Data Council Blog

Building a Column-Oriented, Distributed Data Store for Analytics - The Story of Druid

 

Druid is a modern data store built for analytics use-cases. As the volume of data has exploded, and companies have sought deeper insights from their data, ad-hoc analytics have become difficult as more data is buried in distributed systems like Hadoop & Spark. The query model for these systems can result in long latencies making them sub-optimal for interactive analytics applications.

How to Build a Data Pipeline That Handles Hundreds of Different Inputs

How many different file formats does your ETL system need to parse? For many data pipelines, several well-defined formats will suffice. Things break, and at times require manual intervention, but not so often that a couple engineers can't keep tabs on the system and keep things running relatively smoothly.

| |

10 Unique Gift Ideas for Data Scientists and Engineers

 

Shopping for geeks can be stressful, especially if you don’t know what to get someone. Take a data scientist or engineer for example: do you get him or her gadgets, clothing, or an experiential gift? Should it be practical or fun? How about familiar or unique?

| |

Open Source Software Wins $2K in Lieu of Conference Swag

 

Conference swag is great, but cash is better. This year, the sponsors of DataEngConf NYC '16, which ncluded over 25 data science and engineering experts from top data, finance and media companies, decided to award one luck open source software project $2,000, plus free tickets to the next DataEngConf in San Frnacisco in Spring, 2017, in lieu of conference swag.