Here's our January 2021 roundup of links from across the web that could be relevant to you:
1. Storing Cold Metadata with Alki (Dropbox)
Dropbox shared insights into Alki, the petabyte-scale metadata store it designed for infrequently accessed metadata (“cold data”). The post details how one-size-fits-all database Edgestore was reaching capacity limits, and why audit logs were a good candidate to be moved elsewhere than on costly SSDs. After considering off-the-shelf options, the team settled on building its own solution on top of AWS services: Alki; with DynamoDB as the hot store, and S3 as the cold store. Like HBase or Cassandra, Alki is based on log-structured merge-trees (LSM trees), but is better suited to handle hot-then-cold audit logs, as well as future use cases at Dropbox.
Here's our December 2020 roundup of links from across the web that could be relevant to you:
1. The Modern Data Stack (Fishtown Analytics)
This long-form post on the dbt blog is a must-read. Titled “The Modern Data Stack: Past, Present, and Future,” it answers the question that Tristan Handy has been asking himself for the past two years: “What happened to the massive innovation we saw from 2012-2016?” His carefully thought-out analysis covers the natural cycles of technological shifts, defines the phase we are in as a ‘deployment’ one, and points out high-impact opportunity areas for the next few years - which you might find particularly useful if you are considering launching a new product.
Here's our November 2020 roundup of good reads and podcast episodes that might be relevant for your career in data:
1. Heroes of NLP: Quoc Le (Deeplearning.ai)
Here's our October 2020 roundup of good reads and podcast episodes that might be relevant to you as a data professional:
1. Multiplayer Editing: a Pragmatic Approach (Hex)
Created by Berlin-based developer Jan Oberhauser in 2019, n8n presents itself as “a free and open workflow automation tool”. Think of it as a locally hosted Zapier on steroids.
Here's our September 2020 roundup of good reads and podcast episodes that might be relevant to you as a data professional:
1. What Data Tools Don't Do (Data Council)
Our founder Pete Soderling co-authored a follow-on piece to his previous post with Great Expectations' core contributor Abe Gong and Partner at Amplify Partners Sarah Catanzaro, for which they had interviewed the makers of some of the hottest data tools. The focus is still the same: rather than what their data tools can do, we hear about what they don't do, as a way to better understand how they fit together. From ApertureData to Xplenty, this new installment covers 21 new tools, and you can read it here.