Real Time Text Matching at Scale

Shayan Mohanty | Watchful.io

ABOUT THE TALK

The total number of data-producing devices in the world is increasing, and data-laden organizations are feeling a growing pressure to reduce the amount of time it takes to extract value and insight from their data. Traditional Extract-Transform-Load (ETL) pipelines are insufficient tools for dealing with the deluge of dark data being generated in the world, and are generally more likely to end up as bottlenecks than sources of “fast insight”.
 
The need to be able to identify and process data in-stream is apparent, across the spectrum of fully structured to totally unstructured data. Watchful is a real time pattern matching platform that makes stream processing easy and fast. It allows users to identify, filter, and route data in real-time based on complex patterns in its content rather than inflexible headers/schemas — a model we call “Stream-Filter-Drain”.
 
Under the hood, Watchful is powered by a sophisticated distributed non-backtracking regular expression engine and coordination layer designed to provide a turn-key experience for end users. In this talk, we will discuss Watchful’s high level architecture, the regex evaluation strategies it bundles with the engine, and the concurrency guarantees/scaling patterns it is designed for. We will also briefly touch on a few active use cases to illustrate how “Stream-Filter-Drain” is currently used to power complex real time systems."

Download Slides

Shayan Mohanty

Founder | Watchful.io

Shayan is the founder of Watchful.io. He's also a Data Engineer at Facebook. Previously he worked at Able Lending as a Software Engineer. He graduated from University of Texas at Austin.

Shayan Mohanty