Apache Arrow is much more than a format – it is an entire ecosystem. In this talk, we'll dive into one of the newest Apache Arrow subprojects, Arrow Database Connectivity (ADBC), an API specification for Arrow-based database access. ADBC is an API specification, with APIs provided for C, Go, and Java. The APIs are then implemented by drivers (or a driver manager) which use whatever underlying protocols are needed for the database. The goal for ADBC is to provide applications with a single Arrow-based API to work with multiple databases regardless of if those databases are Arrow-native or not. Application code shouldn't need to juggle the conversions from non-Arrow-native datasources alongside bindings for Arrow-native database protocols.
Over the course of this session, you’ll get a crash course in ADBC and learn how it communicates with different data APIs (like Arrow Flight SQL and Postgres) using Arrow-native in-memory data. By the end, you’ll understand the use cases it can conquer and know where to access the resources you need to get started.
This talk will cover goals, use-cases, and examples of using ADBC to communicate with different Data APIs (such as Flight SQL or postgres) with Arrow Native in-memory data.
Matt is a committer for the Apache Arrow project, frequently enhancing the Golang Arrow and Parquet libraries among other enhancements and helping to grow the Arrow Community. Recently, Matt has joined Voltron Data in order to work on the Apache Arrow libraries full time and grow the Arrow Golang community. In June 2022, Matt's first book was published, which is the first (and currently only) book on Apache Arrow titled "In-Memory Analytics with Apache Arrow".