Relational databases have become the backbone of entire industries since their inception. ETL pipelines that turn the gears of the corporate world would crumble without them. Deephaven has transformed the T in ETL (transforming transforms?), making it easier than ever to manipulate complex data in both static and real-time contexts all the same. Why not cover the E and the L, too? Well, now Deephaven does.
Deephaven’s v0.21 release brought the first Deephaven ODBC and ADBC client to the Python API. In this blog, I’ll show you how simple it is to connect to and ingest data from a relational database using Deephaven’s API.
ODBC makes accessing databases simple. Deephaven, a column-oriented query engine, is well-suited to source data from relational databases. Deephaven also relies heavily on Apache Arrow Flight to power the table engine. The Arrow Database Connectivity (ADBC) API is a natural fit for Deephaven as well. Start by using pip
to install the necessary drivers:
pip install adbc_driver_manager adbc_driver_postgresql
Then, in Deephaven, just import the packages:
from deephaven.dbc import adbc as dhadbc
from deephaven.dbc import odbc as dhodbc
Using Deephaven’s ADBC and ODBC clients is intuitive. If you have access to a relational database from your Deephaven instance, you need three things:
- The import statement
- The URI of the database
- A SQL query to execute
The workflows for both ADBC and ODBC are the same:
- Connect to the database.
- Execute a query in a cursor.
- Read the cursor.
What does that look like in a real example? Let’s look at an ADBC query:
from deephaven.dbc import adbc as dhadbc
from adbc_driver_postgresql import dbapi
import os
username = os.environ["POSTGRES_USERNAME"]
password = os.environ["POSTGRES_PASSWORD"]
url = os.environ["POSTGRES_URL"]
port = os.environ["POSTGRES_PORT"]
sql_query = "SELECT t_ts as Timestamp, CAST(t_id AS text) as Id, " +
"CAST(t_instrument as text) as Instrument, " +
"t_exchange as Exchange, t_price as Price, t_size as Size " +
"FROM CRYPTO TRADES"
uri = f"{url}:{port}/postgres?user={username}&password={password}"
with dbapi.connect(uri) as conn:
with conn.cursor() as cursor:
cursor.execute(sql_query)
table = dhadbc.read_cursor(cursor)
That’s all there is to it.
If your database only supports ODBC/JDBC, the Deephaven ADBC client can still be used. This flexibility provides increased efficiency when extracting data from relational database systems.
Our Slack community continues to grow. Join us!
Source link
lol