Release Notes for Deephaven v0.7.0 and v0.8.0

Deephaven remains laser focused on making real-time data easy for everyone – on its own and coupled with static data.

The team continues to evolve the deephaven-core, barrage, and web-client-ui projects, releasing versions 0.7.0 and 0.8.0 of core, at the beginning and end of December, respectively.

To support Community, we have been working on documentation coverage and established a new deephaven-examples GitHub organization to serve as a central warehouse of illustrative use cases. We encourage the community to contribute examples there as well.

Further, the Deephaven YouTube channel continues to grow. Subscribe to view the new content we plan to drop each week.

The description below follows the organization of development themes presented in our 2022 Roadmap.

Removed log4j as a dependency.
Added support for local arm64 builds.
Delivered the Deephaven learn library, which will allow users to nicely marry Deephaven’s real-time table dynamics with Python AI models.
Created Docker images for Deephaven+PyTorch, +TensorFlow, and +SciKitLearn, respectively, for easy deployment for data scientists.
Made available new models for deployment generally, ones particularly suited for local Python development.
Added input tables to the API and web-UI.
Meaningfully increased the performance of some aggregation cases (both updating and static).

End-of-January’s deliveries will center around:

Engine performance improvements and measurement infrastructure.
A new CSV reader (that we’re proud of).
A new plug-in infrastructure (supporting both server-side and JS entrypoints).
A new Debezium integration (for CDC).
A meaningfully re-architected server-side Python experience.
A table-map implementation that will enable users to create and manage in-memory child tables based on keys of a parent table.

General ease of use

New deployment models

Investments were made to avail users of new models for deployment. Docker will remain a fundamental option, and the Envoy proxy will continue to be important for many setups, but we wanted to open simpler models for running and scaling Deephaven and its web-UI, particularly for local development. To accomplish this, we helped modify gRPC-Java. With that work, we can use Jetty, a Java servlet container, to run the server and to serve the web-UI, which wasn’t possible using netty (as was the case pre-release). This allows you to run natively on Mac/Windows or without the indirection of Docker on Linux, simplifying debugging and integration with other local resources. Some related websocket work also makes Deephaven compatible with gRPC-web clients. #1731

Improvements to (nicely simple) data sourcing methods

Users want the sourcing of data to be easy. We do too, so we built a URI-driven method in a library called ResolveTools.
With simple syntax like t = resolve(‘dh+plain://address/path/table_name'), you can inherit real-time, dynamic tables from Deephaven applications, CSVs, Parquet files, and Barrage tables – both locally and from remote sources (including public domains). In the release, we addressed some wiring related to those capabilities. #1706

Table replay available (also) in Python

Improved memory-use monitoring

The PerformanceQueries class provides process performance statistics to users. It is often used to analyze queries. We augmented the statistics it makes available. #1559

Query Engine

Improved aggregation infrastructure

We improved the performance of many aggregations in both the dynamic-update and static-batch cases. #1726
We added a new aggAllBy() method as catch-all infrastructure to support future aggregations that might be added to the engine. This method allows you to apply the same aggregation to all non-keyed columns. #1618

QST now supports input tables

Input tables are a utility that make integrating custom sources of streaming and batch data into Deephaven very easy. In this release, we added support for input tables via the Query Syntax Tree, the declarative structure implementation available to users.

Python, ML, AI, and Data Science

Type casting inside Deephaven’s `learn` module

Deephaven’s learn module provides utilities for efficient data transfer between Deephaven tables and Python objects, as well as a framework for using popular machine-learning / deep-learning libraries with Deephaven tables. In this release, we fixed casting, so the work product of ML libraries yields expected types. #1543

Other Python work

We fixed an issue whereby a recent refactor removed where_one_of functionality, so “OR” filters are back on track. #1650
Wheels are now being used directly in the build process. #1555
The jpy configuration will now be implied by the Python environment. #1708

UI/UX and the JS API

UI-driven input experiences are now available in the web-UI

We updated the JS API to support gRPC input tables and delivered pretty slick user experiences around adding and changing data manually via the front end. If you pull up an input table in the UI, you can easily add rows and modify data therein. #1565
We updated node.js to latest 14.x release. #1565

Client APIs and the OpenAPI

We fixed an issue so modified rows that are shifted are correctly accounted for. #1564.

Data Sources and Sinks

We added support for nested fields in Avro. #1667
Further, we fixed bugs in Avro options “mapping” and “mapping_only” in kafka_consumer.py. #1656

The Barrage Wire Protocol

We added support for BigDecimal and BigInteger in the Barrage protocol. #1627

To learn more about the documentionation changes for each release, see:

Source link
lol