Google Revs Cloud Databases, Adds More GenAI to the Mix

Google Cloud has been stuck in third place among the cloud bigs for years, but it’s banking on a slew of database and AI enhancements unveiled at its Next Tokyo ‘24 conference today to help differentiate its offerings among competitors, including major updates to its Spanner databases and new generative AI capabilities in BigQuery and Looker.

BigQuery and Looker, which are Google Cloud’s analytics database and BI front-end, respectively, are getting more hooks into Vertex AI and the various foundation models that it ships with under the Gemini brand.

Specifically, BigQueryis getting new GenAI assistant functions, such as SQL and Python code generation and understanding, which should accelerate the development of code for data science, data analysis, and data engineering.

The new GenAI capabilities will provide a boost in productivity for data preparation tasks, said Gerrit Kazmaier, Google’s vice president and general manager for database, data analytics, and Looker.

“Data is messy,” he said during a press conference last week. “Our new data preparation assistant basically is an intelligent data agent that is specialized on data preparation, and it’s basically helping everyone who has messy data, to wrangle it into a consistent form, the understanding, the semantics of fields, by proposing recommended transformations.”

Google Cloud worked with its Deep Mind subsidiary to develop BigQuery’s new “code-assist” for SQL and Python

The BigQuery team worked “insanely hard” with Google Deep Mind to develop the SQL and Python code generation capabilities, Kazmaier said. While SQL can get “very long, very complex,” the GenAI code assistant is able to follow it, reason about it, and even explain it back to you, he said, which helps with the generation of optimized code.

Google is also introducing to BigQuery a new user experience called a data canvas that has caught the eye of Kazmaier. “Data canvas is truly a marvel,” Kazmaier said. “It’s basically the perfect synergy between user experiences, [for] AI and data analysis.”

Data canvas moves users beyond the traditional paradigm of an IDE or a data science notebook, he said, by helping users build an interactive data graph with BigQuery, he said. The data graph, in turn, prompts the data agent about what the user is building, which leads to a virtuous cycle between user and AI.

This approach “leads us to incredible high accuracy numbers because essentially it’s a self-reinforcing dynamic by you incrementally building your analysis path,” Kazmaier said, “and the analysis path, because of the semantic relationship that it describes, it also is informing AI about your intentions. And you get much higher accuracy in predictions.”

“We have amazing customer feedback about data canvas,” he added. “We really expect groundbreaking productivity gains from that new experience and we’re so proud to put it into GA.”

Looker is also getting the Gemini treatment in the form of AI-powered formula assist capabilities (currently in preview) that help users “explore data and create metrics from complex formulas.” Google is also bringing new slide generation capabilities that will help analysts bring their queries to life with compelling graphics generated by AI.

BigQuery has traditionally been used for analyzing structured data. With this release, Google is opening it up to analyze all types of data, including structured, semi-structured, and unstructured data. Combined with additional GenAI enhancements in the Looker business intelligence (BI) tool, it will radically open up the potential field of questions that customers can ask–and actually get answered, Kazmaeir said.

Looker users can generate slides using Gemini

“BI has been really constrained in so many ways. Usually organizations run out of time. They never run out of questions,” he said. “So we have focused our innovation on Looker and In BI in building customized LLM agents who are really the BI experts, which know how to select data, perform analysis, and summarize it. So basically everyone who has a question can get it without working with an analyst…..Instead of someone even having to ask the question, the system is basically mining that a priori and telling you what is relevant to pay attention to.”

Google Cloud also announced that it’s now supporting open source Apache Spark and Apache Kafka for data streaming and processing within BigQuery. It also announced a new real-time streaming option from its Analytics Hub that enables users to subscribe and get access to real-time data feeds. Finally, it launched a new data migration offering designed to help customesr move into BigQuery.

Operational Databases

As if the enhancements in BigQuery and Looker weren’t enough, Google Cloud is also making big announcements for its transactional databases, including Spanner, the global SQL database chosen by customers with tight availability requirements.

For starters, Spanner is going multi-model, with graph, vector, and full-text search capabilities. These will help Spanner power the next generation of GenAI apps, says Andi Gutmans, Google’s vice president and general manager of databases.

“Graph is a really critical capability to understand how things relate to each other…[and] really understanding how do we deliver contextual capability out of our data,” Gutmans said during the press conference. “And with Spanner Graph, we’re basically taking advantage of that very strong consistency, very strong availability, and virtually unlimited scale of spanner, but also supporting the graph model on top of that.”

Graph is a first-class data model within Spanner, according to Gutmans, but that doesn’t mean that customers have to store all of their data as a graph. “They can take their enterprise data and just start to build the graph capability on top of that or on the side, and query these things together, so that will really help customer extract the maximum out of their data,” he said.

Spanner is also supporting Graph Query Language (GQL), which is an ISO standard for querying graphs. It’s also supporting full-text search, which should help minimize data movement into search engines like Elasticsearch or Solr, Gutmans said. Finally, Spanner is gaining support for storing vector embeddings, which will help with semantic search and also help serve data as part of a retrieval augmented generation (RAG) setup.

“What is most exciting here is that we’re bringing together the relational model, the graph model, the full-text search model and the vector search model, and all these models are fully interoperable, meaning I can actually build applications that have all these capabilities with them, and that’s really going to help our customers build very intelligent applications in a way that they’ve never been able to do before, in a single system, with the highest level of availbltiy, consistently and scale,” Gutmans said.

Finaly, Google Cloud is also rejiggering its Spanner packaging. It’s now offering the database in three versions, Standard, Enterprise, and Enterprise Plus, instead of the previous two. According to Gutmans, this will provide more flexibility and cost transparency to Spanner customers, since replication costs are built into Spanner Enterprise.

For more info, see the Google Cloud blog.

More AI Added to Google Cloud’s Databases

Google Cloud Bolsters Storage with New Options for Block, Object, and Backup