Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Data is the holy grail of AI. From nimble startups to global conglomerates, organizations everywhere are pouring billions of dollars to mobilize datasets for highly performant AI applications and systems.
But, even after all the effort, the reality is accessing and utilizing data from different sources and across various modalities—whether text, video, or audio—is far from seamless. The effort involves different layers of work and integrations, which often leads to delays and missed business opportunities.
Enter California-based ApertureData. To tackle this challenge, the startup has developed a unified data layer, ApertureDB, that merges the power of graph and vector databases with multimodal data management. This helps AI and data teams bring their applications to market much faster than traditionally possible. Today, ApertureData announced $8.25 million in seed funding alongside the launch of a cloud-native version of their graph-vector database.
“ApertureDB can cut data infrastructure and dataset preparation times by 6-12 months, offering incredible value to CTOs and CDOs who are now expected to define a strategy for successful AI deployment in an extremely volatile environment with conflicting data requirements,” Vishakha Gupta, the founder and CEO of ApertureData, tells VentureBeat. She noted the offering can increase the productivity of data science and ML teams building multimodal AI by ten-fold on an average.
What does ApertureData bring to the table?
Many organizations find managing their growing pile of multimodal data— terabytes of text, images, audio, and video daily— to be a bottleneck in leveraging AI for performance gains.
The problem isn’t the lack of data (the volume of unstructured data has only been growing) but the fragmented ecosystem of tools required to put it into advanced AI.
Currently, teams have to ingest data from different sources and store it in cloud buckets – with continuously evolving metadata in files or databases. Then, they have to write bespoke scripts to search, fetch or maybe do some preprocessing on the information.
Once the initial work is done, they have to loop in graph databases and vector search and classification capabilities to deliver the planned generative AI experience. This complicates the setup, leaving teams struggling with significant integration and management tasks and ultimately delaying projects by several months.
“Enterprises expect their data layer to let them manage different modalities of data, prepare data easily for ML, be easy for dataset management, manage annotations, track model information, and let them search and visualize data using multimodal searches. Sadly their current choice to achieve each of those requirements is a manually integrated solution where they have to bring together cloud stores, databases, labels in various formats, finicky (vision) processing libraries, and vector databases, to transfer multimodal data input to meaningful AI or analytics output,” Gupta, who first saw glimpses of this problem when working with vision data at Intel, explained.
Prompted by this challenge, she teamed up with Luis Remis, a fellow research scientist at Intel Labs, and started ApertureData to build a data layer that could handle all the data tasks related to multimodal AI in one place.
The resulting product, ApertureDB, today allows enterprises to centralize all relevant datasets – including large images, videos, documents, embeddings, and their associated metadata – for efficient retrieval and query handling. It stores the data, giving a uniform view of the schema to the users, and then provides knowledge graph and vector search capabilities for downstream use across the AI pipeline, be it for building a chatbot or a search system.
“Through 100s of conversations, we learned we need a database that not only understands the complexity of multimodal data management but also understands AI requirements to make it easy for AI teams to adopt and deploy in production. That’s what we have built with ApertureDB,” Gupta added.
How is it different from what’s in the market?
While there are plenty of AI-focused databases in the market, ApertureData hopes to create a niche for itself by offering a unified product that natively stores and recognizes multimodal data and easily blends the power of knowledge graphs with fast multimodal vector search for AI use cases. Users can easily store and delve into the relationships between their datasets and then use AI frameworks and tools of choice for targeted applications.
“Our true competition is a data platform built in-house with a combination of data tools like a relational / graph database, cloud storage, data processing libraries, vector database, and in-house scripts or visualization tools for transforming different modalities of data into useful insights. Incumbents we typically replace are databases like Postgres, Weaviate, Qdrant, Milvus, Pinecone, MongoDB, or Neo4j– but in the context of multimodal or generative AI use cases,” Gupta emphasized.
ApertureData claims its database, in its current form, can easily increase the productivity of data science and AI teams by an average of 10x. It can prove as much as 35 times faster than disparate solutions at mobilizing multimodal datasets. Meanwhile, in terms of vector search and classification specifically, it is 2-4x faster than existing open-source vector databases in the market.
The CEO did not share the exact names of customers but pointed out that they have secured deployments from select Fortune 100 customers, including a major retailer in home furnishings, a large manufacturer and some biotech, retail and emerging gen AI startups.
“Across our deployments, the common benefits we hear from our customers are productivity, scalability and performance,” she said, noting that the company saved $2 million for one of its customers.
As the next step, it plans to continue this work by expanding the new cloud platform to accommodate the emerging classes of AI applications, focusing on ecosystem integrations to deliver a seamless experience to users and extending partner deployments.
Source link lol