OpenAI's November 2022 launch of ChatGPT, which marked a significant improvement in generative AI technology, sparked a surge of enterprise interest in AI development.
Generative AI models, when combined with an enterprise's proprietary data, can be used to develop applications that enable employees to interact with data using natural language and automate repetitive tasks.
However, for those applications to be of worth, they need relevant, high-quality data, and lots of it.
The more relevant, high-quality data there is to train an AI application, the more likely it is to deliver accurate, trustworthy outputs that can be used to inform business decisions.
Given the need for volume, unstructured data is more important than in the past. Structured data such as financial records and point-of-sale transactions makes up less than 20% of all data. To get enough data to properly train AI tools -- and provide a more comprehensive view of an organization's operations -- unstructured data is required.
As a result, vector search has taken on a critical role over the past two years, enabling enterprises to access their unstructured data as they develop AI tools. Providing that access, meanwhile, was the impetus for Aerospike first developing vector search capabilities, according to Naren Narendran, the vendor's chief engineering officer.
"Graph and vector are particularly important for our AI strategy," he said. "They are foundational for the future of AI applications and is the reason we got into those."
However, how vectors are indexed is critical to their effectiveness. If not done well, relevant vectorized data will be difficult to discover.
Aerospike's new vector search capabilities include what it calls a hierarchical navigable small world index (HNSW).
The approach to indexing enables data to simultaneously be ingested into the database as well as indexed so it can be searched across devices. In addition, though data ingestion and indexing may be taking place at the same time users run queries on the data in real time, the workloads are kept separate to optimize performance.
Performance speed, meanwhile, is important for Aerospike given not only its historical focus on real-time analysis but also the need to keep AI tools updated with the most current data possible, according to Aslett.
"The ability to scale vector ingestion and indexing independently is in keeping with Aerospike's focus on real-time application requirements," he said. "In addition, it supports the increasing need for high-performance GenAI and AI inference to facilitate intelligent operational applications that deliver contextually relevant recommendations, predictions and forecasting."
Stephen Catanzano, an analyst at Informa TechTarget's Enterprise Strategy Group, similarly noted the importance of HNSW. He pointed out that enabling data to be ingested in real time while the system asynchronously builds an index fuels real-time, AI-powered decisions.
In addition, Catanzano highlighted the importance of new storage options for vectorized data such as in-memory for small indexes or hybrid memory for large indexes that had previously been available only in Aerospike's core database.
"The most significant features in this update are the durable self-healing indexes and flexible storage configurations," he said. "Together, these features enable better scalability, reduced operational overhead and lower infrastructure costs for enterprise AI systems."
Beyond new indexing and storage capabilities, Aerospike's Vector Search update includes the following features:
Combined, the new features comprise a compelling update, according to Catanzano.
"These features address key industry challenges like uninterrupted performance, scalability and cost reduction," he said. "This release [is[ a noteworthy advancement rather than just an incremental improvement. "While rising interest in AI development led Aerospike to add vector search and graph technology to its database platform, the motivation for developing new vector search capabilities came from customer feedback and market observations, according to Narendran.
In particular, multi-model database capabilities in the same system address customer needs.
"[With multi-model capabilities, customers don't have to get a vector database, pull their data out and move it into the vector database," Narenden said.
Regarding market trends, the rising interest in developing AI, both generative AI and traditional AI, is a driving force, he continued.