Snowflake RAG, proficient in analytical GenAI with response times measured in minutes, can now support real-time operational GenAI, thanks to K2view.
Retrieval-Augmented Generation (RAG) is a design pattern for integrating your organization’s private internal data with the publicly available external data the Large Language Model (LLM) was trained on, to generate more accurate and personalized responses to user queries.
For example, say a frequent flyer asks an airline’s RAG chatbot, “How many frequent flyer credits do I have?” An operational GenAI RAG tool would retrieve all the data related to that customer to generate a protected and precise real-time response, like “Mike, you have 5,000 frequent flyer credits available for use.” An agentic AI mechanism might go even further by suggesting, “I notice that, as a rule, you generally like to apply your credits to business-class upgrades. Would you like to apply 2,500 credits to an upgrade on the flight you just reserved?”
RAG tools retrieve relevant business data from data stores like Snowflake and then augment the LLM via contextually enriched prompts. Specifically, Snowflake Cortex implements RAG by:
Implementing RAG with Snowflake is an excellent choice for analytical GenAI. For example, answering a sales analyst’s question, like “Compare this year’s Q3 sales results with last year’s, with a breakdown by region, industry, and product” may require several complex queries to be performed that join multiple large tables in Snowflake. This process might take several minutes to execute.
But, when it comes to operational GenAI use cases, such as assisting 500 contact center agents with a conversational AI chatbot, Snowflake would fall short due to the need for:
We recently conducted a survey among 300 AI pros, spanning many different industries, and found that the 2 leading generative AI use cases are both operational.
Snowflake RAG is inappropriate for addressing generative AI use cases in customer service, which are now being implemented by more than 50% of B2C organizations.
The key to understanding why Snowflake RAG is unsuitable for operational GenAI can be summed up in speed, security, and cost. More on this in the next section…
Most organizations store their raw enterprise data in data lakes like Snowflake Data Cloud. With this data already centralized and easily accessible, it makes perfect sense to leverage it for augmenting an LLM with RAG. LLM agents are implemented to access the structured data in Snowflake, as shown below.
Despite the many advantages of data lakes (scalability, adaptability, and storage cost efficiency, to name a few), let’s examine some of their limitations when it comes to implementing an enterprise RAG:
K2view GenAI Data Fusion is a semantic data layer, optimized for Snowflake and operational GenAI use cases.
It’s an in-memory cache that dynamically organizes Snowflake data by entities. For example, in customer service the data would be organized by customers, with the data for each customer stored and managed in a “data lake of one”. In such a Micro-Database™, all the data for a specific customer (across all Snowflake tables) is organized as a single unit that can be queried in milliseconds via ANSI SQL.
So, when Jo, an authorized user, asks Robota the chatbot a question, K2view provides the appropriate LLM agent with secure access to all of Jo’s data (and only to Jo’s data) in conversational latency – with the ability to answer any query related to Jo.
With K2view, you can now apply Snowflake RAG to operational GenAI use cases and start benefiting from enhanced cost savings and better user experiences.
Just one thing: When you call Snowflake about deploying Cortex RAG in your call center, be sure to tell them K2view sent you :)
Learn how K2view GenAI Data Fusion extends
Snowflake RAG into the realm of operational GenAI.