LLM Function Calling Goes Way Beyond Text Generation

Written by Iris Zarecki | October 27, 2024

LLM function calling is the ability of a large language model to perform actions besides generating text by invoking APIs to interface with external tools.

What is LLM function calling?

LLM function calling is the process of enabling your large language model to interface with external tools via API calls. This feature allows LLMs – like Claude from Anthropic, Ernie from Baidu, Gemini from Google, and the GPT family from OpenAI – to perform tasks that require data from, and/or interaction with, external sources. It can also be used to augment data from internal company sources via Retrieval-Augmented Generation (RAG) and relevant LLM agents.

The function calling process starts with predefined functions that are fed into your LLM – including function descriptions and instructions for use. When the model detects the need to activate a function, it generates the relevant prompts in JSON format. These are then passed to the function for execution. See more on this below.

Function calling enhances the utility of LLMs in practical applications like RAG chatbots. It does so by enabling LLMs to retrieve information or take actions that are outside the scope of their initial training. For example, a model could call a weather API to deliver real-time weather updates or trigger a database query to fetch updated stock quotes. In this way, LLM function calling bridges the gap between a model's language processing capabilities and the real-world information that consumers need.

RAG vs LLM function calling

RAG tools and LLM function calling were both designed to improve the accuracy and utility of an enterprise LLM by providing it with context. Where they differ is in how they source and apply that context.

RAG enhances LLMs by retrieving and augmenting relevant structured data in enterprise systems or docs in vector databases. When a user query is received, RAG uses a retriever to find relevant structured or unstructured data, pulls relevant context from this data, then includes it in an enhanced prompt that is passed to the LLM for generation. Active retrieval-augmented generation is ideal for both dynamic, quickly changing operational data as well as static, slowly changing company docs in knowledge bases.

In contrast, LLM function calling enables real-time interactions with external APIs and tools. It allows the LLM to trigger external systems, fetch live data (like stock prices or news updates), or perform actions in real time. This ability makes LLM function calling perfect for dynamic environments in which information is constantly changing. It’s also great for situations where the LLM needs to perform specific tasks autonomously, like managing workflows or integrating with real-time systems.

How LLM function calling works

LLM function calling enables LLMs to go beyond generating text by facilitating interaction with external systems, tools, and APIs. It enhances LLM AI learning, by enabling the models not only to answer questions but also perform real-world actions, customizing them for applications like automation, data retrieval, and task execution.

Here's how LLM function calling works:

Prompt and function definitions

Using AI prompt engineering tools, the user provides a set of instructions that requires access to external data or an action. Along with this prompt, the application sends the LLM a list of available functions, including descriptions and input/output schemas (e.g., “get current weather”, “city”) for retrieving weather information.
Function detection and selection

The LLM processes the prompt and determines if a function call is necessary. If so, it identifies the correct function from the provided list and generates a JSON dictionary that includes the selected function's name and the required input arguments (e.g., {“function”: “get current weather”, “city”: “London”}).
Execution

The application parses the JSON response and invokes the identified function, either sequentially, or in parallel, with other functions, depending on the requirements.
Final response generation

After executing the function(s) and retrieving the required data, the output is fed back into the LLM. The LLM then integrates this information into its response, generating a message that’s not only more accurate – because it’s based on the data provided by the external function – but also freer of AI hallucinations based on your LLM guardrails.

Integrating LLM function calling with RAG

By integrating LLM function calling with enterprise RAG, LLM users can enjoy the strengths of both techniques – for even more accurate, relevant and actionable responses.

In such integrated systems, the LLM first determines if it needs external data to improve its response. If the user query requires updated or specialized information, the model initiates a retrieval process – pulling relevant documents or data from external databases, APIs, or the web – via RAG conversational AI. This retrieved information is then passed back to the LLM via an enhanced prompt with enriched context, enabling it to generate a more accurate response.

If more real-time data is required – for example, if a user asks for the latest stock prices or weather forecast – the LLM recognizes this and calls the relevant API directly, ensuring that only fresh, real-time data is used.

This integration works by dynamically coordinating when to retrieve data (via RAG architecture) and when to invoke specific functions (vis LLM function calling). In advanced solutions, the LLM can even chain-link these operations, first retrieving relevant content and then processing it with an external function before delivering the final response.

This synergy between RAG and LLM function calling allows these advanced systems to go beyond static knowledge from based on your LLM vector database, providing responses that are not only up to date but also tailored to specific queries.

LLM function calling with GenAI Data Fusion

GenAI Data Fusion, the K2View suite of RAG tools, integrates both retrieval-augmented generation and LLM function calling for generative AI use cases that requires precise, personalized interactions – beyond the scope of textual answers generated from a company’s private structured data. GenAI Data Fusion uses contextual LLM chain-of-thought prompting to ensure better, more thorough, and more relevant outputs.

GenAI Data Fusion:

Retrieves customer data in real time, to create more accurate and relevant prompts.
Anonymizes sensitive data and PII (Personally Identifiable Information) dynamically.
Responds to data service access requests and provides recommendations on the fly.
Penetrates enterprise systems – via API, CDC, messaging, or streaming – to collect data from multiple source systems.

GenAI Data Fusion powers your LLM to respond with greater accuracy to business-intensive queries based on external knowledge.

Discover K2view AI Data Fusion, the RAG tools that integrate LLM
function calling for AI use cases requiring real-time, external data.

View full post