Jan 2025 • 7 min read
Function Calling in LLMs: OpenAI and Anthropic Claude
Understanding how to enable LLMs to interact with external tools, APIs, and databases through structured function calling.
What is Function Calling?
Function calling enables LLMs like Claude and GPT to achieve complex and practical tasks beyond generating text alone. It bridges the gap between the model's conversational intelligence and external APIs, databases, and custom code.
OpenAI unleashed the power of function calling first, changing the game for how developers interact with LLMs. In a recent announcement from Anthropic, Claude joined the function calling party, bringing this critical capability to Claude 3 models.
How Function Calling Works
The Process
- Define Tools: You send a request to the API which includes the user's query and a list of available tools that the model can potentially use. Each tool is defined with a name, description, and input schema.
- Model Decides: The LLM analyzes the user query and determines which tool (if any) to use based on tool descriptions.
- Returns Function Call: The model returns the function name and arguments in a structured format (JSON).
- You Execute: Your code parses the response and actually calls the function with the provided arguments.
- Return Results: You send the function results back to the model.
- Model Responds: The LLM uses the function results to generate a natural language response to the user.
Critical Point: You Execute Functions
The model does not execute or call functions on the provider's side; it only returns the function name and arguments which you will have to parse and call yourself in your code. This is a security feature—the LLM suggests what to do, but you control what actually happens.
Model Support
Anthropic Claude
Only claude-3* models and newer support function calling using Anthropic's API. This includes:
- claude-3-opus-20240229
- claude-3-sonnet-20240229
- claude-3-haiku-20240307
- claude-opus-4-20250514 (latest)
OpenAI
Function calling is supported in:
- gpt-4 and all gpt-4-* variants
- gpt-3.5-turbo and later versions
- gpt-4o (optimized for function calling)
Anthropic's Implementation
Messages API Structure
The Anthropic Messages API provides a clear mechanism for function calling using the tools, tool_use, and tool_result structures. This clean API design makes implementation straightforward.
Tool Definition Format
Tools are defined with:
- name: Unique identifier for the tool
- description: What the tool does (the model uses this to decide when to call it)
- input_schema: JSON Schema defining expected parameters
Common Use Cases
Database Queries
Convert natural language questions into database queries. The model determines what data to fetch, you execute the query, and the model formats results into readable answers.
API Integration
Enable the LLM to interact with external services: send emails, create calendar events, fetch weather data, search the web, or call custom business APIs.
Data Processing
Define functions for complex calculations, data transformations, or analysis that would be difficult or unreliable for the LLM to do through text generation alone.
Multi-Step Workflows
Chain multiple function calls together. The model can call one function, analyze results, and decide to call another function based on those results.
Implementation with LlamaIndex
LlamaIndex provides clean abstractions for function calling with both OpenAI and Anthropic. The FunctionCallingAgent class handles the complex orchestration automatically, letting you focus on defining tools and business logic.
LlamaIndex Benefits
- Automatic tool discovery from function signatures
- Built-in retry logic for failed function calls
- Conversation memory management
- Streaming support for function calling
- Easy integration with RAG pipelines
Best Practices
Write Clear Tool Descriptions
The model relies entirely on tool descriptions to decide when to use them. Be specific about what each tool does, when to use it, and what parameters it needs.
Validate Inputs
Even though models are good at following schemas, always validate function arguments before execution. Models can occasionally generate invalid parameters.
Handle Errors Gracefully
When a function call fails, return clear error messages to the model. The model can often retry with different parameters or try an alternative approach.
Limit Function Scope
Each function should do one thing well. Don't create overly complex functions with many parameters—the model will struggle to use them correctly.
Return Structured Data
Function results should be structured (JSON) rather than free-form text when possible. This helps the model parse and use the data effectively.
Security Considerations
Never Trust Model Output Blindly
Always validate and sanitize function arguments. Implement authorization checks before executing sensitive operations.
Implement Rate Limiting
Prevent abuse by limiting how many function calls can be made per conversation or per user.
Audit Function Calls
Log all function calls with parameters and results for security auditing and debugging.
Advanced Patterns
Parallel Function Calling
Modern models can request multiple independent function calls simultaneously. Execute them in parallel to reduce latency.
Conditional Tool Availability
Don't always provide all tools. Based on conversation context or user permissions, dynamically adjust which tools are available to the model.
Function Calling Chains
Design workflows where one function's output becomes another's input. The model can orchestrate complex multi-step operations automatically.
Testing Function Calling Agents
Create comprehensive test suites covering:
- Correct function selection for various queries
- Proper argument extraction and formatting
- Error handling when functions fail
- Behavior when no appropriate function exists
- Multi-step workflows requiring multiple function calls
Sources
This article was generated with the assistance of AI technology and reviewed for accuracy and relevance.