🦙 Llama Deploy 🤖#
Llama Deploy (formerly llama-agents
) is an async-first framework for deploying, scaling, and productionizing agentic
multi-service systems based on workflows from llama_index
.
With Llama Deploy, you can build any number of workflows in llama_index
and then run them as services, accessible
through a HTTP API by a user interface or other services part of your system.
The goal of Llama Deploy is to easily transition something that you built in a notebook to something running on the cloud with the minimum amount of changes to the original code, possibly zero. In order to make this transition a pleasant one, the intrinsic complexity of running agents as services is managed by a component called API Server, the only one in Llama Deploy that's user facing. You can interact with the API Server in two ways:
- Using the
llamactl
CLI from a shell. - Through the LLama Deploy SDK from a Python application or script.
Both the SDK and the CLI are distributed with the Llama Deploy Python package, so batteries are included.
The overall system layout is pictured below.
Why Llama Deploy?#
- Seamless Deployment: It bridges the gap between development and production, allowing you to deploy
llama_index
workflows with minimal changes to your code. - Scalability: The microservices architecture enables easy scaling of individual components as your system grows.
- Flexibility: By using a hub-and-spoke architecture, you can easily swap out components (like message queues) or add new services without disrupting the entire system.
- Fault Tolerance: With built-in retry mechanisms and failure handling, Llama Deploy adds robustness in production environments.
- State Management: The control plane manages state across services, simplifying complex multi-step processes.
- Async-First: Designed for high-concurrency scenarios, making it suitable for real-time and high-throughput applications.
Wait, where is llama-agents
?#
The introduction of Workflows in llama_index
turned out to be the most intuitive way for our users to develop agentic applications. While we keep building more and
more features to support agentic applications into llama_index
, Llama Deploy focuses on closing the gap between local
development and remote execution of agents as services.
Installation#
llama_deploy
can be installed with pip, and includes the API Server Python SDK and llamactl
:
pip install llama_deploy