Install OpenRAG in a Python project with uv
Use uv to install OpenRAG as a managed or unmanaged dependency in a new or existing Python project.
When you install OpenRAG with uv, you will use an OpenRAG terminal session to configure and manage your OpenRAG deployment.
For other installation methods, see Select an installation method.
Prerequisites
- For Microsoft Windows, you must use the Windows Subsystem for Linux (WSL). See Install OpenRAG on Windows before proceeding.
- Install Python version 3.13.
- Install uv.
-
A Docker or Podman VM must be running before you start OpenRAG unless you are running OpenRAG on a Linux-based VM, such as a Linux-based WSL image.
The container VM must have sufficient resources to run the OpenRAG containers. The minimum recommendation is 8 GB of RAM and at least 50 GB of free disk space.
For more information, see Troubleshoot OpenRAG.
-
Install
podman-composeor Docker Compose. To use Docker Compose with Podman, you must alias Docker Compose commands to Podman commands.
-
Gather the credentials and connection details for one or more supported model providers:
- OpenAI: Create an OpenAI API key.
- Anthropic: Create an Anthropic API key. Anthropic provides language models only; you must select an additional provider for embeddings.
- IBM watsonx.ai: Get your watsonx.ai API endpoint, IBM project ID, and IBM API key from your watsonx deployment.
- Ollama: Deploy an Ollama instance and models locally, in the cloud, or on a remote server. Then, get your Ollama server's base URL and the names of the models that you want to use.
OpenRAG requires at least one language model and one embedding model. If a provider offers both types of models, then you can use the same provider for both models. If a provider offers only one type, then you must configure two providers.
Language models must support tool calling to be compatible with OpenRAG.
For more information, see Complete the application onboarding process.
- Optional: Install GPU support with an NVIDIA GPU, CUDA support, and compatible NVIDIA drivers on the OpenRAG host machine. If you don't have GPU capabilities, OpenRAG provides an alternate CPU-only deployment that is suitable for most use cases. The default CPU-only deployment doesn't prevent you from using GPU acceleration in external services, such as Ollama servers.
Install and start OpenRAG with uv
Use uv add to install OpenRAG as a managed dependency in a new or existing uv Python project.
This command adds OpenRAG to your pyproject.toml and lockfile.
This provides better management of dependencies and the virtual environment as compared to commands like uv pip.
If you encounter errors during installation, see Troubleshoot OpenRAG.
-
Create a new
uv-managed Python project:uv init PROJECT_NAME --python 3.13The
--pythonflag ensures that your project uses the minimum required Python version for OpenRAG. You can omit this flag if your system's default Python version is 3.13. -
Change into your new project directory:
cd PROJECT_NAMEBecause
uvmanages the virtual environment for you, you won't see a(venv)prompt.uvcommands automatically use the project's virtual environment. -
Add OpenRAG to your project:
-
Add the latest version:
uv add openrag -
Add a specific version:
uv add openrag==0.1.30 -
Add a local wheel:
uv add path/to/openrag-VERSION-py3-none-any.whl
For more options, see Managing dependencies with
uv. -
-
Optional: If you want to use a pre-populated OpenRAG
.envfile, create one at~/.openrag/tuibefore starting OpenRAG. -
Start the OpenRAG terminal session:
uv run openragtipFor a GUI terminal experience, use the
--tuiflag when starting OpenRAG.
Configure OpenRAG in the terminal
When you install OpenRAG with uv, you manage the OpenRAG services in an OpenRAG terminal session.
The terminal session guides you through the initial configuration process before you start the OpenRAG services.
Your configuration values are stored in an OpenRAG .env file that is created automatically at ~/.openrag/tui.
If OpenRAG detects an existing .env file in this directory, then OpenRAG can populate those values automatically during setup and onboarding.
Container definitions are stored in the docker-compose files in the same directory as the OpenRAG .env file.
The first time you install OpenRAG, the terminal session issues a series of prompts to guide you through the initial configuration.
If OpenRAG detects an existing OpenRAG .env file, you might need to select the Reconfigure option in the terminal to access the configuration settings.
-
Enter an administrator password for the OpenSearch service.
This password is required, and a secure password is automatically generated if you don't provide one manually.
-
Enter an administrator password for the Langflow service.
This password is recommended but optional. If the Langflow password is empty, the Langflow server starts without authentication enabled. For more information, see Langflow settings.
-
At the AI providers prompt, you can either enter your model provider credentials now, or press Enter to configure these options later during application onboarding.
There is no material difference between providing these values now or during the application onboarding process. If you provide a credential now, it can be populated automatically during the application onboarding process if you enable the Use environment API key option.
OpenRAG's core functionality requires access to language and embedding models. By default, OpenRAG uses OpenAI models. If you aren't sure which models or providers to use, you must provide an OpenAI API key to use OpenRAG's default model configuration.
-
At the Langfuse tracing prompt, enter Y to enable the Langflow integration with Langfuse; otherwise, enter N. If you enter Y, provide the following Langfuse credentials:
- Langfuse Secret Key: A secret key for your Langfuse project.
- Langfuse Public Key: A public key for your Langfuse project.
- Langfuse Host: Required for self-hosted Langfuse deployments. Leave empty for Langfuse Cloud.
-
At the Documents path prompt, accept the default local documents path, or provide a path to a directory where you want OpenRAG to look for documents to ingest into your knowledge base.
-
At the Cloud connectors and advanced settings prompt, enter Y if you want to configure any of the following additional settings; otherwise, enter N.
infoYou must configure advanced settings if you want to enable either OAuth mode or cloud storage connectors during initial set up:
- OAuth mode: Controls document ownership and access in your OpenRAG OpenSearch knowledge base. Without OAuth mode, there is no differentiation between users; all users that access your OpenRAG instance can access and manage all uploaded documents.
- Cloud storage connectors: Enables ingestion of documents from external storage services.
You can also configure these connectors later.
-
OAuth mode and cloud storage connectors: To configure one or more of these connectors, do the following:
-
Register OpenRAG as an OAuth application in your cloud provider, and then obtain the app's OAuth credentials, such as a client ID and secret key. To enable multiple connectors, you must register an app and generate credentials for each provider.
-
In your OpenRAG terminal session, enter the relevant OAuth credentials for the cloud connector prompts:
-
Google: Enter your Google OAuth Client ID and Google OAuth Client Secret. You can generate these in the Google Cloud Console. For more information, see the Google OAuth client documentation.
Providing these Google credentials enables OAuth mode and the Google Drive cloud storage connector.
warningGoogle is the only supported OAuth provider for OpenRAG.
You must enter Google credentials if you want to enable OAuth mode.
The Microsoft and Amazon credentials are used only to authorize the cloud storage connectors. OpenRAG doesn't offer OAuth provider integrations for Microsoft or Amazon.
-
Microsoft: For the Microsoft OAuth Client ID and Microsoft OAuth Client Secret, enter Azure application registration credentials for SharePoint and OneDrive. For more information, see the Microsoft Graph OAuth client documentation.
-
Amazon: Enter your AWS Access Key ID and AWS Secret Access Key with access to your S3 instance. For more information, see the AWS documentation on Configuring access to AWS applications.
The credentials can be populated automatically if OpenRAG detects these credentials in an OpenRAG
.envfile at~/.openrag/tui. -
-
Register the redirect URIs shown in the terminal in your OAuth apps.
The redirect URIs are used for the cloud storage connector webhooks. For Google, the redirect URIs are also used to redirect users back to OpenRAG after they sign in.
-
-
Webhook Base URL: If you entered OAuth credentials, you can set the base address for your OAuth connector endpoints. If set, the OAuth connector webhook URLs are constructed as
WEBHOOK_BASE_URL/connectors/${provider}/webhook. This option is required to enable automatic ingestion from cloud storage. -
OpenSearch Data Path: Use the default path, or specify the path where you want OpenRAG to create your OpenSearch index.
-
Langflow Public URL : Sets the base address to access the Langflow web interface where users interact with the Langflow editor in a browser. You must set this value if you want to run Langflow on a non-default port (
7860)
-
At the
Start services now?prompt, press Y to start the OpenRAG services.This process can take some time while OpenRAG pulls and runs the container images. If all services start successfully, the terminal prints a confirmation message:
Services are running.OpenRAG .env fileYour configuration and credentials are stored in the OpenRAG
.envfile at~/.openrag/tui. If you modified any credentials that were pulled from an existing.envfile, those values are updated in the.envfile. -
If the OpenRAG app doesn't launch automatically, select the Open OpenRAG in browser option to launch the OpenRAG app and start the application onboarding process. You can also manually navigate to
localhost:3000in a browser.If you provided Google OAuth credentials, you must sign in with Google before you are redirected to your OpenRAG instance.
Complete the application onboarding process
The first time you start the OpenRAG application, you must complete the application onboarding process to select language and embedding models that are essential for OpenRAG features like the Chat.
To complete onboarding, you must configure at least one language model and one embedding model.
You can use different providers for your language model and embedding model, such as Anthropic for the language model and OpenAI for the embedding model. Additionally, you can select multiple embedding models.
- Anthropic
- IBM watsonx.ai
- Ollama
- OpenAI (default)
Anthropic doesn't provide embedding models. If you select Anthropic for your language model, you must select a different provider for the embedding model.
-
Enter your Anthropic API key, or enable Use environment API key to pull the key from your OpenRAG
.envfile. -
Under Advanced settings, select the language model that you want to use.
Language models must support tool calling to be compatible with OpenRAG. Incompatible models aren't listed.
-
Click Complete.
-
Select a provider for embeddings, provide the required information, and then select the embedding model you want to use. For information about another provider's credentials and settings, see the instructions for that provider.
-
Click Complete.
After you configure the embedding model, OpenRAG uses your credentials and models to ingest some initial documents. This tests the connection, and it allows you to ask OpenRAG about itself in the Chat. If there is a problem with the model configuration, an error occurs and you are redirected back to the application onboarding screen. Verify that the credential is valid and has access to the selected model, and then click Complete to retry ingestion.
-
Continue through the overview slides for a brief introduction to OpenRAG, or click Skip overview. The overview demonstrates some basic functionality that is covered in the quickstart and in other parts of the OpenRAG documentation.
OpenRAG isn't guaranteed to be compatible with all models that are available through IBM watsonx.ai.
Language models must support tool calling to be compatible with OpenRAG. Incompatible models aren't listed in OpenRAG's settings or onboarding.
Additionally, models must be able to handle the agentic reasoning tasks required by OpenRAG. Models that are too small or not designed for agentic RAG tasks can return low quality, incorrect, or improperly formatted responses. For more information, see Chat issues.
You can submit an OpenRAG GitHub issue to request support for specific models.
-
For watsonx.ai API Endpoint, select the base URL for your watsonx.ai model deployment.
-
Enter your watsonx.ai deployment's project ID and API key.
You can enable Use environment API key to pull the key from your OpenRAG
.envfile. -
Under Advanced settings, select the language model that you want to use.
Language models must support tool calling to be compatible with OpenRAG. Incompatible models aren't listed.
-
Click Complete.
-
Select a provider for embeddings, provide the required information, and then select the embedding model you want to use. For information about another provider's credentials and settings, see the instructions for that provider.
-
Click Complete.
After you configure the embedding model, OpenRAG uses your credentials and models to ingest some initial documents. This tests the connection, and it allows you to ask OpenRAG about itself in the Chat. If there is a problem with the model configuration, an error occurs and you are redirected back to the application onboarding screen. Verify that the credentials are valid and have access to the selected model, and then click Complete to retry ingestion.
-
Continue through the overview slides for a brief introduction to OpenRAG, or click Skip overview. The overview demonstrates some basic functionality that is covered in the quickstart and in other parts of the OpenRAG documentation.
Using Ollama as your language and embedding model provider offers greater flexibility and configuration options for hosting models. However, it requires additional setup because Ollama isn't included with OpenRAG. You must deploy Ollama separately if you want to use Ollama as a model provider.
OpenRAG isn't guaranteed to be compatible with all models that are available through Ollama. Some models might produce unexpected results, such as JSON-formatted output instead of natural language responses, and some models aren't appropriate for the types of tasks that OpenRAG performs, such as those that generate media.
-
Language models: Ollama-hosted language models must support tool calling to be compatible with OpenRAG. The OpenRAG team recommends
gpt-oss:20bormistral-nemo:12b. If you choosegpt-oss:20b, consider using Ollama Cloud or running Ollama on a remote machine because this model requires at least 16GB of RAM. -
Embedding models: The OpenRAG team recommends
nomic-embed-text:latest,mxbai-embed-large:latest, orembeddinggemma:latest.
You can experiment with other models, but if you encounter issues that you are unable to resolve through other RAG best practices (like context filters and prompt engineering), try switching to one of the recommended models. You can submit an OpenRAG GitHub issue to request support for specific models.
-
Install Ollama locally or on a remote server, or run models in Ollama Cloud.
If you are running a remote server, it must be accessible from your OpenRAG deployment.
-
In the OpenRAG onboarding dialog, enter your Ollama server's base URL:
- Local Ollama server: Enter your Ollama server's base URL and port. The default Ollama server address is
http://localhost:11434. - Ollama Cloud: Because Ollama Cloud models run at the same address as a local Ollama server and automatically offload to Ollama's cloud service, you can use the same base URL and port as you would for a local Ollama server. The default address is
http://localhost:11434. - Remote server: Enter your remote Ollama server's base URL and port, such as
http://your-remote-server:11434.
- Local Ollama server: Enter your Ollama server's base URL and port. The default Ollama server address is
-
Select a language model that your Ollama server is running.
OpenRAG only lists language models that support tool calling. If your server isn't running any compatible language models, you must either deploy a compatible language model on your Ollama server, or use another provider for the language model.
Language model and embedding model selections are independent. You can use the same or different servers for each model.
To use different providers for each model, you must configure both providers, and select the relevant model for each provider.
-
Click Complete.
-
Select a provider for embeddings, provide the required information, and then select the embedding model you want to use. For information about another provider's credentials and settings, see the instructions for that provider.
-
Click Complete.
After you configure the embedding model, OpenRAG uses your credentials and models to ingest some initial documents. This tests the connection, and it allows you to ask OpenRAG about itself in the Chat. If there is a problem with the model configuration, an error occurs and you are redirected back to the application onboarding screen. Verify that the server address is valid, and that the selected model is running on the server. Then, click Complete to retry ingestion.
-
Continue through the overview slides for a brief introduction to OpenRAG, or click Skip overview. The overview demonstrates some basic functionality that is covered in the quickstart and in other parts of the OpenRAG documentation.
-
Enter your OpenAI API key, or enable Use environment API key to pull the key from your OpenRAG
.envfile. -
Under Advanced settings, select the language model that you want to use.
Language models must support tool calling to be compatible with OpenRAG. Incompatible models aren't listed.
-
Click Complete.
-
Select a provider for embeddings, provide the required information, and then select the embedding model you want to use. For information about another provider's credentials and settings, see the instructions for that provider.
-
Click Complete.
After you configure the embedding model, OpenRAG uses your credentials and models to ingest some initial documents. This tests the connection, and it allows you to ask OpenRAG about itself in the Chat. If there is a problem with the model configuration, an error occurs and you are redirected back to the application onboarding screen. Verify that the credential is valid and has access to the selected model, and then click Complete to retry ingestion.
-
Continue through the overview slides for a brief introduction to OpenRAG, or click Skip overview. The overview demonstrates some basic functionality that is covered in the quickstart and in other parts of the OpenRAG documentation.
Next steps
- Try some of OpenRAG's core features in the quickstart.
- Learn how to manage OpenRAG services.
- Upload documents, and then use the Chat to explore your data.