Troubleshoot OpenRAG

This page provides troubleshooting advice for issues you might encounter when using OpenRAG or contributing to OpenRAG.

OpenRAG installation fails with unable to get local issuer certificate

If you are installing OpenRAG on macOS, and the installation fails with unable to get local issuer certificate, run the following command, and then retry the installation:

open "/Applications/Python VERSION/Install Certificates.command"

Replace VERSION with your installed Python version, such as 3.12.

No container runtime found

When you start the TUI, a No container runtime found error indicates that OpenRAG cannot find a running Docker or Podman machine.

Make sure Docker or Podman is installed, available in the PATH, and a VM is running.

Container out of memory errors

If you encounter container memory errors, try the following:

Increase your Podman or Docker VM's allocated memory.

If you're using Podman on macOS, you might need to increase VM memory on your Podman machine. This example, the following commands stop Podman, increase the machine size to 8 GB of RAM (the minimum recommended RAM for OpenRAG), and then restart the VM:
```
podman machine stop
podman machine rm
podman machine init --memory 8192   # 8 GB example
podman machine start
```
You must also restart your OpenRAG services after increasing the container VM's memory.
Use a CPU-only deployment to reduce memory usage.

For TUI-managed deployments, you can enable CPU mode on the TUI's Status page.

For self-managed deployments, CPU-only deployments use the docker-compose.yml file that doesn't have GPU overrides.

OpenSearch fails to start

Check that the value of the OPENSEARCH_PASSWORD environment variable meets the OpenSearch password complexity requirements.

If you need to change the password, you must reset the OpenRAG services.

OpenRAG fails to start from the TUI with operation not supported

This error occurs when starting OpenRAG with the TUI in WSL (Windows Subsystem for Linux).

The error occurs because OpenRAG is running within a WSL environment, so webbrowser.open() can't launch a browser automatically.

To access the OpenRAG application, open a web browser and enter http://localhost:3000 in the address bar.

Application onboarding gets stuck or fails

If the application onboarding process hangs for a long time or fails for an unspecified reason, try the following:

Make sure you have enough free space (more than 50 GB) available for the temporary storage required for document ingestion.
Clear your browser's cache.

Langflow connection issues

Verify that the value of the LANGFLOW_SUPERUSER environment variable is correct. For more information about this variable and how this variable controls Langflow access, see Langflow settings.

Port conflicts

By default, OpenRAG requires the following ports to be available on the host machine:

3000: OpenRAG frontend
7860: Langflow service
8000: OpenRAG backend
9200: OpenSearch service
5601: OpenSearch dashboards
5001: Docling service

If the default ports for the OpenRAG frontend (3000) or Langflow (7860) are already in use on your host machine, you can set the FRONTEND_PORT or LANGFLOW_PORT environment variables to map these services to different host ports. If you set LANGFLOW_PORT, you must also set LANGFLOW_PUBLIC_URL to use the new port. To apply the port mapping, restart the containers after editing your OpenRAG .env file.

Upgrade fails due to Langflow container already exists

If you encounter a langflow container already exists error when upgrading OpenRAG, this typically means you upgraded OpenRAG with uv, but you didn't remove or upgrade containers from a previous installation.

To resolve this issue, do the following:

Remove only the Langflow container:
1. Stop the Langflow container:
  Docker
```
docker stop langflow
```
  Podman
```
podman stop langflow
```
2. Remove the Langflow container:
  Docker
```
docker rm langflow --force
```
  Podman
```
podman rm langflow --force
```
Retry the upgrade.
If reinstalling the Langflow container doesn't resolve the issue, then you must reset all containers or reinstall OpenRAG.
Retry the upgrade.

If no updates are available after reinstalling OpenRAG, then you reinstalled at the latest version, and your deployment is up to date.

Model availability and performance issues

The following issues relate to the models you can use with OpenRAG and potential performance issues, such as malformed responses and excessive hallucinations.

Language model isn't listed in OpenRAG settings or application onboarding

If your language model isn't listed in the OpenRAG settings or application onboarding, then the model likely doesn't support tool calling, which is required for OpenRAG. You must select a different model. If no other models are listed, make sure your model provider API key or instance has access to models that support tool calling.

You can submit an OpenRAG GitHub issue to request support for specific models.

IBM watsonx.ai model issues

OpenRAG isn't guaranteed to be compatible with all models that are available through IBM watsonx.ai.

Language models must support tool calling to be compatible with OpenRAG. Incompatible models aren't listed in OpenRAG's settings or onboarding.

Additionally, models must be able to handle the agentic reasoning tasks required by OpenRAG. Models that are too small or not designed for agentic RAG tasks can return low quality, incorrect, or improperly formatted responses. For more information, see Chat issues.

You can submit an OpenRAG GitHub issue to request support for specific models.

Ollama model issues

OpenRAG isn't guaranteed to be compatible with all models that are available through Ollama. Some models might produce unexpected results, such as JSON-formatted output instead of natural language responses, and some models aren't appropriate for the types of tasks that OpenRAG performs, such as those that generate media.

Language models: Ollama-hosted language models must support tool calling to be compatible with OpenRAG. The OpenRAG team recommends gpt-oss:20b or mistral-nemo:12b. If you choose gpt-oss:20b, consider using Ollama Cloud or running Ollama on a remote machine because this model requires at least 16GB of RAM.
Embedding models: The OpenRAG team recommends nomic-embed-text:latest, mxbai-embed-large:latest, or embeddinggemma:latest.

You can experiment with other models, but if you encounter issues that you are unable to resolve through other RAG best practices (like context filters and prompt engineering), try switching to one of the recommended models. You can submit an OpenRAG GitHub issue to request support for specific models.

Document ingestion or similarity search issues

The following issues can occur during document ingestion or as a result of suboptimal ingestion.

Failed or slow ingestion

If an ingestion task fails, try the following:

Make sure you ingest only supported file types.
Split very large files into smaller files.
Remove unusual or complex embedded content, such as videos or animations. Although Docling can replace some non-text content with placeholders during ingestion, some embedded content might cause errors.
Make sure your Podman/Docker VM has sufficient memory and temporary storage for the ingestion tasks. The minimum recommendation is 8 GB of RAM and at least 50 GB of free disk space, but the exact requirements depend on the size and complexity of your documents. If you regularly ingest large files, more RAM and space are recommended.
If OCR ingestion fails due to OCR missing, see OCR ingestion fails (easyocr not installed).

Timeouts when ingesting many files or very large files

Ingesting very large PDFs (more than 300 pages) and folders with many documents can take 30 minutes or more.

Make sure your container VM has sufficient memory for processing large files and folders.

If you experience timeouts during ingestion, edit the following environment variables in your OpenRAG .env file:

LANGFLOW_TIMEOUT
LANGFLOW_CONNECT_TIMEOUT
INGESTION_TIMEOUT
UPLOAD_BATCH_SIZE
MAX_WORKERS
LANGFLOW_WORKERS
DOCLING_WORKERS

OCR ingestion fails (easyocr not installed)

Docling ingestion can fail with an OCR-related error that mentions easyocr is missing. This is likely due to a stale uv cache when you install OpenRAG with uvx.

When you invoke OpenRAG with uvx openrag, uvx creates a cached, ephemeral environment that doesn't modify your project. The location and path of this cache depends on your operating system. For example, on macOS, this is typically a user cache directory, such as ~/.cache/uv.

This cache can become stale, producing errors like missing dependencies.

If the TUI is open, press q to exit the TUI.
Clear the uv cache:
```
uv cache clean
```
Or clear only the OpenRAG cache:
```
uv cache clean openrag
```
Invoke OpenRAG to restart the TUI:
```
uvx openrag
```
Click Launch OpenRAG, and then retry document ingestion.

If you install OpenRAG with uv, dependencies are synced directly from your pyproject.toml file. This should automatically install easyocr because easyocr is included as a dependency in OpenRAG's pyproject.toml.

If you don't need OCR, you can disable OCR-based processing in your ingestion settings to avoid requiring easyocr.

Problems when referencing documents in chat

If the OpenRAG Chat doesn't seem to use your documents correctly, browse your knowledge base to confirm that the documents are uploaded in full, and the chunks are correct.

If the documents are present and well-formed, check your knowledge filters. If you applied a filter to the chat, make sure the expected documents aren't excluded by the filter settings. You can test this by applying the filter when you browse the knowledge base. If the filter excludes any documents, the agent cannot access those documents. Be aware that some settings create dynamic filters that don't always produce the same results, such as a Search query combined with a low Response limit.

If the document chunks have missing, incorrect, or unexpected text, you must delete the documents from your knowledge base, modify the ingestion settings or the documents themselves, and then reingest the documents. For example:

Break combined documents into separate files for better metadata context.
Make sure scanned documents are legible enough for extraction, and enable the OCR option. Poorly scanned documents might require additional preparation or rescanning before ingestion.
Adjust the Chunk size and Chunk overlap settings to better suit your documents. Larger chunks provide more context but can include irrelevant information, while smaller chunks yield more precise semantic search but can lack context.

Chat issues

The following issues can occur when using the OpenRAG Chat feature.

Documents seem to be missing or misinterpreted

In the Chat, click Function Call: search_documents (tool_call) to view the log of tool calls made by the agent. This can shows you how the agent used particular tools and the reasoning.

Click Knowledge to confirm that the documents are present in the OpenRAG OpenSearch knowledge base, and then click each document to see how the document was chunked. If a document was chunked improperly, you might need to tweak the ingestion settings or modify and reupload the document.

Service is suddenly unavailable when it was working previously

First, verify that the container VM and the OpenRAG services are running and healthy.

Second, make sure there are no issues with the flow configuration. If you edited the OpenRAG OpenSearch Agent flow, use the Restore flow option to revert the flow to its original configuration.

If you want to preserve your customizations, you can export the flow before restoring the flow.

JSON-formatted responses

If a model returns JSON-formatted output instead of natural language responses, the model might not be designed for agentic reasoning and RAG tasks.

Try a different model.

For more information, see Model availability and performance issues.

Frequent hallucinations

If the model seems to hallucinate frequently, the model might be too small or poorly suited to RAG tasks.

Try a different model.

For more information, see Model availability and performance issues.

Responses are good but could be better

If your model is compatible with OpenRAG, and it isn't exhibiting any obvious errors, then you can use RAG best practices to refine the response quality, such as knowledge filters and prompt engineering.

Cannot detect GPU with Fedora or local Ollama

If your machine has GPU support, but OpenRAG is having problems detecting or using the GPU, you might see errors like the following:

Auto-detected mode as 'legacy'
nvidia-container-cli: ldcache error
error running createRuntime hook
unrecognized runtime 'crun'
OCI runtime error

To troubleshoot these issues, do the following:

Make sure your environment has a compatible version of the NVIDIA Container Toolkit.
If you are using a local Ollama deployment, Ollama must run in a container with GPU support.

The following steps explain how to troubleshoot these issues with Fedora 43, Podman, local Ollama, and a machine that has an NVIDIA GPU:

Make sure your NVIDIA driver is updated to the latest version.

Remove old toolkit packages:

sudo dnf5 remove -y nvidia-container-toolkit nvidia-container-toolkit-base libnvidia-container*

This command doesn't remove the NVIDIA driver.

Get the NVIDIA Container Toolkit package repository:

sudo curl -s -o /etc/yum.repos.d/nvidia-container-toolkit.repo \
https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo

Version 1.14 or later is required.

Refresh the package cache to recognize the new repository:
```
sudo dnf5 clean expire-cache
sudo dnf5 update
```
Install the new toolkit:
```
sudo dnf5 install -y nvidia-container-toolkit
```
This command installs the following:
- libnvidia-container version 1.14 or later
- nvidia-container-toolkit-base version 1.14 or later
- nvidia-container-toolkit version 1.14 or later
These versions support Fedora 43, Podman, and crun.
Configure the runtime for Podman (crun):
```
sudo nvidia-ctk runtime configure --runtime=crun
```
If this command fails, the make sure you installed version 1.14 or later of the NVIDIA Container Toolkit, which supports crun.
Fix the dynamic loader cache to prevent ldconfig errors:
```
sudo ldconfig
```

Restart the Podman socket:

systemctl --user enable --now podman.socket

Test the NVIDIA hook, and make sure the output doesn't include Auto-detected mode as 'legacy'. The output must be empty.
```
/usr/bin/nvidia-container-runtime-hook prestart
```
Test the GPU with Podman:

podman run --rm \
--hooks-dir=/usr/share/containers/oci/hooks.d \
--device nvidia.com/gpu=all \
nvidia/cuda:12.3.0-base-ubuntu22.04 nvidia-smi

Make sure nvidia-smi prints your GPU information instead of an error.

The --rm argument automatically removes the container after the command exits.

Run Ollama in a GPU-enabled container:

podman run -d \
--name ollama \
--security-opt=label=disable \
--hooks-dir=/usr/share/containers/oci/hooks.d \
--device nvidia.com/gpu=all \
-p 11434:11434 \
-v $HOME/ollama:/root/.ollama \
docker.io/ollama/ollama:latest

Start the TUI, and then start OpenRAG with GPU mode enabled. Or, for self-managed deployments, deploy OpenRAG with the docker-compose-gpu.yml file.

OpenRAG installation fails with unable to get local issuer certificate​

No container runtime found​

Container out of memory errors​

OpenSearch fails to start​

OpenRAG fails to start from the TUI with operation not supported​

Application onboarding gets stuck or fails​

Langflow connection issues​

Port conflicts​

Upgrade fails due to Langflow container already exists​

Model availability and performance issues​

Language model isn't listed in OpenRAG settings or application onboarding​

IBM watsonx.ai model issues​

Ollama model issues​

Document ingestion or similarity search issues​

Failed or slow ingestion​

Timeouts when ingesting many files or very large files​

OCR ingestion fails (easyocr not installed)​

Problems when referencing documents in chat​

Chat issues​

Documents seem to be missing or misinterpreted​

Service is suddenly unavailable when it was working previously​

JSON-formatted responses​

Frequent hallucinations​

Responses are good but could be better​

Cannot detect GPU with Fedora or local Ollama​