Skip to main content

Configure ingestion

The knowledge ingestion settings determine how documents are processed when you upload documents into your knowledge base. This includes chunking strategies, embedding models, and image handling settings.

Existing documents aren't reprocessed after changing ingestion settings

warning

Changes to knowledge ingestion settings only apply to documents that you upload after making changes. Documents uploaded before changing these settings aren't reprocessed.

After changing knowledge ingestion settings, you must determine if you need to reupload any documents to be consistent with the new settings.

It isn't always necessary to reupload documents after changing knowledge ingestion settings. For example, it is typical to upload some documents with OCR enabled and others without OCR enabled.

Select a Docling implementation

OpenRAG uses Docling for document ingestion. Docling processes files, splits them into chunks, and stores them as separate, structured documents in your OpenSearch knowledge base.

You can configure OpenRAG to use either a Docling Serve service or a Docling processor pipeline for document processing:

  • Docling Serve ingestion: By default, OpenRAG uses Docling Serve. When you start OpenRAG, a local docling serve process starts, and then OpenRAG runs Docling ingestion through the Docling Serve API.

    To use a remote docling serve instance or your own local instance, set DOCLING_SERVE_URL=http://HOST_IP:5001 in your OpenRAG .env file. The service must run on port 5001.

    For terminal-managed deployments, the OpenRAG terminal session can alert you if docling serve isn't running or isn't detected by OpenRAG. You can also use the terminal's Show status option to inspect and manage the OpenRAG services.

  • Docling processor ingestion: Instead of using a separate Docling Serve service and the Docling Serve API, you can use the Docling processor directly. To do this, set DISABLE_INGEST_WITH_LANGFLOW=true in your OpenRAG .env file, and then restart the OpenRAG services. For the underlying functionality for this option, see processors.py in the OpenRAG repository.

Set the embedding model and dimensions

When you install OpenRAG, you select at least one embedding model during the application onboarding process. OpenRAG automatically detects and configures the appropriate vector dimensions for your selected embedding model, ensuring optimal search performance and compatibility.

After onboarding, you can change the embedding model on the OpenRAG Settings page. OpenRAG automatically updates all relevant OpenRAG flows to use the selected embedding model and dimensions when ingesting documents.

You can only select models that are available from your configured model providers, such as an Ollama instance or an OpenAI account. For more information, see Troubleshoot model availability and performance issues.

Because the OpenRAG UI validates model availability and compatibility, setting models directly in the OpenRAG .env file isn't recommended.

Documents aren't reingested after changing the embedding model

OpenRAG doesn't reprocess existing documents when you change, add, or remove embedding models.

If you want to generate new embeddings for existing documents, you must reupload the documents after enabling the new embedding model.

To remove previously generated embeddings, you must remove the existing documents from the knowledge base.

Best practices for multiple embedding models

  • Make sure the active embedding model is correct before ingesting documents: For ingestion, OpenRAG allows only one active embedding model at a time. The active model is set on the Settings page.

  • Chat searches take longer with multiple embedding models: OpenRAG Chat can use multiple embedding models, regardless of the active model for ingestion. If your OpenRAG knowledge base has documents with embeddings from different embedding models, the agent runs a separate similarity search against each model's embeddings, and then prepares a unified set of search results. Be aware that this can increase processing time for chat responses.

  • Use filters to exclude unrelated models: You can create filters to separate documents that were embedded with different models. For example, if you use a specific model for financial reports, and a different model for corporate policy documents, you can use a filter to exclude the financial reports when chatting about corporate policies. This can save time and improve the relevance of search results because the agent doesn't need to search both sets of embeddings and then unify them into one set of results.

  • Unavailable embedding models can cause errors: Errors can occur in Chat if you an embedding model is unavailable and your knowledge base contains documents with embeddings generated by that model. Similarly, errors can occur with ingestion if the active embedding model (on the Settings page) is unavailable.

Models become unavailable when you remove an embedding model or provider from the OpenRAG Settings page, your model provider credentials are expired or invalid, or there are network issues preventing access to the model.

If you remove a model or provider that you don't want to use anymore, it is recommended that you remove documents that were embedded with the unavailable model. If you don't remove or filters these documents, then Chat responses can fail or return incomplete results. This is because the agent cannot access the model to generate a query embedding for comparison with the document embeddings.

Set the chunking strategy

You can edit the following settings on the OpenRAG Settings page in the Knowledge Ingest section:

  • Chunk size: Set the number of characters for each text chunk when breaking down a file. Larger chunks yield more context per chunk, but can include irrelevant information. Smaller chunks yield more precise semantic search, but can lack context. The default value is 1000 characters, which is usually a good balance between context and precision.

  • Chunk overlap: Set the number of characters to overlap over chunk boundaries. Use larger overlap values for documents where context is most important. Use smaller overlap values for simpler documents or when optimization is most important. The default value is 200 characters, which represents an overlap of 20 percent with the default Chunk size of 1000. This is suitable for general use. For faster processing, decrease the overlap to approximately 10 percent. For more complex documents where you need to preserve context across chunks, increase it to approximately 40 percent.

Configure table parsing

You can edit the following setting on the OpenRAG Settings page in the Knowledge Ingest section:

  • Table structure: Enables Docling's DocumentConverter tool for parsing tables. Instead of treating tables as plain text, tables are output as structured table data with preserved relationships and metadata. This option is enabled by default.

Configure OCR and image processing

You can edit the following settings on the OpenRAG Settings page in the Knowledge Ingest section:

  • OCR: Enables Optical Character Recognition (OCR) processing when extracting text from images and ingesting scanned documents. This setting is best suited for processing text-based documents faster with Docling's DocumentConverter. Images are ignored and not processed.

    This option is disabled by default. Enabling OCR can slow ingestion performance.

    If OpenRAG detects that the local machine is running on macOS, OpenRAG uses the ocrmac OCR engine. Other platforms use easyocr.

  • Picture descriptions: Only applicable if OCR is enabled. Adds image descriptions generated by the SmolVLM-256M-Instruct model. Enabling picture descriptions can slow ingestion performance.

Set the local documents path

The local documents paths is set when you install OpenRAG and start the OpenRAG services.

The default path for local uploads is ~/.openrag/documents. This is mounted to the /app/openrag-documents/ directory inside the OpenRAG container. Files added to the host or container directory are visible in both locations.

To change this location, modify either of the following, and then restart the OpenRAG services:

See also