Inject retrieved context into a prompt
Provide retrieved documents as additional context to a language model so it can generate responses grounded in your own data.
This is the final step in a retrieval-augmented generation (RAG) workflow: the model receives both the user instruction and the retrieved context in the same prompt.
When this is useful
Inject retrieved context when you want to:
- Answer questions based on your own documents
- Ground model responses in specific source material
- Reduce hallucinations by providing explicit context
- Build RAG workflows for question answering or summarization
If you only need generic model responses without external context, see Prompt a model instead.
How context injection works in KNIME
In KNIME, retrieved documents are passed to the model as plain text.
There is no special “context channel”: you explicitly insert the retrieved content into the prompt text alongside the user instruction.
Typically, this involves:
- retrieving the most relevant documents from a vector store
- formatting them into a single text block
- combining that text with the user’s question or instruction
The model then uses both the instruction and the provided context to generate its response.
Context is just text
Retrieved documents are not interpreted differently by the model. How you format and delimit the context in the prompt directly affects response quality.
Inject context into a prompt
Prerequisites
- A vector store containing embeddings
- A workflow that retrieves relevant documents as text
- An LLM selected via an LLM Selector
- A prompt that combines context and instruction
Retrieve relevant documents
Use a vector store retriever node to fetch the most relevant documents for a given query.
The result is typically a table containing:
- document text
- optional metadata
- similarity scores
Only the text content is required for prompt injection.
Prepare the context text
Combine the retrieved documents into a single text block.
Common approaches include:
- concatenating documents with clear separators
- limiting the number of retrieved documents
- truncating long passages to fit model limits
This step ensures the context can be safely injected into the prompt.
Combine context and instruction
Create a prompt that clearly separates context from the user instruction.
Example structure:
Use the following context to answer the question.
Context:
{{retrieved_documents}}
Question:
{{user_question}}Being explicit about what is context and what is the instruction helps the model reason more reliably over the provided information.
Delimit your context
Use labels, separators, or headings to clearly mark the context section in the prompt. This improves grounding and reduces ambiguity.
Send the prompt to the model
Use the LLM Prompter or LLM Chat Prompter node to send the combined prompt to the selected model.
Each execution injects the retrieved context dynamically, based on the current query.
Result
The model generates a response that is informed by the retrieved documents.
The output reflects both:
- the user’s instruction
- the context retrieved from the vector store
This allows responses to be grounded in your own data rather than general model knowledge.
Next steps
- Build a complete RAG workflow: Product FAQ Assistant tutorial