Create embeddings

Generate vector representations from text using the Text Embedder node.

When this is useful

Embeddings are numerical vectors that capture the semantic meaning of text and can be used for similarity search, clustering, and retrieval-based workflows.

Create embeddings when you want to:

Compare texts by meaning rather than keywords
Find similar documents or records
Group or cluster text data
Prepare data for retrieval-augmented generation (RAG)

If you only need text generation or classification, check out Prompt a model instead.

Build an embedding workflow

Prerequisites

Before you start, make sure that you have:

installed the KNIME AI Extension (see Install the KNIME AI Extension).
configured credentials for a supported provider (see LLM Providers).

1. Select an embedding model

The selection of an embedding model depends on where the model runs.

Hosted models (authentication required)

If you use a hosted provider (for example, OpenAI):

Store your API key using a Credentials Configuration or Credentials Widget.
Connect the credentials to the corresponding authenticator, such as OpenAI Authenticator
Select an embedding-capable model using an Embedding Model Selector, for example:
OpenAI Embedding Model Selector

If the authenticator shows a green status light, the connection is successful.

Local models (no authentication required)

If you use a local model, configure the GPT4All Embedding Model Selector to point to your local model file.

Use the same embedding model consistently

Always use the same embedding model when creating and querying embeddings. Mixing models will lead to incorrect similarity results.

2. Provide text input

Provide the text you want to embed as a table column.

Each row in the table will be processed independently.

3. Generate embeddings

Use the Text Embedder node.

Configure the node as follows:

Connect the model input port to the selected embedding model
Connect the data input port to the table containing the text column
Select the column that contains the text to embed

The node appends a new column containing one embedding vector per input row.

Result

Each embedding is a high-dimensional numeric representation of the input text, designed for similarity-based operations rather than direct inspection.

To see how embeddings are used for similarity analysis and visualization, check out this example workflow on the KNIME Hub:

➡️ Compare texts by semantic similarity

Next steps

Build a RAG pipeline using embeddings for context retrieval

Tutorials

Tutorials

Concepts

Connecting

Reading and Transforming

Writing and Modifying

Reference

Tutorials

Data Processing

Environment Management

Alternative Configurations

Concepts

Reference

API Reference

Create embeddings

When this is useful

Build an embedding workflow

1. Select an embedding model

Hosted models (authentication required)

Local models (no authentication required)

2. Provide text input

3. Generate embeddings

Result

Next steps

Tutorials

API Reference

Create embeddings ​

When this is useful ​

Build an embedding workflow ​

1. Select an embedding model ​

Hosted models (authentication required) ​

Local models (no authentication required) ​

2. Provide text input ​

3. Generate embeddings ​

Result ​

Next steps ​

Create embeddings

When this is useful

Build an embedding workflow

1. Select an embedding model

Hosted models (authentication required)

Local models (no authentication required)

2. Provide text input

3. Generate embeddings

Result

Next steps