Skip to content

Create embeddings

Generate vector representations from text using the Text Embedder node.

When this is useful

Embeddings are numerical vectors that capture the semantic meaning of text and can be used for similarity search, clustering, and retrieval-based workflows.

Create embeddings when you want to:

If you only need text generation or classification, check out Prompt a model instead.

Build an embedding workflow

Prerequisites

Before you start, make sure that you have:

1. Select an embedding model

The selection of an embedding model depends on where the model runs.

Hosted models (authentication required)

If you use a hosted provider (for example, OpenAI):

  1. Store your API key using a Credentials Configuration or Credentials Widget.
  2. Connect the credentials to the corresponding authenticator, such as OpenAI Authenticator
  3. Select an embedding-capable model using an Embedding Model Selector, for example:
    OpenAI Embedding Model Selector

If the authenticator shows a green status light, the connection is successful.

Local models (no authentication required)

If you use a local model, configure the GPT4All Embedding Model Selector to point to your local model file.

Use the same embedding model consistently

Always use the same embedding model when creating and querying embeddings. Mixing models will lead to incorrect similarity results.

2. Provide text input

Provide the text you want to embed as a table column.

Each row in the table will be processed independently.

3. Generate embeddings

Use the Text Embedder node.

Configure the node as follows:

  • Connect the model input port to the selected embedding model
  • Connect the data input port to the table containing the text column
  • Select the column that contains the text to embed

The node appends a new column containing one embedding vector per input row.

Result

Each embedding is a high-dimensional numeric representation of the input text, designed for similarity-based operations rather than direct inspection.

To see how embeddings are used for similarity analysis and visualization, check out this example workflow on the KNIME Hub:

➡️ Compare texts by semantic similarity

Next steps