Skip to content

Prompt a local model via Ollama

Run large language models locally in KNIME by connecting to Ollama through its OpenAI-compatible API, without relying on external cloud providers.

When this is useful

Prompt a local model via Ollama when you want to:

  • Run LLMs locally on your machine
  • Keep data on-prem and offline
  • Avoid usage-based API costs

If you want to use hosted models (e.g., OpenAI, Anthropic, Gemini), see Prompt a model.

Prompt a local model step by step

Prerequisites

Before you start, make sure that you have:

1. Install and start Ollama

  1. Download and install Ollama from https://ollama.com/download

  2. Choose a model from the Ollama library and install it either from the Ollama interface or by running a command in terminal for example:

    ollama run gemma3:12b
  3. Make sure Ollama is running in the background (default endpoint: http://localhost:11434/v1)

2. Authenticate (dummy credentials)

Enter dummy credentials in the Credentials configuration node. Any dummy value can be used as the API key (for example any-string).

Add an OpenAI Authenticator node.

  • Select the credentials flow variables (dummy credential)
  • In Advanced Settings, set:
    • OpenAI base URLhttp://localhost:11434/v1

Why dummy credentials?

Ollama does not require authentication, but the OpenAI nodes expect credentials.

3. Select the local model

Use an OpenAI LLM Selector node.

  • Pass the model name as a flow variable

  • Example model name:

    gemma3:12b

4. Send a prompt

Use the LLM Prompter node as usual.

  • Create a prompt from a table column or expression
  • Execute the node to receive model responses

Each row is processed independently.

Result

The prompt is executed entirely locally via Ollama, using standard KNIME OpenAI nodes.

No data is sent to external providers.

Next steps

See how an Ollama-based prompting workflow can be used in an agent tool in Build your first local agent.