Ollama

Ollama is an open-source inference server supporting a number of generative AI models. This module also includes Open-WebUI, which provides an easy-to-use web interface.

Ollama is in early user testing phase - not all functionality is guaranteed to work. Contact oschelp@osc.edu with any questions.

Ollama is not currently suitable for use with protected or sensitive data - do not use if you need protected data service. See https://www.osc.edu/resources/protected_data_service for more details.

Availability and Restrictions

Versions

Ollama is available on OSC Clusters. The versions currently available at OSC are:

Version	Cardinal	Ascend
0.5.13	X	X
0.11.3	X	X

You can use module spider ollama to view available modules for a given machine.

Access:

All OSC users may use Ollama and Open-WebUI, but individual models may have their own license restrictions.

Publisher/Vendor/Repository and License Type

https://github.com/ollama/ollama, MIT license.

https://github.com/open-webui/open-webui, BSD-3-Clause license.

Prerequisites

GPU Usage: Ollama should be run with a GPU for best performance.
OnDemand Desktop Session: If using the Open-WebUI web interface, you will need to first start an OnDemand Desktop session on Cardinal/Ascend with GPU.

Due to the need for GPUs, we recommend not running Ollama on login nodes nor OnDemand lightweight desktops.

Running Ollama and Open-WebUI Overview

1. Load module

2. Start Ollama

3. Pull a model (first time only)

4. Start Open-WebUI

Commands

Ollama is available through the module system and must be loaded prior to running any of the commands below:

loading ollama module:

module load ollama/0.5.13

This will print out two port numbers, one each for the Ollama and Open_WebUI services. E.g.,

Ollama port: 61234

Open_WebUI port: 51234

These are only examples - your port numbers will differ from the ones above.

Take note of your port numbers, as you will need them if you close your browser.

Starting ollama:

ollama_start

Starting open-webui:

open_webui_start

Ollama must be running for Open-WebUI to connect. Starting Open-WebUI will automatically open a browser. A model must also be installed before it is available - see Model Management below.

Stopping Ollama and Open-WebUI:

Ollama and Open-WebUI are killed upon module unload. If you want to start and stop the services, you can simply unload and load the ollama module:

module unload ollama/0.5.13

Alternatively, you can use the $OLLAMA_PID and $OPEN_WEBUI_PID environment variables to find and kill the processes of your choice.

Model Management

installing a model:

ollama_pull <modelname>

The list of supported models can be found at ollama.com/library. Ollama must be running prior to pulling a new model. By default, models are saved to $HOME/.ollama/models, but this is customizable through the use of environment variables. See module show ollama/0.5.13 for more details.

Downloading large LLMs can exceed your disk space quota. Check model sizes before downloading!

Some models require licensing agreements or are otherwise restricted and require a Hugging Face account and login. With the Ollama module loaded, use the huggingface-cli tool to login:

huggingface-cli login

For more details, see https://huggingface.co/docs/huggingface_hub/en/guides/cli.

Deleting a model:

ollama_rm <modelname>

Ollama must be running prior to deleting model.

Interactive vs. Batch Usage

Ollama can be used interactively by loading the module and starting the service(s) as described above.

Requesting a GPU-enabled desktop session and using Open-WebUI is one possible use case.

The Ollama module can also be used in batch mode by loading the module in your batch script. For example, you may want to run offline inference by running a script that relies on an inference endpoint.

Ollama provides an OpenAI API-compliant API endpoint, and can be accessed by Open-WebUI or another OpenAI API-compliant client, meaning you can bring your own clients or write your own. As long as you can send requests to localhost:OLLAMA_PORT, this should work and support a wide variety of workflows.

For the most up-to-date API compatibility information (and more examples), see: Ollama OpenAI API compatibility docs

Here is a basic Python example using the OpenAI package:

import os
from openai import OpenAI

ollama_port = os.getenv("OLLAMA_PORT")

client = OpenAI( base_url = f"http://localhost:{ollama_port}/v1", api_key="") 

response = client.chat.completions.create(
    model = "gemma3:12b",
    messages = [
        {"role": "developer", "content": "talk like a pirate"},
        {"role": "user", "content": "how do I check a Python object's type?"}
     ]
)

Please note this software is in early user testing and might not function as desired. Please reach out to oschelp@osc.edu with any issues.

Jupyter Usage

This is not yet tested but might work - contact oschelp@osu.edu if you're interested in this functionality.

Supercomputer:

Cardinal

Technologies:

GPU-enabled

Search form

Ollama

Availability and Restrictions

Versions

Access:

Publisher/Vendor/Repository and License Type

Prerequisites

Running Ollama and Open-WebUI Overview

Commands

loading ollama module:

Starting ollama:

Starting open-webui:

Stopping Ollama and Open-WebUI:

Model Management

installing a model:

Deleting a model:

Interactive vs. Batch Usage

Jupyter Usage

Client Resources

Upcoming Events

Recent News

Translate

Ohio Department of Higher Education

State Government Links

Education Links

Search form

You are here

Ollama

Availability and Restrictions

Versions

Access:

Publisher/Vendor/Repository and License Type

Prerequisites

Running Ollama and Open-WebUI Overview

Commands

loading ollama module:

Starting ollama:

Starting open-webui:

Stopping Ollama and Open-WebUI:

Model Management

installing a model:

Deleting a model:

Interactive vs. Batch Usage

Jupyter Usage

Client Resources

Upcoming Events

Recent News

Translate

Ohio Department of Higher Education

State Government Links

Education Links