Run Local LLMs with Ollama + Open WebUI in Docker

Want to run large language models like LLaMA 3 on your own machine — no OpenAI key, no internet required?

Here’s a quick guide to get started with Ollama and Open WebUI using Docker Compose.

Prerequisites

Make sure you have the following installed:

🐳 Docker and docker compose (v2.20+ recommended) or Orbstack
🧠 A machine with at least:
- 8–16 GB RAM for basic models like llama3
- Optional GPU support for faster inference (Ollama supports Apple Silicon, NVIDIA, and AMD)
🔧 (Optional) jq — for formatting CLI output

💡 Models are stored locally and can consume several GB of disk space.

📦 1. Create the Project

mkdir ollama-chat && cd ollama-chat

Create a file called docker-compose.yml:

version: '3.9'

services:
  ollama:
    image: ollama/ollama:latest
    container_name: ollama
    tty: true
    restart: unless-stopped
    ports:
      - "11434:11434"
    volumes:
      - ollama:/root/.ollama

  open-webui:
    image: ghcr.io/open-webui/open-webui:main
    container_name: open-webui
    depends_on:
      - ollama
    ports:
      - "3000:8080"
    environment:
      - OLLAMA_BASE_URL=http://ollama:11434
    volumes:
      - open-webui:/app/backend/data
    extra_hosts:
      - host.docker.internal:host-gateway
    restart: unless-stopped

volumes:
  ollama: {}
  open-webui: {}

▶️ 2. Start It Up

Run everything in the background:

docker compose up -d

This launches:

🧠 ollama: The local model runtime (LLMs are downloaded here)
🖥️ open-webui: A browser-based chat interface

📥 3. Pull a Model

Download your first model (e.g. llama3):

docker exec -it ollama ollama pull llama3

This may take a few minutes depending on the model size and your network.

🧪 4. Test via Curl

Once the model is ready, try a chat call from your terminal:

curl -s -N -X POST http://localhost:11434/api/chat \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3",
    "messages": [{"role": "user", "content": "Say this is a test!"}]
  }' | jq -r 'select(.message.content != null) | .message.content' | tr -d '\n'; echo

✔️ Example output:

This is a test!

💬 5. Chat in the Browser

Now open http://localhost:3000 — you’ll see Open WebUI loaded and ready to chat with llama3 (or any other model you pull).

✅ That’s It!

You’ve got:

Local inference with zero external dependencies
Full control over models and memory
Chat-friendly UI for everyday usage

Run Local LLMs with Ollama + Open WebUI in Docker#

Prerequisites#

📦 1. Create the Project#

▶️ 2. Start It Up#

📥 3. Pull a Model#

🧪 4. Test via Curl#

💬 5. Chat in the Browser#

✅ That’s It!#