No internet. No subscriptions. No one reading your prompts.
Just you and a very smart llama.
Imagine ChatGPT… but it lives on your computer. No internet needed. No subscriptions. No one reading your prompts. Just you and a very smart llama.
Ollama is a free, open-source tool that lets you download and run powerful AI models (called LLMs — Large Language Models) right on your own hardware. Your data never leaves your machine. It's like having a tiny genius living inside your PC.
Let's make sure your PC can handle this:
Think of it this way: RAM is your AI's desk space. More RAM = bigger, smarter models. A GPU is like giving your AI a Red Bull — everything goes faster.
OllamaSetup.exe will land in your Downloads folder🎉 Congrats — you just took your first step into a larger world.
OllamaSetup.exe🦙 See that little llama down there? That's your new best friend. It runs quietly in the background, waiting for your commands.
You're about to use the command line. Deep breaths. It's just typing words. You've got this.
Press Win + X → click "Terminal" or "Windows PowerShell"
Press Win + R, type cmd, hit Enter
You should see a dark window with a blinking cursor. This is where the magic happens.
"Pulling" a model means downloading an AI brain to your computer. Let's start with a great all-rounder:
ollama pull llama3.2
Hit Enter and watch it download. This grabs Meta's Llama 3.2 model (~2 GB for the 3B version).
pulling manifest
pulling abc123def456... 100% ▕████████████████▏ 2.0 GB
pulling 987654fedcba... 100% ▕████████████████▏ 1.5 KB
verifying sha256 digest
success
⏳ Download time depends on your internet speed. Perfect time to grab a coffee. Or a snack. Or contemplate the nature of consciousness.
This is the moment. Type:
ollama run llama3.2
You'll see a prompt that says >>>. That means the AI is listening. Try:
>>> Hey there! What can you help me with?
>>> Explain quantum physics like I'm a golden retriever
>>> Write me a haiku about debugging code
>>> What's a good recipe for someone who only has
eggs, cheese, and sadness?
>>> Pretend you're a pirate who's also a life coach
AND IT RESPONDS. Right there. On YOUR computer. No cloud. No API key. No monthly fee. Just raw, local AI power.
To exit the chat, type:
>>> /bye
Here's the fun part — there's a whole buffet of AI models. Each one has different strengths:
| Model | Command | Size | Good For |
|---|---|---|---|
| Llama 3.2 | ollama pull llama3.2 |
~2 GB | General chat, everyday tasks |
| Llama 3.1 8B | ollama pull llama3.1 |
~4.7 GB | Smarter general purpose |
| Mistral | ollama pull mistral |
~4 GB | Fast, great at reasoning |
| Code Llama | ollama pull codellama |
~3.8 GB | Writing & explaining code |
| Phi-3 | ollama pull phi3 |
~2.2 GB | Surprisingly smart for its size |
| Gemma 2 | ollama pull gemma2 |
~5.4 GB | Google's model, great quality |
| Llama 3.1 70B | ollama pull llama3.1:70b |
~40 GB | The big brain (needs 64 GB+ RAM) |
🧪 Start small, experiment, see what vibes with you. Delete models anytime with ollama rm model-name. Browse all available models at ollama.com/library
Keep this taped to your monitor.
ollama list # See all your downloaded models
ollama pull <model> # Download a new model
ollama run <model> # Start chatting with a model
ollama rm <model> # Delete a model (free up space)
ollama show <model> # See model details
ollama ps # See what's currently running
ollama serve # Manually start the Ollama server
ollama --version # Check your Ollama version
Typing in a terminal is cool and all, but what if you want a ChatGPT-like interface? Two popular free options:
If you have Docker installed (or want to install it):
docker run -d -p 3000:8080 --name open-webui \
ghcr.io/open-webui/open-webui:main
Then visit http://localhost:3000 in your browser. ChatGPT vibes, but local.
http://localhost:11434Close and reopen your terminal. If still broken, restart your PC. Ollama needs to be in your system PATH (the installer does this, but sometimes Windows needs a nudge).
You might be running on CPU only. Check if you have an NVIDIA GPU by typing nvidia-smi in your terminal. If you do, make sure your NVIDIA drivers are up to date.
The model is too big for your RAM. Try a smaller model like llama3.2 or phi3.
Search "Ollama" in the Start menu and launch it. It should appear in the tray.
Let's take a moment to appreciate what you did:
You're not just "using AI" anymore — you're running AI. On your own terms. Welcome to the club. 🦙
Now go forth and chat with your llama. 🦙