Local AI on the ZimaCube 2 — PCIe Expansion, Ollama, and Future-Proofing Your Homelab

Q: Can the ZimaCube 2 run Ollama?

Yes. The stock configuration can run quantized 7B–8B parameter models (Llama 3, Mistral, Phi) comfortably for chat, code assistance, and text analysis. With a GPU added via PCIe, you can run larger models with significantly faster inference.

Q: Does the ZimaCube 2 have a PCIe slot for a GPU?

Yes. The ZimaCube 2 includes a PCIe expansion slot that supports standard GPUs, AI accelerators, additional storage cards, and networking cards. No proprietary form factors or vendor lock-in.

Q: Can I upgrade the RAM later?

Yes. The ZimaCube 2 uses standard SODIMM DDR5 RAM, which is user-replaceable. The stock 8GB configuration handles container workloads well, and you can upgrade when your needs grow.

Q: Is the ZimaCube 2 thermally capable of handling a GPU?

Yes. The metal chassis, intentional airflow design, and internal component layout support PCIe card airflow. The system handles continuous workloads while staying cool, and the thermal design has headroom for expansion.

Eva Wong

IceWhale author

Eva Wong is the Technical Writer and resident tinkerer at ZimaSpace. A lifelong geek with a passion for homelabs and open-source software, she specializes in translating complex technical concepts into accessible, hands-on guides. Eva believes that self-hosting should be fun, not intimidating. Through her tutorials, she empowers the community to demystify hardware setups, from building their first NAS to mastering Docker containers.

Local AI on the ZimaCube 2 — PCIe Expansion, Ollama, and Future-Proofing Your Homelab - Zima Store Online

💡 Community Spotlight: Michael Luckenbill, ZimaCube 2 Pioneer Program

When I opened the ZimaCube 2 for the first time, I was not just looking at what it could do today. I was looking at what it could become tomorrow.

Inside, alongside the expected hardware, I found something that genuinely excited me: a 256GB Kingston NVMe drive for the OS — and an additional unused NVMe slot on the motherboard. Combined with the PCIe expansion slot on the rear, this was not just a NAS. It was a platform designed to grow.

ZimaCube 2 front panel view: 6 SATA drive bays, front USB-A USB-C ports, built-in M.2 NVMe slot

Beyond Storage: A Platform Designed for Expansion

The ZimaCube 2 ships ready to go, but the real story is what you can add later:

Built In

6 × SATA bays
4 × M.2 NVMe slots
8GB DDR5 SODIMM
Dual 2.5Gb Ethernet
Metal chassis with active cooling

Expandable

Extra NVMe slot on motherboard
PCIe expansion slot (GPU/AI/Storage/Networking)
Upgradable SODIMM DDR5 RAM
Standard replaceable components

This is one of the biggest strengths of the system: it grows with your infrastructure needs. You do not have to buy everything upfront. You start with what you need and expand when you are ready.

Local AI: Why It Matters for Homelabs

One of my long-term goals is running more AI workloads locally. Not because cloud AI is bad — but because local AI gives you something different:

Privacy — Your data never leaves your network
Cost predictability — No per-token pricing, no API bills at the end of the month
Experimentation freedom — Try models, break things, start over — without worrying about cloud costs
Offline capability — AI that works when your internet does not
Learning — Understanding how models actually work by running them yourself

The ZimaCube 2 gives me a platform where I can experiment with all of this — Ollama for local LLMs, AI-assisted development workflows, image analysis pipelines, inference workloads, and self-hosted AI tooling — without relying entirely on cloud infrastructure.

Featured

ZimaCube 2 Personal Cloud Home NAS

ZimaCube2

What You Can Run Today (Without a GPU)

Even before adding a dedicated GPU, the ZimaCube 2 already provides a strong foundation for AI experimentation:

🤖 Ollama + Smaller Models — The stock system can run quantized 7B–8B parameter models comfortably. Think Llama 3, Mistral, and Phi-class models for chat, code assistance, and text analysis.

💻 AI-Assisted Development — Combine Ollama with tools like Continue.dev or Cody, and you have a local AI coding assistant that works entirely offline. No API keys, no usage caps, no privacy concerns.

📄 Document Analysis & RAG — With the ZFS storage pools, you can build local RAG (Retrieval Augmented Generation) pipelines — feed your documents, notes, and datasets into a local vector database and query them with an LLM running on the same machine.

🎁 Automation & Scripting — Local LLMs excel at structured tasks — classification, summarization, extraction. Hook Ollama into your existing Docker workflows and automate text processing pipelines that previously needed cloud APIs.

ZimaCube 2 internal hardware layout, passive CPU heatsink, PCIe expansion slot and onboard NVMe slots

The GPU Upgrade Path

The PCIe expansion slot is where things get interesting long-term. Adding a GPU — even a modest one — transforms the ZimaCube 2 into a genuine local AI server:

Larger models — Run 13B–34B parameter models with GPU offloading
Faster inference — 10–50× speedup on token generation
Media transcoding — Hardware-accelerated Plex/Jellyfin transcoding
Image generation — Stable Diffusion and similar models
Multi-model serving — Run different models for different tasks simultaneously

🔌 The slot supports standard PCIe cards — no proprietary form factors, no vendor lock-in. You choose the GPU that fits your needs and budget.

Why This Architecture Matters

Modern homelabs increasingly overlap with AI workloads, local inference, media transcoding, container orchestration, and edge computing.

The ZimaCube 2 feels designed with that future in mind. It is not a sealed appliance that expects you to buy a new one when your needs change. It is a platform that says "here is what you need now, here is room for what you will want later."

For me, that is the difference between a gadget and infrastructure.

Thermal Reality Check: Can It Handle a GPU?

One natural question: can a compact, quiet system actually handle a GPU thermally?

The answer depends on the GPU, but the fundamentals are solid:

The metal chassis acts as a heatsink
The airflow design is intentional (not an afterthought)
The internal component layout leaves room for a PCIe card airflow needs

The system already handles continuous Docker, ZFS, and networking workloads while staying cool to the touch. The thermal design has headroom.

Frequently Asked Questions

Q1: Can the ZimaCube 2 run Ollama?

Yes. The stock configuration can run quantized 7B–8B parameter models (Llama 3, Mistral, Phi) comfortably for chat, code assistance, and text analysis. With a GPU added via PCIe, you can run larger models with significantly faster inference.

Q2: Does the ZimaCube 2 have a PCIe slot for a GPU?

Yes. The ZimaCube 2 includes a PCIe expansion slot that supports standard GPUs, AI accelerators, additional storage cards, and networking cards. No proprietary form factors or vendor lock-in.

Q3: What can I do with local AI on a NAS?

Local AI on a NAS enables private chat assistants, AI-assisted coding (with tools like Continue.dev), document analysis with RAG pipelines, automated text processing and classification, image analysis, and experimentation without cloud API costs.

Q4: How many NVMe slots does the ZimaCube 2 have?

The system has 4× M.2 NVMe slots plus an additional NVMe slot on the motherboard (originally holding the OS drive), which can be used for mirrored OS drives, dedicated Docker storage, or caching layers.

Q5: Can I upgrade the RAM later?

Yes. The ZimaCube 2 uses standard SODIMM DDR5 RAM, which is user-replaceable. The stock 8GB configuration handles container workloads well, and you can upgrade when your needs grow.

Q6: Is the ZimaCube 2 thermally capable of handling a GPU?

Yes. The metal chassis, intentional airflow design, and internal component layout support PCIe card airflow. The system handles continuous workloads while staying cool, and the thermal design has headroom for expansion.