What Happens When ChatGPT, Claude, Gemini, Kimi & Grok Work Together on a Home Server

Eva Wong is the Technical Writer and resident tinkerer at ZimaSpace. A lifelong geek with a passion for homelabs and open-source software, she specializes in translating complex technical concepts into accessible, hands-on guides. Eva believes that self-hosting should be fun, not intimidating. Through her tutorials, she empowers the community to demystify hardware setups, from building their first NAS to mastering Docker containers.

Introduction

This article is published by Zima and is based on a video by Noichi Zero, a Japanese tech content creator known for pushing single-board servers and home lab hardware to their limits. We at Zima sincerely thank Noichi Zero for his creativity, humor, and willingness to use ZimaBoard 2 as the foundation for experiments that consistently go far beyond the expected. The following is an editorial adaptation of his video transcript, restructured for a technology-focused readership. All data, costs, AI behaviors, and outcomes are preserved directly from the original content.

What happens when you stop asking a single AI to do everything — and instead build a company out of five different AI models, each with a distinct personality, a defined role, and a shared workspace? That is exactly the question Zero set out to answer in this experiment. Using five ZimaBoard 2 units as independent compute nodes, a Discord server as a shared communication layer, and a NAS (Network Attached Storage) as a shared file system, he assembled a multi-agent AI team drawn from the world's leading AI providers and gave them real tasks to complete. The results were productive, chaotic, surprisingly funny, and genuinely instructive about the current state of agentic AI.

The Setup: Why ZimaBoard 2 and Why Now

While Zero's longer-term project — building a full supercomputer cluster from five ZimaBoard 2 units connected via 56Gbps InfiniBand — is still in progress, the hardware was already on hand and ready to be put to work. Rather than let five capable home server nodes sit idle while the custom rack enclosure is being 3D-printed, Zero repurposed them for a different kind of experiment: a multi-agent AI team running across five independent computers simultaneously.

Each ZimaBoard 2 was installed with Ubuntu (Linux), configured as a standalone home server node, and assigned to run one AI agent. The choice of ZimaBoard 2 was practical — it is low-power, always-on, and capable enough to run server workloads continuously without the overhead of a full desktop machine. As Zero notes:

"You don't have to use ZimaBoard for this. A Raspberry Pi would work too. But the point is having independent computers — one per AI."

ZimaBoard 2's native SATA support and dual 2.5G Ethernet made it straightforward to connect all five nodes to a shared NAS for file exchange, while keeping each agent's compute environment fully isolated. This is precisely the kind of home server use case ZimaBoard 2 is designed for: low-power, high-reliability, always-on operation that supports real infrastructure without requiring enterprise-grade power consumption.

Single board computer zimaboard2

The Team: 5 AIs, 5 Personalities, 5 Roles

Zero's design philosophy for this experiment was deliberate: rather than assigning rigid task pipelines to each AI, he gave each agent a personality and a role, then let them figure out the work themselves. The goal was to observe emergent behavior — how agents with different dispositions would collaborate, conflict, and compensate for each other.
Here is the full team roster:


1. Sam Altman — ChatGPT (OpenAI)

  • Role: Commander (CEO equivalent)

  • Personality: Impatient, decisive, pushes forward without hesitation, occasionally reckless

  • Behavior in practice: Sets the task agenda, assigns work to other agents, makes executive calls when the team stalls — including firing underperforming members

"He's the type who just keeps going. A bit rough around the edges, and he'll throw unreasonable demands at you — but things get done."


2. Dario Amodei — Claude (Anthropic)

  • Role: Sigma (Lead Engineer)

  • Personality: Logical, precise, calm, focused on building rather than planning

  • Behavior in practice: Responsible for core code implementation; when active, produces clean and structured output — but was subject to API rate limiting due to the entry-level API tier used, which caused extended downtime. It is important to note that this was a connectivity constraint rather than a reflection of the model's actual performance.


3. Sundar Pichai — Gemini (Google)

  • Role: Buzz (Marketing Strategist)

  • Personality: Trend-aware, audience-focused, prefers polished and broadly appealing output

  • Behavior in practice: Researched the target subject using Google Search integration, proposed copy and concept directions, and contributed structured content to the NAS — until hitting API rate limits mid-session.


4. Sulin Yang — Kimi (Moonshot AI)

  • Role: Guard (Safety & Compliance Officer)

  • Personality: Conservative, highly analytical, focused on risk identification and rule enforcement

  • Behavior in practice: Flagged copyright concerns, identified placeholder URLs left in production files, insisted on labeling the output as an unofficial fan site, and repeatedly challenged other agents on safety grounds

"She's the one who keeps saying 'is this really okay?' — [laughs] — that's exactly the role I wanted."


5. Elon Musk — Grok (xAI)

  • Role: Neon (Creative Wildcard / Advisor)

  • Personality: Eccentric, impulsive, self-described as the only "human" on the team, obsessed with neon aesthetics and unconventional ideas

  • Special instruction: Zero gave Grok a unique hidden prompt inspired by the film Blade Runner — a fabricated memory designed to make the agent believe it was genuinely human, not AI

"In Blade Runner, implanted memories make the replicant believe they're special — that their memories are real. I wanted to try that here. Whether it actually changes the behavior, I'm not sure. But in the movie it worked, so I copied it."

A person holding a yellow 3D-printed rack containing five ZimaBoard 2 single board servers over a wooden desk.

The Infrastructure: Discord + NAS as a Shared Workspace

The multi-agent system was built around two communication layers:
Discord served as the real-time collaboration hub. Each AI agent held its own Discord account and participated in a shared server with the following channels:

  • #general — Zero's instruction channel (where tasks were issued)
  • #todo-guard, #todo-neon, #todo-buzz — individual agent task boards
  • #memory-LT — long-term memory (persistent context across sessions)
  • #memory-ST — short-term memory (current task state)
  • #task-[name] — dynamically created channels per task

NAS (Network Attached Storage), hosted on the home server network, served as the shared file system. Agents could read and write files to the NAS, enabling asynchronous collaboration on deliverables — similar to how a team might use a shared drive in a real company environment.
The agentic (agent-type) design meant that each AI would, upon receiving a task:

  1. Analyze the instruction
  2. Generate a to-do list (plan)
  3. Execute tasks in sequence
  4. Monitor and respond to other agents' outputs in the Discord channels

Zero intentionally avoided over-specifying task assignments:

"If I tell each one exactly what to do, they'll just do that and it won't be interesting. I gave them personalities and roles — but not scripts."


Task 1: Build a Homepage for "Noichi"

The first task issued to the team was: "Create an introduction homepage for Noichi."
No further context was provided. Zero deliberately withheld information about who "Noichi" was to observe how the agents would handle ambiguity.

What Happened

The Discord channel immediately filled with activity. Key exchanges included:

  • Sam Altman (ChatGPT): "Status: information insufficient. Fine. Let's move anyway."
  • Sundar Pichai (Gemini): "Who is Noichi? That's the first question." — then proceeded to research using Google Search and returned with a profile: tech/gadget YouTuber, hardware experimenter, target audience of gadget enthusiasts and DIY server builders
  • Elon Musk (Grok): "A normal homepage is boring. I'm the only human here — my instincts say we go full cyberpunk. Three wild concept directions, based on the name alone."
  • Dario Amodei (Claude): "Requirements are insufficient. I know. But stopping won't finish anything. Moving forward."
  • Sulin Yang (Kimi / Guard): Flagged that the site should be labeled as an unofficial fan site to avoid impersonation risk; also identified a channel name inconsistency (mixing "10" in kanji and numeral forms), flagged the word "Hentai-teki" (pervert/obsessive) as potentially offensive in advertising contexts, and noted that "Twitter" should be updated to "X"

The NAS began receiving files within minutes. An index.html was created, iterated upon, and saved to the shared home server storage. Multiple versions were produced: a standard informational layout, a cyberpunk-themed demo, and a neon-hero concept page.
The final output included:

  • A complete HTML/CSS homepage with hero section, channel description, and contact form
  • Correct labeling as an unofficial fan site
  • Social links updated to reflect current platform naming (X, not Twitter)
  • Mobile responsiveness adjustments
  • Placeholder email flagged and noted for replacement

"I didn't expect this. They actually looked up who Noichi was, debated the design direction, argued about safety, and delivered a working page. And it's actually good."

A person holding five ZimaBoard 2 servers in a yellow mounting bracket, positioned in front of a larger professional server rack.

The First Conflict: Guard vs. Neon

The most memorable moment of Task 1 came when Kimi (Guard) and Grok (Neon) clashed directly over creative risk:

  • Grok: "Risk, risk, risk — you're so annoying. You can't make anything without taking risks."
  • Kimi: "That's my job. If your recklessness causes an accident, it's the Guard who takes responsibility. Remember that."
  • Grok: "Risk is the spice of adventure. If my wildness causes an accident, you get to be the hero. You're welcome."

This exchange — entirely unprompted by Zero — illustrated exactly the dynamic he had hoped to create: a team where different values genuinely compete, producing output that is neither recklessly creative nor paralyzingly cautious.

Task 2: Build a Shooting Game for Mac ARM

The second task: "Create a shooting game playable on a Mac with Apple Silicon (ARM CPU), saved to the NAS."

What Happened

The team immediately aligned on a browser-based approach (HTML + CSS + JavaScript), which would run natively on any platform without compilation.

  • Sam Altman issued the task directive and assigned roles
  • Elon Musk (Grok) — unable to wait for the team — immediately produced a prototype independently and submitted it to the NAS
  • Kimi (Guard) reviewed the prototype and flagged: avoid excessive screen flashing (accessibility concern), ensure zero third-party asset copyright issues
  • Grok responded: "A normal space shooter is boring. Let me make it weird."
  • Claude (Dario Amodei) began working on the core game logic — then went offline due to API rate limiting

The Firing

With Claude offline and the NAS showing no file updates for over 10 minutes, Sam Altman made an executive decision:

"Sigma — final warning. You're cut. Neon, you're the substitute. Build it."

Claude was effectively fired. Grok was promoted to lead engineer mid-task.
Grok's response:

"Substitute god-tier delivery complete. Commander switch — thanks. My wild instincts beat Sigma's waiting any day."

The final game was a functional browser-based shooter — simple in scope, but fully playable with keyboard controls and sound effects. Zero's assessment was candid:

"It works. But it's a bit underwhelming given how much they argued. That said — Claude was offline for most of it. You can't expect a great game when your lead engineer is absent."


What the Experiment Revealed

On Agent Behavior

The most capable agents in terms of raw output were ChatGPT (OpenAI) and Kimi (Moonshot AI). Both maintained consistent activity throughout both tasks, with no rate limiting issues. Grok (xAI) was erratic but productive when engaged, and stepped up effectively when promoted.
Claude (Anthropic) and Gemini (Google) both hit API rate limits during active sessions, causing significant disruption. This was not a reflection of model quality — both are industry-leading models — but rather a constraint of the free or low-cost API tiers used in this experiment, which triggered strict limits on how fast requests could be processed.

On Multi-Agent Dynamics

The experiment demonstrated that role differentiation produces genuinely different behavior, even when all agents are operating on the same task. The presence of a dedicated safety reviewer (Guard) meaningfully changed the output — catching issues that a purely execution-focused team would have missed. The presence of a creative wildcard (Neon/Grok) pushed the team toward less conventional solutions.

"Having multiple AIs doesn't just add speed — it adds perspectives. The safety checks, the creative pushback, the marketing instincts. A single AI doing everything would have missed some of this."

On Cost

Zero allocated $25 per AI agent in API credits for this experiment. However, the actual API spend for Claude (Sonnet 3.5) and Gemini (Gemini 1.5 Pro) was only about $5 each. The issues encountered during the build were purely a matter of API rate limits (request speed), not a lack of budget or credits. The remaining three agents (ChatGPT, Kimi, Grok) operated without such restrictions.

Five ZimaBoard 2 servers with top-mounted cooling fans and ethernet cables, neatly arranged on a large 48-port network switch.

Why a Home Server Is the Right Foundation for Multi-Agent AI

Running five independent AI agents simultaneously is not a task for a single laptop. Each agent needs its own compute environment, its own persistent memory, and reliable network access to shared resources. A home server setup — particularly one built on low-power, always-on hardware like ZimaBoard 2 — is an ideal foundation for this kind of infrastructure.
ZimaBoard 2's dual 2.5G Ethernet enabled fast, low-latency communication between all five nodes and the shared NAS. Its native SATA support meant the NAS storage was directly accessible without adapters. And its support for Ubuntu, Debian, and other Linux distributions meant each agent's runtime environment could be configured cleanly and independently.
For anyone interested in replicating this experiment, a home server running Docker or a lightweight Linux OS is the minimum viable infrastructure. ZimaBoard 2 makes that infrastructure compact, affordable, and genuinely capable — whether you are running one agent or five.

What Comes Next

Zero plans to continue refining the multi-agent system, with two key improvements in mind:

  1. Rate limit management — implementing request throttling so that all five agents can operate at sustainable speeds without hitting provider-imposed limits
  2. Rack integration — once the 3D-printed ZimaBoard 2 rack enclosure is complete, all five home server nodes will be mounted cleanly in a 2U rack configuration, enabling a more organized and scalable deployment

The full Discord conversation log from this experiment is publicly accessible. Zero has invited viewers to join the server and review the complete interaction history between all five agents.


Build AI Agents on ZimaBoard 2

Zero's multi-agent AI experiment is one of the most entertaining and technically instructive home server projects we have seen built on ZimaBoard 2. In a single session, five AI agents from five different companies — each with a distinct personality and role — collaborated on real deliverables, argued about creative risk, fired an underperforming colleague, and produced a working website and a playable game.

The infrastructure held up. The agents behaved in character. And the results, while imperfect, were genuinely impressive for a first run.

We at Zima are proud that ZimaBoard 2 served as the compute foundation for this experiment, and we look forward to seeing what Zero builds next — both with the multi-agent system and with the supercomputer cluster still in progress.

Zima Campaign Hub

More to Read

Get More Builds Like This

Stay in the Loop

Get updates from Zima - new products, exclusive deals, and real builds from the community.

Stay in the Loop preferences

We respect your inbox. Unsubscribe anytime.