Hardware Setup

$ neofetch --hypervisor
  ╔═══════════════════════════════════════╗
  ║  HYPERVISOR (Physical Host)           ║
  ╠═══════════════════════════════════════╣
  ║  OS:       Proxmox VE (Debian-based)  ║
  ║  CPU:      Xeon E5-2690v4 (14C/28T)   ║
  ║  RAM:      128GB DDR4 ECC             ║
  ║  GPU:      AMD R9700 AI PRO 32GB      ║
  ║  Storage:  SSD pool + 22TB HDD        ║
  ║  Network:  LAN + Tailscale VPN        ║
  ║  Stack:    KVM / LXC / VFIO           ║
  ║  Uptime:   continuous                 ║
  ╚═══════════════════════════════════════╝

$ neofetch --ageis-node
  ╔═══════════════════════════════════════╗
  ║  AGEIS-NODE (AI VM)                   ║
  ╠═══════════════════════════════════════╣
  ║  OS:       Debian 13 (Trixie)         ║
  ║  CPU:      4 vCPUs (from Xeon pool)   ║
  ║  RAM:      16GB allocated             ║
  ║  GPU:      AMD R9700 32GB passthrough  ║
  ║  Stack:    ROCm 6.x / Ollama          ║
  ║  Role:     AI inference + agents      ║
  ╚═══════════════════════════════════════╝

$ neofetch --embedder-node
  ╔═══════════════════════════════════════╗
  ║  EMBEDDER NODE                        ║
  ╠═══════════════════════════════════════╣
  ║  OS:       Debian 13 (Trixie)         ║
  ║  GPU:      GTX 1650 Super 4GB         ║
  ║  Role:     Embeddings + TTS worker    ║
  ║  Status:   operational                ║
  ╚═══════════════════════════════════════╝

Hypervisor — Physical Host

The physical machine runs Proxmox VE as the hypervisor, hosting KVM virtual machines and LXC containers for service isolation. All VMs and workloads run on this hardware.

Processor

CPU	Intel Xeon E5-2690v4
Cores	14 cores / 28 threads @ 2.60GHz
Platform	X99
Role	Hypervisor + VM host

Memory

Installed	128GB DDR4 ECC
Type	ECC Registered
Allocated	VMs + Proxmox overhead

GPU

GPU	AMD Radeon AI PRO R9700
VRAM	32GB
Passthrough	VFIO to ageis-node VM

Storage

SSD	2x 1TB TeamGroup + Samsung 870 EVO + Samsung 960 NVMe
HDD	22TB WD White (bulk storage)
Boot	Proxmox host OS
VM Storage	Local SSD pool

ageis-node — AI VM

A KVM virtual machine on the hypervisor dedicated to AI inference, the OpenClaw pipeline, and the ACT-R memory system. The R9700 is passed through via PCIe for near-native GPU performance.

VM Resources

vCPUs	4 (allocated from Xeon)
RAM	16GB (expandable)
GPU	AMD R9700 32GB passthrough
Driver	ROCm (Linux)
Storage	Portion of SSD pool

Role

Inference	Local LLM via Ollama
Agents	OpenClaw pipeline
Memory	ACT-R cognitive system
Database	PostgreSQL

Embedder / Inference Node

A dedicated secondary node handles embedding generation and text-to-speech workloads, keeping those tasks off the main inference GPU.

Embedder Node

OS	Debian 13 (Trixie)
GPU	GTX 1650 Super 4GB
Role	Embeddings + TTS worker
Status	Operational

Network

Both nodes sit on the local LAN with Tailscale VPN providing secure remote access. Services are accessible from anywhere through the mesh network without exposing ports to the internet.

Network Topology

LAN	Private local network
VPN	Tailscale mesh
Remote Access	Tailscale + SSH
DNS	Local resolver

ROCm GPU Stack

The AMD Radeon AI PRO R9700 runs ROCm on Linux for local GPU inference. ROCm provides the compute framework — similar to CUDA but for AMD hardware. This is what makes local AI inference possible without depending on NVIDIA.

The GPU is passed through from Proxmox to the ageis-node VM using KVM GPU passthrough. This gives the VM direct hardware access with near-native performance while keeping it isolated from the host.

$ rocm-smi --showmeminfo vram
GPU[0] : VRAM Total: 32GB
GPU[0] : VRAM Used:  variable (model dependent)

$ rocminfo | grep "Name"
Name: gfx1201
Marketing Name: AMD Radeon AI PRO R9700

Running Services

OpenClaw — AI agent platform (primary workload)
Ollama — Local LLM inference server
Embedding service — Vector embeddings on the embedder node
TTS service — Text-to-speech on the embedder node
PostgreSQL — ACT-R memory store + application data
Tailscale — Mesh VPN across all nodes
Monitoring — Resource tracking and alerting

Why Local-First

Running inference locally means no API rate limits, no per-token costs for routine tasks, no data leaving the network, and no dependency on external services being available. The cloud is a tool, not a requirement — heavy reasoning goes to Claude API, everything else stays local. Read the full argument in Why Local AI Matters, or see how this hardware powers the OpenClaw agent platform. For the complete infrastructure deep-dive, see the Homelab Architecture write-up.