Hypervisor — Physical Host
The physical machine runs Proxmox VE as the hypervisor, hosting KVM virtual machines and LXC containers for service isolation. All VMs and workloads run on this hardware.
Processor
| CPU | Intel Xeon E5-2690v4 |
| Cores | 14 cores / 28 threads @ 2.60GHz |
| Platform | X99 |
| Role | Hypervisor + VM host |
Memory
| Installed | 128GB DDR4 ECC |
| Type | ECC Registered |
| Allocated | VMs + Proxmox overhead |
GPU
| GPU | AMD Radeon AI PRO R9700 |
| VRAM | 32GB |
| Passthrough | VFIO to ageis-node VM |
Storage
| SSD | 2x 1TB TeamGroup + Samsung 870 EVO + Samsung 960 NVMe |
| HDD | 22TB WD White (bulk storage) |
| Boot | Proxmox host OS |
| VM Storage | Local SSD pool |
ageis-node — AI VM
A KVM virtual machine on the hypervisor dedicated to AI inference, the OpenClaw pipeline, and the ACT-R memory system. The R9700 is passed through via PCIe for near-native GPU performance.
VM Resources
| vCPUs | 4 (allocated from Xeon) |
| RAM | 16GB (expandable) |
| GPU | AMD R9700 32GB passthrough |
| Driver | ROCm (Linux) |
| Storage | Portion of SSD pool |
Role
| Inference | Local LLM via Ollama |
| Agents | OpenClaw pipeline |
| Memory | ACT-R cognitive system |
| Database | PostgreSQL |
Embedder / Inference Node
A dedicated secondary node handles embedding generation and text-to-speech workloads, keeping those tasks off the main inference GPU.
Embedder Node
| OS | Debian 13 (Trixie) |
| GPU | GTX 1650 Super 4GB |
| Role | Embeddings + TTS worker |
| Status | Operational |
Network
Both nodes sit on the local LAN with Tailscale VPN providing secure remote access. Services are accessible from anywhere through the mesh network without exposing ports to the internet.
Network Topology
| LAN | Private local network |
| VPN | Tailscale mesh |
| Remote Access | Tailscale + SSH |
| DNS | Local resolver |
ROCm GPU Stack
The AMD Radeon AI PRO R9700 runs ROCm on Linux for local GPU inference. ROCm provides the compute framework — similar to CUDA but for AMD hardware. This is what makes local AI inference possible without depending on NVIDIA.
The GPU is passed through from Proxmox to the ageis-node VM using KVM GPU passthrough. This gives the VM direct hardware access with near-native performance while keeping it isolated from the host.
$ rocm-smi --showmeminfo vram
GPU[0] : VRAM Total: 32GB
GPU[0] : VRAM Used: variable (model dependent)
$ rocminfo | grep "Name"
Name: gfx1201
Marketing Name: AMD Radeon AI PRO R9700
Running Services
- OpenClaw — AI agent platform (primary workload)
- Ollama — Local LLM inference server
- Embedding service — Vector embeddings on the embedder node
- TTS service — Text-to-speech on the embedder node
- PostgreSQL — ACT-R memory store + application data
- Tailscale — Mesh VPN across all nodes
- Monitoring — Resource tracking and alerting
Why Local-First
Running inference locally means no API rate limits, no per-token costs for routine tasks, no data leaving the network, and no dependency on external services being available. The cloud is a tool, not a requirement — heavy reasoning goes to Claude API, everything else stays local. Read the full argument in Why Local AI Matters, or see how this hardware powers the OpenClaw agent platform. For the complete infrastructure deep-dive, see the Homelab Architecture write-up.