Skip to main content

Introducing DGX Lab

· 4 min read
Justin Goheen
AI/ML Engineer

DGX Lab is a local-first developer dashboard for the NVIDIA DGX Spark. Eight tools for model management, experiment tracking, agent observability, GPU profiling, training recipes, synthetic data, data curation, and dataset browsing -- all memory-aware against 128 GB of unified LPDDR5X.

This post walks through what it is, what it runs on, and how it fits together.

The hardware

The DGX Spark is a desktop with a GB10 Grace Blackwell SoC: an Arm-based Grace CPU and a Blackwell GPU sharing 128 GB of unified memory at roughly 273 GB/s. FP4, FP8, and FP16 precision. No discrete VRAM -- the memory pool is shared between CPU and GPU, which changes how you think about model loading and inference budgets.

DGX Spark System Architecture
NVIDIA DGX Spark · GB10 Grace Blackwell · Desktop AI

Memory budget

128 GB sounds like a lot until you load a 70B model. Quantization helps -- a 70B at Q4 fits in around 38 GB, leaving room for KV cache, activations, and the system overhead. DGX Lab's Control tool shows this breakdown for every model in your HuggingFace cache.

128 GB Unified Memory Budget
DGX Spark GB10 · 128 GB LPDDR5X · ~273 GB/s

The chart above shows a typical allocation when serving a 70B Q4 model. The "Available" segment is what's left for additional workloads, experiments, or a second model.

The tools

DGX Lab ships eight tools, each backed by a FastAPI router and rendered by a Next.js page:

ToolRouteWhat it does
Control/controlModel library -- scan HF cache, search Hub, pull models, memory fit estimates
Logger/loggerExperiment tracker -- run metrics from SQLite, Parquet, JSONL
Traces/tracesAgent trace viewer -- span waterfall, cost/token aggregation
Monitor/monitorGPU dashboard -- gauges, system timeline, process table
AutoModel/automodelNeMo training recipes -- SFT, LoRA, pretraining, distillation, QAT
Designer/designerSynthetic data generation with provider/model config
Curator/curatorNeMo Curator data curation pipelines and stage browser
Datasets/datasetsLocal and HuggingFace dataset browser with row preview

Every tool that touches model loading or training reasons about the 128 GB budget. Control shows memory fit bars. Monitor tracks GPU utilization in real time. AutoModel validates that a training recipe will fit before launching.

The stack

Frontend: Next.js 16 · React 19 · Tailwind CSS 4 · shadcn v4
Backend: FastAPI · Python 3.12 · uv
Deploy: Docker Compose · nginx · Tailscale
Hardware: DGX Spark · GB10 Grace Blackwell · 128 GB LPDDR5X
Agent: LangChain · Claude 3.5 Haiku (Bedrock) · FAISS + cuVS · LangSmith

The frontend is a Turborepo monorepo with Bun workspaces. apps/web is the Next.js app; packages/ui holds shared shadcn components. The backend runs on the Spark itself -- FastAPI with async routers, one per tool. Docker Compose bundles frontend, backend, and an nginx reverse proxy. Tailscale provides remote access from a Mac or any device on the tailnet.

Architecture

The DGX Lab Agent is a RAG-backed assistant that treats the entire codebase as its knowledge base. It uses nvidia/llama-embed-nemotron-8b for embeddings, FAISS with cuVS GPU acceleration for vector search, and nvidia/llama-nemotron-rerank-1b-v2 for reranking. The LLM is Claude 3.5 Haiku via AWS Bedrock. All interactions are traced with LangSmith.

Design language

Dark base. Cyan is earned -- active states, live-status dots, critical metrics only. Monospace for machine data, sans for navigation and prose. Density over decoration. The dashboard should feel like it was built by someone who runs large models at 2am, not a marketing team.

TokenValueUse
--background#09090bPage base
--surface#0f0f12Sidebar, panels
--elevated#161619Cards, inputs
--color-cyan#22d3eePrimary accent -- scarce

What's next

DGX Lab is open source under Apache 2.0. Clone it, fork it, adapt it for your hardware. The architecture is modular: adding a tool means one FastAPI router, one Next.js page, one sidebar entry.

git clone https://github.com/jxtngx/dgx-lab.git
cd dgx-lab
make dev

The docs cover setup, tool guides, and API reference. The codebase is the documentation.