Skip to main content

Paths, memory constants, and config

· 3 min read
Justin Goheen
AI/ML Engineer

Tools in DGX Lab assume local disk and Spark-class memory. Defaults match a stock DGX Spark layout; everything below can be overridden with environment variables read in backend/app/config.py.

Config surface in config.py
14 env-backed knobs total · same file you grep for `DGX_LAB_` and `DATA_DESIGNER_HOME`

Model and experiment data

Default pathEnv varConsumers
~/.cache/huggingface/hubDGX_LAB_MODELS_DIRControl — cache scan, Hub pull, memory fit
~/.dgx-lab/experimentsDGX_LAB_EXPERIMENTS_DIRLogger — SQLite / Parquet / JSONL metrics
~/.dgx-lab/tracesDGX_LAB_TRACES_DIRTraces tool — JSONL agent traces
~/.dgx-lab/designerDGX_LAB_DESIGNER_DIRDesigner — generated output
~/.data-designerDATA_DESIGNER_HOMEDesigner library config (model_providers.yaml, model_configs.yaml; owned by data-designer)
~/.dgx-lab/curatorDGX_LAB_CURATOR_DIRCurator pipelines and job state
~/.dgx-lab/datasetsDGX_LAB_DATASETS_DIRDatasets browser — local staging

If a directory does not exist yet, create it or run the tool that populates it — the backend does not magically scaffold all of these on first boot.

Agent, transcripts, and LangSmith

These paths support the in-dashboard agent, RAG index, and trace export — they are easy to miss if you only read the “eight tools” table.

Default pathEnv varRole
~/.cursor/projects/<slug>/agent-transcripts (when present) else ~/.dgx-lab/agent-transcriptsDGX_LAB_AGENT_TRANSCRIPTS_DIRCursor agent transcript ingestion
~/.claude/projectsDGX_LAB_CLAUDE_TRANSCRIPTS_DIRClaude project transcripts
~/.dgx-lab/langsmith-tracesDGX_LAB_LANGSMITH_TRACES_DIRLangSmith trace JSONL landing zone
~/.dgx-lab/agentDGX_LAB_AGENT_INDEX_DIREmbeddings / vector index backing RAG
Current working directory of the processDGX_LAB_CODEBASE_ROOTCodebase root for indexing and context

Set these explicitly when you run multiple clones, store caches on another volume, or keep transcripts in a non-default Cursor layout.

Memory display constants

Unified memory on the Spark is 128 GB with a ~273 GB/s bandwidth ceiling in product messaging. DGX Lab uses those numbers for gauges and “fit” math unless you override:

Env varDefaultPurpose
DGX_LAB_MEMORY_TOTAL_GB128Total pool for UI bars and estimates
DGX_LAB_MEMORY_BW_MAX_GBS273Upper bound for bandwidth visualization

If you run the same codebase on different hardware (not the intended case, but possible), adjust these so the UI does not lie about headroom.

Memory UI defaults (overridable)
Unified pool size (gauges, fit math)
Spark default 128 GB — change if you point the UI at different hardware.
Bandwidth ceiling (visualization cap)
Default 273 GB/s — axis cap set to 320 for headroom in the chart.

The same constants feed the memory fit story in Control: a typical 70B Q4 load against the 128 GB pool looks like the breakdown below (illustrative segments — your cache and workload will differ).

128 GB Unified Memory Budget
DGX Spark GB10 · 128 GB LPDDR5X · ~273 GB/s

Why this matters

  • Control and Monitor tie directly to GPU visibility and cache paths — wrong DGX_LAB_MODELS_DIR means an empty library or wrong sizes.
  • Logger / Traces / Datasets are pure filesystem contracts — if the path is wrong, the UI is empty, not “broken.”
  • Agent behavior depends on DGX_LAB_CODEBASE_ROOT and index dir — a mismatched root indexes the wrong tree.

Link related reading: Introducing DGX Lab for architecture context; repo README.md duplicates the high-level table for quick grep.