Parameter System Overview¶
VLA Foundry uses a hierarchical, immutable dataclass system built on draccus to configure every aspect of a training experiment. All configuration flows through a single top-level object --- TrainExperimentParams --- which composes several specialized parameter groups.
Hierarchy¶
TrainExperimentParams
|-- model: ModelParams # Architecture definition
|-- data: DataParams # Dataset and preprocessing
|-- hparams: HyperParams # Optimization and training
|-- distributed: DistributedParams # Multi-GPU / multi-node
|-- ema: EMAParams # Exponential moving average
Each nested parameter class is an immutable frozen dataclass. Fields are populated from YAML config files, command-line arguments, or a combination of both, with command-line arguments always taking precedence.
Design Principles¶
Separation of Concerns¶
Each parameter group owns a distinct slice of the configuration:
| Group | Responsibility |
|---|---|
| ModelParams | What to train --- architecture type, layer counts, pretrained weights |
| DataParams | What to train on --- dataset paths, modality, field definitions, normalization |
| HyperParams | How to train --- learning rate, batch size, optimizer, precision |
| DistributedParams | Where to train --- FSDP, world size, backend |
| EMAParams | Whether and how to maintain an EMA copy of the model |
This separation means you can swap a model preset without touching data configuration, or change the optimizer without modifying the architecture.
Immutability¶
All parameter dataclasses are frozen=True. Once constructed, fields cannot be reassigned. This guarantees that the configuration logged to Weights & Biases or saved alongside checkpoints is exactly what was used during training.
Derived fields (like accum_freq on HyperParams) are computed as @property accessors rather than mutable attributes.
Shared Attributes¶
Some values need to flow between parameter groups. For example, DiffusionPolicyParams needs the action_dim that is computed inside DataParams. This is handled by init_shared_attributes(cfg), which is called during TrainExperimentParams.__post_init__ and passes the full config to each sub-parameter so it can read cross-cutting values.
# Inside DiffusionPolicyParams
def init_shared_attributes(self, cfg):
super().init_shared_attributes(cfg)
object.__setattr__(self, "action_dim", cfg.data.action_dim)
object.__setattr__(self, "proprioception_dim", cfg.data.proprioception_dim)
Registry-Based Polymorphism¶
Both ModelParams and DataParams use draccus ChoiceRegistry for subclass selection at runtime. You specify the concrete type via a type field:
model:
type: diffusion_policy # selects DiffusionPolicyParams
data:
type: robotics # selects RoboticsDataParams
New types are registered with decorators:
@register_model_params("my_new_model")
@dataclass(frozen=True)
class MyNewModelParams(ModelParams):
...
How Configs Are Loaded¶
- YAML files are the primary source. Use
!includedirectives to compose from presets. - Command-line arguments override any YAML value using dot-notation (
--hparams.lr 3e-4). __post_init__runs validation and derives computed fields.init_shared_attributespropagates cross-cutting values between sub-params.check_assertsvalidates consistency (batch sizes, dataset manifest lengths, etc.).
from vla_foundry.params.train_experiment_params import load_experiment_params_from_yaml
params = load_experiment_params_from_yaml("path/to/config.yaml")
Resolving Configs
Set --resolve_configs True to print the fully merged configuration and exit without training. Optionally set --resolve_configs_path ./ to also save it to resolved_config.yaml.
Source Locations¶
| File | Contents |
|---|---|
vla_foundry/params/train_experiment_params.py | TrainExperimentParams, YAML loading helpers |
vla_foundry/params/model_params.py | ModelParams and all model-specific subclasses |
vla_foundry/params/base_data_params.py | DataParams base class |
vla_foundry/params/data_params.py | TextDataParams, ImageCaptionDataParams, RoboticsDataParams |
vla_foundry/params/hyper_params.py | HyperParams |
vla_foundry/params/distributed_params.py | DistributedParams |
vla_foundry/params/ema_params.py | EMAParams |
Next Steps¶
- TrainExperimentParams --- top-level fields
- ModelParams --- architecture configuration
- DataParams --- dataset configuration
- HyperParams --- optimization configuration