Config Reference
This page summarizes the configuration surface exposed through the containers.py dataclasses that back the CLI and Python APIs. Every block below lists defaults and the preset bundles applied by from_preset(...) helpers. Values are shown as declared in code; all options are ASCII-safe and can be overridden via YAML or dot-keys on the CLI.
Common Blocks (Unsupervised NN)
The four deep imputer families (Autoencoder, VAE, NLPCA, UBP) share these structures.
Field |
Default |
Description |
|---|---|---|
|
|
Run/output prefix used for directories and logging. |
|
|
Ploidy level ( |
|
|
Logging verbosity. |
|
|
RNG seed. |
|
|
Parallel jobs for Optuna. |
|
|
Averaging mode for metrics ( |
Field |
Default |
Description |
|---|---|---|
|
|
Latent init ( |
|
|
Latent width. |
|
|
Dropout applied to hidden layers. |
|
|
Count of hidden layers. |
|
|
Hidden non-linearity ( |
|
|
Scales hidden widths. |
|
|
Width layout ( |
Field |
Default |
Description |
|---|---|---|
|
|
Mini-batch size. |
|
|
Base LR. |
|
|
L1 regularization. |
|
|
Early-stop patience (epochs). |
|
|
Epoch bounds. |
|
|
Holdout fraction. |
|
|
Cap on class-weight ratio. |
|
|
Power scaling for class weights. |
|
|
Whether to normalize weights. |
|
|
Whether to invert weights. |
|
|
Focal-loss gamma. |
|
|
Whether to anneal gamma during training. |
|
|
|
Field |
Default |
Description |
|---|---|---|
|
|
Turn on Optuna. |
|
|
Objective metric (or list of metrics) used for Optuna tuning ( |
|
|
Number of trials. |
|
|
Reuse or persist Optuna DB. |
|
|
Per-trial training limits used by model-specific tuning loops (when supported). |
|
|
Model-specific patience setting used during tuning (when supported). |
Field |
Default |
Description |
|---|---|---|
|
|
Output format. |
|
|
Resolution. |
|
|
Base font size. |
|
|
Remove top/right spines. |
|
|
Interactive display toggle. |
Field |
Default |
Description |
|---|---|---|
|
|
Whether to simulate missingness for eval (required for unsupervised models). |
|
|
|
|
|
Proportion to mask. |
|
|
Extra args forwarded to |
Field-by-field notes
IOConfig -
ploidy: Set to1for haploids; controls class count and decoding. -n_jobs: Controls Optuna parallelism.ModelConfig -
latent_dim: Governs compression strength; higher values retain more signal at the cost of capacity/overfit. -layer_schedule:pyramidshrinks toward the bottleneck;linearwalks widths linearly.TrainConfig -
weights_power: Adjusts the aggression of class weighting (e.g., 0.5 for sqrt, 1.0 for standard inverse frequency). -gamma: Controls focal loss behavior. Usegamma_schedule=Trueto anneal it.TuneConfig -
metrics: Provide a string for single-objective tuning or a list/tuple for multi-objective tuning.SimConfig -
sim_strategy:nonrandomstrategies require a tree parser.
Unsupervised NN presets
Each model exposes from_preset("fast" | "balanced" | "thorough") to seed a baseline, then allows overrides. The presets apply to Autoencoder, VAE, NLPCA, and UBP configs.
AutoencoderConfig
Preset baseline (all presets):
- io: verbose=False, ploidy=2.
- train: validation_split=0.20, weights_max_ratio=None, weights_power=1.0, weights_normalize=True, gamma=0.0.
- model: activation="relu", layer_schedule="pyramid", layer_scaling_factor=2.0.
- sim: simulate_missing=True, sim_strategy="random", sim_prop=0.2.
- tune: enabled=False, n_trials=100.
Preset overrides:
fast -
model:latent_dim=4,num_hidden_layers=1,dropout_rate=0.10. -train:batch_size=128,learning_rate=2e-3,early_stop_gen=15,max_epochs=200. -tune:patience=15.balanced -
model:latent_dim=8,num_hidden_layers=2,dropout_rate=0.20. -train:batch_size=64,learning_rate=1e-3,early_stop_gen=25,max_epochs=500. -tune:patience=25.thorough -
model:latent_dim=16,num_hidden_layers=3,dropout_rate=0.30. -train:batch_size=64,learning_rate=5e-4,early_stop_gen=50,max_epochs=1000. -tune:patience=50.
VAEConfig + VAEExtraConfig
Inherits structure from Autoencoder but adds VAEExtraConfig.
Field |
Default |
Description |
|---|---|---|
|
|
Final KL weight. |
|
|
Whether to anneal KL beta. |
Preset overrides:
fast -
model:latent_dim=4,num_hidden_layers=2,dropout_rate=0.10. -train:batch_size=128,learning_rate=2e-3,early_stop_gen=15,max_epochs=200. -vae:kl_beta=0.5. -tune:patience=15.balanced -
model:latent_dim=8,num_hidden_layers=4,dropout_rate=0.20. -train:batch_size=64,learning_rate=1e-3,early_stop_gen=25,max_epochs=500. -vae:kl_beta=1.0. -tune:patience=25.thorough -
model:latent_dim=16,num_hidden_layers=8,dropout_rate=0.30. -train:batch_size=64,learning_rate=5e-4,early_stop_gen=50,max_epochs=1000. -vae:kl_beta=1.0. -tune:patience=50.
NLPCAConfig + NLPCAExtraConfig
Inherits structure from Autoencoder and adds projection controls for latent refinement.
Field |
Default |
Description |
|---|---|---|
|
|
Learning rate for projection refinement. |
|
|
Projection steps per evaluation or inference pass. |
Preset overrides mirror Autoencoder presets; NLPCA adds only projection controls.
UBPConfig + UBPExtraConfig
Inherits structure from Autoencoder and adds projection controls for latent refinement.
Field |
Default |
Description |
|---|---|---|
|
|
Learning rate for projection refinement. |
|
|
Projection steps per evaluation or inference pass. |
Preset overrides mirror Autoencoder presets; UBP adds only projection controls.
Deterministic Imputers
These configurations use simpler blocks and do not use Neural Network specific settings like ModelConfig.
MostFrequentConfig / RefAlleleConfig
Common Fields:
- split.test_size: Default 0.2.
- sim.simulate_missing: Default False (enabled in presets).
- algo.missing: Default -1.
MostFrequentAlgoConfig Extra Fields:
- by_populations: False.
- default: 0.
Supervised Wrappers (RF / HistGB)
Supervised models use distinct config classes ending in ConfigSupervised.
Field |
Default |
Description |
|---|---|---|
|
|
Run identity. |
|
|
Parallel jobs. |
|
|
RNG seed. |
Field |
Default |
Description |
|---|---|---|
|
|
Master toggle. |
|
|
Tuning metric. |
|
|
Trial count. |
|
|
Parallel jobs for tuning. |
|
|
Whether to use faster settings (subsampling etc.). |
Field |
Default |
Description |
|---|---|---|
|
|
Forest size. |
|
|
Depth cap. |
|
|
Class weighting strategy. |
Field |
Default |
Description |
|---|---|---|
|
|
Boosting iterations. |
|
|
Step size. |
|
|
Early stopping patience. |
Presets (Supervised)
RandomForest (RFConfig):
fast:
n_estimators=50,max_iter=5,tune.enabled=False.balanced:
n_estimators=200,max_iter=10,tune.enabled=False.thorough:
n_estimators=500,max_depth=50,max_iter=20,tune.enabled=False.
HistGradientBoosting (HGBConfig):
fast:
n_estimators=50,learning_rate=0.2,max_iter=5.balanced:
n_estimators=150,learning_rate=0.1,max_iter=10.thorough:
n_estimators=500,learning_rate=0.05,max_iter=20,n_iter_no_change=20.