# Checkpoint Conversion Context

Use this pack for the `convert/*` steps.

## Product Contract

- Conversion is an explicit pipeline stage. Do not silently change downstream
  steps to consume a different checkpoint layout.
- Keep source and destination paths separate so a failed conversion cannot
  corrupt the input checkpoint.
- Verify tokenizer/config files travel with HF outputs.

## Step Map

| Step | Input | Output | Use when |
|---|---|---|---|
| `convert/hf_to_megatron` | `checkpoint_hf` | `checkpoint_megatron` | A Megatron-Bridge consumer needs distributed checkpoint layout |
| `convert/megatron_to_hf` | `checkpoint_megatron` | `checkpoint_hf` | HF-native eval, deployment, merge, or optimization needs safetensors layout |
| `convert/merge_lora` | `checkpoint_lora` + `checkpoint_hf` | `checkpoint_hf` | Adapter must become a standalone HF checkpoint |

## Rules

- For Megatron export, point at the concrete `iter_*` checkpoint directory, not
  only the parent run directory.
- For HF import, point at a clean HF model directory with config, tokenizer, and
  weights.
- For LoRA merge, use the exact base model used during adapter training.
- Keep `trust_remote_code=true` only when the HF architecture requires it and
  the source is trusted.

## Pipeline Patterns

- `peft/automodel` -> `convert/merge_lora` for standalone HF output.
- `sft/megatron_bridge` -> `convert/megatron_to_hf` for HF-native eval or
  deployment.
- `sft/automodel` -> `convert/hf_to_megatron` only when a Megatron-only
  downstream step requires it.

## Failure Modes

- `source_not_clean_hf_checkpoint`: use a real HF model directory, not trainer
  logs or adapter-only output.
- `bad_megatron_checkpoint_path`: use the fully written `iter_*` directory.
- `base_model_mismatch`: merge adapters only with their original base.
