calm.pipeline — Lilush API

←index

← calm

Overview

CALM pipeline utilities. Shared helpers for tool scripts and builtins: path resolution, CTDS writing, atomic saves, and key suffix computation.

Functions

NameSignature
resolve_pathsresolve_paths() -> paths
ensure_calm_dirensure_calm_dir() -> ok, err
write_uint16_lewrite_uint16_le(f, val)
write_uint32_lewrite_uint32_le(f, val)
find_cmd_posfind_cmd_pos(seq) -> pos
write_ctdswrite_ctds(path, sequences, cmd_positions) -> ok, err
atomic_saveatomic_save(tmp_path, final_path) -> ok, err
parse_text_datasetparse_text_dataset(path) -> sequences, cmd_positions_or_err, skipped
size_name_from_configsize_name_from_config(d_model, n_layers) -> name

resolve_paths() -> paths

Resolve standard CALM file paths under $HOME/.local/share/lilush/calm/

ensure_calm_dir() -> ok, err

Ensure the CALM data directory exists (recursive mkdir)

write_uint16_le(f, val)

Write a uint16 in little-endian to a file handle

write_uint32_le(f, val)

Write a uint32 in little-endian to a file handle

find_cmd_pos(seq) -> pos

Find the 0-indexed position of the ATN token in a sequence

If ATN is present, returns its 0-indexed position. Otherwise returns 0 (full-sequence loss).

write_ctds(path, sequences, cmd_positions) -> ok, err

Write a CTDS binary file from sequences and cmd_positions

atomic_save(tmp_path, final_path) -> ok, err

Atomic save: write to tmp_path, then rename to final_path

parse_text_dataset(path) -> sequences, cmd_positions_or_err, skipped

Parse a plain-text training data file into sequences and cmd_positions

Returns tables ready for write_ctds(). On success returns sequences, cmd_positions, skipped. On error returns nil, error string.

size_name_from_config(d_model, n_layers) -> name

Look up a model size name from d_model and n_layers