Alignment confidence

How confidently lemmas align across the seven dictionaries, and the review queue of the few that don't align cleanly (UC-CD-05 / UC-RV-03).

This Queue Proves

This queue proves that the alignment layer can isolate the rare headword matches where normalized SLP1 equality is plausible but not safe enough to trust silently.

This Queue Does Not Prove

It does not prove the machine alignment is wrong, or that nasalized and non-nasalized variants are always separate lemmas.

Trust Block

Evidence: src/data/review/low-confidence-alignment-review.json, dictionary source pointers, and the alignment confidence output from src/data/dicts/alignment-confidence.json.
Limitations: this queue covers multi-dictionary lemma matches that pass the comparison filters; it does not inspect every possible spelling variant.
Validation: generated by npm run build-alignment-review; checked by npm run validate-review-reports and npm test.
Owner repo: csl-atlas.
Next use: review a small source-linked sample while preserving the canonical review fields.

high = every dictionary used the identical raw <k1>; medium = matched only after normalization (stripping an accent / nasalization mark / homonym digit). Because all seven dictionaries store SLP1 headwords, alignment is overwhelmingly high-confidence — the medium queue below is the rare residue, not a backlog.

Review queue · low-confidence alignments

All current cases are headwords distinguished only by a nasalization mark (e.g. o vs o~). A reviewer decides whether they are the same lemma; record the decision in src/data/review/low-confidence-alignment-review.json (preserved across rebuilds by reviewId).

Generated by npm run build-alignment-review (after build-dict-comparison); validated by npm run validate-review-reports. CC-BY-SA-4.0.