Alignment confidence
How confidently lemmas align across the seven dictionaries, and the review queue of the few that don't align cleanly (UC-CD-05 / UC-RV-03).
This Queue Proves
This queue proves that the alignment layer can isolate the rare headword matches where normalized SLP1 equality is plausible but not safe enough to trust silently.
This Queue Does Not Prove
It does not prove the machine alignment is wrong, or that nasalized and non-nasalized variants are always separate lemmas.
Trust Block
- Evidence:
src/data/review/low-confidence-alignment-review.json, dictionary source pointers, and the alignment confidence output fromsrc/data/dicts/alignment-confidence.json. - Limitations: this queue covers multi-dictionary lemma matches that pass the comparison filters; it does not inspect every possible spelling variant.
- Validation: generated by
npm run build-alignment-review; checked bynpm run validate-review-reportsandnpm test. - Owner repo:
csl-atlas. - Next use: review a small source-linked sample while preserving the canonical review fields.
high = every dictionary used the identical raw
<k1>; medium = matched only after normalization (stripping an accent / nasalization mark / homonym digit). Because all seven dictionaries store SLP1 headwords, alignment is overwhelmingly high-confidence — the medium queue below is the rare residue, not a backlog.
Review queue · low-confidence alignments
All current cases are headwords distinguished only by a nasalization mark (e.g.
o vs o~). A reviewer decides whether they are the same lemma; record the decision in src/data/review/low-confidence-alignment-review.json (preserved across rebuilds by reviewId).Generated by npm run build-alignment-review (after build-dict-comparison); validated by npm run validate-review-reports. CC-BY-SA-4.0.