Corrections Workflow
This is the canonical sequence for correcting a csl-orig source dictionary (the
"Jim Funderburk / Dhaval Patel pattern"). Do not deviate from it — the steps exist to
guarantee XML validity and a complete audit trail.
The sequence
# 1. Snapshot current source
cp csl-orig/v02/{dict}/{dict}.txt temp_{dict}_0.txt
# 2. Apply corrections (via updateByLine.py or a direct edit)
python updateByLine.py temp_{dict}_0.txt change_file.txt temp_{dict}_1.txt
# 3. Put the result back into csl-orig
cp temp_{dict}_1.txt csl-orig/v02/{dict}/{dict}.txt
# 4. XML validation — REQUIRED before commit
cd csl-pywork/v02
sh generate_dict.sh {dict} tempparent/{dict}
sh xmlchk_xampp.sh {dict}
# On Windows without XAMPP: make_xml.py reporting
# "All records parsed by ET" is sufficient.
rm -rf tempparent/{dict} # clean up
# 5. Generate the audit-trail change file (store in csl-corrections)
cd csl-corrections/batch_YYYYMMDD/dictionaries/{dict}/
python diff_to_changes_dict.py temp_{dict}_0.txt \
csl-orig/v02/{dict}/{dict}.txt change_{dict}_N.txt
# 6. Commit both repos
# csl-orig: the corrected dict file
# csl-corrections: the change file + readme.txt
Critical rules
-
Corrections ARE committed directly to
csl-orig— this is the canonical pattern. The change files incsl-correctionsare the audit trail; they are not applied at generation time. -
Always run XML validation (step 4) before committing to
csl-orig. -
diff_to_changes_dict.pyassumes the same line count in old and new. If you insert or delete lines, usediff_to_changes.pyinstead. -
No BOM. When writing files with Python use
open(f, 'w', encoding='utf-8')— notutf-8-sig. Thecsl-origfiles never carry a UTF-8 BOM. Verify with:python -c "open('f','rb').read(3).hex()" # must NOT be efbbbf -
printchange.txtrecords deviations from the scanned print — not digital/markup fixes.
Local prerequisites (Windows)
generate_dict.sh needs:
python3onPATH(a wrapper that forwards topythonworks).makoinstalled (pip install mako).csl-websanlexiconas a sibling ofcsl-pywork.xmllintis typically unavailable locally — use the ElementTree (ET) parse success as the validation signal.
Change-file format
See Change Files for the exact line-paired format used by
updateByLine.py.