Skip to main content

Repository Map

All repositories live in the sanskrit-lexicon GitHub organization.

Per-dictionary repositories

Each dictionary has its own repo, named by its code. Examples (see the full catalog):

CodeDictionaryRepo
MWSMonier-WilliamsMWS
AP90Apte (1890)AP90
PWGBöhtlingk-Roth (large)PWG
PWKBöhtlingk (shorter)PWK
GRAGrassmann (Rig-Veda)GRA

Infrastructure repositories

RepoRole
csl-origCanonical source: v02/{dict}/{dict}.txt for every dictionary
csl-pyworkBuild tooling: generate_dict.sh, make_xml.py, validators
csl-correctionsAudit trail: change files grouped in batch_YYYYMMDD/
csl-websanlexiconWeb/display assets consumed by generation
csl-apidevWeb backend — the RESTful + Salt API (PHP)
csl-appCross-platform app (Android/iOS/macOS/Linux/Windows; Dart/Flutter)
csl-standardsShared standards (incl. the normative Salt API contract)
csl-inflectInflected-forms generation (e.g. MW inflected forms)
csl-sqliteThe per-dictionary SQLite search databases (published via GitHub Releases)
csl-jsonJSON form of the dictionary data ({words, text})
cologne-stardictStarDict / offline packaging (via Babylon export)
csl-docSphinx per-dictionary front-matter / prefaces documentation

Other infrastructure repos (csl-atlas, csl-observatory, …) support tooling and observability. See Data Formats for how the SQLite, JSON, and StarDict artifacts are produced.

Dictionary code vs. repository name

A dictionary's code (used in URLs, csl-orig, and the API) is not always its repo name. The clearest case: the shorter Petersburg dictionary has code PW (csl-orig/v02/pw/) but lives in the repo PWK. The catalog lists each dictionary's actual repo.

Conventions shared across repos

  • Session state: each repo keeps a tracked .ai_state.md journal.
  • Correction pattern: most repos apply corrections via updateByLine.py change files (see Change Files).
  • Input files for the large German dictionaries live in sibling *xml repos (e.g. ../pwgxml/pwg.xml, ../mwsxml/mws.xml).