Atlas of the Cologne Digital Sanskrit Lexicons

Trust Block

A comparative microstructural atlas of nine narrative Sanskrit-dictionary chapters plus an all-dictionary coverage layer spanning the indigenous kośa tradition (~6th c.) through to the Cologne Digital Sanskrit Lexicon (2024). Each chapter analyses one dictionary under an 18-block formal apparatus developed for MW; the coverage layer asks which parts of that apparatus transfer, and how large those parts are, across every available CDSL v02 dictionary.


Start here


Reader mode

Start with Which dictionary should I use? if you need a first stop: MW is the default public route, with task-specific checks in AP, PWG/PWK, VCP/SKD, WIL, and specialized dictionaries. Then use the Reader lookup for dictionary-first search across MW, AP, PWG, PWK, WIL, VCP, and SKD. It accepts SLP1 and IAST headwords, shows dictionary coverage and source links, and keeps machine-derived evidence visibly labeled. Evidence labels explain observed, derived, inferred, and reviewed.


Evidence atlas — quantitative tracks

Beyond the paper, the atlas is building deterministic, source-linked, evidence-labelled tracks over the CDSL dictionaries. Every count links back to a source record; uncertainty (observed / derived / inferred) is always visible. See ARCHITECTURE.md and the reader guide.

MW quantitative depth · Phase 1

Monier-Williams anatomy at scale: depth dashboard (article types, citations, compound depth), diachronic layers (conservative source-layer profile), family depth (deepest lexical families).

Dictionary comparison · Phase 2

Broad Sanskrit/BHS headword coverage across eligible local dictionaries, with Core 7 deep comparison where adapters are validated: coverage matrix · pairwise overlap · gender conflicts · homonym splits · citation apparatus · sense depth · lemma dossier (look up a word).

Dictionary structure

Dictionary genealogy, convention fingerprints, structural register, and R2 sense alignment / sense granularity move dictionary-structure research into the atlas path.

Review queues

Machine-flagged cases awaiting human judgement, schema-conforming and source-linked: gender conflicts · source layers · alignment confidence · source-siglum aliases.


How to read this atlas

Chapters are ordered to mirror the paper's argument arc per Decision 29:

  1. Framework-fit first — chapters 1–4 (MW, PWG, PWK, AP): structured bilingual dictionaries that exercise the full 18-block apparatus. Read these to learn how the framework works.
  2. Typographic precedents — chapters 5–6 (BEN, CAE): 19th-century single-volume works whose typographic markers (, *) are the historical precedents for MW's tagged <ls>L.</ls> hedge. Read these to learn where the hedge came from.
  3. The base — chapter 7 (WIL): the earliest CDSL dict (1832), Wilson's Amarakośa-derived word list with effectively no source apparatus. Read this to see where the European tradition begins.
  4. The genre boundary — chapters 8–9 (SKD, VCP): seven-volume Sanskrit-Sanskrit kośa works where the 18-block apparatus stops applying as a structured-bilingual model (no <lex>/<ls> tags, inline iti citation instead). Read these to learn where the framework must change.

The arc — framework-fit → precedent → base → genre-limit — is the same arc as PAPER.md §§3–8. A reader following the 9 chapters in order learns the framework, then learns its history, then learns its limits.


Start here

Read the paper

One consolidated study of Monier-Williams 1899 — a data-grounded body, triangulated against three external frameworks:

  • Grounded framework (body) — five constructs built from MW outward (block, slot, profile, hedge, infrastructure) + the block-economy thesis
  • Triangulation (§7) — how Wiegand, Atkins–Rundell and Hausmann converge as three witnesses to one analysis
  • Framework appendices A·B·C — the condensed Wiegand / Atkins–Rundell / Hausmann readings (incl. the proposed Provenienz-Komment)

Explore the tools

Browse the 9 dictionaries (Decision 29 order)

# Code Year Tier One-line summary
1 MW 1899 A Standard single-volume Sanskrit-English reference; 286,561 records; framework's home dictionary
2 PWG 1855–75 A 7-volume Sanskrit-German Grosses PW; densest <ls> apparatus (4.63/record); 0 hedges
3 PWK 1879–89 A Böhtlingk's own compact 7-volume / 7-Lieferung abridgement; dropped PWG's kosha apparatus before MW
4 AP 1890 / 1957 A Apte's Practical Sanskrit-English Dictionary; only post-MW dict with any <ls>L.</ls> (1×)
5 BEN 1866 B Benfey 1866; earliest typographic hedge precedent ( = "no authoritative references")
6 CAE 1891 B Cappeller 1891; first systematic typographic precedent (* for kosha-only); MW 1899 co-editor
7 WIL 1832 B Wilson 1832; earliest CDSL dict; the base from which the European tradition departs
8 SKD 1822–58 C 7-volume Śabdakalpadruma; first Sanskrit-Sanskrit kośa; the genre boundary — framework changes here
9 VCP 1873–84 C 7-volume Vācaspatyam; confirms the genre boundary and motivates the all-dictionary coverage layer

Tier A = full template (8 sections); Tier B = compact + typography section; Tier C = genre-bound, prose-pattern analysis. Audit report: _consistency_audit.md.

Kośa-resolution repos (sources MW's <ls>L.</ls> hedge points back to, not atlas chapters): ARMH (Halāyudha) · ABCH (Hemacandra) · ACPH · ACSJ. A future Phase-5 project would extend the framework to these.


The central finding — three-stage <ls>L.</ls> lineage

The single most striking pattern in the atlas is the three-stage lineage of MW's lexicographer-only hedge (per the 2026-05-27 MW 1872 preface read):

Year Source Marker Meaning Scale
1866 Benfey "no authoritative references" (weaker, methodological) ~900 typographic
1872 MW 1st edn declares L. in preface § II "only in native lexicons" preface-only; ≈ 0 in body
1891 Cappeller * "taught only by grammarians or lexicographers" 1,370 typographic — first systematic
1899 MW 2nd edn (w/ Cappeller as co-editor) <ls>L.</ls> "lexicographer-only attestation" 40,212 tagged + scaled

None of the four stages is fully derivative of the others. MW 1872 is first with the concept (declared in MW's own preface, in his own words); Cappeller 1891 is first with the systematic typographic implementation; MW 1899 is first with the tagged + scaled implementation. PWG, the other major 19th-century dictionary, sits outside this lineage — it kept the named-kosha apparatus (top 5 sigla all named indigenous Sanskrit lexicons: <ls>ŚKDR.</ls>, <ls>MED.</ls>, <ls>H. an.</ls>, etc.) and never used a hedge.

The Lineage Sankey visualises this collapse — MW's 40,212 hedges summarise what PWG distributed across 821 named-kosha sigla.


Corpus frequency data

VisualDCS provides DCS corpus dashboards and per-lemma frequency data (M1–M8 CoNLL-U→SQLite pipeline). Once VisualDCS emits dcs_lemma_summary.json (see docs/VISUALDCS_CONSUMPTION_CONTRACT.md), this atlas will display per-lemma corpus frequency bands inline on lemma pages. Until then, follow the link above for DCS-derived evidence.


Companion documents on GitHub