Atlas of the Cologne Digital Sanskrit Lexicons

Trust Block

Evidence: committed atlas pages, generated data under src/data/, and linked roadmap/guide documents.
Limitations: landing/navigation page; it routes readers to dictionary evidence but is not itself a generated analysis result.
Validation: checked by npm test, npm run build, and Observable link validation.
Owner repo: csl-atlas.
Next use: start with /dictionary-chooser or Reader Lookup, then inspect source-linked dictionary records.

A comparative microstructural atlas of nine narrative Sanskrit-dictionary chapters plus an all-dictionary coverage layer spanning the indigenous kośa tradition (~6th c.) through to the Cologne Digital Sanskrit Lexicon (2024). Each chapter analyses one dictionary under an 18-block formal apparatus developed for MW; the coverage layer asks which parts of that apparatus transfer, and how large those parts are, across every available CDSL v02 dictionary.

Start here

Student path Student Research Desk Start from one Sanskrit word and move from dictionary choice to lookup, dossier, evidence trail, and limits. Builder path Researcher Dashboard See the current atlas work surface: human review, model-pending items, publication tails, and cross-repo leads. Lessons Student Lesson Track Use short classroom prompts to practice lookup, citation, disagreement, and correction caution. Research links Evidence Bridges See which nearby Sanskrit research repos should stay linked, become chips, or wait for a contract. First dictionary Which dictionary? Use MW first, then choose AP, PWG/PWK, VCP/SKD, WIL, or a specialized dictionary by task. Lookup Reader Lookup Search SLP1 or IAST headwords across core and broad dictionary coverage.

Reader mode

Start with Which dictionary should I use? if you need a first stop: MW is the default public route, with task-specific checks in AP, PWG/PWK, VCP/SKD, WIL, and specialized dictionaries. Then use the Reader lookup for dictionary-first search across MW, AP, PWG, PWK, WIL, VCP, and SKD. It accepts SLP1 and IAST headwords, shows dictionary coverage and source links, and keeps machine-derived evidence visibly labeled. Evidence labels explain observed, derived, inferred, and reviewed.

Evidence atlas — quantitative tracks

Beyond the paper, the atlas is building deterministic, source-linked, evidence-labelled tracks over the CDSL dictionaries. Every count links back to a source record; uncertainty (observed / derived / inferred) is always visible. See ARCHITECTURE.md and the reader guide.

MW quantitative depth · Phase 1

Monier-Williams anatomy at scale: depth dashboard (article types, citations, compound depth), diachronic layers (conservative source-layer profile), family depth (deepest lexical families).

Dictionary comparison · Phase 2

Broad Sanskrit/BHS headword coverage across eligible local dictionaries, with Core 7 deep comparison where adapters are validated: coverage matrix · pairwise overlap · gender conflicts · homonym splits · citation apparatus · sense depth · lemma dossier (look up a word).

Dictionary structure

Dictionary genealogy, convention fingerprints, structural register, and R2 sense alignment / sense granularity move dictionary-structure research into the atlas path.

Review queues

Machine-flagged cases awaiting human judgement, schema-conforming and source-linked: gender conflicts · source layers · alignment confidence · source-siglum aliases.

How to read this atlas

Chapters are ordered to mirror the paper's argument arc per Decision 29:

Framework-fit first — chapters 1–4 (MW, PWG, PWK, AP): structured bilingual dictionaries that exercise the full 18-block apparatus. Read these to learn how the framework works.
Typographic precedents — chapters 5–6 (BEN, CAE): 19th-century single-volume works whose typographic markers (†, *) are the historical precedents for MW's tagged <ls>L.</ls> hedge. Read these to learn where the hedge came from.
The base — chapter 7 (WIL): the earliest CDSL dict (1832), Wilson's Amarakośa-derived word list with effectively no source apparatus. Read this to see where the European tradition begins.
The genre boundary — chapters 8–9 (SKD, VCP): seven-volume Sanskrit-Sanskrit kośa works where the 18-block apparatus stops applying as a structured-bilingual model (no <lex>/<ls> tags, inline iti citation instead). Read these to learn where the framework must change.

The arc — framework-fit → precedent → base → genre-limit — is the same arc as PAPER.md §§3–8. A reader following the 9 chapters in order learns the framework, then learns its history, then learns its limits.

Start here

Read the paper

One consolidated study of Monier-Williams 1899 — a data-grounded body, triangulated against three external frameworks:

Grounded framework (body) — five constructs built from MW outward (block, slot, profile, hedge, infrastructure) + the block-economy thesis
Triangulation (§7) — how Wiegand, Atkins–Rundell and Hausmann converge as three witnesses to one analysis
Framework appendices A·B·C — the condensed Wiegand / Atkins–Rundell / Hausmann readings (incl. the proposed Provenienz-Komment)

Explore the tools

Cross-dictionary comparison — nine narrative dicts: citation density + common-block population
All-dictionary coverage — every CDSL v02 dictionary with a main text file: fit score, size, block mass, and entry-type inventory
Matrix explorer — 18 formal blocks × primary article types
Lineage Sankey — PWG → PWK → MW kosha-citation collapse
Typology treemap — 286,561 MW entries by article type
Lexicographic timeline — 6th c. — 2024
Type comparator — pick two types, see block differences
Citation tracer — click a source, see all entries

Browse the 9 dictionaries (Decision 29 order)

#	Code	Year	Tier	One-line summary
1	MW	1899	A	Standard single-volume Sanskrit-English reference; 286,561 records; framework's home dictionary
2	PWG	1855–75	A	7-volume Sanskrit-German Grosses PW; densest `<ls>` apparatus (4.63/record); 0 hedges
3	PWK	1879–89	A	Böhtlingk's own compact 7-volume / 7-Lieferung abridgement; dropped PWG's kosha apparatus before MW
4	AP	1890 / 1957	A	Apte's Practical Sanskrit-English Dictionary; only post-MW dict with any `<ls>L.</ls>` (1×)
5	BEN	1866	B	Benfey 1866; earliest typographic hedge precedent (`†` = "no authoritative references")
6	CAE	1891	B	Cappeller 1891; first systematic typographic precedent (`*` for kosha-only); MW 1899 co-editor
7	WIL	1832	B	Wilson 1832; earliest CDSL dict; the base from which the European tradition departs
8	SKD	1822–58	C	7-volume Śabdakalpadruma; first Sanskrit-Sanskrit kośa; the genre boundary — framework changes here
9	VCP	1873–84	C	7-volume Vācaspatyam; confirms the genre boundary and motivates the all-dictionary coverage layer

Tier A = full template (8 sections); Tier B = compact + typography section; Tier C = genre-bound, prose-pattern analysis. Audit report: _consistency_audit.md.

Kośa-resolution repos (sources MW's <ls>L.</ls> hedge points back to, not atlas chapters): ARMH (Halāyudha) · ABCH (Hemacandra) · ACPH · ACSJ. A future Phase-5 project would extend the framework to these.

The central finding — three-stage `<ls>L.</ls>` lineage

The single most striking pattern in the atlas is the three-stage lineage of MW's lexicographer-only hedge (per the 2026-05-27 MW 1872 preface read):

Year	Source	Marker	Meaning	Scale
1866	Benfey	`†`	"no authoritative references" (weaker, methodological)	~900 typographic
1872	MW 1st edn	declares L. in preface § II	"only in native lexicons"	preface-only; ≈ 0 in body
1891	Cappeller	`*`	"taught only by grammarians or lexicographers"	1,370 typographic — first systematic
1899	MW 2nd edn (w/ Cappeller as co-editor)	`<ls>L.</ls>`	"lexicographer-only attestation"	40,212 tagged + scaled

None of the four stages is fully derivative of the others. MW 1872 is first with the concept (declared in MW's own preface, in his own words); Cappeller 1891 is first with the systematic typographic implementation; MW 1899 is first with the tagged + scaled implementation. PWG, the other major 19th-century dictionary, sits outside this lineage — it kept the named-kosha apparatus (top 5 sigla all named indigenous Sanskrit lexicons: <ls>ŚKDR.</ls>, <ls>MED.</ls>, <ls>H. an.</ls>, etc.) and never used a hedge.

The Lineage Sankey visualises this collapse — MW's 40,212 hedges summarise what PWG distributed across 821 named-kosha sigla.

Corpus frequency data

VisualDCS provides DCS corpus dashboards and per-lemma frequency data (M1–M8 CoNLL-U→SQLite pipeline). Once VisualDCS emits dcs_lemma_summary.json (see docs/VISUALDCS_CONSUMPTION_CONTRACT.md), this atlas will display per-lemma corpus frequency bands inline on lemma pages. Until then, follow the link above for DCS-derived evidence.

Companion documents on GitHub

PAPER.md — the full IJL paper draft (~8,700 words; submission-v1 tag)
IJL_COVER_LETTER.md — submission cover letter
PAPER_RU.md — Russian companion paper (Vostok·Oriens target)
All-dictionary coverage and size layer — 43-dictionary extension from block population to size, type inventory, and framework-fit bands
DICT_PROFILE.md — narrative profile of MW1899
ENTRY_GUIDE.md — how to read an MW entry
MICROANALYSIS.md — block-level working notes
DOUBTS.md — 22 doubts, of which D2/D5/D17/D18/D19/D20/D21/D22 closed
Decision 29 — atlas chapter ordering — the order this atlas is built in