Annotated Swadesh wordlists for the Germanic group (Indo-European family).

Languages included: Gothic [grm-got]; Old Norse [grm-ono]; Icelandic [grm-isl]; Faroese [grm-far]; Bokmål Norwegian [grm-bok]; Danish [grm-dan]; Swedish [grm-swe].


I. Gothic.

Balg 1887 = Balg, G. H. A Comparative Glossary of the Gothic Language, with especial reference to English and German. Mayville, Wisconsin. // A complete dictionary of Gothic, covering the entire text corpus and explicitly listing most of the attestations of individual words; includes extensive etymological notes.

Ulfilas 1896 = Ulfilas oder die uns erhaltenen Denkmäler der gotischen Sprache. Paderborn: Druck und Verlag von Ferdinand Schöningh. // A complete edition of Ulfilas' Bible, together with a concicse vocabulary and a brief grammatical sketch of Gothic.

Costello 1973 = Costello, John R. The Placement of Crimean Gothic by Means of Abridged Test Lists in Glottochronology. Journal of Indo-European Studies, 1:4, pp. 479-506. // A small paper describing an attempt to apply Swadesh glottochronology to the Crimean variety of Gothic, based on XVIth century data. Includes the complete list of 91 words recorded for Crimean Gothic, 27 of which are on the 110-item list used for the GLD.

II. Old Norse.

Main source

Cleasby & Vigfusson 1874 = Cleasby, Richard. An Icelandic-English Dictionary. Enlarged and completed by Gudbrand Vigfusson, M.A. Oxford: Clarendon Press. // The largest and still the most authoritative dictionary of Old Icelandic, illustrated by numerous examples and richly annotated as far as the semantic and distributional properties of the words are concerned, making it an excellent source for lexicostatistical list construction.

Additional sources

Zoega 1910 = Zoëga, Geir T. A Concise Dictionary of Old Icelandic. Oxford: Clarendon Press. // This is basically just a condensed version of [Cleasby & Vigfusson 1874], containing no additional data; references are provided merely for completeness' sake, and consulting the glosses is sometimes useful for determining the most basic and frequent meanings of a particular word.

De Vries 1962 = De Vries, Jan. Altnordisches Etymologisches Wörterbuch. Leiden: Brill. // An etymological dictionary of Old Norse. References are provided mainly for completeness' sake, although in a small handful of cases, etymological information is important in order to provide additional argumentation in favor of a particular meaning of the given word.

Bergsland & Vogt 1962 = Bergsland, Knut & Vogt, Hans. On the Validity of Glottochronology. In: Current Anthropology, 3, 2, pp. 115-153. // This "classic" paper on the intrinsic problems of the glottochronological method contains several 200-item Swadesh wordlists, relatively carefully assembled by specialists in various fields. Contains, in particular, a wordlist for Old Norse, compiled by the authors with the assistance of F. Hødnebø and E. Fjeld Halvorsen.

III. Icelandic.

Haraldsson 1996 = Haraldsson, Helgi. Rússnesk-Íslensk Orðabók. Reykjavík: Nesútgáfan. // Huge Russian-Islandic dictionary (more than 50,000 head entries), well illustrated by examples of usage and strictly distinguishing modern from archaic usage.

Berkov 1962 = В. П. Берков. Исландско-русский словарь. Москва: Государственное издательство иностранных и национальных словарей. // Large Islandic-Russian dictionary (more than 35,000 entries).

IV. Faroese.

Young & Clewer 1985 = Young, G. V. C.; Clewer, Cynthia R. Faroese-English Dictionary. Mansk-Svenska Publishing Co. Ltd. //

V. Bokmål Norwegian

Arakin 2000 = В. Д. Аракин. Большой норвежско-русский словарь. Издание 3-е, исправленное. Т. I-II. Москва: Живой язык. // Huge Russian-Norwegian dictionary (more than 200,000 forms), with a brief grammatical sketch of Norwegian by M. I. Steblin-Kamenskij.

Berkov 2006 = В. П. Берков. Новый большой русско-норвежский словарь. Москва: Живой язык. // Huge Norwegian-Russian dictionary (more than 210,000 Russian equivalents of Norwegian forms, with "traditional Bokmål" and "radical Bokmål" forms consistently indicated along with the "default" orthographic norm).

VI. Danish.

Krymova et al. 2000 = Крымова, Н. И; Эмзина, А. Я.; Новакович, А. С. Большой датско-русский словарь с транскрипцией. Издание 5-е, исправленное. Москва: Живой язык. // Huge Danish-Russian dictionary (around 160,000 forms), with a brief grammatical sketch of Literary Danish by A. S. Novakovich.

Harrit & Harrit 2002 = Harrit, Jørgen; Harrit, Valentina. Russisk-Dansk Ordbog. Copenhagen: Gyldendal. // Large Russian-Danish dictionary (around 50,000 forms), primarily designed for Danish students of the Russian language.

VII. Swedish.

Marklund-Sharapova 2007a = Марклунд-Шарапова, Э. М. Новый большой шведско-русский словарь. Москва: Живой язык. // Huge Swedish-Russian dictionary (around 185,000 forms) with phonetic transcription of Swedish forms.

Marklund-Sharapova 2007b = Марклунд-Шарапова, Э. М. Новый большой русско-шведский словарь. Москва: Живой язык. // Huge Russian-Swedish dictionary (around 185,000 forms).


I. Gothic.

I.1. General.

All of the Gothic forms extracted from the dictionary [Balg 1887] are thoroughly checked against the actual text corpus [Ulfilas 1896]; most of the individual entries, with the exception of certain super-frequent items ('no', 'I', 'thou', etc.), are accompanied with at least one textual example to confirm their eligibility for inclusion.

Comments may also include some basic grammatical info (such as gender and type of stem for the noun entries). Where known from the XVIth century wordlist compiled by Busbecq, Crimean Gothic equivalents are also listed (although they are quite insufficient, not to mention insecure, to serve as the basis for a separate list).

I.2. Transliteration.

The standard transliteration of the Gothic alphabet into Latin letters is taken as the basis for further transliteration into the UTS. The main differences from the standard notation of Gothic words in most sources are as follows:

Common sources UTS Notes
e, ê The Gothic vowels e and o are generally assumed to have been long in most contexts. This length is reflected in the UTS.
o, ô
ei There is a general consensus that the digraph ei transcribed a long monophthong in Gothic.
þ θ
h x It is unknown if Gothic h was phonetically realized as a velar (x) or laryngeal (h) fricative. Since, historically, it is the result of lenition of original *k, we prefer to mark it as a velar (also in order to keep things symmetrical with the other fricatives, i. e. f and θ).
j y
gg, gk ŋg, ŋk
ai ɛ Only before -r- and -x-, -xʷ-; elsewhere, ai is retained.
au ɔ Only before -r- and -x-, -xʷ-; elsewhere, au is retained.

One transcriptional element that has not been introduced concerns the voiced fricatives, traditionally marked in as ƀ, đ, ʒ (= UTS β, ð, ɣ). It is generally assumed that they were regular positional variants (intervocalic) of the corresponding voiced stops b, d, g, but direct evidence for this in Gothic is missing. We prefer to retain the orthographic transcription b, d, g in order to reduce the number of transcriptional symbols and ensure phonological unity for purposes of automatic analysis.

Only individual forms, included in the main Gothic field of the database or mentioned in the comments section, have been transliterated. Textual examples are always quoted in the standard transliteration of the Gothic alphabet, as represented in the actual data sources that were used.

II. Old Norse.

II.1. General.

The generic term "Old Norse" is here used primarily to denote "Old West Norse", or "Old Icelandic". Monuments written in this literary language span across several centuries and several rather distinct genres (the primary difference being between poetry, written in a more archaic and/or stylized language, and prose, more closely reflecting the vernacular standard). In the construction of the wordlist, the following formal criteria were used:

(a) the age of "Old Norse" is marked as the 13th century A.D., since it is generally assumed that the largest corpus of Old Icelandic texts dates from around that period;

(b) prosaic texts are given explicit preference before poetic texts (fortunately, any words that are exclusively encountered in or much more characteristic of poetry than prose are accurately marked in Cleasby & Vigfusson's dictionary, saving the need to peruse textual corpora);

(c) in cases of "transit" synonymy, the factor of frequency of usage of a given word in texts is usually considered as the main argument; where frequencies are hard to determine or comparable, real synonymy is postulated, but such cases form a minority.

The wordlist has been created quite independently of, but later checked against the Old Norse wordlist published in [Bergsland & Vogt 1962]; only a few minor differences were discovered, most of them having to do with the slightly modified semantic standards of the GLD. I am also grateful to Dr. Ilya Sverdlov for valuable advice, drawn from his experience of working on Old Norse texts, on several dubious cases.

Paradigmatic information has not been included on a consistent basis, but gender is always indicated for nouns, different gender forms are adduced for adjectives, numerals, and pronouns when the discrepancies between them are significant, and past tense stems are given for verbs of the "strong" conjugation type.

II.2. Transliteration.

Since, on one hand, the generally employed Latin-based orthography for Old Icelandic is fairly straightforward, and, on the other hand, minute phonetic peculiarities of Old Icelandic pronunciation are not always established beyond doubt (and could vary depending on the century), we prefer to make as few transliterational changes between Cleasby et al.'s notations and the UTS as possible. The main discrepancies are summarized in the following table:

Common sources UTS Notes
Vowel length.
y ü
æ (ǽ) ɛː This vowel is always phonetically long.
ø ö Spelled as œ in Cleasby's dictionary.
ɔ Spelled as ö in Cleasby's dictionary.
þ θ
j y

III. Icelandic.

III.1. General.

Two Russian-focused dictionaries of Icelandic, Berkov 1962 and the much more recent Haraldsson 1996, have been used as base references for the compilation of the 100-wordlist for Modern Icelandic. Besides that, dubious cases have been checked against practical usage in various Internet sources; I am also grateful to Dr. Ilya Sverdlov for occasional consultations.

III.2. Transliteration.

As per GLD standards, orthographic equivalents of Icelandic words are entered in curly brackets. Orthographic equivalents are also used throughout in the "notes" section. The primary entry, however, is transliterated into UTS according to the followng rules:

(a) the basic phonetic form of the Icelandic word is selected from the transcription in [Berkov 1962];

(b) graphic change from the transcription in the dictionary to UTS is minimal (Berkov's j > UTS y; þ > UTS θ; χ > UTS x; q > UTS ɣ);

(c) however, certain phonetic details have been omitted / changed for convenience. Most importantly, we omit the complex system of Icelandic allophones for voiced / voiceless stops, phonetically realized as semi-voiced (, , etc.) or voiceless aspirated (, ) phones depending on contexts; for the sake of simplicity and readability, for these sounds we always retain their orthographic (historic) notations;

(d) on the other hand, for the vowel system, which has genuinely underwent an impressive transformation from the Old Norse period to modern times, we consistently adduce the phonetic values as per Berkov's transcription system; thus, graphic u = Y, i = ı, = , o = ɔ ~ ɔː, = ouː, e = ɛ ~ ɛː, ei = ɛi̯, = , au = öi̯, æ = ai̯, etc.

IV. Faroese.

I.1. General.

Our main source for Faroese has been the mid-size dictionary [Young & Clewer 1985]; additionally, a variety of web resources have been consulted for issues of more accurate transcription, detailed semantics, contextual usage, etc.

I.2. Transliteration.

Transliteration principles mainly follow the rules specified in [Young & Clewer 1985], although, for simplicity purposes, the transcription is not purely phonetical in some points; for instance, the phoneme [r] is not transcribed phonetically as ɹ, etc.

V. Bokmål Norwegian

V.1. General.

Literary Norwegian ("Bokmål") is not an easy language to describe in lexicostatistical terms, since it is essentially a "hybrid" of colloquial Norwegian and Danish, with many words either directly borrowed from Danish or "influenced" by Danish forms (i.e. probably never "replaced" as such in colloquial usage, but reformed in accordance with Danish pronunciation). Things are complicated even further by the existence of several orthographic / orthoepic norms for Bokmål, including a "traditional" variant (where there are even more Danish-like forms) and a "radical" variant (where, on the other hand, some words have been "Norwegized", thus becoming closer to their Nynorsk equivalents). In the current database, the following approach is currently suggested:

(a) "Danish-looking" forms of Bokmål are counted neither as replacements nor as borrowings, but as etymological cognates of the corresponding Germanic forms in other languages, i. e. marked with positive numbers. E.g. such forms as 'to die', hånd 'hand' (instead of døy, hånn), etc., are counted as "influenced" by Danish, but not "borrowed" from Danish.

(b) Borrowings from German, such as spise 'to eat', are definitely counted as borrowings and marked with a negative number. Some basic words have also been suggested as borrowings from Swedish, e.g. sky 'cloud', but evidence for that is frequently ambiguous, and most of those words could also count as Danish "borrowings" / "influences". We treat them the same way, i.e. as normal cognates.

(c) We consistently follow the information in the dictionaries of [Arakin 2000] and [Berkov 2006], choosing "default Bokmål" over "traditional Bokmål" (usually same as the ultra-conservative "Riksmål" norm) and "radical Bokmål", although from a lexicostatistical point of view this makes no difference whatsoever (cognation indexes always remain the same if "Danish-influenced" forms are treated the same way as "fully inherited" forms).

V.2. Transliteration.

As per GLD standards, orthographic equivalents of Norwegian words are entered in curly brackets. Orthographic equivalents are also used throughout in the "notes" section. The primary entry is transliterated into UTS according to the followng rules:

(a) the basic phonetic form of the Norwegian word is determined by the pronunciation rules as described in [Arakin 2000: II, 524-528] (the dictionary itself only lists phonetic transcriptions where they are not predictable through orthography; in such cases, we take over the transcribed form as well);

(b) graphic change from the transcription in the dictionary to UTS is minimal (Arakin's ʃ > UTS š; ç > UTS ʆ; j > UTS y; æ > UTS ä; y > UTS ü); retroflex consonants (ṭ, ḍ, ṇ, ḷ), phonetically developing from clusters rt, rd, rn, rl, are transliterated as ʈ, ɖ, ɳ, ɭ respectively;

(c) for extra adequacy purposes, high and low pitch accent on root morphemes are marked wherever they are explicitly present in the dictionaries (only in the transcription, not in normative orthography).

VI. Danish.

VI.1. General.

The wordlist is based on Standard Danish (official form of the language, based on the dialect of Copenhagen). All forms have been elicited with the aid of two bilingual dictionaries (Russian-Danish and Danish-Russian), well illustrated by examples of usage; some complex cases were further checked against a variety of Internet sources reflecting literary and colloquial usage.

VI.2. Transliteration.

Since the dictionary [Krymova et al. 2000] includes complete phonetic transcription for all the listed Danish words (as an auxiliary measure to facilitate the complexity of relations between conservative Dutch orthography and actual pronunciation), we have used it as the basis for all primary slot inclusions, keeping further transliteration to UTS standards to an absolute minimum. In the primary slots, forms are adduced in phonetic transliteration and standard orthography. In the notes section, only standard orthographic variants are listed.

VII. Swedish.

VII.1. General.

The wordlist is based on Standard Swedish (the most common form of the language, based primarily on the dialect of Stockholm, as reflected in standard dictionaries of the language). All forms have been elicited with the aid of two bilingual dictionaries (Russian-Swedish and Swedish-Russian), well illustrated by examples of usage; some complex cases were further checked against a variety of Internet sources reflecting literary and colloquial usage.

VII.2. Transliteration.

As in the case of Danish, the dictionary [Marklund-Sharapova 2007a] includes complete phonetic transcription for all the listed Swedish words. This transcription was largely retained in our list, including stress and tonal marks, although a few minor vocalic allophones were merged (e. g. E and ɛ). The source also regularly marks consonantal length with ː; we render this with a double consonant when it is reflected in the orthography (e. g. ʆött {kött} 'meat'), but with a length mark when this is purely a phonetic convention without any orthographic basis (e. g. hɵnːd {hund} 'dog').

In the primary slots, forms are adduced in phonetic transliteration and standard orthography. In the notes section, only standard orthographic variants are listed.

Database compiled and annotated by: G. Starostin (last update: February 2016).