Annotated Swadesh wordlists for the !Wi group (Peripheral Khoisan family).

Languages included: ǀXam [kwi-xam]; ǀǀNg!ke [kwi-lng]; ǂKhomani [kwi-kho]; Nǀuu [kwi-nuu]; ǀǀXegwi [kwi-xeg]; ǀʼAuni [kwi-aun]; ǀHaasi [kwi-haa].
Reconstruction: Preliminary version available.


I. General

Bleek 1929 = Bleek, Dorothea F. Comparative Vocabularies of Bushman Languages. Cambridge University Press. // A collection of mid-size vocabularies from 12 "Bushman" dialects (several North, South, and Central Khoisan idioms are represented), with most of the data collected by D. Bleek herself. Not as thorough as Bleek 1956, and even less reliable in regards to data transcription, but the English-Bushman data organization principle makes it a useful source to consult in the preparation of Swadesh wordlists.

Bleek 1956 = Bleek, Dorothea F. A Bushman Dictionary. American Oriental Society: New Haven, Connecticut. // A huge (almost 700 pages) collection of comparative data on Khoisan that includes both Dorothea F. Bleek's own collection and data from numerous other researchers published up until the 1930s (W. Bleek, L. Lloyd, etc.). Transcription quality varies in between all the different sources, but is generally unreliable, quite typical of all Khoisan data published before the second half of the XXth century. Nevertheless, the edition still contains a wealth of priceless data, particularly on extinct North and South Khoisan languages.

Westphal 1965 = Westphal, E. O. J. Linguistic research in S.W.A. and Angola. In: Die ethnischen Gruppen Südwestafrikas. Wissenschaftliche For\-sch\-ung in Südwestafrika, Bd 3. Windhoek: Südwestafrikanische Wissen\-schaft\-liche Gesellschaft, pp. 125-144. // Brief sketch of some of the author's investigations and considerations on the internal relationship of the so-called "Khoisan" languages. Contains a small comparative wordlist for 14 "Khoisan" idioms, drawn from the author's personal field data collections, including 25 Swadesh items.

II. ǀǀNg!ke

Bleek 2000 = Bleek, Dorothea F. Notes on the language of the //ŋ!ke or Bushmen of Griqualand West. Ed. by Tom Güldemann. Khoisan Forum Working Paper No. 15. // An archival edition of a relatively brief manuscript that provides some grammatical notes on ǀǀNg!ke.

III. ǂKhomani

Maingard 1937 = Louis F. Maingard. The ǂKhomani dialect of Bushman: its morphology and other characteristics. In: Bushmen of the southern Kala\-hari. Ed. by J. D. Rheinallt Jones & C. M. Doke. Witwatersrand University Press, Johannesburg, pp. 237-275. // Brief grammar sketch of the ǂKhomani dialect; no separate vocabulary, but including a set of three translated texts in the language.

Doke 1936 = Clement M. Doke. An outline of ǂKhomani Bushman pho\-ne\-tics. In: Bantu studies (Johannesburg), 10, pp. 433-461. // Description of the phonetic and phonological system of the ǂKhomani dialect, accompanied by a significant amount of illustrative lexical material, but without a separate vocabulary.

IV. Nǀuu

Sands et al. 2006 = Sands, Bonny; Miller, Amanda; Brugman, Johanna; Namaseb, Levi; Collins, Chris; Exter, Mats. 1400 item Nǀuu Dictionary. // Unpublished manuscript. All data have been provided courtesy of Bonny Sands.

Sands et al. 2007 = Sands, Bonny; Miller, Amanda; Brugman, Johanna. The Lexicon in Language Attrition: The Case of Nǀuu. // Brief paper discussing possible relations between peculiarities of the lexical inventory of Nǀuu and its current sociolinguistic status. Includes a little bit of illustrative data.

Miller et al. 2009 = Miller, Amanda; Brugman, Johanna; Sands, Bonny; Namaseb, Levi; Exter, Mats; Collins, Chris. Differences in airstream and posterior place of articulation among Nǀuu clicks. In: Journal of the International Phonetic Association, 39 (2), pp. 129-161. // Discussion of certain articulatory and acoustic peculiarities of the phonetic system of Nǀuu. Includes around a hundred illustrative lexical items.

Miller et al. 2007 = Draft version of Miller et al. 2009, available at: // Includes slightly different illustrative lexical data.

V. ǀǀXegwi

Ziervogel 1955 = Dirk Ziervogel. Notes on the language of the eastern Transvaal Bushmen. In: The disappearing Bushmen of Lake Chrissie, ed. by E.y E. F. Potgieter. (Hidding-Currie publications of the University of South Africa, no. 1.). J L van Schaik. Pretoria, pp. 34-64. // A sketch of ǀǀXegwi phonology and grammar, accompanied by two illustrative texts, but no separate lexical glossary.

Lanham & Hallowes 1956 = Leonard Lanham, D. P. Hallowes. An outline of the structure of eastern Bushman. In: African studies (Johannesburg), 15 (3), pp. 97-118. // A sketch of ǀǀXegwi phonology and grammar without any texts or glossaries, but containing an important selection of illustrative lexical data.

Lanham & Hallowes 1956a = Leonard Lanham, D. P. Hallowes. Linguistic relationships and contacts expressed in the vocabulary of Eastern Bushman. In: African studies (Johannesburg), 15 (1), pp. 45-48. // Very short article on certified and potential areal connections between ǀǀXegwi and neighbouring languages, mostly Bantu. Contains a small handful of unique lexical data.

VI. ǀʼAuni

Bleek 1937 = Dorothea F. Bleek. Grammatical notes and texts in the ǀAuni language + ǀAuni vocabulary. In: Bushmen of the Southern Kalahari, ed. by J. D. Rheinalt Jones and C. M. Doke. Johannesburg: The University of the Witwatersrand Press, pp. 195-220. // Very brief notes on ǀʼAuni grammar and a mini-selection of texts, accompanied by a vocabulary of several hundred lexical items. Based on D. Bleek's own research with ǀʼAuni speakers in 1936, significantly superior to the results of her earlier work in 1911, published in [Bleek 1929].

VII. ǀHaasi

Story 1999 = Robert Story. Kʼuǀhaːsi Manuscript. Ed. by Anthony Traill. Khoisan Forum Working Paper No. 13. // Full reproduction of the brief wordlist, grammar notes, and phrases in the ǀHaasi language as recovered from Robert Story's original manuscript of 1937; accompanied with extensive notes by A. Traill, including noteworthy considerations on how to interpret Story's phonetic notation.


1. General.

I. ǀXam

The main entry, in the absolute majority of cases, represents Lucy Lloyd's transcription variant(s) of the ǀXam word, extracted from [Bleek 1956]; transcriptional variants from Wilhelm Bleek's earlier records as well as their equivalents in [Bleek 1929] (this source usually follows W. Bleek rather than L. Lloyd) are added in the notes section. Everything has been properly transliterated into UTS, although a few diacritics (such as the diaeresis and the non-phonological vowel shortness markers) have been dropped. Morphemic boundaries have been added only where they are clearly required by Khoisan phonotactics (e. g. before all syllables that begin with a stop consonant).

II. ǀǀNg!ke

All data are from D. Bleek's fieldwork, recorded in [Bleek 1929, 1956, 2000]. There are clearly several subdialects involved (as evidenced by significant variation in transcribed variants, including the occasionally emerging phenomenon of "click dropping"), but no significant lexicostatistical variation is observed.

III-IV. ǂKhomani and Nǀuu

This is ostensibly the exact same dialect, the recordings of which, however, are set 60-70 years apart. "ǂKhomani" is the old name as recorded in the descriptions of L. Maingard and C. Doke, and Nǀuu is the name applied to the language as spoken by the few re-discovered speakers in the late 1990s / early 2000s, and described by N. Crawhall, B. Sands, A. Miller and others.

V. ǀǀXegwi

Notes on the extinct ǀǀXegwi were first taken by D. F. Bleek (who calls the language "Batwa") in the 1910s; and later still, by D. Ziervogel, and L. Lanham & D. P. Hallowes in the 1950s. The most detailed, although perhaps not the most phonetically reliable, description belongs to Ziervogel, whose lexical data are taken as the default source. Lanham & Hallowes' description contains fewer lexical entries, but seems to be more accurate transcriptionally. A few empty slots have been filled out by data from D. Bleek's records, which have to be taken cautiously due to occasional misglossings and poor transcriptional quality (in particular, a failure to perceive both the presence of uvular q and lateral affricates as autonomous phonological units).

VI. ǀʼAuni

All the data on the extinct ǀʼAuni come from D. F. Bleek's research, undertaken first in 1911 (results published in [Bleek 1929]), and then during an ethnographic expedition in 1936 (published as [Bleek 1937] and later included in [Bleek 1956]). Data from 1936, accompanied by textual evidence, are much more abundant and precise than data from 1911, although still not free of phonetic and semantic mistakes typical of most of the early research on Khoisan languages.

VII. ǀHaasi

The only data on the extinct ǀHaasi come from Robert Story [Story 1999]; they are sufficient to fill in approximately 70% of the Swadesh wordlist, but raise numerous questions as to the exactness of both phonetic notation and semantic precision. Nevertheless, as a language that is closely related to, but still distinct from ǀʼAuni, this is a very important link whose inclusion in the overall lexicostatistics is quite useful.

NB: It should be kept in mind that, although most of the extinct !Kwi languages are still represented in the Ethnologue system, the current nomenclature is quite misguiding in the case of ǀʼAuni and ǀHaasi: both are listed in the system as "dialects" of !Xóõ [nmn], but !Xóõ actually belongs to a different branch of South Khoisan (Taa), and neither ǀʼAuni nor ǀHaasi could ever be seriously considered as its "dialects". Hopefully, this error will be corrected in future editions.

2. Transcription.

Most of the transliterations concern old sources, collected in [Bleek 1956]; the major exception is data on Nǀuu, which has been transliterated from the orthographic conventions employed in [Miller et al. 2009] and other similar sources, with minor orthographic changes. These new sources also sometimes employ a more detailed phonetic transcription; in those cases when words in phonetic transcription are significantly different from words written in phonology-based orthography, their phonetic transcription is quoted in square brackets in the notes section. The following transliteration table may be useful for those who are unfamiliar with the tricky aspects of Khoisan phonology and phonetics:

Sound or sound type Bleek/Lloyd transcription Phonetic transcription for Nǀuu UTS representation
Unaccompanied "simple" click ǀk, ǂk... ǀ, ǂ... ǀ, ǂ...
Voiced click ǀg, ǂg... gǀ, gǂ... ɡǀ, ɡǂ...
Nasalized click ǀn, ǂn... ŋǀ, ŋǂ... ɳǀ, ɳǂ...
Aspirated click ǀkh, ǂkh... ǀh, ǂh... ǀh, ǂh...
Glottalized click1 ǀ, ǂ... ŋǀʼ, ŋǂʼ... ǀʼ, ǂʼ...
"Delayed aspiration" click2 ǀh, ǂh... ŋǀh, ŋǂh... ǀʼh, ǂʼh...
Simple uvular release3 not distinguished from ǀk, ǂk... ǀq, ǂq... ǀq, ǂq...
Aspirated uvular release not distinguished from ǀkh, ǂkh... ǀqh, ǂqh... ǀqh, ǂqh...
Velar/uvular fricative release4 ǀx, ǂx... ǀχ, ǂχ... ǀx, ǂx...
Velar/uvular affricate release5 ǀk", ǂk"... ǀχʼ, ǂχʼ... ǀxʼ, ǂxʼ...
Voiceless palatal stop or affricate6 ky ~ ty c ɕ
Voiced palatal stop or affricate gy ~ dy ɟ ʓ
Voiceless alveolar affricate ts ts c
Velar or uvular fricative x χ x
e ~ ɛ e ~ ɛ e (~ ɛ)
o ~ ɔ o ~ ɔ o (~ ɔ)
a ɑ ~ ǝ a (~ ǝ)
Nasalized vowels ã, ẽ, õ... an, en, on... ã, ẽ, õ...
Pharyngealized vowels aꜣ, eꜣ, oꜣ... aˤ, eˤ, oˤ... a̰, ḛ, o̰...

Additional notes:

(1) Glottalized release is in some Khoisan dialects accompanied with voiceless nasalization. Since glottalization is always recognized as the primary feature of these phonemes, and there do not seem to be any contrasts between pre-nasalized glottalized / non-nasalized glottalized clicks, nasa\-lization is omitted from UTS transliteration.
Special note: In [Bleek 1937], glottalized release in ǀʼAuni words is sometimes marked in the usual way (i. e. the click symbol with no accompanying symbols), and sometimes with an explicit glottal stop (e.g. ǀǀa 'to go', but ǀǀʼa 'to dig'). It is not entirely clear what is meant by this, since such a contrast is unprecedented as far as phonetically well-described Khoisan languages are concerned. It may be that the explicit transcription of the glottal stop signifies additional glottalic articulation on the vowel (i.e. ǀǀʼa is really ǀǀaʔa, etc.).

(2) This click is alternately described as combining ejective (glottalized) articulation with ensuing aspiration (e. g. by C. Doke for ǂKhomani) or as a "voiceless nasal aspirated" click (e. g. by A. Miller et al. for Nǀuu). Despite variation in actual pronunciation, from a phonological standpoint this is always the same phoneme.

(3) "Uvular" clicks are now being reinterpreted as a special type of "linguo-pulmonic" consonants, whose main distinction from "simple" "lingual" clicks is a difference in airstream mechanism (see [Miller et al. 2009] for a detailed explanation), since in reality all clicks have posterior uvular, rather than velar, constrictures. Despite this, the uvular q is still retained as a special transcription marker for "linguo-pulmonic" sounds, and a correlation between this and similar click releases and simple uvular consonants is not out of the question.

(4) This release, in all of the old sources, is consistently marked and described as "velar fricative", but newer recordings and descriptions indicate that its phonetic quality is usually (perhaps even always) uvular. Despite this, we retain the old notation with x rather than χ, since uvular and velar fricatives are never known to contrast phonologically in Khoisan languages (at least, attested ones).

(5) This release correlates with the non-click phoneme that is usually described as a velar or uvular glottalic affricate. We retain one of the traditional notations for this affricate ( is preferred over the widesperad notation kx for technical reasons of automated analysis purposes), although for Nǀuu at least, and probably for most other Khoisan languages, this is not phonetically exact.

(6) In transliterating the old sources, we are sometimes forced to re-transcribe ky ~ ki, ty ~ ti as , , in order to preserve the "contrast" between these two types of palatal articulation, even though in reality they must have been in free variation with each other (reflecting slightly different variants of the same palatal stop).

(7) In transliterating vowels, we follow these conventions: such pairs as e/ɛ, o/ɔ are retained for old sources which do not specifically indicate that these pairs are allophonic (although they might be and probably are), but unified (as e, o) for new sources which explicitly treat the pairs as allophones and only make the distinction in phonetic transcription.

3. Reconstruction.

The task of reconstructing Proto-!Wi is exceedingly hard and "ungrateful". The only !Wi language to have persisted into the 21st century and to have been recorded according to more or less "modern" standards of accuracy is Nǀuu, and even then it is not yet clear to which extent the speakers were influenced by Khoe languages. All the other languages suffer from all possible sorts of problems: primarily, phonetic inaccuracy (e. g. only Lanham & Hallowes' data on ǀǀXegwi recognizes the presence of uvular clicks and consonants in this language), but also semantic errors and data incompleteness (the latter particularly important in the case of ǀHaasi). Consequently, all historical-comparative research on !Wi has to be taken cum grano salis, at least until a complete and well-organized digital data collection has been made presentable.

Nevertheless, in many cases it is still possible to make relatively adequate choices, based on the following criteria: (a) visibly recurrent phonetic correspondences between the various languages (including those that may in fact represent notational errors, but are still recurrent enough to confirm the non-accidental nature of the comparisons); (b) distribution of phonetically corresponding or at least phonetically similar (identical) potential cognates between languages (including also scattered bits of information on other !Wi languages, data on which are too scarce to include them in our lexicostatistics, but quite useful for reconstruction purposes; these languages are ǀǀKuǀǀe, ǀǀKxau, !Gã!ne, and Seroa). Additionally, it is permissible to rely on external information (most importantly, data from the only well-described language in the Taa group - !Xóõ) to confirm or disprove certain hypotheses concerning optimal candidates for the Swadesh proto-wordlist: since !Wi-Taa relationship is well confirmed by regular lexicostatistics between living languages, their data may be "exchanged" to corroborate judgements about proto-wordlists as well.

A detailed table of phonetic correspondences is not given here, since in many cases the regularity of these correspondences remains questionable, reflexation splits remain unclear, and in even more cases it is not even perfectly understood whether the "correspondences" reflect actual phonetic discrepancies or transcriptional inaccuracies. Instead, whenever the correspondences are "non-trivial" (especially if this involves correspondences between different types of clicks or between clicks and non-clicks), detailed comments are given in the "Reconstruction shape" section of the notes (sometimes with references to other examples that support the correspondence).

It should be noted that, contra T. Güldemann's recent re-classification, we do not find the evidence in support of a "Taa + Lower Nǂossob" genetic grouping more overwhelming than evidence in support of a "ǀXam-Nǀuu-ǀǀXegwi + Lower Nǂossob" grouping. In the lexicostatistical sphere, the isoglosses which tie together Taa and Lower Nǂossob languages may all be regarded as shared archaisms rather than innovations (this is also somewhat confirmed by external comparison with the distantly related Ju languages). In the currently employed classification, "!Wi" is divided into "Narrow !Wi" (all varieties of ǀXam, Nǀuu, and the slightly more distant ǀǀXegwi, as well as several other extinct and poorly described languages in D. Bleek's dictionary, see above) and "Lower Nǂossob" (ǀʼAuni + ǀHaasi).

Unfortunately, this binary split is very uneven in terms of available data, since the entire Lower Nǂossob branch is only represented by inaccurate, incomplete, and unverifiable (due to the languages' extinction) old sources. This almost inevitably skews the "Proto-!Wi" reconstruction in favor of "Narrow !Wi" (with a particularly strong bias in favor of the best preserved and described !Wi language - Modern Nǀuu), and makes the reconstructed protowordlist largely unfit to help us in establishing the lexical replacement rates for various !Wi languages. It should, therefore, be regarded of more importance for the purposes of external comparison (with Proto-Taa) than for comparison with its alleged modern descendants.

Database compiled and annotated by: G. Starostin (latest version: August 2015). The compiler also expresses sincere gratitude to B. Sands for providing unpublished data on Nǀuu.