The etymological database for the North Khoisan (or Zhu) group of languages, or dialects, as it is not quite clear whether the group really contains more than one language with a large set of dialects (it is perhaps most convenient to speak of the Northern and Southern clusters as two separate languages, but hardly beyond that). The database is directly linked to the Peripheral Khoisan (Juu-Taa) database. This somewhat downplays the fact that within Peripheral Khoisan, North Khoisan dialects are most tightly connected with ǂHoan; however, since ǂHoan is just one language without a sprawling dialect network, it is technically more suitable to link it to Peripheral Khoisan separately.

Data sources: for North Khoisan, we have one major source of data in the form of Patrick Dickens' excellent dictionary of Juǀʼhoan (Dickens 1994), which basically serves as the main reference point in the database. Many of the forms given by Dickens are de facto treated as representative of Proto-North Khoisan, even if they are not found among the materials for other dialects - simply because none of these materials are as overwhelming in scope as Dickens' collection of data. Of course, this begs for the obvious question that many of these forms may NOT be Proto-North Khoisan, but recent borrowings into Juǀʼhoan. I have done my best to weed out the most obvious of such borrowings, mainly from European languages, Bantu languages, and Nama (Central Khoisan); some, however, are almost sure to remain, so caution is to be exercised when a Juǀʼhoan form is not confirmed by any other entry in the same etymology.

Apart from Dickens 1994, the majority of reliable information comes from the Northern cluster, culled from works by T. Heikkinen, C. Koenig & B. Heine, and J. Snyman. The latter's comparative data on a whole set of NK dialects (from Snyman 1997) is also invaluable, although, unfortunately, rather scant. Data on Southern (and Central, if we take the so-called //Au-//en idiom to be representative of it) cluster dialects stems almost entirely from older collections, such as Dorothea Bleek's materials (Bleek 1929 and Bleek 1956). It is uniformly weaker in quality, but still very valuable for evaluating the geographic distribution of the forms and making hypotheses on reconstruction.

The database consists of the following fields:
1. Proto-Ju: the reconstructed form, sometimes depending exclusively on the corresponding Juǀʼhoan entry, but in many cases based on comparative evaluation of the evidence.
2. Tonal pattern: at the moment I am not offering tonal reconstruction in the "Proto-Ju" field, but the tonal characteristics of those sources which seem to mark tonal oppositions consistently (Dickens 1994, Heikkinen 1986, Koenig-Heine 2001) are indicated separately. For the explanation of the notation system, see below.
3. Meaning: the meaning of the protoform (frequently just directly copied from Dickens 1994).
4. Juǀʼhoan: the main entry in Dickens 1994. All the forms are given in the official transliteration system employed by P. Dickens (even though I, personally, do not find it too comfortable). The forms are occasionally accompanied by their transcribed variants in Snyman 1970 (Sn.).
5. ǀǀAuǀǀen ("NI" in Bleek 1956): the default sources are Bleek 1929/Bleek 1956.
6. Nogau ("NIa" in Bleek 1956): the very few forms available represent C. Lebzelter's data as transcribed in Bleek 1956.
7. !Kung ("NII" in Bleek 1956): this field contains a veritable melange of forms from various dialects, all put together by D. Bleek under the NII heading. These include data originally collected by Lucy Lloyd (Ll.), H. Vedder (Vd.), J. H. Wilhelm (Wil.), and, most importantly, C. M. Doke (Dk.), whose notes are of particular importance as he was the only early-period researcher to consistently mark the so-called "retroflex click". I have also added some newer data from Ferdie Weich's San vocabulary (W.) here, although it seems that the dialect he is describing is rather "NIII" than "NII".
8. !O!Kung ("NIII" in Bleek 1956). The default sources are Bleek 1929/Bleek 1956, but the same field also contains J. Snyman's much later (and better quality) data on "Angolan !Xung" (Snyman 1980).
9. Kavango: the dialect of western Kavango and Ovamboland, with the data taken from Heikkinen 1986.
10. Ekoka: the Ekoka !Xung dialect, with data taken from Koenig & Heine 2001.
11-21. Tsumkwe, Tsintsabis, Okongo, Leeunes, Mpunguvlei, Cuito, Cuando, South Omatako, North Omatako, Kameeldoring, Lister: data from J. Snyman's dialect survey of North Khoisan, all entered from the appendices to Snyman 1997. Technically, "Tsumkwe" is practically the same as Juǀʼhoan proper, and some other dialects almost certainly overlap with others already listed above as described by other researchers, but it was thought that the data should best remain in table-like form close to the original form that J. Snyman shaped it into.
22. Notes: additional comments and considerations.
23. References: bibliographical links.

Notes on transcription:
For old data (Bleek 1956 and older), the original transliteration systems are followed. For data taken from newer and more reliable sources, some of the usual transliterating conventions of ToB are followed (such as transcribing prevoiced clicks as ɡǀ, etc., and nasalized ones as ɳǀ, etc.); in most cases, however, I follow the conventions used by researchers themselves. Proto-North Khoisan mostly uses the same conventions as the other Khoisan databases.
The notational system of Patrick Dickens for Juǀʼhoan is retained with two exceptions: (a) I use a subscript tilde to denote pharyngealized vowels (a̰, etc.) instead of marking them with the letter q (aq, etc.), as this tends to be confusing; (b) the velar affricate efflux is always transcribed by me in the standard way (ǀkx, ǀǀkx, etc.), whereas Dickens marks it with a simple -k-k, ǀǀk, etc.), which tends to be even more confusing.
Clicks: ǀ = dental click, ǂ = palatal click, ! = alveolar click, !̯ = retroflex click, ǀǀ = lateral click.
Click effluxes include the following:
- zero efflux (Proto-NK, post-Bleek researchers: no special marking; Bleek: ǀk, etc.);
- voiced efflux (Proto-NK, post-Bleek researchers: ɡǀ, etc.; Bleek: ǀg, etc.);
- velar fricative efflux (Proto-NK, post-Bleek researchers, Bleek: ǀx, etc.);
- prevoiced velar fricative efflux (Proto-NK, post-Bleek researchers: ɡǀx, etc.; Bleek ?);
- velar ejective affricate (Proto-NK, post-Bleek researchers: ǀkx or ǀxʼ, etc.; Bleek ǀkx or ǀk", etc.);
- prevoiced velar ejective affricate (Proto-NK, post-Bleek researchers: ɡǀkx or ɡǀxʼ etc.; Bleek ?);
- aspirated efflux (Proto-NK: ǀkh, etc.; post-Bleek researchers: ǀh, etc.; Bleek ǀkh, etc.);
- delayed aspiration efflux (Proto-NK: ǀh, etc.; post-Bleek researchers: ǀʼh, etc.; Bleek ǀh, etc.);
- nasalized efflux (Proto-NK, post-Bleek researchers: ɳǀ, etc.; Bleek ǀn, etc.);
- aspirated nasalized efflux (Proto-NK: ɳǀh, etc.; post-Bleek researchers: ɳǀʼh, etc.; Bleek ?);
- preglottalized nasalized efflux (Proto-NK, post-Bleek researchers: ʔǀn, etc.; existence not recognized in Bleek 1956 or previous works);
- glottal stop (Proto-NK, post-Bleek researchers: ǀʼ, etc.; Bleek: no special marking).

Special note on aspirated effluxes: most North Khoisan dialects distinguish "simple" aspiration of the click and "delayed" aspiration, accompanied by a glottal stop. It would seem appropriate to mark the two types in Proto-NK forms as *ǀh and *ǀʼh, respectively, following the more recent and phonetically adequate transcriptions. However, existing evidence shows that the first type of clicks more frequently corresponds to South Khoisan uvular aspirated clicks (ǀqh, etc.), whereas the second one is better to be associated with simple aspiration in South Khoisan (ǀh, etc.). Therefore I take the liberty of representing the two types in the reconstruction as *ǀkh and *ǀh respectively, to emphasize the fact that the velar stop element in the former is more important historically than the glottal stop element in the latter.

Non-click consonants: special comments are necessary as to the transcription of affricates and hissing/hushing fricatives. Various NK dialects distinguish between the following sounds:
voiceless hissing affricate: c (= ts in Dickens' transcription);
voiced hissing affricate: ʒ (not present in Dickens' transcription due to lack in Juǀʼhoan, but sometimes transcribed as dz in other sources);
ejective (or preglottalized) hissing affricate: (= tz in Dickens' transcription);
voiceless aspirated hissing affricate: (= tsh in Dickens' transcription);
preglottalized voiced hissing affricate: ʒʼ (= ds in Dickens' transcription);
preglottalized voiced aspirated hissing affricate: ʒʰʼ (= dsh in Dickens' transcription);
voiceless and voiced hissing fricatives: s, z;
voiceless hushing affricate: č (= tc in Dickens' transcription);
voiced hushing affricate: ǯ (not present in Dickens' transcription due to lack in Juǀʼhoan, but sometimes transcribed as in other sources);
ejective (or preglottalized) hushing affricate: čʼ (= tj in Dickens' transcription);
voiceless aspirated hushing affricate: čʰ (= tch in Dickens' transcription);
preglottalized voiced hushing affricate: ǯʼ (= dc in Dickens' transcription);
preglottalized voiced aspirated hushing affricate: ǯʰʼ (= dch in Dickens' transcription);
voiceless hushing fricative: š (= c in Dickens' transcription);
voiced hushing fricative: ž (= j in Dickens' transcription).
In older sources (Bleek 1956 and before) all of these sounds are transcribed in a whole variety of manners, which often obscures their behaviour in the actual dialects, so it is often impossible to determine whether a particular spelling consistently reflects a particular phoneme or is just an idiosyncrasy on the part of the transcriber. In order to determine the original nature of the phoneme concerned, it is advisable to choose later material (Dickens, Heikkinen, etc.) as diagnostic. It should be kept in mind, though, that throughout the Northern cluster of dialects, hissing and hushing phonemes of PNK tend to regularly merge into just one hushing series, whereas some Southern dialects sometimes show the reverse development.

Vowels: ɛ, ɔ = open correlates to closed e, o (found mostly as phonetic variants in older materials and probably irrelevant on the PNK level). Pharyngealized vowels are marked as , , etc.; breathy vowels are marked with a superscript h (= Dickens' regular h); nasalized vowels are marked with a tilde (ã, õ, etc.).

Tones: most NK dialects distinguish up to four tones, but transcription systems sometimes vary. The most common markings are as follows:
High tone (H) - acute sign (, , etc.), but double acute sign in Heikkinen 1986 (, , etc.);
Mid tone (M) - horizontal line (ā, ē, etc.), but simple acute sign in Heikkinen 1986 (, , etc.). Note that Ju|ʔhoan has no mid tone, according to Dickens;
Low tone (L) - gravis sign (, , etc.) in all sources;
Extra low tone (X) - unmarked in Juǀʼhoan (a, e, etc.), double gravis sign in Heikkinen 1986 and Ekoka (Koenig & Heine 2001).
The majority of old sources try to mark tonal distinctions that way or the other, with most of the transcriptions found in Bleek 1956 somehow reflected in the database. However, these old tonal notations tend to be very unreliable and almost unusable for reconstruction purposes.
Each of the NK morphemes is either bisyllabic with two morae (CV-CV) or monosyllabic with two morae (CV-V), meaning that the tonal structure of each is represented by two tonal characteristics. The field "Tonal pattern" encodes these characteristics from the most reliable sources. E. g., PNK *ɡǀui 'hyaena' is assigned the information XL [j] ~ XX [k] ~ XX [e], meaning that the morpheme has extra low tone on both vowels in Kavango and Ekoka, but extra low tone on -u- and low tone on -i in Juǀʼhoan.