The Global Lexicostatistical Database: Collaboration











With 6 or 7 thousand languages spoken on the planet (multiply n-fold for dialects), the GLD project clearly needs as many helping hands as possible to achieve completion. However, since the standards required for GLD data input are, on the whole, much more demanding than the ones em­ployed in the majority of similar projects, useful collaboration with the project requires more than simply an ability to type in wordlists.


Above all else, we heartily welcome collaboration on the part of professional linguists, preferably (but not necessarily) with a background in his­torical-com­parative studies, either general or regarding specific language families. Such collaboration/assistance can take on different forms, but in general there are two ways to go about it:


(1) Checks and criticisms concerning either the lexical data itself or the accompanying notes (or both) for wordlists already compiled and uploaded on the site. Any professional feedback in this department is always appreciated, and may be passed on to the compilers either through the Send comment or report error form that can be filled in during one's browsing through any particular database, or directly by E-mail (send everything to George Starostin at


(2) Compiling your own wordlists. If you are a professional linguist with an active enough interest in the GLD to want to contribute your own lexicostatistical data (for which the GLD, of course, always gives full credit to the compiler), there is every opportunity to do so! What you will need to do is:


— contact the principal coordinator of the project (George Starostin, at to discuss the particular languages / areas you would be interested in covering (so that your work does not overlap with what has already been done or is currently being done by other members);


— agree with and carefully study the basic standards of GLD data input (such as the unified transcription system, rules of dealing with synonyms and other complex cases, correct notation of comments, etc.), all of which are described in various documents that are listed on the General in­for­ma­tion page. (It is not at all as difficult as it may sound);


— decide whether you would prefer to perform your work in a Unicode-based environment such as MS-Word (we can convert it into StarLing data­base form upon completion), or within the StarLing software itself, and download the corresponding templates from the Downloads page.


Note on format: The GLD is, to a certain extent, a flexible system, but in many respects it has a very rigid format, in order to provide uniformity that is absolutely necessary for various automatic procedures of analysis to which the input data may be submitted. Therefore, for any contributor to the GLD it is absolutely imperative that the required standards be maintained, regardless of personal feelings and opinions on these standards. This does not imply that all GLD contributors should necessarily share the same theoretical conceptions (such as, e. g., the role of lexicostatistics in historical-comparative linguistics in general and genetic classification in particular).


Note on authorship: The GLD is a collective project that fully recognizes all degrees of individual output. All the compilers of the wordlists and au­thors of the accompanying annotations should be and are acknowledged as such for each wordlist. In their turn, compilers and authors are respon­sible themselves for properly indicating all of their sources of information.


BACK TO MAIN PAGE                                   DATABASE LIST                              RUSSIAN VERSION


     © 2011-2014 George Starostin (site design, data input coordination)
    © 2011-2014 Phil Krylov (programming, technical support)