The Tower of Babel

Evolution of Human Language Project

The Tower of Babel
Languages of the World:

All Databases
Interactive Map
Russian Language
What Is It?
Articles and Books
Technical Advice


Dear friends and colleagues,
to say that there can be no adequate replacement for S. Starostin in the world of comparative linguistics would be much more than an understatement. That said, he was never alone in his endless research, and yes, the work goes on. "The Tower of Babel" is still Sergei Starostin's homepage, and will always be as long as there are people still wishing to follow in his footsteps.
Anyway, cutting short the epic part: as some of you may have noticed, the site has undergone some MAJOR reworking over the past month. Crucial improvements include:
A) All of the previously available databases have been replaced by newer versions. This is particularly important for such families as Khoisan and Bahnaric, work on which is still in its initial phases, but improvements have been made in practically every database.
B) TONS of new databases added. Some of these have already been previously available on the Santa Fe version of the site (, but some are presented here for the first time. Here is the list:
1) A database for "global" or "long-range" etymologies, mostly serving as a unifying point of entry for all of our Eurasiatic data, componed by S. Starostin from a series of sources as well as featuring many of his own (for now - highly provisional, of course) etymologies;
2) A database for Nostratic etymologies, gathering in one place most of the pioneering data by V. M. Illich-Svitych and A. Dolgopolsky, with extensive contributions from many ToB participants.
3) The Indo-European database, created by S. L. Nikolayev on the basis of Pokorny's dictionary and other sources. Subordinate databases include collections of Baltic and Germanic etymologies, also componed by S. L. Nikolayev, as well as scanned and OCR-ed versions of Vasmer's etymological dictionary of the Russian language and Pokorny's Indo-European dictionary as well.
4) The Uralic database, originally created by Ye. Khelimsky and substantially enlarged and detailed by S. A. Starostin on the basis of Redei's dictionary and other sources. The database is preliminary and does not yet include any subordinate files, although several intermediate databases are currently under construction. It is nevertheless nice to have at least something on Uralic etymology online.
5) The Kartvelian database, created by S. A. Starostin on the basis of Klimov's and Fenrich's etymological dictionaries.
6) The Afroasiatic (Afrasian, Semito-Hamitic) database, created by A. Militarev and O. Stolbova on the basis of a variety of sources. It is under constant construction, but large parts of it are in readable shape and may be useful to the general public. The subordinate Semitic database - especially those parts of it which have been published as the first two volumes of A. Militarev and L. Kogan's Semitic Etymological Dictionary - still remains as the most elaborate part. Many other subordinate databases are exceedingly small - around 100-200 etymologies - but nevertheless publishable. The only section that has not yet been made public is the Chadic one, due to certain technical difficulties; we hope to have this question settled very soon.
7) The general Sino-Caucasian database, serving as the higher hierarchy level for the already available North Caucasian, Sino-Tibetan, and Yenisseian databases. It was created in its entirety by S. A. Starostin and features the latest version of his Sino-Caucasian reconstruction (a detailed description of phonological correspondences between all three branches has also been completed a short time before his demise and is now awaiting publication). The Sino-Caucasian database now also links to the subordinate Burushaski etymology; the Basque and Na-Dene data, unfortunately, have not yet been entered.
8) The Austric database includes a list of parallels between Austronesian and Austro-Asiatic languages, entered by I. Peiros and S. A. Starostin. Unfortunately, the subordinate Austronesian database (a reworked version of O. Dempwolff's dictionary) is not available yet due to technical reasons. A much larger version of the Austronesian database is also being prepared by I. Peiros and other participants of the project, but is currently in its initial stages. However, it is now possible to include a huge amount of Austro-Asiatic data, collected by I. Peiros and others as part of the IDS project. Formerly only the Bahnaric databases were available; today, subordinate databases include everything from Aslian to Palaung-Wa to Viet-Muong, etc. The quality of these databases varies widely - some do not include more than a hundred etymologies - but it is still better than nothing. Also available is a set of databases for Thai-Kadai languages, although it has not yet been linked to the common Austric base. More to come soon.
9) New databases for Khoisan languages, created by G. Starostin. These include a "macro-Khoisan" database - VERY preliminary stage, containing tentative matches between various subbranches of Khoisan including Hadza and Sandawe; a database for "Peripheral Khoisan", bringing together data from North Khoisan, South Khoisan and Eastern #Hoan; and several lower-level subordinate databases. All the reconstructions are liable to change practically every day, so reader beware.
C) The "Articles and Books" section has been hugely expanded as well. In addition to various works by S. Starostin that have been available for a long time, it now features contributions from most of the participants of ToB (including older as well as more current works) on language families ranging from Indo-European to Afroasiatic to Khoisan to Elamite. Most of the texts are in .PDF format to avoid any unnecessary trouble with font incompatibility. NOTE that the corpus is open to any submissions. If you have a well-written paper on comparative linguistics that you would like to see published, feel free to mail it to us so that we can add it to the archive.
D) Last but not least - we finally have a FORUM! Yes, you can now go to and use it for technical questions, suggestions, announcements, or discussion on various linguistic topics. The forum has both an English language and a Russian language section and is moderated by G. Starostin (
More additions are to come soon. We are planning on having monthly updates in the database section, documented on the news page. Concerning new databases, Chadic and Eskimo etymologies are on their way. Presumably we will also be including 100-wordlists for selected databases, although work on this will be gradual. More updates in the "Articles and Books" sections are coming up. Finally - and very importantly - we are now designing a memorial section for S. A. Starostin, which will include photos, obituaries, and maybe even audio files.
For now, that is all.

Signed: George Starostin

    "The Tower of Babel": The main goal of the project is to join efforts in the research of long range connections between established linguistic families of the world. Internet is a brilliant way to combine our attempts and to build up a commonly accessible database of roots, or etyma reconstructed for the World's major (and minor) linguistic stocks. Continue >>

On the following pages you shall be able to look at computer databases for the dictionaries of Ozhegov, Zalizniak and Mueller, as well as analyze any Russian word and obtain its complete accented paradigm. Continue >>

Languages of the World:
All Bases
North Caucasian
Himalayan (Limbu, Dumi, Kulung)
Chinese Dialects
Continue >> Map >> Download >>
Russian Language
Continue >> Download >>
    SSTARLING is a software package designed by Sergei Starostin for various types of linguistic text and database processing, including handling of linguistic fonts in the DOS and WINDOWS operating systems, operations with linguistic databases and Internet presentation of linguistic data. Continue >> Download >>
Articles and Books
    Altaic, North Caucasian, Yenissei, Dravidian... Continue >>
    Etymological resources, Russian language... Continue >>
Technical Advice
    Fonts... Continue >>