Using the etymological database

Choosing the database

The first interface screen allows to choose among available databases and to tune the interface parameters.

The main choice field lets you specify the name of a particular database.

If you have a right to modify databases on-line, you may use the "Login" button. You will need to have a registered user name and password to do that. If you do not have those requisites, you will still be able to view all the databases in read-only mode.

Several checkboxes below allow to switch viewing parameters.

The "Recoding" checkbox regulates the method of displaying special characters. Lexical data is stored within the database in phonetic transcription, utilising a special encoding system and special fonts (Windows fonts may be downloaded here; Macintosh fonts with the coding table Western MacRoman - HERE). If you do not use these fonts or standard Unicode fonts (on Unicode see below), the program may render special symbols by means of standard ASCII characters using a special encoding system. In this case you must check the "Recoding" checkbox.

The "Graphics as gifs" checkbox regulates the way of displaying doublebyte (Chinese, Greek, Arabic etc.) characters. By default characters are displayed as pictures in the GIF format. However, if the needed fonts are installed on your computer you may cancel the generation of GIFs. Remember: for Chinese you will need fonts in the BIG5 coding system; for Greek or Arabic you should install the fonts contained in TIMESTR.ZIP.

The "Use tables" checkbox allows you to turn off tables in the program output: it may be useful if you prefer a text browser (like Lynx).

The "Record numbers" checkbox allows you to watch the actual record numbers in the database or turn them off (which is the default option).

The "Multiple windows" checkbox lets you view links in separate windows or in the main browser window, whichever is more convenient for you.

After setting the above parameters press the "Submit query" button.

The small radiobuttons below allow you to choose one of the three generally accepted encoding systems for Cyrillic (KOI, ALT and WIN) and to switch between the English or Russian interface languages.
A recent addition (March 2001) is the possibility of using Unicode encoding. Choose these options (UTF-RUS or UTF-ENG) if your browser has Unicode capabilities and if a Unicode font is installed on your computer.

Note: So far only two browsers: Netscape 4 (and higher) and Internet Explorer 4 (and higher) support Unicode. Unicode fonts also behave differently. Times New Roman Star is a Unicode font and will display most symbols properly, but has Cyrillic letters instead of some of the Unicode IPA symbols. Chinese fonts like UWCXMF or Mingliu are OK for Chinese Unicode characters, but lack most of the IPA symbols. Code2000, Lucida Sans Unicode and Titus Cyberbit, on the other hand, do not include most of the CJK (Chinese, Japanese and Korean) symbols. So far the most comprehensive font available is Arial Unicode MS.
 

Press the "Change" button to enable the chosen encoding.


Querying a database

The main part of this screen is a table. By filling in its slots you specify which database fields you would like to include into your query and their contents. Every slot of the table corresponds to a field in the database.

By setting the flag "Include into report" close to a database field name you specify that this field should be included into the query result. If no field is specified all the database fields will be automatically included. The "Value" slots let you specify your requirements for the field value. The requirements are input as strings and the way the strings are interpreted by the query engine is regulated by the "Query method" slots. The following query methods are supported.

Match beginning

The string specified in the "Value" slot should be found in the field beginning.


Match substring

The string specified in the "Value" slot should be found within the field, not necessarily in the beginning.


Like beginning

The string specified in the "Value" slot is supposed to be phonetically similar to the field beginning.
Like substring
        The string specified in the "Value" slot is supposed to be phonetically similar to something found within the field, not necessarily in the beginning.
Match meaning
        The string specified in the "Value" slot is supposed to be semantically similar to the contents of the field. Note that (so far) only English input is evaluated.
Semantic match is understood in a broad sense, as similarity of some semantic components.
You can also search for a string in any of the database fields. Use the option "In any field" to do that.
The "Sort by" list allows you to choose the order of records in your query result.
Methods of entering queries

Special notes for using Chinese characters in queries

         B. With Unicode
         If your operating system and browser are Unicode-compatible and you have an  IME  (Input method) system installed, you may enter Chinese characters and other Unicode symbols in your query (but wildcard search is so far disabled).
The databases presented here had been developed for several years by the Department of Comparative Linguistics and Ancient Languages of the Russian State University of the Humanities. The Chinese, Sino-Tibetan, North Caucasian and Yenisseian databases were compiled by S. A. Starostin; the Altaic database - by S. A. Starostin, O. A. Mudrak and A. V. Dybo; the Dravidian database - by G. S. Starostin; the Chukchee-Kamchatkan database - by O. A. Mudrak.

The computer database and interface systems were developed by S. A. Starostin and Yu. Bronnikov on the basis of S. A. Starostin's STARLING system, using the Clipper, C/C++ and TCL computer languages.