Genealogy Indexer beta Forums
Words Bookmark/Share Help

Transliteration Latin -> Cyrillic

Transliteration Latin -> Cyrillic

Postby logan » Thu Dec 06, 2012 8:26 pm

GenealogyIndexer.org now features automatic transliteration from Latin to Cyrillic, to make it easier for researchers not familiar with Cyrillic to search the growing number of Cyrillic sources. By default, the search option "Add Latin -> Cyrillic" is enabled, meaning that the search engine will look for both the original Latin script search term and possible Cyrillic transliterations. Other options are "Only Latin -> Cyrillic" and "No Transliteration."

Regardless of whether you use the automatic transliteration or enter a search term directly in Cyrillic characters, the search engine will attempt to find both masculine and feminine forms of your search term, essentially using the rules for Russian names. If you know that your search term has an unusual gendered form, you should manually enter the Cyrillic spelling of that form.

The search engine will completely ignore the Cyrillic yer characters, so there is no need to include them in your search term. You also do not need to worry about differences in pre- and post-reform Russian orthography.

The automatic transliteration has been optimized for search terms that are surnames of Russian, German, Polish, Romanian, Lithuanian, or Latvian origin, and will try to find Cyrillic transliterations with Russian spelling (in the future, Ukrainian, Belarusian, and Bulgarian spellings will also be checked).

The automatic transliteration finds valid transliterations using about 15 different transliteration systems, and considers the possibility of both common transliteration errors and forgotten diacritical marks in the search term. You do not have to enter letters with umlauts, ogoneks, etc., but you might see false positives if you omit diacritical marks that should be present. Note that it is not uncommon for there to be several different transliterations in even a single source, none of which is a false positive (e.g., "Hoffmann" appears as four different Cyrillic spellings).

If you are searching for a place, rather than a personal name, keep in mind that the Russian, Belarusian, etc. place name might not simply be a transliteration of the Polish, Lithuanian, etc. place name.

The automatic transliteration and manual Cyrillic searches do not yet work with the OCR-Adjusted or D-M Soundex search options. Transliteration also does not work for search terms consisting of multiple words (Boolean AND).

The transliteration system is new and will continue to be improved. If you notice any strange behavior, excessive false positives, transliterations not found, etc., please let me know.

I might consider offering a transliteration API for other genealogy websites. If you might be interested, contact me.

Logan
logan
Site Admin
 
Posts: 120
Joined: Sun Mar 01, 2009 1:51 am

Return to Announcements

cron