International / foreign language support

Top  Previous  Next

Zoom supports a wide variety of languages such as French, German, Italian, Hebrew, Cyrillic (Russian), Greek, Arabic, and others. But the search page must be configured accordingly, depending on the encoding you have used on your website.

The Zoom Indexer application supports Unicode, and thus, multi-byte character strings. This allows you to index and configure your search engine for many languages using the UTF-8 encoding. However, due to the nature of the scripting platform and the server restrictions of your web host, searching for some Asian languages such as Japanese, Chinese, and Korean are limited. Searches will still be functional but accuracy is hampered due to the lack of a proper language definition to identify words.

For the following language settings to work, make sure your search template HTML file is specified for the correct charset. For example,

<meta HTTP-EQUIV="content-type" CONTENT="text/html; charset=windows-1252">

 
Zoom will warn you if it finds that the search template in the output directory does not have the same character set selected for the indexing configuration (specified under “Languages” in the Configuration window).

lightbulb

Note: By specifying the use of Unicode and UTF-8, the indexer will also store all entries in the search engine data files as UTF-8 encoded characters. This means that the translation of accented HTML entities (such as “&uuml;” etc.) will be converted to UTF-8 encoded characters. When UTF-8 is not selected, the translation happens using standard ANSI instead.

See also:

European languages (French, German, Danish, Swedish, etc.)
Russian (Cyrillic)
Japanese