Acquisition
Station
Search
Clients
Collection
Delivery Tools
Compound Object
Creator
Server
Admin
Collection
Admin
Plug-ins

Topics       

 

» Adding New Items

» Building Text Index

» Updating/Deleting Items

» Exporting Metadata

» Collection Configurations

» Collection Descriptions

» Collection Field Properties

» Lock Administration

» Full Resolution Manager

» View Collection Report

» Register with WorldCat

» Starting Compound Object Creator

» Changing Collection Positions in Catalog

» Browsing or Searching

» Deleting a Collection

» Back to Server Administration

» Zoom and Pan Settings

» Internationalization

 

Tutorials

Site Map

Glossary

Contact Us

User Support Center


Internationalization

Restrictions

Version 3.5, or greater, fully supports the Latin 1 character set (which includes Western European languages as well as others) in all of the software components including Acquisition Station, Server, and Search Client. Characters not defined in the Latin 1 character set can be stored and viewed in metadata, but are subject to the conditions below.

  • Only Latin 1 characters may be used in the following fields and components:

    • Collection names
    • Metadata field names
    • Collection Editor
    • Compound Object Creator

  • Use of controlled vocabularies is restricted to the Latin 1 character set.

  • The search engine provides search capability for the Latin 1 character set only. For search purposes, the accented vowels are mapped to their non-accented equivalents internally.

  • The Acquisition Station works with one codepage at any given time, so characters from different codepages (Latin 1 and Cyrillic, for example) cannot be used in the same record within the Acquisition Station. However, the Simple Update/delete interface within Collection Administration can accommodate character sets from different codepages in the same record if this is desired. Users who have the Collection Editor interface can access the Simple Update/delete interface by following these steps:

    1. Enter your Server URL in your Web browser, followed by /cgi-bin/admin/start.exe. Go to the Collection Administration page for the desired collection. The URL of this page will be of the form:/cgi-bin/admin/home.exe?CISODB=/uw where "/uw" will be replaced by the alias of your collection.

    2. In the Collection Administration URL, replace "home.exe" with "upsel.exe", leaving the rest of the URL the same: /cgi-bin/admin/upsel.exe?CISODB=/uw

    3. This is the URL for the Simple Update/delete interface. Bookmark this link. For additional information about Update/delete interfaces within CONTENTdm, see Updating or Deleting Items.

Acquisition Station Codepage Support

The Acquisition Station supports the Windows-1252 codepage (Latin1) and Windows-1251 codepage (Cyrillic) by default. This is accomplished through the use of mapping files located on the Server in the conf/unicode directory. Additional codepages can be supported on the Acquisition Station if you define a mapping file and store it on the Server.

Mapping files are used to map the 8-bit extended ASCII characters stored in the Acquisition Station to the Unicode values stored on the Server. Only single byte character set (SBCS) codepages are supported.

To Define Mapping Files

Mapping file should consist of 256 lines, one line per SBCS character that maps it to its corresponding Unicode value. Some example records from a mapping file:

70 0070 70
71 0071 71
72 0072 72
73 0073 73
74 0074 74
75 0075 75
76 0076 76
77 0077 77
78 0078 78
79 0079 79
7A 007A 7A
7B 007B 20
7C 007C 20
7D 007D 20
7E 007E 20
7F 007F 20
80 20AC 20
81 0000 20
82 201A 20
83 0192 66
84 201E 20
85 2026 20
86 2020 20
87 2021 20
88 02C6 20

Each line consists of:

  • A 2-character representation of the byte defined (00-FF).
  • A 4-character representation of the corresponding Unicode character
  • A 2-character representation of the Latin1 SBCS character to which the character should be mapped for searching purposes. Only the characters "a" to "z" ("61" to "7A") and "0" to "9" ("30" to "39") are available for searching. Specify "20" (space) for a search delimiter.

A single space separates adjacent values. Mapping files should be saved as .txt files, named for the codepage and be stored on the Server in the conf/unicode directory. See the files conf/unicode/windows-1251.txt and conf/unicode/windows-1252.txt for examples.

For more information on Windows codepages, see Code Pages Supported by Windows on the Microsoft Web site.

Return to top of page


CONTENTdm® is a registered trademark of DiMeMa, Inc.
© 1997-2005 DiMeMa, Inc. All Rights Reserved.

Previous