About code pages and Unicode support

About code pages and Unicode support

The Unicode standard provides a code point for every character in modern use worldwide. It enables plain text data to be transported through different platforms, systems, and programs without corruption. Unicode standardizes three encoding forms and seven encoding schemes:

Encoding forms

•

UTF-8

•

UTF-16

•

UTF-32

Mapping from a character set definition to the actual code units used to represent the data.

Encoding schemes

•

UTF-8

•

UTF-16BE

•

UTF-16LE

•

UTF-16

•

UTF-32

•

UTF-32BE

•

UTF-32LE

Encoding form plus byte serialization, and possible use of Byte Order Mark (BOM).

A code page is a coded character set, in which each character is assigned a unique code within the Unicode code space. Code pages usually cover only a small subset of the Unicode characters.

For more information about the Unicode standard, see http://www.unicode.org.

OpenText StreamServe 5.6

Updated: 2013-03-01