ISO/IEC 10646 and Unicode consortium define the set of characters that are supported in Unicode. Various encoding methods are suggested for the current set of supported characters and scripts. There are 8 bit, 16 bit and 32 bit encodings for Unicode Characters.
- UTF-8
: Unicode Transformation Format based on 8 bit representation
- CESU-8
: Compatibility Encoding Scheme of UTF-16 on an 8-bit base.
- UTF-16
: Unicode Transformation Format based on 16 bit representation.
- UTF-32 : Unicode Transformation Format based on 32 bit representation.
- UCS-2 : Universal Character Set 2 byte variation
- UCS-4 : Universal Character Set 4 byte variation