NormalizeMode

class NormalizeMode

Defines how a Unicode string is transformed in a canonical form, standardizing such issues as whether a character with an accent is represented as a base character and combining accent or as a single precomposed character. Unicode strings should generally be normalized before comparing them.

Fields

class NormalizeMode
ALL

Beyond DEFAULT also standardize the “compatibility” characters in Unicode, such as SUPERSCRIPT THREE to the standard forms (in this case DIGIT THREE). Formatting information may be lost but for most text operations such characters should be considered the same

ALL_COMPOSE

Like ALL, but with composed forms rather than a maximally decomposed form

DEFAULT

Standardize differences that do not affect the text content, such as the above-mentioned accent representation

DEFAULT_COMPOSE

Like DEFAULT, but with composed forms rather than a maximally decomposed form

NFC

Another name for DEFAULT_COMPOSE

NFD

Another name for DEFAULT

NFKC

Another name for ALL_COMPOSE

NFKD

Another name for ALL