How collation works in Derby

Derby supports a wide range of character sets and encodes all of the character sets by using the Unicode support provided by the java.lang.Character class in the Java Virtual Machine (JVM) in which the Derby database runs.

See the Java API documentation for the java.lang.Character class for the exact level of Unicode Standard that is supported.

A collation is a set of rules for comparing characters in a character set. In Derby, the collation rules affect comparisons of the CHAR and VARCHAR data types. Collation rules also affect how the LIKE Boolean operator processes the CHAR, VARCHAR, CLOB, and LONG VARCHAR data types.

The default Derby collation rule is based on the binary Unicode values of the characters. So a character is greater than (>), equal to (=), or less than (<) another character based on the numeric comparison of the Unicode values. This rule allows for very efficient comparisons of character strings.

Note: When LIKE comparisons are used, Derby compares one character at a time for non-metacharacters. This is different from the way Derby processes = comparisons. The comparisons with the = operator compare the entire character string on the left side of the = operator with the entire character string on the right side of the = operator. For details, see Differences between LIKE and equal (=) comparisons.
Related concepts
Locale-based collation
Database connection URL attributes that control collation
Examples of case-sensitive and case-insensitive string sorting
Differences between LIKE and equal (=) comparisons
Related tasks
Creating a database with locale-based collation
Creating a case-insensitive database
Creating a customized collator