Creating a customized collator

You may need to define a collation order different from that of the strengths provided by the collation=collation attribute.

To define a new collation order, follow these steps.

  1. Create a class that extends the java.text.spi.CollatorProvider class and that returns a collator that orders strings the way you want it to.
  2. Create a text file named META-INF/services/java.text.spi.CollatorProvider that contains one line with the name of your collator provider class.
  3. Put the compiled class file and the text file in a jar file that you drop into your JRE's lib/ext directory or in one of the directories specified by the java.ext.dirs property.

For example, suppose you want to define a collation order to make Greek characters sort near their Latin equivalents ('α' near 'a', ''β' near 'b', and so on). You could define another locale with a CollatorProvider that returns a java.text.RuleBasedCollator with ever rules you want. See the API documentation for the RuleBasedCollator class for details about how you specify rules. In its simplest form, a set of rules might look like "a,A < b,B < c,C", which means more or less that 'a' and 'A' should be sorted before 'b' and 'B', which should be sorted before 'c' and 'C'. So to get the Greek characters sorted near similar Latin characters, define a CollatorProvider that looks like this one:

public class MyCollatorProvider extends CollatorProvider {
    public Locale[] getAvailableLocales() {
        return new Locale[] {
            new Locale("en", "US", "greek")
        };
    }
    
    public Collator getInstance(Locale locale) {
        StringBuilder rules = new StringBuilder();
        // alpha should go between a and b
        rules.append("< a,A < \u03b1,\u0391 < b,B");
        // beta should go between b and c
        rules.append("& b,B < \u03b2,\u0392 < c,C");
        // add more rules here ....

        try {
            return new RuleBasedCollator(rules.toString());
        } catch (ParseException pe) {
            throw new Error(pe);
        }
    }
}

Again, put the compiled class and the META-INF/services/java.text.spi.CollatorProvider file in a jar file, and start the ij tool with the -Djava.ext.dirs=. option in the directory where the jar file is located. Create a database that uses the new locale and insert some data with both Greek and Latin characters:

ij> connect 'jdbc:derby:GreekDB;territory=en_US_greek;collation=TERRITORY_BASED;create=true';
ij> create table t (x varchar(12));
0 rows inserted/updated/deleted
ij> insert into t values 'a', 'b', 'c', 'α', 'β';
5 rows inserted/updated/deleted
ij> select * from t order by x;
X           
------------
a
α
b
β
c 

5 rows selected

The ordering is just as you wanted it, with the Greek characters between the Latin ones, and not at the end where they would normally be located.

One word of caution: If, after you have created a database, you update your custom CollatorProvider so that the ordering is changed, you will need to recreate the database. You must do this because the indexes in the database are ordered, and you may see strange results if the indexes are ordered with a different collator from the one your database is currently using.

Related concepts
Character-based collation in Derby
How collation works in Derby
Locale-based collation
Database connection URL attributes that control collation
Examples of case-sensitive and case-insensitive string sorting
Differences between LIKE and equal (=) comparisons
Related tasks
Creating a case-insensitive database
Creating a database with locale-based collation