Creating a database schema

Introduction

A Torque database schema consists of one or more XML files describing the structure of the databases. The files which are considered by the generator can be configured by the sourceIncludes and sourceExcludes settings of the maven plugin or ant task. By default, all files are considered ending with the suffix schema.xml.

All schema files should comply either to database-4-0-strict.xsd or to database-4-0.xsd. The difference between the strict schema and the normal schema is that in the strict schema, additional naming restrictions are made which are aimed at better interoperability between databases.

It is recommended to ensure schema compliance by adding xsd information into the root database element, e.g.

<database 
    xmlns="http://db.apache.org/torque/4.0/templates/database"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://db.apache.org/torque/4.0/templates/database 
        http://db.apache.org/torque/4.0/templates/database-strict.xsd"
   ...
      

or

<database 
    xmlns="http://db.apache.org/torque/4.0/templates/database"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="http://db.apache.org/torque/4.0/templates/database 
        http://db.apache.org/torque/4.0/templates/database.xsd"
   ...
      

Organisation of the schema files

A single database can consist of one or more schema files. For small database definitions, it is best to use one schema file per database. For complicated database definitions, it can be better to use more than one schema files. To have foreign key references between tables in different files, other schema files can either be included using the include-schema element, or they can be referenced using the external-schema element. The difference between included and referenced schemata is that an included schema can have foreign key relations between all tables in all files, whereas with referenced schema files, relations from the referenced schema to the main schema are forbidden. With both included and referenced schema files, care should be taken that the imported/included files are not read on their own by the maven plugin or ant tasks.

For example, in the test project, the files included-schema.xml and ext-schema.xml are included respectively referenced by the test-schema.xml. In the configuration of the torque-maven-plugin in the pom, these files are excluded as source files in order to read them only via the include-schema/external-schema element and not as schema files of their own.

The database element

Each schema file contains one database element as root element. The database element must have a name attribute, which defines the default handle by which the database can be accessed. For example, in the test project, most databases have the name bookstore, so by default they are accessed by the bookstore configuration settings. A database can also define whether by default primitive or object types are used (via the defaultJavaType attribute), or which is the default id method to use to generate primary keys (via the defaultIdMethod attribute). For more attributes and possible values, see the xsd database-4-0.xsd.

The database element can have the following child elements

  • option (currently not used by the torque templates)
  • include-schema (see above)
  • external-schema (see above)
  • domain (definition of common data types, see below)
  • table (see below)
  • view (see below)

domains

A domain is used to create a common type definition. A domain must have a name and a data type and can have a size, scale and default value. Once the domain is defined, it can be used as any other data type.

tables

Tables are defined by the table element. All tables must have a name and should have at least some columns. It is recommended that each table has a primary key (see below for reasons). Torque supports composite primary keys; to create one simply set the primary-key attribute to true in more than one column of the same table.

Torque can auto-generate numeric primary keys using either the native database mechanism (idMethod="native", recommended) or a method involving an extra table (idMethod="idBroker").

Foreign key relations between tables can be defined using the foreign-key element. Foreign keys can consist of one or more columns, simply add a reference elements for each column in the foreign key.

Indexes on the table can be set using the index element. Indexes can also consist of one or more columns. "Normal" indexes are not unique, use the unique element for defining unique constraints.

Columns

There are some column names which you can not use in Torque although your database would support them. These are any column name that contains characters that are not in Java's variable identifier character set. The reason is that column names are used as variable names in the OM Peer classes and these columns will cause the Torque generated code to not compile.

Note however, that SQL92 standard and up uses the same identifier characters as Java for non-delimited columns. So this should only apply to columns defined by the SQL standard as delimited columns, i.e. columns referred to surrounded by double quotes. Delimited column are considered to be case sensitive and/or contain non-standard characters. However, most good cross DB server designs should avoid these special types of column since some servers don't support delimited columns.

In addition, there are two column names that are handled slighly differently in the OM Peer classes. This is so they will not produce constants twice in the generated code. The following column names (case is ignored) will have an "_" prefixed in front of them in the Peer classes:

  • TABLE_NAME => _TABLE_NAME
  • DATABASE_NAME => _DATABASE_NAME

Furthermore, it is recommended that you do not use words which have a defined meaning in SQL as column names. Even if you can trick your database into using them, it is not sure whether Torque can do the same. And besides, even if it works for one database, if you ever decide to use another database, you may run into trouble then.

Primary keys

For every table, you should create a primary key which has no meaning in real life. The reasons for this are:

You should use a primary key at all because it creates a well-defined link between the objects in the database and the objects in memory. Often, one has to decide whether a java object in memory describes "the same" object as a certain row in the database. For example if you read an object from a database, change a field value and write it again to the database, you would usually want to update the row you read from. This is only possible if Torque can find the row in the database from which the object originated. For this, the primary key is used in relational databases. If two java objects have the same primary key, they describe "the same thing" and refer to the same row in the database. If you do not have a primary key, there is no well-defined way to decide if two java objects describe "the same thing". You might run into not being able to update an object.

Now that we know why we want to have a primary key at all, why should it have no meaning in real life ? This can be explained best by an example. Consider a table which holds manufactured parts. Each part has an unique serial number. So it is tempting to use the serial number as a primary key. But now imagine that we have registered the wrong serial number for a certain part in the database. Remember that the primary key is used to decide "is it the same object?" So we cannot change the serial number of a specified object without making it another object.

In Torque, this problem manifests itself in that there is no easy way to change the primary key of an object; you must trick Torque into it by using Torque's internals. This should be avoided if possible. If you use an additional primary key which has no meaning in real life, you do not run into that trouble.

views

Views behave much as tables, except that they cannot be written to. and that they usually reference data from other tables.

A view definition must contain the column names and types.

To let Torque create the sql for the view, you can specify a sql suffix for the view, e.g. "from book join author on book.author_id=author.author_id", and the select snippet for each column using the select attribute, e.g. "author.author_id"". This will usually work only for simple views, but is easy to read in the schema. If you cannot create the sql like this, you can also fill the attribute createSql with the sql needed to create the sql.

examle schema file

For an example schema file conaining most of the discussed features, look at the bookstore-schema.xml from the test project.

More than one database

Torque can access multiple databases with different structure and even different database vendors from within one Torque instance. Basically this means that Torque's inbuilt connection handling can handle more than one database connection pool.

There are three different scenarios for multiple databases, with different levels of support from Torque:

  • Each table has a unique name; different tables belong to different databases. This situation is fully supported by the Torque runtime. If the generated peer classes (generated by the torque generator using the torque templates) are used, the different databases are handled automatically in the standard cases because each table has a default database to which it belongs.
  • Some or all table names are not unique (there are tables with the same name in different databases), and the tables with the same name have the same structure (same column names and data types etc). This situation is also fully supported by the Torque runtime. However, the generated peer classes are not generated for this situation; meaning that by default, a connection for querying and saving is always opened to the default database for the given table name. This means you will either need to pass the correct database handle name for querying and saving, or not use the connection handling provided by the peer classes.
  • Some or all table names are not unique (there are tables with the same name in different databases), and some of the tables with the same name have a different structure than other tables with the same name. This situation is not recommended, it is unclear to which extent it is supported.

Databases with the same structure

If your databases have the same structure, define the schema and generate the classes as if it were only one database. Accessing different databases using the generated classes is covered later in this documentation.

Databases with different structures

On schema creation time, make sure that the different databases are defined in different schema files. Also make sure that the different databases have different name attributes in the database element. Then, define different database handles using the database names from the schema files. Torque will then use the database handle with the key equal to the database element's name attribute for each table.

It is recommended to generate the classes for the different databases into different packages to keep them apart. To achieve this, the corresponding classes must be generated in different generator runs with different package settings.

Using database Schema names

It is possible to qualify table names with schema names (as in ${schema}.${table}) at generate time. To do this, use the fully qualified table name as name attribute in the <table> element of your schema.xml. For example, to use the schema "bookstore" for the table "book", use the following table definition:

...
<table name="bookstore.book" description="Book table">
...

    

If the standard options are used, the resulting java class name will ignore the leading database name (e.g. the class will be named Book for the example above). If you want to retain the schema names in the java names, (i.e. the resulting java class should be named "BookstoreBook"), you can either use the javaName attribute of the table definition:

...
<table name="bookstore.book" javaName="BookstoreBook" description="Book table">
...

    

or you can set the generation option torque.om.retainSchemaNamesInJavaName to true.

If you use a sequence to autogenerate ids, the sequence will be generated in the same schema as the table.

In some databases (hsqldb, postgresql, mssql), a schema exists independly of other database objects (user or database). In these databases, a "create schema" (but no "drop schema") command(s) are created automatically in the ddl-sql script if qualified names are used as table names.