CitationDownload as .RIS
Emerald Group Publishing Limited
Copyright © 2006, Emerald Group Publishing Limited
Technical reports: temporal databases in depth
Technical reports: temporal databases in depth
Keywords: Cybernetics, Databases, Systems
This technical report is based on the “Temporal Database in Depth: Time and the Data Warehouse” Oracle Corporation and the e-Science Institute Seminar held at the e-Science Institute, Edinburgh, Scotland, UK in November 2005. The event consisted of an intense day – long seminar presented by Chris Date who is one of the foremost relational database experts. This was followed by a short presentation by Rob Squires of Oracle Corporation.
1.1 Temporal database. We consider here the issues which arise when the time dimension is added to a relational database and review some of the developments and work-in-progress. Reference is made, in particular to the best approaches being considered to tackle temporal data and the standpoint of database vendors. The review is based on recent work in the field and highlights the contributions and discussions made at the recent e-Science Institute Seminar.
The temporal dimension
As storage gets cheaper and the accumulation of data reaches ever greater proportions, the questions of how to store, manage and extract information from, this data become ever more urgent. In particular, the accumulation of historical data means that the temporal nature of data is of special interest. For example, the information “the status of supplier X is currently 20” has no particularly interesting temporal properties. However, if this information is expanded to “the status of supplier X is currently 20 and has been 20 since 1st July and was 15 from 12th April to 30th June” then a new temporal dimension has been added. The addition of this new dimension creates a whole host of interesting and complex issues.
These issues were explored by Chris Date at the recent seminar (Date, 2005) based on work carried out in collaboration with Darwin and Lorentzos (Date et al., 2005).
There is some controversy concerning the best approach to tackling temporal data. One school of thought is to treat temporal data as special, e.g. represent time as a hidden attribute as in TSQL2, and depart from the principles of relational databases. Whilst the other main school of thought, to which Date belongs, is to treat temporal data like any other data and strictly adhere to the principles of relational databases.
Date (2005), gave various examples to illustrate the problems incurred by adding a temporal dimension to data. In practice, such examples require extremely precise and lengthy specification, in this report they are necessarily discussed in a more informal manner. Consider two relations supplier and supplierparts. The first relation supplier contains details of the suppliers and the time intervals during which the supplier was under contract, e.g. “FROM 12th April TO 30th June, inclusive, supplier X was under contract”. The second relation supplierparts contains details of the parts and time intervals during which a supplier could provide a part, e.g. “FROM 12th April TO 30th April, inclusive, supplier X was able to supply part P”. A number of constraints are required in order to keep these two relations consistent, for example:
For each FROM and TO pair, the TO value must be greater than the FROM value, e.g. “FROM 1st July 2006 TO 1st Jan 2006” makes no sense.
If two tuples abut or overlap in terms of time, then they should be combined into a single tuple. For example, in relation supplier the tuples representing “FROM 12th April TO 30th June, inclusive, supplier X was under contract” and “FROM 15th June TO 30th August, inclusive, supplier X was under contract” should be merged into the one tuple “FROM 12th April TO 30th August, inclusive, supplier X was under contract”.
If supplierparts shows a supplier as being able to supply some part during an interval of time, then supplier must show that same supplier as being under contract throughout that same interval of time.
A large number of such constraints are required and, as Date demonstrated, these constraints are complex to formulate precisely. Additionally, queries on a temporal database are complex to formulate, e.g. “list the supplier name-FROM-TO triples for suppliers who have been unable to supply any parts at all during at least one interval of time, where FROM and TO together designate a maximal interval during which the supplier was in fact unable to supply any part at all”.
Thus, Date (2005) shows that temporal data leads to constraints and queries that are currently unreasonably complex to express. What Date proposes is to treat these FROM-TO pairs as intervals and define an interval logic and interval operators to handle these pairs. Of particular interest are Date’s PACK and UNPACK operators which allow a relation containing temporal data to be compressed or expanded, based upon the value of an interval attribute. For example, using the PACK operator on a relation containing the tuples “FROM day 1 TO day 2, inclusive, supplier X was under contract” and “FROM day 3 TO day 5, inclusive, supplier X was under contract” could result in a relation containing the tuple “FROM day 1 TO day 5, inclusive, supplier X was under contract”. Whilst using the UNPACK operator on a relation containing the tuple “FROM day 1 TO day 2, inclusive, supplier X was under contract” could result in a relation containing the tuples “FROM day 1 TO day 1, inclusive, supplier X was under contract” and “FROM day 2 TO day 2, inclusive, supplier X was under contract”. Date then went on to show how the traditional relational algebra operators can be redefined in terms of these PACK/UNPACK operators. Using these new expanded operators one can talk about the UNION, DIFFERENCE, JOIN, etc. of relations containing attributes that are intervals of time.
Having laid down the previous foundations, Date moved on to discuss various issues regarding temporal databases. A sixth normal form was proposed to cater for relations with attributes that are intervals. The interesting problem of the moving point “now” was discussed, how does one store the tuple “FROM day 3 TO the present time (i.e. ‘now’), supplier X was under contract”? A number of difficulties arise when dealing with intervals that involve the present time. Date suggested a solution might be to horizontally decompose your relations to separate current information like the previous tuple, and historical information such as “FROM day 3 TO day 6, inclusive, supplier Y was under contract”. Insertions, updates and deletions involving temporal databases are necessarily more complicated and require redefining in terms of the PACK/UNPACK operators mentioned previously. For example, when inserting a new tuple into a relation, the relation will have to be UNPACKed, the insertion made and the relation rePACKed. As mentioned previously, queries on temporal databases can also become complex and Date argues that using the techniques described in this seminar, certain queries can be handled which cannot be handled using techniques proposed by other researchers.
Squires (2005) of the Oracle Corporation has outlined his company’s current standpoint on temporal databases. Like other database vendors Oracle currently have no commercial offering that tackles all the problems of temporal databases. Squires, however, has been working on some experimental temporal database software which he has personally developed. This is based solely upon SQL, and contains no extensions of that language. He concurs with Date on the difficulties of temporal databases and emphasises the increasing urgency of developing a commercial temporal database.
Temporal database seminar at Edinburgh
There was no doubt about the success of this event which not only introduced new ideas and strategies in tackling the challenges of relational databases but also gave a comprehensive insight into current developments in the field.
Its relevance to research in systems, cybernetics and the management Sciences made it a particularly profitable experience to all who participated both in the seminars and, of course, in the many discussions that followed over the event period.
C.J.H. MannNorbert Wiener Institute
Date, C. (2005), “Temporal database in depth: time and the data warehouse”, Seminar, e-Science Institute, Edinburgh
Date, C., Darwen, H. and Lorentzos (2005), Temporal Data and the Relational Model, Morgan Kaufmeann, San Mateo, CA, ISBN 1-55860-855-9
Squires, R. (2005) paper presented at Oracle Corporation and the E-science Institute Seminar, Edinburgh