Tuesday, 5 July 2011

Firebird's Indices, part 1

The concept assumed as the basis of indices is simple and visual and is one of the most important bases of databases design. On the basis of indices many database basic objects are grounded, and furthermore the correct use of indices is key to an improvement of the productivity of databases applications. However, what is index? Index is an ordered pointer of the records in the table. Pointer means that the index contains values of one or several fields in the table and the addresses of data pages where these values are located. In other words, index consists of pairs of values "field value" – "physical location of this field".

Thus, by value of the field (or fields), included in the index, using index we can find quickly that place in the table where the record containing this value is allocated. Ordered means that values of the fields stored in the index are ordered.

Very often index is compared with a library catalogue, in which all the books are recorded to cards and are ordered in some way: according to the alphabet or themes, and in every card contains the information where exactly the given book is allocated in storage.
Why do we need indices?

The only thing indices promote is speeding up of record retrieval by its indexed field (indexed - means included in the index). The main function of indices is to provide fast record retrieval in the table. Any index using comes to this.

How this retrieval function is realized? At the input of this function we have the value of indexed field (or several fields). As a result of retrieval we should receive the whole record, in which the indexed field has a preset value. First in the index (to put it more precisely, in the ordered array of values of the indexed field) the required value is searched, then the address of data page is taken where the required record is located, the server goes to this page and reads the found record. It looks rather inconvenient, however search, using index, is many times faster, than sequential enumeration of all values from the table.

If we continue to make the analogy between index and library catalogue, we will see that the record retrieval, using index, is very similar to the book search, using card. When we find a book in a rather small catalogue (in comparison with the whole library storage), we immediately receive the information about where exactly the book is stored and we can go straight there. Search without using index can be compared with sequential enumeration of all the books in the library!

Enumeration of all records in the table is called direct or natural. We should say that despite of powers of modern computers natural enumeration may be very long if the table contains a great number of records.
How are they organized?

Index is not a part of the table, it is a separate object connected with the table and other database objects. This is a very important point of DBMS implementation allowing to separate information storage from its representation.

Firebird as any other relational database stores records in tables in the unordered way, i.e. does not care at all how records are physically allocated in the table.

Unordered storing means that two records added to the table one by one may not be next to each other. Moreover, the data extracted from the table also have no order apart from the one that should be explicitly specified by the user making a retrieval query.

However, we cannot go without ordering the stored data: end-users of applications want to see the data in the defined order - for example, people's last names according to the alphabet. Indices solve the problem of data representation in the ordered way. Field values included in the index are ordered and represented in the special view, optimized for searching the required values (namely, this is essential for creating ordered sequences).

Separating data storage from their representation gives additional benefits in comparison with direct sorting – maybe you will need to sort the initial table in different ways. Then indices will help you – there can be up to 64 indices for every table!

If we speak about implementation of indices at the physical level, they represent a binary tree which nodes represent pairs "field value in index " – "data allocation in the table ". Retrieval of the required record in index is performed using the mechanism of hash-search - one of the fastest search algorithms.
Index application

Now, when it is clear what we can demand from indices, it is time to know about their function in a database. Indices are used in three main cases:

1. Speeding up of inquiry execution. Indices are created for the fields used under the conditions of search of SQL-inquiries.

2. Support of uniqueness of values in fields; a primary key constraint (about which it was told in the chapter "Tables. Primary keys ") demands that in the table there will be no two identical values of the fields included in a primary key. In order to meet this condition, when inserting a new record you should search for the same value that will be inserted. For record retrieval the special variety of index is used - a unique index (see below).

3. Support of reference integrity. Foreign keys constraints (which are considered in the chapter "Database constraints") are used to check that the values inserted into the table necessarily exist in other table. When creating a foreign key index is automatically created. This index is applied for speeding up the inquiries using join of tables, as well as for checking the conditions of the foreign key. We have briefly covered all possible index applications. Now we will consider the peculiarities of every case in more detail and we will answer most frequently arising questions concerning index application.

Read in the part 2:

* How to speed up SQL queries with indices
* Support of reference integrity
* Optimization of index performance

No comments:

Post a Comment


About Me

My photo
IBSurgeon was established in 2002, 10 years we recover databases and save Firebird/InterBase data.