MySQL 5 Storage Engines

By Ian Gilfillan

New Storage Engines in MySQL 5

MySQL 5 offers a number of new storage engines (previously called table types). In addition to the default MyISAM storage engine, and the InnoDB, BDB, HEAP and MERGE storage engines, there are four new types: CSV, ARCHIVE, FEDERATED and EXAMPLE, as well as a new name for the HEAP storage engine. It is now called the MEMORY storage engine. None of the new types are available by default - you can check for sure with the SHOW ENGINES statement. Here is what is on my default version of MySQL Max:

mysql> SHOW ENGINES;

+------------+---------+-----------------
| Engine     | Support | Comment                                                    |
+------------+---------+-----------------
| MyISAM     | DEFAULT | Default engine as of MySQL 3.23 with great performance     |
| HEAP       | YES     | Alias for MEMORY                                           |
| MEMORY     | YES     | Hash based, stored in memory, useful for temporary tables  |
| MERGE      | YES     | Collection of identical MyISAM tables                      |
| MRG_MYISAM | YES     | Alias for MERGE                                            |
| ISAM       | NO      | Obsolete storage engine, now replaced by MyISAM            |
| MRG_ISAM   | NO      | Obsolete storage engine, now replaced by MERGE             |
| InnoDB     | YES     | Supports transactions, row-level locking, and foreign keys |
| INNOBASE   | YES     | Alias for INNODB                                           |
| BDB        | YES     | Supports transactions and page-level locking               |
| BERKELEYDB | YES     | Alias for BDB                                              |
| NDBCLUSTER | NO      | Clustered, fault-tolerant, memory-based tables             |
| NDB        | NO      | Alias for NDBCLUSTER                                       |
| EXAMPLE    | NO      | Example storage engine                                     |
| ARCHIVE    | NO      | Archive storage engine                                     |
| CSV        | NO      | CSV storage engine                                         |
+------------+---------+------------------------------
To add support for the missing storage engines, you currently need to build MySQL with certain options. It is likely though that there will be binary versions that include these storage engines by default at some point. Until then, there is no other way to enable them.
Changes in the MEMORY storage engine
You can read my 2003 article on the HEAP table type as a start, as most of the detail has not changed, and refer below for modifications in MySQL 5.
Previously, the HEAP storage engine only made use of hash indexes. These allow finding specific matches extremely quickly, but do not return any kind of range data. An index matches a record, but there is no ordering to allow it to return subsequent records. Only the complete index can be used, the concept of leftmost prefixing (using the left part of an index) does not apply. The MEMORY storage engine now permits BTREE indexes as well (the kind used by MyISAM tables by default).
To specify an index type, use the USING clause, as in the following examples:

CREATE TABLE memory_table (f1 INT, INDEX USING BTREE (f1)) ENGINE = MEMORY;
                                     or
CREATE TABLE memory_table (f1 INT, INDEX USING HASH (f1)) ENGINE = MEMORY;

The HASH index is still the default, and will be the type of index created if you do not specify a particular kind.
  • Now support AUTO_INCREMENT
  • Now support INSERT DELAYED
  • Support indexes on columns that can contain NULL values
  • Never get converted to disk tables. (Temporary internal tables are automatically converted to disk table if they get too big, MEMORY tables never are. The max_heap_table_sizevariable (it hasn't yet changed its name to reflect the new storage engine name) places a limit on the memory utilization of MEMORY tables, and you can always place a MAX_ROWS limit as well, when creating the table.

The EXAMPLE storage engine

Added in MySQL 4.1.3 and only of interest to developers, the EXAMPLE storage engine does nothing, but is there to provide simple source code for developers to base new storage engines on. For those interested, the source code can be found in the sql/examples directory.
The FEDERATED storage engine
Added in MySQL 5.0.3, to make use of it you need to use the --with-federated-storage-engine option to configure when building MySQL. The FEDERATED storage engine allows you to access data from a table on another database server. That table can make use of any storage engine. Let's see it in action. First, CREATE a table on a remote server (you can do this on the same server for testing purposes, but doing so is fairly pointless otherwise).

CREATE TABLE myisam_table (f1 INT, PRIMARY KEY(f1))ENGINE=MYISAM;
Assuming that the default is set to create MyISAM tables (FEDERATED tables can access tables of any type), the above statement creates a definition file (.frm), an index file (.MYI) and a data file (.MYD). If you had created an InnoDB file, MySQL would create a definition (.frm) and index and data file (.idb). Now create the FEDERATED table on another server. The original table must always exist first:

CREATE TABLE federated_table (f1 INT, PRIMARY KEY(f1))ENGINE=FEDERATED
COMMENT='mysql://username:password@hostname.co.za:3306/dbname/myisam_table';
This creates a definition file (.frm), but the data and index files are that of the table on the remote server. The only unusual syntax is the COMMENT, which supplies the username, password (optional), port (optional), database and table name. This method is not particularly elegant, or secure, as the password is stored in clear text available to anyone who has access to the table data. However, in most cases it is likely that whoever has access to the FEDERATED table can also have access to the remote table, so this should not be too much of an issue. Bear in mind that this method of connecting will likely change in a future version.
There are some limitations on the use of FEDERATED tables. They are useful for easily accessing data on a different server, but fall short in many areas:
  • No Data Definition Language statements are permitted (such as DROP TABLE, ALTER TABLE)
  • They do not support transactions (since the remote server is contacted once, and the results returned to the local server)
  • Similarly, there is no way of being sure that the integrity of the local data is intact.
  • No prepared statements.

The CSV storage engine

Added in MySQL 4.1.4, tables of type CSV are actually just comma-delimited text files. This can be quite useful, and they exist to allow MySQL to interact easily with other applications that make use of CSV files, such as spreadsheets. They make no use of any sort of indexing. To enable this storage engine, use the --with-csv-storage-engine configure option when building MySQL.
Let's see how this works. You can import an existing CSV file. Assume you have a file containing firstname, surname and age, as follows:
"Jacob","Mbezela","42"
"Schabir","Field","29"
First, create the .frm definition file, as follows:

mysql>CREATE TABLE csv_names(firstname CHAR(30), surname CHAR(40), age INT) ENGINE = CSV;
 The blank data file is also created. Since the CSV file is just a plain text file, you can copy an existing CSV file into the same location, and it will be viewable from the MySQL client, as follows:

mysql> SELECT * FROM csv_names;
+-----------+------------+-----+
| firstname | surname    | age |
+-----------+------------+-----+
| Jacob     | Mbezela    |  42 |
| Schabir   | Field      |  29 |
+-----------+------------+-----+
With no indexes, the SELECT is not at all efficient, and performs a complete table scan. Conversely, you can INSERT a record from the MySQL client:

mysql> INSERT INTO csv_names VALUES('Quinton','Baxter','75');

and view the change in the CSV file:

"Jacob","Mbezela","42"
"Schabir","Field","29"
"Quinton","Baxter","75"

The ARCHIVE storage engine

Added in MySQL 4.1.3, the archive storage engine lives up to its name by storing large amounts of data without taking up too much space. It too makes no use of any sort of indexing, and there are no means to repair the table should it become corrupted during a crash. To enable this storage engine, use the -with-archive-storage-engine configure option when building MySQL.

mysql> CREATE TABLE archive_names(firstname CHAR(30), surname CHAR(40), age INT) ENGINE = ARCHIVE;
This, as always, creates a .frm definition file, as well as .ARZ and .ARM data and metadata files.

Being an archive, you cannot DELETE, UPDATE or REPLACE records - you can only INSERT and SELECT. Again, with no indexes, the SELECT needs to perform a complete table scan. Although the records are compressed upon insertion, OPTIMIZE TABLE can compress the entire dataset even further. A .ARN file will temporarily appear when this occurs.

mysql> INSERT INTO archive_name VALUES('Quinton','Baxter','75');
mysql> SELECT * FROM archive_names;
+-----------+------------+-----+
| firstname | surname    | age |
+-----------+------------+-----+
| Quinton   | Baxter     |  75 |
+-----------+------------+-----+
Conclusion
The new storage engines, whilst tricky for most people to use since they require rebuilding MySQL, can be useful for those with special needs. After all, people have taken the time to write them! They are bound to be included in the binaries at some point, so if you are dying to try them, but rebuilding MySQL is not for you, you hopefully do not have to be patient for long. Even if you are able to use them now, keep an eye on the official documentation, as things are likely to change as they approach maturity. Good luck!

2 comments:

Unknown said...

Forget the problem of an application crash or slow data access or response time for an overloaded SAN switch port with Traverse's service container that can monitor application response time and correlate that with the underlying storage components which are relevant to that application using its Business Container technology.
http://zyrion.com/solutions/server.php

Unknown said...

looks great. The parsing of corrupted documents can be also performed by the sql server database recovery tool, it is the fastest way to open affected files