-
MySQL and the SSB – Part 2 – MyISAM vs InnoDB low concurrency
This blog post is part two in what is now a continuing series on the Star Schema Benchmark.In my previous blog post I compared MySQL 5.5.30 to MySQL 5.6.10, both with default settings using only the InnoDB storage engine. In my testing I discovered that innodb_old_blocks_time had an effect on performance of the benchmark. There was some discussion in the comments and I promised to follow up with more SSB tests at a later date.I also promised more low concurrency SSB tests when Peter blogged about the importance of performance at low concurrency.The SSB The SSB tests a database’s ability to optimize queries for a star schema. A star schema presents some unique challenge to the database optimizer. The SSB benchmark consists of four sets of queries. Each set is known as a “flight”. I have labeled each query as Q{FLIGHT_NUMBER}.{QUERY_NUMBER}. In general, each flight examines different time periods or different regions. The flights represent the type of investigations and drill-downs that are common in OLAP analysis.Each query in each flight (Q1.1 for example) is tested with a cold buffer pool. Then the query is tested again without restarting the database. The first test is described as the cold test, and the second as the hot test. The database software is restarted after the hot test. All OS caches are dropped at this time as well.These set of queries were tested on the SSB at SCALE FACTOR: 20. This means there is approximately 12GB of data in the largest table.You can find the individual SSB query definitions in my previous blog post.Test environment These tests were done on a relatively fast machine with a Xeon E5-2680 (8 cores, 16 threads) with fast IO (OCZ R4 1.6TB) and 128GB memory. For the hot test, the data fits in the buffer pool and has been loaded by the cold test already. The buffer pool and adaptive hash index are cold for the cold test. All tests were done with no concurrency. The hardware for this test was provided by Adotomi. I will be blogging about raw performance of the OCZ card in another post.Also, while it is labeled on the graphs, it is important to note that in all cases, lower times are better.SSB Flight #1 Here you will see the start of an interesting trend. MyISAM is faster when the data is not cached (the cold run) but is slower in the hot (cached) run. I did some investigation during the testing and found that InnoDB does more IO than MyISAM when the database is cold, but uses less CPU time when the database is hot. I am only speculating (and I can investigate further), but I believe the adaptive hash index is improving performance of InnoDB significantly during the hot run, as hash indexes are faster than a b-tree index. Also accessing pages from the buffer pool should be faster than getting them from the OS cache, which is another advantage of InnoDB. SSB Flight #2 Flight #2 is similar to Flight #1. MyISAM is faster than InnoDB when the database is cold, but the opposite is true when the database is hot. SSB Flight #3 Here in some cases MyISAM is substantially faster than InnoDB both cold and hot. SSB Flight #4 There is one query in this flight, Q4.3, which is faster using MyISAM than InnoDB. Like the queries in Flight #3 that are faster using MyISAM, Q4.3 examines very little data. It seems that InnoDB performs better when a larger number of rows must be joined together (Q4.1, Q4.2) but worse when small amounts of data are examined. ConclusionIn some cases MyISAM is faster than InnoDB, but usually only when the buffer pool is cold. Please don’t take away that you should be using MyISAM for everything!. MyISAM may be good for raw performance, but there are limitations which MyISAM imposes that are difficult to work with. MyISAM does not maintain checksum consistency during regular operations and is not ACID compliant. MyISAM and InnoDB may perform differently under concurrency, which this benchmark does not cover. I will make a follow-up post about concurrency in another blog post in this series. Regardless, when the working set fits in memory, InnoDB almost always performs better, at least for this workload.NotesMySQL version used: 5.6.11, custom compiled to remove performance_schemaFor the InnoDB tests, a 64GB buffer pool was used. O_DIRECT was used so, there was no caching of data at the filesystem level. The InnoDB indexes were built using ALTER TABLE fast index creation (merge sort).For the MyISAM tests I used a 10GB key buffer. I used ALTER TABLE DISABLE KEYS and built the keys with sort via ALTER TABLE ENABLE KEYS. my.cnf[mysqld]
datadir=/mnt/mysql56/data
basedir=/usr/local/mysql
socket=/var/lib/mysql/mysql.sock
user=justin
innodb_buffer_pool_size=64G
innodb_log_file_size=4G
innodb_file_per_table
innodb_stats_on_metadata=off
innodb_file_format=barracuda
innodb_log_buffer_size=32M
innodb_buffer_pool_instances=16
metadata_locks_hash_instances=32
table_open_cache_instances=8
sort_buffer_size=128k
read_rnd_buffer_size=8M
join_buffer_size=8M
default_tmp_storage_engine=myisam
tmpdir=/dev/shm
innodb_undo_logs=32
innodb_old_blocks_time=0
table_open_cache=2048
table_definition_cache=16384
innodb_flush_method=O_DIRECT
key_buffer_size=10G
# Disabling symbolic-links is recommended to prevent assorted security risks
symbolic-links=0
innodb_stats_persistent
innodb_stats_auto_update=off
[mysqld_safe]
log-error=/var/log/mysqld.log
pid-file=/var/run/mysqld/mysqld.pidThe post MySQL and the SSB – Part 2 – MyISAM vs InnoDB low concurrency appeared first on MySQL Performance Blog.
-
MySQL Workbench 6.0 – A Sneak Preview
The MySQL Workbench team has been a little quiet for the past few months, but that’s because we’ve been busy working on an exciting new version. Read more about it in Tomas’ blog
-
MySQL Workbench 6.0 – A Sneak Preview
The MySQL Developer Tools team is known for their steady release cycle, putting out a new MySQL Workbench release every 4-5 weeks. Now that it has been a bit quiet for a while you may wonder what is going on. Let me share some inside knowledge of what’s happening behind the scenes.
MySQL Workbench is a key component of our MySQL stack and extremely popular as shown by download numbers and interest in our white papers. It is the face of MySQL on the desktop, and we aim to make it even more popular for developers & DBAs than it is today. We have been looking for strong C++ and Python developers to grow the team further, and I’m happy to welcome Marcin Szalowicz from Poland and Miguel Tadeu from Portugal who started this Monday. They are going to work on both, the C++ backend and the native UI frontend of the product.
For some time the focus for the team has been on the next major release of MySQL Workbench which will be called 6.0. If you are familiar with the current version (5.2) you are going to see that we have been listening carefully to the feedback we’ve got. Here are the main areas that have been improved.
Simplified Workflow
The current WB 5.2 UI was designed for three specific workflows, SQL Development, EER Modelling and Administration. This is represented by the 3 main areas of the WB 5.2 Home screen and makes good sense if you have a very specific role in your company and use WB for that specific purpose, e.g. Administration of MySQL servers. What we learned is that e.g. many developers working in the SQL Editor also want to perform administrative tasks on their development machines and therefore also open an Administrator session. The same is true for DBAs using the SQL Editor to run scripts and queries.
For that reason we have unified SQL Development & Administration and made both available in a single session when you connect to your MySQL Server. The connections now take the main area of the new, redesigned WB 6.0 Home screen. We have also adopt a modern looking UI.
Enterprise Feature Support
MySQL Enterprise Backup and the new Audit Log functionality found in MySQL 5.6 Enterprise are very compelling features for our customers. We got many requests to deliver a streamlined GUI for these features that will now be released as part of WB 6.0. MySQL Workbench will perform all tasks that are needed to configure your MySQL installation for the use of the backup feature and setting up an incremental backup is now as easy as clicking a button.
Improved Code Quality
Another area of focus was to improve the quality of the product. WB 6.0 will contain over 100 bug fixes and will be tested by a dedicated QA team. We are soon going to launch the public Beta test phase and with the help of the MySQL community the WB 6.0 will be the best MySQL Workbench release to date.
More Than 50 Improvements
Apart from working on the major UI changes a lot of work has been invested in improving and extending existing features like Server Status overview, Visual Explain, Editors, Schema Inspector, Table Data Search, and much more.
I hope you got interested in the upcoming MySQL Workbench 6.0 release and you will join the Beta test when it is first announced.
-
Some storage engine features you only get if you’re InnoDB
I had reason to look into the extended secondary index code in MariaDB and MySQL recently, and there was one bit that I really didn’t like.
MariaDB:
share->set_use_ext_keys_flag(legacy_db_type == DB_TYPE_INNODB);
MySQL:
use_extended_sk= (legacy_db_type == DB_TYPE_INNODB);
In case you were wondering what “legacy_db_type” actually does, let me tell you: it’s not legacy at all, it’s kind of key to how the whole “metadata” system in MySQL works. For example, to drop a table, this magic number is used to work out what storage engine to call to drop the table.
Now, these code snippets basically kiss goodbye to the idea of a “pluggable storage engine” architecture. If you’re not InnoDB, you don’t get to have certain features. This isn’t exactly MySQL or MariaDB encouraging an open storage engine ecosystem (quite the opposite really).
Having the MySQL server have this incredibly basic, busy and incomplete understanding of metadata has always been a bit of a mess. The code for reading a table definition out of the FRM file really does show its age, and has fingers all through the server.
If somebody was serious about refactoring server code, you’d certainly be looking here, as this code is a major source of arbitrary limitations. However, if you have the server and the engine(s) both having separate views of what is the “correct” state of metadata you end up with a mess (anyone who has had InnoDB be out of sync with FRMs knows this one). I worry that the FRM code will be replaced with something even less understandable by humans, again making the mistake that the server knows the state of the engine better than the engine does.
See Also:
Sergey Petrunia’s blog on the topic of extended keys: http://s.petrunia.net/blog/?p=74
Sergey Glukhov blogs on the MySQL implementation: http://glukhsv.blogspot.com.au/2012/12/innodb-extended-secondary-keys.html
-
Implementing asynchronous cascade delete in MySQL
A while back one of my foreign keys started causing trouble. The problem was that some parent rows had tens of thousand of child rows, and the foreign key was defined with CASCADE DELETE enabled. When we deleted one of those parent rows on a master database, it took several seconds to execute the delete because of the cascade. This led to latency for the end user, and also led to replication delays.
The immediate solution was make the application tolerant of orphaned rows in the child table and to drop the explicit foreign key constraint.
I didn't really want to leave those orphaned rows hanging around in the child table, so I decided to implement an asynchronous process to delete the orphaned rows on a scheduled basis. Read on for a description of that process.
Using the sakila database as an example, imagine I drop the foreign key between file_category and category, like so:
alter table sakila.film_category
drop foreign key fk_film_category_category;
Without the foreign key in place, deletes on the category table lead to orphaned rows in film_category. For example, I will delete the "New" category:
```
mysql> delete from sakila.category
-> where name = 'New';
Query OK, 1 row affected (0.01 sec)
```
There are several ways to count the orphaned rows. Here are two different naive implementations using OUTER JOIN or NOT EXISTS:
```
mysql> select count(*)
-> from sakila.film_category c
-> left outer join sakila.category p on p.category_id = c.category_id
-> where p.category_id is null;
+----------+
| count(*) |
+----------+
| 63 |
+----------+
1 row in set (0.00 sec)
mysql> select count(*)
-> from sakila.film_category c
-> where not exists
-> (
-> select NULL from sakila.category p where p.category_id = c.category_id
-> );
+----------+
| count(*) |
+----------+
| 63 |
+----------+
1 row in set (0.01 sec)
```
I can also delete the orphaned rows using the same query approaches. I'll roll the first delete back so I can demonstrate the second query in the same session:
```
mysql> set autocommit = 0;
Query OK, 0 rows affected (0.00 sec)
mysql> delete c.*
-> from sakila.film_category c
-> left outer join sakila.category p on p.category_id = c.category_id
-> where p.category_id is null;
Query OK, 63 rows affected (0.02 sec)
mysql> rollback;
Query OK, 0 rows affected (0.00 sec)
mysql> delete c.*
-> from sakila.film_category c
-> where not exists (select NULL from sakila.category p where p.category_id = c.category_id);
Query OK, 63 rows affected (0.01 sec)
mysql> commit;
Query OK, 0 rows affected (0.00 sec)
```
This approach will be very slow for tables containing millions of rows, so in my real world case I didn't use this approach. Instead I decided it would be a lot easier and faster to delete the orphaned rows if I knew who their parent was. To this end I created a new table to track the deleted rows, and populated it using a trigger. Continuing the example in the sakila database:
```
CREATE TABLE category_deleted (
category_id tinyint(3) unsigned NOT NULL,
name varchar(25) NOT NULL,
last_update timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
delete_time timestamp NOT NULL,
PRIMARY KEY (category_id)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
DELIMITER $$
DROP TRIGGER IF EXISTS sakila.TR_A_DEL_CATEGORY $$
CREATE TRIGGER sakila.TR_A_DEL_CATEGORY AFTER DELETE ON sakila.category FOR EACH ROW BEGIN
INSERT IGNORE INTO sakila.category_deleted (category_id, name, last_update, delete_time)
VALUES (old.category_id, old.name, old.last_update,now());
END $$
DELIMITER ;
```
Now I can delete another category to test the trigger:
```
mysql> -- delete a single category
mysql> delete from sakila.category
-> where name = 'Classics';
Query OK, 1 row affected (0.01 sec)
mysql> -- verify the trigger worked
mysql> select * from sakila.category_deleted;
+-------------+----------+---------------------+---------------------+
| category_id | name | last_update | delete_time |
+-------------+----------+---------------------+---------------------+
| 4 | Classics | 2006-02-15 04:46:27 | 2013-05-21 18:21:53 |
+-------------+----------+---------------------+---------------------+
1 row in set (0.00 sec)
mysql> -- count the orphaned rows using the _deleted table
mysql> select count(*)
-> from sakila.film_category c
-> inner join sakila.category_deleted p on p.category_id = c.category_id;
+----------+
| count(*) |
+----------+
| 57 |
+----------+
1 row in set (0.00 sec)
```
I also wanted to execute the deletes on the child table in chunks, so I implemented a stored procedure to delete the orphaned rows by iterating through the rows in the _deleted table, deleting the child rows, and then deleting from the _deleted table. If I had to implement it again, I would probably use common_schema to chunk the deletes so I wouldn't need the stored procedure.
Here's an implementation using common_schema:
```
-- first delete the orphaned rows from the child table
set @script := "
split(sakila.film_category: delete sakila.film_category.*
from sakila.film_category
inner join sakila.category_deleted on sakila.category_deleted.category_id = sakila.film_category.category_id
)
SELECT $split_total_rowcount AS 'rows deleted so far';
";
call common_schema.run(@script);
-- then delete the rows from the _deleted table (assuming they have no children)
delete p.*
from sakila.category_deleted p
left outer join sakila.film_category c on c.category_id = p.category_id
where c.category_id is null;
```
|