ProxySQL and Mirroring what about it?
Written by Marco Tusa   



1. Overview

I love ProxySQL, I think it is a great component for expanding architecture flexibility and HA, but not all what shine is gold.

Let me make clear that I only want to set the expectations right, and avoid to sell carbon for gold. Carbon has it's own use, gold has another. 


First of all let me clarify what is mirroring for me (and hope most of you).

Then we need to cover the basic of how ProxySQL manage the (I cannot say mirroring) traffic dispatch.

ProxySQL receive a connection from the application, and through it we can have a simple SELECT or a more complex transaction. ProxySQL will get each query, pass it to the QueryProcessor, process the query, identify if the query is mirrored, duplicate the whole mysql session ProxySQL internal object and associate it to a mirror queue, which refer to a mirror threads pool.

If the pool is free (has an available active slot in the concurrent active threads set) then the query is processed right away, if not it will stay in the queue. If the queue is full, the query is lost.

Whatever is returned from the query goes to /dev/null, as such no result set is passed back to client.

The whole process is not free for a server, actually if you will check the CPU utilization you will see that the “mirroring” in ProxySQL will actually double the CPU utilization. Meaning that also the traffic on server A will be impacted because resource contention.

Summarizing ProxySQL will:

  • Send the query for execution in different order
  • Completely ignore any transaction isolation
  • Have different number of query executed on B respect to A
  • Add significant load on the server resources

Comparing this with the point and expectations I mention in the reasoning at the end of this article, it is quite clear to me that at the moment we cannot consider ProxySQL as a valid mechanism to duplicate consistent load from server A to server B.


Personally, I don’t think that ProxySQL development Team (Rene’ :D), should waste his time on fixing this part, there are so many other things to cover and improve on ProxySQL.

After having work extensively with ProxySQL and have done deep QA on mirroring, I think that either we keep it as basic blind traffic dispatcher or a full re-conceptualization is required.

2. Setup

But once we have clarified that, we can still see ProxySQL “traffic dispatch” (cannot say mirroring really) as a very interesting feature, that may result useful in many ways, especially because it is so easy to setup.

The following is the result of tests I had performed, which should help in setting correct expectations.

Tests were simple, load data in a PXC cluster and use ProxySQL to replicate the load on a MySQL master-slave environment.

Machines for MySQL/PXC where VM with CentOS 7, 4 CPU 3 GB RAM, attached storage.

Machine for ProxySQL VM CentOS 7, 8 CPU 8GB RAM.

Why I choose to give ProxySQL such higher  volume of resources?

I knew in advance I may need to play a bit with a couple of settings requiring more memory and CPU cycles and I want to be sure I don’t get any problem from ProxySQL in relation to CPU and Memory.

The application that I was using to add load is a Java application I develop to perform my tests. App is https://github.com/Tusamarco/blogs/blob/master/stresstool_base_app.tar.gz, the whole set I had used to do the test is here

I had used 4 different tables,

| Tables_in_mirror |
| mirtabAUTOINC    |
| mirtabMID        |
| mirtabMIDPart    |
| mirtabMIDUUID    |


For full table definition see here


3. ProxySQL setup

Ok so let start.

First setup ProxySQL

DELETE FROM mysql_servers WHERE hostgroup_id IN (500,501,700,701);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('',500,3306,60000,400);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('',501,3306,100,400);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('',501,3306,20000,400);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('',501,3306,20000,400);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('',700,3306,1,400);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('',701,3306,1,400);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('',701,3306,1,400);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('',701,3306,1,400);
DELETE FROM mysql_users WHERE username='load_RW';
INSERT INTO mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) VALUES ('load_RW','test',1,500,'test',1);
DELETE FROM mysql_query_rules WHERE rule_id=202;
INSERT INTO mysql_query_rules (rule_id,username,destination_hostgroup,mirror_hostgroup,active,retries,apply) VALUES(202,'load_RW',500,700,1,3,1);


4. Tests results

Now the action. 

4.1. Test1

The first test is mainly a simple functional test during which I insert records using 1 single thread in PXC and in MySQL.

No surprise here I have 3000 loops and at the end I have 3000 records on both platforms.

To have a baseline we can see that ProxySQL CPU utilization is quite low



AT the same time the number of "questions" against PXC and MySQL very similar:





The other two metrics we want to keep an eye on are Mirror_concurrency and Mirror_queue_length this two refer respectively to mysql-mirror_max_concurrency and mysql-mirror_max_queue_length.

The two new variables and metrics were introduced in ProxySQL 1.4.0 with the intent to control and manage the load ProxySQL generate internally related to the mirroring feature.



In this case as you can see we have a max of 3 concurrent connections and 0 queue entry, all good.

Now that we have a baseline, and that we know at functional level "it works" let see what happen increasing the load.


4.2. Test 2.

Scope of the test was to identify how ProxySQL will behave with standard configuration and increasing load.

It comes up that as soon as ProxySQL has a little bit more load, it will start to loose some query along the way.

Executing 3000 loop for 40 threads, insert only will result in 120,000 rows inserted in all the 4 tables in PXC but the table in the Secondary (mirrored) platform will only have a variable number between 101,359 and 104,072. Showing consistent lost of data.

Reviewing the insight and comparing the connections running in PXC and the secondary we can see that, as expected, the PXC number of connections is scaling serving the number of incoming requests, while the connections on the Secondary are limited by the default value of mysql-mirror_max_concurrency=16.



Is also interesting to notice that to process the queue of transaction existing in ProxySQL the connection on the Secondary persist longer than the connection in PXC.



As we can see above the queue as an evident bell curve reaching the 6K entries which is quite below the mysql-mirror_max_queue_length limit (32K). Yet the queries were drop by ProxySQL, which indicate the queue is not really enough to accommodate the pending work.



CPU wise ProxySQL as expected take a bit more cycles, but nothing crazy and the overhead for the simple mirroring queue processing can be see when the main load stops around 12:47.


Another interesting graph to keep an eye on is the one describing the executed command inside PXC and the Secondary:






As you can see the traffic on the Secondary was significantly less 669 average than PXC 1.17K. Then it spikes when the main load on the PXC node terminates.

In short it is quite clear that ProxySQL is not able here to scale following the traffic existing in PXC and actually loosing significant amount of data on the Secondary.

Doubling the load in the Test3 show the same behavior, having ProxySQL reach his limit for the traffic duplication.


But can this be optimized?

Of course yes, this is what the mysql-mirror_max_concurrency is for, lets see what is going to happen if I increase the value from 16 to 100 just to make it crazy high.


4.3. Test 4

2 app node writing.

I am jumping the description of test 3 because is mainly the same of Test 2 with more load.


The first thing that is coming to the attention is that both PXC and secondary report same number of rows in the tables (240,000). That is a first good win.


Second the number of running connections:


Lines now are much closer and the queue just drop to few entries.


Commands executed in PXC:


And commands executed in the Secondary:


Average execution report the same value, and very similar trend.


Finally, what was the CPU cost and effect?






As expected some difference in the CPU usage distribution exists, but the trend is consistent between the two nodes and with the operations.


The ProxySQL CPU utilization is definitely higher than before:


But absolutely manageable, and still reflecting the initial distribution.


Finally what about CRUD now?
So far I had only tested the insert operation, but what happen if we run a full CRUD set of tests?


4.4. Test 7 Crud

First of all, let us review the executed commands, in PXC


And Secondary


While in appearance we have very similar workload, selects aside the behavior will significantly diverge.
This because in the Secondary the different operations are not encapsulated by the transaction and executed as they are received.  We can see significant difference in update and delete operations between the two.


Also the threads in execution will show a different picture between the two platform.





It appears quite clear that PXC is constantly having more running threads and more connections.

Never the less both platform process similar total number of questions.





Both having an average around 1.17K/second questions.

This is also another indication of how much the behavior is impacted when we have concurrent operation but no respect of the isolation or execution order.

Different behavior that is also clear reviewing the CPU utilization.






5. Conclusions

To close this long article, I want to go back to the start.

We cannot consider the mirror function in ProxySQL as a real mirroring, but more a traffic redirection.


Use ProxySQL with this approach, is still partially effective in testing the load and the effect it may have on a secondary platform.
As we know data consistency is not guarantee in this scenario, and Selects, Updates and Delete are affected by this given the different data-set and result-set they will manage.

Given that, the server behaviors will change between original and mirror, if not in the quantity in the quality.

I am convinced that when we need a tool able to test our production load on a different or new platform, we will do better to look to something else, like Query Playback recently reviewed and significantly patch by DropBox (https://github.com/Percona-Lab/query-playback).

At the end ProxySQL is already a cool tool and if it doesn't cover this, well I can live with that, I am interested to have it working as it should and does in many other functionalities.


6. Acknowledgments

As usual to Rene' who had work on fixing and introducing new functionalities associated to the mirror, like the queue and concurrency control.

To the Percona team who develop PMM, all the graphs here (except 3) comes from PMM, and some of them are my customization.

Last Updated on Friday, 26 May 2017 10:04
InnoDB Page Merging and Page Splitting
Written by Marco Tusa   

If you met one of the (few) MySQL consultants around the globe and asked him/her to review your queries and/or schemas, I am sure that he/she would tell you something regarding the importance of good primary key(s) design. Especially in the case of InnoDB, I’m sure they started to explain to you about index merges and page splits. These two notions are closely related to performance, and you should take this relationship into consideration when designing any index (not just PKs).

That may sound like mumbo jumbo to you, and you may be right. This is not easy stuff, especially when talking about internals. This is not something you deal with on a regular basis, and often you don’t want to deal with it at all.

But sometimes it’s a necessity. If so, this article is for you.

In this article, I want to shed some light in explaining some of the most unclear, behind the scenes operations in InnoDB: page index creation, page merging and page splitting.

In Innodb all data is an index. You’ve probably heard that as well right? But what exactly does that mean?


0.1. File-Table Components

Let's say you have MySQL installed, the latest 5.7 version (Percona Server for MySQL, right?  ), and you have a table named wmills in the schema windmills. In the data directory (normally /var/lib/mysql/) you will see that it contains:




This is because the parameter innodb_file_per_table is set to 1 since MySQL 5.6. With that setting, each table in your schema is represented by one file (or many files if the table is partitioned).

What is important here is that the physical container is a file named wmills.ibd. This file is broken up into and contains N number of segments. Each segment is associated with an index.

While a file’s dimensions do not shrink with row-deletions, a segment itself can grow or shrink in relation to a sub-element named extent. An extent can only exist inside a segment and has a fixed dimension of 1MB (in the case of default page size). A page is a sub-element of an extent and has a default size of 16KB.

Given that, an extent can contain a maximum of 64 pages. A page can contain two to N number of rows. The number of rows a page can contain is related to the size of the row, as defined by your table schema. There is a rule within InnoDB that says, at minimum, two rows must fit into a page. Therefore, we have a row-size limit of 8000 bytes.

If you think this sounds like Matryoshka dolls, you are right! An image might help:


InnoDB uses B-trees to organize your data inside pages across extents, within segments.

Roots, Branches, and Leaves

Each page (leaf) contains 2-N rows(s) organized by the primary key. The tree has special pages to manage the different branch(es). These are known as internal nodes (INodes).


This image is just an example, and is not indicative of the real-world output below.

Let’s see the details:


ROOT NODE #3: 4 records, 68 bytes
 INTERNAL NODE #197: 464 records, 7888 bytes
 LEAF NODE #5: 57 records, 7524 bytes
 RECORD: (id=2)(uuid="884e471c-0e82-11e7-8bf6-08002734ed50", millid=139, kwatts_s=1956, date="2017-05-01", lo



Below is the table structure:


CREATE TABLE `wmills` (
  `id` bigint(11) NOT NULL AUTO_INCREMENT,
  `uuid` char(36) COLLATE utf8_bin NOT NULL,
  `millid` smallint(6) NOT NULL,
  `kwatts_s` int(11) NOT NULL,
  `date` date NOT NULL,
  `location` varchar(50) COLLATE utf8_bin DEFAULT NULL,
  `active` tinyint(2) NOT NULL DEFAULT '1',
  `strrecordtype` char(3) COLLATE utf8_bin NOT NULL,
  PRIMARY KEY (`id`),
  KEY `IDX_millid` (`millid`)


All styles of B-trees have a point of entry known as the root node. We’ve identified that here as page #3. The root page contains information such as index ID, number of INodes, etc. INode pages contain information about the pages themselves, their value ranges, etc. Finally, we have the leaf nodes, which is where we can find our data. In this example, we can see that leaf node #5 has 57 records for a total of 7524 bytes. Below that line is a record, and you can see the row data.

The concept here is that while you organize your data in tables and rows, InnoDB organizes it in branches, pages, and records. It is very important to keep in mind that InnoDB does not work on a single row basis. InnoDB always operates on pages. Once a page is loaded, it will then scan the page for the requested row/record.

Is that clear up to now? Good. Let’s continue.

0.2. Page Internals

A page can be empty or fully filled (100%). The row-records will be organized by PK. For example, if your table is using an AUTO_INCREMENT, you will have the sequence ID = 1, 2, 3, 4, etc.


A page also has another important attribute: MERGE_THRESHOLD. The default value of this parameter is 50% of the page, and it plays a very important role in InnoDB merge activity:


While you insert data, the page is filled up sequentially if the incoming record can be accommodated inside the page.

When a page is full, the next record will be inserted into the NEXT page:


Given the nature of B-trees, the structure is browsable not only top-down following the branches, but also horizontally across the leaf nodes. This is because each leaf node page has a pointer to the page that contains the NEXT record value in the sequence.

For example, Page #5 has a reference to the next page, Page #6. Page #6 has references backward to the previous page (Page #5) and a forward to the next page (Page #7).

This mechanism of a linked list allows for fast, in-order scans (i.e., Range Scans). As mentioned before, this is what happens when you are inserting and have a PK based on AUTO_INCREMENT. But what happens if I start to delete values?


0.3. Page Merging

When you delete a record, the record is not physically deleted. Instead, it flags the record as deleted and the space it used becomes reclaimable.


When a page has received enough deletes to match the MERGE_THRESHOLD (50% of the page size by default), InnoDB starts to look to the closest pages (NEXT and PREVIOUS) to see if there is any chance to optimize the space utilization by merging the two pages.


In this example, Page #6 is utilizing less than half of its space. Page #5 received many deletes and is also now less than 50% used. From InnoDB’s perspective, they are mergeable:


The merge operation results in Page #5 containing its previous data plus the data from Page #6. Page #6 becomes an empty page, usable for new data.


The same process also happens when we update a record and the size of the new record brings the page below the threshold.

The rule is: Merges happen on delete and update operations involving close linked pages.

If a merge operation is successful, the index_page_merge_successful metric in INFORMATION_SCHEMA.INNODB_METRICS is incremented.


0.4. Page Splits

As mentioned above, a page can be filled up to 100%. When this happens, the next page takes new records.

But what if we have the following situation?


Page #10 doesn’t have enough space to accommodate the new (or updated) record. Following the next page logic, the record should go on Page #11. However:


Page #11 is also full, and data cannot be inserted out of order. So what can be done?

Remember the linked list we spoke about? At this moment Page #10 has Prev=9 and Next=11.

What InnoDB will do is (simplifying):

  1. Create a new page
  2. Identify where the original page (Page #10) can be split (at the record level)
  3. Move records
  4. Redefine the page relationships


A new Page #12 is created:


Page #11 stays as it is. The thing that changes is the relationship between the pages:

  • Page #10 will have Prev=9 and Next=12
  • Page #12 Prev=10 and Next=11
  • Page #11 Prev=12 and Next=13

The path of the B-tree still sees consistency since it is following a logical organization. However, physically the page is located out of order, and in most cases in a different extent.

As a rule we can say: Page splits happens on Insert or Update, and cause page dislocation (in many cases on different extents).

InnoDB tracks the number of page splits in INFORMATION_SCHEMA.INNODB_METRICS. Look for index_page_splits and index_page_reorg_attempts/successful metrics.

Once the split page is created, the only way to move back is to have the created page drop below the merge threshold. When that happens, InnoDB moves the data from the split page with a merge operation.

The other way is to reorganize the data by OPTIMIZE the table. This can be a very heavy and long process, but often is the only way to recover from a situation where too many pages are located in sparse extents.

Another aspect to keep in mind is that during merge and split operations, InnoDB acquires an x-latch to the index tree. On a busy system, this can easily become a source of concern. This can cause index latch contention. If no merges and splits (aka writes) touch only a single page, this is called an “optimistic” update in InnoDB, and the latch is only taken in S. Merges and splits are called “pessimistic” updates, and take the latch in X.


0.5. My Primary Key

A good Primary Key (PK) is not only important for retrieving data, but also correctly distributing the data inside the extents while writing (which is also relevant in the case of split and merge operations).

In the first case, I have a simple auto-increment. In the second my PK is based on an ID (1-200 range) and an auto-increment value. In my third, I have the same ID (1-200 range) but associate with a UUID.

When inserting, InnoDB must add pages. This is read as a SPLIT operation:

Page Merging and Page Splitting

The behavior is quite different depending on the kind of Primary Key I use.

The first two cases will have more “compact” data distribution. This means they will also have better space utilization, while the semi-random nature of the UUID will cause a significant “sparse” page distribution (causing a higher number of pages and related split operations).

In the case of merges, the number of attempts to merge is even more different by PK type.


On Insert-Update-Delete operations, auto-increment has less page merge attempts and 9.45% less of a success ratio than the other two types. The PK with UUID (on the side other of the spectrum) has a higher number of merge attempts, but at the same time also a significantly higher success ratio at 22.34%, given that the “sparse” distribution left many pages partially empty.

The PK values with similar numbers also come from a secondary index.

0.6. Conclusion

MySQL/InnoDB constantly performs these operations, and you have very limited visibility of them. But they can bite you, and bite hard, especially if using a spindle storage VS SSD (which have different issues, by the way).

The sad story is there is also very little we can do to optimize this on the server side using parameters or some other magic. But the good news is there is A LOT that can be done at design time.

Use a proper Primary Key and design a secondary index, keeping in mind that you shouldn’t abuse of them. Plan proper maintenance windows on the tables that you know will have very high levels of inserts/deletes/updates.

This is an important point to keep in mind. In InnoDB you cannot have fragmented records, but you can have a nightmare at the page-extent level. Ignoring table maintenance will cause more work at the IO level, memory and InnoDB buffer pool.

You must rebuild some tables at regular intervals. Use whatever tricks it requires, including partitioning and external tools (pt-osc). Do not let a table to become gigantic and fully fragmented.

Wasting disk space? Need to load three pages instead one to retrieve the record set you need? Each search causes significantly more reads?
That’s your fault; there is no excuse for being sloppy!

Happy MySQL to everyone!


1. Acknowledgments

Laurynas Biveinis: who had the time and patience to explain some internals to me.

Jeremy Cole: for his project InnoDB_ruby (that I use constantly).

Last Updated on Monday, 10 April 2017 22:57
More Articles...

Page 9 of 30

Who's Online

We have 48 guests online