Home MySQL Blogs
MySQL

My MySQL tipsvalid-rss-rogers




ProxySQL PXC Single Writer Mode and auto failover, re-bootstrap PDF Print E-mail
Written by Marco Tusa   
Tuesday, 21 November 2017 11:21

Overview

ProxySQL had been adopted as solution for HA in place of HAProxy in Percona PXC package. writing
The new solution has a lot of advantages and provide an unbelievable flexibility we did not had before.  But when talking about HA and PXC there was still a gap.

As already discussed in my previous article “ProxySQL and Percona XtraDB Cluster (Galera) Integration”(https://www.percona.com/blog/2016/09/15/proxysql-percona-cluster-galera-integration/),
ProxySQL is working great when using multi-master approach, but trying to have it in single Node writer, was not possible unless using creative (nice way to say wrong) solutions like what I was covering in “ProxySQL and PXC using Replication HostGroup” in the same article.

In the following months, I had few emails and requests from colleagues and externals reporting me that some time that specific solution was not working or was not covering this or that case.  No matter if I had mark the related feature active_failover as experimental, apparently that in conjunction of Replication HG was deployed as solution.

What was wrong with that? Well first of all that was only a test, an experimental feature, as such I was not going to maintain it or fix bugs. But secondly and more important, it was based on a wrong concept.  ProxySQL is on-top-of MySQL as reverse proxy. The specific feature was breaking the main idea and was actually acting on the PXC nodes changing their status (READ_ONLY).  That should not happen and any action or operation should involve only the ProxySQL node(s)

 

 

 

 

Why this blog

This blog will describe what I had implemented in the script as solution (not experimental), for the case in which we need to use PXC in Single Writer Mode (SWM). I will illustrate the new feature in PXC and the additional concept of Failover Host Group (FHG).

After few months, I realized I have to change the script and either remove the experimental feature or redesign it.  But something was missed in PXC to allow me (us) to correctly enable the SWM.  Actually at the beginning I was looking to have it fully integrated in ProxySQL, and I had several discussion with Percona colleagues, and Rene’, about this.  But at the end the PXC new internal features had not be enough to guarantee ProxySQL to be able to identify and manage most dangerous cases. Let us see what had being implemented in PXC and what was still missed such that I had to put my hands on the script code again.

Good .. but not good enough  When in the need to have PXC in SWM, we need to have ONE server in the HG receiving the WRITES, and as many as we like in the HG(s) receiving the READS.

What this imply is that ProxySQL either need to perform this operation using Replication HG (abusing of the READ_ONY flag), or we need a way to:

  • Identify which node(s) can be a WRITER candidate in place of
  • Identify/assign the correct relevance and priority to each candidate
  • Have a unique view to recognize the nodes in the cluster and their segment membership Identify as much as possible any reason(s) for which a node cannot be used as writer

What we were discussing internally, was to have a solution that would allow us to see the cluster status no matter which node would answer us, and should also be able to provide us which node could become the next writer. This last should be eventually evaluated against custom settings/preferences.

PXC 5.7.19 come with a new feature pxc_cluster_view, a table that reside in performance_schema and that can be easily access simply doing :

 

(pxc_test@192.168.1.11) [(none)]>select * FROM performance_schema.pxc_cluster_view ORDER BY SEGMENT,LOCAL_INDEX;
+-----------+--------------------------------------+--------+-------------+---------+
| HOST_NAME | UUID                                 | STATUS | LOCAL_INDEX | SEGMENT |
+-----------+--------------------------------------+--------+-------------+---------+
| node1     | 05b5554f-be34-11e7-aa92-5e038abbae35 | SYNCED |           0 |       1 |
| node2     | 63870cc3-af5d-11e7-a0db-463be8426737 | SYNCED |           1 |       1 |
| node3     | 666ad5a0-af5d-11e7-9b39-2726e5de8eb1 | SYNCED |           2 |       1 |
| node6     | 7540bdca-b267-11e7-bae9-464c3d263470 | SYNCED |           3 |       2 |
| node5     | ab551483-b267-11e7-a1a1-9be1826f877f | SYNCED |           4 |       2 |
| node4     | e24ebd01-af5d-11e7-86f0-42c8ceb6886c | SYNCED |           5 |       2 |
+-----------+--------------------------------------+--------+-------------+---------+

 

 

As you can see the table is reporting us the list of ALL nodes ordered by Segment and LOCAL_INDEX. Why I had use that sort? Because this order will give me, in descending order, the priority for which PXC see the nodes.  As we know segment is assigned and is part of the customization/design we have to take care when deploying the cluster.  Local_Index, instead, is an auto calculated value against the UUID assigned internally by the cluster.  As such while this is a good indicator internally, is totally useless for us in case we want to “customize” the priority.
But this is not all, as already covered in my previously mentioned article, PXC/Galera nodes can be in a state that will prevent the proper access to them, without having ProxySQL realizing it … or PXC itself .

One very simple example is the following:

(root@localhost) [(none)]>set global wsrep_reject_queries=ALL; 
2017-11-06T19:21:14.533407Z 738828 [Note] WSREP: Rejecting client queries due to manual setting

 

(pxc_test@192.168.1.231) [(none)]>select now();select * FROM performance_schema.pxc_cluster_view ORDER BY SEGMENT,LOCAL_INDEX;
| 2017-11-06 14:22:19 |
| HOST_NAME | UUID                                 | STATUS | LOCAL_INDEX | SEGMENT |
| node1     | 05b5554f-be34-11e7-aa92-5e038abbae35 | SYNCED |           0 |       1 | ← still looks good
| node2     | 63870cc3-af5d-11e7-a0db-463be8426737 | SYNCED |           1 |       1 |
| node3     | 666ad5a0-af5d-11e7-9b39-2726e5de8eb1 | SYNCED |           2 |       1 |
| node6     | 7540bdca-b267-11e7-bae9-464c3d263470 | SYNCED |           3 |       2 |
| node5     | ab551483-b267-11e7-a1a1-9be1826f877f | SYNCED |           4 |       2 |
| node4     | e24ebd01-af5d-11e7-86f0-42c8ceb6886c | SYNCED |           5 |       2 |


No need to say that by default the Vanilla ProxySQL version will not catch that either, so Bingo! Or to be more correct BOOOM!

Your cluster is doomed, because by mistake or on purpose something not traced had stop your production Write node.
So what else/more do we need, to have not only an healty PXC cluster but also the chance to perform a nice failover when using SWM?

  • Well of course the script covering the different node state, and that my script was already doing it.
  • We need to have the chance to indicate a list of preferred node, with priority. After we can decide to go for the priority as it comes from pxc_cluster_view if for any reasons our choice will not work. Or even go for a different segment as last resource but still respecting the pxc_cluster_view.

To fix the first and second points, I had first look to what had been done in ProxySQL for group replication. Group Replication has a dedicated table with a lot of additional information:

  • writer_hostgroup
  • backup_writer_hostgroup
  • reader_hostgroup
  • offline_hostgroup
  • active
  • max_writers
  • writer_is_also_reader
  • Max_transactions_behind

This could make sense for Group replication, also if I consider it a bit redundant, but for us and PXC, that would be too much and no needed given I can resolve to have what I need in a more elegant way.

Keep in mind that when Rene’ had implement the HostGroup approach he had put the foundations of an unbelievable flexibility with elegance and consistency.
Everything in ProxySQL revolve around HG, we do not have servers (well we do as part of) we have HG, our rules point to HG, queries are directed to HG servers are incidentally mention as part of HG.
So why not take the benefit of this? Afterall I had already used the special group 9000 to manage edge cases of maintenance and offline.

The only problem was, do I have all the information I need there? Let us see the content of mysql_servers table:

  • hostgroup_id
  • hostname
  • port
  • status
  • weight
  • compression
  • max_connections
  • max_replication_lag
  • use_ssl
  • max_latency_ms
  • comment

To generate a list of candidates WRITER for PXC, I need exactly that nothing less or more.
The new mysql_servers table will looks like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
SELECT hostgroup_id,hostname,port,STATUS,weight FROM runtime_mysql_servers ORDER BY hostgroup_id,weight DESC;
+--------------+---------------+------+--------+---------+
| hostgroup_id | hostname      | port | STATUS | weight  |
+--------------+---------------+------+--------+---------+
| 50           | 192.168.1.205 | 3306 | ONLINE | 1000000 |<-- writer
| 52           | 192.168.1.21  | 3306 | ONLINE | 1000000 |
| 52           | 192.168.1.233 | 3306 | ONLINE | 10      |
| 52           | 192.168.1.205 | 3306 | ONLINE | 1       |
| 52           | 192.168.1.22  | 3306 | ONLINE | 1       |
| 52           | 192.168.1.23  | 3306 | ONLINE | 1       |
| 52           | 192.168.1.231 | 3306 | ONLINE | 1       |
| 8050         | 192.168.1.205 | 3306 | ONLINE | 1000000 |<-- Candidates with weight
| 8050         | 192.168.1.231 | 3306 | ONLINE | 10000   |<--
| 8050         | 192.168.1.22  | 3306 | ONLINE | 100000  |<--
+--------------+---------------+------+--------+---------+
 

Now that we had established we have what we need, let us test, the solution.

letusdoit

Implement and test SWM and HA with galera_check.pl
As previously discuss the new script can efficiently manage the SWM and HA, using 3 different methods of failover:

  1. Use special HG8000 to set preferences
  2. Use pxc_cluster_view inside same segment
  3. Last resource pxc_cluster_view in ANY segment

What we will test is:

  • Controlled failover for maintenance
  • Server cannot get queries
  • Server crash

Failover methods:

  • Use provided list of servers
  • Use pxc_cluster_view for same segment
  • Use pxc_cluster_view for all segment

We will have:

  • 6 nodes
  • 2 segments
  • 2 HG (with read/write split)

We will test it as we did for https://www.percona.com/blog/2016/09/15/proxysql-percona-cluster-galera-integration/ with Sysbench commands:

 

Prepare
sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-host=192.168.1.11 --mysql-port=6033 --mysql-user=pxc_test --mysql-password=test --mysql-db=test_galera --db-driver=mysql --oltp-tables-count=50 --oltp-tablesize=50000 --num-threads=10 --report-interval=10  prepare
 
Write
sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-host=192.168.1.11 --mysql-port=6033 --mysql-user=pxc_test --mysql-password=test --mysql-db=test_galera --db-driver=mysql --oltp-tables-count=50 --oltp-tablesize=50000 --max-requests=0 --max-time=9000 --oltp-point-selects=5 --oltp-read-only=off --oltp-dist-type=uniform --oltp-reconnect-mode=transaction --oltp-skip-trx=off --num-threads=10 --report-interval=10 --mysql-ignore-errors=all run
 
Read only
sysbench --test=/usr/share/doc/sysbench/tests/db/oltp.lua --mysql-host=192.168.1.11 --mysql-port=6033 --mysql-user=pxc_test --mysql-password=test --mysql-db=test_galera --db-driver=mysql --oltp-tables-count=50 --oltp-tablesize=50000 --max-requests=0 --max-time=9000 --oltp-point-selects=5 --oltp-read-only=on --num-threads=10 --oltp-reconnect-mode=query --oltp-skip-trx=on --report-interval=10 --mysql-ignore-errors=all run

 

Set it up:

 

UPDATE global_variables SET Variable_Value=0 WHERE Variable_name='mysql-hostgroup_manager_verbose';  

 

 

Please take a moment to note the above setting.
IF you omit ProxySQL will verbosely report host-groups changes, I suggest to put it to 0 unless you are trying to debug “something” or your logs will become enormous soon.

 

 

INSERT INTO mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) VALUES ('pxc_test','test',1,50,'mysql',1);
LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK;

 

 

 

Below the entries to populate add your servers:

 

DELETE FROM mysql_servers WHERE hostgroup_id IN (50,52);
 
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.21',50,3306,1000000,2000);
 
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.205',52,3306,1,2000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.21',52,3306,1000000,2000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.231',52,3306,1,2000);
 
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.22',52,3306,1,2000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.23',52,3306,1,2000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.233',52,3306,10,2000);

 

Here the section with the list of WRITER candidates:

 

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.205',8050,3306,1000000,2000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.231',8050,3306,100000,2000);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.22',8050,3306,100000,2000);
 
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;

 

 

Finally query rules to have read/write split:

 

DELETE FROM mysql_query_rules WHERE rule_id IN (40,41,45,46,80,81);
INSERT INTO mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) VALUES(4,6033,'pxc_test',50,1,3,'.',1);
INSERT INTO mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) VALUES(40,6033,'pxc_test',50,1,3,'^SELECT.*FOR UPDATE',1);
INSERT INTO mysql_query_rules (rule_id,proxy_port,username,destination_hostgroup,active,retries,match_digest,apply) VALUES(41,6033,'pxc_test',52,1,3,'^SELECT ',1); 
 
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;

 

 

 

Time to test our SWM and failover.

Clone the script from git in your preferred directory, I will put it in opt for now.

git clone https://github.com/Tusamarco/proxy_sql_tools
chmod +x  /opt/proxy_sql_tools/galera_check.pl


Add to the script to the scheduler

 

DELETE FROM scheduler WHERE id=10;
INSERT  INTO scheduler (id,active,interval_ms,filename,arg1) VALUES (10,0,2000,"/opt/proxy_sql_tools/galera_check.pl","-u=admin -p=admin -h=127.0.0.1 -H=50:W,52:R -P=6032 --execution_time=0 --retry_down=0 --retry_up=0 --main_segment=1 --debug=0  --log=/var/lib/proxysql/galeraLog --active_failover=1");
LOAD SCHEDULER TO RUNTIME;SAVE SCHEDULER TO DISK;

 

 

Make it active:

UPDATE scheduler SET active=1 WHERE id=10; LOAD scheduler TO run;

 

If you had left the --execution_time=1 the script will constantly print the time it takes to execute in the log:

2017/11/10 09:03:44.439:[INFO] END EXECUTION Total Time(ms):391

I suggest you to initially keep it on so you will see what is actually happening.

 

Time to perform the first test.

Controlled failover for maintenance

redflagWhen you want to do a CONTROLLED failover, the best way to do with minimal impact is to manually add the second node with very low weight, and after remove the node you need to work on. I will show you how to do this safely.

Add the new new write node to the writer HG:

 

INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.205',50,3306,1,2000);
LOAD mysql servers TO run;

You will have something like this:

 

+---------+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| weight  | hostgroup | srv_host      | srv_port | STATUS | ConnUsed | ConnFree | ConnOK | ConnERR | Queries | Bytes_data_sent | Bytes_data_recv | Latency_us |
+---------+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+
| 1000000 | 50        | 192.168.1.21  | 3306     | ONLINE | 10       | 0        | 35863  | 0       | 1137903 | 75147779        | 1527773047     | 6794       |
| 1       | 50        | 192.168.1.205 | 3306     | ONLINE | 0        | 0        | 0      | 0       | 0       | 0               | 0              | 8236       |
| 1000    | 52        | 192.168.1.233 | 3306     | ONLINE | 8        | 1        | 33153  | 0       | 1747722 | 82574272        | 3919114554     | 9304       |
| 1000    | 52        | 192.168.1.231 | 3306     | ONLINE | 3        | 6        | 15206  | 0       | 1469312 | 69597188        | 3340444198     | 2989       |
| 1000    | 52        | 192.168.1.23  | 3306     | ONLINE | 6        | 3        | 36690  | 4       | 2046599 | 96797464        | 4606494587     | 10261      |
| 1000    | 52        | 192.168.1.22  | 3306     | ONLINE | 4        | 5        | 36778  | 262     | 2152217 | 101807337       | 4846638759      | 2108       |
| 10      | 52        | 192.168.1.21  | 3306     | ONLINE | 0        | 1        | 302    | 0       | 21960   | 1039596         | 49257671       | 6794       |
| 1000    | 52        | 192.168.1.205 | 3306     | ONLINE | 3        | 6        | 31355  | 0       | 2180310 | 103198644       | 4917528180      | 8236       |
+---------+-----------+---------------+----------+--------+----------+----------+--------+---------+---------+-----------------+-----------------+------------+

 

 

 

With the added host handling very few connections because the weight.

At this point you just need to connect to the node you want to put in maintenance and set it desync:

SET global wsrep_desync=1;

 

 

Once desync the script will put the server in OFFLINE_SOFT for you, and the server will allow the running connection to complete while the new ones will go to the node we had just insert.
No errors no service interruption at all:

 

2017/11/10 09:31:41.967:[WARN]  Move node:192.168.1.21;3306;50;3010 SQL: UPDATE mysql_servers SET status='OFFLINE_SOFT' WHERE hostgroup_id=50 AND hostname='192.168.1.21' AND port='3306'
2017/11/10 09:31:41.978:[WARN]  Move node:192.168.1.21;3306;52;3010 SQL: UPDATE mysql_servers SET status='OFFLINE_SOFT' WHERE hostgroup_id=52 AND hostname='192.168.1.21' AND port='3306'

 

 

Once maintenance is done, just reverse the wsrep_desync and the script will put the node back to the pool.
redflagIf you need to stop the mysql server on that node and you do not want it automatically back, remember to remove the entry from the host group 50 (the Writer one) or it will be automatically placed back in production as active.

 

DELETE FROM mysql_servers WHERE hostname=”192.168.1.21” AND port=3306 AND hostgroup_id=50;
LOAD mysql servers TO run;

Remember at the and to remove the temporary node from the Writer HG.

 

Real failover

Now a real fail over and we will test it using wsrep_rejectqueries to trigger the issue, but that can be any other issue.

In this test, given there is no other node accepting the writes in the HG, we should expect some connections error, but what we want is to have the failover in very short time.
Keep in mind we have define these as possible failover candidates:

 

+--------------+---------------+------+--------+---------+
| hostgroup_id | hostname      | port | STATUS | weight  |
+--------------+---------------+------+--------+---------+
| 8050         | 192.168.1.205 | 3306 | ONLINE | 1000000 |
| 8050         | 192.168.1.22  | 3306 | ONLINE | 100000  |
| 8050         | 192.168.1.231 | 3306 | ONLINE | 100000  |
+--------------+---------------+------+--------+---------+
 
While the pxc_cluster_view sees the following:
 
+-----------+--------------------------------------+--------+-------------+---------+
| HOST_NAME | UUID                                 | STATUS | LOCAL_INDEX | SEGMENT |
+-----------+--------------------------------------+--------+-------------+---------+
| node1     | 05b5554f-be34-11e7-aa92-5e038abbae35 | SYNCED |           0 |       1 |
| node2     | 63870cc3-af5d-11e7-a0db-463be8426737 | SYNCED |           2 |       1 |
| node3     | 666ad5a0-af5d-11e7-9b39-2726e5de8eb1 | SYNCED |           3 |       1 |
| node5     | 07f55a00-c57a-11e7-bd7d-8a568ca96345 | SYNCED |           1 |       2 |
| node4     | 6c185b87-c57a-11e7-b2a6-8f095c523cd3 | SYNCED |           4 |       2 |
| node6     | 7540bdca-b267-11e7-bae9-464c3d263470 | SYNCED |           5 |       2 |
+-----------+--------------------------------------+--------+-------------+---------+
 
FOR better understanding also FOR the following tests this IS the mapping, node name <> IP:
+-----------+---------------+
| HOST_NAME | IP            |
+-----------+---------------+
| node1     | 192.168.1.205 | 
| node2     | 192.168.1.21  | 
| node3     | 192.168.1.231 | 
| node5     | 192.168.1.23  | 
| node4     | 192.168.1.22  | 
| node6     | 192.168.1.233 | 
+-----------+---------------+

 

My Servers layout is:

 
+--------------+---------------+------+--------+---------+
| hostgroup_id | hostname      | port | STATUS | weight  |
+--------------+---------------+------+--------+---------+
| 50           | 192.168.1.205 | 3306 | ONLINE | 1000000 | ← writer
| 52           | 192.168.1.205 | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.22  | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.23  | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.231 | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.233 | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.21  | 3306 | ONLINE | 10      |
| 8050         | 192.168.1.205 | 3306 | ONLINE | 1000000 |
| 8050         | 192.168.1.22  | 3306 | ONLINE | 100000  |
| 8050         | 192.168.1.231 | 3306 | ONLINE | 100000  |
+--------------+---------------+------+--------+---------+

 

The first candidate is

192.168.1.22

which is node4 and stay in a DIFFERENT segment than the current writer. As such the first election should go to the other node

192.168.1.231

which is Node3 and reside in the same segment.
Let us see what happens with the active-failover=1:
Do:

set global wsrep_reject_queries=all;

On the current writer.

The script will identify the Writer cannot accept queries and will take action:

2017/11/13 10:51:52.856:[WARN]  Move node:192.168.1.205;3306;50;3001 SQL: UPDATE mysql_servers SET hostgroup_id=9050 WHERE hostgroup_id=50 AND hostname='192.168.1.205' AND port='3306'
2017/11/13 10:51:52.861:[WARN]  Move node:192.168.1.205;3306;52;3001 SQL: UPDATE mysql_servers SET hostgroup_id=9052 WHERE hostgroup_id=52 AND hostname='192.168.1.205' AND port='3306'
2017/11/13 10:51:55.038:[WARN] Fail-over in action Using Method = 1
2017/11/13 10:51:55.038:[INFO] Special Backup - Group found! I am electing a node to writer following the indications
 This Node Try to become the new WRITER for HG 50 Server details: 192.168.1.231:3306:HG8050
2017/11/13 10:51:55.038:[INFO] This Node Try to become a WRITER promoting to HG 50192.168.1.231:3306:HG8050
2017/11/13 10:51:55.083:[WARN]  Move node:192.168.1.231:33061000002000 SQL:INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.231',50,3306,100000,2000);

 

It will also analyze the failover candidates and it will promote one node if present. Also as expected it will choose the one in the same segment.

What happen at the applications?

Writes
[ 2010s ] thds: 10 tps: 33.40 qps: 503.33 (r/w/o: 302.92/127.31/73.10) lat (ms,95%): 434.83 err/s: 0.00 reconn/s: 0.00
[ 2020s ] thds: 10 tps: 38.30 qps: 575.27 (r/w/o: 344.68/147.59/82.99) lat (ms,95%): 376.49 err/s: 0.00 reconn/s: 0.00
[ 2030s ] thds: 10 tps: 35.80 qps: 537.20 (r/w/o: 322.40/137.90/76.90) lat (ms,95%): 419.45 err/s: 0.00 reconn/s: 0.00
[ 2040s ] thds: 10 tps: 32.70 qps: 489.34 (r/w/o: 293.53/125.71/70.11) lat (ms,95%): 427.07 err/s: 0.00 reconn/s: 0.00
[ 2050s ] thds: 10 tps: 30.60 qps: 456.43 (r/w/o: 273.62/117.61/65.20) lat (ms,95%): 493.24 err/s: 0.00 reconn/s: 0.00
[ 2060s ] thds: 10 tps: 19.20 qps: 298.36 (r/w/o: 181.18/74.99/42.19) lat (ms,95%): 657.93 err/s: 0.00 reconn/s: 30.50 <-- here 
[ 2070s ] thds: 10 tps: 23.40 qps: 345.80 (r/w/o: 206.30/89.10/50.40) lat (ms,95%): 634.66 err/s: 0.00 reconn/s: 0.00
[ 2080s ] thds: 10 tps: 27.20 qps: 414.93 (r/w/o: 249.72/106.71/58.50) lat (ms,95%): 612.21 err/s: 0.00 reconn/s: 0.00
 
Reads
[ 2000s ] thds: 30 tps: 256.50 qps: 2309.91 (r/w/o: 2309.91/0.00/0.00) lat (ms,95%): 189.93 err/s: 0.00 reconn/s: 0.00
[ 2010s ] thds: 30 tps: 250.27 qps: 2254.26 (r/w/o: 2254.26/0.00/0.00) lat (ms,95%): 200.47 err/s: 0.00 reconn/s: 0.00
[ 2020s ] thds: 30 tps: 256.29 qps: 2305.30 (r/w/o: 2305.30/0.00/0.00) lat (ms,95%): 186.54 err/s: 0.00 reconn/s: 0.00
[ 2030s ] thds: 30 tps: 241.65 qps: 2176.28 (r/w/o: 2176.28/0.00/0.00) lat (ms,95%): 193.38 err/s: 0.00 reconn/s: 0.00
[ 2040s ] thds: 30 tps: 217.88 qps: 1956.96 (r/w/o: 1956.96/0.00/0.00) lat (ms,95%): 215.44 err/s: 0.00 reconn/s: 0.00
[ 2050s ] thds: 30 tps: 277.70 qps: 2504.09 (r/w/o: 2504.09/0.00/0.00) lat (ms,95%): 196.89 err/s: 0.00 reconn/s: 0.00
[ 2060s ] thds: 30 tps: 207.37 qps: 1863.52 (r/w/o: 1863.52/0.00/0.00) lat (ms,95%): 240.02 err/s: 0.00 reconn/s: 0.00
[ 2070s ] thds: 30 tps: 199.44 qps: 1795.70 (r/w/o: 1795.70/0.00/0.00) lat (ms,95%): 244.38 err/s: 0.00 reconn/s: 0.00
 

The moment of the server lock the application had a hiccup for the write as expected, but the time was very limited and the difference in WRITE transactions was only of 75-89/sec  against the normally processed ~120/sec, as such only ~50 missed during the incident.Reads 0 loss.

 

What if there is no other server in same segment?

My new servers layout is:

 

+--------------+---------------+------+--------+---------+
| hostgroup_id | hostname      | port | STATUS | weight  |
+--------------+---------------+------+--------+---------+
| 50           | 192.168.1.231 | 3306 | ONLINE | 100000  | <--- new 
| 52           | 192.168.1.22  | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.23  | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.231 | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.233 | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.21  | 3306 | ONLINE | 10      |
| 8050         | 192.168.1.205 | 3306 | ONLINE | 1000000 |
| 8050         | 192.168.1.22  | 3306 | ONLINE | 100000  |
| 8050         | 192.168.1.231 | 3306 | ONLINE | 100000  |
| 9052         | 192.168.1.205 | 3306 | ONLINE | 1000    | <--- set in special group
+--------------+---------------+------+--------+---------+

 

 

Let us put down the writer again. 

At this point the only available node is the one on the other segment. 
As expected the script will identify the failed node and will take action:

 

2017/11/13 12:15:18.535:[WARN]  Move node:192.168.1.231;3306;50;3001 SQL: UPDATE mysql_servers SET hostgroup_id=9050 WHERE hostgroup_id=50 AND hostname='192.168.1.231' AND port='3306'
2017/11/13 12:15:18.543:[WARN]  Move node:192.168.1.231;3306;52;3001 SQL: UPDATE mysql_servers SET hostgroup_id=9052 WHERE hostgroup_id=52 AND hostname='192.168.1.231' AND port='3306'
2017/11/13 12:15:20.646:[WARN] Fail-over in action Using Method = 1
2017/11/13 12:15:20.646:[INFO] Special Backup - Group found! I am electing a node to writer following the indications
 This Node Try to become the new WRITER for HG 50 Server details: 192.168.1.22:3306:HG8050
2017/11/13 12:15:20.646:[INFO] This Node Try to become a WRITER promoting to HG 50192.168.1.22:3306:HG8050
2017/11/13 12:15:20.660:[WARN]  Move node:192.168.1.22:33061000002000 SQL:INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.22',50,3306,100000,2000);

 

This because when using automatic_failover=1 I assume YOU KNOW what you are doing in using a failover group. As such while I still use the segments for internal priority, I am not filtering out the node(s) in a different segment, as if using automatic_failover=2.

Same minimal impact on TPS:

 

Writes
[ 7040s ] thds: 10 tps: 26.39 qps: 395.02 (r/w/o: 236.85/101.48/56.69) lat (ms,95%): 569.67 err/s: 0.00 reconn/s: 0.00
[ 7050s ] thds: 10 tps: 26.81 qps: 405.20 (r/w/o: 243.16/104.73/57.31) lat (ms,95%): 623.33 err/s: 0.00 reconn/s: 0.00
[ 7060s ] thds: 10 tps: 18.80 qps: 276.65 (r/w/o: 167.27/68.29/41.09) lat (ms,95%): 634.66 err/s: 0.00 reconn/s: 23.70
[ 7070s ] thds: 10 tps: 18.90 qps: 292.87 (r/w/o: 177.78/73.99/41.10) lat (ms,95%): 5813.24 err/s: 0.00 reconn/s: 2.80
[ 7080s ] thds: 10 tps: 24.00 qps: 356.84 (r/w/o: 213.42/91.71/51.71) lat (ms,95%): 623.33 err/s: 0.00 reconn/s: 0.00
[ 7090s ] thds: 10 tps: 20.60 qps: 312.27 (r/w/o: 187.58/79.69/45.00) lat (ms,95%): 759.88 err/s: 0.00 reconn/s: 0.00
[ 7100s ] thds: 10 tps: 19.80 qps: 299.33 (r/w/o: 179.02/78.41/41.90) lat (ms,95%): 787.74 err/s: 0.00 reconn/s: 0.00
[ 7110s ] thds: 10 tps: 26.70 qps: 398.33 (r/w/o: 239.12/101.51/57.70) lat (ms,95%): 669.89 err/s: 0.00 reconn/s: 0.00</pre>
 
Reads
[ 7040s ] thds: 30 tps: 207.90 qps: 1870.43 (r/w/o: 1870.43/0.00/0.00) lat (ms,95%): 227.40 err/s: 0.00 reconn/s: 0.00
[ 7050s ] thds: 30 tps: 198.88 qps: 1790.44 (r/w/o: 1790.44/0.00/0.00) lat (ms,95%): 240.02 err/s: 0.00 reconn/s: 0.00
[ 7060s ] thds: 30 tps: 165.52 qps: 1488.05 (r/w/o: 1488.05/0.00/0.00) lat (ms,95%): 282.25 err/s: 0.00 reconn/s: 0.00
[ 7070s ] thds: 30 tps: 186.87 qps: 1681.46 (r/w/o: 1681.46/0.00/0.00) lat (ms,95%): 267.41 err/s: 0.00 reconn/s: 0.00
[ 7080s ] thds: 30 tps: 178.24 qps: 1607.05 (r/w/o: 1607.05/0.00/0.00) lat (ms,95%): 267.41 err/s: 0.00 reconn/s: 0.00
[ 7090s ] thds: 30 tps: 167.67 qps: 1506.13 (r/w/o: 1506.13/0.00/0.00) lat (ms,95%): 282.25 err/s: 0.00 reconn/s: 0.00
[ 7100s ] thds: 30 tps: 170.50 qps: 1535.70 (r/w/o: 1535.70/0.00/0.00) lat (ms,95%): 287.38 err/s: 0.00 reconn/s: 0.00
[ 7110s ] thds: 30 tps: 175.52 qps: 1580.38 (r/w/o: 1580.38/0.00/0.00) lat (ms,95%): 267.41 err/s: 0.00 reconn/s: 0.00
 
Servers layout
+--------------+---------------+------+--------+---------+
| hostgroup_id | hostname      | port | STATUS | weight  |
+--------------+---------------+------+--------+---------+
| 50           | 192.168.1.22  | 3306 | ONLINE | 100000  |<-- new writer
| 52           | 192.168.1.22  | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.23  | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.233 | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.21  | 3306 | ONLINE | 10      |
| 8050         | 192.168.1.205 | 3306 | ONLINE | 1000000 |
| 8050         | 192.168.1.22  | 3306 | ONLINE | 100000  |
| 8050         | 192.168.1.231 | 3306 | ONLINE | 100000  |
| 9052         | 192.168.1.205 | 3306 | ONLINE | 1000    |<-- node in special hg
| 9052         | 192.168.1.231 | 3306 | ONLINE | 1000    |<-- “
+--------------+---------------+------+--------+---------+
 

 

At this point as security feature also if you put back the failed nodes, the script WILL NOT fail back. But I am working on this feature request to implement th eoption to eventually perform fail-back.
As for today if you do:

Set global wsrep_reject_queries=none 

On both servers, and you will have:

 

+--------------+---------------+------+--------+---------+
| hostgroup_id | hostname      | port | STATUS | weight  |
+--------------+---------------+------+--------+---------+
| 50           | 192.168.1.22  | 3306 | ONLINE | 100000  |
| 52           | 192.168.1.205 | 3306 | ONLINE | 1000    |<-- back but READ
| 52           | 192.168.1.22  | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.23  | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.231 | 3306 | ONLINE | 1000    |<-- back but READ 
| 52           | 192.168.1.233 | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.21  | 3306 | ONLINE | 10      |
| 8050         | 192.168.1.205 | 3306 | ONLINE | 1000000 |
| 8050         | 192.168.1.22  | 3306 | ONLINE | 100000  |
| 8050         | 192.168.1.231 | 3306 | ONLINE | 100000  |
+--------------+---------------+------+--------+---------+

 

 

To fail back you need to manually trigger it.
Connect to the current writer and set it to wsrep_reject_queries = ALL
The script will move back as Writer the node with Higher weight in the 8000 HG

 

>+--------------+---------------+------+--------+---------+
| hostgroup_id | hostname      | port | STATUS | weight  |
+--------------+---------------+------+--------+---------+
| 50           | 192.168.1.205 | 3306 | ONLINE | 1000000 |<-- back as writer
| 52           | 192.168.1.205 | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.23  | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.231 | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.233 | 3306 | ONLINE | 1000    |
| 52           | 192.168.1.21  | 3306 | ONLINE | 10      |
| 8050         | 192.168.1.205 | 3306 | ONLINE | 1000000 |
| 8050         | 192.168.1.22  | 3306 | ONLINE | 100000  |
| 8050         | 192.168.1.231 | 3306 | ONLINE | 100000  |
| 9052         | 192.168.1.22  | 3306 | ONLINE | 1000    |<-- set as special group
+--------------+---------------+------+--------+---------+

 

Server crash

In case of server crash the behaviour will be exactly the same with the difference that the node will be elected immediately, no intermediate step for the HG shift.
Killing the mysqld with kill -9 (SIGKILL) will result in the following:

 

writes
[ 240s ] thds: 10 tps: 42.80 qps: 642.78 (r/w/o: 385.99/166.00/90.80) lat (ms,95%): 331.91 err/s: 0.00 reconn/s: 0.00
[ 250s ] thds: 10 tps: 40.80 qps: 603.37 (r/w/o: 361.74/155.92/85.71) lat (ms,95%): 363.18 err/s: 0.00 reconn/s: 0.00
[ 260s ] thds: 10 tps: 39.60 qps: 600.79 (r/w/o: 361.20/156.20/83.40) lat (ms,95%): 376.49 err/s: 0.00 reconn/s: 0.00
[ 270s ] thds: 10 tps: 35.00 qps: 541.33 (r/w/o: 324.82/141.91/74.60) lat (ms,95%): 356.70 err/s: 0.00 reconn/s: 1.00<--
[ 280s ] thds: 10 tps: 14.80 qps: 216.47 (r/w/o: 130.78/54.89/30.80) lat (ms,95%): 7615.89 err/s: 0.00 reconn/s: 0.00
[ 290s ] thds: 10 tps: 23.10 qps: 347.64 (r/w/o: 208.42/91.21/48.00) lat (ms,95%): 682.06 err/s: 0.00 reconn/s: 0.00</pre><br>
 
reads
[ 230s ] thds: 30 tps: 253.27 qps: 2281.51 (r/w/o: 2281.51/0.00/0.00) lat (ms,95%): 193.38 err/s: 0.00 reconn/s: 0.00
[ 240s ] thds: 30 tps: 229.80 qps: 2066.20 (r/w/o: 2066.20/0.00/0.00) lat (ms,95%): 215.44 err/s: 0.00 reconn/s: 0.00
[ 250s ] thds: 30 tps: 217.51 qps: 1957.59 (r/w/o: 1957.59/0.00/0.00) lat (ms,95%): 240.02 err/s: 0.00 reconn/s: 0.00
[ 260s ] thds: 30 tps: 207.89 qps: 1872.62 (r/w/o: 1872.62/0.00/0.00) lat (ms,95%): 240.02 err/s: 0.00 reconn/s: 0.00
[ 270s ] thds: 30 tps: 200.18 qps: 1802.03 (r/w/o: 1802.03/0.00/0.00) lat (ms,95%): 253.35 err/s: 0.00 reconn/s: 0.00<--
[ 280s ] thds: 30 tps: 199.33 qps: 1791.80 (r/w/o: 1791.80/0.00/0.00) lat (ms,95%): 240.02 err/s: 0.00 reconn/s: 0.00
[ 290s ] thds: 30 tps: 187.49 qps: 1689.78 (r/w/o: 1689.78/0.00/0.00) lat (ms,95%): 257.95 err/s: 0.00 reconn/s: 0.00
 
Script Log:
2017/11/16 10:02:10.923:[ERROR] Cannot connect TO DBI:mysql:host=192.168.1.205;port=3306 AS monitor
2017/11/16 10:02:11.179:[WARN] Fail-over IN action USING Method = 1
2017/11/16 10:02:11.179:[INFO] Special Backup - GROUP found! I am electing a node TO writer following the indications
 This Node Try TO become the new WRITER FOR HG 50 Server details: 192.168.1.231:3306:HG8050
2017/11/16 10:02:11.179:[INFO] This Node Try TO become a WRITER promoting TO HG 50 192.168.1.231:3306:HG 8050
2017/11/16 10:02:11.234:[WARN]  Move node:192.168.1.231:33061000002000 SQL:INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,max_connections) VALUES ('192.168.1.231',50,3306,100000,2000);
2017/11/16 10:02:13.190:[ERROR] Cannot connect TO DBI:mysql:host=192.168.1.205;port=3306 AS monitor
 
Server layout:
(admin@127.0.0.1) [main]>select hostgroup_id,hostname,port,STATUS,weight FROM runtime_mysql_servers ORDER BY hostgroup_id,weight DESC;
+--------------+---------------+------+---------+---------+
| hostgroup_id | hostname      | port | STATUS  | weight  |
+--------------+---------------+------+---------+---------+
| 50           | 192.168.1.231 | 3306 | ONLINE  | 100000  |<-- new writer
| 52           | 192.168.1.205 | 3306 | SHUNNED | 1000    |<--shunned server
| 52           | 192.168.1.22  | 3306 | ONLINE  | 1000    |
| 52           | 192.168.1.23  | 3306 | ONLINE  | 1000    |
| 52           | 192.168.1.231 | 3306 | ONLINE  | 1000    |
| 52           | 192.168.1.233 | 3306 | ONLINE  | 1000    |
| 52           | 192.168.1.21  | 3306 | ONLINE  | 10      |
| 8050         | 192.168.1.205 | 3306 | SHUNNED | 1000000 |
| 8050         | 192.168.1.22  | 3306 | ONLINE  | 100000  |
| 8050         | 192.168.1.231 | 3306 | ONLINE  | 100000  |
+--------------+---------------+------+---------+---------+

 

Using Auto_failover=2|3

This is going to be exactly the same behaviour not matter what is the failover method choose.
The most flexible and controlled one is to use the 8000 HG with the identification of the list of servers prioritize by weight.

The use of pxc_cluster_view, will give you an pseudo automatic way to set the same priority.
But choosing that method will NOT guarantee you that the node you want will take the traffic given is based the internal Galera decision.

Conclusions

While the galera_check script was initially conceptualize to work with multiple writers, the inclusion of pxc_cluste_view and the addition of the special Host Group 8000, had provide the needed base for me to develop a sustainable feature to support single writer and failover.

I have to remember you that the preferred solution is and remain to use multiple writers with different weights, given that solution provide internal failover in Proxysql.
The use of active_failover can significantly reduce the amount of service interruption caused by a possible single writer node crash, still it is an external script and is not a effective as internal balancing base on weight.

By design the script at the moment is NOT supporting the failback, but I already have a feature request on this.
Once implemented this feature will allow a node that had being removed from the active pool, to recover his state once in a healthy state again.
Another very important aspect is the way ProxySQL will manage the configuration cross node using the cluster option.

This will be described in the next article.

More blogs (shorter) will come on how to use the script and ProxySQL Cluster.

Reference

https://github.com/Tusamarco/proxy_sql_tools
https://www.percona.com/blog/2016/09/15/proxysql-percona-cluster-galera-integration/

Last Updated on Tuesday, 21 November 2017 12:48
 
How ProxySQL deal with schema (and schemaname) PDF Print E-mail
Written by Marco Tusa   
Tuesday, 22 August 2017 00:00

I think that very often we are so focus in analyzing internals, or specific behaviours/situations/anomalies that we tend to forget the simple things.

It happened to me that last week a couple of customers raise the same question:  "How ProxySQL manage the default schema, or the ones declared inside a FROM/JOIN?"
I was a bit surprise because I was given that for granted, and my first thought was, 'well read the documentation', but then I realize we do not have a clear section in the documentation about this.

Given that and also because I realize I had not done a full and extensive test on how the SCHEMA is actually managed.
I decide to do a simple set of tests and write down few lines.


This blog is to answer that very simple question:"How ProxySQL manage the default schema, or the ones declared inside a FROM/JOIN?"
The blog is split in two parts, part 1 simple declaration and summary of what happen. Part 2 all the details and tests, in case you want to see them.

 

Schemaname and ProxySQL


In ProxySQL we can specify the schema in two different places and with different scope.

  1. In the mysql_user table as property of the USER, where it will represent the DEFAULT schema.
  2. In the mysql_query_rules as filter for which a query can be capture. The filter is valid only for the explicitly set default SCHEMA, (ie with -D mysql command line or USE). ProxySQL will NOT analyze the FROM SQL clausole. Given the limitation in the above point 2, it is not 100% safe to trust the SCHEMANAME as filter unless you are 200% sure the code do not contains commands to change default schema like USE.

On the other hand if I want to filter by a schemaname (in general) it is safer and more flexible to use regular expression and query_rules, as I will show later.

 

For the scope of this article I want to answer these simple sub-questions:

  1. How ProxySQL differes from MySQL in managing explicit default schema declaration? does it respect the -D or USE
  2. How proxy respect/follow security agains schema
  3. How schemaname filter acts in the query rules?
  4. How can I transparently redirect using schema name?

To test the above I have created:

two servers:

Master

Slave

two schemas:

world

City

Country

myworld

CityM

CountryM

three users:

uallworld, can access all the schemas (including test)

uworld, can access world in write/read on Master, read on slave. Can access myworld in read on slave.

umyworld, can access myworld in write/read on Master, read on slave. Can access world in read on slave.

Queries used during the tests:

1
2
3
4
5
6
7
8
9
10
11
    select database();
    update world.City set Population=10500000 where ID=1024;
    update world.Country set LifeExpectancy=62.5 where Code='IND';
    update myworld.CityM set Population=10500001 where ID=1024;
    update myworld.CountryM set LifeExpectancy=0 where Code='IND';
    Select * from world.City order by Population DESC limit 5 ;
    Select * from myworld.CityM order by Population DESC limit 5 ;
    Select City.*, Country.name, Country.LifeExpectancy from world.City as City 
        join world.Country as Country on City.CountryCode=Country.code order by Population DESC limit 5; 
    Select City.*, Country.name, Country.LifeExpectancy from myworld.CityM as City 
        join myworld.CountryM as Country on City.CountryCode=Country.code order by Population DESC limit 5; 
    Select City.*, Country.name, Country.LifeExpectancy from world.City as City 
        join myworld.CountryM as Country on City.CountryCode=Country.code order by Population DESC limit 5; 
 

 

To setup the environment see instructions at Annex 1.

Short story

 

  1. How ProxySQL differes from MySQL in managing explicit default schema declaration? Does it respect the -D or USE?
    MySQL and ProxySQL will behave the same when passing the default schema, setting it as default.
    MySQL
    mysql -uuallworld -ptest -h192.168.1.107 -P 3306 -D test

    ProxySQL

    mysql -uuallworld -ptest -h127.0.0.1 -P 6033 -D test

    If a default schema is set in ProxySQL the schema coming from command line or connection (like in java:"connUrl=jdbc:mysql://192.168.1.50:6033/test"), will override the ProxySQL default.
    In case a default schema is not pass during the connection MySQL and ProxySQL will differs on how the behave:
           MySQL will set the current schema to NULL. It is to be noted that MySQL accept a NULL schema when u connect but then once SET it with USE you cannot set it back to NULL.
           ProxySQL will set it as the one declared default in the mysql_user table. If no schema is declared as default, ProxySQL will elect information_schema as the default. In short ProxySQL cannot have a default schema set to NULL.

     

  2. How proxy respect/follow security agains schema
    MySQL
    mysql -uuworld -ptest -h192.168.1.107 -P 3306 -D test ERROR 1044 (42000):
    Access denied for user 'uworld'@'%' to database 'test'

    ProxySQL

    mysql -uuworld -ptest -h127.0.0.1 -P 6033 -D test
    Welcome to the MySQL monitor. Commands end with ; or \g.
    Your MySQL connection id is 483902 Server version: 5.5.30 (ProxySQL)

    On connection the behaviour is different between MySQL and ProxySQL.
    Why? Because while you directly connect to MySQL, when you connect to ProxySQL you do not actually touch the final server.

    ProxySQL is NOT a forward proxy, but a reverse proxy, so its scope is act as an intermediary for its associated servers to be contacted by any client. Backend connection that will serve your client is establish at the monent you actually send a query, which will be comunicated to the relevent host group.


    As such when you connect, you only open a connection to the ProxySQL. While issue a query will USE a connection to the backend and if the user do not have the right grants an error will be returned.
    But ProxySQL will not known until you submit the query and it can decide where this query should go (which HG to point to).
    mysql> select database(); ERROR 1044 (42000): Access denied for user 'uworld'@'%' to database 'test'

    Aside from this all the GRANTS defined in MySQL are transparent and followed by ProxySQL

  3. How schemaname filter acts in the query rules?

    In MySQL we can easily change the default schema with USE , this action is fully supported by ProxySQL.
    But it may have some side effects when using "schemaname" as filter in the query_rules.

    If you define a rule that include the default schemaname and the default schema is changed with USE, the rule will not apply, and unpredictable results may happen.
    To prevent that ProxySQL has another option in mysql_user "schema_locked" which will prevent the schema to be changed by USE.

    This feature is present but not fully implemented, and after a brief conversation with Rene (https://github.com/sysown/proxysql/issues/1133), I hope it will be soon.

    Given that, when designing Query rules using the Default schema, you must to take in consideration the possibility to have the application or user changing the default schema and invalidating that rule.


  4. how can I transparently redirect using schema name?

    This is not a Schema feature, more one of the things that in ProxySQL are quite easy to set, while close to be impossible if done in plain MySQL.
    When connecting directly with MySQL there is no option for you to "play" with GRANTS and schema such that you will transparently allow a user to do an action on a schema/server pair
    and another on a different schema/server pair.
    When using ProxySQL to filter by schemaname is quite trivial.
    For instance assuming we have 3 users one is admin of the platform which include 2 separate schemas (or more), each user can access one schema for write (but that can be table as well),
    and a final slave with reporting information, where all the users that needs to read from other schema except their own can read cross schemas. While all the select not cross schema mus still got to the Master;
    This is not so uncommon, actually with few variant is exactly what one of the customer I spoke last week needs to do.
    Now in MySQL this is impossible while in ProxySQL is just a matter of 3 rules:
        insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply) values(10,'uworld',10,1,3,'^SELECT.*FOR UPDATE',1);
        insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply,FlagOUT,FlagIN) values(11,'uworld',10,1,3,'^SELECT ',0,50,0);
        insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply,FlagOUT,FlagIN) values(12,'uworld',11,1,3,'\smyworld.',1,50,50);
    Simply applying the rules above will allow the application to transparently access the data from different servers without errors,
    following the GRANTS given on the MySQL side. This for ONE user (uworld), but it can eventually extended to any, and the rule chain is minimal just 2 rules, so minimal overhead also with high traffic.



Summarizing

ProxySQL is following the MySQL model to access the schema, in most parts. There are a couple of differences though.

     ProxySQL will require to set a default schema when connecting, implicitly or explicitly.

     ProxySQL will not return an error at connection time, if a user is not authorized to connect to the given schema. Error will raise at the first query, moment when ProxySQL will actually establish the connection.

 

Finally using ProxySQL, will allow administrator to play with GRANTS and HG/servers to provide transparent access to data in a more granular way, choosing an HG where user may have read (or other specific) access, against one where user is not allow at all .

 

Long Cool Story ... click me

 

 

Annex 1

Create environment for test

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
create schema myworld;
create table myworld.CityM like City;
create table myworld.CountryM like Country;
create table myworld.CountryLanguageM like CountryLanguage;
insert into  myworld.CityM select * from City;
insert into  myworld.CountryM select * from Country;
insert into  myworld.CountryLanguageM select * from CountryLanguage;
 
delete from mysql_users where username like '%world';
insert into mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) values ('uworld','test',1,10,'world',1);
insert into mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) values ('umyworld','test',1,10,'myworld',1);
insert into mysql_users (username,password,active,default_hostgroup,default_schema,transaction_persistent) values ('uallworld','test',1,10,'test',1);
 
LOAD MYSQL USERS TO RUNTIME;SAVE MYSQL USERS TO DISK;
 
delete from mysql_servers where hostgroup_id in (10,11,20,21);
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,comment) VALUES ('192.168.1.107',10,3306,100,'master');
INSERT INTO mysql_servers (hostname,hostgroup_id,port,weight,comment) VALUES ('192.168.1.109',11,3307,100,'slave');
 
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;
 
delete from mysql_replication_hostgroups;
INSERT INTO mysql_replication_hostgroups VALUES (10,11,'world-myworld replication hgroup');
LOAD MYSQL SERVERS TO RUNTIME; SAVE MYSQL SERVERS TO DISK;
 
DO NOT RUN AT THE beginning this is for test 3 !!
delete from mysql_query_rules where rule_id in (10,11,12,13,14,15);
insert into mysql_query_rules (rule_id,username,schemaname,destination_hostgroup,active,retries,match_digest,apply) 
  values(10,'uworld','world',10,1,3,'^SELECT.*FOR UPDATE',1);
insert into mysql_query_rules (rule_id,username,schemaname,destination_hostgroup,active,retries,match_digest,apply) 
  values(11,'uworld','world',11,1,3,'^SELECT ',1);
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;
 
DO NOT RUN AT THE beginning this is for test 4 !!
delete from mysql_query_rules where rule_id in (10,11,12,13,14,15);
Let see what we need and how to do it:
1) user(s) uworld & umyworld need to go to their default schema on Master for Writes.
2) user(s) uworld & umyworld should go to their default schema on  master for direct reads
3) user(s) uworld & umyworld should go to the slave for reads when the other schema is used
 
To do this we will need the following rules:
insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply) values(10,'uworld',10,1,3,'^SELECT.*FOR UPDATE',1);
insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply,FlagOUT,FlagIN) values(11,'uworld',10,1,3,'^SELECT ',0,50,0); 
insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply,FlagOUT,FlagIN) values(12,'uworld',11,1,3,'myworld.',1,50,50);
 
 
insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply) values(13,'umyworld',10,1,3,'^SELECT.*FOR UPDATE',1);
insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply,FlagOUT,FlagIN) values(14,'umyworld',10,1,3,'^SELECT ',0,50,0); 
insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply,FlagOUT,FlagIN) values(15,'umyworld',11,1,3,'\sworld.',1,50,50);
 
LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;
 
GRANTS
-------
grant all on *.* to uallworld@'%' identified by 'test';
grant all on world.* to uworld@'%' identified by 'test';     
grant all on myworld.* to umyworld@'%' identified by 'test';   
 
on the slave
REVOKE ALL ON *.*  FROM 'uworld'@'%';
REVOKE ALL ON *.*  FROM 'umyworld'@'%';
grant select on myworld.* to uworld@'%' identified by 'test';     
grant select on world.* to umyworld@'%' identified by 'test';
 
 
To monitor what is happening
---------------------------------
watch -n 1 'mysql -h 127.0.0.1 -P 6032 -uadmin -padmin -t -e "select * from stats_mysql_connection_pool where hostgroup 
< 30 order by hostgroup,srv_host desc;" -e " select srv_host,command,avg(time_ms), count(ThreadID) from 
stats_mysql_processlist group by srv_host,command;" -e "select * from stats_mysql_users;";
mysql -h 127.0.0.1 -P 6032 -uadmin -padmin -t -e "select * from stats_mysql_global "|egrep -i  "(mirror|memory|stmt)"'  
 
select active,hits,destination_hostgroup, mysql_query_rules.rule_id, match_digest, match_pattern, 
   replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules 
   NATURAL JOIN stats.stats_mysql_query_rules where destination_hostgroup < 30  ORDER BY mysql_query_rules.rule_id; 
select * from stats_mysql_query_digest;
select * from stats_mysql_query_digest_reset;
Last Updated on Tuesday, 22 August 2017 18:33
 
How ProxySQL deal with schema (and schemaname) Long story PDF Print E-mail
Written by Marco Tusa   
Tuesday, 15 August 2017 15:47

This article is the spin-off of the article How ProxySQL deal with schema (and schemaname)

1 How ProxySQL differes from MySQL in managing explicit default schema declaration? does it respect the -D or USE

MySQL and ProxySQL will behave the same when passing the default schema, setting it as default.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
      MySQL
        mysql -uuallworld -ptest -h192.168.1.107 -P 3306 -D test
        select database();
        +------------+
        | database() |
        +------------+
        | test       |
        +------------+
 
        SHOW TABLES;
        +-----------------+
        | Tables_in_test  |
        +-----------------+
        | rewrite_tbtest  |
        | rewrite_tbtest1 |
        +-----------------+
 
      ProxySQL
        mysql -uuallworld -ptest -h127.0.0.1 -P 6033 -D test
        select database();
        +------------+
        | database() |
        +------------+
        | test       |
        +------------+
 
        SHOW TABLES;
        +-----------------+
        | Tables_in_test  |
        +-----------------+
        | rewrite_tbtest  |
        | rewrite_tbtest1 |
        +-----------------+
 

 

Passing a different schema from the default one in ProxySQL will override the ProxySQL default:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
        MySQL
         ./mysql-3306 -uuallworld -ptest -D mysql
        (uallworld@localhost) [mysql]>select database();
        +------------+
        | database() |
        +------------+
        | mysql      |
        +------------+
        ProxySQL
        [root@rms2 server57S]# ./mysql-3306 -uuallworld -ptest -h192.168.1.50 -P6033 -D mysql
 
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [mysql]>select database();
        +------------+
        | database() |
        +------------+
        | mysql      |
        +------------+
 

 

Connecting without DEFAULT Schema

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
      MySQL 
        mysql -uuallworld -ptest -h192.168.1.107 -P 3306
        mysql> select database();
          +------------+
          | database() |
          +------------+
          | NULL       |
          +------------+
 
      ProxySQL 
        mysql -uuallworld -ptest -h127.0.0.1 -P 6033
        mysql> select database();
        +------------+
        | database() |
        +------------+
        | test       |
        +------------+
 

 

The last is different between MySQL and ProxySQL.      

By default ProxySQL will connect the user to his default schema.     

Not only, if the default schema is not define:

1
2
3
4
5
6
7
8
9
10
11
12
13
      +-----------+----------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
      | username  | password | active | use_ssl | default_hostgroup | default_schema | schema_locked | transaction_persistent | fast_forward | backend | frontend | max_connections |
      +-----------+----------+--------+---------+-------------------+----------------+---------------+------------------------+--------------+---------+----------+-----------------+
      | uallworld | test     | 1      | 0       | 10                | NULL           | 0             | 1                      | 0            | 1       | 1        | 10000           |
 
      ProxySQL will point the connection to information_schema.
          mysql> select database();
          +--------------------+
          | database()         |
          +--------------------+
          | information_schema |
          +--------------------+
 

 

2 How proxy respect/follow security agains schema

1
2
3
4
5
6
7
8
9
10
11
      MySQL
      mysql -uuworld -ptest -h192.168.1.107 -P 3306 -D test
        mysql: [Warning] Using a password on the command line interface can be insecure.
        ERROR 1044 (42000): Access denied for user 'uworld'@'%' to database 'test'
 
      ProxySQL
      mysql -uuworld -ptest -h127.0.0.1 -P 6033 -D test
        mysql: [Warning] Using a password on the command line interface can be insecure.
        Welcome to the MySQL monitor.  Commands end with ; or \g.
        Your MySQL connection id is 483902
        Server version: 5.5.30 (ProxySQL)

 

And no warnings.     

Why?

Because while you directly connect to MySQL, when you connect to ProxySQL you do not actually touch the final server.      

You only open a connection to the ProxySQL, issue a query will open a connection and if the user do not have the right grants an error will be returned.

Let us see.

 just issuing the command:

 

1
2
3
        mysql> select database();
        ERROR 1044 (42000): Access denied for user 'uworld'@'%' to database 'test'
 

 

 

 

As soon as you try to query the MySQL server you got an error.     

 

 

3 How schemaname filter acts in the query rules?

In MySQL we can easily change the default schema with USE , this action is fully supported by ProxySQL.    
But it may have some side effects when using "schemaname" as filter in the query_rules.  
If you define a rule that include the default schemaname and the default schema is changed with USE, the rule will not apply, and unpredictable results may happen.  
To prevent that ProxySQL has another option in mysql_user "schema_locked" which will prevent the schema to be changed by USE.
This feature is present but not fully implemented, and after a brief conversation with Rene (https://github.com/sysown/proxysql/issues/1133).   
For now my advice is to DO NOT use schemaname as filter in the query_rules.  

Below the full test case:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
    mysql> select active,hits,destination_hostgroup, mysql_query_rules.rule_id, match_digest, match_pattern, replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules where destination_hostgroup < 30  ORDER BY mysql_query_rules.rule_id;
      +--------+------+-----------------------+---------+---------------------+---------------+-----------------+-----------+-------+--------+---------+
      | active | hits | destination_hostgroup | rule_id | match_digest        | match_pattern | replace_pattern | cache_ttl | apply | flagIN | flagOUT |
      +--------+------+-----------------------+---------+---------------------+---------------+-----------------+-----------+-------+--------+---------+
      | 1      | 0    | 10                    | 10      | ^SELECT.*FOR UPDATE | NULL          | NULL            | NULL      | 1     | 0      | NULL    |
      | 1      | 0    | 11                    | 11      | ^SELECT             | NULL          | NULL            | NULL      | 1     | 0      | NULL    |
      +--------+------+-----------------------+---------+---------------------+---------------+-----------------+-----------+-------+--------+---------+
      2 rows in set (0.00 sec)
 
 
      (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [world]>Select * from world.City order by Population DESC limit 5 ;
      +------+-----------------+-------------+--------------+------------+
      | ID   | Name            | CountryCode | District     | Population |
      +------+-----------------+-------------+--------------+------------+
      | 1024 | Mumbai (Bombay) | IND         | Maharashtra  |   10500000 |
      | 2331 | Seoul           | KOR         | Seoul        |    9981619 |
      |  206 | São Paulo       | BRA         | São Paulo    |    9968485 |
      | 1890 | Shanghai        | CHN         | Shanghai     |    9696300 |
      |  939 | Jakarta         | IDN         | Jakarta Raya |    9604900 |
      +------+-----------------+-------------+--------------+------------+
      5 rows in set (0.01 sec)
 
 
 
      mysql> select active,hits,destination_hostgroup, mysql_query_rules.rule_id, match_digest, match_pattern, replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules where destination_hostgroup < 30  ORDER BY mysql_query_rules.rule_id;
      +--------+------+-----------------------+---------+---------------------+---------------+-----------------+-----------+-------+--------+---------+
      | active | hits | destination_hostgroup | rule_id | match_digest        | match_pattern | replace_pattern | cache_ttl | apply | flagIN | flagOUT |
      +--------+------+-----------------------+---------+---------------------+---------------+-----------------+-----------+-------+--------+---------+
      | 1      | 0    | 10                    | 10      | ^SELECT.*FOR UPDATE | NULL          | NULL            | NULL      | 1     | 0      | NULL    |
      | 1      | 1    | 11                    | 11      | ^SELECT             | NULL          | NULL            | NULL      | 1     | 0      | NULL    | <---
      +--------+------+-----------------------+---------+---------------------+---------------+-----------------+-----------+-------+--------+---------+
      2 rows in set (0.00 sec)
 
      mysql> select * from stats_mysql_query_digest;
      +-----------+------------+----------+--------------------+-----------------------------------------------------------+------------+------------+------------+----------+----------+----------+
      | hostgroup | schemaname | username | digest             | digest_text                                               | count_star | first_seen | last_seen  | sum_time | min_time | max_time |
      +-----------+------------+----------+--------------------+-----------------------------------------------------------+------------+------------+------------+----------+----------+----------+
      | 10        | world      | uworld   | 0x594F2C744B698066 | select USER()                                             | 1          | 1502789864 | 1502789864 | 0        | 0        | 0        |
      | 10        | world      | umyworld | 0x134B0F1E837A0B8B | Select * from world.City order by Population DESC limit ? | 1          | 1502789849 | 1502789849 | 10264    | 10264    | 10264    |
      | 10        | world      | uworld   | 0x02033E45904D3DF0 | show databases                                            | 1          | 1502789863 | 1502789863 | 1750     | 1750     | 1750     |
      | 10        | world      | umyworld | 0x594F2C744B698066 | select USER()                                             | 1          | 1502789745 | 1502789745 | 0        | 0        | 0        |
      | 10        | world      | uworld   | 0x226CD90D52A2BA0B | select @@version_comment limit ?                          | 1          | 1502789863 | 1502789863 | 0        | 0        | 0        |
      | 10        | world      | uworld   | 0x99531AEFF718C501 | show tables                                               | 1          | 1502789863 | 1502789863 | 470      | 470      | 470      |
      | 10        | world      | umyworld | 0x226CD90D52A2BA0B | select @@version_comment limit ?                          | 1          | 1502789745 | 1502789745 | 0        | 0        | 0        |
      | 10        | world      | umyworld | 0x99531AEFF718C501 | show tables                                               | 1          | 1502789745 | 1502789745 | 7427     | 7427     | 7427     |
      | 11        | world      | uworld   | 0x134B0F1E837A0B8B | Select * from world.City order by Population DESC limit ? | 1          | 1502789866 | 1502789866 | 3349     | 3349     | 3349     | <---
      | 10        | world      | umyworld | 0x02033E45904D3DF0 | show databases                                            | 1          | 1502789745 | 1502789745 | 615      | 615      | 615      |
      +-----------+------------+----------+--------------------+-----------------------------------------------------------+------------+------------+------------+----------+----------+----------+
      10 rows in set (0.00 sec)
 
      (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [world]>USE information_schema;
      Reading table information for completion of table and column names
      You can turn off this feature to get a quicker startup with -A
 
      Database changed
      (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [information_schema]>Select * from world.City order by Population DESC limit 5 ;
      +------+-----------------+-------------+--------------+------------+
      | ID   | Name            | CountryCode | District     | Population |
      +------+-----------------+-------------+--------------+------------+
      | 1024 | Mumbai (Bombay) | IND         | Maharashtra  |   10500000 |
      | 2331 | Seoul           | KOR         | Seoul        |    9981619 |
      |  206 | São Paulo       | BRA         | São Paulo    |    9968485 |
      | 1890 | Shanghai        | CHN         | Shanghai     |    9696300 |
      |  939 | Jakarta         | IDN         | Jakarta Raya |    9604900 |
      +------+-----------------+-------------+--------------+------------+
      5 rows in set (0.01 sec)
 
 
 
 
      mysql> select * from stats_mysql_query_digest;
      +-----------+--------------------+----------+--------------------+-----------------------------------------------------------+------------+------------+------------+----------+----------+----------+
      | hostgroup | schemaname         | username | digest             | digest_text                                               | count_star | first_seen | last_seen  | sum_time | min_time | max_time |
      +-----------+--------------------+----------+--------------------+-----------------------------------------------------------+------------+------------+------------+----------+----------+----------+
      | 10        | information_schema | uworld   | 0x99531AEFF718C501 | show tables                                               | 1          | 1502789929 | 1502789929 | 540      | 540      | 540      |
      | 10        | information_schema | uworld   | 0x02033E45904D3DF0 | show databases                                            | 1          | 1502789929 | 1502789929 | 897      | 897      | 897      |
      | 11        | world              | uworld   | 0x620B328FE9D6D71A | SELECT DATABASE()                                         | 1          | 1502789929 | 1502789929 | 537      | 537      | 537      |
      | 10        | world              | uworld   | 0x594F2C744B698066 | select USER()                                             | 1          | 1502789864 | 1502789864 | 0        | 0        | 0        |
      | 10        | world              | umyworld | 0x134B0F1E837A0B8B | Select * from world.City order by Population DESC limit ? | 1          | 1502789849 | 1502789849 | 10264    | 10264    | 10264    |
      | 10        | world              | uworld   | 0x02033E45904D3DF0 | show databases                                            | 1          | 1502789863 | 1502789863 | 1750     | 1750     | 1750     |
      | 10        | world              | umyworld | 0x594F2C744B698066 | select USER()                                             | 1          | 1502789745 | 1502789745 | 0        | 0        | 0        |
      | 10        | world              | uworld   | 0x226CD90D52A2BA0B | select @@version_comment limit ?                          | 1          | 1502789863 | 1502789863 | 0        | 0        | 0        |
      | 10        | world              | uworld   | 0x99531AEFF718C501 | show tables                                               | 1          | 1502789863 | 1502789863 | 470      | 470      | 470      |
      | 10        | world              | umyworld | 0x226CD90D52A2BA0B | select @@version_comment limit ?                          | 1          | 1502789745 | 1502789745 | 0        | 0        | 0        |
      | 10        | world              | umyworld | 0x99531AEFF718C501 | show tables                                               | 1          | 1502789745 | 1502789745 | 7427     | 7427     | 7427     |
      | 10        | information_schema | uworld   | 0x134B0F1E837A0B8B | Select * from world.City order by Population DESC limit ? | 1          | 1502789933 | 1502789933 | 3707     | 3707     | 3707     | <---
      | 11        | world              | uworld   | 0x134B0F1E837A0B8B | Select * from world.City order by Population DESC limit ? | 1          | 1502789866 | 1502789866 | 3349     | 3349     | 3349     | <---
      | 10        | world              | umyworld | 0x02033E45904D3DF0 | show databases                                            | 1          | 1502789745 | 1502789745 | 615      | 615      | 615      |
      +-----------+--------------------+----------+--------------------+-----------------------------------------------------------+------------+------------+------------+----------+----------+----------+
      14 rows in set (0.01 sec)
 
      mysql> select active,hits,destination_hostgroup, mysql_query_rules.rule_id, match_digest, match_pattern, replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules where destination_hostgroup < 30  ORDER BY mysql_query_rules.rule_id;
      +--------+------+-----------------------+---------+---------------------+---------------+-----------------+-----------+-------+--------+---------+
      | active | hits | destination_hostgroup | rule_id | match_digest        | match_pattern | replace_pattern | cache_ttl | apply | flagIN | flagOUT |
      +--------+------+-----------------------+---------+---------------------+---------------+-----------------+-----------+-------+--------+---------+
      | 1      | 0    | 10                    | 10      | ^SELECT.*FOR UPDATE | NULL          | NULL            | NULL      | 1     | 0      | NULL    |
      | 1      | 2    | 11                    | 11      | ^SELECT             | NULL          | NULL            | NULL      | 1     | 0      | NULL    |
      +--------+------+-----------------------+---------+---------------------+---------------+-----------------+-----------+-------+--------+---------+
      2 rows in set (0.00 sec)
 

 

4 how can I transparently redirect using schema name?

This is not a Schema feature, more one of the things that in ProxySQL are quite easy to set, while close to be impossible if done in plain MySQL.
When connecting directly with MySQL there is no option for you to "play" with GRANTS and schema such that you will transparently allow a user to do an action on a schema/server pair and another on a different schema/server pair.
When using ProxySQL to filter by schemaname is quite trivial.

For instance assuming we have 3 users one is admin of the platform which include 2 separate schemas (or more), each user can access one schema for write (but that can be table as well), and a final slave with reporting information, where all the users that needs to read from other schema except their own can read cross schemas.
While all the select not cross schema mus still got to the Master.
This is not so uncommon, actually with few variant is exactly what one of the customer I spoke last week needs to do.

Let see what we need and how to do it:

  1. user(s) uworld & umyworld need to go to their default schema on Master for Writes.        
  2. user(s) uworld & umyworld should go to their default schema on  master for direct reads
  3. user(s) uworld & umyworld should go to the slave for reads when the other schema is used

To do this we will need the following rules:

 

1
2
3
4
5
6
7
8
9
10
        insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply) values(10,'uworld',10,1,3,'^SELECT.*FOR UPDATE',1);
        insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply,FlagOUT,FlagIN) values(11,'uworld',10,1,3,'^SELECT ',0,50,0); 
        insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply,FlagOUT,FlagIN) values(12,'uworld',11,1,3,'\smyworld.',1,50,50);
 
 
        insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply) values(13,'umyworld',10,1,3,'^SELECT.*FOR UPDATE',1);
        insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply,FlagOUT,FlagIN) values(14,'umyworld',10,1,3,'^SELECT ',0,50,0); 
        insert into mysql_query_rules (rule_id,username,destination_hostgroup,active,retries,match_digest,apply,FlagOUT,FlagIN) values(15,'umyworld',11,1,3,'\sworld.',1,50,50);
 
        LOAD MYSQL QUERY RULES TO RUNTIME;SAVE MYSQL QUERY RULES TO DISK;

 

 

 

To check the behaviour:

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
        select database();
        update world.City set Population=10500000 where ID=1024;
        update world.Country set LifeExpectancy=62.5 where Code='IND';
        update myworld.CityM set Population=10500001 where ID=1024;
        update myworld.CountryM set LifeExpectancy=0 where Code='IND';
 
        select database();
        Select * from world.City order by Population DESC limit 5 ;
        Select * from myworld.CityM order by Population DESC limit 5 ;
 
        Select City.*, Country.name, Country.LifeExpectancy from world.City as City join world.Country as Country on City.CountryCode=Country.code order by Population DESC limit 5; 
        Select City.*, Country.name, Country.LifeExpectancy from myworld.CityM as City join myworld.CountryM as Country on City.CountryCode=Country.code order by Population DESC limit 5; 
 
        Select City.*, Country.name, Country.LifeExpectancy from world.City as City join myworld.CountryM as Country on City.CountryCode=Country.code order by Population DESC limit 5; 
 

 

Once I run the test (queries above):

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
        mysql> select active,hits,destination_hostgroup, mysql_query_rules.rule_id, match_digest, match_pattern, replace_pattern, cache_ttl, apply,flagIn,flagOUT FROM mysql_query_rules NATURAL JOIN stats.stats_mysql_query_rules where destination_hostgroup < 30  ORDER BY mysql_query_rules.rule_id;
        +--------+------+-----------------------+---------+---------------------+---------------+-----------------+-----------+-------+--------+---------+
        | active | hits | destination_hostgroup | rule_id | match_digest        | match_pattern | replace_pattern | cache_ttl | apply | flagIN | flagOUT |
        +--------+------+-----------------------+---------+---------------------+---------------+-----------------+-----------+-------+--------+---------+
        | 1      | 0    | 10                    | 10      | ^SELECT.*FOR UPDATE | NULL          | NULL            | NULL      | 1     | 0      | NULL    |
        | 1      | 5    | 10                    | 11      | ^SELECT             | NULL          | NULL            | NULL      | 0     | 0      | 50      | <-- 5 selects in total OK
        | 1      | 3    | 11                    | 12      | myworld.            | NULL          | NULL            | NULL      | 1     | 50     | 50      | <-- 3 contains term myworld OK
        | 1      | 0    | 10                    | 13      | ^SELECT.*FOR UPDATE | NULL          | NULL            | NULL      | 1     | 0      | NULL    |
        | 1      | 0    | 10                    | 14      | ^SELECT             | NULL          | NULL            | NULL      | 0     | 0      | 50      |
        | 1      | 0    | 11                    | 15      | world.              | NULL          | NULL            | NULL      | 1     | 50     | 50      |
        +--------+------+-----------------------+---------+---------------------+---------------+-----------------+-----------+-------+--------+---------+
        6 rows in set (0.00 sec)
 
 
 
        mysql> select * from stats_mysql_query_digest;                                                                                                                                         
        +-----------+------------+----------+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------+------------+------------+----------+----------+----------+
        | hostgroup | schemaname | username | digest             | digest_text                                                                                                                                                                       | count_star | first_seen | last_seen  | sum_time | min_time | max_time |
        +-----------+------------+----------+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------+------------+------------+----------+----------+----------+
        | 11        | world      | uworld   | 0x921512ADAF79D0FF | Select City.*, Country.name, Country.LifeExpectancy from world.City as City join myworld.CountryM as Country on City.CountryCode=Country.code order by Population DESC limit ?    | 1          | 1502718358 | 1502718358 | 6531     | 6531     | 6531     |
        | 11        | world      | uworld   | 0xE9D26001526F2618 | Select City.*, Country.name, Country.LifeExpectancy from myworld.CityM as City join myworld.CountryM as Country on City.CountryCode=Country.code order by Population DESC limit ? | 1          | 1502718358 | 1502718358 | 6573     | 6573     | 6573     |
        | 10        | world      | uworld   | 0xE846287B5A6B3945 | Select City.*, Country.name, Country.LifeExpectancy from world.City as City join world.Country as Country on City.CountryCode=Country.code order by Population DESC limit ?       | 1          | 1502718358 | 1502718358 | 3181     | 3181     | 3181     |
        | 11        | world      | uworld   | 0x55FFF888F4642D3A | Select * from myworld.CityM order by Population DESC limit ?                                                                                                                      | 1          | 1502718358 | 1502718358 | 7753     | 7753     | 7753     |
        | 10        | world      | uworld   | 0x26DB674419D1E979 | update myworld.CountryM set LifeExpectancy=? where Code=?                                                                                                                         | 1          | 1502718358 | 1502718358 | 257      | 257      | 257      |
        | 10        | world      | uworld   | 0x056615DE2CFD8C8E | update myworld.CityM set Population=? where ID=?                                                                                                                                  | 1          | 1502718358 | 1502718358 | 235      | 235      | 235      |
        | 10        | world      | uworld   | 0x134B0F1E837A0B8B | Select * from world.City order by Population DESC limit ?                                                                                                                         | 1          | 1502718358 | 1502718358 | 3262     | 3262     | 3262     |
        | 10        | world      | uworld   | 0x7A15CC342D54452D | update world.Country set LifeExpectancy=?.? where Code=?                                                                                                                          | 1          | 1502718358 | 1502718358 | 319      | 319      | 319      |
        | 10        | world      | uworld   | 0x500E6F01B02078B6 | update world.City set Population=? where ID=?                                                                                                                                     | 1          | 1502718358 | 1502718358 | 970      | 970      | 970      |
        +-----------+------------+----------+--------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------------+------------+------------+----------+----------+----------+
 

 

As we can see with this simple set of rules my uworld user was able to perform exactly as expected and able to access the schema from the designated HG.       

All the selects with schema "myworld" were redirect to HG 11.

Results details:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [(none)]>update world.City set Population=10500000 where ID=1024;
        Query OK, 1 row affected (0.01 sec)
        Rows matched: 1  Changed: 0  Warnings: 0
 
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [(none)]>update world.Country set LifeExpectancy=62.5 where Code='IND';
        Query OK, 1 row affected (0.00 sec)
        Rows matched: 1  Changed: 0  Warnings: 0
 
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [(none)]>update myworld.CityM set Population=10500001 where ID=1024;      <--- I am NOT managing this but I could if needed
        ERROR 1142 (42000): UPDATE command denied to user 'uworld'@'192.168.1.50' for table 'CityM'
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [(none)]>update myworld.CountryM set LifeExpectancy=0 where Code='IND';   <--- I am NOT managing this but I could if needed
        ERROR 1142 (42000): UPDATE command denied to user 'uworld'@'192.168.1.50' for table 'CountryM'
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [(none)]>
 
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [(none)]>Select * from world.City order by Population DESC limit 5 ;
        +------+-----------------+-------------+--------------+------------+
        | ID   | Name            | CountryCode | District     | Population |
        +------+-----------------+-------------+--------------+------------+
        | 1024 | Mumbai (Bombay) | IND         | Maharashtra  |   10500000 |
        | 2331 | Seoul           | KOR         | Seoul        |    9981619 |
        |  206 | São Paulo       | BRA         | São Paulo    |    9968485 |
        | 1890 | Shanghai        | CHN         | Shanghai     |    9696300 |
        |  939 | Jakarta         | IDN         | Jakarta Raya |    9604900 |
        +------+-----------------+-------------+--------------+------------+
        5 rows in set (0.00 sec)
 
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [(none)]>Select * from myworld.CityM order by Population DESC limit 5 ;
        +------+-----------------+-------------+--------------+------------+
        | ID   | Name            | CountryCode | District     | Population |
        +------+-----------------+-------------+--------------+------------+
        | 1024 | Mumbai (Bombay) | IND         | Maharashtra  |   10500001 |
        | 2331 | Seoul           | KOR         | Seoul        |    9981619 |
        |  206 | São Paulo       | BRA         | São Paulo    |    9968485 |
        | 1890 | Shanghai        | CHN         | Shanghai     |    9696300 |
        |  939 | Jakarta         | IDN         | Jakarta Raya |    9604900 |
        +------+-----------------+-------------+--------------+------------+
        5 rows in set (0.01 sec)
 
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [(none)]>
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [(none)]>Select City.*, Country.name, Country.LifeExpectancy from world.City as City join world.Country as Country on City.CountryCode=Country.code order by Population DESC limit 5; 
        +------+-----------------+-------------+--------------+------------+-------------+----------------+
        | ID   | Name            | CountryCode | District     | Population | name        | LifeExpectancy |
        +------+-----------------+-------------+--------------+------------+-------------+----------------+
        | 1024 | Mumbai (Bombay) | IND         | Maharashtra  |   10500000 | India       |           62.5 |
        | 2331 | Seoul           | KOR         | Seoul        |    9981619 | South Korea |           74.4 |
        |  206 | São Paulo       | BRA         | São Paulo    |    9968485 | Brazil      |           62.9 |
        | 1890 | Shanghai        | CHN         | Shanghai     |    9696300 | China       |           71.4 |
        |  939 | Jakarta         | IDN         | Jakarta Raya |    9604900 | Indonesia   |           68.0 |
        +------+-----------------+-------------+--------------+------------+-------------+----------------+
        5 rows in set (0.01 sec)
 
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [(none)]>Select City.*, Country.name, Country.LifeExpectancy from myworld.CityM as City join myworld.CountryM as Country on City.CountryCode=Country.code order by Population DESC limit 5; 
        +------+-----------------+-------------+--------------+------------+-------------+----------------+
        | ID   | Name            | CountryCode | District     | Population | name        | LifeExpectancy |
        +------+-----------------+-------------+--------------+------------+-------------+----------------+
        | 1024 | Mumbai (Bombay) | IND         | Maharashtra  |   10500001 | India       |            0.0 |
        | 2331 | Seoul           | KOR         | Seoul        |    9981619 | South Korea |           74.4 |
        |  206 | São Paulo       | BRA         | São Paulo    |    9968485 | Brazil      |           62.9 |
        | 1890 | Shanghai        | CHN         | Shanghai     |    9696300 | China       |           71.4 |
        |  939 | Jakarta         | IDN         | Jakarta Raya |    9604900 | Indonesia   |           68.0 |
        +------+-----------------+-------------+--------------+------------+-------------+----------------+
        5 rows in set (0.00 sec)
 
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [(none)]>
        (
 This e-mail address is being protected from spambots. You need JavaScript enabled to view it
 ) [(none)]>Select City.*, Country.name, Country.LifeExpectancy from world.City as City join myworld.CountryM as Country on City.CountryCode=Country.code order by Population DESC limit 5; 
        +------+-----------------+-------------+--------------+------------+-------------+----------------+
        | ID   | Name            | CountryCode | District     | Population | name        | LifeExpectancy |
        +------+-----------------+-------------+--------------+------------+-------------+----------------+
        | 1024 | Mumbai (Bombay) | IND         | Maharashtra  |   10500000 | India       |            0.0 |
        | 2331 | Seoul           | KOR         | Seoul        |    9981619 | South Korea |           74.4 |
        |  206 | São Paulo       | BRA         | São Paulo    |    9968485 | Brazil      |           62.9 |
        | 1890 | Shanghai        | CHN         | Shanghai     |    9696300 | China       |           71.4 |
        |  939 | Jakarta         | IDN         | Jakarta Raya |    9604900 | Indonesia   |           68.0 |
        +------+-----------------+-------------+--------------+------------+-------------+----------------+
        5 rows in set (0.00 sec)
 

 

Obviosuly doing it on the other user/schema work exacty the same.

 

Back to main article

Last Updated on Tuesday, 22 August 2017 14:39
 
Sweet and sour can become bitter PDF Print E-mail
Written by Marco Tusa   
Thursday, 03 August 2017 14:01

 

Recently Fred published a post (http://lefred.be/content/mysql-group-replication-is-sweet-but-can-be-sour-if-you-misunderstand-it) in which he was stating, I had publish my blog (http://www.tusacentral.net/joomla/index.php/mysql-blogs/191-group-replication-sweet-a-sour.html) which contains few “misunderstanding”.

 

All the people that knows me, also know I have no problem to admit if I do a mistake, at the end doing mistake is a way of learning and be afraid of the ones who don’t do them.

But at the same time, I need to have proof of it. As I provide proof of what I wrote with numbers and tests (all available on github).

Let us put down the basis of the discussion with facts, not personal speculation or assertions.

Facts:

1) From MySQL official documentations (https://dev.mysql.com/doc/refman/5.7/en/group-replication-use-cases.html) 

Group Replication enables you to create fault-tolerant systems with redundancy by replicating the system state throughout a set of servers. Consequently, even if some of the servers fail, as long it is not all or a majority, the system is still available, and all it could have degraded performance or scalability, it is still available. Server failures are isolated and independent. They are tracked by a group membership service which relies on a distributed failure detector that is able to signal when any servers leave the group, either voluntarily or due to an unexpected halt. There is a distributed recovery procedure to ensure that when servers join the group they are brought up to date automatically. There is no need for server fail-over, and the multi-master update everywhere nature ensures that not even updates are blocked in the event of a single server failure. Therefore MySQL Group Replication guarantees that the database service is continuously available.

 

2) Still from MySQL (https://dev.mysql.com/worklog/task/?id=9838Group Replication introduced flow control to avoid excessive buffering and to maintain group members reasonably close to one another. For several reasons, it's more efficient to keep buffering low but, as mentioned before, it is not a requirement that members are kept in sync for replication to work: once a slave becomes delayed it will just increase the number of pending transactions in the relay log.

The flow control mechanism enters the scene to bound the back-log members can accumulate, in terms of transactions to certify and transactions to apply. Once those bounds are exceeded, and while they remain exceeded, the flow-control mechanism limits the throughput of writer members to adjust to the capacity of the slowest members of the group. By imposing conservative bounds it tries to make sure all members are able to keep-up and can return safely to within the bounds defined by the user.

 

 

So given the above, I, as “standard” user, read that as “when using MySQL and GR, and setting (eventually) the Flow Control correctly, I will achieve to have a data platform that is continuously available given the use of GR”.

Cool, right? So what I was doing in my tests? Well two main things, one is to see if I can use GR for scaling reads as I was doing (and hundreds of customers as well) with PXC, the other is to see if in case of crash of my Master, I can safely fail-over another slave.

This doesn’t mean I am comparing the way the two product works. I cannot care less at this stage, as I am sure most of the customer will not care. What they care is what they can do SAFELY, and with what. All the mambo-jambo about the technical details (how much sync or async I can be) is not their concern.

So the point was and is… Given I am used to product X can I move to product Y and if so how and what I should be aware of?

Once more I was trying to answer to the question “If GR and InnoDB cluster has to work as alternative to other (virtually) synchronous replication mechanism, what change or shift our customers must consider if they want move from one to the other.”

The outcome of the tests is pointing to answer that, period.

 

Let us clarify two things more:

I am perfectly aware (Fred talking to you) that GR use a different mechanism for FC, and that the numbers set in the group_replication_flow_control_certifier_threshold/ group_replication_flow_control_applier_threshold are then use to calculate the Quota. Still they are user threshold, and Specifies the number of waiting transactions in the applier/certifier queue that trigger flow control which are connected to the final statements reported above: By imposing conservative bounds it tries to make sure all members are able to keep-up and can return safely to within the bounds defined by the user.
Bound that as for the manual can go from 0 to 2147483648. 
As such setting it to 25 (I did try also 1000 or more with even worse results) is perfectly legit and I have some issues in considering it a mistake.

 

I was measuring the lag with the only tools MySQL/Oracle had give us, in the article I said I had used:
select  @last_rec:=SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX( Received_transaction_set,':',-2),':',1),'-',-1) last_received FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier'; select (@last_rec - @last_exec) as real_lag"

Which use the only information available at the moment regarding the incoming and the current executed transactions.
Fred says that is not valid because the certification and
This means that a transaction listed on last_received_transaction_set is certified and queued to apply on this member. On the other members it may already be certified/applied or soon be.
I agree that it may not be perfect .. and I should have said: is for sure the last apply or soon to be given the certification on the master a node can know about.
But here we are not talking of 1 or 10 entries, but in most cases lag of hundreds or thousands entries. So also if this is not perfect and I can miss in a way or another a couple or entries because still processing the certification, I still think it is a valid way to evaluate the lag given the dimensions we are talking about.

BTW if this is not … well please help us and provide the right way to calculate the REAL lag between each node in a GR cluster and the writing master, with the most accurate precision.

 

About the comment of the node dimension, well thanks and yes you are right here I forgot to put the details.

The four nodes are VMS on separated hosts, so the gr1/2 where hosted on Host1 while gr3/4 hosted on the another host. Nothing else running on the hosts while test was running. Applications and monitor where running on a 3 host. The exact dimension is not really relevant given all of them were the same.

I want to say that I can setup the perfect environment for GR or PXC or NDB or whatever you are testing and showing how cool that is.

But I think we should consider that real life is far to be like that, and that we must see/test is how a product is behaving in adverse conditions, or if not adverse challenging.

 

Given all the above, in my opinion the tests clearly shown that also if the Flow Control mechanism in GR is the coolest thing ever conceptualize, at the end it does not what it is suppose to, no matter what.

Was I setting the thresholds wrong? Well not sure about that, given the results I got.

If I have 4 nodes and 2 of them (so … no majority) are lagging behind of hundreds or even thousands of entries, while my threshold is in the order of hundreds or less, this means that the cool mechanism is not doing his job, period.

One think is write down whatever about this or that but another is doing tests and provide numbers and evidences, not just words.

As said in a comment (not published in the Fred blog) I am more than happy to perform more tests and do them in conjunctions with anyone else.

Then if I am wrong I am wrong… but until I have different evidence, just saying a car can fly because it has very nice and colorful wings, doesn’t make it fly.

chitty-1

 


Last Updated on Saturday, 12 August 2017 15:06
 
Group-Replication, sweet & sour PDF Print E-mail
Written by Marco Tusa   
Thursday, 20 July 2017 00:00

A story around replication lag and Flow-Control.

Overview

In the last few months we had 2 main actors in the the MySQL ecosystem, ProxySQL and Group-Replication (with the evolution to InnoDB Cluster). 

While I had extensively covered the first, my last serious work on GR, goes back to some lab version in the past years. veg sweet and sour soup
Given the decision Oracle made to declare it GA, and the Percona decision to provide some level of support to GR, I decide it was time for me to take a look at it again.
A lot of reviews were already done covering different topics. I saw articles about GR and performance, GR and basic functionalities (or lack of it like automatic node provisioning), GR and ProxySQL and so on.

But one question was coming up over and over in my mind. If GR and InnoDB cluster has to work as alternative to other (virtually) synchronous replication mechanism, what change or shift our customers must consider if they want move from one to the other.
In solutions using Galera, like Percona Xtradb Cluster (PXC), there is a main concept to which all of us must refer to. The cluster is data-centric, which at the end brings us to what matters, which is the data and its state, that must be exactly the same on each node at a given time (commit/apply). To guarantee this PXC and others use a set of data validation and FlowControl that at the end will make possible to the cluster dataset to be consistent on each node, respecting the main principle (be data-centric).
Immediate application of this principle is that an application can query ANY node in the PXC and be sure to get the same data, or to write and know that the data will be visible on all node (virtually) at the same time.
Last but not list, if a node is not consistent with the others, it will be excluded and must be rebuild, or inconsistency fix, before joining back.

Not a minor thing if you think carefully, and a very useful thing to have because allow you to transparently split write/read operations, or failover from one node to another with lees troubles, and more.

When I thought to GR (or InnoDB Cluster), I put myself in the customer shoes, and I asked to the other myself: “Aside all the other things we know (see above) what is the real impact of moving form PXC to GR/Innodb-Cluster for my application? Because when you mention me that GR is still using (basically) replication with binlogs and relaylog, also if there is a Flow-Control mechanism an alarm bell started to ring in my mind.”

My Answer was: “Let us do a POC, and see what is really going on in that terms.”

 

The POC

Given that I setup a simple set of servers using GR, with a very basic application performing writes on a single Writer node, and (eventually) reads on the other nodes.Schema definition can be found here, mainly I had use the 4 tables from my windmills test suite, nothing special, or weird of specifically design for GR. Actually I had use this test a lot for PXC in the past, so was a perfect fit.

Test definition

The application will do very simple work, and I had thought to test 4 main cases:

  • One thread performing one insert at each transaction.
  • One thread performing 50 batched inserts at each transaction.
  • 8 threads performing one insert to each transaction.
  • 8 threads performing 50 batched inserts at each transaction.

As you can see nothing crazy, a quite simple operation.
Then I decide to test it using the following 4 conditions on the servers:

  • Single slave worker FC as default
  • Single slave worker FC set to 25
  • 8 slave workers FC as default
  • 8 slave workers FC set to 25

Again nothing weird or strange from my point of view.I had used 4 nodes:

  • Gr1 Writer
  • Gr2 Reader
  • Gr3 Reader minimal latency (~10ms)
  • Gr4 Reader minimal latency (~10ms)

Finally, I had to be sure I measure the lag in a way that allow me to reference it in a consistent way on all nodes.
I think we can safely say that the incoming GTID (last_ Received_transaction_set from replication_connection_status) is for sure the last apply on the master a slave node can know about.
This because network delay can prevent the last one to really be "received".
The other point of reference is the GTID_EXECUTED which refers to the latest GTID processed on the Node itself.

The closest Query that can track the distance will be:

1
select @last_exec:=SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX( @@global.GTID_EXECUTED,':',-2),':',1),'-',-1) last_executed;
select  @last_rec:=SUBSTRING_INDEX(SUBSTRING_INDEX(SUBSTRING_INDEX( Received_transaction_set,':',-2),':',1),'-',-1) last_received
FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier'; select (@last_rec - @last_exec) as real_lag

 

Or in case of a single worker

1
select @last_exec:=SUBSTRING_INDEX(SUBSTRING_INDEX( @@global.GTID_EXECUTED,':',-1),'-',-1) last_executed;
select  @last_rec:=SUBSTRING_INDEX(SUBSTRING_INDEX(Received_transaction_set,':',-1),'-',-1) last_received 
FROM performance_schema.replication_connection_status WHERE Channel_name = 'group_replication_applier'; select (@last_rec - @last_exec) as real_lag;

 

The result will be something like this:

 

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
+---------------+
| last_executed |
+---------------+
| 23607         |
+---------------+
 
+---------------+
| last_received |
 
+---------------+
| 23607         |
+---------------+
 
+----------+
| real_lag |
 
+----------+
|        0 |
+----------+

 

The whole set of tests can be found here, with all the commands you may need to run the application (you can find it here), and replicate the tests.
I will focus on the results, or this blogpost will be far too long, but I invite you to see the details.


The results

Efficiency on Writer by execution time & Rows/secHere using the raw data from the tests (excel spreadsheet available here) I was interested to identify if and how the writer is affected by the use of GR and FC.

writer_efficency_bytime


Reviewing the Graph we can see that the Writer has a linear increase of the execution time (when using default FC), with the increase of the load, nothing really concerning and all in all expected also if the load is light, we will see after that the volume of rows at the end justify the execution time.

Different scenario if we use FC, the execution time increase significantly in both cases (single  worker/multiple workers). In the worth case (8 threads, 50 inserts batch) it becomes 4 time higher the same load without FC.
What happen to the inserted rows? In the application I trace the rows inserted/sec, as such is easy to see what is going on there as well.

writer_efficency_by_rows

We can see that the Writer with FC activated is able to insert less than 1/3 of the rows it can process without FC.We can definitely say, that FC has a significant impact on the Writer performance.
To clarify let see this graph:

Writer_difference_data_time_fc_nofc


Without FC the Writer is able to process a high volume of rows in a limited number of time (results from test 8 workers; 8 threads; 50 insert batch).
While with FC the situation changes drastically, the Writer will take a long time processing a significant smaller amount of rows/sec. In short performance will drop significantly.


But, hey I can be ok with that if this means to have a consistent data-set cross all the nodes.
At the end also PXC and similar, pay a significant price in performance to match the data-centric Principle.

Ok let see what happen on the other nodes.

Entries Lag

Well, the scenario is not so good.

Writer_slave_real_lag_all_tests

 

When NOT using FC, the nodes lag behind the writer significantly. Remember that by default the FC in GR is set to 25000 entries, I mean 25K of entries!!!

So what happens here is that as as soon as I put some salt (see load) on the Writer the slave nodes will start to lag.
When using the default single worker, that will be significant, while when using multiple workers, we will see that the lag will mainly happen only on the node(s) with a minimal (10ms network latency).
Sad thing is that is not really going down respect to the single thread worker, indicating the simple minimal latency of 10ms was enough to affect the replication.

Well time to activate the FC and have no lag.

Writer_slave_real_lag_all_tests_wFC

Unfortunately, this is not going to be the case. As we can see the lag in case of single worker remain high also in Gr2 (154 entries).
While when using multiple workers, the Gr3/4 nodes are able to perform much better, and lag is significantly less, but still high ~1k entries.

It is important to remember that at this time the Writer is processing 1/3 or less of the rows it is normally able to do. It is important to note, that I had set 25 to the entry limit in the FC, and never the less the Gr3 (and Gr4) nodes are lagging more than 1K entries behind.

To clarify, let check the two graphs below:

Writer_slave_distance_without_FC

 

Using the Writer (Master) as baseline in entry #N, without FC, the nodes (slaves) using GR, will start to lag behind the writer, in a significant way also with light load. The distance in this POC was from very minimal with 58 entries, up to 3849 entries in the case of higher load.

Writer_slave_distance_with_FC


Using FC the Writer (Master) will diverge less, as expected, but also if it will have a significant drop in performance (1/3 or less), the Nodes will lag anyhow, worse case up to 1363 entries.Need to underline that we have not further way (I am aware of) to tune the lag and prevent it to happen.
Which means an application cannot transparently split Writes/Reads and expect consistency. The gap will be too high


A graph that tell us a story

I was using PMM to keep an eye on the nodes while doing the tests. And one of the graph was really telling me that GR has still some “limits” if we want to consider it as the replication mechanism for a cluster.

test8


This graph shows the MySQL Queries executed on all the 4 nodes, in the 8-50 threads-batch with FC test.As you can see the Gr1 (the writer) is the first one to takeoff, followed by Gr2, the nodes Gr3 and Gr4 will require a bit more, given the binlog transmission (and 10ms delay), once the data is there, they match (inconsistently) the Gr2 node, this is an effect of the FC asking the master to slow down. But as seen previously, the nodes will never be able to match the Writer. At the end, when the load test is over, the nodes will continue to process the queue for additional ~130 sec. Considering that the whole load takes 420 sec on the Writer, this means that 1/3 of the total time on the Writer is spent AFTER on the slave to sync.

test4


The above graph shows the same test without FC, is interesting to see how the Writer was going above the 300 Queries/sec while G2 stay around 200 and Gr3/4 far below. The writer was able to process the whole load in ~120 seconds instead 420, while Gr3/4 continue to process the load for additional ~360 seconds.This means that without FC set the Nodes will lag around 360 seconds behind the Master, while with FC set to 25, they will lag 130 seconds.


A significant gap.

businessman_across_gap_cliff

 

Conclusions

Going back to the origin, and the reason why I was looking to this POC.
I, as customer of myself, think that my application(s) will not be a good fit for GR, given I have set PXC to scale out the reads, and be able to efficiently move my writer to another when in need to.
GR while based on a very interesting concept it is still based on asynchronous replication (as my colleague Kenny said). I am sure it could make sense in many other cases, but it cannot be compare to solutions that are based on virtually synchronous replication; and it still requires a lot of refinement.

 

On the other hand, all the applications that can afford to have a significant gap between writer and readers, will probably be fine with that.But I raise another question … was not standard replication already covering that?Reviewing the Oracle documentations (https://dev.mysql.com/doc/refman/5.7/en/group-replication-background.html) I can see why GR as part of the InnoDB cluster, can help me in improving HA, when comparing it to standard replication.But I also think it is important to understand that GR (and derived solutions like InnoDB cluster) are not comparable or a replacement of data-centric solutions as PXC. At least up to now.

 

Good MySQL to everyone.

 

References

https://dev.mysql.com/doc/refman/5.7/en/group-replication.html

https://dev.mysql.com/doc/refman/5.7/en/mysql-innodb-cluster-userguide.html

http://lefred.be/content/mysql-group-replication-understanding-flow-control/

https://dev.mysql.com/worklog/task/?id=9838

Last Updated on Thursday, 20 July 2017 16:26
 
«StartPrev12345678910NextEnd»

Page 1 of 13
 

Connecting from

Your IP: 54.234.247.118

Location: UNITED STATES

Who's Online

We have 125 guests online