My MySQL tips valid-rss-rogers

 

 Understand dirty reads when using ProxySQL

 Recently I had been asked to dig a bit about WHY some user where getting dirty reads when using PXC and ProxySQL. 

While the immediate answer was easy, I had taken that opportunity to dig a bit more and buildup a comparison between different HA solutions. 

 For the ones that cannot wait, the immediate answer is …drum roll, PXC is based on Galera replication, and as I am saying from VERY long time (2011), Galera replication is virtually synchronous. Given that if you are not careful you MAY hit some dirty reads, especially if configured incorrectly. 

 There is nothing really bad here, we just need to know how to handle it right. 

In any case the important thing is to understand some basic concepts. 

Two ways of seeing the world (the theory)

Once more let us talk about data-centric approach and data-distributed.

We can have one data state:

dacentric

 Where all the data nodes see a single state of the data. This is it, you will consistently see the same data at a given T moment in time, where T is the moment of commit on the writer. 

 Or we have data distributed:

data diff

Where each node has an independent data state. This means that data can be visible on the writer, but not yet visible on another node at the moment of commit, and that there is no guarantee that data will be passed over in a given time. 

 The two extremes can be summarized as follow:

Tightly coupled database clusters

  • Data Centric approach (single state of the data, distributed commit)
  • Data is consistent in time cross nodes
  • Replication requires high performing link
  • Geographic distribution is forbidden

Loosely coupled database clusters

  • Single node approach (local commit)
  • Data state differs by node
  • Single node state does not affect the cluster
  • Replication link doesn’t need to be high performance
  • Geographic distribution is allowed 

 

Two ways of seeing the world (the reality)

Given life is not perfect and we do not have only extremes, the most commonly used MySQL solution find their place covering different points in the two-dimensional Cartesian coordinate system:

Screen Shot 2019 10 16 at 94547 PM

This graph has the level of high availability on the X axis and the level of Loose – Tight relation on the Y axis. 

As said I am only considering the most used solutions:

  • MySQL – NDB cluster
  • Solutions based on Galera 
  • MySQL Group replication / InnoDB Cluster
  • Basic Asynchronous MySQL replication 

InnoDB Cluster and Galera are present in two different positions, while the others take a unique position in the graph. At the two extreme position we have Standard replication, which is the one less tight and less HA, and NDB Cluster who is the tightest solution and higher HA.  

 Translating this into our initial problem, it means that when using NDB we NEVER have Dirty Reads, while when we use standard replication we know this will happen.

Another aspect we must take in consideration when reviewing our solutions, is that nothing come easy. So, the more we want to move to the Right-Top corner the more we need to be ready to give. This can be anything, like performance, functionalities, easy to manage, etc.

 When I spoke about the above the first time, I got a few comments, the most common was related on why I decided to position them in that way and HOW I did test it. 

 Well initially I had a very complex approach, but thanks to the issue with the Dirty Reads and the initial work done by my colleague Marcelo Altman, I can provide a simple empiric way that you can replicate just use the code and instructions from HERE.

 

Down into the rabbit hole 

The platform

To perform the following tests, I have used:

  • A ProxySQL server
  • An NDB cluster of 3 MySQL nodes 6 data nodes (3 Node Groups)
  • A cluster of 3 PXC 5.7 single writer
  • An InnoDB cluster 3 nodes single writer 
  • A 3 nodes MySQL replica set
  • 1 Application node running a simple Perl script

All nodes where connected with dedicated backbone network, different from front end receiving data from the script. 

The tests

I have run the same simple test script with the same set of rules in ProxySQL.
For Galera and InnoDB cluster I had used the native support in ProxySQL, also because I was trying to emulate the issues I was asked to investigate. 

For Standard replication and NDB I had used the mysql_replication_hostgroup settings, with the difference that the later one had 3 Writers, while basic replication has 1 only.

Finally, the script was a single threaded operation, creating a table in the Test schema, filling it with some data, then read the Ids in ascending order, modify the record with update, and try to read immediately after. 

When doing that with ProxySQL, the write will go to the writer Host Group (in our case 1 node also for NDB, also if this is suboptimal), while reads are distributed cross the READ Host Group. If for any reason an UPDATE operation is NOT committed on one of the nodes being part of the Reader HG, we will have a dirty read.

Simple no?!

The results

 

dirty comparative2

 

Let us review the graph. Number of dirty reads significantly reduce moving from left to the right of the graph, dropping from 70% of the total with basic replication to the 0.06% with Galera (sync_wait =0).

The average lag is the average time taken from the update commit to when the script returns the read with the correct data. 

It is interesting to note a few factors:

  1. The average cost time in GR between EVENTUAL and AFTER is negligible
  2. Galera average cost between sync_wait=0 and sync_wait=3 is 4 times longer 
  3. NDB is getting an average cost that is in line with the other BUT its max Lag is very low, so the fluctuation because the synchronization is minimal (respect to the others)
  4. GR and Galera can have 0 dirty reads but they need to be configured correctly. 

 Describing a bit more the scenario, MySQL NDB cluster is the best, period! Less performant in single thread than PXC but this is expected, given NDB is designed to have a HIGH number of simultaneous transactions with very limited impact. Aside that it has 0 dirty pages no appreciable lag between writer commit – reader. 

On the other side of the spectrum we have MySQL replication with the highest number of dirty reads, still performance was not bad but data is totally inconsistent.

 Galera (PXC implementation) is the faster solution when single threaded and has only 0.06% of dirty reads with WSREP_SYNC_WAIT=0, and 0 dirty pages when SYNC_WAIT=3.
About galera we are seen and paying something that is like that by design. A very good presentation (https://www.slideshare.net/lefred.descamps/galera-replication-demystified-how-does-it-work) from Fred Descamps explain how the whole thing works.

This slide is a good example:

Screen Shot 2019 10 13 at 32714 PM

By design the apply and commit finalize in Galera may have (and has) a delay between nodes. When changing the parameter wsrep_sync_wait as explained in the documentation the node initiates a causality check, blocking incoming queries while it catches up with the cluster. 

Once all data on the node receiving the READ request is commit_finalized, the node perform the read.

 MySQL InnoDB Cluster is worth a bit of discussion. From MySQL 8.0.14 Oracle introduced the parameter group_replication_consistency please read (https://dev.mysql.com/doc/refman/8.0/en/group-replication-consistency-guarantees.html), in short MySQL Group replication can now handle in different way the behavior in respect of Write transactions and read consistency.

Relevant to us are two settings:

  • EVENTUAL
    • Both RO and RW transactions do not wait for preceding transactions to be applied before executing. This was the behavior of Group Replication before the group_replication_consistency variable was added. A RW transaction does not wait for other members to apply a transaction. This means that a transaction could be externalized on one member before the others.
  • AFTER
    • A RW transaction waits until its changes have been applied to all of the other members. This value has no effect on RO transactions. This mode ensures that when a transaction is committed on the local member, any subsequent transaction reads the written value or a more recent value on any group member. Use this mode with a group that is used for predominantly RO operations to ensure that applied RW transactions are applied everywhere once they commit. This could be used by your application to ensure that subsequent reads fetch the latest data which includes the latest writes.

 

As shown above using AFTER is a win and will guarantee us to prevent dirty reads with a small cost.

ProxySQL

ProxySQL has native support for Galera and Group replication, including the identification of the transactions/writeset behind. Given that we can think ProxySQL SHOULD prevent dirty reads, and it actually does when the entity is such to be caught. 

But dirty reads can happen in such so small-time window that ProxySQL cannot catch them. 

As indicated above we are talking of microseconds or 1-2 milliseconds. To catch such small entity ProxySQL monitor should pollute the MySQL servers with requests, and still possibly miss them given network latency. 

Given the above, the dirty read factor, should be handled internally as MySQL Group Replication and Galera are doing, providing the flexibility to choose what to do. 

There are always exceptions, and in our case the exception is in the case of basic MySQL replication. In that case, you can install and use the ProxySQL binlog reader, that could help to keep the READS under control, but will NOT be able to prevent them when happening a very small time and number.

Conclusion

Nothing comes for free, dirty reads is one of “those” things that can be prevented but we must be ready to give something back. 

It doesn’t matter what, but we cannot get all at the same time. 

Given that is important to identify case by case WHICH solution fits better, sometimes it can be NDB, others Galera or Group replication.  There is NOT a silver bullet and there is not a single way to proceed. 

Also, when using Galera or GR the more demanding setting to prevent dirty reads, can be set at the SESSION level, reducing the global cost.

Summarizing

  • NDB is the best, but is complex and fits only some specific usage like high number of threads; simple schema definition; in memory dataset
  • Galera is great and it helps in joining performance and efficiency. It is a fast solution but can be flexible enough to prevent dirty reads with some cost.
    Use WSREP_SYNC_WAIT to tune that see (https://galeracluster.com/library/documentation/mysql-wsrep-options.html#wsrep-sync-wait)
  • MySQL Group Replication come actually attached, we can avoid dirty reads, it cost a bit use SET group_replication_consistency= 'AFTER' for that.
  • Standard replication can use ProxySQL Binlog Reader, it will help but will not prevent the dirty reads. 

To be clear:

  • With Galera use WSREP_SYNC_WAIT=3 for reads consistency 
  • With GR use group_replication_consistency= 'AFTER'

I suggest to use SESSION not GLOBAL and play a bit with the settings to understand well what is going on.

 

I hope this article had given you a better understanding of what solutions we have out there, such that you will be able to perform an informed decision when in need. 

 

Reference

https://www.proxysql.com/blog/proxysql-gtid-causal-reads

https://github.com/Tusamarco/proxy_sql_tools/tree/master/proxy_debug_tools

https://en.wikipedia.org/wiki/Isolation_(database_systems)#Dirty_reads

https://galeracluster.com/library/documentation/mysql-wsrep-options.html#wsrep-sync-wait

https://dev.mysql.com/doc/refman/8.0/en/group-replication-configuring-consistency-guarantees.html

https://www.slideshare.net/lefred.descamps/galera-replication-demystified-how-does-it-work

This week is almost over, and it pass after two nice event that I had attended.

PerconaLive 2019 in Amsterdam Schiphol and ProxySQL Technology Day in Ghent.

What are my takeaways on both events?

Well, let us start with PerconaLive first.

Venue

The venue was nice, the hotel comfortable and rooms were nice as well, clean and quiet. Allow to have good rest and same time decent space to work if you have to. The conference was just there so I was able to attend, work and rest without any effort, A+.

The hotel was far away from the city and the Amsterdam’s distractions, which it may be a negative thing for many, but honestly IF I spend money, and time of my life to attend a conference, I want to take the most out of it. Have the chance to stay focus on it and to talk with all attendees, customer and experts is a plus. If you want to go around and be a tourist, take a day more after or before the conference and not try to do it while you should work.

Attendees & Speakers

Attendee were curious and investigative, given that I notice most of the speeches were interactive, with people asking questions during the presentations and after. All good, there. A couple of comments I have is more towards some speakers. First, I think some of them should rehears more the speech they present, second please please please STOP reading the slides. You should refer to them, but speak toward the audience not the screen and speak! Give your opinion your thoughts, they can read your bullet points, no need for you to read the text to them.

Outside the rooms we have a good mix of people, talking and interacting. Small groups where reshuffling quite often, which at the end result in better exchanges. Never the less I think we should do better, sometimes we, and I am referring to people like me that are the “old ones” of the MySQL conferences, should help more the customers to connect to each other and with other experts.

Key-notes

I am not going to do a full review of the Key-Note sessions, but one thing come to my mind over and over, and to be honest it makes me feel unhappy and a bit upset.

The discussion we are having over and over about Open Source model and how some big giants (aka Amazon AWS, Google cloud, Microsoft Azure but not only) use the code develop by others, get gazillion and give back crumbs to the community who had develop it, makes me mad.

We had not address it correctly at all, and this because there is a gap not only in the licensing model, but also in the international legislation.

This because open source was not a thing in the past for large business. It is just recently that finally Open Source has been recognized a trustable and effective solution also for large, critical business.

Because that, and the obvious interest of some large group, we lack a legislation that should help or even prevent the abuse that is done by such large companies.

Technical Tracks

Anyhow back to the technical tracks. Given this was a PerconaLive event, we had few different technologies presents. MySQL, Postgress and MongoDB. It comes without saying that MySQL was the predominant and the more interesting tracks were there. This not because I come from MySQL, but because the ecosystem was helping the track to be more interesting.
Postgres was having some interesting talk, but let us say clearly, we had just few from the community.

Mongo was really low in attendee. The number of attendees during the talks and the absence of the MongoDb community was clearly indicating that the event is not in the are of interest of the MongoDB utilizers.

Here I want to give a non sollecitated advice. The fact that Percona supports multiple technologies is a great thing, and Percona do that in the best and more professional way possible.
But this doesn’t mean the company should dissipate its resources and create confusion and misunderstanding when realizing events like the Percona Live.

Do all the announcement and off sessions presentations you want, to explain what the company does (and does them in a great way) to serve the other technologies, but keep the core of the conference for what it should be, a MySQL conference.

Why? It is simple, Percona has being leading in that area after MySQL AB vanished, and the initial years had done a great job.

The community had benefitted a lot form it, and customers understanding and adoption had improved significantly because Percona Live. With this attempt of building up a mix conference, Percona seems have lost the compass during the event. Which is not true!!! But that is what comes out in the community, and more important not sure that is good for customers.

At the same time, the attempt is doom to fail, because both Postgres and Mongo already have strong communities. So instead trying to have the Sun gravitate around the Earth, let us have the Earth to gravitate around the Sun.

My advice is to be more present in the Postgres/MongoDB events, as sponsor, collaborating with the community, with talks and innovation (like Percona packaging for Postgres or MongoDB backup), make the existing conference stronger with the Percona presence.

That will lead more interest towards what Percona is doing than trying to get the Sun out of its orbit.

MySQL

About MySQL, as usual we had a lot of great news and in-depth talks. We can summarize them, indicating how MySQL/Oracle and Percona Software are growing. Becoming more efficient and ready to cover enterprise needs even better than before.

The MySQL 8.0.17 version contains great things, not only the clone plugins. The performance optimization indicated by Dimitri KRAVTCHUK in his presentation, and how they were achieved, is a very important step. Because FINALLY we had broken the barrier and start to touch core blocks that were causing issues from forever.

At the same time, it is sad to see how MariaDb is accumulating performance debt respect to the other distributions. Sign that the decision to totally diverge is not paying back, at least in the real world, given for their sales they are better than ever.

As Dimitri said about himself, “I am happy not to use MariaDB” I second that.

InnoDB Cluster

The other interesting topic coming over and over was InnoDB cluster and Group replication. The product is becoming better and better, and yes it still have some issues, but we are far from the initial version on 5.7, lightyear far. Doing some in depth talks about it with the Oracle guys, we can finally say that IF correctly tuned GR is a strong HA solution. At the moment and on the paper stronger than anything based on Galera.

But we still need to see performance wise if the solution keeps the promises when under load, or IF there will be the need to relax the HA constrains, resulting in lowering the HA efficiency. The last operation will result in making GR less HA efficient than Galera, but still a valid alternative.

Galera

About Galera, I have noticed a funny note in the Codership slides at the boot, which says “Galera powers PXC” or something like that.

Well of course guys!!! We all know who is the real code producer of Galera (Codership). Other companies are adopting the core and changing few things around to “customize” the approach.

Packaging that, adding some variants, will never change the value of what you do. Just think about the last PXC announcement who include the: “PXC8 implements Galera 4”.

It seems to me, we have to sit at a table and resolve a bit of identity crisis from both sides.

Vitess also is growing and the interest around it as well, Morgo is doing is best to help the community to understand what it does and how to approach it, well done!

Proxysql taks

A lot of talks about ProxySQL as well, I did one but there where many. ProxySQL is confirming its role as THE solution when you need to add a “proxy/router/firewall/ha/performance improvement”. I am still wondering what Oracle is waiting for to replace router and start to use a more performing and flexible product like ProxySQL.

ProxySQL Technology Day

Finally let us talk about ProxySQL Technology day.

The event was organized in Ghent (a shame it was not on the same venue as PLEU) for the 3td of October after the close of PLEU 2019.

I spoke with many PLEU attendee and a lot of them were saying something like this: “I would LOVE to attend the event, if in Amsterdam, but cannot do at the last minute in Ghent”. Well that is a bit a shame because event was promoted and announced in time, but I have to say I understand not all the people are willing to move away from Amsterdam, take the train and move to the historical Ghent.

Anyhow the ProxySQL Technology Day was actually very well attended. Not only in number of people there, but also company participating. We had, Oracle, Virtual Health, Percona, Pythian and obviously the ProxySQL guys.

It was also interesting to see the different level of attendees, from senior dba or technical managers to students.

The event was happening in the late afternoon starting at 5PM, but I think that ProxySQL should plan the next one as a full day event. Probably a bit more structured in the line to follow for the talks, but I really see it as a full day event, with also real cases presented eventually by customers. This because real life always wins on any other kind of talk, and because a lot of attendee where looking to have the chance to share real cases.

The other great thing that I saw happening there, was during the Pizza Time (thanks Pythian!). The interaction between the people was great, the involvement they had and the interest was definitely worth the trip. I had answered more technical questions during the pizza there than the 2 days PLEU. No barriers no limit, I love that.

Given all the above, well folks in Amsterdam, great to see your pictures in FB or whatever social platform, but trust me you had miss something!

Conclusion

PLEU 2019 in Amsterdam was a nice conference, it totally shows we need to keep focus on MySQL and diversify the efforts for the other technology. It also shows collaboration pay and fights doesn’t.

Some of the thing could have be done better, especially in the session scheduling and in following the Community indications, but those are workable bumps on the road that should be addressed and clarified.
ProxySQL is doing great and is doing better with the time, just this week the ProxySQL 2.0.7 was announced, and include full native support for AWS Aurora and AUTODISCOVERY.

Wow so excited … must try it NOW!

Good MySQL to all…

Missed OpportunityIs “that” time of the year … when autumn is just around the corner and temperature start to drop.
But is also the time for many exciting conferences in the MySQL world.
We have the Oracle Open world this week, and many interesting talks around MySQL 8.
Then in 2 weeks we will be at the Percona live Europe in Amsterdam, which is not a MySQL (only) conference anymore (https://www.cvent.com/events/percona-live-open-source-database-conference-europe-2019/agenda-6321c2468b1b43328f97212f3e53f4de.aspx).
Percona had move to a more “polyglot” approach not only in its services but also during the events.
This is obviously an interesting experiment, that allow people from different technologies to meet and discuss. At the end of the day it is a quite unique situation and opportunity, the only negative effect is that it takes space from the MySQL community, who is suffering a bit in terms of space, attendee and brainstorming focus on MySQL deep dive.
Said that there are few interesting talks I am looking to attend:
• Security and GDPR, many sessions
• MySQL 8.0 Performance: Scalability & Benchmarks
• Percona will also present the Percona cluster version 8, which is a must attend session
Plus the other technologies which I am only marginally interested to.

After Percona live in Amsterdam there will be a ProxySQL technology day in Ghent (https://proxysql.com/blog/proxysql-technology-day-ghent-oct3rd2019). Ghent is a very nice city and worth a visit, to reach it from Amsterdam is only 2hrs train. Given this event is the 3td of October I will just move there immediately after PLEU.
The ProxySQL event is a mid-day event starting at 5PM, with 30 minutes sessions focus on best practices on how to integrate the community award winning solution “ProxySQL” with the most common scenario and solutions.
I like that because I am expecting to see and discuss real cases and hands on issues with the participants.

So, a lot of things, right?
But once more, I want to raise the red flag about the lack of a MySQL community event.
We do have many events, most of them are following companies focus, and they are sparse and not well synchronized. Given that more than anything else, we miss A MySQL event. A place where we can group around and not only attract DBAs from companies who use it and sometime abuse it, but also a place for all of us to discuss and coordinate the efforts.

In the meantime, see you in Amsterdam, then Ghent, then Fosdem then …
Good MySQL to all

To set correct system variable values is the essential step to get the correct server behavior against the workload.  

In MySQL we have many System variables that can be changed at runtime, most of them can be set at session or at global scope. small things

To change the value of a system variable at global level in the past user need to have SUPER privileges. Once the system variable value is modified as global, the server will change his behavior for the session, and obviously as global scope. For instance, one of the most commonly adjusted variables is probably max_connections. If you have max_connection=100 in your my.cnf or as default value, and during the day as DBA you notice that the number of them is not enough, it is easy just to add new connections on the fly, the command:

SET GLOBAL MAX_CONNECTIONS=500;

Will do the work. But here is the issue. We had changed a GLOBAL value, that apply to the whole server, but this change is ephemeral and if the server restarts the setting is lost. In the past I have seen many times servers with different configurations between my.cnf and current Server settings. To prevent this or at least keep it under control good DBAs had develop scripts to checks if and where the differences exists and fix it. The main issue is that very often, we forget to update the configuration file while doing the changes, or we do on purpose to do "Fine tuning first” and forgot after.

What's new in MySQL8 about that? Well we have a couple of small changes. First of all, the privileges, as for MySQL8 user can have SYSTEM_VARIABLES_ADMIN or SUPER to modify the GLOBAL system variables. The other news is related to the ability to have GLOBAL changes to variable to PERSIST on disk and finally to know who did it and when. The new option for SET command is PERSIST. So, if I have:

(root@localhost) [(none)]>show global variables like 'max_connections';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| max_connections | 1500  |
+-----------------+-------+

and I want to change the value of max_connection and be sure this value is reloaded at restart, I will do:

(root@localhost) [(none)]>set PERSIST max_connections=150;

(root@localhost) [(none)]>show global variables like 'max_connections';
+-----------------+-------+
| Variable_name   | Value |
+-----------------+-------+
| max_connections | 150   |
+-----------------+-------+

With the usage of PERSIST, MySQL will write the information related to: - key (variable name) - value - timestamp (including microseconds) - user - host A new file in the data directory: mysqld-auto.cnf contains the information The file is in Json format and will have the following:

{ "Version" : 1 , "mysql_server" : {
 "max_connections" : { 
"Value" : "150" , "Metadata" : {
 "Timestamp" : 1565363808318848 , "User" : "root" , "Host" : "localhost" 
} } } }

The information is also in Performance Schema:

select a.VARIABLE_NAME,b.VARIABLE_value ,SET_TIME,SET_USER,SET_HOST  
    from performance_schema.variables_info a 
        join performance_schema.global_variables b 
        on a.VARIABLE_NAME=b.VARIABLE_NAME  
    where b.VARIABLE_NAME like 'max_connections'\G

*************************** 1. row ***************************
 VARIABLE_NAME: max_connections
VARIABLE_value: 150
      SET_TIME: 2019-08-09 11:16:48.318989
      SET_USER: root
      SET_HOST: localhost

As you see the information present, report who did the change, from where, when, and the value. Unfortunately, there is no history here, but this can be easily implemented. To clear the PERSIST settings, run RESET PERSIST and all the Persistent setting will be removed. To be clear if you have:

{ "Version" : 1 , "mysql_server" : {
  "max_connections" : { "Value" : "151" , "Metadata" : { "Timestamp" : 1565367524946371 , "User" : "root" , "Host" : "localhost" } } , 
  "wait_timeout" : { "Value" : "151" , "Metadata" : { "Timestamp" : 1565368154036275 , "User" : "root" , "Host" : "localhost" } } 
} }

RESET PERSISTE will do:

{ "Version" : 1 , "mysql_server" : {  }

Which is removing ALL THE SETTINGS not just one.

Anyhow why is this a good thing to have?

First because we have no excuse now, when we change a variable, we have all the tools needed to make sure we will have it up at startup if this is the intention of the change.

Second is good because storing the information in a file, and not only showing it from PS, allow us to include such information in any automation tool we have.
This in the case we decide to take action or just to keep track of it, like comparison with my.cnf and fixing the discrepancies automatically also at service down or when cloning.


On this let me say that WHILE you can change the value in the file mysqld-auto.cnf, and have the server at restart use that value as the valid one.
This is not recommended, instead please fix the my.cnf and remove the information related to PERSIST.
To touch that file is also dangerous because if you do stupid things like removing a double quote or in any way affecting the Json format, the server will not start, but there will be NO error in the log.

{ "Version" : 1 , "mysql_server" : { "wait_timeout": { "Value : "150" , "Metadata" : { "Timestamp" : 1565455891278414, "User" : "root" , "Host" : "localhost" } } } }
                                                           ^^^ missing double quote 
tusa@tusa-dev:/opt/mysql_templates/mysql-8.0.17futex$ ps aux|grep 8113
tusa      8119  0.0  0.0  14224   896 pts/1    S+   12:54   0:00 grep --color=auto 8113
[1]+  Exit 1                  bin/mysqld --defaults-file=./my.cnf

I have opened a bug for this (https://bugs.mysql.com/bug.php?id=96501).

A short deep dive in the code (you can jump it if you don't care)

The new feature is handled in the files <source>/sql/persisted_variable.(h/cc) The new structure dealing with the PERSIST actions is:

struct st_persist_var {
  std::string key;
  std::string value;
  ulonglong timestamp;
  std::string user;
  std::string host;
  bool is_null;
  st_persist_var();
  st_persist_var(THD *thd);
  st_persist_var(const std::string key, const std::string value,
                 const ulonglong timestamp, const std::string user,
                 const std::string host, const bool is_null);
};

And the main steps are in the constructors st_persist_var. About that, should be noted that when creating the timestamp the code is generating a value that is NOT fully compatible with the MySQL functions FROM_UNIXTIME. The code assigning the timestamp value pass/assign also the microseconds passing them from the timeval (tv) structure:

st_persist_var::st_persist_var(THD *thd) {
  timeval tv = thd->query_start_timeval_trunc(DATETIME_MAX_DECIMALS);
  timestamp = tv.tv_sec * 1000000ULL + tv.tv_usec;
  user = thd->security_context()->user().str;
  host = thd->security_context()->host().str;
  is_null = false;
}

Where:

    tv.tv_sec = 1565267482
    tv.tc_usec = 692276

this will generate: timestamp = 1565267482692276 this TIMESTAMP is not valid in MySQL and cannot be read from the time functions, while the segment related to tv.tv_sec = 1565267482 works perfectly.

(root@localhost) [(none)]>select FROM_UNIXTIME(1565267482);
+---------------------------+
| FROM_UNIXTIME(1565267482) |
+---------------------------+
| 2019-08-08 08:31:22       |
+---------------------------+

(root@localhost) [(none)]>select FROM_UNIXTIME(1565267482692276);
+---------------------------------+
| FROM_UNIXTIME(1565267482692276) |
+---------------------------------+
| NULL                            |
+---------------------------------+

this because the timestamp with microsecond is formatted differently in MySQL : PERSIST_code = 1565267482692276 MySQL = 1565267482.692276 If I run: select FROM_UNIXTIME(1565267482.692276); I get the right result:

+----------------------------------+
| FROM_UNIXTIME(1565267482.692276) |
+----------------------------------+
| 2019-08-08 08:31:22.692276       |
+----------------------------------+

of course, I can use the trick:

select FROM_UNIXTIME(1565267482692276/1000000);
+-----------------------------------------+
| FROM_UNIXTIME(1565267482692276/1000000) |
+-----------------------------------------+
| 2019-08-08 08:31:22.6922                |
+-----------------------------------------+

Well that's all for the behind the scene info, keep in mind if you want to deal with the value coming from the Json file.

Conclusion

Sometimes the small things can be better than the HUGE shining things. I saw many times DBAs in trouble because they do not have this simple feature in MySQL, and many MySQL failing to start, or behave not as expected. Given that, I welcome PERSIST and I am sure that the people who manage thousands of servers, have different workloads and automation in place, will see this as a good thing as well.

References:

https://dev.mysql.com/doc/refman/8.0/en/persisted-system-variables.html
https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_timestamp
https://lefred.be/content/where-does-my-mysql-configuration-variable-value-come-from/
https://lefred.be/content/what-configuration-settings-did-i-change-on-my-mysql-server/

 

MariaDB 10.4 has being declared GA, and few presentations on Galera4 in the last conferences were hold.

So, I thought, it is time to give it a try and see what is going on.

It is not a secret that I consider the Codership guys my heroes, and that I have push for Galera as solid and very flexible HA solution, for many years.

Given that my first comment is that it was a shame to have Galera4 available only in MariaDB, I would have preferred to test the MySQL vanilla version for many reasons, but mainly because the MySQL/Oracle is and remain the official and main line of the MySQL software, you like it or not, and as such I want to see how the Galera4 behave with that. Anyhow Codership state that the other versions will be out AFTER the summer, and I hope this will be true.

To test the new version given I do not have the vanilla MySQL, I decide to use the other distribution coming from Percona. At the end the test where done comparing MariaDB 10.4.x with PXC 5.7.x. In short Galera4 Vs Galera3.

I setup on the same machines the two different software, and I configure as close as possible. Said that I did 2 main set of tests: Data ingest and OLTP, both running for 90 minutes, not pushing like hell, but gently simulate some traffic. Configuration files can be found here.

Galera4 stream replication was disable, following the Codership instruction (wsrep_trx_fragment_size=0).

Test1 Ingest

For the ingest test I had use my stresstool application (here) with only 10 threads and 50 batch inserts each thread, the schema definition is in the windmills.json file.

As always, an image says more than many words:

ingest execution by thread

In general terms, PXC was able to execute same load in less than MariaDB.

ingest events thread

And PXC was able to deal with a higher number of events per thread as well.

The average galera latency was around 9ms in the writer and around 5ms for the receivers in PXC. With same load, same machines, same configuration:

Screen Shot 2019 07 24 at 21658 PM

The latency in MariaDB was significantly higher around 19ms for the writer, and between 9 and 5 ms for the receivers.

Screen Shot 2019 07 24 at 13350 PM

In short overall PXC 5.7 with galera3 was performing better than MariaDB 10.4 with galera4.

The amount of data on transmitted and received on PXC was higher (good) than Mariadb:

PXC:

Screen Shot 2019 07 24 at 22127 PM

MariaDB:

Screen Shot 2019 07 24 at 14453 PM

OLTP

For oltp test I have sysbenc with oltp r/w tests, 180 threads (90 from two different application nodes), 200K rows for table, 40 tables and 90 minutes run.

Let see what happened:

oltp evens write

PXC was performing better than MariaDB, executing more writes/s and and more events_thread/sec.

Checking the latency, we can see:

oltp latency

Also in this case PXC was having less average latency than MariaDB in the execution.

What about galera?

For PXC/Galera3, the average galera latency was around 3.5ms in the writer and less in the receivers:

Screen Shot 2019 07 26 at 124919 PM

In this case the latency in Galera4 was same or less of the one in Galera3:

Screen Shot 2019 07 26 at 51341 PM

Also analyzing the MAX latency:

Galera3

Screen Shot 2019 07 26 at 124936 PM

Galera4

Screen Shot 2019 07 26 at 51354 PM

We can see that Galera4 was dealing with it much better than the version3.

 

I have done many other checks and it seems to me that in the OLTP, but I do not exclude this is valid for ingest as well, Galera4 is in some way penalize by the MariaDB platform.

I am just at the start of my investigation and I may be wrong, but I cannot confirm or deny until Codership will release the code for MySQL.

Conclusions

Galera4 seems to come with some very good new feature, please review Seppo presentation, and one thing I noticed it comes with optimized node communication, reducing the latency fluctuation.

Just this for me is great, plus we will have stream replication that can be very useful but I have not tested it yet.

Nevertheless, I would not move to it just yet. I would wait to have the other MySQL distribution supported, do some tests, and see where the performance problem is.

Because at the moment also with not heavy load, current version of PXC 5.7/Galera3 runs better than MariaDB/Galera4, so why I should migrate to a platform that locks me in like MariaDB, and do not give me benefit (yet)? Also considering that once Galera4 will be available for the standard MySQL versions, we can have all the good coming from Galera4, without being lock in by MariaDB.

A small note about MariaDB, while I was playing with it, I noticed that by default MariaDB comes with the plugin level BETA, which means potentially run in production code that is still in beta stage, no comment!surprise emo

References

https://github.com/Tusamarco/blogs/tree/master/Galera4

https://galeracluster.com/

Seppo presentation https://www.slideshare.net/SakariKeskitalo/galera-cluster-4-presentation-in-percona-live-austin-2019

https://mariadb.org/mariadb-10-4-4-now-available/

Latest conferences

We have 63 guests and no members online

oracle_ace