My MySQL tips valid-rss-rogers

 

When trying to make things better, make our life very complicated.

I was working on a Security Threat Tool script, when I had to learn more about the interaction between static and dynamic privileges in MySQL 8.

Dynamic privileges is a “new” thing added in MySQL 8 to easily extend the privileges definition, and at the same time to provide more granularity. For instance the FLUSH operation now has dedicated Privileges and by scope. 

Dynamic privileges are assigned at runtime. Most of them are active when the server starts. But they can also change in respect to the components or plugin when activated. (https://dev.mysql.com/doc/mysql-security-excerpt/8.0/en/privileges-provided.html#privileges-provided-dynamic)

Static privileges are the classical privileges available in MySQL (https://dev.mysql.com/doc/mysql-security-excerpt/8.0/en/privileges-provided.html#privileges-provided-static)

Those are built into the server and cannot be changed.

So far all is good. If we can give more flexibility to the security mechanism existing in MySQL, well I am all for it.

My first step was to deal with the abuse of SUPER. 

About that the manual comes to help with a section: Migrating Accounts from SUPER to Dynamic Privileges (https://dev.mysql.com/doc/mysql-security-excerpt/8.0/en/privileges-provided.html#dynamic-privileges-migration-from-super) . 

Woo perfect!

Let us play a bit. First let me create a user:

create user secure_test@'localhost' identified by 'secret';
DC2-2(secure_test@localhost) [(none)]>show grants for current_user();
+-------------------------------------------------+
| Grants for secure_test@localhost                |
+-------------------------------------------------+
| GRANT USAGE ON *.* TO `secure_test`@`localhost` |
+-------------------------------------------------+

As you can see I can connect, but have no permissions.

On another terminal with an administrative account, let us do the classical operation to create a DBA:

GRANT ALL on *.* to secure_test@'localhost' WITH GRANT OPTION;

And now I have :

DC2-2(secure_test@localhost) [(none)]>show grants for current_user()\G
*************************** 1. row ***************************
Grants for secure_test@localhost: GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, SHUTDOWN, PROCESS, FILE, REFERENCES, INDEX, ALTER, SHOW DATABASES, SUPER, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, REPLICATION SLAVE, REPLICATION CLIENT, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, CREATE USER, EVENT, TRIGGER, CREATE TABLESPACE, CREATE ROLE, DROP ROLE ON *.* TO `secure_test`@`localhost` WITH GRANT OPTION
*************************** 2. row ***************************
Grants for secure_test@localhost: GRANT APPLICATION_PASSWORD_ADMIN,AUDIT_ADMIN,BACKUP_ADMIN,BINLOG_ADMIN,BINLOG_ENCRYPTION_ADMIN,CLONE_ADMIN,CONNECTION_ADMIN,ENCRYPTION_KEY_ADMIN,FLUSH_OPTIMIZER_COSTS,FLUSH_STATUS,FLUSH_TABLES,FLUSH_USER_RESOURCES,GROUP_REPLICATION_ADMIN,INNODB_REDO_LOG_ARCHIVE,INNODB_REDO_LOG_ENABLE,PERSIST_RO_VARIABLES_ADMIN,REPLICATION_APPLIER,REPLICATION_SLAVE_ADMIN,RESOURCE_GROUP_ADMIN,RESOURCE_GROUP_USER,ROLE_ADMIN,SERVICE_CONNECTION_ADMIN,SESSION_VARIABLES_ADMIN,SET_USER_ID,SHOW_ROUTINE,SYSTEM_USER,SYSTEM_VARIABLES_ADMIN,TABLE_ENCRYPTION_ADMIN,XA_RECOVER_ADMIN ON *.* TO `secure_test`@`localhost` WITH GRANT OPTION

As you can see I have a bunch of Privileges assigned. 

To be honest, to have to identify exactly what each privilege does and how it interacts with the others is challenging.

Anyhow, the manual tell us:

"For each account identified by the preceding query, determine the operations for which it needs SUPER. Then grant the dynamic privileges corresponding to those operations, and revoke SUPER."

In our case:

revoke SUPER on *.* from secure_test@'localhost';

Which will remove the SUPER privileges, but what else will remain active? 

Let us try one of the easiest things, let us modify the variable super_read_only.

With super I can change the value of the variable without problems , but if I remove the SUPER privileges, what will happen? 

DC2-2(secure_test@localhost) [(none)]>show grants for current_user()\G
*************************** 1. row ***************************
Grants for secure_test@localhost: GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, SHUTDOWN, PROCESS, FILE, REFERENCES, INDEX, ALTER, SHOW DATABASES, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, REPLICATION SLAVE, REPLICATION CLIENT, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, CREATE USER, EVENT, TRIGGER, CREATE TABLESPACE, CREATE ROLE, DROP ROLE ON *.* TO `secure_test`@`localhost` WITH GRANT OPTION
*************************** 2. row ***************************
Grants for secure_test@localhost: GRANT APPLICATION_PASSWORD_ADMIN,AUDIT_ADMIN,BACKUP_ADMIN,BINLOG_ADMIN,BINLOG_ENCRYPTION_ADMIN,CLONE_ADMIN,CONNECTION_ADMIN,ENCRYPTION_KEY_ADMIN,FLUSH_OPTIMIZER_COSTS,FLUSH_STATUS,FLUSH_TABLES,FLUSH_USER_RESOURCES,GROUP_REPLICATION_ADMIN,INNODB_REDO_LOG_ARCHIVE,INNODB_REDO_LOG_ENABLE,PERSIST_RO_VARIABLES_ADMIN,REPLICATION_APPLIER,REPLICATION_SLAVE_ADMIN,RESOURCE_GROUP_ADMIN,RESOURCE_GROUP_USER,ROLE_ADMIN,SERVICE_CONNECTION_ADMIN,SESSION_VARIABLES_ADMIN,SET_USER_ID,SHOW_ROUTINE,SYSTEM_USER,SYSTEM_VARIABLES_ADMIN,TABLE_ENCRYPTION_ADMIN,XA_RECOVER_ADMIN ON *.* TO `secure_test`@`localhost` WITH GRANT OPTION

As you can see SUPER is gone. 

DC2-2(secure_test@localhost) [(none)]>set global super_read_only=0;
Query OK, 0 rows affected (0.00 sec)

And I can still modify the global variable. WHY?  

The manual says that SYSTEM_VARIABLES_ADMIN from the dynamic privileges allow us to modify "Enables system variable changes at runtime". Well what if I revoke it? 

revoke SYSTEM_VARIABLES_ADMIN on *.* from  secure_test@'localhost';

DC2-2(secure_test@localhost) [(none)]>set global super_read_only=0;
ERROR 1227 (42000): Access denied; you need (at least one of) the SUPER or SYSTEM_VARIABLES_ADMIN privilege(s) for this operation

Great! So in order to really remove/limit super I need to also remove SYSTEM_VARIABLES_ADMIN. But is that all?

Well to make it short, no it is not. 

Checking the manual you can see that SUPER is affecting all these:

  • BINLOG_ADMIN,
  • CONNECTION_ADMIN,
  • ENCRYPTION_KEY_ADMIN,
  • GROUP_REPLICATION_ADMIN,
  • REPLICATION_SLAVE_ADMIN,
  • SESSION_VARIABLES_ADMIN,
  • SET_USER_ID,
  • SYSTEM_VARIABLES_ADMIN

And these are the ones by default. But we can also have others depending on the plugins we have active. 

So in theory to be sure we are removing all SUPER related privileges, we should:

REVOKE SUPER, BINLOG_ADMIN, CONNECTION_ADMIN, ENCRYPTION_KEY_ADMIN, GROUP_REPLICATION_ADMIN, REPLICATION_SLAVE_ADMIN, SESSION_VARIABLES_ADMIN, SET_USER_ID, SYSTEM_VARIABLES_ADMIN on *.* from secure_test@'localhost';

This, should leave us with the equivalent of a user without SUPER:

DC2-2(secure_test@localhost) [(none)]>show grants for current_user()\G
*************************** 1. row ***************************
Grants for secure_test@localhost: GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, SHUTDOWN, PROCESS, FILE, REFERENCES, INDEX, ALTER, SHOW DATABASES, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, REPLICATION SLAVE, REPLICATION CLIENT, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, CREATE USER, EVENT, TRIGGER, CREATE TABLESPACE, CREATE ROLE, DROP ROLE ON *.* TO `secure_test`@`localhost` WITH GRANT OPTION
*************************** 2. row ***************************
Grants for secure_test@localhost: GRANT APPLICATION_PASSWORD_ADMIN,AUDIT_ADMIN,BACKUP_ADMIN,BINLOG_ENCRYPTION_ADMIN,CLONE_ADMIN,FLUSH_OPTIMIZER_COSTS,FLUSH_STATUS,FLUSH_TABLES,FLUSH_USER_RESOURCES,INNODB_REDO_LOG_ARCHIVE,INNODB_REDO_LOG_ENABLE,PERSIST_RO_VARIABLES_ADMIN,REPLICATION_APPLIER,RESOURCE_GROUP_ADMIN,RESOURCE_GROUP_USER,ROLE_ADMIN,SERVICE_CONNECTION_ADMIN,SHOW_ROUTINE,SYSTEM_USER,TABLE_ENCRYPTION_ADMIN,XA_RECOVER_ADMIN ON *.* TO `secure_test`@`localhost` WITH GRANT OPTION

CONCLUSIONS

In this first blog we have started to explore the usage of Dynamic privileges, and what we need to do to remove the SUPER privilege. 

Nevertheless, the list above is still a bit chaotic and unsafe. We still have SHUTDOWN or RELOAD or FILE, all of them are insecure and should be assigned with great care. In the next article we will see how to deal with Dynamic and Static privileges by Role and we also try to have clearer how they affect one another.

When organizing things helps to simplify life.

In the previous article we start to explore dynamic privileges and the interaction with the static ones. We also saw how to remove SUPER privilege from a DBA account. 

What we did was go by subtraction. But in real life, we should act differently. We should ADD only what is really needed for the account to work correctly.

Adding privilege one by one, and for each user is problematic given the level of interaction they may have, and also prone to mistakes. 

Instead we can use ROLES to group, assign and revoke the correct privileges in a much easier way.

This is becoming even more important in MySQL with the advent of dynamic privileges.

What should we do to correctly use ROLEs? Well first of all design.  

The first step is to identify the ROLES, in doing so we need to keep a basic principle, make it simple, as such let us try to avoid having too many ROLES, or ROLE with too many cross functional privileges.

My proposal: 

  • DBA (The lord of the databases who can do all)
  • MaintenanceAdmin (DBA minions :-) they can perform only some action on the server, and server only)
  • UserAdmin (Can create users assign grants and so on)
  • MonitorUser (See all process and read from performance_schema)
  • DBManager (Can add/drop/modify schemas/tables/triggers/view/routines etc )
  • DBDesigner (Can modify specific objects mostly with a clear identification by schema/table)
  • ReplicationAdmin (Can add/change/remove start/stop replication also GR)
  • BackupAdmin (Can take backup, cannot restore)

We have 8 administrative ROLES and they should cover ALL we need for administrative tasks.

Now let us create them:

CREATE ROLE 'DBA', 'MaintenanceAdmin', 'UserAdmin', 'MonitorUser', 'DBManager', 'DBDesigner', 'ReplicationAdmin', 'BackupAdmin'

DC2-1(root@localhost) [mysql]>Select user,host from mysql.user where account_locked ='Y' and password_expired='Y' order by 1;
+------------------+------+
| user             | host |
+------------------+------+
| BackupAdmin      | %    |
| DBA              | %    |
| DBDesigner       | %    |
| DBManager        | %    |
| MaintenanceAdmin | %    |
| MonitorUser      | %    |
| ReplicationAdmin | %    |
| UserAdmin        | %    |
+------------------+------+
8 rows in set (0.00 sec)

Let us check the roles one by one and see what privileges we need to assign.

Our test user do not have any grant:

DC2-2(secure_test@localhost) [(none)]>show grants for current_user()\G
*************************** 1. row ***************************
Grants for secure_test@localhost: GRANT USAGE ON *.* TO `secure_test`@`localhost`

DBA,

well you may say .. easy GRANT ALL.

Wrong! As already indicated in the previous article, doing that will also assign SUPER, which is deprecated from MySQL 8.0.x. Let us start with the right foot and add ONLY what we need:

GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, DROP, RELOAD, SHUTDOWN, PROCESS, FILE, REFERENCES, INDEX, ALTER, SHOW DATABASES, CREATE TEMPORARY TABLES, LOCK TABLES, EXECUTE, REPLICATION SLAVE, REPLICATION CLIENT, CREATE VIEW, SHOW VIEW, CREATE ROUTINE, ALTER ROUTINE, CREATE USER, EVENT, TRIGGER, CREATE TABLESPACE, CREATE ROLE, DROP ROLE ON *.* TO `DBA`@`%` WITH GRANT OPTION;
   
GRANT APPLICATION_PASSWORD_ADMIN,AUDIT_ADMIN,BACKUP_ADMIN,BINLOG_ADMIN,BINLOG_ENCRYPTION_ADMIN,CLONE_ADMIN,CONNECTION_ADMIN,ENCRYPTION_KEY_ADMIN,FLUSH_OPTIMIZER_COSTS,FLUSH_STATUS,FLUSH_TABLES,FLUSH_USER_RESOURCES,GROUP_REPLICATION_ADMIN,INNODB_REDO_LOG_ARCHIVE,INNODB_REDO_LOG_ENABLE,PERSIST_RO_VARIABLES_ADMIN,REPLICATION_APPLIER,REPLICATION_SLAVE_ADMIN,RESOURCE_GROUP_ADMIN,RESOURCE_GROUP_USER,ROLE_ADMIN,SERVICE_CONNECTION_ADMIN,SESSION_VARIABLES_ADMIN,SET_USER_ID,SHOW_ROUTINE,SYSTEM_USER,SYSTEM_VARIABLES_ADMIN,TABLE_ENCRYPTION_ADMIN,XA_RECOVER_ADMIN ON *.* TO `DBA`@`%` WITH GRANT OPTION;

That should be exactly the same as GRANT ALL, but without SUPER. 

To assign the ROLE to our test user:

GRANT `DBA`@`%` TO `secure_test`@`localhost`

Now our user has:

DC2-2(secure_test@localhost) [(none)]>show grants for current_user()\G
*************************** 1. row ***************************
Grants for secure_test@localhost: GRANT USAGE ON *.* TO `secure_test`@`localhost`
*************************** 2. row ***************************
Grants for secure_test@localhost: GRANT `DBA`@`%` TO `secure_test`@`localhost`

Correct you now see DBA as grant but that is not active:

DC2-2(secure_test@localhost) [(none)]>show grants for DBA@'%'\G
ERROR 1142 (42000): SELECT command denied to user 'secure_test'@'localhost' for table 'user'

To ACTIVATE a role you need to do it explicitly:

 SET DEFAULT ROLE DBA TO  secure_test@'localhost';

And have the user reconnect!

Once a role is activated we can also use:

show grants for current_user()\G

To check which privileges are now active for a specific user.

We can also control which role is active for which user querying the table mysql.default_roles. 

To remove the active role:

SET DEFAULT ROLE NONE TO  secure_test@'localhost';

Anyhow, NOW we have our DBA role available for all DBA and if we need to change something we can do it just there on the ROLE and not on each single user. 

MaintenanceAdmin,

GRANT EVENT, LOCK TABLES, RELOAD, SELECT, SHOW DATABASES, RELOAD, SHUTDOWN ON *.* TO `MaintenanceAdmin`@`%`;
GRANT BINLOG_ADMIN, CONNECTION_ADMIN, ENCRYPTION_KEY_ADMIN, GROUP_REPLICATION_ADMIN, REPLICATION_SLAVE_ADMIN, SESSION_VARIABLES_ADMIN, SET_USER_ID, SYSTEM_VARIABLES_ADMIN ON *.* TO `MaintenanceAdmin`@`%`;
GRANT `MaintenanceAdmin`@`%` TO `secure_test`@`localhost` ;

UserAdmin,

GRANT CREATE USER, GRANT OPTION, RELOAD, SHOW DATABASES ON *.* TO `UserAdmin`@`%`;
GRANT ROLE_ADMIN  ON *.* TO `UserAdmin`@`%`;
GRANT `UserAdmin`@`%` TO `secure_test`@`localhost` ;

MonitorUser,

GRANT PROCESS, REPLICATION CLIENT ON *.* TO `MonitorUser`@`%`;
GRANT SELECT ON performance_schema.* TO `MonitorUser`@`%`;
GRANT `MonitorUser`@`%` TO `secure_test`@`localhost` ;

DBManager,

GRANT ALTER, ALTER ROUTINE, CREATE, CREATE ROUTINE, CREATE TABLESPACE, CREATE TEMPORARY TABLES, CREATE VIEW, DELETE, DROP, DROP ROLE, EVENT, INDEX, INSERT, LOCK TABLES, RELOAD, SELECT, SHOW DATABASES, SHOW VIEW, TRIGGER, UPDATE  ON *.* TO `DBManager`@`%`;
GRANT SET_USER_ID, SHOW_ROUTINE ON *.* TO `DBManager`@`%`;
GRANT `DBManager`@`%` TO `secure_test`@`localhost` ;

DBDesigner,

GRANT ALTER, ALTER ROUTINE, CREATE, CREATE ROUTINE, CREATE VIEW, INDEX, SELECT, SHOW DATABASES, SHOW VIEW, TRIGGER ON *.* TO `DBDesigner`@`%`;
GRANT `DBDesigner`@`%` TO `secure_test`@`localhost` ;

ReplicationAdmin,

GRANT REPLICATION CLIENT ON *.* TO `ReplicationAdmin`@`%`;
GRANT REPLICATION_APPLIER, REPLICATION_SLAVE_ADMIN, GROUP_REPLICATION_ADMIN, SERVICE_CONNECTION_ADMIN ON *.* TO `ReplicationAdmin`@`%`;
GRANT SELECT on performance_schema.* TO `ReplicationAdmin`@`%`;
GRANT SELECT on mysql.* TO `ReplicationAdmin`@`%`;
GRANT `ReplicationAdmin`@`%` TO `secure_test`@`localhost` ;

BackupAdmin,

GRANT EVENT, LOCK TABLES, SELECT, SHOW DATABASES ON *.* TO `BackupAdmin`@`%`;
GRANT BACKUP_ADMIN ON *.* TO `BackupAdmin`@`%`;
GRANT `BackupAdmin`@`%` TO `secure_test`@`localhost` ;

Once all our ROLES are in, we can test them. For instance we can check our ReplicationAdmin checking the Binary Logs and stopping/starting our Group Replication (or normal Replication):

DC2-2(secure_test@localhost) [(none)]>show binary logs;
ERROR 1227 (42000): Access denied; you need (at least one of) the SUPER, REPLICATION CLIENT privilege(s) for this operation

DC2-2(secure_test@localhost) [(none)]>stop group_replication;
ERROR 1227 (42000): Access denied; you need (at least one of) the SUPER or GROUP_REPLICATION_ADMIN privilege(s) for this operation

Also if created and assigned the role is not active. Let us now enable the role for the user:

SET DEFAULT ROLE ReplicationAdmin  TO  secure_test@'localhost';

Remember to reconnect!

DC2-2(secure_test@localhost) [(none)]>show binary logs;
+---------------+-----------+-----------+
| Log_name      | File_size | Encrypted |
+---------------+-----------+-----------+
| binlog.000011 | 113802321 | No        |
| binlog.000012 |     19278 | No        |
+---------------+-----------+-----------+
2 rows in set (0.00 sec)

DC2-2(secure_test@localhost) [(none)]>stop group_replication;
Query OK, 0 rows affected (5.25 sec)

DC2-2(secure_test@localhost) [(none)]>select * from  performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 7fddf04f-9797-11eb-a193-08002734ed50 | gr5         |        3306 | OFFLINE      |             |                |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
1 row in set (0.00 sec)

DC2-2(secure_test@localhost) [(none)]>start group_replication;
Query OK, 0 rows affected (3.70 sec)

DC2-2(secure_test@localhost) [(none)]>select * from  performance_schema.replication_group_members;
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| CHANNEL_NAME              | MEMBER_ID                            | MEMBER_HOST | MEMBER_PORT | MEMBER_STATE | MEMBER_ROLE | MEMBER_VERSION |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
| group_replication_applier | 79ede65d-9797-11eb-9963-08002734ed50 | gr4         |        3306 | ONLINE       | PRIMARY     | 8.0.23         |
| group_replication_applier | 7e214802-9797-11eb-a0cf-08002734ed50 | gr6         |        3306 | ONLINE       | SECONDARY   | 8.0.23         |
| group_replication_applier | 7fddf04f-9797-11eb-a193-08002734ed50 | gr5         |        3306 | ONLINE       | SECONDARY   | 8.0.23         |
+---------------------------+--------------------------------------+-------------+-------------+--------------+-------------+----------------+
3 rows in set (0.01 sec)

And these are the privileges active:

DC2-2(secure_test@localhost) [(none)]>show grants for current_user()\G
*************************** 1. row ***************************
Grants for secure_test@localhost: GRANT REPLICATION CLIENT ON *.* TO `secure_test`@`localhost`
*************************** 2. row ***************************
Grants for secure_test@localhost: GRANT GROUP_REPLICATION_ADMIN,REPLICATION_APPLIER,REPLICATION_SLAVE_ADMIN,SERVICE_CONNECTION_ADMIN ON *.* TO `secure_test`@`localhost`
*************************** 3. row ***************************
Grants for secure_test@localhost: GRANT SELECT ON `mysql`.* TO `secure_test`@`localhost`
*************************** 4. row ***************************
Grants for secure_test@localhost: GRANT SELECT ON `performance_schema`.* TO `secure_test`@`localhost`
*************************** 5. row ***************************
Grants for secure_test@localhost: GRANT `BackupAdmin`@`%`,`DBA`@`%`,`DBDesigner`@`%`,`DBManager`@`%`,`MaintenanceAdmin`@`%`,`MonitorUser`@`%`,`ReplicationAdmin`@`%`,`UserAdmin`@`%` TO `secure_test`@`localhost`
5 rows in set (0.00 sec)

Conclusions

Using the ROLES allows us to modify the needed privileges in one place, and at the same time will allow us to keep under control the possible proliferation of dynamic privileges due the use of components or plugins, significantly reducing the complexity of having multiple privileges sources.

Roles are normally used in the most common Databases and MySQL had implemented them quite late. But using roles is the natural evolution of how we should deal with user grants when moving from small platforms to medium/large. 

The time when we assign single user privileges IS GONE, welcome to 2021 MySQLlers!

 

For your convenience I am distributing a simple SQL file with all commands to create the Roles as described in this article(link to github)

References

https://dev.mysql.com/doc/refman/8.0/en/roles.html

https://lefred.be/content/some-queries-related-to-mysql-roles/

https://lefred.be/content/mysql-8-0-listing-roles/

 

https://lefred.be/content/mysql-8-0-roles-and-graphml/

Well, we are almost there.

Percona Live 2021 is just around the corner and this year comes with a very “dense” agenda.  PL2021 Speaker cards Marco Tusa

From my side I am giving a speech that is a high-level comparison between PXC and PS with group replication as highly available solutions.

It is a bit new for me to be at such high level, given I prefer to dig into technical, but I also think it was worth to stop and have the chance better understand what gives us what.

Anyhow this blog is not about that.

Given, as said, this year we will have a very “dense” schedule, I initially look at it and get scared.

“Too much stuff!”, that was my initial thought. Then I take a breath, reserve a couple of hours of my personal time, and start to dig-in.

In general, I have to say that thanks to the fact we are having an online event with registered speech it would be easier to attend multiple tracks also when conflicting.

What I did, I start to prune the agenda from the topics I am less interested, like marketing, or from source we know are predicating right and acting in totally different way. Then I navigate the speeches reading abstracts, reading about the speaker (if I was not familiar with), and start to compile my list.

Once done I realized it was a quite good and solid track, and possible worthy to share.

Spoiler alert, I love fantasy, so you may end up in attending one or two speech which will talk about fables, but you know we always need to have some fun time.

Before sharing the schedule, I want to mention that without the incredible work done by Bronwyn Campbell (Global Events Manager in Percona) and Valentina Lago (Global Events Coordinator) Percona Live would not exists. Given that we should thanks them for the HUGE effort done to realize this incredible event.

And now my list, I have left some holes here and there, where there was nothing interesting for me, to attend the speeches that are in conflict.

The link to the agenda: https://perconaliveonline.sched.com/

 

Hope you will find it useful: 

All times are in EDT

May 12
6:30
Percona Server for MySQL in the Enterprise - Dimitri Vanoverbeke
Building and Scaling a Robust Zero-Code Data Pipeline With Open Source Technologies - Paul Brebner

8:00
Projections in ClickHouse - Amos Bird
Revertible, Recoverable Schema Migrations in Vitess - Shlomi Noach

8:30
Creating MySQL User-Defined Functions in C++ Has Never Been Easier - Yura Sorokin
ClickHouse 2021: New Features and Roadmap - Alexey Milovidov

9:00
The Essentials of Search - Kyle Davis

11:00
Organize the Migration of a Hundred Database Clusters to the Cloud - Maxime Fouilleul

13:00
Sharding All The Way Down: Building Fast and Highly Concurrent Databases on Modern Hardware - Avishai Ish-Shalom

13:30
Creating Chaos in Databases - Vadim Tkachenko

14:00
The Top 5 Things You Should Know About Databases on Kubernetes vs VMs - Marco Nicosia - Rachel Heaton - Adam Berlin

15:00
ARM Power! Comparing MySQL x86 vs ARM Performance - Vinicius. Grippa

16:00
Debug a Kubernetes Operator - Philipp Krenn
Kubernetes-on-Rails?! KateSQL: A Shopify-Scale Cloud-Hosted MySQL Platform - Jeremy Cole - Akshay Suryawanshi

17:30
MyRocks - The 30,000 Foot View - Mike Benshoof
Open Source DBaaS with PMM - Steve Hoffman

May 13
7:00
Hybrid TP/AP With MySQL and ClickHouse - Stig Bakken
OSINT - Do You Really Know What Data You're Leaking? - David Busby
How Machine Learning Inside Databases Solves Significant Data-Science Challenges - Jorge Torres - Patricio Cerda-Mardini

7:30
Setup and manage alerts for databases with Integrated alerting in Percona Monitoring and Management - Peter Boros - Zoriana Stefanyshyn

8:30
Test Applications' Storage Stability by Injecting Storage Errors - Keao Yang

9:00
Everything a DBA Should Know About Kubernetes - Janos Pasztor

9:30
MySQL Shell for DBAs - Frédéric Descamps

10:00
How to Develop BPF Tools with libbpf + BPF CO-RE - Wenbo Zhang

10:30
Don’t Feed Me Dog Food and Call it a 5 Star Meal. How the Open Source Landscape is Being Hijacked. - Matt Yonkovit

11:00
Brand New Development Announced at PL - Johannes Schlüter - Kenny Gryp
Scaling Out Distributed Storage Fabric with RocksDB - Yasaswi Kishore - Sandeep Madanala

13:00
Successfully Run Your MySQL NDB Cluster in Kubernetes - Tiago Alves

14:00
GraphQL as Analytical Language for Data Warehouses - Aleksey Studnev

 

 

 

 

 

 

 

 

Percona live is over, and we finally get some feedback about our presentations.

Initially I had 3 presentations to give:

  • Comparing High Available Solutions With Percona XtraDB Cluster and Percona Server With Group Replication.
  • Comparing Hash Join solution, the good, the bad and the worse. The New version 2021.
  • Boosting MySQL NDB cluster & MySQL Innodb Cluster with PrxySQL V2

But when I saw we were having a lot of great submissions, I decided that it would be better for the conference and for Percona to drop 2 of them and leave some free slots for others.

That has always been my line of conduct, I do not think we should have speakers with multiple talks unless exceptions.  I know in the past I had been presenting multiple times, but that is why I am stating this now, more and more.

Anyhow, the remaining talk was Comparing High Available Solutions With Percona XtraDB Cluster and Percona Server With Group Replication. Which is a high level discussion about High Availability and the two main solutions Percona provides for it. 

The presentation was previously recorded and played during the event. This allowed me to interact during the presentation much more than if I had to do it live.thankyou

I thought we had a good level of interactions and discussion which is always a good sign, but when we finally got the feedback I discovered my session had 260 attendees.  

Now, I am still trying to get more info (wondering why it is so difficult to get) but from what I discovered so far, it was in the first three most attended. 

Given that I want to THANK YOU to all who have been there. I want to say one million times THANK YOU to all the ones who had also interacted during the presentation.

The reason to be for a speaker is to have not only people in the room, but people who interact because they are interested.
I like to think this is the result of years of honest and unbias blogging and speaking. So once more thank you! 

Percona had not publish it yet on the official site, I am sharing it from youtube. In case you are still interested to view it, unfortunately the chat interaction is not there, given this was my original recording I had passed for the conference, but better than nothing right?

 

And now few comments about the conference. 

In general it was a good conference, I think we had a decent mix of technical and less technical content. I really enjoyed the key notes with Amanda Brock and after Luis Villa. I can be more in alignment with one instead of the other, but the topic and the side discussion in the chat was great.

About trying to follow the tracks and trying to attend the most interesting speeches (for me), I have to say I was a bit disappointed.

 

In my article here I was giving an indication of which presentations I was going to attend or when in conflict I was expecting to be able to see the recording. 

But there had been some unexpected inconveniences. 

The first one, positive, was that I wanted to attend more talks than the ones in the list. The second one, negative, was that the conference organization did not provide the recording after the speech. Actually as mentioned above most of them are still not published.

Now, while I understand that in case of LIVE talk, you may need some time to put them up, I do not get why we were not publishing the recording, as they were, after the official slot was over. Any additional refining or editing could have been done later, but in the meantime people would have been able to get better service and attend more.

Point was that as soon as I discovered it was not possible to watch the recording after the presentation, I started to keep on at least two screens, and record what I cannot focus on to be able to watch it later. Very annoying and disappointing. But hey space for improvement right? Next online event can be better. 

Another thing I liked was the platform used for the conference, easy, clean and allowed a lot of “offline” sharing. Yes, sometimes we had to reload the page to have the presentation running, but that was not a big deal.

I want to mention again the great work done by the Percona marketing team in making this event a great success. All of them, but special mention is for Bronwyn Campbell and Valentina Lago, without them Percona live 2021 would NOT be what it was.

Finally, while all of us around the globe HOPE to be able to attend the next conference in person, I strongly doubt this will be possible. We MUST keep in mind conferences as Percona Live is a global event. That will make it very unlikely to have people attending in person soon. Obviously the option is to limit it to a regional event, which for me will mean to diminish its relevance in a very impactful way. I miss meeting in person, but I would prefer to have the conference online, if that means I will miss a huge part of the speakers and their content!

And I may be pessimistic, but I doubt we will be in a safe position before the next event. 

Well that is all for now, but stay tuned… because the two missed presentations are coming ;) 

 

To all THANK YOU two hundred sixty thousand times once more.

thankyou

AKA Cluster Error Voting

What is Cluster Error Voting (CEV)?

“Cluster Error Voting is a new feature implemented by Alexey Yurchenko, and it is a protocol for nodes to decide how the cluster will react to problems in replication. When one or several nodes have an issue to apply an incoming transaction(s) (e.g. suspected inconsistency), this new feature helps. In a 5-node cluster, if 2-nodes fail to apply the transaction, they get removed and a DBA can go in to fix what went wrong so that the nodes can rejoin the cluster. (Seppo Jaakola)”

This feature was ported to Percona PXC in version 8.0.21, and as indicated above, it is about increasing the resilience of the cluster especially when TWO nodes fail to operate and may drop from the cluster abruptly. The protocol is activated in a cluster with any number of nodes.  

Before CEV if a node has a problem/error during a transaction, the node having the issue would just report the error in his own log and exit the cluster:

2021-04-23T15:18:38.568903Z 11 [ERROR] [MY-010584] [Repl] Slave SQL: Could not execute Write_rows event on table test.test_voting; Duplicate entry '21' for key 'test_voting.PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 0, Error_code: MY-001062
2021-04-23T15:18:38.568976Z 11 [Warning] [MY-000000] [WSREP] Event 3 Write_rows apply failed: 121, seqno 16
2021-04-23T15:18:38.569717Z 11 [Note] [MY-000000] [Galera] Failed to apply write set: gtid: 224fddf7-a43b-11eb-84d5-2ebf2df70610:16 server_id: d7ae67e4-a43c-11eb-861f-8fbcf4f1cbb8 client_id: 40 trx_id: 115 flags: 3
2021-04-23T15:18:38.575439Z 11 [Note] [MY-000000] [Galera] Closing send monitor...
2021-04-23T15:18:38.575578Z 11 [Note] [MY-000000] [Galera] Closed send monitor.
2021-04-23T15:18:38.575647Z 11 [Note] [MY-000000] [Galera] gcomm: terminating thread
2021-04-23T15:18:38.575737Z 11 [Note] [MY-000000] [Galera] gcomm: joining thread
2021-04-23T15:18:38.576132Z 11 [Note] [MY-000000] [Galera] gcomm: closing backend
2021-04-23T15:18:38.577954Z 11 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(NON_PRIM,3206d174,5)
memb {
	727c277a,1
	}
joined {
	}
left {
	}
partitioned {
	3206d174,1
	d7ae67e4,1
	}
)
2021-04-23T15:18:38.578109Z 11 [Note] [MY-000000] [Galera] PC protocol downgrade 1 -> 0
2021-04-23T15:18:38.578158Z 11 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view ((empty))
2021-04-23T15:18:38.578640Z 11 [Note] [MY-000000] [Galera] gcomm: closed
2021-04-23T15:18:38.578747Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1

While the other nodes, will “just” report the node as out of the view:

2021-04-23T15:18:38.561402Z 0 [Note] [MY-000000] [Galera] forgetting 727c277a (tcp://10.0.0.23:4567)
2021-04-23T15:18:38.562751Z 0 [Note] [MY-000000] [Galera] Node 3206d174 state primary
2021-04-23T15:18:38.570411Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(PRIM,3206d174,6)
memb {
	3206d174,1
	d7ae67e4,1
	}
joined {
	}
left {
	}
partitioned {
	727c277a,1
	}
)
2021-04-23T15:18:38.570679Z 0 [Note] [MY-000000] [Galera] Save the discovered primary-component to disk
2021-04-23T15:18:38.574592Z 0 [Note] [MY-000000] [Galera] forgetting 727c277a (tcp://10.0.0.23:4567)
2021-04-23T15:18:38.574716Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2
2021-04-23

With CEV we have a different process. Let us review it with images first.

Let us start with a cluster based on:

1

3 Nodes where only one works as Primary.

Primary writes and as expected writestes are distributed on all nodes.

insert into test_voting values(null,REVERSE(UUID()), NOW()); <-- Few times

DC1-1(root@localhost) [test]>select * from test_voting;
+----+--------------------------------------+---------------------+
| id | what                                 | when                |
+----+--------------------------------------+---------------------+
|  3 | 05de43720080-938a-be11-305a-6d135601 | 2021-04-24 14:43:34 |
|  6 | 05de43720080-938a-be11-305a-7eb60711 | 2021-04-24 14:43:36 |
|  9 | 05de43720080-938a-be11-305a-6861c221 | 2021-04-24 14:43:37 |
| 12 | 05de43720080-938a-be11-305a-d43f0031 | 2021-04-24 14:43:38 |
| 15 | 05de43720080-938a-be11-305a-53891c31 | 2021-04-24 14:43:39 |
+----+--------------------------------------+---------------------+
5 rows in set (0.00 sec)

Some inexperienced DBA does manual operation on a secondary using the very unsafe feature wsrep_on.

And then by mistake or because he did not understand what he is doing:

insert into test_voting values(17,REVERSE(UUID()), NOW()); <-- with few different ids

At the end of the operation the Secondary node will have:

DC1-2(root@localhost) [test]>select * from test_voting;
+----+--------------------------------------+---------------------+
| id | what                                 | when                |
+----+--------------------------------------+---------------------+
|  3 | 05de43720080-938a-be11-305a-6d135601 | 2021-04-24 14:43:34 |
|  6 | 05de43720080-938a-be11-305a-7eb60711 | 2021-04-24 14:43:36 |
|  9 | 05de43720080-938a-be11-305a-6861c221 | 2021-04-24 14:43:37 |
| 12 | 05de43720080-938a-be11-305a-d43f0031 | 2021-04-24 14:43:38 |
| 15 | 05de43720080-938a-be11-305a-53891c31 | 2021-04-24 14:43:39 |
| 16 | 05de43720080-a39a-be11-405a-82715600 | 2021-04-24 14:50:17 |
| 17 | 05de43720080-a39a-be11-405a-f9d62e22 | 2021-04-24 14:51:14 |
| 18 | 05de43720080-a39a-be11-405a-f5624662 | 2021-04-24 14:51:20 |
| 19 | 05de43720080-a39a-be11-405a-cd8cd640 | 2021-04-24 14:50:23 |
+----+--------------------------------------+---------------------+

Which of course is not in line with the rest of the cluster, that still has the previous data.

Then our guy put the node back:

At this point the Primary does another insert in that table and:

Houston we have a problem! 

The secondary node already has the entry with that ID and cannot perform the insert:

2021-04-24T13:52:51.930184Z 12 [ERROR] [MY-010584] [Repl] Slave SQL: Could not execute Write_rows event on table test.test_voting; Duplicate entry '18' for key 'test_voting.PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log FIRST, end_log_pos 0, Error_code: MY-001062
2021-04-24T13:52:51.930295Z 12 [Warning] [MY-000000] [WSREP] Event 3 Write_rows apply failed: 121, seqno 4928120

But instead of exit from the cluster it will raise a verification through voting:

2021-04-24T13:52:51.932774Z 0 [Note] [MY-000000] [Galera] Member 0(node2) initiates vote on ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120,878ded7898c83a72:  Duplicate entry '18' for key 'test_voting.PRIMARY', Error_code: 1062;
2021-04-24T13:52:51.932888Z 0 [Note] [MY-000000] [Galera] Votes over ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120:
   878ded7898c83a72:   1/3
Waiting for more votes.
2021-04-24T13:52:51.936525Z 0 [Note] [MY-000000] [Galera] Member 1(node3) responds to vote on ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120,0000000000000000: Success
2021-04-24T13:52:51.936626Z 0 [Note] [MY-000000] [Galera] Votes over ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120:
   0000000000000000:   1/3
   878ded7898c83a72:   1/3
Waiting for more votes.
2021-04-24T13:52:52.003615Z 0 [Note] [MY-000000] [Galera] Member 2(node1) responds to vote on ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120,0000000000000000: Success
2021-04-24T13:52:52.003722Z 0 [Note] [MY-000000] [Galera] Votes over ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120:
   0000000000000000:   2/3
   878ded7898c83a72:   1/3
Winner: 0000000000000000

As you can see each node inform the cluster about the success or failure of the operation, the majority wins.

Once the majority had identified the operation was legit, as such, the node that ask for the voting will need to get out from the cluster:

2021-04-24T13:52:52.038510Z 12 [ERROR] [MY-000000] [Galera] Inconsistency detected: Inconsistent by consensus on ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120
	 at galera/src/replicator_smm.cpp:process_apply_error():1433
2021-04-24T13:52:52.062666Z 12 [Note] [MY-000000] [Galera] Closing send monitor...
2021-04-24T13:52:52.062750Z 12 [Note] [MY-000000] [Galera] Closed send monitor.
2021-04-24T13:52:52.062796Z 12 [Note] [MY-000000] [Galera] gcomm: terminating thread
2021-04-24T13:52:52.062880Z 12 [Note] [MY-000000] [Galera] gcomm: joining thread
2021-04-24T13:52:52.063372Z 12 [Note] [MY-000000] [Galera] gcomm: closing backend
2021-04-24T13:52:52.085853Z 12 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(NON_PRIM,65a111c6-bb0f,23)
memb {
	65a111c6-bb0f,2
	}
joined {
	}
left {
	}
partitioned {
	aae38617-8dd5,2
	dc4eaa39-b39a,2
	}
)
2021-04-24T13:52:52.086241Z 12 [Note] [MY-000000] [Galera] PC protocol downgrade 1 -> 0
2021-04-24T13:52:52.086391Z 12 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view ((empty))
2021-04-24T13:52:52.150106Z 12 [Note] [MY-000000] [Galera] gcomm: closed
2021-04-24T13:52:52.150340Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = no, bootstrap = no, my_idx = 0, memb_num = 1

It is also nice to notice that now we have a decent level of information about what happened also in the other nodes, the log below is from the Primary:

2021-04-24T13:52:51.932829Z 0 [Note] [MY-000000] [Galera] Member 0(node2) initiates vote on ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120,878ded7898c83a72:  Duplicate entry '18' for key 'test_voting.PRIMARY', Error_code: 1062;
2021-04-24T13:52:51.978123Z 0 [Note] [MY-000000] [Galera] Votes over ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120:
…<snip>
2021-04-24T13:52:51.981647Z 0 [Note] [MY-000000] [Galera] Votes over ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120:
   0000000000000000:   2/3
   878ded7898c83a72:   1/3
Winner: 0000000000000000
2021-04-24T13:52:51.981887Z 11 [Note] [MY-000000] [Galera] Vote 0 (success) on ab5deb8e-389d-11eb-b1c0-36eca47bacf0:4928120 is consistent with group. Continue.
2021-04-24T13:52:52.064685Z 0 [Note] [MY-000000] [Galera] declaring aae38617-8dd5 at tcp://10.0.0.31:4567 stable
2021-04-24T13:52:52.064885Z 0 [Note] [MY-000000] [Galera] forgetting 65a111c6-bb0f (tcp://10.0.0.21:4567)
2021-04-24T13:52:52.066916Z 0 [Note] [MY-000000] [Galera] Node aae38617-8dd5 state primary
2021-04-24T13:52:52.071577Z 0 [Note] [MY-000000] [Galera] Current view of cluster as seen by this node
view (view_id(PRIM,aae38617-8dd5,24)
memb {
	aae38617-8dd5,2
	dc4eaa39-b39a,2
	}
joined {
	}
left {
	}
partitioned {
	65a111c6-bb0f,2
	}
)
2021-04-24T13:52:52.071683Z 0 [Note] [MY-000000] [Galera] Save the discovered primary-component to disk
2021-04-24T13:52:52.075293Z 0 [Note] [MY-000000] [Galera] forgetting 65a111c6-bb0f (tcp://10.0.0.21:4567)
2021-04-24T13:52:52.075419Z 0 [Note] [MY-000000] [Galera] New COMPONENT: primary = yes, bootstrap = no, my_idx = 1, memb_num = 2

At this point a DBA can start to investigate and manually fix the inconsistency and have the node rejoin the cluster. In the meanwhile the rest of the cluster continue to operate:

+----+--------------------------------------+---------------------+
| id | what                                 | when                |
+----+--------------------------------------+---------------------+
|  3 | 05de43720080-938a-be11-305a-6d135601 | 2021-04-24 14:43:34 |
|  6 | 05de43720080-938a-be11-305a-7eb60711 | 2021-04-24 14:43:36 |
|  9 | 05de43720080-938a-be11-305a-6861c221 | 2021-04-24 14:43:37 |
| 12 | 05de43720080-938a-be11-305a-d43f0031 | 2021-04-24 14:43:38 |
| 15 | 05de43720080-938a-be11-305a-53891c31 | 2021-04-24 14:43:39 |
| 18 | 05de43720080-938a-be11-405a-d02c7bc5 | 2021-04-24 14:52:51 |
+----+--------------------------------------+---------------------+

Conclusion

Cluster Error Voting (CEV), is a nice feature to have. It helps to understand better what goes wrong and it increases the stability of the cluster, that with the voting has a better way to manage the node expulsion.

Another aspect is the visibility, never underestimate the fact an information is available also on other nodes. Having it available on multiple nodes may help investigations in case the log on the failing node gets lost (for any reasons).

We still do not have active tuple certification, but is a good step, especially given the history we have seen of data drift in PXC/Galera in these 12 years of utilization.

My LAST comment, is that while I agree WSREP_ON can be a very powerful tool in the hands of experts as indicated in my colleague blog https://www.percona.com/blog/2019/03/25/how-to-perform-compatible-schema-changes-in-percona-xtradb-cluster-advanced-alternative/ . That option remains DANGEROUS, and you should never use it UNLESS your name is Przemysław Malkowski and you really know what you are doing.

 

Great MySQL to everybody!

References

https://www.percona.com/doc/percona-xtradb-cluster/8.0/release-notes/Percona-XtraDB-Cluster-8.0.21-12.1.html

https://youtu.be/LbaCyr9Soco

Latest conferences

PL2020
percona_tech_days_aug_2020

We have 175 guests and no members online

oracle_ace