Using and benchmarking Galera in different architectures

=============================================

What I was interested most during the second day was again, synchronous replication and Replication solutions provide from Continuent.

The first I attend in the day was the Galera one, done Henrik and Alexey.

The presentation was going to talk about:

"We will present results from benchmarking a MySQL Galera cluster under various workloads and also compare them to how other MySQL high-availability approaches perform. We will also go through the different ways you can setup Galera, some of its architectures are unique among MySQL clustering solutions.

* MySQL Galera

** Synchronous multi-master clustering, what does it mean?

** Load balancing and other options

** WAN replication

** How split brain is handled

** How split brain is handled in WAN replication

* How does it perform?

** In memory workload

** Scale-out for writes - how is it possible?

** Disk bound workload

** WAN replication

** Parallel slave threads

** Allowing slave to replicate (commit) out-of-order

"

I know how passionate is Henrik when talking about Galera and I partially shares the feeling. I said partially because I am still not fully convinced about the numbers, but I am working on that.

Anyhow, a side the part related to bench marching, I have found interesting the combination of blocks and element for the HA solution.

Including redundant load balancer and use MySQL JDBC with Galera is a simple but efficient way to provide HA.

Also important we can finally drop the DRBD solution that has being for too long the only syncronous solution for MySQL. DRBD was forcing to have one PRIMARY (RW) node, and one SECONDARY completely useless.

I have also appreciated the honesty from Alexey about scaling.

Galera will not scale to infinite as some foul state, but it could have a decent number of nodes.

Now the limit of it is obviously to discover and calibrate against the real load pushed against the nodes, it cannot be define as an absolute abstract limit.

Interesting also how the Galera team is managing by "quorum" the server synchronization. In short having 3 nodes if one will not be able to access the other two but still get writes (split brain), at the moment of re-union, the other two will take over by "quorum".

Obvious and immediate problem is in case of having 3 datacenter one with 6 nodes and the others with 2 nodes each. If the DC with 6 nodes gets disconnected, the valid data will be in the 2 remaining data centre, but at the moment of reunion, the DC with 6 nodes will take the majority, and all data set will become invalid.

Alexey is working on a way to calculate the "weight" by proximity to fix this issue.

Honestly, I am not sure that Galera is production ready, but is for sure the most interesting and easy solution for simple write scale.

Reference for Galera at http://codership.com/