DRBD for lazy people ... a guide to DRBD installation
V 1 (05/09/2007)
======================================
How to setup DRBD from scratch
======================================
Get the DRBD code from http://oss.linbit.com/drbd/
==================================================
expand it in /usr/src/drbd-8.0.X
# cd /usr/src
# tar -xvzf where/you/downloaded/it/drbd-0.8.X.tar.gz
Be sure that YOU are running using the kernel you want to have the DRBD.
If you want to use a 2.4.x kernel, you need to use DRBD-0.7.;
Obviously you should have a build-system installed, i.e. gcc, make, etc. ;)
Make sure YOU have the right kernel source and that you have already compiled it!!!!
# cd drbd-0.8.0/drbd
# make clean all
# make KDIR=/path/to/kernel/source
# make tools
# make install
# make install-tools
=============================================
Configure DRBD
=============================================
global {
dialog-refresh 10; # set it to 0 to disable redrawing completely. [ default = 1 ]
}
# useless comment.
common {
protocol C; #USE C if you care about YOUR DATA!!!!
startup {
degr-wfc-timeout 0; #wait forever
}
disk {
on-io-error detach;
fencing dont-care; # IF something goes wrong just disconnect and then manually fix it IS THE SAFEST WAY!!
}
syncer {
rate 16M;
al-extents 1801; #def 127 (must be a prime number)
}
net {
timeout 60;
connect-int 10;
ping-int 10;
}
}
resource mysql {
on <uname -n>#Node1 {
device /dev/drbd0;
disk /dev/sda1;
meta-disk internal;
address 10.0.0.253:7788;
}
on <uname -n> #Node2 {
device /dev/drbd0;
disk /dev/sda1;
meta-disk internal;
address 10.0.0.254:7788;
}
}
resource mysql2 {
on <uname -n>#Node1 {
device /dev/drbd1;
disk /dev/sdb1;
meta-disk internal;
address 10.0.0.253:7789;
}
on <uname -n> #Node2 {
device /dev/drbd0;
disk /dev/sdb1;
meta-disk internal;
address 10.0.0.254:7789;
}
}
======================================================
STARTING UP
======================================================
ON BOTH NODES!!!!!
# modprobe drbd
# lsmod |grep drbd #to check if it is really up..
You can also run
# /etc/init.d/drbd start|stop|restart
# drbdadm create-md <resource | all>
# drbdadm up <resource | all>
# drbdadm -- --overwrite-data-of-peer primary <resource | all>
------------------------[ O P T I O N A L ]--------------------------------------
<<<TO AVOID INITIAL SYNCRONIZATION>>>
ON both node!
# drbdadm -- 6::::1 set-gi <res>
# drbdadm up <res>
# drbdadm down <res>
# drbdadm dump-md mysqllogs > /tmp/md.txt
# sed -i -r -e 's/0xF{16}/0x0000000000000000/g' /tmp/md.txt
Important you need to know EXACTLY how to reload the meta information
to do that run:
# drbdadm -d dump-md <resource>
Copy the output something like : drbdmeta /dev/drbd0 v08 /dev/sda1 internal dump-md
and modify like:
# drbdmeta /dev/drbd0 v08 /dev/sda1 internal restore-md /tmp/md.txt
# drbdadm up <res>
Done! It should not ask for full-resync unless you invalidate the device
-----------------------------------------------------------------
=================================================================
DRBD usefull commands
=================================================================
drbdadm attach <resource |all>
drbdadm deattach <resource |all>
drbdadm connect <resource |all>
drbdadm disconnect <resource |all>
drbdadm up <resource |all>
drbdadm down <resource |all>
------------------------------------
How to change a disk
------------------------------------
ONLY on the node where you change the disk
# drbdadm down <res>
Change the disk
# drbdadm create-md <res>
# drbdadm attach <res>
Resync (full) restart... ! YOU cannot avoid it go for
a raid around the world ;-)
------------------------------------
drbdadm -- --show-gi mysql
+--< Current data generation UUID >-
| +--< Bitmap's base data generation UUID >-
| | +--< younger history UUID >-
| | | +-< older history >-
V V V V
C25DA6C2A0416DAB:F83BE889225BA471:34447320FCBB7BFB:CE0711E0804A8B81:1:1:1:0:0:0
^ ^ ^ ^ ^ ^
-< Data consistancy flag >--+ | | | | |
-< Data was/is currently up-to-date >--+ | | | |
-< Node was/is currently primary >--+ | | |
-< Node was/is currently connected >--+ | |
-< Node was in the progress of setting all bits in the bitmap >--+ |
-< The peer's disk was out-dated or inconsistent >--+
------------
TO modify it
drbdadm set-gi C25DA6C2A0416DAB:F83BE889225BA471:34447320FCBB7BFB:CE0711E0804A8B81:X:X:X:X:X:X
------------
========================
SPLIT BRAIN SCENARIO
========================
Hot to reconize it:
Both nodes are StandAlone and Secondary
on the /var/log/message you will find:
Aug 17 11:20:40 kernel: drbd0: Handshake successful: DRBD Network Protocol version 86
Aug 17 11:20:40 kernel: drbd0: Split-Brain detected, dropping connection!
Aug 17 11:20:40 kernel: drbd0: self 8F9865BAD721B293:96D5A3054B60...
Aug 17 11:20:40 kernel: drbd0: peer FC877D3508BEBFDF:96D5A3054B6...
----------------------------------------
HOW to fix it EASY WAY ;-) first
----------------------------------------
Status both nodes are not mounted and hopefully YOU can reconize
who was the last ACTIVE node conatining the good data...
If yes good and easy
on the node that will be the SECONDARY
# drbdadm -- --discard-my-data connect <resource>
On the Primary node
# drbdadm connect <resources>
IF it will not start as primary
# drbdadm primary <resources>
----------------------------------------------------
HOW to fix it more difficult way (be very carefull)
----------------------------------------------------
Identify (again YOU must know wich is the good one)
the last good Active node then run there:
# drbdadm get-gi mysql
you will get something like:
C6DF6BCECD9703BD:0000000000000000:40C9DB5237D65911:A1985772A0E0B3E3:1:1:0:1:0:0
run the same command on the Secondary node
and you will get something similar but NOT equal to the above result.
We can read it as follow:
(last good Active) Cg : B : H1 : H2 and (not good) Cu : B : H1 : H2.
the Cg and Cu values are different!
To fix it you should replace the value of Cu with the value of B
and the value of B with 0.
so at the end issue the command on Secondary ONLY:
# drbdadm set-gi <B>:0:<H1>:<H2> <res>
# drbdadm connect <res>
Full resync will starts
-------------------------------------------------------
================================
HOW to force full resync
================================
on the node you want to resync:
# drbdadm invalidate <res>
or remotely
# drbdadm invalidate-remote <res>
==============================================
HOW to SEE the DRBD STATUS ????
==============================================
To check what is going on on your DRBD installation
you can:
See all in one shot and live:
# wtach -n 1 cat /proc/drbd
Every 1.0s: cat /proc/drbd Thu Sep 6 12:26:11 2007
version: 8.0.4 (api:86/proto:86)
SVN Revision: 2947 build by root@<uname -n>, 2007-09-03 17:36:50
0: cs:Connected st:Primary/Secondary ds:UpToDate/UpToDate C r---
ns:4 nr:32 dw:36 dr:1661 al:0 bm:0 lo:0 pe:0 ua:0 ap:0
resync: used:0/31 hits:0 misses:0 starving:0 dirty:0 changed:0
act_log: used:0/127 hits:1 misses:0 starving:0 dirty:0 changed:0
-------------------------------------------------------------------------
What to monitor here??? Easy:
lo (local count) Number of open requests to the local IO sub-system issued by DRBD.
pe (pending) Number of requests sent to the partner, but that have not yet
ap (application pending) Number of block IO requests forwarded to DRBD,
but not yet answered by DRBD.
---------------------------------------------------------------------------
From custom script (hearthbeat, mon, nagios, BigBrother etc..)
you can also check for a summary quick and easy:
# drbdadm state <resource> give --> Primary/Secondary
# drbdadm dstate <resource> give --> UpToDate/UpToDate
# drbdadm cstate <resource> give --> Connected
Checking for these status you can guess if your DRBD is working
properly or not.
==========================================================================
When NOT to USE DRBD
==========================================================================
1) All or most of your data is static.
2) Your application requires sub-second failover(if you need less then 1 min for fail over then … choose another solution)
3) You are required to provide synchronous 3-way redundancy.
4) You only need replication for one particular purpose, and want extreme optimization.
(like the need of replicating ONLY MySQL service, so MySQL
replication could be a better solution OR MySQL PROXY I am working on it ;-)).
5) You can pay for a SAN.
=========================================
Q&A
=========================================
If you need further Help
contact me at marco@mysql.com
=========================================
IF you have comment please send them to me ...
thanks
{joscommentenable}