Saturday, April 20, 2019

More tips and a check list for that elusive Percona Xtra DB (pxc) (galera) cluster SST or IST

While it's clear that SST/IST are the magic sauce of Galera and their implementation in MySQL by Percona there are dragons lurking that can prevent this from working.  It seems that lots of people have this issue so I created this checklist you can use to try and resolve your IST/SST issues.  At one time or another every one of these issues has caused me to have an SST/IST fail.

 In general, when debugging SST/IST issues it is best to use a specific donor for your joiner node so that you can isolate your troubleshooting and testing to a specific set of nodes.

Checklist


  • Are the nodes in the same cluster?  Check wsrep_cluster_name
  • Is the server_id the same for all nodes in the cluster?
  • Is wsrep_node_address correct?  A typo here causes very interesting errors.
  • Is wsrep_node_name correct?  Again, a simple error here causes SST to fail.
  • Is gcom the same for all nodes?  Triple check this.  Use nslookup on every address to check for correctness.   Do a reverse nslookup to make sure fqdn resolves to the expected ip address.  
  • Is wsrep_sst_donor a fqdn or node name?  Do not use an ip address in this field.  Refer to the documentation for methods to code this. 
  • Are ports 3306, 4444, 4567, and 4568 open?  This is a biggie so see below for tips on how to test.
  • Are you using xtrabackup_v2 for your wsrep_sst_method?  If not you should.  My experience with rsync, mysqldump, and the deprecated xtrabackup (no _v2) have been poor.  
  • Are xtrabackup versions the same on all nodes?  
  • Is /etc/hosts fqdn and ip address correct on all nodes?  Test using dns for every name and ip address you see.
  • Make sure you examine every ERROR in the MySQL error log and sometimes the WARNINGS are also worth a look.  Infrequently another error can cause SST/IST to fail for issues actually unrelated to either SST or IST.  
  • is socat working?  (see below on how to use socat to test ports._)
  • will the MySQL server start at all?  If you have a valid install you can try something simple like mysqld --wsrep-recover to see if MySQL will even start.  If your datadir is invalid then blow it away and reinitialize MySQL using mysqld initialize_insecure to bootstrap mysql using your my.cnf and then try to start it using mysqld --wsrep-recover only to make sure it will start.  It may uncover other issues you have with MySQL that are not specific to SST/IST.  This helped me once when an error existed in my.cnf which was not logged to the error log until I ran this standalone test.  

Port testing

Now about those pesky ports.  The best way to test this is using socat which is the same utility that sst/ist uses to move data between nodes.  The socat utility established a bi-directional tunnel between hosts and should be tested on both hosts.   Substitute your ip address below for donor and joiner.  You must test this in both directions to be certain SST/IST will work.

run this pair of tests for each port 4444, 4567, and 4568.  

from the joiner run this
    socat TCP4-LISTEN:4444 STDOUT    # this starts a listener on the port

from the donor run this.  You should see hello on your joiner
    echo "hello" | socat -t 30 tcp:
:4444 -

Now reverse this and do the same where the donor becomes the listener.  Repeat for all three pots.  If your socat works then you have eliminated ports as an issue.  So go back and check the items in the checklist once again.  

For port 3306 use the MySQL client from each host to contact the other host.  Since your joiner is down you may need to start it without using the cluster in order to ensure that port 3306 is open.  (See the checklist on how to get a bare bones mysqld setup to run.) 




I hope this is useful.

See also this article (somewhat dated and redundant) from Percona.
https://www.percona.com/blog/2014/12/30/diagnosing-sst-errors-with-percona-xtradb-cluster-for-mysql/


Friday, March 29, 2019

percona SST fails with broken pipe or no route to host

I ran into a very strange set of issues when trying to build a PXC cluster node using SST.  The SST kept failing either due to "broken pipe" on both the donor and joiner or "no route to host" which was seen frequently on the joiner and less so on the donor.

Lots of head scratching and work with the network team finally figured this one out.  Percona support also played an important role.  We first started digging into the normal port issue.  PXC needs 3306, 4444, 4567, and 4568 open.  For SST the critical port is 4567.  We were on hosts that were located in two data centers and all hosts were in the DMZ.  Because of this, the networking was more complex than normal with two sets of firewalls and NATting going on as well.  I am no network person but thankfully I had good support.

Our first dig into the problem showed we had ports closed and it was unclear on which end it was breaking.  We were able to use tcp dump to finally convince ourselves that traffic could flow across the network on the critical ports.  We did this by running

tcpdump -n -o eth0  > output_file.

We ran this on the joiner with the ip address of the donor and we ran it on the donor with the address of the joiner.  This gives you a view of traffic flow and can confirm that you are making the connection.

After much work by the network team to open firewalls we finally were able to see the traffic flow between the two hosts.  During our investigation it was decided that we wanted to pin down the joiner and donor so that we would always know what hosts to investigate.  Here comes our first major error.  We added the
wsrep_sst_donor= 
on our joiner.  We used the ip address of the donor!  Don't do this.  It turns out that Galera has specifically recommended against using the ip address and either wants the wsrep_node_name or the hostname of the donor!  Once we fixed this we saw success of our SST.  But our woes were not over.

We then turned our attention to another node.  It too failed SST and we started head scratching.  Yes, the ports were open.  Yes, we had the hostname in the donor attribute.  After digging in Mr. Google we found a post that mentioned a failed SST due to a version difference.  Viola!  Our puppet manifest had accidentally put the wrong version of MySQL on one of the hosts and that was the reason for the SST failure.  Once the node was upgraded correctly the SST worked.

Hopefully this will help someone someday.

You can read the full Galera post on why not to use ip address as the donor here:
http://galeracluster.com/documentation-webpages/mysqlwsrepoptions.html#wsrep-sst-donor3



Thursday, January 10, 2019

create table duplicate key! What?

I ran into an interesting and informative edge case involving asynchronous replication in MySQL and Percona's pt-online-schema_change utility.

In our 5.7 MySQL environment we were adding columns to an 18M 500G table.  Unfortunately this not only copies the table to complete the alter it also blocks the world from writing.  So we had to come up with another way to manage this change.  We have used Percona's pt-osc utility before and have great praise for it.  Our setup involved a PXC cluster and a single standalone slave.

pt-osc creates a new table _table_new with foreign keys renamed to _old_name, e.g., an underscore is added to make the fk in the new table unique.  But, we had previously run a pt-osc on this table and had swapped the old/new table successfully so we did not check that those fk names had been changed.  When we ran a new pt-osc it replicated through to the slave and failed with duplicate key on the create table ddl.

The fix was simple; drop and recreate the foreign keys without the leading _ (use set foreign_key_checks=NO to prevent scanning on the fk rebuild).  Worked like a charm and we resumed the slave.

I confess that it was not simple to figure this out and thankfully Percona was around to help me.


Sunday, January 6, 2019

with percona xtradb, mysqldump can create inconsistent ddl missing auto increment for example!

When running Percona's XtraDB cluster and using pxc_strict_mode=ENFORCNG (the default) lock tables will generate an error.

mysqldump defaults to --opt which includes --lock-tables wherin the error gets thrown.  Of course you can temporarily change pxc_strict_mode to DISABLED or PERMISSIVE to get around the error however there can be side affects of this in a live system.

A better approach is to use mysqldump without --opt by specifying --no-opt however this can generate incorrect DDL because of the embedded option (in --opt) of --create-options.  The safe method if you want to use --no-opt is to specify the options individually and omit the --lock-tables implied by --opt.

As shown by help 
--opt               Same as --add-drop-table, --add-locks, --create-options,
                      --quick, --extended-insert, --lock-tables, --set-charset,
                      and --disable-keys. Enabled by default, disable with

                      --skip-opt.

So omit --lock-tables and include the others and your ddl will be consistent.

Color me unhappy when I copied a schema from one pxc cluster to another and missed this so that my copied schema omitted auto increment and automatic update of timestamp.  

Have fun.