Tuesday 28 July 2015

About the new warnings generated during 'PURGE BINARY LOGS' execution

In this blog, I will discuss about warnings that users might face if they are trying to purge active/in_use binary log which are introduced in MySQL-5.6.12/MySQL-5.7.2 releases.  Before jumping directly to that, I will be covering few basic things about PURGE BINARY LOGS command, about safe way to purge the binary logs on a replication forum. If you are good at basics on "PURGE BINARY LOGS" , you can choose to directly move to the section (at the end of the blog) where I was talking about these new warnings.

Introduction:
==========
A binary log file contains “events” that describe database changes occurred on a MySQL Server (typically called Master) that enabled binary logging using --log-bin. On a Master server, the files that are related to binary logging are binary log files (files with binlog_base_name.6_digit serial_number) and a binary log index file (file with binlog_index_base_name.index name) which contains the names of all available binary log files.

A binary log file is rotated when its size reaches the binary log limit (that can be specified using max_binlog_size) or when a user (with RELOAD privileges) issues  FLUSH LOGS command.  If you want to see the list of all the available binary log files on the server, you could use 

"SHOW [BINARY | MASTER]  LOGS"

These files can get accumulated over the period of time and will occupy good amount of disk space.
You can delete all binary log files with the RESET MASTER statement, or a subset of them with PURGE BINARY LOGS manually or automatically by setting expire_logs_days system variable.  

PURGE BINARY LOGS syntax:
=========================
PURGE { BINARY | MASTER } LOGS
{ TO 'log_name' | BEFORE datetime_expr }

This statement deletes all the binary log files listed in the log index file prior to the specified log file name or date. Deleted log files are also removed from the list recorded in the index file, so that the given log file becomes the first in the list.

Screen shot for PURGE BINARY LOGS TO 'log_name' example
=================================================

When you are purging binary logs, you have to make sure that no slaves will require them. To safely purge binary log files that are not required any more, follow this procedure:

  1. On each slave server, check Master_Log_File value in SHOW SLAVE STATUS output to check which master log file it is reading 
  2. Obtain a listing of the binary log files on the master server with SHOW BINARY LOGS.
  3. Determine the earliest log file among all the slaves. This is the target file. If all the slaves are up to date, this is the last log file on the list.
  4. Make a backup of all the log files you are about to delete. (This step is optional, but always advisable.)
  5. Purge all log files up to but not including the target file. Lets say the target file from the step.3 is binlog.000005. You can execute PURGE BINARY LOGS binlog.000005 which will purge all binary logs prior to binlog.000005 but not binlog.000005.
PURGE BINARY LOGS BEFORE datatime_expr:
======================================
If you want to purge all the binary log files before a given datatime, you can use

PURGE BINARY LOGS BEFORE datatime_expr

The BEFORE variant's datetime_expr argument should evaluate to a DATETIME value (a value in 'YYYY-MM-DD hh:mm:ss' format). Lets say we have 4 binary log files. Two of them last modified on July 3rd and two of them are last modified on July 4th



and you want to delete all the files that are modified before July 4th 00.00AM, then the following command can be used
PURGE BINARY LOGS BEFORE '2015-07-04 22:46:26'

Screen shot for PURGE BINARY LOGS BEFORE datetime_expr
=================================================


Purging binary logs using expire_logs_days system variable:
=============================================
You can also set the expire_logs_days system variable to expire binary log files automatically after a given number of days. This variable can be specified at the time of start up using command line parameters or by mentioning in conf file (my.cnf) or using 'SET GLOBAL expire_logs_days=#' command. You should set the variable no lower than the maximum number of days your slaves might lag behind the master. The permitted value range for this variable is between 0 to 99 , 0 being the default for it. The value '0' indicates that there is no automatic purging enabled. Expired binary log files purged either at the time of server start up or when the binary log is rotated.

ER_WARN_PURGE_LOG_IN_USE/ER_WARN_PURGE_LOG_IS_ACTIVE Warnings:
==================================================================
Purging is safe to do while slaves are replicating. You need not stop them.
If you have an active slave's dump thread or active user session that is currently reading one of the log files you are trying to delete, this statement does not delete that 
file. Instead it will generate a warning. This could be one of the reasons for users complaining about still seeing binary log files even after executing 'PURGE BINARY LOGS' command. Before 5.6.12 (or 5.7.2), the command was silently stopped when it finds an active/in use binary log.  But in the later versions, user will see a warning (ER_WARN_PURGE_LOG_IN_USE) if you are trying to purge a binary log that is in use (either a show binary log events on it or a dump thread reading it) and user will see a warning (ER_WARN_PURGE_LOG_IS_ACTIVE) if you are trying to purge the active binary log
where the current ongoing transactions are going into.

Screenshot to show ER_WARN_PURGE_LOG_IN_USE(1867) Warning:
=======================================================

Client1 is reading events from 'master-bin.000004'
Client2 is trying to execute purge binary logs to 'master-bin.000007'

As you can see above, PURGE BINARY LOGS command found that a file (master-bin.000004) is being read by a thread. Hence it did not purge that file and any file that follows in the list of files to be purged and throws a warning ER_WARN_PURGE_LOG_IN_USE(1867) with all possible information

 Screenshot to show ER_WARN_PURGE_LOG_IS_ACTIVE  Warning:
==========================================================

As you can see above, PURGE BINARY LOGS command found that one of the binary log files you are trying to delete is active binary log and it did not purge that file and throws a warning ER_WARN_PURGE_LOG_IS_ACTIVE (1868) with all possible information


Summary:
This blog post provided few basics things about PURGE BINARY LOGS command , safe way to purge the binary logs on a replication forum and at the end I also spoke about warnings (ER_WARN_PURGE_LOG_IN_USE/ER_WARN_PURGE_LOG_IS_ACTIVE) that users might face if they are trying to purge active/in_use binary log which are introduced in MySQL-5.6.12/MySQL-5.7.2 releases. As always, Thank you for using MySQL. If you find any issues, please feel free to create bugs @ bugs.mysql.com.

Saturday 21 December 2013

Why MySQL introduced a new command 'CHANGE REPLICATION FILTER'


The MySQL@Oracle announced 5.7.3 development milestone release, now available for download at dev.mysql.com. You can find the full list of changes and bug fixes in the 5.7.3 Release Notes. One of the changes which went in this milestone release was MAKE –REPLICATION-* FILTER SETTINGS DYNAMIC. With this work the slave options –replicate-* can be changed dynamically through the new command CHANGE REPLICATION FILTER. These new options allow slave replication filters to be changed while the server is running, without requiring a server stop and restart.
This blog is intended to explain the design decision of choosing a “new command” over “extending the existing SET command”.

Alternatives:

  1. MySQL could introduce few global variables (one for each --replicate.* options) which could be settable through a SET command. Currently ‘SET’command syntax does not allow users to give comma separated values for any existing global/local variables.
  2. To support CSV list without changing the ‘SET’ grammar, Users can give CSV in a string (“ “). Parser will pass this string to executor and executor will have to parse this string value and   extract  the individual values. Eg: SET GLOBAL @@replicate_do_db=”db1,db2” i.e., instead of parser, executor should do the separation. If executor is separating the values by implementing        
    a special parsing logic to support commands like these
    • SET GLOBAL @@replicate_do_db=”db1,db2”
    • SET GLOBAL @@replicate_wild_do_table=”db.t\_\%”
    • SET GLOBAL @@replicate_rewrite_db=”db1->db2”
    then the special parsing logic should handle the following special cases as well
    • the value contains special characters like double quotes(“), comma(,), underscore (_), percentage(%), backslash(\) and this list goes on with many more special characters.
    • should respect different CHARACTER SET combinations.
    We can write code to handle all these cases in the special parser logic. But the parser(lex, yacc) is specially designed to handle these kind of cases and many more special cases. So it is always better to let parser separate out the values instead of implementing a special parser which might be error prone and it might become very difficult to maintain it in future

  3. As an another alternative, lets try to see what happens if we extend the parser to accept CSV for these new variables. Eg: SET @@global.replicate_do_db=(db1,db2). Parenthesis”(, )” are to resolve conflicts in the existing SET grammar. In this approach, Parser will give an object to executor which contains list of values. But consider this following special case
    • mysql> SET @@global.replicate_do_db=(db1,db2);
    • mysql> SET @saved_replicate_do_db_list= @@global.replicate_do_db;
    • mysql> SET @@global.replicate_do_db=@saved_replicate_do_db_list;
    To support this special case, we should declare all these new variables as special type of variables so that parser not only evaluates @saved_replicate_do_db_list variable, but also should parse the value (db1,db2) and provide an object to executor which contains list of values which requires good amount of changes in the current MySQL parser design.
  4. Another alternative would be to extend the syntax to take just one value at a time just the same way how –replicate-do-db takes. Problem with this approach is if a user has to set 10 database names in replicate_do_db filter rule, user has to execute 10 SET commands. And also with this design, to remove an item from this list, One has to clear all the items and then execute this command 9 times which is not a good design when we are implementing a new feature. 
MySQL could implement any of the above three alternatives either by resolving the mentioned problems using some *good* amount of changes at all layers or by imposing some limitations on the user (like 'special characters are not allowed' or 'users cannot use local variables to set the value' or 'users should execute the command ‘N’ number of times to set ‘N’ values'). In order to solve all the above problems in a *simpler way*, MySQL introduced a new command ‘CHANGE REPLICATION FILTER’ with no limitations on users.
Advantages of the new command:
    • Parser is used to separate out the values (hence avoiding all the above mentioned problems)
    • Easy to extend the grammar for future filter extensions.

Summary: This blog is intended to explain the design decision of choosing a “new command” over “extending the existing SET command”.  As explained above, MySQL looked into all the possible alternatives and chosen the approach which we believe is best suited to users. We hope this new command will make your life simpler by avoiding few restarts of the server. Please let us know your feedback on the same. If you find any issues feel free to create bugs @ bugs.mysql.com.

Thursday 5 December 2013

Making MySQL Slave Replication Filters Dynamic


In MySQL Replication, users can filter statements either at master (using --binlog-* startup options) or at the slave (using --replicate-* startup options). Prior to MySQL-5.7.3, users could set these filtering rules either through command line parameters or by using my.cnf file. In either case MySQL server must be restarted in order to change the filtering rules. It is not easy to restart MySQL server in real time scenarios (because of downtime issues and also loss of buffer cache resulting in performance problems). It is always helpful having a way to dynamically configure these filtering rules. Particularly in environments where slaves are configured dynamically to replicate certain databases or tables based on load and usage.

In MySQL-5.7.3, a new command "CHANGE REPLICATION FILTER" has been introduced through which users can change the *slave* side replication filters dynamically without the need for restarting the server.

Replication filtering:
At the master, users can control the statements which need to get into the binary log file by specifying startup options (like –binlog-*). Users cannot use these options to control the databases and tables that need to be executed on the slave. Instead, use filtering on the slave to control the events that are executed on the slave. At the slave, decisions about whether to execute or ignore statements received from the master are made according to the --replicate-* options that the slave was started with. In the simplest case, when there are no --replicate-* options, the slave executes all statements that it receives from the master.

At the slave server, SQL thread is the only thread accessing replication filters at any give point of time as shown in the picture above. If the SQL thread is not active, filtering rules can be changed without any harm. Hence a new command “CHANGE REPLICATION FILTER” has been introduced which allows dynamic change of filters.

'CHANGE REPLICATION FILTER' Command:
Starting from MySQL-5.7.3, user can set one or more replication filtering rules on the slave dynamically while the server is up and running. It requires only SQL thread to be down.

The following slave replication filters can be changed dynamically using this command.

  • REPLICATE_DO_DB
  • REPLICATE_IGNORE_DB
  • REPLICATE_DO_TABLE
  • REPLICATE_IGNORE_TABLE
  • REPLICATE_WILD_DO_TABLE
  • REPLICATE_WILD_IGNORE_TABLE
  • REPLICATE_REWRITE_DB



    Syntax:
    CHANGE REPLICATION FILTER = filter[, filter][, ...]

    filter:

    REPLICATE_DO_DB = ([db_list])

    | REPLICATE_IGNORE_DB = ([db_list])

    | REPLICATE_DO_TABLE = ([tbl_list])

    | REPLICATE_IGNORE_TABLE = ([tbl_list])

    | REPLICATE_WILD_DO_TABLE = ([tbl_list])

    | REPLICATE_WILD_IGNORE_TABLE = ([wild_tbl_list])

    | REPLICATE_REWRITE_DB = ([db_pair_list])


    db_list:

    db_name[, db_name][, ...]


    tbl_list:

    tbl_name[, tbl_name][, ...]


    wild_tbl_list:

    'pattern'[, 'pattern'][, ...]


    db_pair_list:

    (db_pair)[, (db_pair)][, ...]


    db_pair:

    from_db, to_db 


    Example:

    CHANGE REPLICATION FILTER REPLICATE_DO_DB=(db1,db2);


    CHANGE REPLICATION FILTER REPLICATE_IGNORE_DB=(db1,db2);


    CHANGE REPLICATION FILTER REPLICATE_DO_TABLE=(db1.t1);


    CHANGE REPLICATION FILTER REPLICATE_IGNORE_TABLE=(db2.t2);


    CHANGE REPLICATION FILTER REPLICATE_WILD_DO_TABLE=('db.t%');


    CHANGE REPLICATION FILTER REPLICATE_WILD_IGNORE_TABLE=('db%.a%');


    CHANGE REPLICATION FILTER REPLICATE_REWRITE_DB=((from_db, to_db));

FAQ:
1) What is the equivalent way of doing --replicate-do-db=db1 start option using the command?
A) You can achieve the same using
Eg: CHANGE REPLICATION FILTER REPLICATE_DO_DB=(db1)

2) To specify db1 and db2 in replicate-ignore-db, I was using –replicate-ignore-db=db1
–replicate-ignore-db=db2. How do I achieve the same using the command?
A) You should be able to set more than one value to any rule using one command.
Eg: To put db1, db2 database names in replicate-ignore-db, You can do
CHANGE REPLICATION FILTER REPLICATE_IGNORE_DB=(db1, db2)

3) Can I change master side filter rules (--binlog-*) using this command?
A) No, this new command is only for replication filter rules on slave (--replicate-* options ).

4) What are the replication filter rules which can be changed dynamically using this command?
A) You can change all slave side replication filter rules dynamically using this command. Following is the complete list of filter rules which can be set dynamically in MySQL-5.7.3.
  • REPLICATE_DO_DB
  • REPLICATE_IGNORE_DB
  • REPLICATE_DO_TABLE
  • REPLICATE_IGNORE_TABLE
  • REPLICATE_WILD_DO_TABLE
  • REPLICATE_WILD_IGNORE_TABLE
  • REPLICATE_REWRITE_DB

5) When I am using this command, I am getting error 1896 (This operation cannot be performed with a running slave sql thread; run STOP SLAVE SQL_THREAD first.). How do I rectify it?
A) When SQL thread is active, one cannot change the filtering rules. Unlike the case with the server startup options, this statement does not require restarting the server to take effect, only that the slave SQL thread be stopped. As the error message indicates execute 'STOP SLAVE SQL_THREAD', execute CHANGE REPLICATION FILTER command and start the slave SQL thread using 'START SLAVE SQL_THREAD' to make the new rules effective.
mysql> CHANGE REPLICATION FILTER REPLICATE_DO_DB=(db1);
ERROR 1896 (HY000): This operation cannot be performed with a running slave sql thread; run STOP SLAVE SQL_THREAD first

mysql> STOP SLAVE SQL_THREAD;
Query OK, 0 rows affected (0.05 sec)

mysql> CHANGE REPLICATION FILTER REPLICATE_DO_DB=(db1);
Query OK, 0 rows affected (0.00 sec)

6) Can I change more than one replication filter rules at a time using one command?
A) Yes, You can set multiple filter rules in one command
Eg: CHANGE REPLICATION FILTER REPLICATE_DO_DB=(db1),
REPLICATE_IGNORE_TABLE=(db1.t1);

7) Are brackets ( ) compulsory to set even one value?
A) Yes, You must use '(' ')' to mention the list of values (even for one value).
Eg: CHANGE REPLICATION FILTER REPLICATE_DO_TABLE=(db1.t1)

8) How to reset the values for a particular filter rule?
A) To reset the filter value, use void brackets "()" syntax, i.e, empty list will clear the existing
values
Eg: CHANGE REPLICATION FILTER REPLICATE_WILD_IGNORE_TABLE=();
clears all the existing values for REPLICATE_WILD_IGNORE_TABLE rule.

9) I have used only REPLICATE_IGNORE_TABLE in the command. What happens to other filter rules?
A) Unspecified filter rules will be unchanged.
Eg: CHANGE REPLICATION FILTER REPLICATE_REWRITE_DB = ((db1, db2), (db3,db4)); will clear all the existing list for REPLICATION_REWRITE_DB rule and it will add two db_pairs to the list. And the rest of the filter rules will not be changed.

10) Are there any behavioral changes when compared to server startup options --replicate-*?
A) No, there are no behavioral changes when compared to server startup options.

11) Are these changes made through the new command persistent upon restarting the server?
A) No, the changes are not persistent. You have to change the values again after restarting the server. If you want these changes to be persistent, you can put them in my.cnf file.

12) What will happen if I use same rule multiple times in the same command, will they be appended together?
A) No, if any rule is specified multiple times, the latter list will be considered (the earlier list will be ignored)
Eg: CHANGE REPLICATION FILTER REPLICATION_DO_DB=(db1, db2), REPLICATE_DO_DB=(db3,db4); the first db list db1, db2 list will be ignored and the REPLICATE_DO_DB rule will be updated with the second list db3, db4.

13) I have mentioned filter rules in my.cnf and also changed the values using CHANGE REPLICATION FILTER command. Which one takes high priority?
A) The values changed through the new command will be considered and values read through my.cnf settings will be ignored completely.

14) Will the existing way of setting the replication filters work on MySQL-5.7.3?
A) Yes, there is no change in that behavior. The filter rules can be set using
  • command line parameters at the time of server startup (--replicate-* )
  • set it in my.cnf file before server startups
  • set it dynamically using CHANGE REPLICATION FILTER COMMAND while the server is up and running
15) I have a db name with special characters, non utf-8 characters, eg: db`1, dä
How do I use it in the new command?
A) You have to specify the object name exactly the same way you have created the database using create database command. Note that db_names and table_names in db_list and table_list follow the metadata syntax exactly the same way they were created.
Eg: CHANGE REPLICATION FILTER REPLICATION_DO_DB=(`db``1`, `dä`);
and 'pattern' in wild_tbl_list follows exactly the same rule as it is used in 'LIKE' clause in 'SELECT' command.
Eg: CHANGE REPLICATE FILTER REPLICATION_WILD_IGNORE_TABLE=('db`1.t1');

16) When I am using this new command, I am getting 1794 error (Slave is not configured or failed to initialize properly), What is wrong?
A) This command cannot be executed on a stand-alone server where the replication is disabled.
As the error message says, you must at least set --server-id to enable either a master or a slave.
Additional error messages can be found in the MySQL error log.

17) I have used the command to change the filter rules. Now how to check the changed filter rules?
A) You can use "Show Slave Status" command.

18) What is the new syntax to mention --replicate-rewrite-db= db1->db2.
A) You can use CHANGE REPLICATION FILTER REPLICATION_REWRITE_DB=((db1,db2));
to rewrite all db1's statements into db2's statements. Just like other rules, you can mention more than one db_pair.
Eg: CHANGE REPLICATION FILTER REPLICATION_REWRITE_DB=((db1,db2), (db3,db4)) which tells the server to rewrite all db1 statements into db2 statements and all db3 statements into
db4 statements.

19) Can I add or delete few values from the list?
A) No, this new command is used to replace the old values with the new values and there is
no provision to add /delete values from the list.

20) Are there any test scripts to have a look at the usage in detail?
A) Yes, please look at $MYSQL_HOME/mysql-test/suite/rpl/t/rpl_filter_* for more details on
the usage.

Summary:
This blog post provides a very brief insight about Filtered replication and the new command “CHANGE REPLICATION FILTER” introduced as part of MySQL-5.7.3. It has a short FAQ section about the usage of the command. I hope we have made your life simpler by avoiding few restarts of the server. Please let us know your feedback on the same. If you find any issues feel free to create bugs @ bugs.mysql.com. I cannot end this post without mentioning the contribution from Davi Arnaut (for Bug#67362) which inspired us to implement this new feature. Thanks Davi!

References: