今天在MySQL同步中再次遇到了令人讨厌的 1062与1032错误。。
对于MySQL的replication,感觉有点不太靠谱,,我不知道其他DBA都是用哪些同步方案。
在思考是否需要换个同步方案,随着访问量的增加,一主一备看来也比较脆弱了。宕机风险也高。
先前态度比较乐观。根据错误的提示:
20131128_14:56:09mysql> show slave status\G;
20131128_14:56:09*************************** 1. row ***************************
20131128_14:56:09 Slave_IO_State: Waiting for master to send event
20131128_14:56:09 Master_Host: 192.168.101.210
20131128_14:56:09 Master_User: backup
20131128_14:56:09 Master_Port: 3306
20131128_14:56:09 Connect_Retry: 60
20131128_14:56:09 Master_Log_File: mysql-bin.001472
20131128_14:56:09 Read_Master_Log_Pos: 339328924
20131128_14:56:09 Relay_Log_File: hostname-relay-bin.004513
20131128_14:56:09 Relay_Log_Pos: 66635985
20131128_14:56:09 Relay_Master_Log_File: mysql-bin.001472
20131128_14:56:09 Slave_IO_Running: Yes
20131128_14:56:09 Slave_SQL_Running: No
20131128_14:56:09 Replicate_Do_DB:
20131128_14:56:09 Replicate_Ignore_DB: mysql,test
20131128_14:56:09 Replicate_Do_Table:
20131128_14:56:09 Replicate_Ignore_Table:
20131128_14:56:09 Replicate_Wild_Do_Table:
20131128_14:56:09 Replicate_Wild_Ignore_Table:
20131128_14:56:09 Last_Errno: 1032
20131128_14:56:09 Last_Error: Could not execute Update_rows event on table feiliu_bbs.uc_member_info; Duplicate entry '20551928' fo20131128_14:56:09r key 'PRIMARY', Error_code: 1062; Can't find record in 'uc_member_info', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the ev20131128_14:56:09ent's master log mysql-bin.001472, end_log_pos 66636646
20131128_14:56:09 Skip_Counter: 0
20131128_14:56:09 Exec_Master_Log_Pos: 66635839
20131128_14:56:09 Relay_Log_Space: 339336281
20131128_14:56:09 Until_Condition: None
20131128_14:56:09 Until_Log_File:
20131128_14:56:09 Until_Log_Pos: 0
20131128_14:56:09 Master_SSL_Allowed: No
20131128_14:56:09 Master_SSL_CA_File:
20131128_14:56:09 Master_SSL_CA_Path:
20131128_14:56:09 Master_SSL_Cert:
20131128_14:56:09 Master_SSL_Cipher:
20131128_14:56:09 Master_SSL_Key:
20131128_14:56:09 Seconds_Behind_Master: NULL
20131128_14:56:09Master_SSL_Verify_Server_Cert: No
20131128_14:56:09 Last_IO_Errno: 0
20131128_14:56:09 Last_IO_Error:
20131128_14:56:09 Last_SQL_Errno: 1032
20131128_14:56:09 Last_SQL_Error: Could not execute Update_rows event on table feiliu_bbs.uc_member_info; Duplicate entry '20551928' fo20131128_14:56:09r key 'PRIMARY', Error_code: 1062; Can't find record in 'uc_member_info', Error_code: 1032; handler error HA_ERR_END_OF_FILE; the ev20131128_14:56:09ent's master log mysql-bin.001472, end_log_pos 66636646
20131128_14:56:09 Replicate_Ignore_Server_Ids:
20131128_14:56:09 Master_Server_Id: 1
发现时主键重复与更新失败。
主键重复的情况按照常理说可以直接跳过,所以我写好了命令:
命令1:stop slave sql_thread;set global sql_slave_skip_counter=1;start slave sql_thread;
Mysql 的Replication主要两个线程:1:IO_Thread 2:SQL_Thread;
网上都是建议直接stop slave或者start slave.这里主要是sql_thread的异常中断,所以我只重启sql_thread;
然后执行过命令1之后,发现这种情况不停的发生。不停的一次次跳过太过繁琐,通过问题来看主要是针对表
feiliu_bbs.uc_member_info
最终实在忍受不了,就打算从主库dump一份最新的数据到备份。于是我做了如下操作:
0:停止slave的同步进程
stop slave;
1:在master端手动备份目标表:uc_member_info,改名为uc_member_info_bak
create table uc_member_info_bak
select * from uc_member_info where 1=2;
2:导出表
mysqldump -uroot -p dbname tablename > tablename.sql
3:scp到备库,然后导入
mysql>source tablename.sql;
4:然后切换表名
切换之前,在新表uc_member_info_bak中补加原表uc_member_info 中的索引;
补加表中原有的索引:
create index idx_xxxx on uc_member_info_bak(cloumn_name);
切换表名:
alter table uc_member_info rename to uc_member_info_old;
alter table uc_member_info_bak rename to uc_member_info;
5:开启同步进程
start slave;
由于my.cnf中已经设置跳过主键重复错误。
slave-skip-errors = 1062
error错误日志中会不停弹出如下错误信息。
130922 12:23:20 [Warning] Slave SQL: Could not execute Write_rows event on table feiliu_bbs.uc_member_info; Duplicate entry '1756491
4' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.001185, end_log_pos 69
4223737, Error_code: 1062
130922 12:23:21 [Warning] Slave SQL: Could not execute Write_rows event on table feiliu_bbs.uc_member_info; Duplicate entry '1756491
5' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.001185, end_log_pos 69
4274279, Error_code: 1062
130922 12:23:21 [Warning] Slave SQL: Could not execute Write_rows event on table feiliu_bbs.uc_member_info; Duplicate entry '1756491
6' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.001185, end_log_pos 69
4278383, Error_code: 1062
130922 12:23:21 [Warning] Slave SQL: Could not execute Write_rows event on table feiliu_bbs.uc_member_info; Duplicate entry '1756491
7' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.001185, end_log_pos 69
4326934, Error_code: 1062
130922 12:23:21 [Warning] Slave SQL: Could not execute Write_rows event on table feiliu_bbs.uc_member_info; Duplicate entry '1756491
8' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.001185, end_log_pos 69
4340338, Error_code: 1062
弹出这个问题的原因是
备库现有的的uc_member_info是从主库dump过来的,比原有的备库表uc_member_info_old 进度要快。因为备库与主库之间存在较长时间的间隔,主库的binlog,在同步停止的时间内,没有及时发到备库导致。
解决方法:
首先停止同步,然后根据最后一条错误信息:
130922 12:23:21 [Warning] Slave SQL: Could not execute Write_rows event on table feiliu_bbs.uc_member_info; Duplicate entry '1756491
8' for key 'PRIMARY', Error_code: 1062; handler error HA_ERR_FOUND_DUPP_KEY; the event's master log mysql-bin.001185, end_log_pos 69
4340338, Error_code: 1062
删除 主键>=“17564918“的数据。因为主键为自增长。但是这个自增长不依赖自动序列。而上来源于另外一张表的主键。索引不用考虑自增序列的混乱情况。
删除之后,恢复同步。就不在出现主键重复的错误提示了。
等待同步,大约半个小时之后,仍然会出现1032错误。仍旧是表uc_member_info.现在怀疑应该不单单是单表的问题,而是数据库本身的一致性已经处理问题。准备放弃,进行重做。但是目前不能让备库停止,不能影响其他数据库的正常读取业务。故准备采用杀手锏!
准备进行设置数据库跳过一般的错误异常,使之不会轻易停止同步。
使用:slave_exec_mode参数
(具体参数说明 http://blog.csdn.net/zhangbiaobiaobiao/article/details/17072199 这里记录过大概。)
然后准备后续的主备的一致性验证与备份重做。
--转自