1、
MySQL錯(cuò)誤日志里出現(xiàn):
140331 10:08:18 [ERROR] Error reading master configuration
140331 10:08:18 [ERROR] Failed to initialize the master info structure
140331 10:08:18 [Note] Event Scheduler: Loaded 0 events
 
從錯(cuò)誤提示就可以看出和mster info相關(guān),找到數(shù)據(jù)目錄下的master.info文件,直接cat查看
# cat master.info 
 
18
luocs-mysql-
 
bin.000004
267
1.1.1.1
rep1
PASSWORD
3306
60
0
 
 
 
 
 
0
1800.000
 
0
 
可以看出格式不對(duì),解決方法非常簡單,reset slave即可,這樣將會(huì)清空master.info文件,錯(cuò)誤也將消失。
 
———————————————————————————————————————————————————————–
2、
MySQL 往一張大表添加字段時(shí)報(bào)如下錯(cuò)誤:
ERROR 1799 (HY000) at line 1: Creating index 'PRIMARY' required more than 'innodb_online_alter_log_max_size' bytes of modification log. Please try again. 
 
解決方法:
我的數(shù)據(jù)庫為MySQL 5.5版本,innodb_online_alter_log_max_size值為默認(rèn)大小128M。
mysql> show variables like 'innodb_online_alter_log_max_size';
+——————————————+———————-+
| Variable_name                            | Value                |
+——————————————+———————-+
| innodb_online_alter_log_max_size         | 134217728            |
+——————————————+———————-+
1 rows in set (0.00 sec)
 
該參數(shù)為動(dòng)態(tài)參數(shù)且全局的,可通過如下命令加大
mysql> set global innodb_online_alter_log_max_size=402653184;
Query OK, 0 rows affected (0.03 sec)
 
加到合適大小,我往120G大小表里添加字段設(shè)置該值4G,成功執(zhí)行。
 
———————————————————————————————————————————————————————–
3、
MySQL日志:
140306 12:03:25  InnoDB: ERROR: the age of the last checkpoint is 9434024,
InnoDB: which exceeds the log group capacity 9433498.
InnoDB: If you are using big BLOB or TEXT rows, you must set the
InnoDB: combined size of log files at least 10 times bigger than the
InnoDB: largest such row.
 
應(yīng)該是Innodb引擎下日志大小設(shè)置過小導(dǎo)致的,某個(gè)事物產(chǎn)生大量日志,但innodb_log_file_size設(shè)置過小,可以加大解決。
 
解決方法:
直接貼網(wǎng)上找到的方法,如下
STEP 01) Change the following in /etc/my.cnf
[mysqld]
innodb_log_buffer_size          = 32M
innodb_buffer_pool_size         = 3G
innodb_log_file_size            = 768M
STEP 02) mysql -uroot -p -e"SET GLOBAL innodb_fast_shutdown = 0;"
STEP 03) service mysql stop
STEP 04) rm -f /var/lib/mysql/ib_logfile*
STEP 05) service mysql start
I added SET GLOBAL innodb_fast_shutdown = 0;. What does that do? It forces InnoDB to completely purge transactional changes from all of InnoDB moving parts, including the transactional logs (ib_logfile0, ib_logfile1). Thus, there is no need to backup the old ib_logfile0, ib_logfile1. If deleting them makes you nervous, then make Step 04
 
mv /var/lib/mysql/ib_logfile* ..
 
———————————————————————————————————————————————————————–
4、
使用pt-online-schema-change工具添加字段時(shí),收到錯(cuò)誤如下:
# pt-online-schema-change –alter="add column tag_common text default null" –user=root –password=xxxxxxxx D=MYDB,t=MYTB –execute
Cannot connect to D=lsedata_13Q1,h=10.13.7.47,p=…,u=root
No slaves found.  See –recursion-method if host BJL1-Y13-10-ops.gaoder.net has slaves.
Not checking slave lag because no slaves were found and –check-slave-lag was not specified.
 
# A software update is available:
#   * Percona Toolkit 2.2.6 has a possible security issue (CVE-2014-2029) upgrade is recommended. The current version for Percona::Toolkit is 2.2.7.
 
The table `MYDB`.`MYTB` has triggers.  This tool needs to create its own triggers, so the table cannot already have triggers.
 
這是MYTB表上之前就有觸發(fā)器的原因,可以從pt-online-schema-change的工作機(jī)制了解到:
 
1) 如果存在外鍵,根據(jù)alter-foreign-keys-method參數(shù)值,檢測外鍵相關(guān)的表,針對(duì)相應(yīng)的設(shè)置進(jìn)行處理;
2) 創(chuàng)建一個(gè)新的表,表結(jié)構(gòu)修改后的數(shù)據(jù)表,用于從源數(shù)據(jù)表向新表中導(dǎo)入數(shù)據(jù);
3) 創(chuàng)建觸發(fā)器,在復(fù)制數(shù)據(jù)開始之后,將對(duì)源數(shù)據(jù)表繼續(xù)進(jìn)行數(shù)據(jù)修改的操作記錄下來,以便在數(shù)據(jù)復(fù)制結(jié)束后執(zhí)行這些操作,保證數(shù)據(jù)不會(huì)丟失;
4) 復(fù)制數(shù)據(jù),從源數(shù)據(jù)表中復(fù)制數(shù)據(jù)到新表中;
5) 修改外鍵相關(guān)的子表,根據(jù)修改后的數(shù)據(jù),修改外鍵關(guān)聯(lián)的子表;
6) 更改源數(shù)據(jù)表為old表,把新表更改為源表名,并將old表刪除;
7) 刪除觸發(fā)器;
 
pt-online-schema-change詳細(xì)文檔,請(qǐng)閱讀:http://www.percona.com/doc/percona-toolkit/2.1/pt-online-schema-change.html
 
———————————————————————————————————————————————————————–
5、
我們?cè)谑褂胢ysqldump的時(shí)候可能會(huì)遇到如下錯(cuò)誤:
mysqldump: Got error: 1044: Access denied for user 'lseread'@'IP' to database 'lsedata_13q1' when doing LOCK TABLES
 
解決方法:
加上–skip-lock-tables即可,類似如下:
mysqldump -h1.1.1.1  -uuser   -ppassword   -P3306 mydb mytb –where "time <= cast('2014-04-03 16:00' as datetime)" –skip-lock-tables –default-character-set=utf8  > mytb.txt 
 
MySQL5.6開始提供新特性GTID模式,我們的研發(fā)人員在我提供的從庫上dump的時(shí)候遇到如下警告:
Warning: Using a password on the command line interface can be insecure.
Warning: A partial dump from a server that has GTIDs will by default include the GTIDs of all transactions, even those that changed suppressed parts of the database. If you don't want to restore GTIDs, pass –set-gtid-purged=OFF. To make a complete dump, pass –all-databases –triggers –routines –events.
 
雖然有警告,但數(shù)據(jù)還是可以dump出來,郁悶的是,想把dump文件導(dǎo)入到其他版本數(shù)據(jù)庫的時(shí)候?qū)?huì)遇到:
ERROR 1839 (HY000) at line 24: @@GLOBAL.GTID_PURGED can only be set when @@GLOBAL.GTID_MODE = ON.
 
參考如下文:
gtid_executed:
WHEN used WITH global scope, this variable contains a representation OF the SET OF ALL transactions that are logged IN the BINARY log. 
WHEN used WITH SESSION scope, it contains a representation OF the SET OF transactions that are written TO the cache IN the CURRENT SESSION.
Issuing RESET MASTER causes the global VALUE (but NOT the SESSION VALUE) OF this variable TO be reset TO an empty string.
 
解決方法:
dump的時(shí)候加上參數(shù)–gtid-mode=OFF,類似如下:
mysqldump -h1.1.1.1  -uuser   -ppassword   -P3306 mydb mytb –where "time <= cast('2014-04-03 16:00' as datetime)" –skip-lock-tables –default-character-set=utf8 –gtid-mode=OFF > mytb.txt 
 
———————————————————————————————————————————————————————–
6、
給一張大表添加字段,過了一段時(shí)間系統(tǒng)HANG住,添加字段工作中斷。系統(tǒng)重啟之后,想重新添加字段,卻遇到如下錯(cuò)誤:
mysql> alter table mytb add column yyy text default null;
ERROR 1050 (42S01): Table 'mydb/#sql-ib54' already exists
 
查看MySQL err日志:
2014-04-04 09:10:12 10578 [Note] /opt/mysql5.6/bin/mysqld: ready for connections.
Version: '5.6.17-log'  socket: '/opt/mysql5.6/data/mysql.sock'  port: 3307  Source distribution
2014-04-04 09:10:24 10578 [ERROR] InnoDB: Failed to find tablespace for table '"mydb"."#sql-ib54"' in the cache. Attempting to load the tablespace with space id 54.
2014-04-04 09:10:24 52e55940  InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
2014-04-04 09:10:24 10578 [ERROR] InnoDB: Could not find a valid tablespace file for 'mydb/#sql-ib54'. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
 
查看數(shù)據(jù)目錄,有#開頭的一些文件,如下:
# ls
– 其他表信息忽略,之所以.ibd文件較多是因?yàn)椴僮鞅韒ytb為分區(qū)表
#sql-ib58.ibd  #sql-ib65.ibd #sql-1935_2.frm  #sql-ib59.ibd  #sql-ib66.ibd
#sql-1935_2.par  #sql-ib60.ibd  #sql-ib67.ibd #sql-ib54.ibd    #sql-ib61.ibd
#sql-ib55.ibd    #sql-ib62.ibd #sql-ib56.ibd    #sql-ib63.ibd #sql-ib57.ibd    #sql-ib64.ibd
 
這個(gè)問題我頭一次碰到,沒深入研究過其中細(xì)節(jié),只是想著#開頭為臨時(shí)文件,覺得可以rm,不猶豫直接刪掉。
然后重啟數(shù)據(jù)庫,結(jié)果報(bào)了大量ERROR:
2014-04-04 09:10:12 2b1b9b20dfe0  InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
InnoDB: If you are installing InnoDB, remember that you must create
InnoDB: directories yourself, InnoDB does not create them.
2014-04-04 09:10:12 10578 [ERROR] InnoDB: Could not find a valid tablespace file for 'mydb/#sql-ib54'. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2014-04-04 09:10:12 10578 [ERROR] InnoDB: Tablespace open failed for '"mydb"."#sql-ib54"', ignored.
2014-04-04 09:10:12 2b1b9b20dfe0  InnoDB: Operating system error number 2 in a file operation.
InnoDB: The error means the system cannot find the path specified.
InnoDB: If you are installing InnoDB, remember that you must create
InnoDB: directories yourself, InnoDB does not create them.
2014-04-04 09:10:12 10578 [ERROR] InnoDB: Could not find a valid tablespace file for 'mydb/#sql-ib55'. See http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html for how to resolve the issue.
2014-04-04 09:10:12 10578 [ERROR] InnoDB: Tablespace open failed for '"mydb"."#sql-ib55"', ignored.
……
 
遇到問題需要淡定,我嘗試?yán)^續(xù)添加字段,但報(bào)錯(cuò)依然。
然后我閱讀了下錯(cuò)誤日志里給出的http://dev.mysql.com/doc/refman/5.6/en/innodb-troubleshooting-datadict.html文章,才了解到,在獨(dú)立表空間模式下通過shell命令刪除表結(jié)構(gòu)和表空間文件的話就會(huì)遇到這種問題。
 
文中還給出了搶救方法,我找到了如下文段:
Problem with Temporary Table
 
If MySQL crashes in the middle of an ALTER TABLE operation, you may end up with an orphaned temporary table inside the InnoDB tablespace. Using the Table Monitor, you can see listed a table with a name that begins with #sql-. You can perform SQL statements on tables whose name contains the character “#” if you enclose the name within backticks. Thus, you can drop such an orphaned table like any other orphaned table using the method described earlier. To copy or rename a file in the Unix shell, you need to put the file name in double quotation marks if the file name contains “#”.
 
我就創(chuàng)建了一個(gè)新庫,創(chuàng)建了同結(jié)構(gòu)的表,然后再該表上進(jìn)行添加字段操作,這時(shí)候新庫數(shù)據(jù)目錄下又產(chǎn)生#sql-開頭的文件,我把這些文件全部拷貝到原庫目錄下。
接著全部改名,如下:
# mv \#sql-2ff9_1.frm sql-2ff9_1.frm
# mv \#sql-2ff9_1.par sql-2ff9_1.par
# mv \#sql-ib82.ibd sql-ib82.ibd 
# mv \#sql-ib83.ibd sql-ib83.ibd 
# mv \#sql-ib84.ibd sql-ib84.ibd 
# mv \#sql-ib85.ibd sql-ib85.ibd 
# mv \#sql-ib86.ibd sql-ib86.ibd 
# mv \#sql-ib87.ibd sql-ib87.ibd 
# mv \#sql-ib88.ibd sql-ib88.ibd 
# mv \#sql-ib89.ibd sql-ib89.ibd 
# mv \#sql-ib90.ibd sql-ib90.ibd 
# mv \#sql-ib91.ibd sql-ib91.ibd 
# mv \#sql-ib92.ibd sql-ib92.ibd 
# mv \#sql-ib93.ibd sql-ib93.ibd 
# mv \#sql-ib94.ibd sql-ib94.ibd 
# mv \#sql-ib95.ibd sql-ib95.ibd 
 
這樣show tables能查看表信息:
 
mysql> show tables;
+———————+
| Tables_in_mydb   |
+———————+
| #mysql50#sql-2ff9_1 |
+———————+
1 rows in set (0.01 sec)
 
但悲劇的是,我嘗試刪除#mysql50#sql-2ff9_1這個(gè)表,卻始終沒能刪掉。
時(shí)間耗得也不少,開發(fā)那邊一直在詢問進(jìn)展。
我就想到別的方案:
該表rename,創(chuàng)建結(jié)構(gòu)一樣的新表,導(dǎo)數(shù)據(jù)到新表,再往新表里增加字段,rename的表刪除。
 
實(shí)際證明此方案可行。就是數(shù)據(jù)量大,導(dǎo)數(shù)據(jù)過程時(shí)間長,當(dāng)然增加字段時(shí)間也一樣長。
 
那網(wǎng)上有些人說刪除ibdata1和ib_logfile0、1,然后重啟數(shù)據(jù)庫就可以解決,這方法可行嗎?
答案當(dāng)然是NO,在獨(dú)立表空間模式下,ibdata里保存數(shù)據(jù)字典以及UNDO信息,刪除之后重啟數(shù)據(jù)庫將會(huì)生成全新的ibdata,也就是丟失了字典信息和UNDO了。
 
到時(shí)候會(huì)出現(xiàn)如下現(xiàn)象:
mysql> show tables;
+——————-+
| Tables_in_mydb |
+——————-+
| t1          |
+——————-+
1 rows in set (0.01 sec)
 
mysql> select count(*) from t1;
ERROR 1146 (42S02): Table 'mydb.t1' doesn't exist

———————————————————————————————————————————————————————–

 

7、
主從復(fù)制失敗,查看slave日志如下錯(cuò)誤:
140405  4:16:12 [ERROR] Slave I/O: error reconnecting to master 'rep1@10.13.34.199:3306' – retry-time: 60  retries: 86400, Error_code: 2003
140405  6:53:12 [Note] Slave: connected to master 'rep1@10.13.34.199:3306',replication resumed in log 'mysql-bin.000275' at position 192295247
140405  6:53:12 [ERROR] Error reading packet from server: Could not find first log file name in binary log index file ( server_errno=1236)
140405  6:53:12 [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file', Error_code: 1236
140405  6:53:12 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000275', position 192295247
140405  6:54:11 [Note] Error reading relay log event: slave SQL thread was killed
140405  6:54:11 [Note] Slave I/O thread: connected to master 'rep1@10.13.34.199:3306',replication started in log 'mysql-bin.000275' at position 192295247
140405  6:54:11 [ERROR] Error reading packet from server: Could not find first log file name in binary log index file ( server_errno=1236)
140405  6:54:11 [ERROR] Slave I/O: Got fatal error 1236 from master when reading data from binary log: 'Could not find first log file name in binary log index file', Error_code: 1236
140405  6:54:11 [Note] Slave I/O thread exiting, read up to log 'mysql-bin.000275', position 192295247
140405  6:54:11 [Note] Slave SQL thread initialized, starting replication in log 'mysql-bin.000275' at position 192295247, relay log './SHUBEI-34-198-relay-bin.000153' position: 192295393
 
從指定的binlog里找到指定的position位置,我發(fā)現(xiàn)這已經(jīng)到頭了,之后就是切了新binlog
# mysqlbinlog mysql-bin.000275 | grep -A 10 192295247 
#140405  3:16:06 server id 1  end_log_pos 192295247     Xid = 468032712
COMMIT/*!*/;
DELIMITER ;
# End of log file
ROLLBACK /* added by mysqlbinlog */;
/*!50003 SET COMPLETION_TYPE=@OLD_COMPLETION_TYPE*/;
 
這下問題好解決了,給從庫指定新的binlog即可。
 
主庫show master status;
mysql> show master status;
+——————+———-+————–+——————+
| File             | Position | Binlog_Do_DB | Binlog_Ignore_DB |
+——————+———-+————–+——————+
| mysql-bin.000276 |      107 |              | test             |
+——————+———-+————–+——————+
1 row in set (0.00 sec)
 
從庫重新配置復(fù)制:
mysql> stop slave;
Query OK, 0 rows affected (0.00 sec)
 
mysql> change master to master_host='10.13.34.199',master_port=3306,master_user='rep1',master_password='RepSlavE&2013', master_log_file='mysql-bin.000276',master_log_pos=107; 
Query OK, 0 rows affected (0.05 sec)
 
mysql> start slave;
Query OK, 0 rows affected (0.00 sec)
 
查看主從復(fù)制狀態(tài):
mysql> show slave status\G
—-省略—-
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes

———————————————————————————————————————————————————————–
8、
解決ERROR 1872 (HY000): Slave failed to initialize relay log info structure from the repository
修改了relay-log的輸出格式之后,重啟MySQL,同步失敗
 
mysql> start slave;
ERROR 1872 (HY000): Slave failed to initialize relay log info structure from the repository
 
MySQL Error日志:
2014-02-20 16:35:19 27094 [ERROR] Failed to open the relay log './luocs166-relay-bin.000007' (relay_log_pos 359).
2014-02-20 16:35:19 27094 [ERROR] Could not find target log file mentioned in relay log info in the index file '/opt/mysql/data/anav-relay-log.index' during relay log initialization.
 
解決方法:
我在查看change master to命令的時(shí)候找到如下文段
mysql> ? change master to
– 省略大部分
The next example shows an operation that is less frequently employed.
It is used when the slave has relay log files that you want it to
execute again for some reason. To do this, the master need not be
reachable. You need only use CHANGE MASTER TO and start the SQL thread
(START SLAVE SQL_THREAD):
 
CHANGE MASTER TO
  RELAY_LOG_FILE='slave-relay-bin.006',
  RELAY_LOG_POS=4025;
 
但我的數(shù)據(jù)庫版本為5.6,使用了GTID模式,使用報(bào)錯(cuò):
mysql> CHANGE MASTER TO
    -> RELAY_LOG_FILE='anav-relay-log.000001',
    -> RELAY_LOG_POS=120;
ERROR 1776 (HY000): Parameters MASTER_LOG_FILE, MASTER_LOG_POS, RELAY_LOG_FILE and RELAY_LOG_POS cannot be set when MASTER_AUTO_POSITION is active.
 
因此如下解決:
mysql> reset slave;
Query OK, 0 rows affected (0.00 sec)
 
mysql> CHANGE MASTER TO
    -> MASTER_HOST='10.19.3.168',
    -> MASTER_USER='repl2',
    -> MASTER_PASSWORD='oracle',
    -> MASTER_AUTO_POSITION = 1;
Query OK, 0 rows affected, 2 warnings (0.02 sec)
 
mysql> start slave;
Query OK, 0 rows affected, 1 warning (0.01 sec)
 
mysql> show slave status\G
…. 省略部分 ….
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes