機房環境:
OS:windows 2008
數據庫版本:oracle 11g
事件起因:
昨天公司運維說機房的HP服務器磁盤壞了,我當時冷汗一下子就出來了,因為這是測試庫,沒有備份,沒有歸檔,但是里面可是有1.5T的數據啊,
![解決ORA-00600: <wbr>internal <wbr>error <wbr>code, <wbr>arguments: <wbr>[kcratr_nab_less_than_odr]錯誤 解決ORA-00600: <wbr>internal <wbr>error <wbr>code, <wbr>arguments: <wbr>[kcratr_nab_less_than_odr]錯誤](http://www.sinaimg.cn/uc/myshow/blog/misc/gif/E___6692EN00SIGG.gif)
下午他們修了一下午,系統可以進去了,但是數據庫在啟動到OPEN的時候報錯SYSTEM01數據文件缺失一千多個數據塊,由于沒有備份和歸檔,所以這個數據庫是掛掉了。
![解決ORA-00600: <wbr>internal <wbr>error <wbr>code, <wbr>arguments: <wbr>[kcratr_nab_less_than_odr]錯誤 解決ORA-00600: <wbr>internal <wbr>error <wbr>code, <wbr>arguments: <wbr>[kcratr_nab_less_than_odr]錯誤](http://www.sinaimg.cn/uc/myshow/blog/misc/gif/E___6706EN00SIGG.gif)
(DBA們一定要做好備份,否則最后全是自己的事情),幸好我們還有一個一周前的冷備庫,所以我直接物理遷移,把數據庫恢復到一周以前了,恢復的過程非常簡單,因為數據庫必須的那些文件路徑都已存在(如參數文件內的DUMP文件等),我只是把參數文件,密碼文件,控制文件,數據文件恢復到指定的目錄,然后直接STARTUP,在nomount和mount階段都是比較順利的,而到open階段的時候,則報錯了讓我鬧心的ORA-600ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr]的錯誤,因為大家都知道,ORA-600是數據庫自己內部的BUG,我也是第一次遇到,問了很多人大家都不會,所以只能自己去摸索解決,值得慶幸的是在摸索了一個小時后,我終于把數據庫OPEN了,以下就是我恢復的步驟,如果感覺對您有所幫助,那么請支持我一下。
![解決ORA-00600: <wbr>internal <wbr>error <wbr>code, <wbr>arguments: <wbr>[kcratr_nab_less_than_odr]錯誤 解決ORA-00600: <wbr>internal <wbr>error <wbr>code, <wbr>arguments: <wbr>[kcratr_nab_less_than_odr]錯誤](http://www.sinaimg.cn/uc/myshow/blog/misc/gif/E___7392ZH00SIGG.gif)
//報錯:ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [1770], [779181], [779205], [], [], [], [], [], [], []
//因為ORA-600的錯誤沒有具體提示,所以很多人無從下手,別著急,我們可以先去告警日志去看看,以下是我的告警日志信息:
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [1770], [779181], [779205], [], [], [], [], [], [], [
Incident details in: d:/app/administrator/diag/rdbms/ccxe/ccxe/incident/incdir_2570/ccxe_ora_2164_i2570.trc
Aborting crash recovery due to error 600
Errors in file d:/app/administrator/diag/rdbms/ccxe/ccxe/trace/ccxe_ora_2164.trc:
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [1770], [779181], [779205], [], [], [], [], [], [], []
Errors in file d:/app/administrator/diag/rdbms/ccxe/ccxe/trace/ccxe_ora_2164.trc:
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [1770], [779181], [779205], [], [], [], [], [], [], []
ORA-600 signalled during: alter database open...
Tue May 10 10:13:10 2011
Trace dumping is performing id=[cdmp_20110510101310]
Tue May 10 10:13:11 2011
Sweep [inc][2570]: completed
Sweep [inc2][2570]: completed
Tue May 10 10:29:52 2011
Shutting down instance (immediate)
Shutting down instance: further logons disabled
//我們通過告警日志文件好像也沒查到什么有用的信息,別著急,我們還可以再查找對應的TRC文件,我的對應TRC文件則是ccxe_ora_2788_i2164.trc,以下是我的TRC信息
WARNING! Crash recovery of thread 1 seq 1770 is
ending at redo block 779181 but should not have ended before
redo block 779205
//看,有警告了,意思是我在恢復的時候,丟失了redo日志,當時我很納悶,怎么會丟失redo呢?最后我把這個問題定位在傳輸redo的時候可能造成數據塊的丟失,因為我以前發生過這類事情,所以我重新拷貝redo日志,再次將數據庫啟動。
Incident 3771 created, dump file: d:/app/administrator/diag/rdbms/ccxe/ccxe/incident/incdir_3771/ccxe_ora_2164_i3771.trc
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [1770], [779181], [779205], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [1770], [779181], [779205], [], [], [], [], [], [], []
ORA-00600: internal error code, arguments: [kcratr_nab_less_than_odr], [1], [1770], [779181], [779205], [], [], [], [], [], [], []
*** 2011-05-10 10:47:11.138
Stopping background process MMNL
*** 2011-05-10 10:47:12.139
Stopping background process MMON
*** 2011-05-10 10:47:13.259
ksukia: Starting kill, flags = 1
ksukia: killed 0 out of 0 processes.
*** 2011-05-10 10:47:14.652
*** 2011-05-10 10:47:14.652 4132 krsh.c
ARCH: Archival disabled due to shutdown: 1089
*** 2011-05-10 10:47:15.653
*** 2011-05-10 10:47:15.653 4132 krsh.c
ARCH: Archival disabled due to shutdown: 1089
//重新拷貝redo日志到指定目錄,再次啟動數據庫。
Total System Global Area 6413680640 bytes
Fixed Size 2187728 bytes
Variable Size 754978352 bytes
Database Buffers 5637144576 bytes
Redo Buffers 19369984 bytes
Database mounted.
ORA-00338: log 4 of thread 1 is more recent than control file
ORA-00312: online log 4 thread 1:
'D:/APP/ADMINISTRATOR/ORADATA/CCXE/REDO04.LOG'
//報錯果然不一樣了,這次報錯的意思是我的日志比我的控制文件新,要我恢復控制文件。
SQL> recover database using backup controlfile until cancel; 指定日志文件恢復
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
D:/app/Administrator/oradata/CCXE/redo04.log
ORA-00279: change 128488612 generated at 05/10/2011 11:13:52 needed for thread
ORA-00289: suggestion :
D:/APP/ADMINISTRATOR/FLASH_RECOVERY_AREA/CCXE/ARCHIVELOG/2011_05_10/O1_MF_1_1771_%U_.ARC
ORA-00280: change 128488612 for thread 1 is in sequence #1771
ORA-00278: log file 'D:/app/Administrator/oradata/CCXE/redo04.log' no longer
needed for this recovery
Specify log: {<RET>=suggested | filename | AUTO | CANCEL}
D:/app/Administrator/oradata/CCXE/redo05.log
Log applied.
Media recovery complete.
SQL> alter database open; -必須以RESETLOGS方式打開數據庫
alter database open
ERROR at line 1:
ORA-01589: must use RESETLOGS or NORESETLOGS option for database open
SQL> alter database open RESETLOGS ;
Database altered.
OK,現在數據庫起來了,通過這次出錯我總結了幾點,希望對大家有幫助
1:數據庫不管是不是測試,都盡量備份,否則出問題哭都來不及。
2:數據庫盡量歸檔,這樣就可以恢復數據庫在任何時刻。
3:出錯時不要慌,要仔細查看報錯信息,分析報錯,根據自己所掌握的知識一步一步解決,當你把問題解決
時,你會發現排錯的過程真的是一個非常提升自己的過程。
4:平常一定要有幾個在這方面很強的朋友,因為有的時候并不是自己查資料就能查到的,更多的是需要朋友的
經驗與幫助。
注:以上就是這次錯誤的全部過程,過段時間我會把近期所遇到的錯誤都整理一下,都分享給大家,通過近期的工作我感覺數據庫真的是一個非常龐大的系統,每次遇到新錯的時候我都會很郁悶的解決,因為我發現我的懂的還是太少了,學習畢竟是一個由點到線,由線到網的過程,一個問題由模糊到清晰,由清晰再模糊,反復來幾遍最終都會很明白的。我們要有清晰的理論知識,出錯的時候才會找到出題所在,作為一個DBA冷靜,清晰的頭腦也很重要,通過DBA的這個行業,也是非常鍛煉自己的性格,希望大家都可以在DBA的這條職場之路,越走越遠,祝大家好運。
posted on 2015-03-20 15:08
壞男孩 閱讀(1371)
評論(0) 編輯 收藏 所屬分類:
ORACLE篇章