有個(gè)客戶跟我說(shuō),他們把weblogic從923升級(jí)到923以后,總是提示failed to resume transaction的。當(dāng)時(shí)沒(méi)有看到任何信息,以為跟JTA超時(shí)有關(guān),讓客戶把JTA timeout加大,同時(shí)把下面的設(shè)置加入jdbc-config.xml中??蛻舴答佭^(guò)來(lái)說(shuō):還是不行。 :(
<keep-xa-conn-till-tx-complete>true</keep-xa-conn-till-tx-complete>
<rollback-local-tx-upon-conn-close>true</rollback-local-tx-upon-conn-close>
比較郁悶,后來(lái)想想,客戶的jdbc是使用non-XA的driver, 同時(shí)將emulate-2pc設(shè)為了true,上面連個(gè)參數(shù)是針對(duì)XA的,測(cè)試沒(méi)有效果應(yīng)該是正常的。:) 客戶把具體錯(cuò)誤信息給我發(fā)了過(guò)來(lái),如下:
####<2008-10-27 上午10時(shí)35分28秒 CST> <Error> <JDBC> <SZSEWEB-YSXAPP> <appServer11> <[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1225074928234> <BEA-001112> <Test "SELECT 1" set up for pool "szseWebDataSource" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]服務(wù)器無(wú)法繼續(xù)執(zhí)行該事務(wù)。說(shuō)明: 3c00000047。".>
奇怪了,對(duì)于non-XA的connection, weblogic做測(cè)試的時(shí)候,應(yīng)該是不會(huì)其transaction的,而且是直接拿底層的物理connection來(lái)做個(gè)select,怎么會(huì)有transaction resume呢? 是不是底層物理連接出現(xiàn)了問(wèn)題? 跟客戶溝通了一下,大概了解了他們的應(yīng)用:通過(guò)jdbc調(diào)用SQL Server的stored procedure,而stored procedure中會(huì)起自己的transaction。 Tx中的操作分兩種類型:
1:業(yè)務(wù)簡(jiǎn)單,數(shù)據(jù)量大
2:數(shù)據(jù)不大,但業(yè)務(wù)比較復(fù)雜
總之,TX中的SQL執(zhí)行時(shí)間比較長(zhǎng)。
開始懷疑跟客戶的stored procedure有關(guān),建議他先去掉里面的TX,果然奏效。因?yàn)閠ransaction一直是我比較模糊的董東,也不敢跟客戶說(shuō):你不能這么寫,這么寫是不可以的。畢竟自己底氣不足啊。不是很清楚jdbc connection的auto commit提交的到底是哪個(gè)事務(wù),是driver的tx,還是stored procedure里的tx。應(yīng)該是前者吧?;艘簧衔?,自己搭了個(gè)測(cè)試環(huán)境,終于復(fù)現(xiàn)了這個(gè)問(wèn)題:
存儲(chǔ)過(guò)程:
1 create proc dbo.TestProc
2 as
3 begin transaction
4 waitfor delay '00:02:00'
5 insert into dbo.TestT_1 values('test')
6 commit
JDBC 代碼:
1 package com.bea.cs.test.jdbc;
2
3 import com.bea.cs.test.utils.JNDIRetriver;
4 import java.sql.*;
5 import java.io.*;
6 import javax.transaction.*;
7
8 public class SQLServerJDBCTest {
9
10 public static void main(String args[])
11 {
12 SQLServerJDBCTest test = new SQLServerJDBCTest();
13
14 for(int loop=0; loop<15; loop++)
15 test.callProc("jdbc/SQLServerNonXADS", loop);
16
17 try{
18 Thread.currentThread().sleep(10000);
19 }catch(Exception e){}
20
21 for(int loop=0; loop<15; loop++)
22 test.checkAutoCommit("jdbc/SQLServerNonXADS");
23 }
24
25 public void checkAutoCommit(String dsName)
26 {
27 CheckAutoCommitThread cacThread = new CheckAutoCommitThread(dsName);
28 cacThread.start();
29 }
30
31 class CheckAutoCommitThread extends Thread
32 {
33 private String dsName = null;
34
35 public CheckAutoCommitThread(String ds){
36 dsName = ds;
37 }
38
39 private void callProc(String dsName, int loop)
40 {
41 ProcThread procThread = new ProcThread(dsName, loop);
42 procThread.start();
43 }
44
45 class ProcThread extends Thread
46 {
47 private String ds = null;
48 private int id = -1;
49
50 public ProcThread(String dsName, int loop)
51 {
52 ds = dsName;
53 id = loop;
54 }
55
56 public void run()
57 {
58 String url = "t3://10.130.2.90:7001";
59 String sql = "{ call TestProc() }";
60 Connection conn = null;
61 JNDIRetriver retriever = new JNDIRetriver(url);
62 try{
63 conn = retriever.getJBDCConnection(ds);
64 boolean autoCommit = conn.getAutoCommit();
65 CallableStatement cstmt = conn.prepareCall(sql);
66
67 //start a thread to close current connection, so that a connection
68 //that attachs a tx will be returned to connection pool and when
69 //it's retrieved from connection pool by other client, the error
70 //will be reproduced.
71 ConnCloseThread closeThread = new ConnCloseThread(conn, id);
72 closeThread.start();
73 long start = System.currentTimeMillis();
74 System.out.println( "execute-" + id + "starts at: " + start/1000.0);
75 cstmt.execute();
76 long end = System.currentTimeMillis();
77 System.out.println("statement " + id + " execute: " + (end-start)/1000.0);
78 conn.close();
79 }catch(Exception e)
80 {
81 try{
82 System.out.println("connection is closed for exception: " + e.getMessage());
83 conn.close();
84 }catch(Exception e1){}
85 e.printStackTrace();
86 }
87 }
88 }
89
90 class ConnCloseThread extends Thread
91 {
92 private Connection connection = null;
93 private int id = -1;
94
95 public ConnCloseThread(Connection conn, int loop){
96 connection = conn;
97 id = loop;
98 }
99
100 public void run()
101 {
102 try{
103 Thread.currentThread().sleep(10000);
104 //connection.rollback();
105 long start = System.currentTimeMillis();
106 System.out.println( "closeConn-" + id + "starts at: " + start/1000.0);
107 connection.close();
108 long end = System.currentTimeMillis();
109 System.out.println("close connection " + id + " takes: " + (end -start)/1000.0);
110 }catch(Exception e){}
111 }
112 }
113 }
114
115
測(cè)試結(jié)果和預(yù)想的有點(diǎn)出入:ConnCloseThread中關(guān)閉連接的時(shí)候,不是立刻返回的。Connection.close()會(huì)觸發(fā)Connection.commit(),而因?yàn)檎{(diào)用的存儲(chǔ)過(guò)程中,存儲(chǔ)過(guò)程起了自己的事務(wù),connection.commit()必須等到存儲(chǔ)過(guò)程結(jié)束才能完成(這個(gè)是microsoft論壇上看到的)。如果所有connection.close()都等到tx commit或rollback完成才執(zhí)行的話,這個(gè)問(wèn)題就不會(huì)出現(xiàn)了??纯次业臏y(cè)試結(jié)果:
statement 5 execute:表示從存儲(chǔ)過(guò)程調(diào)用開始,到調(diào)用返回的時(shí)間
close connection 5 takes:表示關(guān)閉連接耗費(fèi)的時(shí)間(也就是connection.commit()等待存儲(chǔ)過(guò)程事務(wù)結(jié)果的時(shí)間)
statement 5 execute: 125.922
close connection 5 takes: 148.39
statement 14 execute: 130.031
close connection 14 takes: 148.39
statement 2 execute: 134.031
close connection 2 takes: 148.39
statement 6 execute: 138.14
close connection 6 takes: 148.406
statement 8 execute: 142.14
close connection 8 takes: 148.406
statement 0 execute: 146.156
close connection 0 takes: 148.406
statement 3 execute: 162.39
close connection 3 takes: 168.625
statement 11 execute: 166.39
close connection 11 takes: 168.625
statement 13 execute: 120.0
close connection 13 takes: 115.359
statement 12 execute: 150.265
close connection 12 takes: 148.406
statement 9 execute: 154.281
close connection 9 takes: 148.406
statement 1 execute: 158.39
close connection 1 takes: 148.406
statement 4 execute: 170.5
close connection 4 takes: 168.625
statement 10 execute: 174.515
close connection 10 takes: 168.625
statement 7 execute: 178.609
close connection 7 takes: 168.625
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966102> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3b00000001.".>
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966132> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3e00000001.".>
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '31' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966142> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3800000001.".>
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966162> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3a00000001.".>
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '29' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966172> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3400000001.".>
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '19' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966172> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3600000001.".>
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '20' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966182> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3f00000001.".>
從測(cè)試結(jié)果來(lái)看,凡是close connection耗時(shí)比execute statement短的,連接(物理連接)都會(huì)報(bào)出該問(wèn)題。分析原因:通過(guò)weblogic datasource獲取的connection并不是物理connection,而是由weblogic wrapped的connection。這些conection在被close后,并不會(huì)關(guān)閉物理連接,而只是將物理連接還池。我們對(duì)connection的所有操作,最終都會(huì)被delegated到底層物理連接上,即commit(),rollback()最終都是在物理連接上執(zhí)行。如果上面的connection.close(),底層物理連接沒(méi)有等到存儲(chǔ)過(guò)程事務(wù)結(jié)束就返回的話,那么物理連接上應(yīng)該還帶有此次操作的事務(wù),而weblogic這邊不會(huì)關(guān)系物理連接的情況,直接將連接放入connection pool供其它客戶端使用。這時(shí)候如果設(shè)定了test on reserve的話,下次客戶端從data source獲取連接時(shí),weblogic會(huì)檢查這個(gè)物理連接,作一個(gè)select操作的,這個(gè)有問(wèn)題的連接就會(huì)暴露出來(lái),也就是上面的異常。這個(gè)問(wèn)題如果使用driver manager來(lái)獲取連接的話(如果每次都關(guān)閉的話),則不會(huì)出現(xiàn),因?yàn)槭褂玫奈锢磉B接每次都是不同的。還好,weblogic會(huì)幫忙重新創(chuàng)建有問(wèn)題的連接。原因大概了解了,但這是誰(shuí)的問(wèn)題呢? 為什么connection.close()不等存儲(chǔ)過(guò)程的事務(wù)結(jié)束?
結(jié)論:一般而言,我們不建議通過(guò)JDBC調(diào)用存儲(chǔ)過(guò)程的時(shí)候,在存儲(chǔ)過(guò)程中定義事務(wù),應(yīng)該將tx的管理工作交給jdbc去做。 non-xa如此,xa亦如此,畢竟事務(wù)嵌套了以后,管理起來(lái)是個(gè)問(wèn)題,完整性更是個(gè)問(wèn)題。
posted on 2008-10-28 19:38
走走停停又三年 閱讀(9075)
評(píng)論(0) 編輯 收藏 所屬分類:
Database