有個客戶跟我說,他們把weblogic從923升級到923以后,總是提示failed to resume transaction的。當時沒有看到任何信息,以為跟JTA超時有關,讓客戶把JTA timeout加大,同時把下面的設置加入jdbc-config.xml中。客戶反饋過來說:還是不行。 :(
<keep-xa-conn-till-tx-complete>true</keep-xa-conn-till-tx-complete>
<rollback-local-tx-upon-conn-close>true</rollback-local-tx-upon-conn-close>
比較郁悶,后來想想,客戶的jdbc是使用non-XA的driver, 同時將emulate-2pc設為了true,上面連個參數是針對XA的,測試沒有效果應該是正常的。:) 客戶把具體錯誤信息給我發了過來,如下:
####<2008-10-27 上午10時35分28秒 CST> <Error> <JDBC> <SZSEWEB-YSXAPP> <appServer11> <[ACTIVE] ExecuteThread: '0' for queue: 'weblogic.kernel.Default (self-tuning)'> <<WLS Kernel>> <> <> <1225074928234> <BEA-001112> <Test "SELECT 1" set up for pool "szseWebDataSource" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]服務器無法繼續執行該事務。說明: 3c00000047。".>
奇怪了,對于non-XA的connection, weblogic做測試的時候,應該是不會其transaction的,而且是直接拿底層的物理connection來做個select,怎么會有transaction resume呢? 是不是底層物理連接出現了問題? 跟客戶溝通了一下,大概了解了他們的應用:通過jdbc調用SQL Server的stored procedure,而stored procedure中會起自己的transaction。 Tx中的操作分兩種類型:
1:業務簡單,數據量大
2:數據不大,但業務比較復雜
總之,TX中的SQL執行時間比較長。
開始懷疑跟客戶的stored procedure有關,建議他先去掉里面的TX,果然奏效。因為transaction一直是我比較模糊的董東,也不敢跟客戶說:你不能這么寫,這么寫是不可以的。畢竟自己底氣不足啊。不是很清楚jdbc connection的auto commit提交的到底是哪個事務,是driver的tx,還是stored procedure里的tx。應該是前者吧。花了一上午,自己搭了個測試環境,終于復現了這個問題:
存儲過程:
1 create proc dbo.TestProc
2 as
3 begin transaction
4 waitfor delay '00:02:00'
5 insert into dbo.TestT_1 values('test')
6 commit
JDBC 代碼:
1 package com.bea.cs.test.jdbc;
2
3 import com.bea.cs.test.utils.JNDIRetriver;
4 import java.sql.*;
5 import java.io.*;
6 import javax.transaction.*;
7
8 public class SQLServerJDBCTest {
9
10 public static void main(String args[])
11 {
12 SQLServerJDBCTest test = new SQLServerJDBCTest();
13
14 for(int loop=0; loop<15; loop++)
15 test.callProc("jdbc/SQLServerNonXADS", loop);
16
17 try{
18 Thread.currentThread().sleep(10000);
19 }catch(Exception e){}
20
21 for(int loop=0; loop<15; loop++)
22 test.checkAutoCommit("jdbc/SQLServerNonXADS");
23 }
24
25 public void checkAutoCommit(String dsName)
26 {
27 CheckAutoCommitThread cacThread = new CheckAutoCommitThread(dsName);
28 cacThread.start();
29 }
30
31 class CheckAutoCommitThread extends Thread
32 {
33 private String dsName = null;
34
35 public CheckAutoCommitThread(String ds){
36 dsName = ds;
37 }
38
39 private void callProc(String dsName, int loop)
40 {
41 ProcThread procThread = new ProcThread(dsName, loop);
42 procThread.start();
43 }
44
45 class ProcThread extends Thread
46 {
47 private String ds = null;
48 private int id = -1;
49
50 public ProcThread(String dsName, int loop)
51 {
52 ds = dsName;
53 id = loop;
54 }
55
56 public void run()
57 {
58 String url = "t3://10.130.2.90:7001";
59 String sql = "{ call TestProc() }";
60 Connection conn = null;
61 JNDIRetriver retriever = new JNDIRetriver(url);
62 try{
63 conn = retriever.getJBDCConnection(ds);
64 boolean autoCommit = conn.getAutoCommit();
65 CallableStatement cstmt = conn.prepareCall(sql);
66
67 //start a thread to close current connection, so that a connection
68 //that attachs a tx will be returned to connection pool and when
69 //it's retrieved from connection pool by other client, the error
70 //will be reproduced.
71 ConnCloseThread closeThread = new ConnCloseThread(conn, id);
72 closeThread.start();
73 long start = System.currentTimeMillis();
74 System.out.println( "execute-" + id + "starts at: " + start/1000.0);
75 cstmt.execute();
76 long end = System.currentTimeMillis();
77 System.out.println("statement " + id + " execute: " + (end-start)/1000.0);
78 conn.close();
79 }catch(Exception e)
80 {
81 try{
82 System.out.println("connection is closed for exception: " + e.getMessage());
83 conn.close();
84 }catch(Exception e1){}
85 e.printStackTrace();
86 }
87 }
88 }
89
90 class ConnCloseThread extends Thread
91 {
92 private Connection connection = null;
93 private int id = -1;
94
95 public ConnCloseThread(Connection conn, int loop){
96 connection = conn;
97 id = loop;
98 }
99
100 public void run()
101 {
102 try{
103 Thread.currentThread().sleep(10000);
104 //connection.rollback();
105 long start = System.currentTimeMillis();
106 System.out.println( "closeConn-" + id + "starts at: " + start/1000.0);
107 connection.close();
108 long end = System.currentTimeMillis();
109 System.out.println("close connection " + id + " takes: " + (end -start)/1000.0);
110 }catch(Exception e){}
111 }
112 }
113 }
114
115
測試結果和預想的有點出入:ConnCloseThread中關閉連接的時候,不是立刻返回的。Connection.close()會觸發Connection.commit(),而因為調用的存儲過程中,存儲過程起了自己的事務,connection.commit()必須等到存儲過程結束才能完成(這個是microsoft論壇上看到的)。如果所有connection.close()都等到tx commit或rollback完成才執行的話,這個問題就不會出現了。看看我的測試結果:
statement 5 execute:表示從存儲過程調用開始,到調用返回的時間
close connection 5 takes:表示關閉連接耗費的時間(也就是connection.commit()等待存儲過程事務結果的時間)
statement 5 execute: 125.922
close connection 5 takes: 148.39
statement 14 execute: 130.031
close connection 14 takes: 148.39
statement 2 execute: 134.031
close connection 2 takes: 148.39
statement 6 execute: 138.14
close connection 6 takes: 148.406
statement 8 execute: 142.14
close connection 8 takes: 148.406
statement 0 execute: 146.156
close connection 0 takes: 148.406
statement 3 execute: 162.39
close connection 3 takes: 168.625
statement 11 execute: 166.39
close connection 11 takes: 168.625
statement 13 execute: 120.0
close connection 13 takes: 115.359
statement 12 execute: 150.265
close connection 12 takes: 148.406
statement 9 execute: 154.281
close connection 9 takes: 148.406
statement 1 execute: 158.39
close connection 1 takes: 148.406
statement 4 execute: 170.5
close connection 4 takes: 168.625
statement 10 execute: 174.515
close connection 10 takes: 168.625
statement 7 execute: 178.609
close connection 7 takes: 168.625
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '1' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966102> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3b00000001.".>
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '2' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966132> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3e00000001.".>
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '31' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966142> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3800000001.".>
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '4' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966162> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3a00000001.".>
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '29' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966172> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3400000001.".>
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '19' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966172> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3600000001.".>
####<Oct 28, 2008 5:59:26 PM CST> <Error> <JDBC> <fjin01> <AdminServer> <[ACTIVE] ExecuteThread: '20' for queue: 'weblogic.kernel.Default (self-tuning)'> <<anonymous>> <> <> <1225187966182> <BEA-001112> <Test "SELECT 1" set up for pool "SQLServerNonXADS" failed with exception: "java.sql.SQLException: [BEA][SQLServer JDBC Driver][SQLServer]The server failed to resume the transaction. Desc:3f00000001.".>
從測試結果來看,凡是close connection耗時比execute statement短的,連接(物理連接)都會報出該問題。分析原因:通過weblogic datasource獲取的connection并不是物理connection,而是由weblogic wrapped的connection。這些conection在被close后,并不會關閉物理連接,而只是將物理連接還池。我們對connection的所有操作,最終都會被delegated到底層物理連接上,即commit(),rollback()最終都是在物理連接上執行。如果上面的connection.close(),底層物理連接沒有等到存儲過程事務結束就返回的話,那么物理連接上應該還帶有此次操作的事務,而weblogic這邊不會關系物理連接的情況,直接將連接放入connection pool供其它客戶端使用。這時候如果設定了test on reserve的話,下次客戶端從data source獲取連接時,weblogic會檢查這個物理連接,作一個select操作的,這個有問題的連接就會暴露出來,也就是上面的異常。這個問題如果使用driver manager來獲取連接的話(如果每次都關閉的話),則不會出現,因為使用的物理連接每次都是不同的。還好,weblogic會幫忙重新創建有問題的連接。原因大概了解了,但這是誰的問題呢? 為什么connection.close()不等存儲過程的事務結束?
結論:一般而言,我們不建議通過JDBC調用存儲過程的時候,在存儲過程中定義事務,應該將tx的管理工作交給jdbc去做。 non-xa如此,xa亦如此,畢竟事務嵌套了以后,管理起來是個問題,完整性更是個問題。
posted on 2008-10-28 19:38
走走停停又三年 閱讀(9074)
評論(0) 編輯 收藏 所屬分類:
Database