開發Web應用程序時,無論是用什么樣的框架技術來開發,一碰從數據庫存取涉及到中文的數據,就要面對中文亂碼或者是各種編碼方式不匹配的異常,今天晚上終于搞定了Tomcat+MySql+Struts的中文問題,用到了很簡單的方法,很快就能搞定。

    在做以下工作之前,所有的HTML/JSP的charset都設為charset=gb2312。

    第一個要解決的是表單提交亂碼問題。在使用Struts提供的ActionForm過程中,無論表單采用的是Struts標簽還是Html標簽,都可以用ActionForm的Get/Set來獲取和設置表單的元素值(它們的作用效果與request.getParameter()方法一樣),但提取出來的數據不經過處理的話就是亂碼,主要的原因是1.Tomcat的J2EE實現對表單提交即Post方法提交時,處理參數采用默認的ISO8859_1來處理2.Tomcat對Get方法提交的請求在query-string處理時采用了和Post方法不一樣的處理方式。所以如果要正確地顯示和獲取中文數據采用的解決方案:(1)對于Post方法提交的表單通過編寫一個過濾器(filer)的方法解決,過濾器在用戶提交的數據被處理之前被調用,可以通過這個Java代碼改變參數的編碼方式(目標編碼方式可以通過Web.xml文件里面的參數指定)。過濾器的代碼如下:

import java.io.IOException;
import javax.servlet.Filter;
import javax.servlet.FilterChain;
import javax.servlet.FilterConfig;
import javax.servlet.ServletException;
import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.UnavailableException;


/**
 * <p>Example filter that sets the character encoding to be used in parsing the
 * incoming request, either unconditionally or only if the client did not
 * specify a character encoding.  Configuration of this filter is based on
 * the following initialization parameters:</p>
 * <ul>
 * <li><strong>encoding</strong> - The character encoding to be configured
 *     for this request, either conditionally or unconditionally based on
 *     the <code>ignore</code> initialization parameter.  This parameter
 *     is required, so there is no default.</li>
 * <li><strong>ignore</strong> - If set to "true", any character encoding
 *     specified by the client is ignored, and the value returned by the
 *     <code>selectEncoding()</code> method is set.  If set to "false,
 *     <code>selectEncoding()</code> is called <strong>only</strong> if the
 *     client has not already specified an encoding.  By default, this
 *     parameter is set to "true".</li>
 * </ul>
 *
 * <p>Although this filter can be used unchanged, it is also easy to
 * subclass it and make the <code>selectEncoding()</code> method more
 * intelligent about what encoding to choose, based on characteristics of
 * the incoming request (such as the values of the <code>Accept-Language</code>
 * and <code>User-Agent</code> headers, or a value stashed in the current
 * user's session.</p>
 *
 * @author Craig McClanahan
 * @version $Revision: 1.2 $ $Date: 2004/03/18 16:40:33 $
 */

public class SetCharacterEncodingFilter implements Filter {


    // ----------------------------------------------------- Instance Variables


    /**
     * The default character encoding to set for requests that pass through
     * this filter.
     */
    protected String encoding = null;


    /**
     * The filter configuration object we are associated with.  If this value
     * is null, this filter instance is not currently configured.
     */
    protected FilterConfig filterConfig = null;


    /**
     * Should a character encoding specified by the client be ignored?
     */
    protected boolean ignore = true;


    // --------------------------------------------------------- Public Methods


    /**
     * Take this filter out of service.
     */
    public void destroy() {

        this.encoding = null;
        this.filterConfig = null;

    }


    /**
     * Select and set (if specified) the character encoding to be used to
     * interpret request parameters for this request.
     *
     * @param request The servlet request we are processing
     * @param result The servlet response we are creating
     * @param chain The filter chain we are processing
     *
     * @exception IOException if an input/output error occurs
     * @exception ServletException if a servlet error occurs
     */
    public void doFilter(ServletRequest request, ServletResponse response,
                         FilterChain chain)
 throws IOException, ServletException {

        // Conditionally select and set the character encoding to be used
        if (ignore || (request.getCharacterEncoding() == null)) {
            String encoding = selectEncoding(request);
            if (encoding != null)
                request.setCharacterEncoding(encoding);
        }

 // Pass control on to the next filter
        chain.doFilter(request, response);

    }


    /**
     * Place this filter into service.
     *
     * @param filterConfig The filter configuration object
     */
    public void init(FilterConfig filterConfig) throws ServletException {

 this.filterConfig = filterConfig;
        this.encoding = filterConfig.getInitParameter("encoding");
        String value = filterConfig.getInitParameter("ignore");
        if (value == null)
            this.ignore = true;
        else if (value.equalsIgnoreCase("true"))
            this.ignore = true;
        else if (value.equalsIgnoreCase("yes"))
            this.ignore = true;
        else
            this.ignore = false;

    }


    // ------------------------------------------------------ Protected Methods


    /**
     * Select an appropriate character encoding to be used, based on the
     * characteristics of the current request and/or filter initialization
     * parameters.  If no character encoding should be set, return
     * <code>null</code>.
     * <p>
     * The default implementation unconditionally returns the value configured
     * by the <strong>encoding</strong> initialization parameter for this
     * filter.
     *
     * @param request The servlet request we are processing
     */
    protected String selectEncoding(ServletRequest request) {

        return (this.encoding);

    }

}

編繹后把class文件放在classes目錄下,并在Web應用的web.xml文件中添加如下代碼:

<filter>
  <filter-name>Set Character Encoding</filter-name>
  <filter-class>com.neusoft.equipment.controller.SetCharacterEncodingFilter</filter-class>
  <init-param>
   <param-name>encoding</param-name>
   <param-value>gbk</param-value>
  </init-param>
 </filter>
 <filter-mapping>
  <filter-name>Set Character Encoding</filter-name>
  <url-pattern>/*</url-pattern>
 </filter-mapping>只要是gb2312,gbk,utf8等支持多字節編碼的字符集都可以儲存漢字,當然,gb2312中的漢字數量遠少于gbk,而gb2312,gbk等都可在utf8下編碼,這里指定目標編碼方式是gbk,重新啟動Tomcat后就可以了。
(2)對Get方法提交的表單,由于參數是緊跟在用戶的URL請求后面,Tomcat對其的處理方法與Post方法不一樣。所以上面設置的過濾器對Get方法沒有作用,它需要在其他地方設置。找到Tomca的server.xml配置文件,找到對80(或者是8080等別的,這個是自己修改后的)的Connector組件的設置部分,給這個Connector組件添加一個屬性:URIEncoding="GBK"。修改后的Connector組件是這樣的:

 <!-- Define a non-SSL HTTP/1.1 Connector on port 80-->
    <Connector port="80" maxHttpHeaderSize="8192"
               maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
               enableLookups="false" redirectPort="8443" acceptCount="100"
               connectionTimeout="20000" disableUploadTimeout="true"  URIEncoding="GBK"/>這樣修改后,重啟Tomcat就可以正確處理GET方法提交的表單數據了。

    第二個要解決的是數據庫存取數據出現的亂碼等情況。對于不同的數據庫往往支持不同的編碼,造成了應用時比較混亂,不同的數據庫的解決方法往往是不同的,針對MySql,網上也有各種各樣的解決方案,但個人覺得那些太繁了,現在有一個極其簡單的解決辦法:修改MySql的配置文件,打開MySql安裝后的根目錄,找到my.init文件,把[mysqld]區的如下語句:default-character-set=latin1修改為:default-character-set=gbk,然后在[client]區增加:default-character-set=gbk,修改后記得做一件事情,到Widows控制面板的管理工具下的服務程序,把Mysql服務停止了重新啟動,這樣就根本解決了MySql的數據庫亂碼問題,很簡單~~~~