<rt id="bn8ez"></rt>
<label id="bn8ez"></label>

  • <span id="bn8ez"></span>

    <label id="bn8ez"><meter id="bn8ez"></meter></label>

    javajohn

    金色年華

    漢字(中文)還是unicode

    漢字與 unicode 編碼相互轉化

    (2006年7月17日? 11:07:58 )

    一、???????????? 概述:

    ?????? 如果項目采用了 GBK 的編碼,那么漢字轉化就不是問題了。但是如果采用了 utf-8 的編碼,漢字的處理就相對比較麻煩一些。

    二、???????????? 功能實現:

    ??????

    代碼如下:

    ?

    ?1 ???? // ?轉為unicode
    ?2 ???? public ? static ? void ?writeUnicode( final ?DataOutputStream?out,
    ?3 ???????????? final ?String?value)? {
    ?4 ???????? try ? {
    ?5 ???????????? final ?String?unicode? = ?gbEncoding(value);
    ?6 ???????????? final ? byte []?data? = ?unicode.getBytes();
    ?7 ???????????? final ? int ?dataLength? = ?data.length;
    ?8
    ?9 ????????????System.out.println( " Data?Length?is:? " ? + ?dataLength);
    10 ????????????System.out.println( " Data?is:? " ? + ?value);
    11 ????????????out.writeInt(dataLength);? // ?先寫出字符串的長度
    12 ????????????out.write(data,? 0 ,?dataLength);? // ?然后寫出轉化后的字符串
    13 ????????}
    ? catch ?(IOException?e)? {
    14
    15 ????????}

    16 ????}

    17
    18 ???? public ? static ?String?gbEncoding( final ?String?gbString)? {
    19 ???????? char []?utfBytes? = ?gbString.toCharArray();
    20 ????????String?unicodeBytes? = ? "" ;
    21 ???????? for ?( int ?byteIndex? = ? 0 ;?byteIndex? < ?utfBytes.length;?byteIndex ++ )? {
    22 ????????????String?hexB? = ?Integer.toHexString(utfBytes[byteIndex]);
    23 ???????????? if ?(hexB.length()? <= ? 2 )? {
    24 ????????????????hexB? = ? " 00 " ? + ?hexB;
    25 ????????????}

    26 ????????????unicodeBytes? = ?unicodeBytes? + ? " \\u " ? + ?hexB;
    27 ????????}

    28 ???????? // ?System.out.println("unicodeBytes?is:?"?+?unicodeBytes);
    29 ???????? return ?unicodeBytes;
    30 ????}

    31
    32 ???? /**
    33 ?????*?This?method?will?decode?the?String?to?a?recognized?String?in?ui.
    34 ?????*?功能:將unicod碼轉為需要的格式(utf-8)
    35 ?????*? @author ?javajohn
    36 ?????*? @param ?dataStr
    37 ?????*? @return
    38 ????? */

    39 ???? public ? static ?StringBuffer?decodeUnicode( final ?String?dataStr)? {
    40 ???????? final ?StringBuffer?buffer? = ? new ?StringBuffer();
    41 ????????String?tempStr? = ? "" ;
    42 ????????String?operStr? = ?dataStr;
    43 ???????? if (operStr? != ? null ? && ?operStr.indexOf( " \\u " )? == ? - 1 )? return ?buffer.append(operStr); //
    44 ???????? if (operStr? != ? null ? && ? ! operStr.equals( "" )? && ? ! operStr.startsWith( " \\u " )) { //
    45 ????????????tempStr? = ?operStr.substring( 0 ,operStr.indexOf( " \\u " )); //?
    46????????????operStr?=?operStr.substring(operStr.indexOf("\\u"),operStr.length());//operStr字符一定是以unicode編碼字符打頭的字符串
    47????????}

    48 ????????buffer.append(tempStr);
    49 ???????? while ?(operStr? != ? null ? && ? ! operStr.equals( "" )? && ?operStr.startsWith( " \\u " )) { // 循環處理,處理對象一定是以unicode編碼字符打頭的字符串
    50 ????????????tempStr? = ?operStr.substring( 0 , 6 );
    51 ????????????operStr? = ?operStr.substring( 6 ,operStr.length());
    52 ????????????String?charStr? = ? "" ;
    53 ????????????charStr? = ?tempStr.substring( 2 ,?tempStr.length());
    54 ???????????? char ?letter? = ?( char )?Integer.parseInt(charStr,? 16 );? // ?16進制parse整形字符串。
    55 ????????????buffer.append( new ?Character(letter).toString());
    56 ???????????? if (operStr.indexOf( " \\u " )? == ? - 1 ) { //?
    57????????????????buffer.append(operStr);
    58????????????}
    else { // 處理operStr使其打頭字符為unicode字符
    59 ????????????????tempStr? = ?operStr.substring( 0 ,operStr.indexOf( " \\u " ));
    60 ????????????????operStr? = ?operStr.substring(operStr.indexOf( " \\u " ),operStr.length());
    61 ????????????????buffer.append(tempStr);
    62 ????????????}

    63 ????????}

    64 ???????? return ?buffer;
    65 ????}

    一、???????????? 結尾:

    posted on 2006-07-17 11:07 javajohn 閱讀(5532) 評論(1)  編輯  收藏 所屬分類: 我的記憶

    Feedback

    # re: 漢字(中文)還是unicode 2006-07-18 17:11 小豬

    關于代碼單元和代碼點的理解:
    1、一個代碼點可能包含一個或兩個代碼單元。
    2、在我的測試程序中,“我 ”也只占用一個代碼單元。即代碼點數等于代碼單元數。
    下面是在unicode的官方網站上找到的關于unicode的中文,韓文,日文的一些說明:
    Q: I have heard that UTF-8 does not support some Japanese characters. Is this correct?

    A: There is a lot of misinformation floating around about the support of Chinese, Japanese and Korean (CJK) characters. The Unicode Standard supports all of the CJK characters from JIS X 0208, JIS X 0212, JIS X 0221, or JIS X 0213, for example, and many more. This is true no matter which encoding form of Unicode is used: UTF-8, UTF-16, or UTF-32.

    Unicode supports over 70,000 CJK characters right now, and work is underway to encode further additions. The International Standard ISO/IEC 10646 and the Unicode Standard are completely synchronized in repertoire and content. And that means that Unicode has the same repertoire as GB 18030, since that also is synchronized with ISO 10646 — although with a different ordering and byte format.
    無論是那個編碼方式(UTF-8, UTF-16, or UTF-32)都可以對中文全面支持?


    我的測試程序如下:
    public class test0 {
    public static void main(String[] args)
    {String a="我 ";
    int cuCount=a.length();
    System.out.println("the number of code units required for string \"test\" in the UTF-16 encoding is "+cuCount);
    int cpCount=a.codePointCount(0, a.length());
    System.out.println("the number of code points is "+cpCount);
    System.out.println("the end of string \"我 \" is "+a.charAt(a.length()-1));

    }

    }

    輸出結果為:
    the number of code units required for string "test" in the UTF-16 encoding is 2
    the number of code points is 2
    the end of string "我 " is [空格]

    在eclipse里面找到了set encoding選項,在里面可以設置編碼方式。  回復  更多評論   


    My Links

    Blog Stats

    常用鏈接

    留言簿(7)

    隨筆分類(36)

    隨筆檔案(39)

    classmate

    good blog

    企業管理網站

    好友

    站點收藏

    搜索

    最新評論

    閱讀排行榜

    評論排行榜

    主站蜘蛛池模板: 亚洲精品乱码久久久久久蜜桃不卡 | 亚洲精品午夜视频| 久久久精品午夜免费不卡| 综合久久久久久中文字幕亚洲国产国产综合一区首 | 免费萌白酱国产一区二区三区| 亚洲日韩中文在线精品第一| 日韩久久无码免费毛片软件| 免费大黄网站在线观| 国产午夜亚洲精品不卡电影| 国产一级大片免费看| 免费看黄福利app导航看一下黄色录像| 日韩成全视频观看免费观看高清| 亚洲精品中文字幕无乱码麻豆| 免费大片黄在线观看yw| 亚洲中文无码永久免| 宅男666在线永久免费观看| 青青青视频免费观看| 亚洲色欲久久久综合网| 三年片在线观看免费观看大全动漫| 亚洲va久久久噜噜噜久久| 97在线视频免费播放| 亚洲一区二区三区在线观看蜜桃| 人禽杂交18禁网站免费| 国产精品无码亚洲一区二区三区| 亚洲国产精品成人| 青青草原1769久久免费播放| 亚洲永久中文字幕在线| 四虎影视大全免费入口| a高清免费毛片久久| 亚洲高清视频免费| 日本一道本高清免费| 成年女人A毛片免费视频| 亚洲综合激情视频| 日韩特黄特色大片免费视频| 两个人日本WWW免费版| 亚洲国产av一区二区三区丶| 免费a级毛片大学生免费观看| a毛片在线还看免费网站| 亚洲 日韩 色 图网站| 中文亚洲AV片不卡在线观看| 亚洲一级毛片免费看|