<rt id="bn8ez"></rt>
<label id="bn8ez"></label>

  • <span id="bn8ez"></span>

    <label id="bn8ez"><meter id="bn8ez"></meter></label>

    Decode360's Blog

    業(yè)精于勤而荒于嬉 QQ:150355677 MSN:decode360@hotmail.com

      BlogJava :: 首頁 :: 新隨筆 :: 聯(lián)系 ::  :: 管理 ::
      397 隨筆 :: 33 文章 :: 29 評論 :: 0 Trackbacks
    取重復(fù)記錄中的任一條的分析
    ?
    ??? 最近一直考慮一個問題,找出表中同一字段重復(fù)記錄中的任意一條,照理來說這樣的一個需求,在Oracle內(nèi)部進行實現(xiàn)是很方便的,而且不需要對表進行2遍的掃描。但是事實上我想了很久也向不出來有什么函數(shù)可以直接實現(xiàn)這一功能,基本上所有可以這樣做的方法都需要進行嵌套才能完成。不知道為什么Oracle沒有提供這個功能,也許是有什么邏輯矛盾我沒有想到。現(xiàn)在總結(jié)一下,模擬的環(huán)境如下:
    ?
    ??? create table t1_t(a int,b varchar2(50));
    ??? insert into t1_t(a,b) (select mod(rownum,50) rn,object_name from user_objects);
    ??? insert into t1_t(a,b) (select * from t1_t);
    ??? insert into t1_t(a,b) (select * from t1_t);
    ??? insert into t1_t(a,b) (select * from t1_t);
    ??? insert into t1_t(a,b) (select * from t1_t);
    ??? commit;
    ?
    ??? 至此t1_t表大致上有300萬條記錄左右,其中a字段的取值都是0-49之間的integer,現(xiàn)在的目標(biāo)就是取到50條記錄,a分別等于0~49,具體哪一條都可以,但需要同時取出b字段。
    ?
    ?
    一、方法大致有兩種:
    ?
    1、使用row_number() over函數(shù)找到首條記錄選出
    ?
    ??? select * from
    ??? (select a,b,row_number() over(partition by a order by 1) k from t1_t)
    ??? where k=1;
    ?
    ??? 實際執(zhí)行時間:在軟解析的情況下需要3s左右。執(zhí)行計劃如下:
    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=32068 Card=3632040 Bytes=192498120)
    ?? 1??? 0?? VIEW (Cost=32068 Card=3632040 Bytes=192498120)
    ?? 2??? 1???? WINDOW (SORT PUSHED RANK) (Cost=32068 Card=3632040 Bytes=101697120)
    ?? 3??? 2?????? TABLE ACCESS (FULL) OF 'T1_T' (TABLE) (Cost=3255 Card=3632040 Bytes=101697120)
    ?
    Statistics
    ----------------------------------------------------------
    ??????? 165? recursive calls
    ???????? 22? db block gets
    ????? 17303? consistent gets
    ?????? 1937? physical reads
    ????? 75904? redo size
    ?????? 2682? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 4? sorts (memory)
    ????????? 1? sorts (disk)
    ???????? 50? rows processed
    ?
    2、兩層關(guān)聯(lián),用rowid別名進行匹配:
    ?
    ??? select t1.* from t1_t t1,(select min(rowid) rid from t1_t group by a) t2
    ??? where t1.rowid = t2.rid;
    ?
    ??? 實際執(zhí)行時間:軟解析情況下1.5s左右。執(zhí)行計劃如下,比上面的方法要好很多,主要是由于不需要進行排序的操作:
    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=3919 Card=50 Bytes=2000)
    ?? 1??? 0?? NESTED LOOPS (Cost=3919 Card=50 Bytes=2000)
    ?? 2??? 1???? VIEW (Cost=3869 Card=50 Bytes=600)
    ?? 3??? 2?????? HASH (GROUP BY) (Cost=3869 Card=50 Bytes=750)
    ?? 4??? 3???????? TABLE ACCESS (FULL) OF 'T1_T' (TABLE) (Cost=3243 Card=3632040 Bytes=54480600)
    ?? 5??? 1???? TABLE ACCESS (BY USER ROWID) OF 'T1_T' (TABLE) (Cost=1 Card=1 Bytes=28)
    ?
    Statistics
    ----------------------------------------------------------
    ????????? 1? recursive calls
    ????????? 0? db block gets
    ????? 16278? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ?????? 2638? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 0? sorts (disk)
    ???????? 50? rows processed
    ?
    ?
    二、考慮使用索引
    ?
    ?
    ??? 首先增加a、b字段的索引:
    ?

    ??? create index t1_t_a on t1_t(a);

    ??? create index t1_t_b on t1_t(b);

    ???
    ??? 要使用索引,需要將字段設(shè)置為not null,或在SQL中使用not null選項才可以:
    ?

    ??? alter table t1_t modify a not null ;

    ??? alter table t1_t modify b not null ;

    ?
    ?
    1、對于第1種方法,執(zhí)行計劃完全沒變,執(zhí)行效率略有提高
    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=32068 Card=3632040 Bytes=192498120)
    ?? 1??? 0?? VIEW (Cost=32068 Card=3632040 Bytes=192498120)
    ?? 2??? 1???? WINDOW (SORT PUSHED RANK) (Cost=32068 Card=3632040 Bytes=101697120)
    ?? 3??? 2?????? TABLE ACCESS (FULL) OF 'T1_T' (TABLE) (Cost=3255 Card=3632040 Bytes=101697120)
    ?
    Statistics
    ----------------------------------------------------------
    ????????? 2? recursive calls
    ???????? 22? db block gets
    ????? 16228? consistent gets
    ???????? 21? physical reads
    ????????? 0? redo size
    ?????? 2682? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 1? sorts (disk)
    ???????? 50? rows processed
    ?
    2、對于第2中方法,用INDEX FAST FULL SCAN 代替了 TABLE ACCESS FULL SCAN,執(zhí)行效率也略有提高
    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2114 Card=50 Bytes=2000)
    ?? 1??? 0?? NESTED LOOPS (Cost=2114 Card=50 Bytes=2000)
    ?? 2??? 1???? VIEW (Cost=2064 Card=50 Bytes=600)
    ?? 3??? 2?????? HASH (GROUP BY) (Cost=2064 Card=50 Bytes=750)
    ?? 4??? 3???????? INDEX (FAST FULL SCAN) OF 'T1_T_A' (INDEX) (Cost=1439 Card=3632040 Bytes=54480600)
    ?? 5??? 1???? TABLE ACCESS (BY USER ROWID) OF 'T1_T' (TABLE) (Cost=1 Card=1 Bytes=28)

    Statistics
    ----------------------------------------------------------
    ??????? 210? recursive calls
    ????????? 0? db block gets
    ?????? 7145? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ?????? 2638? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 5? sorts (memory)
    ????????? 0? sorts (disk)
    ???????? 50? rows processed
    ?
    ?
    三、強制使用索引
    ?
    1、對于方法1,使用索引t1_t_a
    ?

    ??? select * from

    ??? ( select /*+INDEX(t1_t t1_t_a)*/ a,b,row_number() over( partition by a order by 1 ) k from t1_t)

    ??? where k= 1 ;


    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=844081 Card=3632040 Bytes=192498120)
    ?? 1??? 0?? VIEW (Cost=844081 Card=3632040 Bytes=192498120)
    ?? 2??? 1???? WINDOW (NOSORT) (Cost=844081 Card=3632040 Bytes=101697120)
    ?? 3??? 2?????? TABLE ACCESS (BY INDEX ROWID) OF 'T1_T' (TABLE) (Cost=815268 Card=3632040Bytes=101697120)
    ?
    ?? 4??? 3???????? INDEX (FULL SCAN) OF 'T1_T_A' (INDEX) (Cost=7154 Card=3632040)

    Statistics
    ----------------------------------------------------------
    ????????? 0? recursive calls
    ????????? 0? db block gets
    ???? 814027? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ?????? 2817? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 0? sorts (disk)
    ???????? 50? rows processed
    ?
    ??? 相比沒有使用索引時的效率還要低的多,因為是全表掃描,所以使用全索引掃描造成了額外的開銷。
    ?
    2、對于方法2:
    ?
    ??? 將INDEX FAST FULL SCAN 改成INDEX FULL SCAN 明顯沒有什么意義。
    ?
    ?
    四、a,b的聯(lián)合索引:
    ?
    ??? 首先創(chuàng)建a,b的聯(lián)合索引:
    ??? createindex t1_t_ab on t1_t(a,b);
    ?
    1、對方法1,使用索引t1_t_ab
    ?
    ?

    ??? select * from

    ??? ( select /*+INDEX(t1_t t1_t_ab)*/ a,b,row_number() over( partition by a order by 1 ) k from t1_t)

    ??? where k= 1 ;

    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=48327 Card=3632040 Bytes=192498120)
    ?? 1??? 0?? VIEW (Cost=48327 Card=3632040 Bytes=192498120)
    ?? 2??? 1???? WINDOW (NOSORT) (Cost=48327 Card=3632040 Bytes=101697120)
    ?? 3??? 2?????? INDEX (FULL SCAN) OF 'T1_T_AB' (INDEX) (Cost=19514 Card=3632040 Bytes=1016
    ????????? 97120)
    ?
    Statistics
    ----------------------------------------------------------
    ????????? 1? recursive calls
    ????????? 0? db block gets
    ????? 19368? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ?????? 2783? bytes sent via SQL*Net to client
    ??????? 537? bytes received via SQL*Net from client
    ????????? 5? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 0? sorts (disk)
    ???????? 50? rows processed
    ?
    ??? 相比沒有使用索引的時候,性能有所降低,但是比使用t1_t_a索引要好的多了,但是如果對于a字段進行限制時,row_number() over函數(shù)會自動找到t1_t_ab索引,效果會很好
    ?

    ??? select * from

    ??? ( select a,b,row_number() over( partition by a order by 1 ) k from t1_twhere a='1' )

    ??? where k= 1 ;

    ?
    Execution Plan
    ----------------------------------------------------------
    ?? 0????? SELECT STATEMENT Optimizer=ALL_ROWS (Cost=1119 Card=84004 Bytes=4452212)
    ?? 1??? 0?? VIEW (Cost=1119 Card=84004 Bytes=4452212)
    ?? 2??? 1???? WINDOW (NOSORT) (Cost=1119 Card=84004 Bytes=2352112)
    ?? 3??? 2?????? INDEX (RANGE SCAN) OF 'T1_T_AB' (INDEX) (Cost=453 Card=84004 Bytes=2352112)
    ?
    Statistics
    ----------------------------------------------------------
    ????????? 1? recursive calls
    ????????? 0? db block gets
    ??????? 410? consistent gets
    ????????? 0? physical reads
    ????????? 0? redo size
    ??????? 537? bytes sent via SQL*Net to client
    ??????? 504? bytes received via SQL*Net from client
    ????????? 2? SQL*Net roundtrips to/from client
    ????????? 0? sorts (memory)
    ????????? 0? sorts (disk)
    ????????? 1? rows processed
    ?
    ?
    五、總結(jié)
    ?
    ??? 基本上如果是要在全表中找到所有a的單條記錄,那么加不加索引的區(qū)別都不大,因為全表掃描是沒有必要走索引的,但是用方法2會比方法1要快速很多,資源占用也較少,原因是不需要排序的環(huán)節(jié)。
    ?
    ??? 如果只是想要取出單個a值的任意記錄,則可以選用方法1,并建立a,b字段的聯(lián)合索引,會大大提高效率。
    ?
    ?
    ?
    posted on 2009-03-04 21:57 decode360 閱讀(250) 評論(0)  編輯  收藏 所屬分類: 05.SQL
    主站蜘蛛池模板: 亚洲av无码无线在线观看 | 五月婷婷亚洲综合| 国产精品午夜免费观看网站| 久久精品国产亚洲av麻豆小说| 国产h视频在线观看免费| 无码日韩人妻AV一区免费l | 亚洲欧美日韩综合久久久久 | 亚洲精品国产成人专区| 毛片a级毛片免费观看免下载| 国产精品免费看久久久香蕉| 亚洲国产成人精品无码区在线秒播| 四虎免费影院4hu永久免费| 性无码免费一区二区三区在线| 亚洲精品国产精品| 亚洲成亚洲乱码一二三四区软件| 最新仑乱免费视频| 日本高清免费观看| 深夜福利在线免费观看| 亚洲午夜国产精品| 国产亚洲色婷婷久久99精品| 国产成人免费一区二区三区| 最好看最新的中文字幕免费| 一级特黄a大片免费| 亚洲日本VA中文字幕久久道具| 日韩va亚洲va欧洲va国产| 国产免费卡一卡三卡乱码| 成人免费视频69| 免费网站看av片| jizz日本免费| 亚洲精品色在线网站| 亚洲伊人久久精品| 五月天网站亚洲小说| 国产亚洲欧洲Aⅴ综合一区 | 蜜桃精品免费久久久久影院| 222www免费视频| 免费网站看av片| 黄色网址在线免费| 三级毛片在线免费观看| 美女裸免费观看网站| 亚洲第一se情网站| 亚洲国产精品无码中文lv|