亚洲国产精品无码久久98 ,亚洲精品无码成人,亚洲永久永久永久永久永久精品

(轉)oracle 正則表達式

環境: oracle 10g r2
更多內容可以參考:
<<Oracle? Database Application Developer's Guide - Fundamentals 10g Release 2 (10.2)>>

必須明白的是，oracle的正則表達式和其它的語言的并不是完全一樣。
oracle支持以下標準的正則表達式:

IEEE Portable Operating System Interface (POSIX) standard draft 1003.2/D11.2

Unicode Regular Expression Guidelines of the Unicode Consortium

Extends the matching capabilities for multilingual data beyond what is specified in the POSIX standard(即posix的多語言支持).

Adds support for the common Perl regular expression extensions that are not

included in the POSIX standard but do not conflict with it. Oracle Database

provides built-in support for some of the most heavily used Perl regular
expression operators, for example, character class shortcuts, the non-greedy
modifier, and so on(即部分perl標準,不包含在posix標準,但是又不與之沖突的)

---2008-09-08 做了個實驗，關鍵的要點是三個,重復次數{},+,*,?
---以及匹配模式[...] ,[^...],(...),(...)
--以及轉義符號 \d,\D,\\等。
------------------------------------------------------------------------------------------------------------------
                                     ID VALUE
--------------------------------------- ----------------------------------------
                                      9 lzf is a chinese
                                     10 1231231231lzf
                                     11 perfect
                                      1 13625061300
                                      2 059183383335
                                      3 13959029697
                                      4 13950323157
                                      5 who is your daddy?
                                      6 who are you?
                                      6 what is your name?
                                      6 I am 20 years old!
                                      7 股市下挫到2100點左右
                                      8 lzf ' is very happy 136ok
                                      8 do you know lzf,ok?
                                      8 ok,lzf is my brother
------------------------------------------------------------------------------------------------------------------

select * from test_reg where regexp_like(value,'^1(.+)7$') --1開頭，7結尾的內容
select * from test_reg where regexp_like(value,'\d+')       --一個或者多個數字
select * from test_reg where regexp_like(value,'^[a-zA-Z]{1,}[a-zA-Z]+$') --全部字符的.[:digit:]
select * from test_reg where regexp_like(value,'^[0-9]{1,}[0-9]+$') --全部數字的.
select * from test_reg where regexp_like(value,'[y][o][u]') --包含you
select * from test_reg where regexp_like(value,'(you)') --包含you
select * from test_reg where regexp_like(value,'[you]')     --包含y o u的,不分順序
select * from test_reg where regexp_like(value,'(lzf){1,}(.){0,}(ok)') --包含lzf 和ok的. 但是也僅僅限制與lzf 在ok前
以上試驗是在看了下文以及參考了OReilly.Oracle.PL.SQL.Programming.4th.Edition.Aug.2005.chs后作出的.
--------------
2010/08/04 增加以下例子

---
--以'POSIX('開頭的串,以下幾個表達式的結果都是一樣的。
select 1 from dual where regexp_like('POSIX(ADDDDDD,Y,13)','^(POSIX$)([A-Z]+)(,)([RYN]{1})(,)\d{1,}?$')

select 1 from dual where regexp_like('POSIX(ADDDDDD,Y,13)','^(POSIX$)([A-Z]+)(,)([RYN]{1})(,)\d+$')
select 1 from dual where regexp_like('POSIX(ADDDDDD,Y,13)','^(POSIX$)([A-Z]+)(,)([RYN]{1}),\d+$')
select 1 from dual where regexp_like('POSIX(ADDDDDD,Y,13)','^(POSIX$)([A-Z]+),([RYN]{1}),\d+$')
select 1 from dual where regexp_like('POSIX(ADDDDDD,Y,13)','^(POSIX$)([A-Z]+),([RYN]+),\d+$')

select 1 from dual where regexp_like('POSIX(ADDDDDD,Y,13)','^(POSIX$)([A-Z]+),[RYN]+,\d+$')
select 1 from dual where regexp_like('POSIX(A,N,13)','^(POSIX$)([A-Z]+),[RYN]+,\d+$')
----------
--位于字符串中的串
select 1 from dual where regexp_like('POSIX(A,N,13)','^(POSIX$)([A-Z]+),[RYN]+,\d+$')
select 1 from dual where regexp_like('P_OSI_X(JYL_GP_R,N,13)','^[A-Z_]+$([A-Z_]+),[RYN]+,\d+$')

--測試regexp_substr,基本的還是regexp_like

SQL> select regexp_substr('P_OSI_X(JYL_GP_R,N,13)+AVG(JYL_JJ_R,Y,3)','^[A-Z_]+$([A-Z_]+),[RYN]+,\d+$') FROM DUAL
2 /

REGEXP_SUBSTR('P_OSI_X(JYL_GP_
------------------------------
P_OSI_X(JYL_GP_R,N,13)

SQL> select regexp_substr('P_OSI_X(JYL_GP_R,N,13)+AVG(JYL_JJ_R,Y,3)','[A-Z_]+$([A-Z_]+),[RYN]+,\d+$',1,2) FROM DUAL
2 /

REGEXP_SUBSTR('P_OSI_X(JYL_GP_
------------------------------
AVG(JYL_JJ_R,Y,3)

2010/08/26 新增例子

-- delete the first ()
select regexp_replace('dad(disc_id=50003118)add','(^([A-Z]{1,}$)|($[A-Z]{1,})$)','',1,0,'i') FROM DUAL
--
match pattern
using classes
select regexp_substr(upper('dad(ssdsd89\9 dd== (sddf(d)),sdsd)add'),'^([A-Z]{1,}$)[[:space:][:alnum:]\,\_\($\=]{1,}\)[A-Z]{1,}$',1,1,'i') FROM DUAL

OReilly. 的參考內容
Section A.1. Metacharacters

A.1. Metacharacters

The R2 column in Tables A-1 through A-3 indicates which metacharacters were introduced in Oracle Database 10g Release 2. All others were introduced in Release 1.

Table A-1. Character-matching metacharacters
Syntax	R2?	Description
.		Matches any single character except for newline. Will match newline when the n flag is set. On Windows, Linux, and Unix platforms, chr(10) is recognized as the newline.
[ ... ]		Defines a matching list that matches any character listed between the brackets. You may specify ranges of characters, as in a-z. These ranges are interpreted based on the NLS_SORT setting. A dash (-) is a literal when it occurs first or last in the list (e.g., [abc-]). A closing-bracket (]) is a literal when it occurs first in the list (e.g., []abc]). A caret (^) in the first position makes the list a nonmatching list (see the next entry).
[^ ... ]		Matches any character not listed between the brackets. Referred to as a "nonmatching list."
[:class:]		Matches any character that belongs to the specified character class. May only be used within a matching list: [[:class:]abc] is a valid expression, [:class:]abc is not. Table A-5 lists the valid character class names.
[.coll.]		Matches the specified collation element, which may be one or more characters. May only be used within a matching list. For example, the expression [[.ch.]] matches the Spanish letter "ch". Table A-4 lists the valid collation elements.
[=char=]		Matches all characters that share the same base character as char. May be used only within a matching list. For example, [[=e=]] matches any of: "eé?è???E".
\d		Matches any digit. Equivalent to [[:digit:]].
\D		Matches any nondigit. Equivalent to [^[:digit:]]
\w		Matches any "word character." Word characters are defined to be alphabetic characters, numeric characters, and the underscore.
\W		Matches any nonword character.
\s		Matches any whitespace character. Equivalent to [[:space:]].
\S		Matches nonwhitespace characters. Equivalent to [^[:space:]].

Table A-2. Quantifiers
Syntax	R2?	Description
+		One or more
?		Zero or one
*		Zero or more
{m}		Exactly m occurrences
{m,}		At least m occurrences
{m,n}		At least m, and at most n occurrences
+?		One or more, but nongreedy
??		Zero or one, but nongreedy
{m}?		The same as {m}
{m,}?		At least m occurrences, but nongreedy and stops as soon as m occurrences are reached
{m,n}?		At least m, and at most n occurrences, but nongreedy; when possible, m occurrences are matched

Table A-3. Other metacharacters
Syntax	R2?	Description
\|		Specifies an alternation. An alternation within a subexpression doesn't extend beyond the subexpression.
( ...)		Defines a subexpresson.
\n		References the text matched by the nth subexpression. Backreferences may range from \1 through \9.
\		When not followed by a digit, the \ is an escape character. For example, use the pattern \\1 to look for a single backslash followed by the digit 1, use \( to look for an opening-parentheses (rather than begin a subexpression), etc.
^		Anchors an expression to the beginning of the string (in multiline mode, to the beginning of a line).
$		Anchors an expression to the end of the string (in multiline mode, to the end of a line).
\A		Anchors an expression to the beginning of the string regardless of whether multiline mode is specified.
\Z		Anchors an expression to the end of the string, or the a newline that happens to be ending a string, regardless of whether multiline mode is specified.
\z		Anchors an expression to the end of the string regardless of whether multiline mode is specified.

Table A-4. Collation elements
NLS_SORT	Multicharacter collation elements
XDANISH	aa oe	AA OE	Aa Oe
XSPANISH	ch ll	CH LL	Ch Ll
XHUNGARIAN	cs gy ly ny sz ty zs	CS GY LY NY SZ TY ZS	Cs Gy Ly Ny Sz Ty Zs
XCZECH	ch	CH	Ch
XCZECH_PUNCTUATION	ch	CH	Ch
XSLOVAK	dz d_ ch	DZ D_ CH	Dz D_ Ch
XCROATIAN	d_ lj nj	D_ LJ Nj	D_ Lj NJ

Table A-5. Supported character classes
Class	Description
[:alnum:]	Alphanumeric characters (same as [:alpha:] + [:digit:])
[:alpha:]	Alphabetic characters only
[:blank:]	Blank space characters, such as space and tab
[:cntrl:]	Nonprinting, or control characters
[:digit:]	Numeric digits
[:graph:]	Graphical characters (same as [:punct:] + [:upper:] + [:lower:] + [:digit:])
[:lower:]	Lowercase letters
[:print:]	Printable characters
[:punct:]	Punctuation characters
[:space:]	Whitespace characters such as space, formfeed, newline, carriage return, horizontal tab, and vertical tab
[:upper:]	Uppercase letters
[:xdigit:]	Hexadecimal characters

對于類的使用，要特別注意點，因為基本上要包含在一對中括號中。
例一：檢驗非漢字的函數
例如可以定義一個檢驗非漢字的函數，不考慮其它UNICODE
CREATE OR REPLACE Function func_existNormalChar(chars In Varchar2) Return Pls_Integer
Is
vResult Pls_Integer;
Begin
Select 1
     Into vresult From dual
        Where regexp_like(chars,'[!-~ ，。、？’“；：『』【】＋＝－——～！◎＃￥％……※×（）]+')
        Or    regexp_like(chars,'[[:space:]+]')
        Or    regexp_like(chars,'[[:blank:]+]')
        Or    regexp_like(chars,'[[:cntrl:]+]');
Return vresult;
Exception
When Others Then
    Return 0;
End;

例二：出現次數的檢驗

序號	例子	結果	說明
1	regexp_replace('a12c','[[:digit:]]+','*')	a*c	測試"+",即出現一次或者更多次的.由于采取的是連續的匹配,所以結果上是把多個數字合并為一個替代字符串,為了理解連續匹配的意思,看序號9更能夠明白.這是總體偏少的模式
2	regexp_replace('a12c','[[:digit:]]?','*')	a*c	測試"?",即0或者一次.這個符號,會在任意一個其它字符邊上插入一個假設出現的匹配類型符號,如例子,a的左邊本來是沒有,但是表達式認為數字字符在a的左邊出現了0次(這個是成立的).然后還要逐個替換過去,這是總體偏多的模式,過于匹配和逐個匹配結合
3	regexp_replace('a12c','[[:digit:]]','')	ac	測試"*",即0或者更多次數,這個模式類似與?類似,采取連續匹配+多余匹配
4	regexp_replace('a12c','[[:digit:]]+?','*')	a**c	測試"+?",匹配在前面一次或者多次的串,采取的是逐個匹配的策略(如果有連續的情況出現),這個是通常最希望使用的模式.
5	regexp_replace('a12c','[[:digit:]]?','')	a12c*	測試"*?",匹配0或者更多次,難于理解! 但可以簡單地理解為為在各個字符中以及兩邊插入替代串.
6	regexp_replace('a12c','[[:digit:]]??','*')	a12c*	測試"??",匹配0或者1次,難于理解啊! 但可以簡單地理解為為在各個字符中以及兩邊插入替代串.
7	regexp_replace('a12c','[[:digit:]]{1,}','*')	a*c	測試{1,},同+,結果已經說明
8
9	regexp_replace('a1x2c','[[:digit:]]+','*')	axc	測試"+",這個例子說明,如果被檢測的字符串中特定的匹配串如果不是連續的(多次),那么替代的還是屬于正常的范圍.

其它的次數模式暫時沒有測試,對于 *,?,+ ,*?,+?,??,基本有其對應的模式 {n,m}? 或者是{n,}? ,除了*,*?之外,因為此二者表示的都是0或者多次(大于1),無法用{n,m}的方式表示.

例三:多個類的并用
很多時候使用類還是很方便的,例如[:digit:]表示數字.[:space:]表示空格,換行,回車等等.
如果希望把特定串中的數字和回車空格等都替換為'-',則可以使用 "|"(替代操作符,其實就是類似一般高級語言中的或操作符號).
select regexp_replace('adfa jkjk jk
jk','([[:digit:]]+?)|([[:space:]]+?)','-') from dual;
結果如下:
adfa-jkjk-jk-jk
已經把空格和回車符號都替換為'-'了.
又例如:
Select regexp_replace('a1b c1d jk
jk','([[:digit:]]+?)|([[:space:]]+?)','-') From dual
結果如下:
a-b-c-d-jk-jk

例四:迭代替換
例如希望把以下串替換掉
'<xxxx <often> <common> <fat> >'

關于范圍操作符號 -

發音：橫桿或者減號

在列表匹配操作情況下，如果沒有特別的設置，就是表示匹配一個范圍：

例子：把字符串中非數字+-()的符號替換為空

這是一個不成功的匹配

SQL> select regexp_replace('1+[2c- 3([ad]km,4','[^0-9\+\-]','',1) from dual;

REGEXP_REPLACE('1+[2C-3([AD]KM
------------------------------
1+23(4

這也是一個錯誤的匹配，因為減號這個時候表示一個范圍
SQL> select regexp_replace('1+[2c- 3([ad]km,4','[^0-9\+-]','',1) from dual;

REGEXP_REPLACE('1+[2C-3([AD]KM
------------------------------
1+[2-3([,4

二者的區別在于單獨的一個減號，如果前面沒有跟上斜杠\，那么就是表示范圍，否則就是減號本身。

但即使如此，依然沒有達到目的，第一個把減號還是給去除了，雖然排除了其它的需要排除的符號，

第二個，則是當作范圍符號，大部分的沒有排除掉。

但是不太明白的是，按照oracle文檔的約定，在范圍匹配中形如[\-]表示匹配一個減號，為何依然匹配補上了，尤其是當夾著其它需要匹配的模式的時候。

現在只能嘗試著把這個模式放在最后：

SQL> select regexp_replace('1+[2c- 3([ad4','[^0-9$+$\-]','',1) from dual
2 /

REGEXP_REPLACE('1+[2C-3([AD4',
------------------------------
1+2-3(4

結果是可以，因為模式"\-"被放到了范圍中的尾巴。

那是文檔不全，還是理解不透徹了，

看來以后遇到這樣的問題還是先把減號放到尾巴去吧。

posted on 2011-08-22 10:34 HUIKK 閱讀(441) 評論(0) 編輯收藏所屬分類: DataBase

新用戶注冊刷新評論列表


只有注冊用戶登錄后才能發表評論。




網站導航: 博客園 IT新聞 Chat2DB C++博客博問管理
相關文章: Oracle Data Guard Linux 平臺 Physical Standby 搭建實例() oracle 批量更新 PL/SQL中SELECT INTO無記錄的情況 windows環境下Oracle數據庫冷備份和恢復全過程 oracle10g數據庫冷備份、恢復的具體步驟(轉) oracle 正則匹配中文 PowerDesigner設置 SQLPLUS SPOOL命令使用 Oracle執行SQL時遭遇特殊字符& oracle及操作系統對于文件大小的限制

夢幻之旅

公告

常用鏈接

留言簿(21)

隨筆分類(644)

隨筆檔案(669)

文章檔案(6)

最新隨筆

積分與排名

最新評論

閱讀排行榜

評論排行榜

A.1. Metacharacters

Table A-1. Character-matching metacharacters

Table A-2. Quantifiers

Table A-3. Other metacharacters

Table A-4. Collation elements

Table A-5. Supported character classes