1銆?strong>寮綃囪
2銆?strong>姒傝堪
3銆?strong>娓婃簮
4銆?strong>鍒濊瘑Solr
5銆?strong>Solr鐨勫畨瑁?/strong>
6銆?strong>Solr鍒嗚瘝欏哄簭
7銆?strong>Solr涓枃搴旂敤鐨勪竴涓疄渚?/strong>
8銆?strong>Solr鐨勬绱㈣繍綆楃
[寮綃囪]鎸夌収鎯緥搴旇鍐欎竴綃囨妧鏈枃绔犱簡錛岃繖嬈$粨鍚圠ucene/Solr鏉ュ垎浜竴涓嬪紑鍙戠粡楠屻?/span>
Lucene鏄竴涓嬌鐢↗ava璇█鍐欑殑鍏ㄦ枃媯绱㈠紑鍙戝寘錛圓PI錛夛紝鍒╃敤瀹冨彲浠ュ疄鐜板己澶х殑媯绱㈠姛鑳斤紝瀹冪殑璇︾粏浠嬬粛澶у鍙互鍘籊oogle涓婃悳绱竴涓嬶紝鏈枃閲嶇偣鏀懼湪Solr鐩稿叧鐨勮璁轟笂銆?/span>
[姒傝堪]鐩墠鍥藉唴鐮旂┒Solr鐨勪漢涓嶅錛岃屼笖澶у鏄洜涓洪」鐩紑鍙戦渶瑕併係olr甯堟壙Lucene錛屼負Apache鍩洪噾浼氫笅鐨勪竴涓」鐩紝鍏蜂綋鐨勮瀹冭繕鏄疞ucene涓嬬殑涓涓瓙欏圭洰銆係olr鍑鴻韓璞棬錛岃屼笖鍏鋒湁鑷繁鐨勬妧鏈壒鐐癸紝濉ˉ浜嗕互寰Lucene浠呬粎浣滀負寮鍙戝伐鍏峰寘鐨勯仐鎲撅紝瀹冩槸涓涓畬瀹屾暣鏁村湴搴旂敤銆傛崲鍙ヨ瘽璇達紝瀹冩槸涓涓叏鏂囨绱㈡湇鍔″櫒錛屽畠寮綆卞嵆鐢紝璁╂垜浠兘绔嬮┈浣撲細鍒癓ucene鐨勫己澶у姛鑳斤紝涓篖ucene浜у搧鍖栬蛋鍑轟簡涓澶ф銆?br />
Solr鍒嗚瘝鍘熺悊婕旂ず鐣岄潰
[娓婃簮]鏈鍒濓紝CNET Networks浣跨敤Lucene API鏉ュ紑鍙戜簡涓浜涘簲鐢紝騫跺湪榪欎釜鍩虹涓婁駭鐢熶簡Solr鐨勯洀褰紝鍚庢潵Apache Software Foundation鍦↙ucene欏剁駭欏圭洰鐨勬敮鎸佷笅寰楀埌浜哠olr錛岃繖宸茬粡鏄?006騫?鏈堜喚鐨勪簨浜嗐?006騫?鏈?7鏃ワ紝Solr姝f槸鍔犲叆Apache鍩洪噾浼氱殑瀛靛寲欏圭洰錛屽湪鏁翠釜欏圭洰瀛靛寲鏈熼棿錛孲olr 紼蟲鍦扮Н绱悇縐嶇壒鎬у茍鍚稿紩浜嗕竴涓ǔ瀹氱殑user緹や綋銆乨eveloper緹や綋鍜孋ommitter緹や綋錛屽茍浜?騫翠箣鍚庣殑17鏃ユ寮忛厺閰挎垚鐔燂紝鍦ㄨ繖涔嬪墠宸茬粡鎴愬姛鍙戝竷浜?.1.0鐗堛傜洰鍓嶇殑紼沖畾鐗堟湰鏄?.2錛孲olr鍦?鏈堜喚鐨?007Apache騫翠細涓婂ぇ鏀懼紓褰╋紝鍦ㄤ粖騫?1鏈堝簳灝嗘潵鍒伴娓弬鍔?007浜氭床寮婧愯蔣浠跺嘲浼氾紝閬楁喚鐨勬槸涓轟粈涔堜笉鏉ュ寳浜?-(
[鍒濊瘑Solr]Solr鏈嶅姟鍣ㄤ笉鍚屼簬鏅氱殑鍏崇郴鍨嬫暟鎹簱錛屼笉浠呬粎鍦ㄤ簬瀹冩牳蹇冩湰璐ㄧ殑涓嶅悓錛堥潰鍚戠粨鏋勫寲鍜岄潪緇撴瀯鍖栨暟鎹殑涓嶅悓錛夛紝寰堝ぇ鐨勪笉鍚岃繕鍦ㄤ簬瀹冪殑浣撶郴鏋舵瀯涓娿係olr鏈嶅姟鍣ㄤ竴鑸儏鍐典笅闇瑕侀儴緗蹭簬搴旂敤鏈嶅姟鍣?Java瀹瑰櫒涓婏紙濡傛灉鏄湰鏈洪氫俊涓嶆秹鍙奟PC鍙互涓嶄嬌鐢↗ava瀹瑰櫒錛屽閲囩敤宓屽叆鏂瑰紡浣跨敤Solr錛夛紝鏃犳硶鐙珛宸ヤ綔浜嶫VM涓娿?br />
Solr鏋舵瀯鍥?/strong>
Solr鏈嶅姟鍣ㄥ彲浠ュ瓨鍌ㄦ暟鎹茍閫氳繃绱㈠紩瀵瑰叾榪涜蹇熼珮鏁堟绱€傚澶栨彁渚汬TTP/XML鍜孞son API鎺ュ彛錛岃繖浣垮緱瀹冭兘澶熷湪澶氳璦鐜涓嬮泦鎴愶紝姣斿閽堝瀹冪殑瀹㈡埛绔殑寮鍙戙係olr鐩墠鐨勫鎴風闈㈠悜鐨勬湁Java銆丳HP銆丳ython銆丆#銆丣son鍜孯uby絳夛紝閬楁喚鐨勬槸娌℃湁闈㈠悜C/C++錛堣繖涔熸槸鏈漢鐩墠鍦ㄧ爺絀剁殑錛夛紝鐮旂┒闊充箰鎼滅儲鍒嗙被鐨凚rian Whitman鏇懼湪鑻規灉騫沖彴涓婁嬌鐢↗NI鎶鏈湪C浠g爜涓祵鍏olr瀹炵幇媯绱紝涓嶈繃鏄竴涓狢ocoa宸ョ▼銆傛湁浜嗚繖浜涘鎴風錛屼嬌鐢ㄨ呰兘寰堟柟渚垮湴灝哠olr闆嗘垚鍒板叿浣撹繍鐢ㄤ腑銆傜洰鍓嶆渶瀹屽杽鐨勫綋灞濲ava瀹㈡埛绔疭olrj錛屼互鍙婂姞鍏ュ埌Solr trunk錛屽茍灝嗗湪1.3鐗堟湰涓寮忓彂甯冦?/span>
濡傛灉涓嶇爺絀跺紑鍙慡olr錛屽彧鏄嬌鐢⊿olr錛屽彧闇瑕佸叧娉⊿olr鐨勪互涓嬪嚑涓柟闈細
1銆丼olr鏈嶅姟鍣ㄧ殑閰嶇疆鍦╯olrconfig.xml涓畬鎴愶紝鍖呮嫭瀵圭紦瀛橈紝servlet鐨勪釜鎬у寲閰嶇疆絳夌瓑錛屽嵆緋葷粺鍏ㄥ眬鐨勯厤緗紱
2銆佺儲寮曟柟娉曘佺儲寮曞煙錛堝瓧孌碉級絳夌瓑鍦╯chema.xml涓畬鎴愶紝榪欎釜閰嶇疆鏄拡瀵筍olr瀹炰緥鐨勶紱
3銆佺儲寮曟暟鎹枃浠墮粯璁ゆ斁鍦⊿olr鏂囨。鏍圭洰褰曚笅鐨刣ata/index鐩綍涓嬶紝榪欎釜璺緞鍙互閫氳繃絎?鐐歸厤緗紝鍚屾椂鍙互灝嗚繖涓洰褰曚笅鐨勬枃浠惰繘琛屽鍒剁矘璐達紝鍗沖彲瀹屾垚绱㈠紩鐨勫鐢紱
4銆佸緩绔嬬儲寮曠殑鏃墮棿鐩稿綋闀匡紝鎴戦噰鐢ㄦ寜璇嶆棤瀛楀吀绱㈠紩鏂瑰紡瀵?G110涓囨潯涓枃璁板綍榪涜绱㈠紩錛岃姳浜嗗皢榪?涓崐灝忔椂鐨勬椂闂達紙褰撶劧榪欎釜鏃墮棿鍜屽緢澶氬洜绱犳湁鍏籌紝鏈夊叴瓚g殑璇濆ぇ瀹跺彲浠ョ暀璦鍜屾垜璁ㄨ錛夛紝鐩稿鑰岃█錛屽湪linux涓嬪緩绱㈠紩鏃墮棿瑕佹瘮windows涓嬪揩寰堝錛屽彲浠ヤ嬌鐢╟ommit鎿嶄綔浣挎柊澧炵儲寮曠敓鏁堬紝鍚屾椂娉ㄦ剰绱㈠紩鐨勪紭鍖栵紝绱㈠紩浼樺寲涔熸槸寰堣垂璧勬簮鍜屾椂闂寸殑錛屼絾鏄紭鍖栫儲寮曚篃鏄彁楂樻绱㈤熷害鐨勯噸瑕佹柟娉曪紝鍥犳闇瑕佸ソ濂芥潈琛¤繖涓鐐癸紱
5銆佸畨瑁呭畬鍚庣殑Solr鐩綍涓嬫湁榪欎箞鍑犱釜鏂囦歡澶癸細bin鏂囦歡澶歸噷涓昏鏄敤浜庡緩绔嬮暅鍍忓拰瀹屾垚榪滅▼鍚屾鐨勮剼鏈紱conf鏂囦歡澶逛笅涓昏鏄?銆?鐐逛腑鎻愬埌鐨勯厤緗枃浠訛紱admin鏂囦歡澶逛笅鏄殑涓昏鏄彁渚泈eb綆$悊鐣岄潰鐨勬枃浠訛紱
6銆佺洰鍓峉olr1.2涓嶅叿澶囧畨鍏ㄦц璁★紝娌℃湁鐢ㄦ埛緇勫強鏉冮檺璁劇疆錛屽湪榪涜鍏蜂綋搴旂敤鏃墮渶瑕佹敞鎰忓畨鍏紝鐩墠鏈鏈夋晥鐨勬柟娉曟槸閫氳繃搴旂敤鏈嶅姟鍣ㄤ笂鐨勬巿鏉冨疄鐜般?br />
鏈枃姘鎬箙閾炬帴錛?a >http://www.jinsehupan.com/blog/?p=25
[Solr鐨勫畨瑁?/span>]Solr鍙戣鐗堜腑宸茬粡鏈変竴涓嬌鐢↗etty涓簊ervlet瀹瑰櫒鐨勫皬渚嬪瓙錛屽彲浠ヤ嬌鐢ㄨ繖涓緥瀛愭潵浣撻獙錛岄偅姝e湪鍦ㄨ嚜宸辨兂閮ㄧ講鐨勫鉤鍙板拰搴旂敤鏈嶅姟鍣ㄤ笂璇ユ庝箞涓涓楠ゅ憿錛?/span>
瑕佸紑濮嬩嬌鐢?Solr錛岄渶瀹夎浠ヤ笅杞歡錛?br />
1銆丣ava 1.5 鎴栨洿楂樼増鏈紱
2銆丄nt 1.6.x 鎴栨洿楂樼増鏈紙鐢ㄤ簬緙栬瘧綆$悊Solr宸ョ▼錛屼釜浜烘帹鑽愶紝褰撶劧鍙互浣跨敤eclipse錛夛紱
3銆乄eb 嫻忚鍣紝鐢ㄦ潵鏌ョ湅綆$悊欏甸潰錛堝畼鏂瑰緩璁嬌鐢‵irefox錛屼絾瀹為檯娌℃湁鍙戠幇鍜孖E鏈変粈涔堝樊鍒級錛?br />
4銆乻ervlet 瀹瑰櫒錛屽Tomcat 5.5錛堜笉寤鴻浣跨敤6鐗堟湰錛夈傛湰鏂囦互Tomcat 鍦?8080 绔彛涓婅繍琛屼負渚嬨傚鏋滆繍琛岀殑鏄叾浠?servlet 瀹瑰櫒鎴栧湪鍏朵粬鐨勭鍙d笂榪愯錛屽垯鍙兘瑕佷慨鏀逛唬鐮佷腑鐨刄RL鎵嶈兘璁塊棶紺轟緥搴旂敤紼嬪簭鍜?Solr銆?/span>
涓嬮潰寮濮嬪畨瑁呴厤緗細
1銆佷嬌鐢ˋnt緙栬瘧宸ョ▼鎴栦笅杞界ず渚嬪簲鐢ㄧ▼搴忥紝灝哠olr WAR 鏂囦歡澶嶅埗鍒?servlet 瀹瑰櫒鐨剋ebapps鐩綍涓紱
2銆佸緱鍒癝olr鏂囦歡澶癸紝浠ュ闅忓悗灝嗗叾澶嶅埗鍒板綋鍓嶇洰褰曪紝鍙互浣跨敤ant build寰楀埌錛屼篃鍙互鍦ㄤ笅杞界殑鍘嬬緝鍖呬腑鎵懼埌錛屼互瀹冧負妯℃澘浠ュ涔嬪悗鐨勪慨鏀癸紱
3銆佸彲浠ラ氳繃浠ヤ笅涓夌鏂瑰紡涔嬩竴璁劇疆 Solr 鐨勪富浣嶇疆錛?br />
璁劇疆 java 緋葷粺灞炴?solr.solr.home 錛堟病閿欙紝灝辨槸 solr.solr.home錛屼竴鑸湪宓屽叆寮忛泦鎴愪腑鐢ㄥ緱澶氾級錛?br />
閰嶇疆 java:comp/env/solr/home 鐨勪竴涓?JNDI 鏌ユ壘鎸囧悜 solr 鐩綍錛屽緩绔?tomcat55/conf/Catalina/localhost/solr.xml鏂囦歡錛屾敞鎰忚繖涓獂ml鏂囦歡鍚嶅皢鏄疭olr瀹炰緥鍚嶇О錛?涓殑褰撳墠鐩綍琚寚瀹氫負涓嬮潰涓殑f:/solrhome錛屾枃浠跺唴瀹瑰涓嬶細
鍦ㄥ寘鍚?solr 鐩綍鐨勭洰褰曚腑鍚姩 servlet 瀹瑰櫒錛堥粯璁ょ殑 Solr 涓葷洰褰曟槸褰撳墠宸ヤ綔鐩綍涓嬬殑 solr錛夛紱
4銆佹渶鍚庝竴鐐瑰氨鏄鏋滄湁CJK錛堜腑鏃ラ煩鏂囧瓧錛夊簲鐢紝鍑虹幇涔辯爜闂錛岄噰鐢ㄥ涓嬫柟娉曡В鍐籌紙鍏跺疄宸茬粡涓嶇畻鏄痵olr閰嶇疆闂錛岃屾槸搴旂敤鏈嶅姟鍣ㄩ厤緗棶棰橈級錛屼慨鏀筎omcat鐨刢onf/server.xml鏂囦歡涓浜庣鍙o紙鏈枃涓?080錛夌殑榪炴帴鍣ㄧ粺涓璧勬簮緙栫爜涓篣TF-8錛屽洜涓篠olr1.2鍐呮牳鏀寔UTF-8緙栫爜錛?/span>
[Solr鍒嗚瘝欏哄簭]Solr寤虹珛绱㈠紩鍜屽鍏抽敭璇嶈繘琛屾煡璇㈤兘寰楀瀛椾覆榪涜鍒嗚瘝錛屽湪鍚戠儲寮曞簱涓坊鍔犲叏鏂囨绱㈢被鍨嬬殑绱㈠紩鐨勬椂鍊欙紝Solr浼氶鍏堢敤絀烘牸榪涜鍒嗚瘝錛岀劧鍚庢妸鍒嗚瘝緇撴灉渚濇浣跨敤鎸囧畾鐨勮繃婊ゅ櫒榪涜榪囨護錛屾渶鍚庡墿涓嬬殑緇撴灉鎵嶄細鍔犲叆鍒扮儲寮曞簱涓互澶囨煡璇€傚垎璇嶇殑欏哄簭濡備笅錛?br />
绱㈠紩
1錛氱┖鏍紈hitespaceTokenize
2錛氳繃婊よ瘝StopFilter
3錛氭媶瀛梂ordDelimiterFilter
4錛氬皬鍐欒繃婊owerCaseFilter
5錛氳嫳鏂囩浉榪戣瘝EnglishPorterFilter
6錛氬幓闄ら噸澶嶈瘝RemoveDuplicatesTokenFilter
鏌ヨ
1錛氭煡璇㈢浉榪戣瘝
2錛氳繃婊よ瘝
3錛氭媶瀛?br />
4錛氬皬鍐欒繃婊?br />
5錛氳嫳鏂囩浉榪戣瘝
6錛氬幓闄ら噸澶嶈瘝
浠ヤ笂鏄拡瀵硅嫳鏂囷紝涓枃鐨勯櫎浜嗙┖鏍鹼紝鍏朵粬閮界被浼?/span>
[Solr涓枃搴旂敤鐨勪竴涓疄渚?/span>]
1銆侀鍏堥厤緗畇chema.xml錛岃繖涓浉褰撲簬鏁版嵁琛ㄩ厤緗枃浠訛紝瀹冨畾涔変簡鍔犲叆绱㈠紩鐨勬暟鎹殑鏁版嵁綾誨瀷鐨勩?.2鐗堟湰鐨剆chema.xml涓昏鍖呮嫭types銆乫ields鍜屽叾浠栫殑涓浜涚己鐪佽緗?/span>
A銆侀鍏堥渶瑕佸湪types緇撶偣鍐呭畾涔変竴涓狥ieldType瀛愮粨鐐癸紝鍖呮嫭name,class,positionIncrementGap絳夌瓑涓浜涘弬鏁幫紝name灝辨槸榪欎釜FieldType鐨勫悕縐幫紝class鎸囧悜org.apache.solr.analysis鍖呴噷闈㈠搴旂殑class鍚嶇О錛岀敤鏉ュ畾涔夎繖涓被鍨嬬殑琛屼負銆傚湪FieldType瀹氫箟鐨勬椂鍊欐渶閲嶈鐨勫氨鏄畾涔夎繖涓被鍨嬬殑鏁版嵁鍦ㄥ緩绔嬬儲寮曞拰榪涜鏌ヨ鐨勬椂鍊欒浣跨敤鐨勫垎鏋愬櫒analyzer,鍖呮嫭鍒嗚瘝鍜岃繃婊ゃ傚湪渚嬪瓙涓璽ext榪欎釜FieldType鍦ㄥ畾涔夌殑鏃跺欙紝鍦╥ndex鐨刟nalyzer涓嬌鐢╯olr.WhitespaceTokenizerFactory榪欎釜鍒嗚瘝鍖咃紝灝辨槸絀烘牸鍒嗚瘝錛岀劧鍚庝嬌鐢╯olr.StopFilterFactory錛宻olr.WordDelimiterFilterFactory錛宻olr.LowerCaseFilterFactory錛宻olr.EnglishPorterFilterFactory錛宻olr.RemoveDuplicatesTokenFilterFactory榪欏嚑涓繃婊ゅ櫒銆傚湪鍚戠儲寮曞簱涓坊鍔爐ext綾誨瀷鐨勭儲寮曠殑鏃跺欙紝Solr浼氶鍏堢敤絀烘牸榪涜鍒嗚瘝錛岀劧鍚庢妸鍒嗚瘝緇撴灉渚濇浣跨敤鎸囧畾鐨勮繃婊ゅ櫒榪涜榪囨護錛屾渶鍚庡墿涓嬬殑緇撴灉鎵嶄細鍔犲叆鍒扮儲寮曞簱涓互澶囨煡璇€係olr鐨刟nalysis鍖呭茍娌℃湁甯︽敮鎸佷腑鏂囩殑鍖咃紝鍦ㄨ繖閲屾垜浠噰鐢╨ucene閲岀殑璇█鍖咃紙鍦ㄤ笅杞藉悗鐨剆olr鍘嬬緝鍖呭唴錛宭ib鐩綍涓嬫湁涓涓猯ucene-analyzers-2.2.0.jar鍖咃紝閲岄潰鍚湁涓枃澶勭悊鐨刢n鍜宑jk綾伙級錛屾湁cn鍜宑jk涓や釜綾誨彲浠ユ敮鎸佷腑鏂囥傛垜浠噰鐢╟jk綾伙紝騫跺湪schema.xml涓姞鍏ュ涓嬮厤緗細
鏀寔綾誨瀷瀹氫箟瀹屾垚浜嗐?/span>
B銆佹帴涓嬫潵鐨勫伐浣滃氨鏄湪fields緇撶偣鍐呭畾涔夊叿浣撶殑瀛楁錛堢被浼兼暟鎹簱涓殑瀛楁錛夛紝灝辨槸filed錛宖iled瀹氫箟鍖呮嫭name,type錛堜負涔嬪墠瀹氫箟榪囩殑鍚勭FieldType錛?indexed錛堟槸鍚﹁绱㈠紩錛?stored錛堟槸鍚﹁鍌ㄥ瓨錛夛紝multiValued錛堟槸鍚︽湁澶氫釜鍊鹼級絳夌瓑銆備緥濡傚畾涔夊涓嬶細
field鐨勫畾涔夌浉褰撻噸瑕侊紝鏈夊嚑涓妧宸ч渶娉ㄦ剰涓涓嬶紝瀵瑰彲鑳藉瓨鍦ㄥ鍊煎緱瀛楁灝介噺璁劇疆multiValued灞炴т負true錛岄伩鍏嶅緩绱㈠紩鏄姏鍑洪敊璇紱濡傛灉涓嶉渶瑕佸瓨鍌ㄧ浉搴斿瓧孌靛鹼紝灝介噺灝唖tored灞炴ц涓篺alse銆?/span>
C銆佸緩璁緩绔嬩簡涓涓嫹璐濆瓧孌碉紝灝嗘墍鏈夌殑鍏ㄦ枃瀛楁澶嶅埗鍒頒竴涓瓧孌典腑錛屼互渚胯繘琛岀粺涓鐨勬绱細
騫跺湪鎷瘋礉瀛楁緇撶偣澶勫畬鎴愭嫹璐濊緗細
D銆侀櫎姝や箣澶栵紝榪樺彲浠ュ畾涔夊姩鎬佸瓧孌碉紝鎵璋撳姩鎬佸瓧孌靛氨鏄笉鐢ㄦ寚瀹氬叿浣撶殑鍚嶇О錛屽彧瑕佸畾涔夊瓧孌靛悕縐扮殑瑙勫垯錛屼緥濡傚畾涔変竴涓猟ynamicField錛宯ame涓?_i錛屽畾涔夊畠鐨則ype涓簍ext錛岄偅涔堝湪浣跨敤榪欎釜瀛楁鐨勬椂鍊欙紝浠諱綍浠i緇撳熬鐨勫瓧孌甸兘琚涓烘槸絎﹀悎榪欎釜瀹氫箟鐨勶紝渚嬪name_i錛実ender_i錛宻chool_i絳夈?/span>
2銆侀厤緗畇olrconfig.xml錛岀敤鏉ラ厤緗甋olr鐨勪竴浜涚郴緇熷睘鎬э紝姣旇緝閲嶈鐨勪竴涓氨鏄彲浠ラ氳繃鏇存敼鍏朵腑鐨刣ataDir灞炴ф潵鎸囧畾绱㈠紩鏂囦歡鐨勫瓨鏀句綅緗紝瀵逛簬鏈夊ぇ鏁版嵁閲忕殑鎯呭喌涓嬭繕瑕佽繘琛岃嚜鍔╟ommit鎿嶄綔閰嶇疆錛屼互涓嬭緗負褰撳唴瀛樼儲寮曢噺杈懼埌20W鏉℃椂鑷姩榪涜寰紓佺洏鍐欐搷浣滐紝浠ュ厤鍫嗘孩鍑猴紝榪欎篃鏄В鍐沖崟涓叆搴搙ml鏂囦歡鏈濂戒笉瑕佽秴榪?0M鐨勬湁鏁堟柟娉曪細
3銆侀厤緗ソ榪欎簺鍚庯紝闇瑕侀噸鏂板惎鍔⊿olr鏈嶅姟鍣ㄤ嬌閰嶇疆鐢熸晥錛岀劧鍚庡悜鍏朵腑娣誨姞鏁版嵁銆?/span>
4銆佹坊鍔犳暟鎹槸閫氳繃鍚戞湇鍔″櫒鐨剈pdate Servlet POST xml鏍煎紡鐨勬暟鎹潵瀹炵幇鐨勶紝xml緇撴瀯鏄繖鏍風殑add涓棿鏈夊緢澶氫釜doc錛屾瘡涓猟oc涓湁寰堝涓猣ield銆傛坊鍔犲埌绱㈠紩搴撲腑鐨勬瘡鏉¤褰曢兘蹇呴』鎸囧畾鍞竴鐨勬暟瀛梚d鏉ュ敮涓鏍囪瘑榪欐潯绱㈠紩銆傚緩绔嬪ソxml鏂囦歡錛堜緥濡俿olr.xml錛変箣鍚庯紝鍦╡xampledocs鐩綍涓嬫墽琛岋細java -jar post.jar solr.xml鏉ユ坊鍔犵儲寮曟暟鎹傚浜巔ost鐨刯ar鍖咃紝濡傛灉閲嶆柊閰嶇疆浜嗗簲鐢ㄦ湇鍔″櫒錛屽浣跨敤浜哻omcat錛岀鍙f敼涓?080錛屽疄渚嬪悕縐版敼涓簊olrx浜嗛渶瑕侀噸鏂扮敓鎴愮浉搴旂殑post.jar鍖呰繘琛屾搷浣溿?/span>
鍙﹂檮ronghao瀹炵幇涓枃鍒嗚瘝鐨勬渚嬩緵澶у鍙傝冿細
瀵瑰叏鏂囨绱㈣岃█錛屼腑鏂囧垎璇嶉潪甯哥殑閲嶈錛岃繖閲岄噰鐢ㄤ簡qieqie搴栦竵鍒嗚瘝錛堥潪甯鎬笉閿欙細錛夛級銆傞泦鎴愰潪甯哥殑瀹規槗錛屾垜涓嬭澆鐨勬槸2.0.4-alpha2鐗堟湰錛屽叾涓畠鏀寔鏈澶氬垏鍒嗗拰鎸夋渶澶у垏鍒嗐傚垱寤鴻嚜宸辯殑涓涓腑鏂嘥okenizerFactory緇ф壙鑷猻olr鐨凚aseTokenizerFactory銆?/span>
**
* Created by IntelliJ IDEA.
* User: ronghao
* Date: 2007-11-3
* Time: 14:40:59
* 涓枃鍒囪瘝 瀵瑰簴涓佸垏璇嶇殑灝佽
*/
public class ChineseTokenizerFactory extends BaseTokenizerFactory {
/**
* 鏈澶氬垏鍒?榛樿妯″紡
*/
public static final String MOST_WORDS_MODE = “most-words”;
/**
* 鎸夋渶澶у垏鍒?/span>
*/
public static final String MAX_WORD_LENGTH_MODE = “max-word-length”;
private String mode = null;
public void setMode(String mode) {
if (mode==null||MOST_WORDS_MODE.equalsIgnoreCase(mode)
|| “default”.equalsIgnoreCase(mode)) {
this.mode=MOST_WORDS_MODE;
} else if (MAX_WORD_LENGTH_MODE.equalsIgnoreCase(mode)) {
this.mode=MAX_WORD_LENGTH_MODE;
}
else {
throw new IllegalArgumentException(”涓嶅悎娉曠殑鍒嗘瀽鍣∕ode鍙傛暟璁劇疆:” + mode);
}
}
@Override
public void init(Map args) {
super.init(args);
setMode(args.get(”mode”));
}
public TokenStream create(Reader input) {
return new PaodingTokenizer(input, PaodingMaker.make(),
createTokenCollector());
}
private TokenCollector createTokenCollector() {
if( MOST_WORDS_MODE.equals(mode))
return new MostWordsTokenCollector();
if( MAX_WORD_LENGTH_MODE.equals(mode))
return new MaxWordLengthTokenCollector();
throw new Error(”never happened”);
}
}
鍦╯chema.xml鐨勫瓧孌祎ext閰嶇疆閲屽姞鍏ヨ鍒嗚瘝鍣ㄣ?/span>
1. <fieldtype name="text" class="solr.TextField" positionIncrementGap="100">
2.
3. <analyzer type="index">
4.
5. <tokenizer class="com.ronghao.fulltextsearch.analyzer.ChineseTokenizerFactory" mode="most-words"/>
6.
7.
8. <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
9.
10. <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
11.
12. <filter class="solr.LowerCaseFilterFactory"/>
13.
14.
15. <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
16.
17. </analyzer>
18.
19. <analyzer type="query">
20.
21. <tokenizer class="com.ronghao.fulltextsearch.analyzer.ChineseTokenizerFactory" mode="most-words"/>
22.
23. <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
24.
25. <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/>
26.
27. <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0"/>
28.
29. <filter class="solr.LowerCaseFilterFactory"/>
30.
31. <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
32.
33. </analyzer>
34.
35. </fieldtype>
瀹屾垚鍚庨噸鍚痶omcat錛屽嵆鍙湪http://localhost:8080/solr/admin/analysis.jsp
浣撻獙鍒板簴涓佺殑涓枃鍒嗚瘝銆傛敞鎰忚灝唒aoding-analysis.jar澶嶅埗鍒皊olr鐨刲ib涓嬶紝娉ㄦ剰淇敼jar鍖呴噷瀛楀吀鐨刪ome銆?/span>
[Solr鐨勬绱㈣繍綆楃]
“:” 鎸囧畾瀛楁鏌ユ寚瀹氬鹼紝濡傝繑鍥炴墍鏈夊?:*²
“?”²琛ㄧず鍗曚釜浠繪剰瀛楃鐨勯氶厤
“*” 琛ㄧず澶氫釜浠繪剰瀛楃鐨勯氶厤錛堜笉鑳藉湪媯绱㈢殑欏瑰紑濮嬩嬌鐢?鎴栬?絎﹀彿錛?/span>²
“~”²琛ㄧず妯$硦媯绱紝濡傛绱㈡嫾鍐欑被浼間簬”roam”鐨勯」榪欐牱鍐欙細roam~灝嗘壘鍒板艦濡俧oam鍜宺oams鐨勫崟璇嶏紱roam~0.8錛屾绱㈣繑鍥炵浉浼煎害鍦?.8浠ヤ笂鐨勮褰曘?br />
²閭昏繎媯绱紝濡傛绱㈢浉闅?0涓崟璇嶇殑”apache”鍜?#8221;jakarta”錛?#8221;jakarta apache”~10
“^”²鎺у埗鐩稿叧搴︽绱紝濡傛绱akarta apache錛屽悓鏃跺笇鏈涘幓璁?#8221;jakarta”鐨勭浉鍏沖害鏇村姞濂斤紝閭d箞鍦ㄥ叾鍚庡姞涓?#8221;^”絎﹀彿鍜屽閲忓鹼紝鍗砵akarta^4 apache
甯冨皵鎿嶄綔絎ND銆亅|²
甯冨皵鎿嶄綔絎R銆?/span>²&&
甯冨皵鎿嶄綔絎OT銆?銆?²錛堟帓闄ゆ搷浣滅涓嶈兘鍗曠嫭涓庨」浣跨敤鏋勬垚鏌ヨ錛?br />
“+” 瀛樺湪鎿嶄綔絎︼紝瑕佹眰絎﹀彿”+”鍚庣殑欏瑰繀欏誨湪鏂囨。鐩稿簲鐨勫煙涓瓨鍦?/span>²
( ) 鐢ㄤ簬鏋勬垚瀛愭煡璇?/span>²
² [] 鍖呭惈鑼冨洿媯绱紝濡傛绱㈡煇鏃墮棿孌佃褰曪紝鍖呭惈澶村熬錛宒ate:[200707 TO 200710]
{}²涓嶅寘鍚寖鍥存绱紝濡傛绱㈡煇鏃墮棿孌佃褰曪紝涓嶅寘鍚ご灝?br />
date:{200707 TO 200710}
" 杞箟鎿嶄綔絎︼紝鐗規畩瀛楃鍖呮嫭+ -² && || ! ( ) { } [ ] ^ ” ~ * ? : "