Lucene 杞歡鍖呯殑鍙戝竷褰㈠紡鏄竴涓?JAR 鏂囦歡錛屼笅闈㈡垜浠垎鏋愪竴涓嬭繖涓?JAR 鏂囦歡閲岄潰鐨勪富瑕佺殑 JAVA 鍖咃紝浣胯鑰呭涔嬫湁涓垵姝ョ殑浜嗚В銆?/p>
Package: org.apache.lucene.document
榪欎釜鍖呮彁渚涗簡涓浜涗負灝佽瑕佺儲寮曠殑鏂囨。鎵闇瑕佺殑綾伙紝姣斿 Document, Field銆傝繖鏍鳳紝姣忎竴涓枃妗f渶緇堣灝佽鎴愪簡涓涓?Document 瀵硅薄銆?/p>
Package: org.apache.lucene.analysis
榪欎釜鍖呬富瑕佸姛鑳芥槸瀵規枃妗h繘琛屽垎璇嶏紝鍥犱負鏂囨。鍦ㄥ緩绔嬬儲寮曚箣鍓嶅繀欏昏榪涜鍒嗚瘝錛屾墍浠ヨ繖涓寘鐨勪綔鐢ㄥ彲浠ョ湅鎴愭槸涓哄緩绔嬬儲寮曞仛鍑嗗宸ヤ綔銆?/p>
Package: org.apache.lucene.index
榪欎釜鍖呮彁渚涗簡涓浜涚被鏉ュ崗鍔╁垱寤虹儲寮曚互鍙婂鍒涘緩濂界殑绱㈠紩榪涜鏇存柊銆傝繖閲岄潰鏈変袱涓熀紜鐨勭被錛欼ndexWriter 鍜?IndexReader錛屽叾涓?IndexWriter 鏄敤鏉ュ垱寤虹儲寮曞茍娣誨姞鏂囨。鍒扮儲寮曚腑鐨勶紝IndexReader 鏄敤鏉ュ垹闄ょ儲寮曚腑鐨勬枃妗g殑銆?/p>
Package: org.apache.lucene.search
榪欎釜鍖呮彁渚涗簡瀵瑰湪寤虹珛濂界殑绱㈠紩涓婅繘琛屾悳绱㈡墍闇瑕佺殑綾匯傛瘮濡?IndexSearcher 鍜?Hits, IndexSearcher 瀹氫箟浜嗗湪鎸囧畾鐨勭儲寮曚笂榪涜鎼滅儲鐨勬柟娉曪紝Hits 鐢ㄦ潵淇濆瓨鎼滅儲寰楀埌鐨勭粨鏋溿?/p>
涓轟簡瀵規枃妗h繘琛岀儲寮曪紝Lucene 鎻愪緵浜嗕簲涓熀紜鐨勭被錛屼粬浠垎鍒槸 Document, Field, IndexWriter, Analyzer, Directory銆備笅闈㈡垜浠垎鍒粙緇嶄竴涓嬭繖浜斾釜綾葷殑鐢ㄩ旓細 Document Document 鏄敤鏉ユ弿榪版枃妗g殑錛岃繖閲岀殑鏂囨。鍙互鎸囦竴涓?HTML 欏甸潰錛屼竴灝佺數瀛愰偖浠訛紝鎴栬呮槸涓涓枃鏈枃浠躲備竴涓?Document 瀵硅薄鐢卞涓?Field 瀵硅薄緇勬垚鐨勩傚彲浠ユ妸涓涓?Document 瀵硅薄鎯寵薄鎴愭暟鎹簱涓殑涓涓褰曪紝鑰屾瘡涓?Field 瀵硅薄灝辨槸璁板綍鐨勪竴涓瓧孌點?/p> Field Field 瀵硅薄鏄敤鏉ユ弿榪頒竴涓枃妗g殑鏌愪釜灞炴х殑錛屾瘮濡備竴灝佺數瀛愰偖浠剁殑鏍囬鍜屽唴瀹瑰彲浠ョ敤涓や釜 Field 瀵硅薄鍒嗗埆鎻忚堪銆?/p> Analyzer 鍦ㄤ竴涓枃妗h绱㈠紩涔嬪墠錛岄鍏堥渶瑕佸鏂囨。鍐呭榪涜鍒嗚瘝澶勭悊錛岃繖閮ㄥ垎宸ヤ綔灝辨槸鐢?Analyzer 鏉ュ仛鐨勩侫nalyzer 綾繪槸涓涓娊璞$被錛屽畠鏈夊涓疄鐜般傞拡瀵逛笉鍚岀殑璇█鍜屽簲鐢ㄩ渶瑕侀夋嫨閫傚悎鐨?Analyzer銆侫nalyzer 鎶婂垎璇嶅悗鐨勫唴瀹逛氦緇?IndexWriter 鏉ュ緩绔嬬儲寮曘?/p> IndexWriter IndexWriter 鏄?Lucene 鐢ㄦ潵鍒涘緩绱㈠紩鐨勪竴涓牳蹇冪殑綾伙紝浠栫殑浣滅敤鏄妸涓涓釜鐨?Document 瀵硅薄鍔犲埌绱㈠紩涓潵銆?/p> Directory 榪欎釜綾諱唬琛ㄤ簡 Lucene 鐨勭儲寮曠殑瀛樺偍鐨勪綅緗紝榪欐槸涓涓娊璞$被錛屽畠鐩墠鏈変袱涓疄鐜幫紝絎竴涓槸 FSDirectory錛屽畠琛ㄧず涓涓瓨鍌ㄥ湪鏂囦歡緋葷粺涓殑绱㈠紩鐨勪綅緗傜浜屼釜鏄?RAMDirectory錛屽畠琛ㄧず涓涓瓨鍌ㄥ湪鍐呭瓨褰撲腑鐨勭儲寮曠殑浣嶇疆銆?/p> 鐔熸倝浜嗗緩绔嬬儲寮曟墍闇瑕佺殑榪欎簺綾誨悗錛屾垜浠氨寮濮嬪鏌愪釜鐩綍涓嬮潰鐨勬枃鏈枃浠跺緩绔嬬儲寮曚簡錛屾竻鍗?緇欏嚭浜嗗鏌愪釜鐩綍涓嬬殑鏂囨湰鏂囦歡寤虹珛绱㈠紩鐨勬簮浠g爜銆?/p> |
package TestLucene; import java.io.File; import java.io.FileReader; import java.io.Reader; import java.util.Date; import org.apache.lucene.analysis.Analyzer; import org.apache.lucene.analysis.standard.StandardAnalyzer; import org.apache.lucene.document.Document; import org.apache.lucene.document.Field; import org.apache.lucene.index.IndexWriter; /** * This class demonstrate the process of creating index with Lucene * for text files */ public class TxtFileIndexer { public static void main(String[] args) throws Exception{ //indexDir is the directory that hosts Lucene's index files File indexDir = new File("D:\\luceneIndex"); //dataDir is the directory that hosts the text files that to be indexed File dataDir = new File("D:\\luceneData"); Analyzer luceneAnalyzer = new StandardAnalyzer(); File[] dataFiles = dataDir.listFiles(); IndexWriter indexWriter = new IndexWriter(indexDir,luceneAnalyzer,true); long startTime = new Date().getTime(); for(int i = 0; i < dataFiles.length; i++){ if(dataFiles[i].isFile() && dataFiles[i].getName().endsWith(".txt")){ System.out.println("Indexing file " + dataFiles[i].getCanonicalPath()); Document document = new Document(); Reader txtReader = new FileReader(dataFiles[i]); document.add(Field.Text("path",dataFiles[i].getCanonicalPath())); document.add(Field.Text("contents",txtReader)); indexWriter.addDocument(document); } } indexWriter.optimize(); indexWriter.close(); long endTime = new Date().getTime(); System.out.println("It takes " + (endTime - startTime) + " milliseconds to create index for the files in directory " + dataDir.getPath()); } }
|
鍒╃敤Lucene榪涜鎼滅儲灝卞儚寤虹珛绱㈠紩涓鏍蜂篃鏄潪甯告柟渚跨殑銆傚湪涓婇潰涓閮ㄥ垎涓紝鎴戜滑宸茬粡涓轟竴涓洰褰曚笅鐨勬枃鏈枃妗e緩绔嬪ソ浜嗙儲寮曪紝鐜板湪鎴戜滑灝辮鍦ㄨ繖涓儲寮曚笂榪涜鎼滅儲浠ユ壘鍒板寘鍚煇涓叧閿瘝鎴栫煭璇殑鏂囨。銆侺ucene鎻愪緵浜嗗嚑涓熀紜鐨勭被鏉ュ畬鎴愯繖涓繃紼嬶紝瀹冧滑鍒嗗埆鏄憿IndexSearcher, Term, Query, TermQuery, Hits. 涓嬮潰鎴戜滑鍒嗗埆浠嬬粛榪欏嚑涓被鐨勫姛鑳姐?/p> Query 榪欐槸涓涓娊璞$被錛屼粬鏈夊涓疄鐜幫紝姣斿TermQuery, BooleanQuery, PrefixQuery. 榪欎釜綾葷殑鐩殑鏄妸鐢ㄦ埛杈撳叆鐨勬煡璇㈠瓧絎︿覆灝佽鎴怢ucene鑳藉璇嗗埆鐨凲uery銆?/p> Term Term鏄悳绱㈢殑鍩烘湰鍗曚綅錛屼竴涓猅erm瀵硅薄鏈変袱涓猄tring綾誨瀷鐨勫煙緇勬垚銆傜敓鎴愪竴涓猅erm瀵硅薄鍙互鏈夊涓嬩竴鏉¤鍙ユ潵瀹屾垚錛歍erm term = new Term(“fieldName”,”queryWord”); 鍏朵腑絎竴涓弬鏁頒唬琛ㄤ簡瑕佸湪鏂囨。鐨勫摢涓涓狥ield涓婅繘琛屾煡鎵撅紝絎簩涓弬鏁頒唬琛ㄤ簡瑕佹煡璇㈢殑鍏抽敭璇嶃?/p> TermQuery TermQuery鏄娊璞$被Query鐨勪竴涓瓙綾伙紝瀹冨悓鏃朵篃鏄疞ucene鏀寔鐨勬渶涓哄熀鏈殑涓涓煡璇㈢被銆傜敓鎴愪竴涓猅ermQuery瀵硅薄鐢卞涓嬭鍙ュ畬鎴愶細 TermQuery termQuery = new TermQuery(new Term(“fieldName”,”queryWord”)); 瀹冪殑鏋勯犲嚱鏁板彧鎺ュ彈涓涓弬鏁幫紝閭e氨鏄竴涓猅erm瀵硅薄銆?/p> IndexSearcher IndexSearcher鏄敤鏉ュ湪寤虹珛濂界殑绱㈠紩涓婅繘琛屾悳绱㈢殑銆傚畠鍙兘浠ュ彧璇葷殑鏂瑰紡鎵撳紑涓涓儲寮曪紝鎵浠ュ彲浠ユ湁澶氫釜IndexSearcher鐨勫疄渚嬪湪涓涓儲寮曚笂榪涜鎿嶄綔銆?/p> Hits Hits鏄敤鏉ヤ繚瀛樻悳绱㈢殑緇撴灉鐨勩?/p> 浠嬬粛瀹岃繖浜涙悳绱㈡墍蹇呴』鐨勭被涔嬪悗錛屾垜浠氨寮濮嬪湪涔嬪墠鎵寤虹珛鐨勭儲寮曚笂榪涜鎼滅儲浜嗭紝娓呭崟2緇欏嚭浜嗗畬鎴愭悳绱㈠姛鑳芥墍闇瑕佺殑浠g爜銆?/p>
鍦ㄦ竻鍗?涓紝綾籌ndexSearcher鐨勬瀯閫犲嚱鏁版帴鍙椾竴涓被鍨嬩負Directory鐨勫璞★紝Directory鏄竴涓娊璞$被錛屽畠鐩墠鏈変袱涓瓙綾伙細FSDirctory鍜孯AMDirectory. 鎴戜滑鐨勭▼搴忎腑浼犲叆浜嗕竴涓狥SDirctory瀵硅薄浣滀負鍏跺弬鏁幫紝浠h〃浜嗕竴涓瓨鍌ㄥ湪紓佺洏涓婄殑绱㈠紩鐨勪綅緗傛瀯閫犲嚱鏁版墽琛屽畬鎴愬悗錛屼唬琛ㄤ簡榪欎釜IndexSearcher浠ュ彧璇葷殑鏂瑰紡鎵撳紑浜嗕竴涓儲寮曘傜劧鍚庢垜浠▼搴忔瀯閫犱簡涓涓猅erm瀵硅薄錛岄氳繃榪欎釜Term瀵硅薄錛屾垜浠寚瀹氫簡瑕佸湪鏂囨。鐨勫唴瀹逛腑鎼滅儲鍖呭惈鍏抽敭璇?#8221;lucene”鐨勬枃妗c傛帴鐫鍒╃敤榪欎釜Term瀵硅薄鏋勯犲嚭TermQuery瀵硅薄騫舵妸榪欎釜TermQuery瀵硅薄浼犲叆鍒癐ndexSearcher鐨剆earch鏂規硶涓繘琛屾煡璇紝榪斿洖鐨勭粨鏋滀繚瀛樺湪Hits瀵硅薄涓傛渶鍚庢垜浠敤浜嗕竴涓驚鐜鍙ユ妸鎼滅儲鍒扮殑鏂囨。鐨勮礬寰勯兘鎵撳嵃浜嗗嚭鏉ャ傚ソ浜嗭紝鎴戜滑鐨勬悳绱㈠簲鐢ㄧ▼搴忓凡緇忓紑鍙戝畬姣曪紝鎬庝箞鏍鳳紝鍒╃敤Lucene寮鍙戞悳绱㈠簲鐢ㄧ▼搴忔槸涓嶆槸寰堢畝鍗曘?/p> |