1. 各種Query
1.1. 概述
query.toString()查看原子查詢
1.2. 使用特定的分析器搜索
IndexSearcher searcher = new IndexSearcher(path );
Hits hits = null;
Query query = null;
QueryParser parser =new QueryParser("contents", new StandardAnalyzer());
query =parser.parse("11 a and hello");
hits=searcher.search(query); //查找 name:11 name:hello 共1個結果
System.out.println("查找 "+query.toString()+" 共" + hits.length() + "個結果");
1.3. 按詞條搜索—TermQuery
Query query = null;
query=new TermQuery(new Term("name","word1 a and"));
hits=searcher.search(query);// 查找 name:word1 a and 共0個結果
System.out.println("查找 "+query.toString()+" 共" + hits.length() + "個結果");
1.4. 按“與或”搜索—BooleanQuery
1.和: MUST與MUST_NOT
2.或: SHOULD與SHOULD
3.A與B的并集-B MUST與MUST_NOT
Query query1=null;
Query query2=null;
BooleanQuery query=null;
query1=new TermQuery(new Term("name","word1"));
query2=new TermQuery(new Term("name","word2"));
query=new BooleanQuery();
query.add(query1,BooleanClause.Occur.MUST);
query.add(query2,BooleanClause.Occur.MUST_NOT);
1.5. 在某一范圍內搜索—RangeQuery
Term beginTime=new Term("time","200001");
Term endTime=new Term("time","200005");
RangeQuery query=null;
query=new RangeQuery(beginTime,endTime,false);//不包含邊界值
1.6. 使用前綴搜索—PrefixQuery
Term pre1=new Term("name","wor");
PrefixQuery query=null;
query = new PrefixQuery(pre1);
1.7. 短語搜索—PhraseQuery
a)默認坡度為0
PhraseQuery query = new PhraseQuery();
query.add(new Term(“bookname”,”鋼”));
query.add(new Term(“bookname”,”鐵”));
Hits hits=searcher.search(query); //搜索“鋼鐵”短語,而非“鋼”和“鐵”
b)設置坡度,默認為0
PhraseQuery query = new PhraseQuery();
query.add(new Term(“bookname”,”鋼”));
query.add(new Term(“bookname”,”鐵”));
query.setSlop(1);
Hits hits=searcher.search(query);//搜索“鋼鐵”或“鋼*鐵”中含一字
1.8. 多短語搜索—MultiPhraseQuery
a)
MultiPhraseQuery query=new MultiPhraseQuery();
//首先向其中加入要查找的短語的前綴
query.add(new Term(“bookname”,”鋼”));
//構建3個Term,作為短語的后綴
Term t1=new Term(“bookname”,”鐵”);
Term t2=new Term(“bookname”,”和”);
Term t3=new Term(“bookname”,”要”);
//再向query中加入所有的后綴,與前綴一起,它們將組成3個短語
query.add(new Term[]{t1,t2,t3});
Hits hits=searcher.search(query);
for(int i=0;i<hits.length();i++)
System.out.println(hits.doc(i));
b)
MultiPhraseQuery query=new MultiPhraseQuery();
Term t1=new Term(“bookname”,”鋼”);
Term t2 = new Term(“bookname”,”和”);
query.add(new Term[]{t1,t2});
query.add(new Term(“bookname”,”鐵”));
c)
MultiPhraseQuery query=new MultiPhraseQuery();
Term t1=new Term(“bookname”,”鋼”);
Term t2 = new Term(“bookname”,”和”);
query.add(new Term[]{t1,t2});
query.add(new Term(“bookname”,”鐵”));
Term t3=new Term(“bookname”,”是”);
Term t4=new Term(“bookname”,”戰(zhàn)”);
query.add(new Term[]{t3,t4});
1.9. 模糊搜索—FuzzyQuery
使用的算法為levenshtein算法,在比較兩個字符串時,將動作分為3種:
l 加一個字母
l 刪一個字母
l 改變一個字母
FuzzyQuery query=new FuzzyQuery(new Term(“content”,”work”));
public FuzzyQuery(Term term)
public FuzzyQuery(Term term,float minimumSimilarity)throws IllegalArgumentException
public FuzzyQuery(Term term,float minimumSimilarity,int prefixLength)throws IllegalArgumentException
其中minimumSimilarity為最小相似度,越小則文檔的數量越多。默認為0.5.其值必須<1.0
FuzzyQuery query=new FuzzyQuery(new Term(“content”,”work”),0.1f);
其中prefixLength表示要有多少個前綴字母必須完全匹配
FuzzyQuery query=new FuzzyQuery(new Term(“content”,”work”),0.1f,1);
1.10. 通配符搜索—WildcardQuery
* 表示0到多個字符
? 表示一個單一的字符
WildcardQuery query=new WildcardQuery(new Term(“content”,”?qq*”));
1.11. 跨度搜索
1.11.1. SpanTermQuery
效果和TermQuery相同
SpanTermQuery query=new SpanTermQuery(new Term(“content”,”abc”));
1.11.2. SpanFirstQuery
從Field內容的起始位置開始,在一個固定的寬度內查找所指定的詞條
SpanFirstQuery query=new SpanFirstQuery(new Term(“content”,”abc”),3);//是第3個word,不是byte
1.11.3. SpanNearQuery
SpanNearQuery相當與PhaseQuery
SpanTermQuery people=new SpanTermQuery(new Term(“content”,”mary”));
SpanTermQuery how=new SpanTermQuery(new Term(“content”,”poor”));
SpanNearQuery query=new SpanNearQuery(new SpanQuery[]{people,how},3,false);
1.11.4. SpanOrQuery
把所有SpanQuery的結果合起來
SpanTermQuery s1=new SpanTermQuery(new Term(“content”,”aa”);
SpanTermQuery s2=new SpanTermQuery(new Term(“content”,”cc”);
SpanTermQuery s3=new SpanTermQuery(new Term(“content”,”gg”);
SpanTermQuery s4=new SpanTermQuery(new Term(“content”,”kk”);
SpanNearQuery query1=new SpanNearQuery(new SpanQuery[]{s1,s2},1,false);
SpanNearQuery query2=new SpanNearQuery(new SpanQuery[]{s3,s4},3,false);
SpanOrQuery query=new SpanOrQuery(new SpanQuery[]{query1,query2});
1.11.5. SpanNotQuery
從第1個SpanQuery的查詢結果中,去掉第2個SpanQuery的查詢結果
SpanTermQuery s1=new SpanTermQuery(new Term(“content”,”aa”);
SpanFirstQuery query1=new SpanFirstQuery(s1,3);
SpanTermQuery s3=new SpanTermQuery(new Term(“content”,”gg”);
SpanTermQuery s4=new SpanTermQuery(new Term(“content”,”kk”);
SpanNearQuery query2=new SpanNearQuery(new SpanQuery[]{s3,s4},4,false);
SpanNotQuery query=new SpanNotQuery(query1,query2);
1.12. RegexQuery—正則表達式的查詢
String regex="http://[a-z]{1,3}\\.abc\\.com/.*";
RegexQuery query=new RegexQuery(new Term("url",regex));
本文來自CSDN博客,轉載請標明出處:http://blog.csdn.net/xiaoping8411/archive/2010/03/24/5413757.aspx