1.Adding documents to an index:
 protected String[] keywords = {"1", "2"};
 protected String[] unindexed = {"Netherlands", "Italy"};
 protected String[] unstored = {"Amsterdam has lots of bridges", "Venice has lots of canals"};
 protected String[] text = {"Amsterdam", "Venice"};
 Directory dir = FSDirectory.getDirectory(indexDir, true);
 IndexWriter writer = new IndexWriter(dir, new SimpleAnalyzer(), true);
 writer.setUseCompoundFile(true);
 for (int i = 0; i < keywords.length; i++) {
  Document doc = new Document();
  doc.add(Field.Keyword("id", keywords[i]));
  doc.add(Field.UnIndexed("country", unindexed[i]));
  doc.add(Field.UnStored("contents", unstored[i]));
  doc.add(Field.Text("city", text[i]));
  writer.addDocument(doc);
 }
 writer.optimize();
 writer.close();
2.Removing Documents from an index:
 IndexReader reader = IndexReader.open(dir);
 reader.delete(1);
上面的方式一次只能刪除一個document,下面的方法可以刪除多個滿足條件的document
 IndexReader reader = IndexReader.open(dir);
 reader.delete(new Term("city", "Amsterdam"));
 reader.close();

3.Index dates
 Document doc = new Document();
 doc.add(Field.Keyword("indexDate", new Date()));

4.Tuning indexing performance
 IndexWriter          System property                            Default value          Description
 --------------------------------------------------------------------------------------------------
 mergeFactor          org.apache.lucene.mergeFactor        10       Controls segment merge  frequency and size
 maxMergeDocs     org.apache.lucene.maxMergeDocs   Integar.MAX_VALUE    Limits the number of  documents per segement
 minMergeDocs         org.apache.lucene.minMergeDocs     10     Controls the amount of   RAM used when indexing

mergeFactor控制寫入硬盤前內存中緩存的document數量,同時控制merge index segments的頻率。其默認值是10,即存滿10個
documents后就必須寫入硬盤,而且如果segment的數量達到10的級數的時候會merge成一個segment,當然maxMergeDocs限制了每個
segment最大能夠保存的document數量。mergeFactor越大的話就越能利用RAM,提高index的效率,但是mergeFactor越高也就意味著
merge的頻率就越低,會可能導致segments的數量很大(因為沒有merge),這樣search的時候就需要打開更多的segment文件,也就
降低了search的效率。minMergeDocs is another IndexWriter instance variable that affects indexing performance. Its
value controls how many Documents have to be buffered before they’re merged to a segment.也即是說minMergeDocs也具有
mergeFactor控制緩存document數量的功能。

5.RAMDirectory幫助利用RAM,也可以采用集群或者多線程的方式充分利用硬件和軟件資源,提高index的效率。

6.有時候對于每個field可能希望控制其大小,比如只對前1000個term做index,這個時候就需要使用maxFieldLength來控制。

7.IndexWriter’s optimize()方法就是將segments進行merge,降低segments的數量從而減少search的時候讀取index的時間。

8.注意多線程環境下的工作:an index-modifying IndexReader operation can’t be executed
while an index-modifying IndexWriter operation is in progress.為了防止誤用,Lucene在使用某些API時會給
index上鎖。