亚洲阿v天堂在线2017免费,亚洲区视频在线观看,久久99国产亚洲精品观看

回文字符串的切割問(wèn)題

摘要: 給定一個(gè)字符串s，分割s使得每個(gè)子串都是回文的（即倒過(guò)來(lái)和原字符串是一樣的，如aba）
求最少的分割次數(shù)。閱讀全文

posted @ 2013-04-11 11:24 小明閱讀(4143) | 評(píng)論 (3) | 編輯收藏

TopCoder MatchMaker解法

問(wèn)題描述：

Problem Statement

THIS PROBLEM WAS TAKEN FROM THE SEMIFINALS OF THE TOPCODER INVITATIONAL

TOURNAMENT

DEFINITION

Class Name: MatchMaker

Method Name: getBestMatches

Paramaters: String[], String, int

Returns: String[]

Method signature (be sure your method is public): String[]

getBestMatches(String[] members, String currentUser, int sf);

PROBLEM STATEMENT

A new online match making company needs some software to help find the "perfect

couples". People who sign up answer a series of multiple-choice questions.

Then, when a member makes a "Get Best Mates" request, the software returns a

list of users whose gender matches the requested gender and whose answers to

the questions were equal to or greater than a similarity factor when compared

to the user's answers.

Implement a class MatchMaker, which contains a method getBestMatches. The

method takes as parameters a String[] members, String currentUser, and an int

sf:

- members contains information about all the members. Elements of members are

of the form "NAME G D X X X X X X X X X X"

* NAME represents the member's name

* G represents the gender of the current user.

* D represents the requested gender of the potential mate.

* Each X indicates the member's answer to one of the multiple-choice

questions. The first X is the answer to the first question, the second is the

answer to the second question, et cetera.

- currentUser is the name of the user who made the "Get Best Mates" request.

- sf is an integer representing the similarity factor.

The method returns a String[] consisting of members' names who have at least sf

identical answers to currentUser and are of the requested gender. The names

should be returned in order from most identical answers to least. If two

members have the same number of identical answers as the currentUser, the names

should be returned in the same relative order they were inputted.

TopCoder will ensure the validity of the inputs. Inputs are valid if all of

the following criteria are met:

- members will have between 1 and 50 elements, inclusive.

- Each element of members will have a length between 7 and 44, inclusive.

- NAME will have a length between 1 and 20, inclusive, and only contain

uppercase letters A-Z.

- G can be either an uppercase M or an uppercase F.

- D can be either an uppercase M or an uppercase F.

- Each X is a capital letter (A-D).

- The number of Xs in each element of the members is equal. The number of Xs

will be between 1 and 10, inclusive.

- No two elements will have the same NAME.

- Names are case sensitive.

- currentUser consists of between 1 and 20, inclusive, uppercase letters, A-Z,

and must be a member.

- sf is an int between 1 and 10, inclusive.

- sf must be less than or equal to the number of answers (Xs) of the members.

NOTES

The currentUser should not be included in the returned list of potential mates.

EXAMPLES

For the following examples, assume members =

{"BETTY F M A A C C",

"TOM M F A D C A",

"SUE F M D D D D",

"ELLEN F M A A C A",

"JOE M F A A C A",

"ED M F A D D A",

"SALLY F M C D A B",

"MARGE F M A A C C"}

If currentUser="BETTY" and sf=2, BETTY and TOM have two identical answers and

BETTY and JOE have three identical answers, so the method should return

{"JOE","TOM"}.

If currentUser="JOE" and sf=1, the method should return

{"ELLEN","BETTY","MARGE"}.

If currentUser="MARGE" and sf=4, the method should return [].

Definition

Class:

MatchMaker

Method:

getBestMatches

Parameters:

String[], String, int

Returns:

String[]

Method signature:

String[] getBestMatches(String[] param0, String param1, int param2)

(be sure your method is public)

================================================================我的代碼=============

import java.util.ArrayList;
import java.util.Collections;
import java.util.Comparator;
import java.util.List;

public class MatchMaker {
    enum GENDER{MALE,FEMALE};

    //"NAME G D X X X X X X X X X X"
    private static class Member{
        String name;
        GENDER gender;
        GENDER mate;
        String[] answers;
        int index;
        int matched = 0;
    }

    String[] getBestMatches(String[] members, String currentUser, int sf){
        List<Member> allMembers = new ArrayList<Member>();
        Member cu = null;
        for(int i=0;i<members.length;++i){
            String m = members[i];
            String[] c = m.split(" ");
            Member mem = new Member();
            mem.name= c[0];
            mem.gender = c[1].equals("M")?GENDER.MALE:GENDER.FEMALE;
            mem.mate = c[2].equals("M")?GENDER.MALE:GENDER.FEMALE;
            mem.index = i;
            mem.matched = 0;
            String[] answers = mem.answers = new String[c.length-3];
            for(int j=3;j<c.length;++j){
                answers[j-3] = c[j];
            }
            allMembers.add(mem);
            if(c[0].equals(currentUser)){
                cu = mem;
            }
        }
        List<Member> matched = new ArrayList<Member>();
        if(cu!=null){
            for(Member mem:allMembers){
                if(mem!=cu && mem.gender==cu.mate){
                    for(int i=0;i<mem.answers.length;++i){
                        if(mem.answers[i].equals(cu.answers[i])){
                            ++mem.matched;
                        }
                    }
                    if(mem.matched>=sf){
                        matched.add(mem);
                    }
                }
            }

            Collections.sort(matched, new Comparator<Member>(){
                public int compare(Member ma, Member mb) {
                    if(ma.matched!=mb.matched){
                        return mb.matched - ma.matched;
                    }
                    return ma.index-mb.index;
                }
            });

            String[] result = new String[matched.size()];
            for(int i=0;i<result.length;++i){
                result[i] = matched.get(i).name;
            }
            return result;
        }
        return new String[0];
    }
}

posted @ 2013-04-02 14:04 小明閱讀(402) | 評(píng)論 (0) | 編輯收藏

有點(diǎn)難度的java筆試題

以下是我在上一家公司出的java筆試題，有些難度，感興趣的同學(xué)可以做做看。

---Question---

1.What is the output of the following program?

public class Foo {

public static void main(String[] args){

Map<byte[], String> m = new HashMap<byte[], String>();

byte[] key = "abcd".getBytes();

m.put(key, "abcd");

System.out.println(m.containsKey(key));

System.out.println(m.containsKey("abcd"));

System.out.println(m.containsKey("abcd".getBytes()));

}

a) true,true,false b)true,false,false c)true,true,true d) false,false,false e)Program throws an exception

2. What is the proper string filled in the following program?

Public class Foo {

public static void main(String[] args) {

String s=”1\\2\\3\\4”;

//split the string with “\”

String []result = s.split(“____”);

for(String r:result){

System.out.println(r);

}

a) \ b) \\ c) \\\ d)\\\\ e)\\\\\

3. What is the output of the following program?

public class Foo {

public static void main(String[] args) {

char[] c = new char[] { '1' };

String s = new String(c);

System.out.println("abcd" + c);

System.out.println("abcd" + s);

}

a) Compile error b)abcd1,abcd1 c) abcd49,abcd1 d) Program throws an exception e)none of above

4. Which class is threading safe which one object can be used between multi-threads without extra synchronized?

a) Vector b) HashMap c) ArrayList d)StringBuilder e)HashSet

5. What is the output of the following program?

public class Foo {

public static void main(String[] args) throws IOException {

ByteArrayOutputStream baos = new ByteArrayOutputStream();

byte[] b = new byte[]{(byte)0x0,(byte)0x1,(byte)0x2};

baos.write(b);

baos.write(0x0102);

byte[] result = baos.toByteArray();

ByteArrayInputStream bais = new ByteArrayInputStream(result);

System.out.println(bais.available());

}

a) 0 b) 1 c)4 d) 5 e) Program throws an exception

6. What is return value of function “calc”?

public class Foo {

public static int calc() throws IOException{

int ret = 0;

try{

++ret;

throw new IOException("try");

}

catch(IOException ioe){

--ret;

return ret;

}

finally{

++ret;

return ret;

}

a) 0 b) 1 c)2 d)3 e) throws an exception

7. What is the output of the following program?

public class Foo {

public static class Value {

private int value;

public int get(){

return value;

}

public void set(int v){

value = v;

}

public static class Values implements Iterable<Value>{

public Values(int capacity){

this.capacity = capacity;

}

int count =1 ;

int capacity;

Value v = new Value();

public Iterator<Value> iterator() {

return new Iterator<Value>(){

public boolean hasNext() {

return count<=capacity;

}

public Value next() {

v.set(count++);

return v;

}

public void remove() {

throw new UnsupportedOperationException();

}

};

}

public static void main(String[] args) {

Values vs = new Values(10);

Value result = null;

for(Value v:vs){

if(result == null){

result = v;

}

else{

result.set(result.get()+v.get());

}

System.out.println(result.get());

}

a) 20 b)40 c)45 d)55 e)throws NullpointerException

8. If add keyword “final” before a class member function, it means:

a) The method can’t access the non-final member variable.

b) The method can’t modify the member variable.

c) The method can’t be override by subclass.

d) The method is a thread-safe function.

e) The method can’t be accessed by other non-final function.

9. About java memory and garbage collector, which statement is correct?

a) Moving variable from locale to class will make GC more effectively.

b) When Full GC is executing, all the user threads will be paused.

c) If object A contains a reference of object B and object B contains a reference of object A, the two objects can’t be reclaimed by GC.

d) When a thread exits, all objects which created by that thread will be reclaimed

e) It is recommended that calling “System.gc()” to control the memory usage.

10. About java classpath and classloader, which statement is NOT correct?

a) User can specify the classpath by using the option “-cp” in Java command line.

b) If user doesn’t specify classpath, the JVM search the class from the current folder by default.

c) A JVM can load two different versions of a library.

d) To define customized class loader, it is possible to load class from internet at runtime.

11. Which data structure has best performance when remove an element from it?

a) Vector b)ArrayList c)LinkedList d)HashMap e)HashSet

12. Which is the correct way to convert bytes from charset “gb2312” to “utf-8”?

byte[] src , dst;

a) dst = new String(src,”utf-8”).getBytes(“gb2312”);

b) dst = new String(src,”gb2312”).getBytes(“utf-8”);

c) dst = new String(src,”utf-16”).getBytes();

d) dst = new String(src).getBytes();

e) None of above.

posted @ 2012-11-07 09:46 小明閱讀(2540) | 評(píng)論 (3) | 編輯收藏

leveldb性能分析 - 隨機(jī)寫(xiě)

摘要: 準(zhǔn)備工作：1. 下載Snappy庫(kù)Download source code from: http://code.google.com/p/snappy編譯并安裝./configure & make & sudo make install2. 編譯leveldb自帶的db_benchmake db_bench注意:在ubuntu 11.04上編譯會(huì)出錯(cuò)，修改makefile:$(CX... 閱讀全文

posted @ 2012-03-22 17:32 小明閱讀(4170) | 評(píng)論 (0) | 編輯收藏

leveldb研究11- 流程分析：讀數(shù)據(jù)

leveldb讀數(shù)據(jù)

先看看ReadOptions有哪些參數(shù)可以指定：

// Options that control read operations
struct ReadOptions {
  // 是否檢查checksum
  // Default: false
  bool verify_checksums;

  // 是否將此次結(jié)果放入cache
  // Default: true
  bool fill_cache;

  //是否指定snapshot,否則讀取當(dāng)前版本
  // Default: NULL
  const Snapshot* snapshot;

  ReadOptions()
      : verify_checksums(false),
        fill_cache(true),
        snapshot(NULL) {
  }
};

下面看看讀取的詳細(xì)過(guò)程：
查詢(xún)memtable=>查詢(xún)previous memtable(imm_)=>查詢(xún)文件(緩沖）

Status DBImpl::Get(const ReadOptions& options,
                   const Slice& key,
                   std::string* value) {
  Status s;
  MutexLock l(&mutex_);
  SequenceNumber snapshot;
  //設(shè)置snapshot
  if (options.snapshot != NULL) {
    snapshot = reinterpret_cast<const SnapshotImpl*>(options.snapshot)->number_;
  } else {
    snapshot = versions_->LastSequence();
  }

  MemTable* mem = mem_;
  MemTable* imm = imm_;
  Version* current = versions_->current();
  mem->Ref();
  if (imm != NULL) imm->Ref();
  current->Ref();

  bool have_stat_update = false;
  Version::GetStats stats;

  // Unlock while reading from files and memtables
  {
    mutex_.Unlock();
    LookupKey lkey(key, snapshot);
    //先查詢(xún)memtable
    if (mem->Get(lkey, value, &s)) {
      // Done
    } else if (imm != NULL && imm->Get(lkey, value, &s)) { //然后查詢(xún)previous memtable:imm_
      // Done
    } else {
      //從文件中讀取
      s = current->Get(options, lkey, value, &stats);
      have_stat_update = true;
    }
    mutex_.Lock();
  }

  //是否有文件需要被compaction,參見(jiàn)allowed_seek
  if (have_stat_update && current->UpdateStats(stats)) {
    MaybeScheduleCompaction();
  }
  mem->Unref();
  if (imm != NULL) imm->Unref();
  current->Unref();
  return s;
}

重點(diǎn)來(lái)看看從version中讀?。?br />

Status Version::Get(const ReadOptions& options,
                    const LookupKey& k,
                    std::string* value,
                    GetStats* stats) {
  Slice ikey = k.internal_key();
  Slice user_key = k.user_key();
  const Comparator* ucmp = vset_->icmp_.user_comparator();
  Status s;

  stats->seek_file = NULL;
  stats->seek_file_level = -1;
  FileMetaData* last_file_read = NULL;
  int last_file_read_level = -1;

  //從level0向高層查找，如果再低級(jí)level中查到，則不再查詢(xún)
  std::vector<FileMetaData*> tmp;
  FileMetaData* tmp2;
  for (int level = 0; level < config::kNumLevels; level++) {
    size_t num_files = files_[level].size();
    //本層文件數(shù)為空，則返回
    if (num_files == 0) continue;

    // Get the list of files to search in this level
    FileMetaData* const* files = &files_[level][0];
    if (level == 0) {
      //level0特殊處理，因?yàn)閗ey是重疊，所有符合條件的文件必須被查找
      tmp.reserve(num_files);
      for (uint32_t i = 0; i < num_files; i++) {
        FileMetaData* f = files[i];
        if (ucmp->Compare(user_key, f->smallest.user_key()) >= 0 &&
            ucmp->Compare(user_key, f->largest.user_key()) <= 0) {
          tmp.push_back(f);
        }
      }
      if (tmp.empty()) continue;

      std::sort(tmp.begin(), tmp.end(), NewestFirst);
      files = &tmp[0];
      num_files = tmp.size();
    } else {
      // 二分法查找，某個(gè)key只可能屬于一個(gè)文件
      uint32_t index = FindFile(vset_->icmp_, files_[level], ikey);
      //沒(méi)有查到
      if (index >= num_files) {
        files = NULL;
        num_files = 0;
      } else {
        tmp2 = files[index];
        if (ucmp->Compare(user_key, tmp2->smallest.user_key()) < 0) {
          // All of "tmp2" is past any data for user_key
          files = NULL;
          num_files = 0;
        } else {
          files = &tmp2;
          num_files = 1;
        }
      }
    }

    for (uint32_t i = 0; i < num_files; ++i) { //遍歷本層符合條件的文件
      if (last_file_read != NULL && stats->seek_file == NULL) {
        //seek_file只記錄第一個(gè)
        stats->seek_file = last_file_read;
        stats->seek_file_level = last_file_read_level;
      }

      FileMetaData* f = files[i];
      last_file_read = f;
      last_file_read_level = level;

      //從table cache中讀取
      Iterator* iter = vset_->table_cache_->NewIterator(
          options,
          f->number,
          f->file_size);
      iter->Seek(ikey);
      const bool done = GetValue(ucmp, iter, user_key, value, &s);
      if (!iter->status().ok()) { //查找到
        s = iter->status();
        delete iter;
        return s;
      } else {
        delete iter;
        if (done) {
          return s;
        }
      }
    }
  }

  return Status::NotFound(Slice());  // Use an empty error message for speed
}

繼續(xù)跟蹤：TableCache

Iterator* TableCache::NewIterator(const ReadOptions& options,
                                  uint64_t file_number,
                                  uint64_t file_size,
                                  Table** tableptr) {
  if (tableptr != NULL) {
    *tableptr = NULL;
  }

  char buf[sizeof(file_number)];
  EncodeFixed64(buf, file_number);
  Slice key(buf, sizeof(buf));

  //從LRU cache中查找
  Cache::Handle* handle = cache_->Lookup(key);
  if (handle == NULL) {
    /加載文件
    std::string fname = TableFileName(dbname_, file_number);
    RandomAccessFile* file = NULL;
    Table* table = NULL;
    Status s = env_->NewRandomAccessFile(fname, &file);
    if (s.ok()) {
      s = Table::Open(*options_, file, file_size, &table);
    }

    if (!s.ok()) {
      assert(table == NULL);
      delete file;
      // We do not cache error results so that if the error is transient,
      // or somebody repairs the file, we recover automatically.
      return NewErrorIterator(s);
    }

    //插入Cache
    TableAndFile* tf = new TableAndFile;
    tf->file = file;
    tf->table = table;
    handle = cache_->Insert(key, tf, 1, &DeleteEntry);
  }

  Table* table = reinterpret_cast<TableAndFile*>(cache_->Value(handle))->table;
  //從Table對(duì)象中生成iterator
  Iterator* result = table->NewIterator(options);
  result->RegisterCleanup(&UnrefEntry, cache_, handle);
  if (tableptr != NULL) {
    *tableptr = table;
  }
  return result;
}

posted @ 2012-03-21 17:30 小明閱讀(2729) | 評(píng)論 (0) | 編輯收藏

leveldb研究10- 流程分析：寫(xiě)數(shù)據(jù)

總體來(lái)說(shuō),leveldb的寫(xiě)操作有兩個(gè)步驟，首先是針對(duì)log的append操作，然后是對(duì)memtable的插入操作。

影響寫(xiě)性能的因素有：
1. write_buffer_size
2. kL0_SlowdownWritesTrigger and kL0_StopWritesTrigger.提高這兩個(gè)值，能夠增加寫(xiě)的性能，但是降低讀的性能

看看WriteOptions有哪些參數(shù)可以指定

struct WriteOptions {
  //設(shè)置sync=true,leveldb會(huì)調(diào)用fsync()，這會(huì)降低插入性能
  //同時(shí)會(huì)增加數(shù)據(jù)的安全性
  //Default: false
  bool sync;

  WriteOptions()
      : sync(false) {
  }
};

首先把Key,value轉(zhuǎn)成WriteBatch

Status DB::Put(const WriteOptions& opt, const Slice& key, const Slice& value) {
  WriteBatch batch;
  batch.Put(key, value);
  return Write(opt, &batch);
}

接下來(lái)就是真正的插入了
這里使用了兩把鎖，主要是想提高并發(fā)能力，減少上鎖的時(shí)間。
首先是檢查是否可寫(xiě)，然后append log，最后是插入memtable
<db/dbimpl.cc>

Status DBImpl::Write(const WriteOptions& options, WriteBatch* updates) {
  Status status;
  //加鎖
  MutexLock l(&mutex_);
  LoggerId self;
  //拿到寫(xiě)log的權(quán)利
  AcquireLoggingResponsibility(&self);
  //檢查是否可寫(xiě)
  status = MakeRoomForWrite(false);  // May temporarily release lock and wait
  uint64_t last_sequence = versions_->LastSequence();
  if (status.ok()) {
    WriteBatchInternal::SetSequence(updates, last_sequence + 1);
    last_sequence += WriteBatchInternal::Count(updates);

    // Add to log and apply to memtable.  We can release the lock during
    // this phase since the "logger_" flag protects against concurrent
    // loggers and concurrent writes into mem_.
    {
      assert(logger_ == &self);
      mutex_.Unlock();
      //IO操作：寫(xiě)入LOG
      status = log_->AddRecord(WriteBatchInternal::Contents(updates));
      if (status.ok() && options.sync) {
        status = logfile_->Sync();
      }
      //插入memtable
      if (status.ok()) {
        status = WriteBatchInternal::InsertInto(updates, mem_);
      }
      mutex_.Lock();
      assert(logger_ == &self);
    }
    //設(shè)置新的seqence number
    versions_->SetLastSequence(last_sequence);
  }
  //釋放寫(xiě)LOG鎖
  ReleaseLoggingResponsibility(&self);
  return status;
}

寫(xiě)流量控制：
<db/dbimpl.cc>

Status DBImpl::MakeRoomForWrite(bool force) {
  mutex_.AssertHeld();
  assert(logger_ != NULL);
  bool allow_delay = !force;
  Status s;
  while (true) {
    if (!bg_error_.ok()) {
      // Yield previous error
      s = bg_error_;
      break;
    } else if (
        allow_delay &&
        versions_->NumLevelFiles(0) >= config::kL0_SlowdownWritesTrigger) {
      mutex_.Unlock();
      //如果level0的文件大于kL0_SlowdownWritesTrigger閾值，則sleep 1s，這樣給compaction更多的CPU
      env_->SleepForMicroseconds(1000);
      allow_delay = false;  // Do not delay a single write more than once
      mutex_.Lock();
    } else if (!force &&
               (mem_->ApproximateMemoryUsage() <= options_.write_buffer_size)) {
      //可寫(xiě)
      break;
    } else if (imm_ != NULL) {
      // imm_:之前的memtable 沒(méi)有被compaction，需要等待
      bg_cv_.Wait();
    } else if (versions_->NumLevelFiles(0) >= config::kL0_StopWritesTrigger) {
      // level0文件個(gè)數(shù)大于kL0_StopWritesTrigger,需要等待
      Log(options_.info_log, "waiting

\n");
      bg_cv_.Wait();
    } else {
      //生成新的額memtable和logfile，把當(dāng)前memtable傳給imm_
      assert(versions_->PrevLogNumber() == 0);
      uint64_t new_log_number = versions_->NewFileNumber();
      WritableFile* lfile = NULL;
      s = env_->NewWritableFile(LogFileName(dbname_, new_log_number), &lfile);
      if (!s.ok()) {
        break;
      }
      delete log_;
      delete logfile_;
      logfile_ = lfile;
      logfile_number_ = new_log_number;
      log_ = new log::Writer(lfile);
      imm_ = mem_;
      has_imm_.Release_Store(imm_);
      mem_ = new MemTable(internal_comparator_);
      mem_->Ref();
      force = false;   // Do not force another compaction if have room

// 發(fā)起compaction,dump imm_

      MaybeScheduleCompaction();
    }
  }
  return s;
}

posted @ 2012-03-21 14:41 小明閱讀(3448) | 評(píng)論 (0) | 編輯收藏

leveldb研究9- 流程分析：打開(kāi)數(shù)據(jù)庫(kù)

摘要: leveldb 是通過(guò)Open函數(shù)來(lái)打開(kāi)/新建數(shù)據(jù)庫(kù)。Code highlighting produced by Actipro CodeHighlighter (freeware)http://www.CodeHighlighter.com/-->static Status Open(const Options& options, &nb... 閱讀全文

posted @ 2012-03-20 16:02 小明閱讀(3439) | 評(píng)論 (0) | 編輯收藏

leveldb研究8- 內(nèi)存中的數(shù)據(jù)結(jié)構(gòu)Memtable/SkipList

我們知道,leveldb在寫(xiě)數(shù)據(jù)的時(shí)候，除了log文件，都要在內(nèi)存中寫(xiě)一份。

先看看SkipList【跳表】這個(gè)數(shù)據(jù)結(jié)構(gòu)：

SkipList有如下特點(diǎn)：
1. 本質(zhì)上一個(gè)排序好的鏈表
2. 分層，上層節(jié)點(diǎn)比下層的少，更具有跳躍性
3. 查詢(xún)的復(fù)雜度是O(logn)

SkipList跟紅黑樹(shù)等還是比較容易實(shí)現(xiàn)和理解的，主要長(zhǎng)處是比較容易實(shí)現(xiàn)Lock free和遍歷。
我們來(lái)看看leveldb的實(shí)現(xiàn)
插入：

//插入一個(gè)新的key
template<typename Key, class Comparator>
void SkipList<Key,Comparator>::Insert(const Key& key) {
  //查找插入節(jié)點(diǎn),prev為各層的前置節(jié)點(diǎn)
  Node* prev[kMaxHeight];
  Node* x = FindGreaterOrEqual(key, prev);

  // Our data structure does not allow duplicate insertion
  assert(x == NULL || !Equal(key, x->key));

  //生成隨機(jī)高度
  int height = RandomHeight();
  if (height > GetMaxHeight()) {
    for (int i = GetMaxHeight(); i < height; i++) {
      prev[i] = head_;
    }
    //設(shè)置當(dāng)前最大高度
    max_height_.NoBarrier_Store(reinterpret_cast<void*>(height));
  }

  //生成新節(jié)點(diǎn)
  x = NewNode(key, height);
  for (int i = 0; i < height; i++) {
    //設(shè)置新節(jié)點(diǎn)的各層的下一跳
    x->NoBarrier_SetNext(i, prev[i]->NoBarrier_Next(i));
    //設(shè)置前節(jié)點(diǎn)的next為當(dāng)前節(jié)點(diǎn)，完成插入
    prev[i]->SetNext(i, x);
  }
}

查詢(xún)：

template<typename Key, class Comparator>
typename SkipList<Key,Comparator>::Node* SkipList<Key,Comparator>::FindGreaterOrEqual(const Key& key, Node** prev)
    const {
  Node* x = head_;
  int level = GetMaxHeight() - 1; //從高層開(kāi)始查找，依次到0 level
  while (true) {
    Node* next = x->Next(level);
    if (KeyIsAfterNode(key, next)) { //比next key 要大
      // Keep searching in this list
      x = next;
    } else { //比next key小，查找下一層
      //標(biāo)記當(dāng)前l(fā)evel的前置節(jié)點(diǎn)
      if (prev != NULL) prev[level] = x;
      if (level == 0) {
        return next;
      } else {
        level--;
      }
    }
  }
}

template<typename Key, class Comparator>
bool SkipList<Key,Comparator>::Contains(const Key& key) const {
  Node* x = FindGreaterOrEqual(key, NULL);
  if (x != NULL && Equal(key, x->key)) {
    return true;
  } else {
    return false;
  }
}

接著我們看看leveldb MemTable的實(shí)現(xiàn)，很簡(jiǎn)單了，基本是對(duì)SkipList訪問(wèn)一個(gè)封裝
<db/memtable.h>

class MemTable {
public:
  explicit MemTable(const InternalKeyComparator& comparator);

  //增加引用計(jì)數(shù)
  void Ref() { ++refs_; }

  //減少引用計(jì)數(shù)
  void Unref() {
    --refs_;
    assert(refs_ >= 0);
    if (refs_ <= 0) {
      delete this;
    }
  }

  //內(nèi)存使用量
  size_t ApproximateMemoryUsage();

  //遍歷操作
  Iterator* NewIterator();

  //插入
  void Add(SequenceNumber seq, ValueType type,
           const Slice& key,
           const Slice& value);

  //查詢(xún)
  bool Get(const LookupKey& key, std::string* value, Status* s);

private:
  ~MemTable();  // Private since only Unref() should be used to delete it

  //key compartor，用于排序
  struct KeyComparator {
    const InternalKeyComparator comparator;
    explicit KeyComparator(const InternalKeyComparator& c) : comparator(c) { }
    int operator()(const char* a, const char* b) const;
  };
  friend class MemTableIterator;
  friend class MemTableBackwardIterator;

  typedef SkipList<const char*, KeyComparator> Table;

  KeyComparator comparator_;
  int refs_; //引用計(jì)數(shù)
  Arena arena_; //內(nèi)存分配器
  Table table_; //數(shù)據(jù)存放SkipList

  // No copying allowed
  MemTable(const MemTable&);
  void operator=(const MemTable&);
};

先看看插入
<db/memtable.cc>

void MemTable::Add(SequenceNumber s, ValueType type,
                   const Slice& key,
                   const Slice& value) {
  //數(shù)據(jù)結(jié)構(gòu)：
  //1.internal key size : Varint32 (length of 2+3)
  //2.key data
  //3.SequenceNumber+Key type: 8 bytes
  //4 value size: Varint32
  //5 value data
  size_t key_size = key.size();
  size_t val_size = value.size();
  size_t internal_key_size = key_size + 8;
  const size_t encoded_len =
      VarintLength(internal_key_size) + internal_key_size +
      VarintLength(val_size) + val_size;
  char* buf = arena_.Allocate(encoded_len);
  char* p = EncodeVarint32(buf, internal_key_size);
  memcpy(p, key.data(), key_size);
  p += key_size;
  EncodeFixed64(p, (s << 8) | type);
  p += 8;
  p = EncodeVarint32(p, val_size);
  memcpy(p, value.data(), val_size);
  assert((p + val_size) - buf == encoded_len);
  table_.Insert(buf);
}

查詢(xún)
<db/memtable.cc>

bool MemTable::Get(const LookupKey& key, std::string* value, Status* s) {
  Slice memkey = key.memtable_key();
  Table::Iterator iter(&table_);
  iter.Seek(memkey.data());
  if (iter.Valid()) {
    // entry format is:
    //    klength  varint32
    //    userkey  char[klength]
    //    tag      uint64
    //    vlength  varint32
    //    value    char[vlength]
    // Check that it belongs to same user key.  We do not check the
    // sequence number since the Seek() call above should have skipped
    // all entries with overly large sequence numbers.
    const char* entry = iter.key();
    uint32_t key_length;
    const char* key_ptr = GetVarint32Ptr(entry, entry+5, &key_length);
    if (comparator_.comparator.user_comparator()->Compare(
            Slice(key_ptr, key_length - 8),
            key.user_key()) == 0) {
      // Correct user key
      const uint64_t tag = DecodeFixed64(key_ptr + key_length - 8);
      switch (static_cast<ValueType>(tag & 0xff)) {
        case kTypeValue: {
          Slice v = GetLengthPrefixedSlice(key_ptr + key_length);
          value->assign(v.data(), v.size());
          return true;
        }
        case kTypeDeletion:
          *s = Status::NotFound(Slice());
          return true;
      }
    }
  }
  return false;
}

posted @ 2012-03-19 16:31 小明閱讀(2997) | 評(píng)論 (0) | 編輯收藏

leveldb研究7-Version/VersionSet/VersionEdit

leveldb 使用 version 來(lái)保存數(shù)據(jù)庫(kù)的狀態(tài)。

先看看一個(gè)重要的數(shù)據(jù)結(jié)果，sst file的META info
<db/version_edit.h>

struct FileMetaData {
  int refs; //引用計(jì)數(shù)
  int allowed_seeks; //允許的seeks次數(shù)
  uint64_t number;//文件編號(hào)
  uint64_t file_size;  //文件大小
  InternalKey smallest;    //最小的key
  InternalKey largest;      //最大的key

  FileMetaData() : refs(0), allowed_seeks(1 << 30), file_size(0) { }
};

這里面有一個(gè)很有意思的字段: allowed_seeks,代表了可以seek的次數(shù)，為0的時(shí)候表示這個(gè)文件需要被compaction.如何設(shè)置seeks次數(shù)呢？文件大小除以16k，不到100算100。

f->allowed_seeks = (f->file_size / 16384);
if (f->allowed_seeks < 100) f->allowed_seeks = 100;

原因，請(qǐng)看leveldb的注釋?zhuān)?br />

// We arrange to automatically compact this file after a certain number of seeks. Let's assume:
      //   (1) One seek costs 10ms
      //   (2) Writing or reading 1MB costs 10ms (100MB/s)
      //   (3) A compaction of 1MB does 25MB of IO:
      //         1MB read from this level
      //         10-12MB read from next level (boundaries may be misaligned)
      //         10-12MB written to next level
      // This implies that 25 seeks cost the same as the compaction
      // of 1MB of data. I.e., one seek costs approximately the
      // same as the compaction of 40KB of data. We are a little
      // conservative and allow approximately one seek for every 16KB
      // of data before triggering a compaction.

接下來(lái)看Version的定義，version其實(shí)就是一系列的SST file的集合。

class Version {
public:
  //生成iterator用于遍歷
  void AddIterators(const ReadOptions&, std::vector<Iterator*>* iters);

  //根據(jù)key來(lái)查詢(xún)，若沒(méi)有查到，更新GetStats
  struct GetStats {
    FileMetaData* seek_file;
    int seek_file_level;
  };
  Status Get(const ReadOptions&, const LookupKey& key, std::string* val,
             GetStats* stats);

  //是否需要進(jìn)行compaction
  bool UpdateStats(const GetStats& stats);

  //引用計(jì)算，避免在被引用時(shí)候刪除
  void Ref();
  void Unref();

  //查詢(xún)和key range有關(guān)的files
  void GetOverlappingInputs(
      int level,
      const InternalKey* begin,         // NULL means before all keys
      const InternalKey* end,           // NULL means after all keys
      std::vector<FileMetaData*>* inputs);

  //計(jì)算是否level對(duì)某個(gè)key range是否有overlap
  bool OverlapInLevel(int level,
                      const Slice* smallest_user_key,
                      const Slice* largest_user_key);

  //memtable output應(yīng)該放到哪個(gè)level
  int PickLevelForMemTableOutput(const Slice& smallest_user_key,
                                 const Slice& largest_user_key);

  //某個(gè)level的文件個(gè)數(shù)
   int NumFiles(int level) const { return files_[level].size(); }

  // Return a human readable string that describes this version's contents.
  std::string DebugString() const;

private:
  friend class Compaction;
  friend class VersionSet;

  class LevelFileNumIterator;
  Iterator* NewConcatenatingIterator(const ReadOptions&, int level) const;

  VersionSet* vset_;            // VersionSet to which this Version belongs
  Version* next_;               // Next version in linked list
  Version* prev_;               // Previous version in linked list
  int refs_;                    // Number of live refs to this version

  //sst files
  std::vector<FileMetaData*> files_[config::kNumLevels];

  //下一個(gè)要被compaction的文件
  FileMetaData* file_to_compact_;
  int file_to_compact_level_;

  //compaction score:>1表示要compaction
  double compaction_score_;
  int compaction_level_;

  explicit Version(VersionSet* vset)
      : vset_(vset), next_(this), prev_(this), refs_(0),
        file_to_compact_(NULL),
        file_to_compact_level_(-1),
        compaction_score_(-1),
        compaction_level_(-1) {
  }

  ~Version();

  // No copying allowed
  Version(const Version&);
  void operator=(const Version&);
};

那VersionSet呢？VersionSet 是version組成一個(gè)雙向循環(huán)鏈表。

class VersionSet{
//. . .
Env* const env_;
  const std::string dbname_;
  const Options* const options_;
  TableCache* const table_cache_;
  const InternalKeyComparator icmp_;
  uint64_t next_file_number_;
  uint64_t manifest_file_number_;
  uint64_t last_sequence_;
  uint64_t log_number_;

  WritableFile* descriptor_file_;
  log::Writer* descriptor_log_;
  Version dummy_versions_;  // Head of circular doubly-linked list of versions.
  Version* current_;        // == dummy_versions_.prev_

  //每層都有一個(gè)compact pointer用于指示下次從哪里開(kāi)始compact,以用于實(shí)現(xiàn)循環(huán)compact
  std::string compact_pointer_[config::kNumLevels];
//. . .
}

VersionEdit是version對(duì)象的變更記錄，用于寫(xiě)入manifest.這樣通過(guò)原始的version加上一系列的versionedit的記錄，就可以恢復(fù)到最新?tīng)顟B(tài)。

class VersionEdit {
public:
  VersionEdit() { Clear(); }
  ~VersionEdit() { }

  void Clear();

  void SetComparatorName(const Slice& name) {
    has_comparator_ = true;
    comparator_ = name.ToString();
  }
  void SetLogNumber(uint64_t num) {
    has_log_number_ = true;
    log_number_ = num;
  }
  void SetPrevLogNumber(uint64_t num) {
    has_prev_log_number_ = true;
    prev_log_number_ = num;
  }
  void SetNextFile(uint64_t num) {
    has_next_file_number_ = true;
    next_file_number_ = num;
  }
  void SetLastSequence(SequenceNumber seq) {
    has_last_sequence_ = true;
    last_sequence_ = seq;
  }
  void SetCompactPointer(int level, const InternalKey& key) {
    compact_pointers_.push_back(std::make_pair(level, key));
  }

  //添加meta file
  void AddFile(int level, uint64_t file,
               uint64_t file_size,
               const InternalKey& smallest,
               const InternalKey& largest) {
    FileMetaData f;
    f.number = file;
    f.file_size = file_size;
    f.smallest = smallest;
    f.largest = largest;
    new_files_.push_back(std::make_pair(level, f));
  }

  //刪除特定的文件
  void DeleteFile(int level, uint64_t file) {
    deleted_files_.insert(std::make_pair(level, file));
  }

  //編碼，解碼：用于寫(xiě)入manifest
  void EncodeTo(std::string* dst) const;
  Status DecodeFrom(const Slice& src);

  std::string DebugString() const;

private:
  friend class VersionSet;

  typedef std::set< std::pair<int, uint64_t> > DeletedFileSet;

  std::string comparator_;
  uint64_t log_number_;
  uint64_t prev_log_number_;
  uint64_t next_file_number_;
  SequenceNumber last_sequence_;
  bool has_comparator_;
  bool has_log_number_;
  bool has_prev_log_number_;
  bool has_next_file_number_;
  bool has_last_sequence_;

  std::vector< std::pair<int, InternalKey> > compact_pointers_;
  DeletedFileSet deleted_files_;
  std::vector< std::pair<int, FileMetaData> > new_files_;
};

posted @ 2012-03-16 17:10 小明閱讀(3955) | 評(píng)論 (0) | 編輯收藏

leveldb研究6- Level和Compaction

摘要: leveldb之所以使用level作為數(shù)據(jù)庫(kù)名稱(chēng)，精華就在于level的設(shè)計(jì)。本質(zhì)是一種歸并排序算法。這樣設(shè)計(jì)的好處主要是可以減少compaction的次數(shù)和每次的文件個(gè)數(shù)。Compaction為什么要compaction? compaction可以提高數(shù)據(jù)的查詢(xún)效率，沒(méi)有經(jīng)過(guò)compaction，需要從很多SST file去查找，而做過(guò)compaction后，只需要從有限的SST文件去... 閱讀全文

posted @ 2012-03-15 17:28 小明閱讀(7894) | 評(píng)論 (0) | 編輯收藏

小明思考

回文字符串的切割問(wèn)題

TopCoder MatchMaker解法

有點(diǎn)難度的java筆試題

leveldb性能分析 - 隨機(jī)寫(xiě)

leveldb研究11- 流程分析：讀數(shù)據(jù)

leveldb研究10- 流程分析：寫(xiě)數(shù)據(jù)

leveldb研究9- 流程分析：打開(kāi)數(shù)據(jù)庫(kù)

leveldb研究8- 內(nèi)存中的數(shù)據(jù)結(jié)構(gòu)Memtable/SkipList

leveldb研究7-Version/VersionSet/VersionEdit

leveldb研究6- Level和Compaction

日歷

常用鏈接

留言簿(5)

隨筆分類(lèi)

隨筆檔案

相冊(cè)

My blogs

搜索

最新評(píng)論

閱讀排行榜

評(píng)論排行榜