??xml version="1.0" encoding="utf-8" standalone="yes"?>亚洲国产第一页www,中文字幕亚洲综合久久男男,亚洲人6666成人观看http://www.tkk7.com/czihong/category/50883.htmlzh-cnThu, 02 May 2013 18:01:53 GMTThu, 02 May 2013 18:01:53 GMT60数据库事务与隔离{详解http://www.tkk7.com/czihong/articles/393295.htmlChan ChenChan ChenFri, 21 Dec 2012 05:06:00 GMThttp://www.tkk7.com/czihong/articles/393295.htmlhttp://www.tkk7.com/czihong/comments/393295.htmlhttp://www.tkk7.com/czihong/articles/393295.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/393295.htmlhttp://www.tkk7.com/czihong/services/trackbacks/393295.html事务Q?/span>transactionQ是数据库管理系l的执行单位Q可以是一个数据库操作Q如Select操作Q或者是一l操作序列。事?/span>ACID属性,卛_子性(AtomicityQ、一致?/span>(Consistency)、隔L(IsolationQ、持久性(DurabilityQ?/span>

原子性:保证事务中的所有操作全部执行或全部不执行。例如执行{账事务,要么转̎成功Q要么失败。成功,则金额从转出帐户转入到目的帐Pq且两个帐户金额发生相应的变化Q失败,则两个̎L金额都不变。不会出现{出帐h了钱Q而目的帐h有收到钱的情c?/span>

一致性:保证数据库始l保持数据的一致?#8212;—事务操作之前是一致的Q事务操作之后也是一致的Q不事务成功与否。如上面的例子,转̎之前和之后数据库都保持数据上的一致性?/span>

隔离性:多个事务q发执行的话Q结果应该与多个事务串行执行效果是一L。显然最单的隔离是所有事务都串行执行Q先来先执行Q一个事务执行完了才允许执行下一个。但q样数据库的效率低下Q如Q两个不同的事务只是d同一Ҏ据,q样完全可以q发q行。ؓ了控制ƈ发执行的效果有了不同的隔离U别。下面将详细介绍?/span>

持久性:持久性表CZ物操作完成之后,Ҏ据库的媄响是持久的,即数据库因故障而受到破坏,数据库也应该能够恢复。通常的实现方式是采用日志?/span>

 

事务隔离U别Q?/span>transaction isolation levelsQ:隔离U别是对对事务q发控制的等U?/span>ANSIISOSQL其分ؓ串行化(SERIALIZABLEQ、可重复读(REPEATABLE READQ、读已提交(READ COMMITEDQ、读未提交(READ UNCOMMITEDQ四个等U。ؓ了实现隔ȝ别通常数据库采用锁Q?/span>LockQ。一般在~程的时候只需要设|隔ȝU,至于具体采用什么锁则由数据库来讄。首先介l四U等U,然后举例解释后面三个{Q可重复诅R读已提交、读未提交)中会出现的ƈ发问题?/span>

串行化(SERIALIZABLEQ:所有事务都一个接一个地串行执行Q这样可以避免读(phantom readsQ。对于基于锁来实现ƈ发控制的数据库来_串行化要求在执行范围查询Q如选取q龄?/span>10?/span>30之间的用P的时候,需要获取范围锁Q?/span>range lockQ。如果不是基于锁实现q发控制的数据库Q则查到有违反串行操作的事务Ӟ需要滚回该事务?/span>

可重复读Q?/span>REPEATABLE READQ:所有被Select获取的数据都不能被修改,q样可以避免一个事务前后读取数据不一致的情况。但是却没有办法控制q读Q因个时候其他事务不能更Ҏ选的数据Q但是可以增加数据,因ؓ前一个事务没有范围锁?/span>

d提交Q?/span>READ COMMITEDQ:被读取的数据可以被其他事务修攏V这样就可能D不可重复诅R也是_事务的读取数据的时候获取读锁,但是d之后立即释放Q不需要等C务结束)Q而写锁则是事务提交之后才释放。释放读锁之后,可能被其他事物修改数据。该{也是SQL Server默认的隔ȝU?/span>

L提交Q?/span>READ UNCOMMITEDQ:q是最低的隔离{Q允许其他事务看到没有提交的数据。这U等U会D脏读Q?/span>Dirty ReadQ?/span>

 

       例子Q下面考察后面三种隔离{对应的ƈ发问题。假设有两个事务。事?/span>1执行查询1Q然后事?/span>2执行查询2Q然后提交,接下来事?/span>1中的查询1再执行一ơ。查询基于以下表q行Q?/span>

users

id

name

age

1

Joe

20

2

Jill

25

可重复读(q读Qphantom reads)

一个事务中先后各执行一ơ同一个查询,但是q回的结果集却不一栗发生这U情冉|因ؓ在执行Select操作的时候没有获取范围锁QRange LockQ,D其他事务仍然可以插入新的数据?/p>

Transaction 1

Transaction 2

/* Query 1 */
SELECT * FROM users
WHERE age BETWEEN 10 AND 30;

 

 

/* Query 2 */
INSERT INTO users VALUES ( 3, 'Bob', 27 );
COMMIT;
/* Query 1 */
SELECT * FROM users
WHERE age BETWEEN 10 AND 30;

 

注意transaction 1对同一个查询语句(Query 1Q执行了两次?如果采用更高U别的隔ȝU(即串行化Q的话,那么前后两次查询应该q回同样的结果集。但是在可重复读隔离{中却前后两次l果集不一栗但是ؓ什么叫做可重复ȝU呢Q那是因{解决了下面的不可重复读问题?/p>

d提交Q不可重复读QNon-repeatable readsQ?/h3>

在采用锁来实现ƈ发控制的数据库系l中Q不可重复读是因为在执行Select操作的时候没有加读锁Qread lockQ?/p>

Transaction 1

Transaction 2

/* Query 1 */
SELECT * FROM users WHERE id = 1;

 

 

/* Query 2 */
UPDATE users SET age = 21 WHERE id = 1;
COMMIT; 
/* Query 1 */
SELECT * FROM users WHERE id = 1;

 

在这个例子当中,Transaction 2提交成功,所以Transaction 1W二ơ将获取一个不同的age?在SERIALIZABLE和REPEATABLE READ隔离U别?数据库应该返回同一个倹{而在READ COMMITTED和READ UNCOMMITTEDU别中数据库q回更新的倹{这样就出现了不可重复读?/p>

L提交 (脏读Qdirty reads)

如果一个事?d了另一个事?修改的|但是最后事?滚回了,那么事务2p取了一个脏数据Q这也就是所谓的脏读。发生这U情况就是允怺务读取未提交的更新?/p>

Transaction 1

Transaction 2

/* Query 1 */
SELECT * FROM users WHERE id = 1;

 

 

/* Query 2 */
UPDATE users SET age = 21 WHERE id = 1;
/* Query 1 */
SELECT * FROM users WHERE id = 1;

 

 

RollBack

 

lgqͼ可以{到下面的表|

隔离{

脏读

不可重复?/span>

q读

L提交

YES

YES

YES

d提交

NO

YES

YES

可重复读

NO

NO

YES

串行?/span>

NO

NO

NO

 



Chan Chen 2012-12-21 13:06 发表评论
]]>
BLOBs and CLOBs http://www.tkk7.com/czihong/articles/392276.htmlChan ChenChan ChenFri, 30 Nov 2012 05:44:00 GMThttp://www.tkk7.com/czihong/articles/392276.htmlhttp://www.tkk7.com/czihong/comments/392276.htmlhttp://www.tkk7.com/czihong/articles/392276.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/392276.htmlhttp://www.tkk7.com/czihong/services/trackbacks/392276.html

solidDB® can store binary and character data up to 2147483647 (2G - 1) bytes long. When such data exceeds a certain length, the data is called a BLOB (Binary Large OBject) or CLOB (Character Large OBject), depending upon the data type that stores the information. CLOBS contain only "plain text" and can be stored in any of the following data types:

CHAR, WCHAR

VARCHAR, WVARCHAR

LONG VARCHAR (mapped to standard type CLOB),

LONG WVARCHAR (mapped to standard type NCLOB)

BLOBs can store any type of data that can be represented as a sequence of bytes, such as a digitized picture, video, audio, a formatted text document. (They can also store plain text, but you'll have more flexibility if you store plain text in CLOBs). BLOBs are stored in any of the following data types:

BINARY

VARBINARY

LONG VARBINARY (mapped to standard type BLOB)

Since character data is a sequence of bytes, character data can be stored in BINARY fields, as well as in CHAR fields. CLOBs can be considered a subset of BLOBs.

For convenience, we will use the term BLOBs to refer to both CLOBs and BLOBs.

For most non-BLOB data types, such as integer, float, date, etc., there is a rich set of valid operations that you can do on that data type. For example, you can add, subtract, multiply, divide, and do other operations with FLOAT values. Because a BLOB is a sequence of bytes and the database server does not know the "meaning" of that sequence of bytes (i.e. it doesn't know whether the bytes represent a movie, a song, or the design of the space shuttle), the operations that you can do on BLOBs are very limited.

solidDB does allow you to perform some string operations on CLOBs. For example, you can search for a particular substring (e.g. a person's name) inside a CLOB by using the LOCATE() function. Because such operations require a lot of the server's resources (memory and/or CPU time), solidDB allows you to limit the number of bytes of the CLOB that are processed. For example, you might specify that only the first 1 megabyte of each CLOB be searched when doing a string search. For more information, see the description of the MaxBlobExpressionSize configuration parameter in solidDB Administration Guide.

Although it is theoretically possible to store the entire blob "inside" a typical table, if the blob is large, then the server usually performs better if most or all of the blob is not stored in the table. In solidDB, if a blob is no more than N bytes long, then the blob is stored in the table. If the blob is longer than N bytes, then the first N bytes are stored in the table, and the rest of the blob is stored outside the table as disk blocks in the physical database file. The exact value of "N" depends in part upon the structure of the table, the disk page size that you specified when you created the database, etc., but is always at least 256. (Data 256 bytes or shorter is always stored in the table.)

If a data row size is larger than one third of the disk block size of the database file, you must store it partly as a BLOB.

The SYS_BLOBS system table is used as a directory for all BLOB data in the physical database file. One SYS_BLOB entry can accommodate 50 BLOB parts. If the BLOB size exceeds 50 parts, several SYS_BLOB entries per BLOB are needed.

The query below returns an estimate on the total size of BLOBs in the database.

select sum(totalsize) from sys_blobs

The estimate is not accurate, because the info is only maintained at checkpoints. After two empty checkpoints, this query should return an accurate response.



Chan Chen 2012-11-30 13:44 发表评论
]]>
LDAP采用BDB作ؓ后端数据库的理由http://www.tkk7.com/czihong/articles/377257.htmlChan ChenChan ChenThu, 03 May 2012 03:38:00 GMThttp://www.tkk7.com/czihong/articles/377257.htmlhttp://www.tkk7.com/czihong/comments/377257.htmlhttp://www.tkk7.com/czihong/articles/377257.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/377257.htmlhttp://www.tkk7.com/czihong/services/trackbacks/377257.html1.许多世界知名的大公司都采用了BDB作ؓ多种关键性业务的后端数据库?/span>
Sleepycat Software makes Berkeley DB, the most widely used application-specific data management software in the world with more than 200 million deployments. Customers such as Amazon, AOL, British Telecom, Cisco Systems, EMC, Ericsson, Google, Hitachi, HP, Motorola, RSA Security, Sun Microsystems, TIBCO and Veritas also rely on Berkeley DB for fast, scalable, reliable and cost-effective data management for their mission-critical applications. 
?Sleepycat软g公司出品的Berkeley DB是一U在特定的数据管理应用程序中q泛使用的数据库pȝ,在世界范围内有超q两亿的用户支持.许多世界知名的厂?像Amazon, AOL, British Telecom, Cisco Systems, EMC, Ericsson, Google, Hitachi, HP, Motorola, RSA Security, Sun Microsystems, TIBCO 以及 Veritas都依赖于BDBZ们的许多关键性应用提供快速的,Ҏ的,可靠?q且高性h比的数据理.

2.以下是chinaunix.net上一位高手给出的解释Q在q里引用一下?/span>
mysql是用BDB实现?mysql的后? 。mysql快,BDB比mysqlq要快N倍?/span>
BDBq发高于RDBMS?nbsp;
定w支持可达256TB?/span>
ZHASH支持select数据比RDBMS快?/span>

3.BDB数据库与其它的几U数据库的比较?/span>
BDB数据库不同与其他几种数据?-关系型(Relational databasesQ,面向对象型(Object-oriented databasesQ,|络数据库(Network databasesQ,它是一U嵌入式Qembeded databases)数据库?/span>

下面先简要说说BDB与其它几U数据库的区别:
Q?Q它们几乎都无一例外的采用了l构化查询语aQSQLQ,而BDB没有?/span>
Q?Q它们几乎都无一例外的采用了客户/服务器模型,而BDB采用的是嵌入式模型?/span>

下面是在|上扄一些有关BDB的资?解释了BDB之所以会和当前流行的大多数数据库不同的一些原?所引资料未注明出处,后面的翻译是我自己加的:
Q?QBerkeley DB is an open source embedded database library that provides scalable, high-performance, transaction-protected data management services to applications. Berkeley DB provides a simple function-call API for data access and management.
译:BDB是一个开放源代码的嵌入式数据库的函数库,它ؓ应用E序提供Ҏ的Q高性能的,transaction-protected的数据库理服务QBDB为数据的讉K和管理提供了单的应用E序接口API?/span>
Q?QBerkeley DB is embedded because it links directly into the application. It runs in the same address space as the application. As a result, no inter-process communication, either over the network or between processes on the same machine, is required for database operations. Berkeley DB provides a simple function-call API for a number of programming languages, including C, C++, Java, Perl, Tcl, Python, and PHP. All database operations happen inside the library. Multiple processes, or multiple threads in a single process, can all use the database at the same time as each uses the Berkeley DB library. Low-level services like locking, transaction logging, shared buffer management, memory management, and so on are all handled transparently by the library.
译:BDB之所以是嵌入式数据库是因为它是直接连到应用程序中的。它和应用程序在同一内存I间q行。其l果是,不管应用E序是运行在同一台机器上q是q行在网l上Q在q行数据库操作时Q它都无需q行q程间通信。BDB多编E语a提供了函数接口,q些语言包括C, C++, Java, Perl, Tcl, Python, ?PHP。所有的数据库操作都发生在函数库内部。多个进E,或者是一个进E中的多个线E,都可以同时用BDBQ因为它们实际是在调用BDB函数库。一些像locking, transaction logging, shared buffer management, memory management{等之类的低U服务都可以由函数库透明地处理?/span>
Q?QThe library is extremely portable. It runs under almost all UNIX and Linux variants, Windows, and a number of embedded real-time operating systems. It runs on both 32-bit and 64-bit systems. It has been deployed on high-end Internet servers, desktop machines, and on palmtop computers, set-top boxes, in network switches, and elsewhere. Once Berkeley DB is linked into the application, the end user generally does not know that there's a database present at all.
译:BDB函数库是高度可移植的。它可以q行在几乎所有的UNIX和LINUXpȝ之上Q也支持WINDOWS和多U嵌入式实时操作pȝ。它既可以运行在32位系l上Q也可以q行?4位系l上。它z跃在高端服务器Q桌面系l,掌上电脑Qset-top boxesQ网l交换机以及其它的一些领域。一旦BDB被连接到应用当中以后Q终端用户一般是不知道后端数据库的存在的?/span>
Q?QBerkeley DB is scalable in a number of respects. The database library itself is quite compact (under 300 kilobytes of text space on common architectures), but it can manage databases up to 256 terabytes in size. It also supports high concurrency, with thousands of users operating on the same database at the same time. Berkeley DB is small enough to run in tightly constrained embedded systems, but can take advantage of gigabytes of memory and terabytes of disk on high-end server machines.
译:BDB在许多方面都是弹性的。函数库本n非常紧凑Q在常见的机器体pM大约只占用不?00K的textI间Q但是它可以操作多达256TB的数据。它也支持高强度的ƈ发操作,可以同时允许C千计的用户在同一个数据库q行操作。在高端服务器领域,BDB是够小的,它可以在高度受限的嵌入式pȝ上运行,但却可以利用高达GB量的内存空间和高达TB量的磁盘空间?/span>
Q?QBerkeley DB generally outperforms relational and object-oriented database systems in embedded applications for a couple of reasons. First, because the library runs in the same address space, no inter-process communication is required for database operations. The cost of communicating between processes on a single machine, or among machines on a network, is much higher than the cost of making a function call. Second, because Berkeley DB uses a simple function-call interface for all operations, there is no query language to parse, and no execution plan to produce.
译:BDB在嵌入式应用斚w的性能比关pd数据库和面向对象的数据库优越的原因是多方面的。首先,因ؓ函数库和应用是运行在同一地址I间中的Q省掉了数据库操作时的进E间通信。而众所周知Q不是在单Zq是在分布式pȝ上,q程间通信所q旉q多于函数调用所要的旉。其ơ,因ؓBDBҎ有的操作提供了简z的函数调用接口Q无需Ҏ询语aq行解析Q也不需要预执行?/span>
Q?QIn contrast to most other database systems, Berkeley DB provides relatively simple data access services.Berkeley DB supports only a few logical operations on records. They are:

Insert a record in a table. 
Delete a record from a table. 
Find a record in a table by looking up its key. 
Update a record that has already been found. 
译:与其他大多数数据库系l相比,BDB提供了相对简单的数据讉K服务。BDB只支持对记录所做的几种逻辑操作。它们是Q?/span>

在表中插入一条记录?/span>
从表中删除一条记录?/span>
通过查询键(keyQ从表中查找一条记录?/span>
更新表中已有的一条记录?/span>
Q?QBerkeley DB is not a standalone database server. It is a library, and runs in the address space of the application that uses it.It is possible to build a server application that uses Berkeley DB for data management. For example, many commercial and open source Lightweight Directory Access Protocol (LDAP) servers use Berkeley DB for record storage. LDAP clients connect to these servers over the network. Individual servers make calls through the Berkeley DB API to find records and return them to clients. On its own, however, Berkeley DB is not a server.
译:BDB不是一个独立的数据库服务器。它是一个函数库Q和调用它的应用E序是运行在同一地址I间中的。可以把BDB作ؓ数据库管理系l来构徏服务器程序。比如,有许多商业的和开源的轻量U目录访问协议(LDAPQ服务器都用BDB存储记录。LDAP客户端通过|络q接到服务器。服务器调用BDB的API来查找记录ƈq回l客戗而在它本w而言QBDB却不是数据库的服务器端?/span>
所以,BDB是一U完全不同于其它数据库管理系l的数据库,而且它也不是一个数据库服务器端?/span>

4.BDB的优点和~点?/span>
Berkeley DB is an ideal database system for applications that need fast, scalable, and reliable embedded database management. For applications that need different services, however, it can be a poor choice.
Berkeley DB was conceived and built to provide fast, reliable, transaction-protected record storage. The library itself was never intended to provide interactive query support, graphical reporting tools, or similar services that some other database systems provide.We have tried always to err on the side of minimalism and simplicity. By keeping the library small and simple, we create fewer opportunities for bugs to creep in, and we guarantee that the database system stays fast, because there is very little code to execute. If your application needs that set of features, then Berkeley DB is almost certainly the best choice for you.
译:当面对的是对性能Q规模和可靠性要求都比较高的嵌入式应用的时候,BDB是理想的数据库管理系l。但对于要求多种不同服务的应用而言Q选择它是不适当的?/span>
BDB的初h提供快速的Q可靠的Qtransaction-protected的记录存储。函数库本nq没有提供对交互查询的支持,也没有提供图形化的报表工P或者一些其它的数据库管理系l提供的服务。我们一直在致力于保持函数库的短和l,q样做,可以使得bug出现的机会大大减,而且因ؓ只有很少的代码需要执行,我们可以保证数据库一直快速的q行。如果你的应用正好需要的是这L一套功能的话,那么BDB几乎一定是你的首选对象?/span>

5.我个人的观点
BDB之所以适合LDAP,一个关键的因素是它可以保证LDAP的快速响?因ؓBDB本n是一U嵌入式的数据库,速度快是它最大的特点,也是它和其他数据库系l相比最大的优势.我们再来看LDAP,LDAP是一U一旦数据徏立就很少需要改动的数据?q且它最常用的操作是d,查询,搜烦{等不改变数据库内容的操?而让BDB来做q几U事情无疑是最好的选择.q样,即在有大量用户提交数据库查询的情况?LDAP仍能快速反馈给用户有用信息.所?速度的考虑是LDAP选用BDB的最大因?q也是目前绝大多数的LDAP服务器都选用BDB的根本原? 

Chan Chen 2012-05-03 11:38 发表评论
]]>
Make Auto Incrementing Field in MongoDBhttp://www.tkk7.com/czihong/articles/374121.htmlChan ChenChan ChenFri, 13 Apr 2012 17:11:00 GMThttp://www.tkk7.com/czihong/articles/374121.htmlhttp://www.tkk7.com/czihong/comments/374121.htmlhttp://www.tkk7.com/czihong/articles/374121.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/374121.htmlhttp://www.tkk7.com/czihong/services/trackbacks/374121.htmlSide counter method
One can keep a counter of the current _id in a side document, in a collection dedicated to counters.
Then use FindAndModify to atomically obtain an id and increment the counter.

> db.counters.insert({_id: "userId", c: 0});

> var o = db.counters.findAndModify(
...        {query: {_id: "userId"}, update: {$inc: {c: 1}}});
{ "_id" : "userId", "c" : 0 }
> db.mycollection.insert({_id:o.c, stuff:"abc"});

> o = db.counters.findAndModify(
...        {query: {_id: "userId"}, update: {$inc: {c: 1}}});
{ "_id" : "userId", "c" : 1 }
> db.mycollection.insert({_id:o.c, stuff:"another one"});
Once you obtain the next id in the client, you can use it and be sure no other client has it.

Optimistic loop method
One can do it with an optimistic concurrency "insert if not present" loop. The following example, in Mongo shell Javascript syntax, demonstrates.

// insert incrementing _id values into a collection
function insertObject(o) {
    x = db.myCollection;
    while( 1 ) {
        // determine next _id value to try
        var c = x.find({},{_id:1}).sort({_id:-1}).limit(1);
        var i = c.hasNext() ? c.next()._id + 1 : 1;
        o._id = i;
        x.insert(o);
        var err = db.getLastErrorObj();
        if( err && err.code ) {
            if( err.code == 11000 /* dup key */ )
                continue;
            else
                print("unexpected error inserting data: " + tojson(err));
        }
        break;
    }
}
The above should work well unless there is an extremely high concurrent insert rate on the collection. In that case, there would be a lot of looping potentially.


Chan Chen 2012-04-14 01:11 发表评论
]]>
SPhttp://www.tkk7.com/czihong/articles/373733.htmlChan ChenChan ChenTue, 10 Apr 2012 10:14:00 GMThttp://www.tkk7.com/czihong/articles/373733.htmlhttp://www.tkk7.com/czihong/comments/373733.htmlhttp://www.tkk7.com/czihong/articles/373733.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/373733.htmlhttp://www.tkk7.com/czihong/services/trackbacks/373733.html-- --------------------------------------------------------------------------------
-- Routine DDL
-- Note: comments before and after the routine body will not be stored by the server
-- --------------------------------------------------------------------------------
DELIMITER $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `chan_insert_date_by_starttime`()
BEGIN
  DECLARE var_id decimal(18,0); 
  DECLARE var_issue decimal(18,0); 
  DECLARE var_date datetime ;
  DECLARE cur1 CURSOR FOR 
    SELECT issue, datevalue 
    FROM jira.customfieldvalue
    where customfield in (10006,10007)
    and issue in (
        SELECT distinct issue
        FROM jira.customfieldvalue
        where customfield in (10007)
    )
    and issue not in (
        SELECT distinct issue
        FROM jira.customfieldvalue
        where customfield in (10006)
    );
  OPEN cur1;
  read_loop: LOOP
    FETCH cur1 INTO var_issue, var_date;
      select (max(id) + 1) into var_id from jira.customfieldvalue ;
      INSERT INTO jira.customfieldvalue(id,issue, customfield,datevalue) 
      VALUES (var_id,var_issue,10006, var_date);
  END LOOP;
  CLOSE cur1;
END



-- --------------------------------------------------------------------------------
-- Routine DDL
-- Note: comments before and after the routine body will not be stored by the server
-- --------------------------------------------------------------------------------
DELIMITER $$
CREATE DEFINER=`root`@`localhost` PROCEDURE `chan_insert_date_by_created`()
BEGIN
  DECLARE var_id decimal(18,0); 
  DECLARE var_issue decimal(18,0); 
  DECLARE var_date datetime ;
  DECLARE cur1 CURSOR FOR 
    SELECT id, created FROM jira.jiraissue where id in 
    (    SELECT issue 
        FROM jira.customfieldvalue
        where customfield in (10000)
        and issue not in (
            SELECT distinct issue
            FROM jira.customfieldvalue
            where customfield in (10007)
        )
        and issue not in (
            SELECT distinct issue
            FROM jira.customfieldvalue
            where customfield in (10006)
        )
    )  ;
 
  OPEN cur1;
  read_loop: LOOP
    FETCH cur1 INTO var_issue, var_date;
      select (max(id) + 1) into var_id from jira.customfieldvalue ;
      INSERT INTO jira.customfieldvalue(id,issue, customfield,datevalue) 
      VALUES (var_id,var_issue,10006, var_date);
  END LOOP;
  CLOSE cur1;
END


Chan Chen 2012-04-10 18:14 发表评论
]]>
MySQL Auto Backuphttp://www.tkk7.com/czihong/articles/372388.htmlChan ChenChan ChenWed, 21 Mar 2012 09:43:00 GMThttp://www.tkk7.com/czihong/articles/372388.htmlhttp://www.tkk7.com/czihong/comments/372388.htmlhttp://www.tkk7.com/czihong/articles/372388.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/372388.htmlhttp://www.tkk7.com/czihong/services/trackbacks/372388.html
Below is the script example to backup mysql database in command line:-
$ mysqldump -h localhost -u username -p database_name > backup_db.sql
If your mysql database is very big, you might want to compress your sql file.
Just use the mysql backup command below and pipe the output to gzip,
then you will get the output as gzip file.
$ mysqldump -u username -h localhost -p database_name | gzip -9 > backup_db.sql.gz
If you want to extract the .gz file, use the command below:-
$ gunzip backup_db.sql.gz
Type the following command to import sql data file:
$ mysql -u username -p -h localhost DATA-BASE-NAME < data.sql
In this example, import 'data.sql' file into 'blog' database using vivek as username:
$ mysql -u vivek -p -h localhost blog < data.sql
If you have a dedicated database server, replace localhost hostname with with actual server name or IP address as follows:
$ mysql -u username -p -h 202.54.1.10 databasename < data.sql
OR use hostname such as mysql.cyberciti.biz
$ mysql -u username -p -h mysql.cyberciti.biz database-name < data.sql
If you do not know the database name or database name is included in sql dump you can try out something as follows:
$ mysql -u username -p -h 202.54.1.10 < data.sql
To auto backup database, please read 10 ways to Automatically & Manually Backup MySQL Database , here is one example
[root@sdc-d1-pangaea-devops1 ~]# vi /etc/crontab
SHELL
=/bin/bash
PATH
=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO
=root
HOME
=/

# For details see man 
4 crontabs

# Example of job definition:
# .
---------------- minute (0 - 59)
|  .------------- hour (0 - 23)
|  |  .---------- day of month (1 - 31)
|  |  |  .------- month (1 - 12) OR jan,feb,mar,apr 
|  |  |  |  .---- day of week (0 - 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
|  |  |  |  |
*  *  *  *  * user-name command to be executed
59 23 * * tue root mysqldump -u root -proot123 jira > /export/home/web/dbbak/jira/jira_$(date +\%Y\%m\%d).sql
This script will generate an backup file at 23:59 every Tue.
Note
The cron does not regonize `date +%Y%m%d`, so use $(date +\%Y\%m\%d) instead of it.


Chan Chen 2012-03-21 17:43 发表评论
]]>
MongoDB Admin Tool -- RockMongo Install for Ubuntuhttp://www.tkk7.com/czihong/articles/372136.htmlChan ChenChan ChenSun, 18 Mar 2012 07:54:00 GMThttp://www.tkk7.com/czihong/articles/372136.htmlhttp://www.tkk7.com/czihong/comments/372136.htmlhttp://www.tkk7.com/czihong/articles/372136.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/372136.htmlhttp://www.tkk7.com/czihong/services/trackbacks/372136.html
> db.User.find()
{ "_id" : ObjectId("4f657b5e6803fa511a000000"), "about" : "Change the way people learn", "blackListCreateTime" : "2012-03-16 13:00", "createTime" : "2012-03-16 13:00", "email" : "admin@stdtlk.com", "fanCreateTime" : "2012-03-16 13:00", "fanId" : [     ObjectId("4f657b5e6803fa511a000000"),     ObjectId("4f657b5e6803fa511a000000") ], "firstName" : "chan", "lastName" : "chen", "levle" : "0", "password" : "admin", "photoURL" : "img/admin.jpg", "school" : "Stdtlk University", "title" : "starter" }
{ "_id" : ObjectId("4f6582986803fa3d1a000000"), "about" : "I like the way to learn", "blackListCreateTime" : "2012-03-16 13:00", "createTime" : "2012-03-16 13:00", "email" : "google@stdtlk.com", "fanCreateTime" : "2012-03-16 13:00", "fanId" : [     ObjectId("4f657b5e6803fa511a000000"),     ObjectId("4f657b5e6803fa511a000000") ], "firstName" : "google", "lastName" : "google", "levle" : "0", "password" : "google", "photoURL" : "img/google.jpg", "school" : "AAA University", "title" : "starter" }
{ "_id" : ObjectId("4f6582cc6803fa501a000003"), "about" : "Apple is greate", "blackListCreateTime" : "2012-03-16 13:00", "createTime" : "2012-03-16 13:00", "email" : "apple@stdtlk.com", "fanCreateTime" : "2012-03-16 13:00", "fanId" : [     ObjectId("4f657b5e6803fa511a000000"),     ObjectId("4f657b5e6803fa511a000000") ], "firstName" : "apple", "lastName" : "apple", "levle" : "0", "password" : "apple", "photoURL" : "img/apple.jpg", "school" : "BBB University", "title" : "starter" }
{ "_id" : ObjectId("4f6582ec6803faa91c000000"), "about" : "microsoft is old", "blackListCreateTime" : "2012-03-16 13:00", "createTime" : "2012-03-16 13:00", "email" : "microsoft@stdtlk.com", "fanCreateTime" : "2012-03-16 13:00", "fanId" : [     ObjectId("4f657b5e6803fa511a000000"),     ObjectId("4f657b5e6803fa511a000000") ], "firstName" : "microsoft", "lastName" : "microsoft", "levle" : "0", "password" : "microsoft", "photoURL" : "img/microsoft.jpg", "school" : "CCC University", "title" : "starter" }

All documents are outputted in one line, if the document has a field which have huge length of characters, it is not easy to find out and read what we look for. So I look around on the internet, and read the MongoDB Admin UI, it recommend some useful tools to admin the database. RockMongo is tool that catch my eyes, and many folks on stack overflow also put some positive feedback on this tool, hence I decide to try it out. Here are the steps to install Rockmongo on Ubuntu.
1. Install apache2, php5
sudo apt-get install apache2 php5 php5-dev php5-cli

2. Install php mongo driver
pecl install mongo

3. Config php.ini
root@ubuntu:~# find / -name php.ini
/etc/php5/apache2/php.ini
/etc/php5/cli/php.ini
root@ubuntu:~# echo "extension=mongo.so" >> /etc/php5/apache2/php.ini

4.check php install successfully
root@ubuntu:~# sudo find / -name www
/var/www
root@ubuntu:~# sudo echo "<? phpinfo() ?>" >> /var/www/info.php
open http://localhost/info.php
if everything goes ok, this page will display

























5. download rockmongo from rockmongo
root@ubuntu:/var/www# cd /var/www
root@ubuntu:/var/www# wget http://rock-php.googlecode.com/files/rockmongo-v1.1.0.zip
root@ubuntu:/var/www# unzip rockmongo-v1.1.0.zip -d rockmongo

6. lunch mongod

7. restart apache server
root@ubuntu
: /etc/init.d/apache2 restart

8. open http://localhost/rockmongo/index.php  by default, username and password are both admin.



 

Chan Chen 2012-03-18 15:54 发表评论
]]>
Indexes in MongoDBhttp://www.tkk7.com/czihong/articles/370841.htmlChan ChenChan ChenMon, 27 Feb 2012 06:34:00 GMThttp://www.tkk7.com/czihong/articles/370841.htmlhttp://www.tkk7.com/czihong/comments/370841.htmlhttp://www.tkk7.com/czihong/articles/370841.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/370841.htmlhttp://www.tkk7.com/czihong/services/trackbacks/370841.html阅读全文

Chan Chen 2012-02-27 14:34 发表评论
]]>
Optimizing Object IDshttp://www.tkk7.com/czihong/articles/370837.htmlChan ChenChan ChenMon, 27 Feb 2012 06:06:00 GMThttp://www.tkk7.com/czihong/articles/370837.htmlhttp://www.tkk7.com/czihong/comments/370837.htmlhttp://www.tkk7.com/czihong/articles/370837.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/370837.htmlhttp://www.tkk7.com/czihong/services/trackbacks/370837.htmlThe _id field in a MongoDB document is very important and is always indexed for normal collections. This page lists some recommendations. Note that it is common to use the BSON ObjectID datatype for _id's, but the values of an _id field can be of any type.

Use the collections 'natural primary key' in the _id field.

_id's can be any type, so if your objects have a natural unique identifier, consider using that in _id to both save space and avoid an additional index.

When possible, use _id values that are roughly in ascending order.

If the _id's are in a somewhat well defined order, on inserts the entire b-tree for the _id index need not be loaded. BSON ObjectIds have this property.

Store Binary GUIDs as BinData, rather than as hex encoded strings

BSON includes a binary data datatype for storing byte arrays. Using this will make the id values, and their respective keys in the _id index, twice as small.

Note that unlike the BSON Object ID type (see above), most UUIDs do not have a rough ascending order, which creates additional caching needs for their index.

> // mongo shell bindata info: 
> help misc
    b = new BinData(subtype,base64str)     create a BSON BinData value     
    b.subtype()     the BinData subtype (0..255)
    b.length()     length of the BinData data in bytes
    b.hex()     the data as a hex encoded string
    b.base64()     the data as a base 64 encoded string
    b.toString()
Extract insertion times from _id rather than having a separate timestamp field.

The BSON ObjectId format provides documents with a creation timestamp (one second granularity) for free. Almost all drivers implement methods for extracting these timestamps; see the relevant api docs for details. In the shell:

> // mongo shell ObjectId methods 
> help misc
    o = new ObjectId() create a new ObjectId
    o.getTimestamp() return timestamp derived from first 32 bits of the OID
    o.isObjectId()
    o.toString()
    o.equals(otherid)
Sort by _id to sort by insertion time

BSON ObjectId's begin with a timestamp. Thus sorting by _id, when using the ObjectID type, results in sorting by time. Note: granularity of the timestamp portion of the ObjectID is to one second only.

> // get 10 newest items 
> db.mycollection.find().sort({id:-1}).limit(10);


Chan Chen 2012-02-27 14:06 发表评论
]]>
Five Reasons of Choosing MongoDBhttp://www.tkk7.com/czihong/articles/370729.htmlChan ChenChan ChenFri, 24 Feb 2012 14:12:00 GMThttp://www.tkk7.com/czihong/articles/370729.htmlhttp://www.tkk7.com/czihong/comments/370729.htmlhttp://www.tkk7.com/czihong/articles/370729.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/370729.htmlhttp://www.tkk7.com/czihong/services/trackbacks/370729.html
  1. Document Database > Most of you data is embedded in a document, so in order to get the data about a person, you don't have to join several tables. Thus, better performance for many use cases.
  2. Strong Query Language > Despite not been an RDBMS, MongoDB has a very strong query language that allows you to get something very specific or very general from a document or documents. The DB is queried using javascript so you can do many more things beside querying (e.g. functions, calculations).
  3. Sharding & Replication > Sharding allows you application to scale horizontally rather than vertically. In other words, more small servers instead of one huge server. And replication gives you fail-over safety in several configurations (e.g. master/slave).
  4. Powerful Indexing - I originally got interested in MongoDB because it allows geo-spatial indexing out of the box but it has many other indexing configurations as well.
  5. Cross-Platform - MongoDB has many drivers.


Chan Chen 2012-02-24 22:12 发表评论
]]>
MongoDB vs. RDBMS Schema Designhttp://www.tkk7.com/czihong/articles/370358.htmlChan ChenChan ChenMon, 20 Feb 2012 11:34:00 GMThttp://www.tkk7.com/czihong/articles/370358.htmlhttp://www.tkk7.com/czihong/comments/370358.htmlhttp://www.tkk7.com/czihong/articles/370358.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/370358.htmlhttp://www.tkk7.com/czihong/services/trackbacks/370358.html阅读全文

Chan Chen 2012-02-20 19:34 发表评论
]]>
Mongo Metadatahttp://www.tkk7.com/czihong/articles/370295.htmlChan ChenChan ChenSun, 19 Feb 2012 08:47:00 GMThttp://www.tkk7.com/czihong/articles/370295.htmlhttp://www.tkk7.com/czihong/comments/370295.htmlhttp://www.tkk7.com/czihong/articles/370295.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/370295.htmlhttp://www.tkk7.com/czihong/services/trackbacks/370295.html
The <dbname>.system.* namespaces in MongoDB are special and contain database system information.  System collections include:
<dbname>.system.namespaces lists all namespaces.
<dbname>.system.indexes lists all indexes. Additional namespace / index metadata exists in the database.ns files, and is opaque.
<dbname>.system.profile stores database profiling information.
<dbname>.system.users lists users who may access the database.
Additionally, in the local database only there is replication information in system collections, e.g., local.system.replset contains the replica set configuration.
Information on the structure of a stored document is stored within the document itself.  See BSON .
There are several restrictions on manipulation of objects in the system collections. Inserting in system.indexes adds an index, but otherwise that table is immutable (the special drop index command updates it for you). system.users is modifiable. system.profile is droppable.
Note: $ is a reserved character. Do not use it in namespace names or within field names. Internal collections for indexes use the $character in their names. These collection store b-tree bucket data and are not in BSON format (thus direct querying is not possible).


Chan Chen 2012-02-19 16:47 发表评论
]]>
Schema Design for MongoDBhttp://www.tkk7.com/czihong/articles/370250.htmlChan ChenChan ChenSat, 18 Feb 2012 07:58:00 GMThttp://www.tkk7.com/czihong/articles/370250.htmlhttp://www.tkk7.com/czihong/comments/370250.htmlhttp://www.tkk7.com/czihong/articles/370250.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/370250.htmlhttp://www.tkk7.com/czihong/services/trackbacks/370250.html阅读全文

Chan Chen 2012-02-18 15:58 发表评论
]]>
Basic Term of MongoDBhttp://www.tkk7.com/czihong/articles/370249.htmlChan ChenChan ChenSat, 18 Feb 2012 07:52:00 GMThttp://www.tkk7.com/czihong/articles/370249.htmlhttp://www.tkk7.com/czihong/comments/370249.htmlhttp://www.tkk7.com/czihong/articles/370249.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/370249.htmlhttp://www.tkk7.com/czihong/services/trackbacks/370249.html

Document
MongoDB can be thought of as a document-oriented database. By 'document', we mean structured documents, not freeform text documents. These documents canbe thought of as objectsbut only the data of an object, not the code, methods or class hierarchy. Additionally, there is much less linking between documents in MongoDB data models than there is between objects in a program written in an object-oriented programming language.
In MongoDB the documents are conceptually JSON. More specifically the documents are represented in a format calledBSON(standing for Binary JSON).
Documents are stored inCollections.
Maximum Document Size
MongoDB limits the data size of individual BSON objects/documents. At the time of this writing the limit is 16MB.
This limit is designed as a sanity-check; it is not a technical limit on document sizes. The thinking is that if documents are larger than this size, it is likely the schema is not ideal. Further it allows drivers to make some assumptions on the max size of documents.
The concept is that the maximum document size is a limit that ensures each document does not require an excessive amount of RAM from the machine, or require too much network bandwidth to fetch. For example, fetching a full 100MB document would take over 1 second to fetch over a gigabit ethernet connection. In this situation one would be limited to 1 request per second.
Over time, as computers grow in capacity, the limit will be adjusted upward.
Collection
MongoDB collections are essentially named groupings of documents. You can think of them as roughly equivalent to relational database tables.
A MongoDB collection is a collection ofBSONdocuments. These documents usually have the same structure, but this is not a requirement since MongoDB is a schema-free (or more accurately, "dynamic schema") database. You may store a heterogeneous set of documents within a collection, as you do not need predefine the collection's "columns" or fields.
A collection is created when the first document is inserted.
Collection names should begin with letters or an underscore and may include numbers; $ is reserved. Collections can be organized in namespaces; these are named groups of collections defined using a dot notation. For example, you could define collections blog.posts and blog.authors, both reside under "blog". Note that this is simply an organizational mechanism for the user -- the collection namespace is flat from the database's perspective.
The maximum size of a collection name is 128 characters (including the name of the db and indexes). It is probably best to keep it under 80/90 chars.
Namespace
MongoDB stores BSON objects in collections. The concatenation of the database name and the collection name (with a period in between) is called a namespace.
For example, acme.users is a namespace, where acme is the database name, and users is the collection name. Note that periods can occur in collection names, so a name such as acme.blog.posts is legal too (in that case blog.posts is the collection name.

 



Chan Chen 2012-02-18 15:52 发表评论
]]>
Z么要用非关系数据库?http://www.tkk7.com/czihong/articles/370243.htmlChan ChenChan ChenSat, 18 Feb 2012 07:48:00 GMThttp://www.tkk7.com/czihong/articles/370243.htmlhttp://www.tkk7.com/czihong/comments/370243.htmlhttp://www.tkk7.com/czihong/articles/370243.html#Feedback0http://www.tkk7.com/czihong/comments/commentRss/370243.htmlhttp://www.tkk7.com/czihong/services/trackbacks/370243.html随着互联|web2.0|站的兴P非关pd的数据库现在成了一个极其热门的新领域,非关pL据库产品的发展非常迅速。而传l的关系数据库在应付web2.0|站Q特别是大规模和高q发的SNScd的web2.0U动态网站已l显得力不从心,暴露了很多难以克服的问题Q例如: 

1、High performance - Ҏ据库高ƈ发读写的需?/strong> 
web2.0|站要根据用户个性化信息来实时生成动态页面和提供动态信息,所以基本上无法使用动态页面静态化技术,因此数据库ƈ发负载非帔RQ往往要达到每U上万次dh。关pL据库应付上万ơSQL查询q勉强顶得住Q但是应付上万次SQL写数据请求,盘IO已l无法承受了。其实对于普通的BBS|站Q往往也存在对高ƈ发写h的需求,例如像JavaEye|站的实时统计在U用L态,记录热门帖子的点L敎ͼ投票计数{,因此q是一个相当普遍的需求?nbsp;

2、Huge Storage - Ҏv量数据的高效率存储和讉K的需?nbsp;
cMFacebookQtwitterQFriendfeedq样的SNS|站Q每天用户生v量的用户动态,以FriendfeedZQ一个月pC2.5亿条用户动态,对于关系数据库来_在一?.5亿条记录的表里面q行SQL查询Q效率是极其低下乃至不可忍受的。再例如大型web|站的用L录系l,例如腾讯Q盛大,动辄C亿计的帐P关系数据库也很难应付?nbsp;

3、High Scalability && High Availability- Ҏ据库的高可扩展性和高可用性的需?nbsp;
在基于web的架构当中,数据库是最难进行横向扩展的Q当一个应用系l的用户量和讉K量与日俱增的时候,你的数据库却没有办法像web server和app server那样单的通过d更多的硬件和服务节点来扩展性能和负载能力。对于很多需要提?4时不间断服务的|站来说Q对数据库系l进行升U和扩展是非常痛苦的事情Q往往需要停机维护和数据q移Qؓ什么数据库不能通过不断的添加服务器节点来实现扩展呢Q?nbsp;

在上面提到的“三高”需求面前,关系数据库遇C难以克服的障,而对于web2.0|站来说Q关pL据库的很多主要特性却往往无用武之圎ͼ例如Q?nbsp;

1、数据库事务一致性需?/strong> 
很多web实时pȝq不要求严格的数据库事务Q对M致性的要求很低Q有些场合对写一致性要求也不高。因此数据库事务理成了数据库高负蝲下一个沉重的负担?nbsp;

2、数据库的写实时性和d时性需?/strong> 
对关pL据库来说Q插入一条数据之后立L询,是肯定可以读出来q条数据的,但是对于很多web应用来说Qƈ不要求这么高的实时性,比方说我QJavaEye的robbinQ发一条消息之后,q几U乃臛_几秒之后Q我的订阅者才看到q条动态是完全可以接受的?nbsp;

3、对复杂的SQL查询Q特别是多表兌查询的需?/strong> 
M大数据量的webpȝQ都非常忌讳多个大表的关联查询,以及复杂的数据分析类型的复杂SQL报表查询Q特别是SNScd的网站,从需求以及品设计角度,避免了q种情况的生。往往更多的只是单表的主键查询Q以及单表的单条件分|询,SQL的功能被极大的弱化了?nbsp;

因此Q关pL据库在这些越来越多的应用场景下显得不那么合适了Qؓ了解册c问题的非关pL据库应运而生Q现在这两年Q各U各样非关系数据库,特别是键值数据库(Key-Value Store DB)风v云涌Q多得让人眼q݋乱。前不久国外刚刚丑֊?/span>NoSQL ConferenceQ各路NoSQL数据库纷U亮相,加上未亮怽是名声在外的Qv码有过10个开源的NoSQLDBQ例如: 

RedisQTokyo CabinetQCassandraQVoldemortQMongoDBQDynomiteQHBaseQCouchDBQHypertableQ?nbsp;RiakQTinQ?nbsp;FlareQ?nbsp;LightcloudQ?nbsp;KiokuDBQScalarisQ?nbsp;KaiQ?nbsp;ThruDBQ?nbsp; ...... 

q些NoSQL数据库,有的是用C/C++~写的,有的是用Java~写的,q有的是用Erlang~写的,每个都有自己的独C处,看都看不q来了,?robbin)也只能从中挑选一些比较有特色Q看h更有前景的品学习和了解一下。这些NoSQL数据库大致可以分Z下的三类Q?nbsp;

一、满x高读写性能需求的Kye-Value数据库:RedisQTokyo CabinetQ?nbsp;Flare 

高性能Key-Value数据库的主要特点是h极高的ƈ发读写性能QRedisQTokyo CabinetQ?nbsp;FlareQ这3个Key-Value DB都是用C~写的,他们的性能都相当出Ԍ但出了出色的性能Q他们还有自q特的功能Q?nbsp;

1?/span>Redis 
Redis是一个很新的目Q刚刚发布了1.0版本。Redis本质上是一个Key-Valuecd的内存数据库Q很像memcachedQ整个数据库l统加蝲在内存当中进行操作,定期通过异步操作把数据库数据flush到硬盘上q行保存。因为是U内存操作,Redis的性能非常Q每U可以处理超q?0万次d操作Q是我知道的性能最快的Key-Value DB?nbsp;

Redis的出色之处不仅仅是性能QRedis最大的力是支持保存List链表和Set集合的数据结构,而且q支持对Listq行各种操作Q例如从List两端push和pop数据Q取List区间Q排序等{,对Set支持各种集合的ƈ集交集操作,此外单个value的最大限制是1GBQ不像memcached只能保存1MB的数据,因此Redis可以用来实现很多有用的功能,比方说用他的List来做FIFO双向链表Q实C个轻量的高性能消息队列服务Q用他的Set可以做高性能的tagpȝ{等。另外Redis也可以对存入的Key-Value讄expire旉Q因此也可以被当作一个功能加强版的memcached来用?nbsp;

Redis的主要缺Ҏ数据库容量受到物理内存的限制Q不能用作v量数据的高性能dQƈ且它没有原生的可扩展机制Q不hscaleQ可扩展Q能力,要依赖客L来实现分布式dQ因此Redis适合的场景主要局限在较小数据量的高性能操作和运上。目前用Redis的网站有githubQEngine Yard?nbsp;

2?/span>Tokyo Cabinet和Tokoy Tyrant 
TC和TT的开发者是日本人Mikio HirabayashiQ主要被用在日本最大的SNS|站mixi.jp上,TC发展的时间最早,现在已经是一个非常成熟的目Q也是Kye-Value数据库领域最大的热点Q现在被q泛的应用在很多很多|站上。TC是一个高性能的存储引擎,而TT提供了多U程高ƈ发服务器Q性能也非常出Ԍ每秒可以处理4-5万次d操作?nbsp;

TC除了支持Key-Value存储之外Q还支持保存Hashtable数据cdQ因此很像一个简单的数据库表Qƈ且还支持Zcolumn的条件查询,分页查询和排序功能,基本上相当于支持单表的基查询功能了,所以可以简单的替代关系数据库的很多操作Q这也是TC受到大家Ƣ迎的主要原因之一Q有一个Ruby的项?/span>miyazakiresistanceTT的hashtable的操作封装成和ActiveRecord一L操作Q用h非常爽?nbsp;

TC/TT在mixi的实际应用当中,存储?000万条以上的数据,同时支撑了上万个q发q接Q是一个久l考验的项目。TC在保证了极高的ƈ发读写性能的同Ӟh可靠的数据持久化机制Q同时还支持cM关系数据库表l构的hashtable以及单的条gQ分和排序操作Q是一个很的NoSQL数据库?nbsp;

TC的主要缺Ҏ在数据量辑ֈ上亿U别以后Qƈ发写数据性能会大q度下降Q?/span>NoSQL: If Only It Was That Easy提到Q他们发现在TC里面插入1.6亿条2-20KB数据的时候,写入性能开始急剧下降。看来是当数据量上亿条的时候,TC性能开始大q度下降Q从TC作者自己提供的mixi数据来看Q至上千万条数据量的时候还没有遇到q么明显的写入性能瓉?nbsp;

q个是Tim Yang做的一?/span>MemcachedQRedis和Tokyo Tyrant的简单的性能评测Q仅供参?/a> 

3?/span>
Flare 
TC是日本第一大SNS|站mixi开发的Q而Flare是日本第二大SNS|站green.jp开发的Q有意思吧。Flare单的说就是给TCd了scale功能。他替换掉了TT部分Q自己另外给TC写了|络服务器,Flare的主要特点就是支持scale能力Q他在网l服务端之前d了一个node serverQ来理后端的多个服务器节点Q因此可以动态添加数据库服务节点Q删除服务器节点Q也支持failover。如果你的用场景必要让TC可以scaleQ那么可以考虑flare?nbsp;

flare唯一的缺点就是他只支持memcached协议Q因此当你用flare的时候,׃能用TC的table数据l构了,只能使用TC的key-value数据l构存储?nbsp;

二、满xv量存储需求和讉K的面向文档的数据库:MongoDBQCouchDB 

面向文档的非关系数据库主要解决的问题不是高性能的ƈ发读写,而是保证量数据存储的同Ӟh良好的查询性能。MongoDB是用C++开发的Q而CouchDB则是Erlang开发的Q?nbsp;

1?/span>MongoDB 
MongoDB是一个介于关pL据库和非关系数据库之间的产品Q是非关pL据库当中功能最丰富Q最像关pL据库的。他支持的数据结构非常松散,是类似json的bjson格式Q因此可以存储比较复杂的数据cd。Mongo最大的特点是他支持的查询语a非常强大Q其语法有点cM于面向对象的查询语言Q几乎可以实现类似关pL据库单表查询的绝大部分功能,而且q支持对数据建立索引?nbsp;

Mongo主要解决的是量数据的访问效率问题,Ҏ官方的文档,当数据量辑ֈ50GB以上的时候,Mongo的数据库讉K速度是MySQL?0倍以上。Mongo的ƈ发读写效率不是特别出ԌҎ官方提供的性能试表明Q大U每U可以处?.5万-1.5ơ读写请求。对于Mongo的ƈ发读写性能Q我QrobbinQ也打算有空的时候好好测试一下?nbsp;

因ؓMongo主要是支持v量数据存储的Q所以Mongoq自带了一个出色的分布式文件系lGridFSQ可以支持v量的数据存储Q但我也看到有些评论认ؓGridFS性能不佳Q这一点还是有待亲自做Ҏ试来验证了?nbsp;

最后由于Mongo可以支持复杂的数据结构,而且带有强大的数据查询功能,因此非常受到Ƣ迎Q很多项目都考虑用MongoDB来替代MySQL来实C是特别复杂的Web应用Q比方说why we migrated from MySQL to MongoDB是一个真实的从MySQLq移到MongoDB的案例,׃数据量实在太大,所以迁Ud了Mongo上面Q数据查询的速度得到了非常显著的提升?nbsp;

MongoDB也有一个ruby的项?/span>MongoMapperQ是模仿Merb的DataMapper~写的MongoDB的接口,使用h非常单,几乎和DataMapper一模一P功能非常强大易用?nbsp;

2、CouchDB 
CouchDB现在是一个非常有名气的项目,g不用多介l了。但是我却对CouchDB没有什么兴,主要是因为CouchDB仅仅提供了基于HTTP REST的接口,因此CouchDB单纯从ƈ发读写性能来说Q是非常p糕的,q让我立L弃了对CouchDB的兴?nbsp;

三、满高可扩展性和可用性的面向分布式计的数据库:CassandraQVoldemort 

面向scale能力的数据库其实主要解决的问题领域和上述两类数据库还不太一P它首先必L一个分布式的数据库pȝQ由分布在不同节点上面的数据库共同构成一个数据库服务pȝQƈ且根据这U分布式架构来提供online的,hҎ的可扩展能力,例如可以不停机的d更多数据节点Q删除数据节点等{。因此像Cassandra常常被看成是一个开源版本的Google BigTable的替代品。Cassandra和Voldemort都是用Java开发的Q?nbsp;

1?/span>Cassandra 
Cassandra目是Facebook?008q开源出来的Q随后Facebook自己使用Cassandra的另外一个不开源的分支Q而开源出来的Cassandra主要被Amazon的Dynamite团队来维护,q且Cassandra被认为是Dynamite2.0版本。目前除了Facebook之外Qtwitter和digg.com都在使用Cassandra?nbsp;

Cassandra的主要特点就是它不是一个数据库Q而是׃堆数据库节点共同构成的一个分布式|络服务Q对Cassandra的一个写操作Q会被复制到其他节点上去Q对Cassandra的读操作Q也会被路由到某个节点上面去d。对于一个Cassandra集来说Q扩展性能是比较简单的事情Q只在集里面d节点可以了。我看到有文章说Facebook的Cassandra集有超q?00台服务器构成的数据库集?nbsp;

Cassandra也支持比较丰富的数据l构和功能强大的查询语言Q和MongoDB比较cMQ查询功能比MongoDBE弱一些,twitter的^台架构部门领导Evan Weaver写了一文章介lCassandraQ?/span>http://blog.evanweaver.com/articles/2009/07/06/up-and-running-with-cassandra/Q有非常详细的介l?nbsp;

Cassandra以单个节Ҏ衡量Q其节点的ƈ发读写性能不是特别好,有文章说评测下来Cassandra每秒大约不到1万次dhQ我也看C些对q个问题q行质疑的评论,但是评hCassandra单个节点的性能是没有意义的Q真实的分布式数据库讉Kpȝ必然是n多个节点构成的系l,其ƈ发性能取决于整个系l的节点数量Q\由效率,而不仅仅是单节点的ƈ发负载能力?nbsp;

2?/span>Voldemort 
Voldemort是个和CassandracM的面向解决scale问题的分布式数据库系l,Cassandra来自于Facebookq个SNS|站Q而Voldemort则来自于Linkedinq个SNS|站。说hSNS|站为我们A献了n多的NoSQL数据库,例如CassandarQVoldemortQTokyo CabinetQFlare{等。Voldemort的资料不是很多,因此我没有特别仔l去ȝQVoldemort官方l出Voldemort的ƈ发读写性能也很不错Q每U超q了1.5万次d?nbsp;

从Facebook开发CassandraQLinkedin开发VoldemortQ我们也可以大致看出国外大型SNS|站对于分布式数据库Q特别是Ҏ据库的scale能力斚w的需求是多么D切。前面我QrobbinQ提刎ͼweb应用的架构当中,web层和app层相Ҏ说都很容易横向扩展,唯有数据库是单点的,极难scaleQ现在Facebook和Linkedin在非关系型数据库的分布式斚w探烦了一条很好的方向Q这也是Z么现在Cassandraq么热门的主要原因?nbsp;

如今QNoSQL数据库是个o人很兴奋的领域,L不断有新的技术新的品冒出来Q改变我们已lŞ成的固有的技术观念,我自己(robbinQ稍微了解了一些,感觉自己深q沉迷q去了,可以说NoSQL数据库领域也是博大精qQ我QrobbinQ也只能尝辄止Q我QrobbinQ写q篇文章既是自己一点点ȝ心得Q也是抛砖引玉,希望吸引对这个领域有l验的朋友来讨论和交?nbsp;

从我QrobbinQ个人的兴趣来说Q分布式数据库系l不是我能实际用到的技术,因此不打花旉深入Q而其他两个数据领域(高性能NoSQLDB和v量存储NoSQLDBQ都是我很感兴趣的,特别是RedisQTT/TC和MongoDBq?个NoSQL数据库,因此我接下来写三篇文章分别详细介绍q?个数据库?/span>

Chan Chen 2012-02-18 15:48 发表评论
]]>
վ֩ģ壺 ޾ƷVŷ޾ƷVպƷ| ҹƬ| ޹þþþþþ| žžƵ| | ޹ŷպƷһ | ĻӰȫѰ| ˳߹ۿ| Ļ޾Ʒ| ޵һվƵ| avŮӰ| ŷ޾ƷAV| ޹Ʒһ| ëƬƵѹۿ| 99ŮŮѾƷƵ߹ۿ| ҹžƵ߹ۿ| ԴWWW| ĻƷ| С˵| Ƶ߲Ŵȫ| һһһ| һëƬѿ| ¶ѿ| ഺɫƷ| ۺʮ| 337pŷ޴| ۺ߹ۿƵ| ޸߹ۿ| ŷ޹պƷ| ˳ӰԺ| ޻ɫַȫ| ޹ֻߵӰbd| 78˾ƷӰ߲պƷӰһ| Vþþ| ĻƷһ| պСӰ߹ۿ| ޾ƷһƷ99| ޶ɫۺɫž| þۺɫһ | 99reƵƷ| ƷһƵ |