??xml version="1.0" encoding="utf-8" standalone="yes"?>
原子性:保证事务中的所有操作全部执行或全部不执行。例如执行{账事务,要么转̎成功Q要么失败。成功,则金额从转出帐户转入到目的帐Pq且两个帐户金额发生相应的变化Q失败,则两个̎L金额都不变。不会出现{出帐h了钱Q而目的帐h有收到钱的情c?/span>
一致性:保证数据库始l保持数据的一致?#8212;—事务操作之前是一致的Q事务操作之后也是一致的Q不事务成功与否。如上面的例子,转̎之前和之后数据库都保持数据上的一致性?/span>
隔离性:多个事务q发执行的话Q结果应该与多个事务串行执行效果是一L。显然最单的隔离是所有事务都串行执行Q先来先执行Q一个事务执行完了才允许执行下一个。但q样数据库的效率低下Q如Q两个不同的事务只是d同一Ҏ据,q样完全可以q发q行。ؓ了控制ƈ发执行的效果有了不同的隔离U别。下面将详细介绍?/span>
持久性:持久性表CZ物操作完成之后,Ҏ据库的媄响是持久的,即数据库因故障而受到破坏,数据库也应该能够恢复。通常的实现方式是采用日志?/span>
事务隔离U别Q?/span>transaction isolation levelsQ:隔离U别是对对事务q发控制的等U?/span>ANSI/ ISOSQL其分ؓ串行化(SERIALIZABLEQ、可重复读(REPEATABLE READQ、读已提交(READ COMMITEDQ、读未提交(READ UNCOMMITEDQ四个等U。ؓ了实现隔ȝ别通常数据库采用锁Q?/span>LockQ。一般在~程的时候只需要设|隔ȝU,至于具体采用什么锁则由数据库来讄。首先介l四U等U,然后举例解释后面三个{Q可重复诅R读已提交、读未提交)中会出现的ƈ发问题?/span>
串行化(SERIALIZABLEQ:所有事务都一个接一个地串行执行Q这样可以避免读(phantom readsQ。对于基于锁来实现ƈ发控制的数据库来_串行化要求在执行范围查询Q如选取q龄?/span>10?/span>30之间的用P的时候,需要获取范围锁Q?/span>range lockQ。如果不是基于锁实现q发控制的数据库Q则查到有违反串行操作的事务Ӟ需要滚回该事务?/span>
可重复读Q?/span>REPEATABLE READQ:所有被Select获取的数据都不能被修改,q样可以避免一个事务前后读取数据不一致的情况。但是却没有办法控制q读Q因个时候其他事务不能更Ҏ选的数据Q但是可以增加数据,因ؓ前一个事务没有范围锁?/span>
d提交Q?/span>READ COMMITEDQ:被读取的数据可以被其他事务修攏V这样就可能D不可重复诅R也是_事务的读取数据的时候获取读锁,但是d之后立即释放Q不需要等C务结束)Q而写锁则是事务提交之后才释放。释放读锁之后,可能被其他事物修改数据。该{也是SQL Server默认的隔ȝU?/span>
L提交Q?/span>READ UNCOMMITEDQ:q是最低的隔离{Q允许其他事务看到没有提交的数据。这U等U会D脏读Q?/span>Dirty ReadQ?/span>
例子Q下面考察后面三种隔离{对应的ƈ发问题。假设有两个事务。事?/span>1执行查询1Q然后事?/span>2执行查询2Q然后提交,接下来事?/span>1中的查询1再执行一ơ。查询基于以下表q行Q?/span>
一个事务中先后各执行一ơ同一个查询,但是q回的结果集却不一栗发生这U情冉|因ؓ在执行Select操作的时候没有获取范围锁QRange LockQ,D其他事务仍然可以插入新的数据?/p>
Transaction 1 | Transaction 2 |
/* Query 1 */ SELECT * FROM users WHERE age BETWEEN 10 AND 30; |
|
| /* Query 2 */ INSERT INTO users VALUES ( 3, 'Bob', 27 ); COMMIT; |
/* Query 1 */ SELECT * FROM users WHERE age BETWEEN 10 AND 30; |
|
注意transaction 1对同一个查询语句(Query 1Q执行了两次?如果采用更高U别的隔ȝU(即串行化Q的话,那么前后两次查询应该q回同样的结果集。但是在可重复读隔离{中却前后两次l果集不一栗但是ؓ什么叫做可重复ȝU呢Q那是因{解决了下面的不可重复读问题?/p>
在采用锁来实现ƈ发控制的数据库系l中Q不可重复读是因为在执行Select操作的时候没有加读锁Qread lockQ?/p>
Transaction 1 | Transaction 2 |
/* Query 1 */ SELECT * FROM users WHERE id = 1; |
|
| /* Query 2 */ UPDATE users SET age = 21 WHERE id = 1; COMMIT; |
/* Query 1 */ SELECT * FROM users WHERE id = 1; |
|
在这个例子当中,Transaction 2提交成功,所以Transaction 1W二ơ将获取一个不同的age?在SERIALIZABLE和REPEATABLE READ隔离U别?数据库应该返回同一个倹{而在READ COMMITTED和READ UNCOMMITTEDU别中数据库q回更新的倹{这样就出现了不可重复读?/p>
如果一个事?d了另一个事?修改的|但是最后事?滚回了,那么事务2p取了一个脏数据Q这也就是所谓的脏读。发生这U情况就是允怺务读取未提交的更新?/p>
Transaction 1 | Transaction 2 |
/* Query 1 */ SELECT * FROM users WHERE id = 1; |
|
| /* Query 2 */ UPDATE users SET age = 21 WHERE id = 1; |
/* Query 1 */ SELECT * FROM users WHERE id = 1; |
|
RollBack |
lgqͼ可以{到下面的表|
隔离{ | 脏读 | 不可重复?/span> | q读 |
L提交 | YES | YES | YES |
d提交 | NO | YES | YES |
可重复读 | NO | NO | YES |
串行?/span> | NO | NO | NO |
solidDB® can store binary and character data up to 2147483647 (2G - 1) bytes long. When such data exceeds a certain length, the data is called a BLOB (Binary Large OBject) or CLOB (Character Large OBject), depending upon the data type that stores the information. CLOBS contain only "plain text" and can be stored in any of the following data types:
CHAR, WCHAR
VARCHAR, WVARCHAR
LONG VARCHAR (mapped to standard type CLOB),
LONG WVARCHAR (mapped to standard type NCLOB)
BLOBs can store any type of data that can be represented as a sequence of bytes, such as a digitized picture, video, audio, a formatted text document. (They can also store plain text, but you'll have more flexibility if you store plain text in CLOBs). BLOBs are stored in any of the following data types:
BINARY
VARBINARY
LONG VARBINARY (mapped to standard type BLOB)
Since character data is a sequence of bytes, character data can be stored in BINARY fields, as well as in CHAR fields. CLOBs can be considered a subset of BLOBs.
For convenience, we will use the term BLOBs to refer to both CLOBs and BLOBs.
For most non-BLOB data types, such as integer, float, date, etc., there is a rich set of valid operations that you can do on that data type. For example, you can add, subtract, multiply, divide, and do other operations with FLOAT values. Because a BLOB is a sequence of bytes and the database server does not know the "meaning" of that sequence of bytes (i.e. it doesn't know whether the bytes represent a movie, a song, or the design of the space shuttle), the operations that you can do on BLOBs are very limited.
solidDB does allow you to perform some string operations on CLOBs. For example, you can search for a particular substring (e.g. a person's name) inside a CLOB by using the LOCATE() function. Because such operations require a lot of the server's resources (memory and/or CPU time), solidDB allows you to limit the number of bytes of the CLOB that are processed. For example, you might specify that only the first 1 megabyte of each CLOB be searched when doing a string search. For more information, see the description of the MaxBlobExpressionSize configuration parameter in solidDB Administration Guide.
Although it is theoretically possible to store the entire blob "inside" a typical table, if the blob is large, then the server usually performs better if most or all of the blob is not stored in the table. In solidDB, if a blob is no more than N bytes long, then the blob is stored in the table. If the blob is longer than N bytes, then the first N bytes are stored in the table, and the rest of the blob is stored outside the table as disk blocks in the physical database file. The exact value of "N" depends in part upon the structure of the table, the disk page size that you specified when you created the database, etc., but is always at least 256. (Data 256 bytes or shorter is always stored in the table.)
If a data row size is larger than one third of the disk block size of the database file, you must store it partly as a BLOB.
The SYS_BLOBS system table is used as a directory for all BLOB data in the physical database file. One SYS_BLOB entry can accommodate 50 BLOB parts. If the BLOB size exceeds 50 parts, several SYS_BLOB entries per BLOB are needed.
The query below returns an estimate on the total size of BLOBs in the database.
select sum(totalsize) from sys_blobs
The estimate is not accurate, because the info is only maintained at checkpoints. After two empty checkpoints, this query should return an accurate response.
_id's can be any type, so if your objects have a natural unique identifier, consider using that in _id to both save space and avoid an additional index.
If the _id's are in a somewhat well defined order, on inserts the entire b-tree for the _id index need not be loaded. BSON ObjectIds have this property.
BSON includes a binary data datatype for storing byte arrays. Using this will make the id values, and their respective keys in the _id index, twice as small.
Note that unlike the BSON Object ID type (see above), most UUIDs do not have a rough ascending order, which creates additional caching needs for their index.
> // mongo shell bindata info:
> help misc
b = new BinData(subtype,base64str) create a BSON BinData value
b.subtype() the BinData subtype (0..255)
b.length() length of the BinData data in bytes
b.hex() the data as a hex encoded string
b.base64() the data as a base 64 encoded string
b.toString()
The BSON ObjectId format provides documents with a creation timestamp (one second granularity) for free. Almost all drivers implement methods for extracting these timestamps; see the relevant api docs for details. In the shell:
> // mongo shell ObjectId methods
> help misc
o = new ObjectId() create a new ObjectId
o.getTimestamp() return timestamp derived from first 32 bits of the OID
o.isObjectId()
o.toString()
o.equals(otherid)
BSON ObjectId's begin with a timestamp. Thus sorting by _id, when using the ObjectID type, results in sorting by time. Note: granularity of the timestamp portion of the ObjectID is to one second only.
> // get 10 newest items
> db.mycollection.find().sort({id:-1}).limit(10);