亚洲日韩精品无码专区,亚洲乱码一二三四区麻豆,久久精品亚洲精品国产色婷

ThreadLocal与synchronize

rainman — Mon, 06 Oct 2008 04:13:00 GMT

摘要: Java良好的支持多�U�程。��用java,我们可以很轻杄��~�程一个多�U�程�E�序。但是��用多�U�程可能�?x��)引起�ƈ发访问的问题。synchronized和ThreadLocal都是用来解决多线�E��ƈ发访问的问题。大家可能对synchronized较�ؓ(f��)熟�?zh��n)��Q�而对ThreadLocal��p��陌生得多�?ji��n)�?nbsp; �q�发问题。当一个对象被两个�U�程同时讉K��Ӟ��可能有一个线�E�会(x��)得到不可预期的结果�?nbsp; 一... 阅读全文

rainman 2008-10-06 12:13 发表评论

再谈ReentrantLock

rainman — Fri, 03 Oct 2008 09:55:00 GMT

�?/span>入锁�Q?span class="hilite1" style="background-color: #ffff00; ">ReentrantLock�Q�是一�U�递归无阻塞的同步机制。以前一直认为它是synchronized的简单替代，而且实现机制也不相差太远。不�q�最�q�实践过�E�中发现它们之间�q�是有着天壤之别�?/p>

以下�?a target="_blank" style="color: #006699; text-decoration: underline; ">官方说明�Q�一个可重入的互斥锁�?Lock�Q�它��h��与��?synchronized �Ҏ(gu��)��和语句所讉K��的隐式监视器锁定相同的一些基本行为和语义�Q�但功能更强大�?span class="hilite1" style="background-color: #ffff00; ">ReentrantLock ��由最�q�成功获得锁定，�q�且�q�没有释放该锁定的线�E�所拥有。当锁定没有被另一个线�E�所拥有�Ӟ��调用 lock 的线�E�将成功获取该锁定�ƈ�q�回。如果当前线�E�已�l�拥有该锁定�Q�此�Ҏ(gu��)��立卌��回。可以��?isHeldByCurrentThread() �?getHoldCount() �Ҏ(gu��)��来检查此情况是否发生�?/p>

它提供了(ji��n)lock()�Ҏ(gu��)��Q?br /> 如果该锁定没有被另一个线�E�保持，则获取该锁定�q�立卌��回，��锁定的保持计数讄��?1�?br /> 如果当前�U�程已经保持该锁定，则将保持计数�?1�Q��ƈ且该�Ҏ(gu��)��立即�q�回�?br /> 如果该锁定被另一个线�E�保持，则出于线�E�调度的目的�Q�禁用当前线�E�，�q�且在获得锁定之前，该线�E�将一直处于休眠状态，此时锁定保持计数被设�|��ؓ(f��) 1�?/p>

最�q�在研究Java concurrent中关于�Q务调度的实现�Ӟ��M��(ji��n)延迟队列DelayQueue的一些代码，比如take()。该�Ҏ(gu��)��的主要功能是从优先队列（PriorityQueue�Q�取��Z��个最应该执行的�Q务（最优��|��(j��)�Q�如果该��d��的预订执行时间未刎ͼ�则需要wait�q�段旉��差。反之，如果旉��C��(ji��n)�Q�则�q�回该�Q务。而offer()�Ҏ(gu��)��是将一个�Q务添加到该队列中�?/p>

后来产生�?ji��n)一个疑问：(x��)如果最应该执行的�Q务是一个小时后执行的，而此旉��要提交一�?0�U�后执行的�Q务，�?x��)出��C��么状况？�q�是先看看take()的源代码�Q?/p>

public E take() throws InterruptedException {

final ReentrantLock lock = this.lock;

lock.lockInterruptibly();

try {

for (;;) {

E first = q.peek();

if (first == null) {

available.await();

} else {

long delay = first.getDelay(TimeUnit.NANOSECONDS);

if (delay > 0) {

long tl = available.awaitNanos(delay);

} else {

E x = q.poll();

assert x != null;

if (q.size() != 0)

available.signalAll(); // wake up other takers

return x;

}

}

}

} finally {

lock.unlock();

}

}

而以下是offer()的源代码:

public boolean offer(E e) {

final ReentrantLock lock = this.lock;

lock.lock();

try {

E first = q.peek();

q.offer(e);

if (first == null || e.compareTo(first) < 0)

available.signalAll();

return true;

} finally {

lock.unlock();

}

}

如代码所�C�，take()和offer()都是lock�?ji��n)重入锁。如果按照synchronized的思维�Q��用诸如synchronized(obj)的方法）(j��)�Q�这两个�Ҏ(gu��)��是互斥的。回到刚才的疑问�Q�take()�Ҏ(gu��)��需要等�?个小时才能返回，而offer()需要马上提交一�?0�U�后�q�行的�Q务，�?x��)不会(x��)一直等待take()�q�回后才能提交呢�Q�答案是否定的，通过�~�写验证代码也说明了(ji��n)�q�一炏V��这让我寚w��入锁有了(ji��n)更大的兴��，它确实是一个无��d��的锁�?/p>

下面的代码也许能说明问题�Q�运行了(ji��n)4个线�E�，每一�ơ运行前打印l(f��)ock的当前状态。运行后都要�{�待5�U�钟�?/p>

public static void main(String[] args) throws InterruptedException {

final ExecutorService exec = Executors.newFixedThreadPool(4);

final ReentrantLock lock = new ReentrantLock();

final Condition con = lock.newCondition();

final int time = 5;

final Runnable add = new Runnable() {

public void run() {

System.out.println("Pre " + lock);

lock.lock();

try {

con.await(time, TimeUnit.SECONDS);

} catch (InterruptedException e) {

e.printStackTrace();

} finally {

System.out.println("Post " + lock.toString());

lock.unlock();

}

}

};

for(int index = 0; index < 4; index++)

exec.submit(add);

exec.shutdown();

}

�q�是它的输出�Q?br /> Pre ReentrantLock@a59698[Unlocked]
Pre ReentrantLock@a59698[Unlocked]
Pre ReentrantLock@a59698[Unlocked]
Pre ReentrantLock@a59698[Unlocked]
Post ReentrantLock@a59698[Locked by thread pool-1-thread-1]
Post ReentrantLock@a59698[Locked by thread pool-1-thread-2]
Post ReentrantLock@a59698[Locked by thread pool-1-thread-3]
Post ReentrantLock@a59698[Locked by thread pool-1-thread-4]

每一个线�E�的锁状态都�?#8220;Unlocked”,所以都可以�q�行。但在把con.await�Ҏ(gu��)��Thread.sleep(5000)�Ӟ��输出��变成了(ji��n)�Q?br /> Pre ReentrantLock@a59698[Unlocked]
Pre ReentrantLock@a59698[Locked by thread pool-1-thread-1]
Pre ReentrantLock@a59698[Locked by thread pool-1-thread-1]
Pre ReentrantLock@a59698[Locked by thread pool-1-thread-1]
Post ReentrantLock@a59698[Locked by thread pool-1-thread-1]
Post ReentrantLock@a59698[Locked by thread pool-1-thread-2]
Post ReentrantLock@a59698[Locked by thread pool-1-thread-3]
Post ReentrantLock@a59698[Locked by thread pool-1-thread-4]

以上的对比说明线�E�在�{�待�?con.await)�Q�已�l�不在拥有（keep�Q�该锁了(ji��n)�Q�所以其他线�E�就可以获得重入锁了(ji��n)�?br />

有必要会(x��)�q�头再看看Java官方的解释：(x��)“如果该锁定被另一个线�E�保持，则出于线�E�调度的目的�Q�禁用当前线�E�，�q�且在获得锁定之前，该线�E�将一直处于休眠状�?#8221;。我对这里的“保持”的理解是指非wait状态外的所有状态，比如�U�程Sleep、for循环�{�一切有CPU参与的活动。一旦线�E�进入wait状态后�Q�它?y��u)�׃��再keep�q�个锁了(ji��n)�Q�其他线�E�就可以获得该锁�Q�当该线�E�被唤醒�Q�触发信��h��者timeout�Q�后�Q�就接着执行�Q�会(x��)重新“保持”锁，当然前提依然是其他线�E�已�l�不�?#8220;保持”�?ji��n)该重入锁�?/p>

�ȝ��一句话�Q�对于重入锁而言�Q?lock"�?keep"是两个不同的概念。lock�?ji��n)锁�Q�不一定keep锁，但keep�?ji��n)锁一定已�l�lock�?ji��n)锁�?/p>

rainman 2008-10-03 17:55 发表评论

rainman — Fri, 03 Oct 2008 06:35:00 GMT

十五�q�前�Q�多处理器系�l�是高度专用�pȝ��Q�要��p��数十万美元（大多数具有两个到四个处理器）(j��)。现在，多处理器�pȝ��很便宜，而且数量很多�Q�几乎每个主要微处理器都内置�?ji��n)多处理支持�Q�其中许多系�l�支持数十个或数百个处理器�?/p>

要��用多处理器系�l�的功能�Q�通常需要��用多�U�程构造应用程序。但是正如�Q何编写�ƈ发应用程序的人可以告诉你的那��P��要获得好的硬件利用率�Q�只是简单地在多个线�E�中分割工作是不够的�Q�还必须��保�U�程��实大部分时间都在工作，而不是在�{�待更多的工作，或等待锁定共享数据结构�?/p>

问题�Q�线�E�之间的协调

如果�U�程之间 �?/em>需要协调，那么几乎没有��d��可以真正地�ƈ行。以�U�程池�ؓ(f��)例，其中执行的�Q务通常�怺�独立。如果线�E�池利用公共工作队列�Q�则从工作队列中删除元素或向工作队列��d��元素的过�E�必��L��U�程安全的，�q�且�q�意味着要协调对头、尾或节炚w��链接指针所�q�行的访问。正是这�U�协调导致了(ji��n)所有问题�?/p>

标准�Ҏ(gu��)��Q�锁�?/span>

�?Java 语言中，协调对共享字�D늚�讉K��的传�l�方法是使用同步�Q�确保完成对�׃�n字段的所有访问，同时��h��适当的锁定。通过同步�Q�可以确定（假设�cȝ��写正��）(j��)��h��保护一�l�给定变量的锁定的所有线�E�都��拥有对�q�些变量的独占访问权�Q��ƈ且以后其他线�E�获得该锁定�Ӟ��可以看到对�q�些变量�q�行的更攏V��弊端是如果锁定竞争太厉宻I��U�程常常在其他线�E�具有锁定时要求获得该锁定）(j��)�Q�会(x��)损害吞吐量，因�ؓ(f��)竞争的同步非常昂��c(di��n)��（Public Service Announcement�Q�对于现�?JVM 而言�Q�无竞争的同步现在非�怾�宜�?/p>
��Z��锁定的算法的另一个问题是�Q�如果�g�q�具有锁定的�U�程�Q�因为页面错误、计划�g�q�或其他意料之外的�g�q�）(j��)�Q�则没有要求获得该锁定的�U�程可以�l�箋�q�行�?/p>
�q�可以��用可变变量来以比同步更低的成本存储共享变量，但它们有局限性。虽然可以保证其他变量可以立即看到对可变变量的写入，但无法呈现原子操作的�?修改-写顺序，�q�意味着�Q�比如说�Q�可变变量无法用来可靠地实现互斥�Q�互斥锁定）(j��)或计数器�?/p>
使用锁定实现计数器和互斥

假如开发线�E�安全的计数器类�Q�那么这��暴�?get()�?increment() �?decrement() 操作。清�?1 昄��?ji��n)如何��用锁定（同步�Q�实现该�cȝ��例子。注意所有方法，甚至需要同�?get()�Q��ɾc�L��为线�E�安全的�c�，从而确保没有�Q何更��C��息丢失，所有线�E�都看到计数器的最新倹{�?/p>
清单 1. 同步的计数器�c?/strong>
public class SynchronizedCounter {
private int value;
public synchronized int getValue() { return value; }
public synchronized int increment() { return ++value; }
public synchronized int decrement() { return --value; }
}

increment() �?decrement() 操作是原子的�?修改-写操作，��Z��(ji��n)安全实现计数器，必须使用当前��|��q��ؓ(f��)其添加一个��|��或写出新��|��所有这些均视�ؓ(f��)一��Ҏ(gu��)��作，其他�U�程不能打断它。否则，如果两个�U�程试图同时执行增加�Q�操作的不幸交叉��导致计数器只被实现�?ji��n)一�ơ，而不是被实现两次。（注意�Q�通过使值实例变量成为可变变量�ƈ不能可靠地完成这��Ҏ(gu��)��作。）(j��)

许多�q�发��法中都昄��?ji��n)原子的�?修改-写组合。清�?2 中的代码实现�?ji��n)简单的互斥�Q?acquire() �Ҏ(gu��)��也是原子的读-修改-写操作。要获得互斥�Q�必��ȝ��保没有其他�h��h��该互斥（ curOwner = Thread.currentThread()�Q�，然后记录�(zh��n)�拥有该互斥的事实（ curOwner = Thread.currentThread()�Q�，所有这些��其他�U�程不可能在中间出现以及(qi��ng)修改 curOwner field�?/p>
清单 2. 同步的互斥类

public class SynchronizedMutex {
private Thread curOwner = null;
public synchronized void acquire() throws InterruptedException {
if (Thread.interrupted()) throw new InterruptedException();
while (curOwner != null)
wait();
curOwner = Thread.currentThread();
}
public synchronized void release() {
if (curOwner == Thread.currentThread()) {
curOwner = null;
notify();
} else
throw new IllegalStateException("not owner of mutex");
}
}

清单 1 中的计数器类可以可靠地工作，在竞争很��或没有竞争旉��可以很好地执行。然而，在竞争激烈时�Q�这��大大损��x��能�Q�因�?JVM 用了(ji��n)更多的时间来调度�U�程�Q�管理竞争和�{�待�U�程队列�Q�而实际工作（如增加计数器�Q�的旉��却很��。�?zh��n)�可以回�?上月专栏中的图，该图昄��?ji��n)一旦多个线�E��用同步竞争一个内�|�监视器�Q�吞吐量��如何大�q�度下降。虽然该专栏说明�?ji��n)新�?ReentrantLock �c�d��何可以更可�׾~�地替代同步�Q�但是对于一些问题，�q�有更好的解��x��法�?/span>

锁定问题

使用锁定�Q�如果一个线�E�试图获取其他线�E�已�l�具有的锁定�Q�那么该�U�程��被��d��Q�直到该锁定可用。此�Ҏ(gu��)��h��一些明昄��~�点�Q�其中包括当�U�程被阻塞来�{�待锁定�Ӟ��它无法进行其他�Q何操作。如果阻塞的�U�程是高优先�U�的��d��Q�那么该�Ҏ(gu��)��可能造成非常不好的结果（�U�Cؓ(f��) 优先�U�倒置的危险）(j��)�?/span>

使用锁定�q�有一些其他危险，如死锁（当以不一致的��序获得多个锁定时会(x��)发生死锁�Q�。甚��x��有这�U�危险，锁定也仅是相对的�_�粒度协调机�Ӟ��同样非常适合��理��单操作，如增加计数器或更��C��斥拥有者。如果有更细�_�度的机制来可靠��理对单独变量的�q�发更新�Q�则�?x��)更好一些；在大多数��C��处理器都有这�U�机制�?/span>

回页�?/strong>

��g同步原语

如前所�q�ͼ�大多数现代处理器都包含对多处理的支持。当然这�U�支持包括多处理器可以共享外部设备和��d��存，同时它通常�q�包括对指��o(h��)�pȝ��的增加来支持多处理的�Ҏ(gu��)��要求。特别是�Q�几乎每个现代处理器都有通过可以��(g��)��或��L��其他处理器的�q�发讉K��的方式来更新�׃�n变量的指令�?/span>

比较�q�交�?(CAS)

支持�q�发的第一个处理器提供原子的测试�ƈ讄��操作�Q�通常在单位上�q�行�q�项操作。现在的处理器（包括 Intel �?Sparc 处理器）(j��)使用的最通用的方法是实现名�ؓ(f��) 比较�q��{�?/span>�?CAS 的原语。（�?Intel 处理器中�Q�比较�ƈ交换通过指��o(h��)�?cmpxchg �p�d��实现。PowerPC 处理器有一对名�?#8220;加蝲�q�保�?#8221;�?#8220;条�g存储”的指令，它们实现相同的目圎ͼ�MIPS �?PowerPC 处理器相��|��除了(ji��n)�W�一个指令称�?#8220;加蝲链接”。）(j��)

CAS 操作包含三个操作�?—�?内存位置�Q�V�Q�、预期原��|��A�Q�和新�?B)。如果内存位�|�的��g��预期原值相匚w��Q�那么处理器�?x��)自动将该位�|�值更��Cؓ(f��)新倹{��否则，处理器不做�Q何操作。无论哪�U�情况，它都�?x��)�?CAS 指��o(h��)之前�q�回该位�|�的倹{��（�?CAS 的一些特�D�情况下��仅�q�回 CAS 是否成功�Q�而不提取当前倹{��）(j��)CAS 有效地说明了(ji��n)“我认��Z��|?V 应该包含�?A�Q�如果包含该��|��则将 B 攑ֈ��q�个位置�Q�否则，不要更改该位�|�，只告诉我�q�个位置现在的值即可�?#8221;

通常��?CAS 用于同步的方式是从地址 V ��d��?A�Q�执行多步计��来获得新�?B�Q�然后��?CAS ��?V 的��g�� A 改�ؓ(f��) B。如�?V 处的值尚未同时更改，�?CAS 操作成功�?/span>

�c�M��?CAS 的指令允许算法执行读-修改-写操作，而无需��x��其他线�E�同时修改变量，因�ؓ(f��)如果其他�U�程修改变量�Q�那�?CAS �?x��)检��它�Q��ƈ��p�|�Q�，��法可以对该操作重新计算。清�?3 说明�?CAS 操作的行为（而不是性能特征�Q�，但是 CAS 的�h(hu��n)值是它可以在��g中实玎ͼ��q�且是极轻量�U�的�Q�在大多数处理器中）(j��)�Q?/span>

清单 3. 说明比较�q�交换的行�ؓ(f��)�Q�而不是性能�Q�的代码

public class SimulatedCAS {
private int value;

public synchronized int getValue() { return value; }

public synchronized int compareAndSwap(int expectedValue, int newValue) {
if (value == expectedValue)
value = newValue;
return value;
}
}

�?CAS 实现计数�?/span>

��Z�� CAS 的�ƈ发算法称�?无锁�?/em>��法�Q�因为线�E�不必再�{�待锁定�Q�有时称��Z��斥或关键部分�Q�这取决于线�E��^台的术语�Q�。无�?CAS 操作成功�q�是��p�|�Q�在��M��一�U�情况中�Q�它都在可预知的旉��内完成。如�?CAS ��p�|�Q�调用者可以重�?CAS 操作或采取其他适合的操作。清�?4 昄��?ji��n)重新编写的计数器类来��?CAS 替代锁定�Q?/p>

public class CasCounter {
private SimulatedCAS value;
public int getValue() {
return value.getValue();
}
public int increment() {
int oldValue = value.getValue();
while (value.compareAndSwap(oldValue, oldValue + 1) != oldValue)
oldValue = value.getValue();
return oldValue + 1;
}
}

无锁定且无等待算�?/span>

如果每个�U�程在其他线�E��Q意�g�q�（或甚臛_��败）(j��)旉��持�l�进行操作，��可以说该算法是无等�?/em>的。与此�Ş成对比的是，无锁�?/em>��法要求�?#160;某个�U�程��L��执行操作。（无等待的另一�U�定义是保证每个�U�程在其有限的步骤中正确计算自己的操作，而不��其他线�E�的操作、计时、交叉或速度。这一限制可以是系�l�中�U�程数的函数�Q�例如，如果�?10 个线�E�，每个�U�程都执行一��?#160;CasCounter.increment() 操作�Q�最坏的情况下，每个�U�程��必��重试最多九(ji��)�ơ，才能完成增加。）(j��)

再过�ȝ�� 15 �q�里�Q��h们已�l�对无等待且无锁定算法（也称�?#160;无阻塞算�?/em>�Q�进行了(ji��n)大量研究�Q�许多�h通用数据�l�构已经发现�?ji��n)无��d��法。无��d��法被广泛用于操作系�l�和 JVM �U�别�Q�进行诸如线�E�和�q�程调度�{��Q务。虽然它们的实现比较复杂�Q�但相对于基于锁定的备选算法，它们有许多优点：(x��)可以避免优先�U�倒置和死锁等危险�Q�竞争比较便宜，协调发生在更�l�的�_�度�U�别�Q�允许更高程度的�q�行机制�{�等�?/p>
原子变量�c?/span>

�?JDK 5.0 之前�Q�如果不使用本机代码�Q�就不能�?Java 语言�~�写无等待、无锁定的算法。在 java.util.concurrent.atomic 包中��d��原子变量�c�M��后，�q�种情况才发生了(ji��n)改变。所有原子变量类都公开比较�q�设�|�原语（与比较�ƈ交换�c�M��Q�，�q�些原语都是使用�q�_��上可用的最快本机结构（比较�q�交换、加载链�?条�g存储�Q�最坏的情况下是旋�{锁）(j��)来实现的�?#160;java.util.concurrent.atomic 包中提供�?ji��n)原子变量�?9 �U�风��|�� AtomicInteger�Q?#160;AtomicLong�Q?#160;AtomicReference�Q?#160;AtomicBoolean�Q�原子整型；长型�Q�引用；�?qi��ng)原子标记引用和戌��引用�cȝ��数组形式�Q�其原子地更��C��对��|��(j��)�?/p>
原子变量�c�d��以认为是 volatile 变量的泛化，它扩展了(ji��n)可变变量的概念，来支持原子条件的比较�q�设�|�更新。读取和写入原子变量与读取和写入对可变变量的讉K��h��相同的存取语义�?/p>
虽然原子变量�c�表面看��h��与清�?1 中的 SynchronizedCounter 例子一��P��但相��g��是表面的。在表面之下�Q�原子变量的操作�?x��)变为��^台提供的用于�q�发讉K��的硬件原语，比如比较�q�交换�?/p>
更细�_�度意味着更轻量��

调整��h��竞争的�ƈ发应用程序的可�׾~�性的通用技术是降低使用的锁定对象的�_�度�Q�希望更多的锁定��h��从竞争变��Z��竞争。从锁定转换为原子变量可以获得相同的�l�果�Q�通过切换为更�l�粒度的协调机制�Q�竞争的操作��更��，从而提高了(ji��n)吞吐量�?/p>

ABA 问题
因�ؓ(f��)在更�?V 之前�Q�CAS 主要询问“V 的值是否仍�?A”�Q�所以在�W�一�ơ读�?V 以及(qi��ng)�?V 执行 CAS 操作之前�Q�如果将��g�� A 改�ؓ(f��) B�Q�然后再改回 A�Q�会(x��)使基�?CAS 的算法�؜乱。在�q�种情况下，CAS 操作�?x��)成功，但是在一些情况下�Q�结果可能不是�?zh��n)�所预期的。（注意�Q?#160;清单 1 �?#160;清单 2 中的计数器和互斥例子不存在这个问题，但不是所有算法都�q�样。）(j��)�q�类问题�U�Cؓ(f��) ABA 问题�Q�通常通过��标记或版本�~�号与要�q�行 CAS 操作的每个值相兌��Q��ƈ原子地更新值和标记�Q�来处理�q�类问题�?#160;AtomicStampedReference �c�L��持这�U�方法�?/td>

java.util.concurrent 中的原子变量

无论是直接的�q�是间接的，几乎 java.util.concurrent 包中的所有类都��用原子变量，而不使用同步。类�?code>ConcurrentLinkedQueue 的类也��用原子变量直接实现无�{�待��法�Q�而类�?#160;ConcurrentHashMap 的类使用 ReentrantLock 在需要时�q�行锁定。然后， ReentrantLock 使用原子变量来维护等待锁定的�U�程队列�?/p>
如果没有 JDK 5.0 中的 JVM 改进�Q�将无法构造这些类�Q�这些改�q�暴露了(ji��n)�Q�向�c�d��Q�而不是用��L(f��ng)��Q�接口来讉K��g�U�的同步原语。然后，java.util.concurrent 中的原子变量�c�d��其他�c�d��用户�c�d��开�q�些功能�?/p>

回页�?/strong>

使用原子变量获得更高的吞吐量

上月�Q�我介绍�?#160;ReentrantLock 如何相对于同步提供可伸羃性优势，以及(qi��ng)构造通过伪随机数生成器模拟旋转骰子的��单、高竞争�C�Z��基准。我向�?zh��n)�昄��了(ji��n)通过同步�?#160;ReentrantLock 和公�q?#160;ReentrantLock 来进行协调的实现�Q��ƈ昄��?ji��n)结果。本月，我将向该基准��d��其他实现�Q��?#160;AtomicLong 更新 PRNG 状态的实现�?/p>
清单 5 昄��?ji��n)��用同步�?PRNG 实现和��?CAS 备选实现。注意，要在循环中执�?CAS�Q�因为它可能�?x��)失败一�ơ或多次才能获得成功�Q��?CAS 的代码��L��q�样�?/p>
清单 5. 使用同步和原子变量实现线�E�安�?PRNG

public class PseudoRandomUsingSynch implements PseudoRandom { private int seed; public PseudoRandomUsingSynch(int s) { seed = s; } public synchronized int nextInt(int n) { int s = seed; seed = Util.calculateNext(seed); return s % n; } } public class PseudoRandomUsingAtomic implements PseudoRandom { private final AtomicInteger seed; public PseudoRandomUsingAtomic(int s) { seed = new AtomicInteger(s); } public int nextInt(int n) { for (;;) { int s = seed.get(); int nexts = Util.calculateNext(s); if (seed.compareAndSet(s, nexts)) return s % n; } } }

下面�?1 和图 2 中的图与上月那些囄��|��只是为基于原子的�Ҏ(gu��)��多添加了(ji��n)一行。这些图昄��?ji��n)�?8-way Ultrasparc3 和单处理�?Pentium 4 上��用不同数量线�E�的随机发生的吞吐量�Q�以每秒转数为单位）(j��)。测试中的线�E�数不是真实的；�q�些�U�程所表现的竞争比通常多得多，所以它们以比实际程序中低得多的�U�程数显�C�Z��(ji��n) ReentrantLock 与原子变量之间的�q��。�?zh��n)��看刎ͼ�虽�?#160;ReentrantLock 拥有比同步更多的优点�Q�但相对�?#160;ReentrantLock�Q�原子变量提供了(ji��n)其他改进。（因�ؓ(f��)在每个工作单元中完成的工作很��，所以下囑֏�能无法完全地说明�?ReentrantLock 相比�Q�原子变量具有哪些可伸羃性优炏V��）(j��)

�?1. 8-way Ultrasparc3 中同步、ReentrantLock、公�q?Lock �?AtomicLong 的基准吞吐量

�?2. 单处理器 Pentium 4 中的同步、ReentrantLock、公�q?Lock �?AtomicLong 的基准吞吐量

大多数用户都不太可能使用原子变量自己开发无��d��法 �?他们更可能��?#160;java.util.concurrent 中提供的版本�Q�如 ConcurrentLinkedQueue。但是万一�(zh��n)�想知道�Ҏ(gu��)��以前 JDK 中的相类似的功能�Q�这些类的性能是如何改�q�的�Q�可以��用通过原子变量�c�d��开的细�_�度、硬件��别的�q�发原语�?/p>
开发�h员可以直接将原子变量用作�׃�n计数器、序��L(f��ng)��成器和其他独立共享变量的高性能替代�Q�否则必��通过同步保护�q�些变量�?/p>

回页�?/strong>

�l�束�?/span>

JDK 5.0 是开发高性能�q�发�cȝ��巨大�q�步。通过内部公开新的低��协调原语�Q�和提供一�l�公共原子变量类�Q�现在用 Java 语言开发无�{�待、无锁定��法首次变�ؓ(f��)可行。然后， java.util.concurrent 中的�c�d��于这些低�U�原子变量工��h��建，为它们提供比以前执行�怼�功能的类更显著的可�׾~�性优炏V��虽然�?zh��n)�可能永远不�?x��)直接使用原子变量�Q�还是应该�ؓ(f��)它们的存在而欢呹{�?/p>

rainman 2008-10-03 14:35 发表评论

rainman — Thu, 02 Oct 2008 15:12:00 GMT

所面��(f��)的问�?/span>

�?1. �U�程场景

�q�幅图中节点代表一�?single Thread�Q�边代表执行的步骤�?/p>
整幅图代表的意思是�Q�ROOT �U�程执行完毕后执�?T1 �U�程�Q�T1 执行完毕后�ƈ发的执行 T2 �?T3。而从 T2 �?T3 指向 T4 的两条边表示的是 T4 必须�{?T2 �?T3 都执行完毕以后才能开始执行。剩下的步骤以此�c�L��Q�直�?END 作�ؓ(f��)整个�q�程的结束。当�?d��ng)��q�只是个��略的�C�意图，可能面对的一个线�E�场景会(x��)有上百个�U�程。还有，你可以观察到�q�整个场景只有一个入口点和一个出口点�Q�这意味着什么？在下文中��Z��解释�?/p>
�q�其中涉�?qi��ng)到�?Java �U�程的同步互斥机制。例如如何让 T1 �?T2 �?T3 之前�q�行�Q�如何让 T2 �?T3 都执行完毕之后开�?T4 �U�程�?/p>

模型的描�q?/span>

如何来描�q�图 1 中所�C�的场景呢？可以采用 XML 的格式来描述我们的模型。我定义一�?#8220;Thread” element 来表�C�线�E��?/p>

<ThreadList>
<Thread ID = "thread-id" PRETHREAD = "prethread1, prethread2…">Thread>
<Thread ID = "thread-id" PRETHREAD = "prethread3, prethread4…">Thread>
ThreadList>

其中 ID 是线�E�的唯一标识�W�，PRETHREAD 便是该线�E�的直接先决�U�程的ID�Q�每个线�E?ID 之间用逗号隔开�?/p>
�?Thread �q�个 element 里面可以加入你想要该�U�程执行��d��的具体信息�?/p>
实际上模型的描述是解决问题非帔R��要的一个环节，整个�U�程场景可以用一�U�一致的形式来描�q�ͼ�作�ؓ(f��) Java 多线�E��ƈ发控制框架引擎的输入。也��是��线�E�运行的模式�?XML 来描�q�出来，�q�样只用改动 XML 配置文�g��可以更�Ҏ(gu��)��个线�E�运行的模式�Q�不用改动�Q何的源代码�?/p>

两种实现机制

对于 Java 多线�E�的�q�行框架来说�Q�我们将采用“�?#8221;�?#8220;�?#8221;的两�U�模式来实现�?/p>

“�?#8221; - �ȝ��E�轮�?/span>

�?2. 静态类�?/strong>

Thread 是工作线�E�。ThreadEntry �?Thread 的包装类�Q�prerequisite 是一�?HashMap�Q�它含有 Thread 的先决线�E�的状态。如�?中显�C�的那样�Q�T4 的先决线�E�是 T2 �?T3�Q�那�?prerequisite 中就包含 T2 �?T3 的状态。TestScenario 中的 threadEntryList 中包含所有的 ThreadEntry�?/p>
�?3. �U�程执行场景

TestScenario 作�ؓ(f��)�ȝ��E�，作�ؓ(f��)一�?#8220;�?#8221;在的监控者，不断地轮�?threadEntryList 中所�?ThreadEntry 的状态，�?ThreadEntry 接受�?isReady 的查询后查询自己�?prerequisite�Q�当其中所有的先决�U�程的状态�ؓ(f��)“正常�l�束�?#8221;�Q�它便返�?ready�Q�那�?TestScenario 便会(x��)调用 ThreadEntry �?startThread() �Ҏ(gu��)��授权�?ThreadEntry �q�行�U�程�Q�Thread 侉K��过 run() �Ҏ(gu��)��来真正执行线�E�。�ƈ在正常执行完毕后调用 setPreRequisteState() �Ҏ(gu��)��来更新整�?Scenario�Q�threadEntryList 中所�?ThreadEntry �?prerequisite 里面含有�?Thread 的状态信息�ؓ(f��)“正常�l�束”�?/p>
�?4. 状态更改的�q�程

如图 1 中所�C�的 T4 的先决线�E��ؓ(f��) T2 �?T3�Q�T2 �?T3 �q�行执行。如�?4 所�C�，假设 T2 先执行完毕，它会(x��)调用 setPreRequisteState() �Ҏ(gu��)��来更新整�?Scenario�Q?threadEntryList 中所�?ThreadEntry �?prerequisite 里面含有�?T2 的状态信息�ؓ(f��)“正常�l�束”。此�Ӟ��T4 �?prerequisite �?T2 的状态�ؓ(f��)“正常�l�束”�Q�但�?T3 �q�没有执行完毕，所以其状态�ؓ(f��)“未完�?#8221;。所�?T4 �?isReady 查询�q�回�?false�Q�T4 不会(x��)执行。只有当 T3 执行完毕后更新状态�ؓ(f��)“正常�l�束”后，T4 的状态才�?ready�Q�T4 才会(x��)开始运行�?/p>
其余的节点也以此�c�L��Q�它们正常执行完毕的时候会(x��)在整个的 scenario 中广播该�U�程正常�l�束的信息，�׃��U�程不断地轮询各�?ThreadEntry 的状态来开启各个线�E��?/p>
�q�便是采用主控线�E�轮询状态表的方式来控制 Java 多线�E�运行框架的实现方式之一�?/p>
优点�Q?/strong>概念�l�构清晰明了(ji��n)�Q�实现简单。避免采�?Java 的锁机制�Q�减��生死锁的几率。当发生异常��D��其中某些�U�程不能正常执行完毕的时候，不会(x��)产生挂�v的线�E��?/p>
�~�点�Q?/strong>采用�ȝ��E�轮询机�Ӟ��耗费 CPU 旉��。当图中的节点太多的(n>??? 而线�E�单个线�E�执行时间比较短的时�?t

“�?#8221; - wait¬ify

相对�?#8220;�?#8221;-�ȝ��E�轮询机制来��_(d��)��“�?#8221;采用的是自我控制�q�锁触发机制�?/p>
�?5. 锁机制的静态类�?/strong>

Thread 中的 lock 为当�?Thread �?lock�Q�lockList 是一�?HashMap�Q�持有其后��U�程�?lock 的引用，getLock �?setLock 可以�?lockList 中的 Lock �q�行操作。其中很重要的一个成员是 waitForCount�Q�这是一个引用计数。表明当前线�E�正在等待的先决�U�程的个敎ͼ�例如�?1 中所�C�的 T4�Q�在初始的情况下�Q�他�{�待的先决线�E�是 T2 �?T3�Q�那么它�?waitForCount �{�于 2�?/p>
�?6. 锁机制执行顺序图

当整个过�E�开始运行的时候，我们��所有的�U�程 start�Q�但是每个线�E�所持的 lock 都处�?wait 状态，�U�程都会(x��)处于 waiting 的状态。此�Ӟ��我们��?root thread 所持有的自�w�的 lock notify�Q�这�?root thread ��׃��(x��)�q�行��h��。当 root �?run �Ҏ(gu��)��执行完毕以后。它�?x��)检查其后箋�U�程�?waitForCount�Q��ƈ��其值减一。然后再�ơ检�?waitForCount�Q�如�?waitForCount �{�于 0�Q�表�C��后箋�U�程的所有先决线�E�都已经执行完毕�Q�此时我�?notify 该线�E�的 lock�Q�该后箋�U�程便可以从 waiting 的状态�{换成�?running 的状态。然后这个过�E�连锁递归的进行下去，整个�q�程便会(x��)执行完毕�?/p>
我们�q�是�?T2�Q�T3�Q�T4 ��Z��Q�当�q�行 initThreadLock �q�程的时候，我们可以知道 T4 有两个直接先决线�E?T2 �?T3�Q�所�?T4 �?waitForCount �{�于 2。我们假�?T3 先执行完毕，T2 仍然�?running 的状态，此时他会(x��)首先遍历其所有的直接后��U�程�Q��ƈ��他们的 waitForCount 减去 1�Q�此时他只有一个直接后�l�线�E?T4�Q�于�?T4 �?waitForCount 减去 1 以后值变�?1�Q�不�{�于 0�Q�此时不�?x��)�?T4 �?lock notify�Q�T4 �l�箋 waiting。当 T2 执行完毕之后�Q�他�?x��)执行�?T3 相同的步骤，此时 T4 �?waitForCount �{�于 0�Q�T2 �?notify T4 �?lock�Q�于�?T4 �?waiting 状态�{换成�?running 状态。其他的节点也是�怼�的情��c(di��n)�?/p>
当然�Q�我们也可以��整个过�E�的信息攑֜�另外的一个全局对象中，所有的�U�程都去查找该全局对象来获取各自所需的信息，而不是采取这�U�分布式存储的方式�?/p>
优点�Q?/strong>采用 wait¬ify 机制而不采用轮询的机�Ӟ��不会(x��)��费CPU资源。执行效率较高。而且相对�?#8220;�?#8221;-�ȝ��E�轮询的机制来说实时性更好�?/p>
�~�点�Q?/strong>采用 Java �U�程 Object 的锁机制�Q�实现�v来较为复杂。而且采取一�U�连锁触发的方式�Q�如果其中某些线�E�异常，�?x��)导致所有其后��U�程的挂赯��(g��)�造成整个 scenario 的运行失败。�ؓ(f��)�?ji��n)防止这�U�情�늚�发生�Q�我们还必须建立一套线�E�监控的机制来确保其正常�q�行�?/p>

延��

下面的图所要表辄��是这样一�U�递归�q�代的概��c(di��n)��例如在�? 中展�C�的那样�Q�T1 �q�个节点表示的是一个线�E�。现在，忘掉�U�程�q�样一个概念，��?T1 抽象��Z��个过�E�，惌��它是一个银河系�Q�深入到 T1 中去�Q�它也是一个许多子�q�程的集合，�q�些子过�E�之间的关系模式��如�?1 所�C�那��P��可以用一个图来表�C��?/p>
�?7. 嵌套子过�E?/strong>

可以惌��一下这是怎样的一个框�Ӟ��h��无穷扩展性的�q�程框架�Q�我们只用定义各个过�E�之间的关系�Q�我们不用关�?j��)过�E�是怎样�q�行的。事实上�Q�可以在最�l�的节点上指定一个实际的工作�Q�比如读一个文�Ӟ��或者submit一个JCL job�Q�或者执行一条sql statement�?/p>
其实�Q�按照某�U�遍历规则，完全可以��这�U�嵌套递归的结构�{化成��Z��个一层扁�q�结构的图，而不是原来的分层的网状结构，但是我们不这样做的原因是��Z��以下的几点考虑�Q?/p>

如果�q�样做，�?x��)导致图节点太多�Q�边太多�Q��o(h��)人眼��q݋乱�?/li>
不这样做更主要的原因是每一个场景，如图 7 中的 T1�Q�T13�Q�是状态聚集的一个单元，��h��高复用性和可靠性�?/li>
框架是高度抽象的�Q�它实际的执行可以是分布式的�Q�一个单元可以是一个系�l�，作�ؓ(f��)和其他系�l�的分界标志�?/li>

实际上，�q�是一个状态聚集的层次控制框架�Q�我们可以依赖此框架来执行自主运��。我们将在其它的文章中来讨论它的应用�?/p>

�ȝ��

本文介绍�?ji��n)一�U?Java 多线�E��ƈ发控制的框架�Q��ƈ�l�出�?ji��n)其两种实现的模型，它们有各自的优缺点，有各自的适用范围。当需要进�?Java �U�程的�ƈ发控制的时候，可以作�ؓ(f��)参考�?/p>

参考资�?/span>

developerWorks Java 专区 Peter Haggar 的文章：(x��)Apply the Specific Notification pattern to control the order of thread execution

Doug Lea 的著名�ƈ发性图书：(x��)Java �q�发�~�程: 设计原则与模�? �W�二�?Addison Wesley 1999)

另一本关于�ƈ发性的图书�Q?a style="color: #5c81a7; ">Java Concurrency in Practice

developerWorks Java 专区 Joseph Hartal�Q�Ze'ev Bubis 的文章：(x��)使你��L��得进行多�U�程应用�E�序�~�程

developerWorks Java 专区 Alex Roetter 的文章：(x��)�~�写多线�E�的Java应用�E�序

developerWorks Java 专区 Neel V. Kumar 的文章：(x��)Java �E�序中的多线�E?/a>

关于作�?/span>

陈威�Q�华中科技大学��士�Q�IBM CSDL Software Engineer�Q�所在的 Team �?DB2 for z/OS。联�p�L��式：(x��)chenwbj@cn.ibm.com

rainman 2008-10-02 23:12 发表评论

Synchronization and the Java Memory Model

rainman — Tue, 30 Sep 2008 08:55:00 GMT

This set of excerpts from section 2.2 includes the main discussions on how the Java Memory Model impacts concurrent programming.
For information about ongoing work on the memory model, see Bill Pugh's Java Memory Model pages.

Consider the tiny class, defined without any synchronization:

final class SetCheck { private int a = 0; private long b = 0; void set() { a = 1; b = -1; } boolean check() { return ((b == 0) || (b == -1 && a == 1)); } }
In a purely sequential language, the method check could never return false. This holds even though compilers, run-time systems, and hardware might process this code in a way that you might not intuitively expect. For example, any of the following might apply to the execution of method set:

The compiler may rearrange the order of the statements, so b may be assigned before a. If the method is inlined, the compiler may further rearrange the orders with respect to yet other statements.

The processor may rearrange the execution order of machine instructions corresponding to the statements, or even execute them at the same time.

The memory system (as governed by cache control units) may rearrange the order in which writes are committed to memory cells corresponding to the variables. These writes may overlap with other computations and memory actions.

The compiler, processor, and/or memory system may interleave the machine-level effects of the two statements. For example on a 32-bit machine, the high-order word of b may be written first, followed by the write to a, followed by the write to the low-order word of b.

The compiler, processor, and/or memory system may cause the memory cells representing the variables not to be updated until sometime after (if ever) a subsequent check is called, but instead to maintain the corresponding values (for example in CPU registers) in such a way that the code still has the intended effect.

In a sequential language, none of this can matter so long as program execution obeys as-if-serial semantics. Sequential programs cannot depend on the internal processing details of statements within simple code blocks, so they are free to be manipulated in all these ways. This provides essential flexibility for compilers and machines. Exploitation of such opportunities (via pipelined superscalar CPUs, multilevel caches, load/store balancing, interprocedural register allocation, and so on) is responsible for a significant amount of the massive improvements in execution speed seen in computing over the past decade. The as-if-serial property of these manipulations shields sequential programmers from needing to know if or how they take place. Programmers who never create their own threads are almost never impacted by these issues.
Things are different in concurrent programming. Here, it is entirely possible for check to be called in one thread while set is being executed in another, in which case the check might be "spying" on the optimized execution of set. And if any of the above manipulations occur, it is possible for check to return false. For example, as detailed below, check could read a value for the long b that is neither 0 nor -1, but instead a half-written in-between value. Also, out-of-order execution of the statements in set may cause check to read b as -1 but then read a as still 0.

In other words, not only may concurrent executions be interleaved, but they may also be reordered and otherwise manipulated in an optimized form that bears little resemblance to their source code. As compiler and run-time technology matures and multiprocessors become more prevalent, such phenomena become more common. They can lead to surprising results for programmers with backgrounds in sequential programming (in other words, just about all programmers) who have never been exposed to the underlying execution properties of allegedly sequential code. This can be the source of subtle concurrent programming errors.

In almost all cases, there is an obvious, simple way to avoid contemplation of all the complexities arising in concurrent programs due to optimized execution mechanics: Use synchronization. For example, if both methods in class SetCheck are declared as synchronized, then you can be sure that no internal processing details can affect the intended outcome of this code.

But sometimes you cannot or do not want to use synchronization. Or perhaps you must reason about someone else's code that does not use it. In these cases you must rely on the minimal guarantees about resulting semantics spelled out by the Java Memory Model. This model allows the kinds of manipulations listed above, but bounds their potential effects on execution semantics and additionally points to some techniques programmers can use to control some aspects of these semantics (most of which are discussed in �K?).

The Java Memory Model is part of The JavaTM Language Specification, described primarily in JLS chapter 17. Here, we discuss only the basic motivation, properties, and programming consequences of the model. The treatment here reflects a few clarifications and updates that are missing from the first edition of JLS.

The assumptions underlying the model can be viewed as an idealization of a standard SMP machine of the sort described in �K?.4:

For purposes of the model, every thread can be thought of as running on a different CPU from any other thread. Even on multiprocessors, this is infrequent in practice, but the fact that this CPU-per-thread mapping is among the legal ways to implement threads accounts for some of the model's initially surprising properties. For example, because CPUs hold registers that cannot be directly accessed by other CPUs, the model must allow for cases in which one thread does not know about values being manipulated by another thread. However, the impact of the model is by no means restricted to multiprocessors. The actions of compilers and processors can lead to identical concerns even on single-CPU systems.

The model does not specifically address whether the kinds of execution tactics discussed above are performed by compilers, CPUs, cache controllers, or any other mechanism. It does not even discuss them in terms of classes, objects, and methods familiar to programmers. Instead, the model defines an abstract relation between threads and main memory. Every thread is defined to have a working memory (an abstraction of caches and registers) in which to store values. The model guarantees a few properties surrounding the interactions of instruction sequences corresponding to methods and memory cells corresponding to fields. Most rules are phrased in terms of when values must be transferred between the main memory and per-thread working memory. The rules address three intertwined issues:

Atomicity
Which instructions must have indivisible effects. For purposes of the model, these rules need to be stated only for simple reads and writes of memory cells representing fields - instance and static variables, also including array elements, but not including local variables inside methods.
Visibility
Under what conditions the effects of one thread are visible to another. The effects of interest here are writes to fields, as seen via reads of those fields.
Ordering
Under what conditions the effects of operations can appear out of order to any given thread. The main ordering issues surround reads and writes associated with sequences of assignment statements.
When synchronization is used consistently, each of these properties has a simple characterization: All changes made in one synchronized method or block are atomic and visible with respect to other synchronized methods and blocks employing the same lock, and processing of synchronized methods or blocks within any given thread is in program-specified order. Even though processing of statements within blocks may be out of order, this cannot matter to other threads employing synchronization.
When synchronization is not used or is used inconsistently, answers become more complex. The guarantees made by the memory model are weaker than most programmers intuitively expect, and are also weaker than those typically provided on any given JVM implementation. This imposes additional obligations on programmers attempting to ensure the object consistency relations that lie at the heart of exclusion practices: Objects must maintain invariants as seen by all threads that rely on them, not just by the thread performing any given state modification.

The most important rules and properties specified by the model are discussed below.

Atomicity
Accesses and updates to the memory cells corresponding to fields of any type except long or double are guaranteed to be atomic. This includes fields serving as references to other objects. Additionally, atomicity extends to volatile long and double. (Even though non-volatile longs and doubles are not guaranteed atomic, they are of course allowed to be.)
Atomicity guarantees ensure that when a non-long/double field is used in an expression, you will obtain either its initial value or some value that was written by some thread, but not some jumble of bits resulting from two or more threads both trying to write values at the same time. However, as seen below, atomicity alone does not guarantee that you will get the value most recently written by any thread. For this reason, atomicity guarantees per se normally have little impact on concurrent program design.

Visibility
Changes to fields made by one thread are guaranteed to be visible to other threads only under the following conditions:

A writing thread releases a synchronization lock and a reading thread subsequently acquires that same synchronization lock.
In essence, releasing a lock forces a flush of all writes from working memory employed by the thread, and acquiring a lock forces a (re)load of the values of accessible fields. While lock actions provide exclusion only for the operations performed within a synchronized method or block, these memory effects are defined to cover all fields used by the thread performing the action.

Note the double meaning of synchronized: it deals with locks that permit higher-level synchronization protocols, while at the same time dealing with the memory system (sometimes via low-level memory barrier machine instructions) to keep value representations in synch across threads. This reflects one way in which concurrent programming bears more similarity to distributed programming than to sequential programming. The latter sense of synchronized may be viewed as a mechanism by which a method running in one thread indicates that it is willing to send and/or receive changes to variables to and from methods running in other threads. From this point of view, using locks and passing messages might be seen merely as syntactic variants of each other.

If a field is declared as volatile, any value written to it is flushed and made visible by the writer thread before the writer thread performs any further memory operation (i.e., for the purposes at hand it is flushed immediately). Reader threads must reload the values of volatile fields upon each access.

The first time a thread accesses a field of an object, it sees either the initial value of the field or a value since written by some other thread.
Among other consequences, it is bad practice to make available the reference to an incompletely constructed object (see �K?.2). It can also be risky to start new threads inside a constructor, especially in a class that may be subclassed. Thread.start has the same memory effects as a lock release by the thread calling start, followed by a lock acquire by the started thread. If a Runnable superclass invokes new Thread(this).start() before subclass constructors execute, then the object might not be fully initialized when the run method executes. Similarly, if you create and start a new thread T and then create an object X used by thread T, you cannot be sure that the fields of X will be visible to T unless you employ synchronization surrounding all references to object X. Or, when applicable, you can create X before starting T.

As a thread terminates, all written variables are flushed to main memory. For example, if one thread synchronizes on the termination of another thread using Thread.join, then it is guaranteed to see the effects made by that thread (see �K?.2).

Note that visibility problems never arise when passing references to objects across methods in the same thread.
The memory model guarantees that, given the eventual occurrence of the above operations, a particular update to a particular field made by one thread will eventually be visible to another. But eventually can be an arbitrarily long time. Long stretches of code in threads that use no synchronization can be hopelessly out of synch with other threads with respect to values of fields. In particular, it is always wrong to write loops waiting for values written by other threads unless the fields are volatile or accessed via synchronization (see �K?.6).

The model also allows inconsistent visibility in the absence of synchronization. For example, it is possible to obtain a fresh value for one field of an object, but a stale value for another. Similarly, it is possible to read a fresh, updated value of a reference variable, but a stale value of one of the fields of the object now being referenced.

However, the rules do not require visibility failures across threads, they merely allow these failures to occur. This is one aspect of the fact that not using synchronization in multithreaded code doesn't guarantee safety violations, it just allows them. On most current JVM implementations and platforms, even those employing multiple processors, detectable visibility failures rarely occur. The use of common caches across threads sharing a CPU, the lack of aggressive compiler-based optimizations, and the presence of strong cache consistency hardware often cause values to act as if they propagate immediately among threads. This makes testing for freedom from visibility-based errors impractical, since such errors might occur extremely rarely, or only on platforms you do not have access to, or only on those that have not even been built yet. These same comments apply to multithreaded safety failures more generally. Concurrent programs that do not use synchronization fail for many reasons, including memory consistency problems.

Ordering
Ordering rules fall under two cases, within-thread and between-thread:

From the point of view of the thread performing the actions in a method, instructions proceed in the normal as-if-serial manner that applies in sequential programming languages.

From the point of view of other threads that might be "spying" on this thread by concurrently running unsynchronized methods, almost anything can happen. The only useful constraint is that the relative orderings of synchronized methods and blocks, as well as operations on volatile fields, are always preserved.

Again, these are only the minimal guaranteed properties. In any given program or platform, you may find stricter orderings. But you cannot rely on them, and you may find it difficult to test for code that would fail on JVM implementations that have different properties but still conform to the rules.
Note that the within-thread point of view is implicitly adopted in all other discussions of semantics in JLS. For example, arithmetic expression evaluation is performed in left-to-right order (JLS section 15.6) as viewed by the thread performing the operations, but not necessarily as viewed by other threads.

The within-thread as-if-serial property is helpful only when only one thread at a time is manipulating variables, due to synchronization, structural exclusion, or pure chance. When multiple threads are all running unsynchronized code that reads and writes common fields, then arbitrary interleavings, atomicity failures, race conditions, and visibility failures may result in execution patterns that make the notion of as-if-serial just about meaningless with respect to any given thread.

Even though JLS addresses some particular legal and illegal reorderings that can occur, interactions with these other issues reduce practical guarantees to saying that the results may reflect just about any possible interleaving of just about any possible reordering. So there is no point in trying to reason about the ordering properties of such code.

Volatile
In terms of atomicity, visibility, and ordering, declaring a field as volatile is nearly identical in effect to using a little fully synchronized class protecting only that field via get/set methods, as in:
final class VFloat { private float value; final synchronized void set(float f) { value = f; } final synchronized float get() { return value; } }
Declaring a field as volatile differs only in that no locking is involved. In particular, composite read/write operations such as the "++'' operation on volatile variables are not performed atomically.
Also, ordering and visibility effects surround only the single access or update to the volatile field itself. Declaring a reference field as volatile does not ensure visibility of non-volatile fields that are accessed via this reference. Similarly, declaring an array field as volatile does not ensure visibility of its elements. Volatility cannot be manually propagated for arrays because array elements themselves cannot be declared as volatile.

Because no locking is involved, declaring fields as volatile is likely to be cheaper than using synchronization, or at least no more expensive. However, if volatile fields are accessed frequently inside methods, their use is likely to lead to slower performance than would locking the entire methods.

Declaring fields as volatile can be useful when you do not need locking for any other reason, yet values must be accurately accessible across multiple threads. This may occur when:

The field need not obey any invariants with respect to others.

Writes to the field do not depend on its current value.

No thread ever writes an illegal value with respect to intended semantics.

The actions of readers do not depend on values of other non-volatile fields.

Using volatile fields can make sense when it is somehow known that only one thread can change a field, but many other threads are allowed to read it at any time. For example, a Thermometer class might declare its temperature field as volatile. As discussed in �K?.2, a volatile can be useful as a completion flag. Additional examples are illustrated in �K?, where the use of lightweight executable frameworks automates some aspects of synchronization, but volatile declarations are needed to ensure that result field values are visible across tasks.

rainman 2008-09-30 16:55 发表评论

��L��使用�U�程: 同步不是敌�h(转蝲自ibm developwork)

rainman — Tue, 30 Sep 2008 05:31:00 GMT

大多数编�E�语�a�的语�a�规范都不�?x��)谈到线�E�和�q�发的问题；因�ؓ(f��)一直以来，�q�些问题都是留给�q�_��或操作系�l�去详细说明的。但是，Java 语言规范�Q�JLS�Q�却明确包括一个线�E�模型，�q�提供了(ji��n)一些语�a�元素供开发�h员��用以保证他们�E�序的线�E�安全�?/p>
对线�E�的明确支持有利也有弊。它使得我们在写�E�序时更�Ҏ(gu��)��利用�U�程的功能和便利�Q�但同时也意味着我们不得不注意所写类的线�E�安全，因�ؓ(f��)��M��c�都很有可能被用在一个多�U�程的环境内�?/p>
许多用户�W�一�ơ发��C��们不得不�ȝ��解线�E�的概念的时候，�q�不是因��Z��们在写创建和��理�U�程的程序，而是因�ؓ(f��)他们正在用一个本�w�是多线�E�的工具或框架。�Q何用�q?Swing GUI 框架或写�q�小服务�E�序�?JSP ��늚�开发�h员（不管有没有意识到�Q�都曄��被线�E�的复杂性困扰过�?/p>
Java 设计师是惛_��Z��U�语�a��Q��之能够很好地�q�行在现代的��g�Q�包括多处理器系�l�上。要辑ֈ��q�一目的�Q�管理线�E�间协调的工作主要推�l�了(ji��n)软�g开发�h员；�E�序员必��L��定线�E�间�׃�n数据的位�|�。在 Java �E�序中，用来��理�U�程间协调工作的主要工具�?synchronized 关键字。在�~�少同步的情况下�Q�JVM 可以很自由地对不同线�E�内执行的操作进行计时和排序。在大部分情况下�Q�这正是我们惌��的，因�ؓ(f��)�q�样可以提高性能�Q�但它也�l�程序员带来�?ji��n)额外的负担�Q�他们不得不自己识别什么时候这�U�性能的提高会(x��)危及(qi��ng)�E�序的正��性�?/p>
synchronized 真正意味着什么？

大部�?Java �E�序员对同步的块或方法的理解是完全根据��用互斥（互斥信号量）(j��)或定义一个��(f��)界段�Q�一个必��d��子性地执行的代码块�Q�。虽�?synchronized 的语义中��实包括互斥和原子性，但在��程�q�入之前和在��程退��Z��后发生的事情要复杂得多�?/p>
synchronized 的语义确实保证了(ji��n)一�ơ只有一个线�E�可以访问被保护的区�D�，但同时还包括同步�U�程在主存内互相作用的规则。理�?Java 内存模型�Q�JMM�Q�的一个好�Ҏ(gu��)��是把各个线�E�想像成�q�行在相互分��ȝ��处理器上�Q�所有的处理器存取同一块主存空��_(d��)��每个处理器有自己的缓存，但这些缓存可能�ƈ不��d��d��同步。在�~�少同步的情况下�Q�JMM �?x��)允�怸�个线�E�在同一个内存地址上看��C��同的倹{��而当用一个管�E�（锁）(j��)�q�行同步的时候，一旦申请加�?ji��n)锁�Q�JMM ��׃��(x��)马上要求该缓存失效，然后在它被释攑։�对它�q�行��h��Q�把修改�q�的内存位置写回��d��Q�。不隄��Zؓ(f��)什么同步会(x��)对程序的性能影响�q�么大；频繁地刷新缓存代价会(x��)很大�?/p>

回页�?/strong>

使用一条好的运行�\�U?/span>

如果同步不适当�Q�后果是很严重的�Q�会(x��)造成数据混�ؕ和争用情况，��D��E�序崩溃�Q��生不正确的结果，或者是不可预计的运行。更�p�的是，�q�些情况可能很少发生且具有偶然性（使得问题很难被监��和重现�Q�。如果测试环境和开发环境有很大的不同，无论是配�|�的不同�Q�还是负��L(f��ng)��不同�Q�都有可能��得这些问题在��试环境中根本不出现�Q�从而得出错误的�l�论�Q�我们的�E�序是正��的�Q�而事实上�q�些问题只是�q�没出现而已�?/p>

争用情况定义

争用情况是一�U�特定的情况�Q�两个或更多的线�E�或�q�程��L��写一些共享数据，而最�l�结果取决于�q�些�U�程是如何被调度计时的。争用情况可能会(x��)��D��不可预见的结果和隐蔽的程序错误�?/p>

另一斚w��Q�不当或�q�度��C��用同步会(x��)��D��其它问题�Q�比如性能很差和死锁。当�?d��ng)��性能差虽然不如数据�؜乱那么严重，但也是一个严重的问题�Q�因此同样不可忽视。编写优�U�的多�U�程�E�序需要��用好的运行�\�U�，��_��的同步可以��(zh��n)�的数据不发生�؜乱，但不需要滥用到��L��担死锁或不必要地削弱�E�序性能的风险�?/p>

回页�?/strong>

同步的代��h��多大�Q?/span>

�׃��包括�~�存��h��和设�|�失效的�q�程�Q�Java 语言中的同步块通常比许多��^台提供的临界�D�设备代��h��大，�q�些临界�D�通常是用一个原子性的“test and set bit”机器指��o(h��)实现的。即使一个程序只包括一个在单一处理器上�q�行的单�U�程�Q�一个同步的�Ҏ(gu��)��调用仍要比非同步的方法调用慢。如果同步时�q�发生锁定争用，那么性能上付出的代�h(hu��n)�?x��)大得多�Q�因��Z��(x��)需要几个线�E�切换和�pȝ��调用�?/p>
�q�运的是�Q�随着每一版的 JVM 的不断改�q�，既提高了(ji��n) Java �E�序的��M��性能�Q�同时也相对减少�?ji��n)同步的代�h(hu��n)�Q��ƈ且将来还可能�?x��)有�q�一步的改进。此外，同步的性能代�h(hu��n)�l�常是被夸大的。一个著名的资料来源��曾�l�引证说一个同步的�Ҏ(gu��)��调用比一个非同步的方法调用慢 50 倍。虽然这句话有可能是真的�Q�但也会(x��)产生误导�Q�而且已经��D��?ji��n)许多开发�h员即使在需要的时候也避免使用同步�?/p>
严格依照癑ֈ�比计��同步的性能损失�q�没有多大意义，因�ؓ(f��)一个无争用的同步给一个块或方法带来的是固定的性能损失。而这一固定的�g�q�带来的性能损失癑ֈ�比取决于在该同步块内做了(ji��n)多少工作。对一�?em>�I?/em>�Ҏ(gu��)��的同步调用可能要比对一个空�Ҏ(gu��)��的非同步调用�?20 倍，但我们多长时间才调用一�ơ空�Ҏ(gu��)��呢？当我们用更有代表性的��方法来衡量同步损失�Ӟ��癑ֈ�数很快就下降到可以容忍的范围之内�?/p>
�?1 把一些这�U�数据放在一��h��看。它列�D�?ji��n)一些不同的实例�Q�不同的�q�_��和不同的 JVM 下一个同步的�Ҏ(gu��)��调用相对于一个非同步的方法调用的损失。在每一个实例下�Q�我�q�行一个简单的�E�序�Q�测定��@环调用一个方�?10�Q?00�Q?00 �ơ所需的运行时��_(d��)��我调用了(ji��n)同步和非同步两个版本�Q��ƈ比较�?ji��n)结果。表��g��的数据是同步版本的运行时间相对于非同步版本的�q�行旉��的比率；它显�C�Z��(ji��n)同步的性能损失。每�ơ运行调用的都是清单 1 中的��单方法之一�?/p>
表格 1 中显�C�Z��(ji��n)同步�Ҏ(gu��)��调用相对于非同步�Ҏ(gu��)��调用的相�Ҏ(gu��)��能�Q��ؓ(f��)�?ji��n)用�l�对的标准测定性能损失�Q�必��考虑�?JVM 速度提高的因素，�q��ƈ没有在数据中体现出来。在大多数测试中�Q�每�?JVM 的更高版本都�?x��)�?JVM 的��M��性能得到很大提高�Q�很有可�?1.4 版的 Java 虚拟机发行的时候，它的性能�q�会(x��)有进一步的提高�?/p>
�?1. 无争用同步的性能损失

JDK staticEmpty empty fetch hashmapGet singleton create

Linux / JDK 1.1 9.2 2.4 2.5 n/a 2.0 1.42

Linux / IBM Java SDK 1.1 33.9 18.4 14.1 n/a 6.9 1.2

Linux / JDK 1.2 2.5 2.2 2.2 1.64 2.2 1.4

Linux / JDK 1.3 (no JIT) 2.52 2.58 2.02 1.44 1.4 1.1

Linux / JDK 1.3 -server 28.9 21.0 39.0 1.87 9.0 2.3

Linux / JDK 1.3 -client 21.2 4.2 4.3 1.7 5.2 2.1

Linux / IBM Java SDK 1.3 8.2 33.4 33.4 1.7 20.7 35.3

Linux / gcj 3.0 2.1 3.6 3.3 1.2 2.4 2.1

Solaris / JDK 1.1 38.6 20.1 12.8 n/a 11.8 2.1

Solaris / JDK 1.2 39.2 8.6 5.0 1.4 3.1 3.1

Solaris / JDK 1.3 (no JIT) 2.0 1.8 1.8 1.0 1.2 1.1

Solaris / JDK 1.3 -client 19.8 1.5 1.1 1.3 2.1 1.7

Solaris / JDK 1.3 -server 1.8 2.3 53.0 1.3 4.2 3.2

清单 1. 基准��试中用到的��单方�?/strong>

public static void staticEmpty() { }
public void empty() { }
public Object fetch() { return field; }
public Object singleton() {
if (singletonField == null)
singletonField = new Object();
return singletonField;
}
public Object hashmapGet() {
return hashMap.get("this");
}
public Object create() {
return new Object();
}

�q�些��基准测试也阐明�?ji��n)存在动态编译器的情况下解释性能�l�果所面��(f��)的挑战。对�?1.3 JDK 在有和没�?JIT �Ӟ��数字上的巨大差异需要给��Z��些解释。对那些非常��单的�Ҏ(gu��)��Q?empty �?fetch �Q�，基准��试的本质（它只是执行一个几乎什么也不做的紧凑的循环�Q��?JIT 可以动态地�~�译整个循环�Q�把�q�行旉��压羃到几乎没有的地步。但在一个实际的�E�序中，JIT 能否�q�样做就要取决于很多因素�?ji��n)，所以，�?JIT 的计时数据可能在做公�q�_��比时更有用一些。在��M��情况下，对于更充实的�Ҏ(gu��)��Q?create �?hashmapGet �Q�，JIT ��׃��能象�Ҏ(gu��)��单些的方法那样��非同步的情况得到巨大的改�q�。另外，从数据中看不�?JVM 是否能够�Ҏ(gu��)��试的重要部分�q�行优化。同��P��在可比较�?IBM �?Sun JDK 之间的差异反映了(ji��n) IBM Java SDK 可以更大�E�度��C��化非同步的��@环，而不是同步版本代��h��高。这在纯计时数据中可以明昑֜�看出�Q�这里不提供�Q��?/p>
从这些数字中我们可以得出以下�l�论�Q�对非争用同步而言�Q�虽然存在性能损失�Q�但在运行许多不是特别微��的�Ҏ(gu��)��Ӟ��损失可以降到一个合理的水��^�Q�大多数情况下损失大概在 10% �?200% 之间�Q�这是一个相对较?y��u)��的数目�Q�。所以，虽然同步每个�Ҏ(gu��)��是不明智的（�q�也�?x��)增加死锁的可能性）(j��)�Q�但我们也不需要这么害怕同步。这里��用的��单测试是说明一个无争用同步的代仯��比创��Z��个对象或查找一�?code>HashMap 的代价小�?/p>
�׃��早期的书�c�和文章暗示�?ji��n)无争用同步要付出巨大的性能代�h(hu��n)�Q�许多程序员��q��全力避免同步。这�U�恐惧导致了(ji��n)许多有问题的技术出玎ͼ�比如�?double-checked locking�Q�DCL�Q�。许多关�?Java �~�程的书和文章都推荐 DCL�Q�它看上�ȝ��是避免不必要的同步的一�U�聪明的�Ҏ(gu��)��Q�但实际上它�Ҏ(gu��)��没有用，应该避免使用它。DCL 无效的原因很复杂�Q�已��出�?ji��n)本文讨论的范围�Q�要深入�?ji��n)解�Q�请参阅参考资�?/a>里的链接�Q��?/p>

回页�?/strong>

不要争用

假设同步使用正确�Q�若�U�程真正参与争用加锁�Q��?zh��n)�也能感受到同步对实际性能的媄(ji��ng)响。�ƈ且无争用同步和争用同步间的性能损失差别很大�Q�一个简单的��试�E�序指出争用同步比无争用同步�?50 倍。把�q�一事实和我们上面抽取的观察数据�l�合在一��P��可以看出使用一个争用同步的代�h(hu��n)臛_��相当于创�?50 个对象�?/p>
所以，在调试应用程序中同步的��用时�Q�我们应该努力减��实际争用的数目�Q�而根本不是简单地试图避免使用同步。这个系列的�W?2 部分��把重点攑֜�减少争用的技术上�Q�包括减��锁的粒度、减��同步块的大��以�?qi��ng)减��线�E�间�׃�n数据的数量�?/p>

回页�?/strong>

什么时候需要同步？

要��(zh��n)�的�E�序�U�程安全�Q�首先必��ȝ��定哪些数据将在线�E�间�׃�n。如果正在写的数据以后可能被另一个线�E�读刎ͼ�或者正在读的数据可能已�l�被另一个线�E�写�q�了(ji��n)�Q�那么这些数据就是共享数据，必须�q�行同步存取。有些程序员可能�?x��)惊讶地发现�Q�这些规则在��单地��(g��)查一个共享引用是否非�I�的时候也用得上�?/p>
许多��Z��(x��)发现�q�些定义惊�h��C��根{��有一�U�普遍的观点是，如果只是要读一个对象的字段�Q�不需要请求加锁，��其是在 JLS 保证�?32 位读操作的原子性的情况下，它更是如此。但不幸的是�Q�这个观�Ҏ(gu��)��错误的。除非所指的字段被声明�ؓ(f��) volatile �Q�否�?JMM 不会(x��)要求下面的��^台提供处理器间的�~�存一致性和��序�q�诏性，所以很有可能，在某些��^��C��Q�没有同步就�?x��)读到陈旧的数据。有��x��详细的信息，请参�?参考资�?/a>�?/p>
在确定了(ji��n)要共享的数据之后�Q�还要确定要如何保护那些数据。在��单情况下�Q�只需把它们声明�ؓ(f��) volatile 卛_��保护数据字段�Q�在其它情况下，必须在读或写�׃�n数据前请求加锁，一个很好的�l�验是明��指��Z��用什么锁来保护给定的字段或对象，�q�在你的代码里把它记录下来�?/p>
�q�有一点值得注意的是�Q�简单地同步存取器方法（或声明下层的字段�?volatile �Q�可能�ƈ不��以保护一个共享字�D�c(di��n)��可以考虑下面的示例：(x��)

private int foo;
public synchronized int getFoo() { return foo; }
public synchronized void setFoo(int f) { foo = f; }

如果一个调用者想要增�?foo 属性��|��以下完成该功能的代码��׃��是线�E�安全的�Q?/p>

setFoo(getFoo() + 1);

如果两个�U�程试图同时增加 foo 属性��|��l�果可能�?#160;foo 的值增加了(ji��n) 1 �?2�Q�这��p��时决定。调用者将需要同步一个锁�Q�才能防止这�U�争用情况；一个好�Ҏ(gu��)��是在 JavaDoc �c�M��指定同步哪个锁，�q�样�cȝ��调用者就不需要自��q��?ji��n)�?/p>
以上情况是一个很好的�C�Z��Q�说明我们应该注意多层次�_�度的数据完整性；同步存取器方法确保调用者能够存取到一致的和最�q�版本的属性��|��但如果希望属性的��来��g��当前��g��_(d��)��或多个属性间�怺�一��_(d��)��我们��必��d��步复合操�?�?可能是在一个粗�_�度的锁上�?/p>

回页�?/strong>

如果情况不确定，考虑使用同步包装

有时�Q�在写一个类的时候，我们�q�不知道它是否要用在一个共享环境里。我们希望我们的�c�L��U�程安全的，但我们又不希望给一个��L��在单�U�程环境内��用的�c�d��上同步的负担�Q�而且我们可能也不知道使用�q�个�c�L��合适的锁粒度是多大。幸�q�的是，通过提供同步包装�Q�我们可以同时达��C��上两个目的。Collections �c�d��是这�U�技术的一个很好的�C�Z��Q�它们是非同步的�Q�但在框架中定义的每个接口都有一个同步包装（例如�Q?#160;Collections.synchronizedMap() �Q�，它用一个同步的版本来包装每个方法�?/p>

回页�?/strong>

�l�论

虽然 JLS �l�了(ji��n)我们可以使我们的�E�序�U�程安全的工��P��但线�E�安全也不是天上掉下来的馅饼。��用同步会(x��)蒙受性能损失�Q�而同步��用不当又�?x��)��我们承担数据混�ؕ、结果不一致或死锁的风险。幸�q�的是，在过�ȝ��几年�?JVM 有了(ji��n)很大的改�q�，大大减少�?ji��n)与正确使用同步相关的性能损失。通过仔细分析在线�E�间如何�׃�n数据�Q�适当地同步对�׃�n数据的操作，可以使得�(zh��n)�的�E�序既是�U�程安全的，又不�?x��)承受过多的性能负担�?/p>

参考资�?/span>

�(zh��n)�可以参阅本文在 developerWorks 全球站点上的英文原文.

��L(f��ng)��?y��n)L��章顶部或底部�?#160;讨论�q�入�?Brian Goetz ��L��的，关于“Java �U�程�Q�技巧、窍门和技�?#8221;�?#160;讨论论坛�?#160;

Jack Shirazi �~�写�?#160;Java Performance Tuning �Q�O'Reilly & Associates, 2000�Q�可以�ؓ(f��)�?Java �q�_��上解��x��能问题提供指导。本书引用的与本书一��h��供的参考资料提供了(ji��n)很好�?#160;性能调试技�?/a>�?#160;

Dov Bulka �?#160;Java Performance and Scalability�Q�第 1 ��P��(x��)Server-Side Programming Techniques �Q�Addison-Wesley�Q?000�Q�提供了(ji��n)大量的设计技巧和诀�H�，可帮助�?zh��n)�增强自己的应用程序的性能�?#160;

Steve Wilson �?Jeff Kesselman �?#160;Java Platform Performance: Strategies and Tactics �Q�Addison-Wesley�Q?000�Q��ؓ(f��)有经验的 Java �E�序员提供了(ji��n)生成快速、有效的 Java 代码的技术�?#160;

Brian Goetz 最�q�的著作“ Double-checked locking: Clever, but broken”�Q�JavaWorld�Q?001 �q?2 月）(j��)详细探烦(ch��)�?JMM �q�描�q�C��(ji��n)特定情况下不使用同步的惊人后果�?#160;

公认的多�U�程权威 Allen Holub 在他的文�?#8220; 警告�Q�多处理器世界中的线�E?/a>”�Q�JavaWorld�Q?001 �q?2 月）(j��)中揭�C�Z��(ji��n)��Z��么用于减��同步负担的大多数技巧都不�v作用�?#160;

Peter Haggar 描述�?ji��n)怎样用固定的�Q��@环的��序获取多个锁定以避免死�?/a>�Q�developerWorks�Q?000 �q?9 月）(j��)�?#160;

在他的文�?#8220; �~�写多线�E?Java 应用”�Q�developerWorks�Q?001 �q?9 月）(j��)中，Alex Roetter 介绍�?Java Thread API�Q�概括了(ji��n)多线�E�涉�?qi��ng)的问题�Q��ƈ��Z��般性的问题提供�?ji��n)解��x��案�?#160;

Doug Lea �?#160;Concurrent Programming in Java�Q�第 2 �?/em> �Q�Addison-Wesley�Q?999�Q�是关于 Java 语言中多�U�程�~�程的敏感问题的权威书籍�?#160;

“ 同步�?Java 内存模型”摘录�?Doug Lea 的关�?#160;synchronized 的实际意义的著作�?#160;

Bill Pugh �?#160;Java 内存模型为�?zh��n)�学�?f��n) JMM 提供�?ji��n)一个很好的��L(f��ng)��?#160;

“Double Checked Locking is Broken”声明描述�?ji��n)��?f��)什�?DCL 在用 Java 语言实现时没有用�?#160;

Bill Joy、Guy Steele �?James Gosling �?#160;The Java Language Specification�Q�第 2 �?/em> �Q�Addison-Wesley�Q?000�Q�的�W?17 章描�q�C��(ji��n) Java 内存模型的深层细节问题�?#160;

IBM T.J. Watson 研究中心(j��)有一整个��目�l�投入到性能��理中�?#160;

请在 developerWorks Java 技术专�?/a>查找更多的参考资料�?#160;

关于作�?/span>

Brian Goetz 是一名��Y仉��问，�q�且�q�去 15 �q�来一直是专业的��Y件开发�h员。他�?#160;Quiotix�Q�一家坐落在 Los Altos�Q�California 的��Y件开发和咨询公司的首席顾问。请通过 brian@quiotix.com �?Brian 联系�?/p>

rainman 2008-09-30 13:31 发表评论


		陈威�Q�华中科技大学��士�Q�IBM CSDL Software Engineer�Q�所在的 Team �?DB2 for z/OS。联�p�L��式：(x��)chenwbj@cn.ibm.com

JDK	staticEmpty	empty	fetch	hashmapGet	singleton	create
Linux / JDK 1.1	9.2	2.4	2.5	n/a	2.0	1.42
Linux / IBM Java SDK 1.1	33.9	18.4	14.1	n/a	6.9	1.2
Linux / JDK 1.2	2.5	2.2	2.2	1.64	2.2	1.4
Linux / JDK 1.3 (no JIT)	2.52	2.58	2.02	1.44	1.4	1.1
Linux / JDK 1.3 -server	28.9	21.0	39.0	1.87	9.0	2.3
Linux / JDK 1.3 -client	21.2	4.2	4.3	1.7	5.2	2.1
Linux / IBM Java SDK 1.3	8.2	33.4	33.4	1.7	20.7	35.3
Linux / gcj 3.0	2.1	3.6	3.3	1.2	2.4	2.1
Solaris / JDK 1.1	38.6	20.1	12.8	n/a	11.8	2.1
Solaris / JDK 1.2	39.2	8.6	5.0	1.4	3.1	3.1
Solaris / JDK 1.3 (no JIT)	2.0	1.8	1.8	1.0	1.2	1.1
Solaris / JDK 1.3 -client	19.8	1.5	1.1	1.3	2.1	1.7
Solaris / JDK 1.3 -server	1.8	2.3	53.0	1.3	4.2	3.2


		Brian Goetz 是一名��Y仉��问，�q�且�q�去 15 �q�来一直是专业的��Y件开发�h员。他�?#160;Quiotix�Q�一家坐落在 Los Altos�Q�California 的��Y件开发和咨询公司的首席顾问。请通过 brian@quiotix.com �?Brian 联系�?/p>