Update:
For anyone interested this bug has been addressed and fixed for Java 7u6 build b14. You can see the bug report/fixes here
Original Answer
When thinking in terms of memory visibility/order you would need to think about its happens-before relationship. The important pre condition for b != 0
is for a == 1
. If a != 1
then b can be either 0 or 1.
Once a thread sees a == 1
then that thread is guaranteed to see b == 1
.
Post Java 5, in the OP example, once the while(a == 0)
breaks out b is guaranteed to be 1
Edit:
I ran the simulation many number of times and didn't see your output.
What OS, Java version & CPU are you testing under?
I am on Windows 7, Java 1.6_24 (trying with _31)
Edit 2:
Kudos to the OP and Walter Laan - For me it only happened when I switched from 64 bit Java to 32 bit Java, on (but may not be excluded to) a 64 bit windows 7.
Edit 3:
The assignment to tt
, or rather the staticget of b
seems to have a significant impact (to prove this remove the int tt = b;
and it should always work.
It appears the load of b
into tt
will store the field locally which will then be used in the if coniditonal (the reference to that value not tt
). So if b == 0
is true it probably means that the local store to tt
was 0 (at this point its a race to assign 1 to local tt
). This seems only to be true for 32 Bit Java 1.6 & 7 with client set.
I compared the two output assembly and the immediate difference was here. (Keep in mind these are snippets).
This printed "error"
0x021dd753: test %eax,0x180100 ; {poll}
0x021dd759: cmp $0x0,%ecx
0x021dd75c: je 0x021dd748 ;*ifeq
; - Test$1::run@7 (line 13)
0x021dd75e: cmp $0x0,%edx
0x021dd761: jne 0x021dd788 ;*ifne
; - Test$1::run@13 (line 17)
0x021dd767: nop
0x021dd768: jmp 0x021dd7b8 ; {no_reloc}
0x021dd76d: xchg %ax,%ax
0x021dd770: jmp 0x021dd7d2 ; implicit exception: dispatches to 0x021dd7c2
0x021dd775: nop ;*getstatic out
; - Test$1::run@16 (line 18)
0x021dd776: cmp (%ecx),%eax ; implicit exception: dispatches to 0x021dd7dc
0x021dd778: mov $0x39239500,%edx ;*invokevirtual println
And
This did not print "error"
0x0226d763: test %eax,0x180100 ; {poll}
0x0226d769: cmp $0x0,%edx
0x0226d76c: je 0x0226d758 ;*ifeq
; - Test$1::run@7 (line 13)
0x0226d76e: mov $0x341b77f8,%edx ; {oop('Test')}
0x0226d773: mov 0x154(%edx),%edx ;*getstatic b
; - Test::access$0@0 (line 3)
; - Test$1::run@10 (line 17)
0x0226d779: cmp $0x0,%edx
0x0226d77c: jne 0x0226d7a8 ;*ifne
; - Test$1::run@13 (line 17)
0x0226d782: nopw 0x0(%eax,%eax,1)
0x0226d788: jmp 0x0226d7ed ; {no_reloc}
0x0226d78d: xchg %ax,%ax
0x0226d790: jmp 0x0226d807 ; implicit exception: dispatches to 0x0226d7f7
0x0226d795: nop ;*getstatic out
; - Test$1::run@16 (line 18)
0x0226d796: cmp (%ecx),%eax ; implicit exception: dispatches to 0x0226d811
0x0226d798: mov $0x39239500,%edx ;*invokevirtual println
In this example the first entry is from a run that printed "error" while the second was from one which didnt.
It seems that the working run loaded and assigned b
correctly before testing it equal to 0.
0x0226d76e: mov $0x341b77f8,%edx ; {oop('Test')}
0x0226d773: mov 0x154(%edx),%edx ;*getstatic b
; - Test::access$0@0 (line 3)
; - Test$1::run@10 (line 17)
0x0226d779: cmp $0x0,%edx
0x0226d77c: jne 0x0226d7a8 ;*ifne
; - Test$1::run@13 (line 17)
While the run that printed "error" loaded the cached version of %edx
0x021dd75e: cmp $0x0,%edx
0x021dd761: jne 0x021dd788 ;*ifne
; - Test$1::run@13 (line 17)
For those who have more experience with assembler please weigh in :)
Edit 4
Should be my last edit, as the concurrency dev's get a hand on it, I did test with and without the
int tt = b;
assignment some more. I found that when I increase the max from 100 to 1000 there seems to be a 100% error rate when int tt = b
is included and a 0% chance when it is excluded.