Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
575 views
in Technique[技术] by (71.8m points)

c - How does a mutex lock and unlock functions prevents CPU reordering?

As far as I know, a function call acts as a compiler barrier, but not as a CPU barrier.

This tutorial says the following:

acquiring a lock implies acquire semantics, while releasing a lock implies release semantics! All the memory operations in between are contained inside a nice little barrier sandwich, preventing any undesireable memory reordering across the boundaries.

I assume that the above quote is talking about CPU reordering and not about compiler reordering.

But I don't understand how does a mutex lock and unlock causes the CPU to give these functions acquire and release semantics.

For example, if we have the following C code:

pthread_mutex_lock(&lock);
i = 10;
j = 20;
pthread_mutex_unlock(&lock);

The above C code is translated into the following (pseudo) assembly instructions:

push the address of lock into the stack
call pthread_mutex_lock()
mov 10 into i
mov 20 into j
push the address of lock into the stack
call pthread_mutex_unlock()

Now what prevents the CPU from reordering mov 10 into i and mov 20 into j to above call pthread_mutex_lock() or to below call pthread_mutex_unlock()?

If it is the call instruction that prevents the CPU from doing the reordering, then why is the tutorial I quoted makes it seem like it is the mutex lock and unlock functions that prevents the CPU reordering, why the tutorial I quoted didn't say that any function call will prevent the CPU reordering?

My question is about the x86 architecture.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The short answer is that the body of the pthread_mutex_lock and pthread_mutex_unlock calls will include the necessary platform-specific memory barriers which will prevent the CPU from moving memory accesses within the critical section outside of it. The instruction flow will move from the calling code into the lock and unlock functions via a call instruction, and it is this dynamic instruction trace you have to consider for the purposes of reordering - not the static sequence you see in an assembly listing.

On x86 specifically, you probably won't find explicit, standalone memory barriers inside those methods, since you'll already have lock-prefixed instructions in order to perform the actual locking and unlocking atomically, and these instructions imply a full memory barrier, which prevents the CPU reordering you are concerned about.

For example, on my Ubuntu 16.04 system with glibc 2.23, pthread_mutex_lock is implemented using a lock cmpxchg (compare-and-exchange) and pthread_mutex_unlock is implemented using lock dec (decrement), both of which have full barrier semantics.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...