Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
215 views
in Technique[技术] by (71.8m points)

c++ - Why compilers no longer optimize this UB with strict aliasing

One of the first results for strict aliasing on google is this article http://dbp-consulting.com/tutorials/StrictAliasing.html
One interesting thing I noticed is this: http://goo.gl/lPtIa5

uint32_t swaphalves(uint32_t a) {
  uint32_t acopy = a;
  uint16_t* ptr = (uint16_t*)&acopy;
  uint16_t tmp = ptr[0];
  ptr[0] = ptr[1];
  ptr[1] = tmp;
  return acopy;
}

is compiled to

swaphalves(unsigned int):
        mov     eax, edi
        ret

by GCC 4.4.7. Any compiler newer than that (4.4 is mentioned in the article so article is not wrong) does not implement the function as it could using strict aliasing. What is the reason for this? Was it in fact bug in GCC or GCC decided to drop it since many lines of code were written in a way that thay produce UB or it is just a compiler regression that lasts for years... Also Clang does not optimize it.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

The GCC developers put some effort into making the compiler behave "as expected" in these cases. (I wish I could give you a proper reference for this - I remember it coming up on a mailing list or somesuch at some time).

At any rate, something you say:

... does not implement the function as it could using strict aliasing

... implies perhaps a slight misunderstanding of what the strict aliasing rules are for. Your code sample invokes undefined behavior - so any compilation is technically valid, including just a plain ret or generation of a trap instruction, or even nothing at all (it's legitimate to assume the method can never be called). That newer versions of GCC produce longer/slower code is hardly a deficiency, since producing code that does any particular thing at all would not violate the standard. In fact, the newer versions improve the situation by producing code that does what the programmer probably intended the code to do instead of silently doing something differerent.

What would you rather - that the compiler produces fast code that doesn't do what you want, or slightly slower code that does do what you want?

That being said, I firmly believe that you should not write code that breaks the strict aliasing rules. Relying on the compiler doing the "right" thing when it is "obvious" what is intended is walking a tightrope. Optimisation is hard enough already, without the compiler having to guess at - and make allowances for - what the programmer intended. Further, it's possible to write code which obeys the rules and which can be turned into very efficient object code by the compiler. Indeed the further question can be raised:

Why did earlier versions of GCC behave the way they did, and "optimize" the function by relying on adherence to the strict aliasing rules?

That's a little bit complicated, but is interesting for this discussion (especially in light of suggestions that the compiler is going to some lengths just to break code). Strict aliasing is a component of (or rather, a rule which assists) a process called alias analysis. This process decides whether two pointers alias or not. There are, essentially, 3 possible conditions between any two pointers:

  • They MUST NOT ALIAS (the strict aliasing rule makes it easy to deduce this condition, though it can sometimes be deduced in other ways).
  • They MUST ALIAS (this requires analysis; value propagation might detect this condition for instance)
  • They MAY ALIAS. This is the default condition when neither of the other two conditions can be established.

In the case of the code in your question, strict aliasing implies a MUST NOT ALIAS condition between &acopy and ptr (it is trivial to make this determination, because the two values have incompatible types which are not allowed to alias). This condition allows for the optimisation that you then see: all the manipulation of *ptr values can be discarded because they cannot in theory effect the value of acopy and they do not otherwise escape the function (which can be determined via escape analysis).

It takes further effort to determine the MUST ALIAS condition between the two pointers. Furthermore, in doing so the compiler would need to ignore (at least temporarily) the previously ascertained MUST NOT ALIAS condition, which means it must spend time attempting to ascertain the truth of a condition which, if everything is as it should be, must be false.

When both MUST NOT ALIAS and MUST ALIAS conditions are determined, we have a case where the code must be invoking undefined behaviour (and we can issue a warning). We then have to decide which condition to keep and which to discard. Because MUST NOT ALIAS, in this case, comes from a constraint which can be (and indeed has been) broken by the user, it is the best option to discard.

So, the older versions of GCC either do not do the requisite analysis to determine the MUST ALIAS condition (perhaps because the opposite MUST NOT ALIAS condition has already been established), or alternatively, the older GCC version opts to discard the MUST ALIAS condition in preference to the MUST NOT ALIAS condition, which leads to faster code which does not do what the programmer most likely intended. In either case, it seems that the newer versions offer an improvement.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...