The C11 standard says this, 6.8.5/6:
An iteration statement whose controlling expression is not a constant expression,156) that
performs no input/output operations, does not access volatile objects, and performs no
synchronization or atomic operations in its body, controlling expression, or (in the case of a for statement) its expression-3, may be assumed by the implementation to
terminate.157)
The two foot notes are not normative but provide useful information:
156) An omitted controlling expression is replaced by a nonzero constant, which is a constant expression.
157) This is intended to allow compiler transformations such as removal of empty loops even when
termination cannot be proven.
In your case, while(1)
is a crystal clear constant expression, so it may not be assumed by the implementation to terminate. Such an implementation would be hopelessly broken, since "for-ever" loops is a common programming construct.
What happens to the "unreachable code" after the loop is however, as far as I know, not well-defined. However, clang does indeed behave very strange. Comparing the machine code with gcc (x86):
gcc 9.2 -O3 -std=c11 -pedantic-errors
.LC0:
.string "begin"
main:
sub rsp, 8
mov edi, OFFSET FLAT:.LC0
call puts
.L2:
jmp .L2
clang 9.0.0 -O3 -std=c11 -pedantic-errors
main: # @main
push rax
mov edi, offset .Lstr
call puts
.Lstr:
.asciz "begin"
gcc generates the loop, clang just runs into the woods and exits with error 255.
I'm leaning towards this being non-compliant behavior of clang. Because I tried to expand your example further like this:
#include <stdio.h>
#include <setjmp.h>
static _Noreturn void die() {
while(1)
;
}
int main(void) {
jmp_buf buf;
_Bool first = !setjmp(buf);
printf("begin
");
if(first)
{
die();
longjmp(buf, 1);
}
printf("unreachable
");
}
I added C11 _Noreturn
in an attempt to help the compiler further along. It should be clear that this function will hang up, from that keyword alone.
setjmp
will return 0 upon first execution, so this program should just smash into the while(1)
and stop there, only printing "begin" (assuming
flushes stdout). This happens with gcc.
If the loop was simply removed, it should print "begin" 2 times then print "unreachable". On clang however (godbolt), it prints "begin" 1 time and then "unreachable" before returning exit code 0. That's just plain wrong no matter how you put it.
I can find no case for claiming undefined behavior here, so my take is that this is a bug in clang. At any rate, this behavior makes clang 100% useless for programs like embedded systems, where you simply must be able to rely on eternal loops hanging the program (while waiting for a watchdog etc).