Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
CompCert – Formally-verified C compiler (github.com/absint)
227 points by deterministic on June 27, 2021 | hide | past | favorite | 123 comments


I have had the pleasure of visiting AbsInt and hearing talks on, amongst other things, CompCert. It really is a fascinating piece of software. I would highly recommend looking at the 2016 paper [1] for a bit more detail on how it works. If you're interested you can find other publications as well.

The core idea is that every phase of the compiler is proven not to change the behavior of the program. I.e. if you ran any intermediate representation you'd always get the same behavior.

The main limitation is, in my opinion, that the C specification isn't available in any form that a proof assistant can consume. Therefore, they had to translate the specification into that form manually. It is possible that this step contains mistakes. On the upside, every C compiler has this problem.

[1] https://hal.inria.fr/hal-01238879


> The main limitation is, in my opinion, that the C specification isn't available in any form that a proof assistant can consume. Therefore, they had to translate the specification into that form manually. It is possible that this step contains mistakes. On the upside, every C compiler has this problem.

Most compiler bugs are not in their interpretation of the specification, but in various optimization passes. Compcert's formal verification ensures that all optimizations retain semantics; that alone makes it worth whatever they're charging.


Eh, often there is not a clear distinction between the two. Whether an optimization is correct or not depends heavily on the interpretation of very subtle aspects of the specification.

For correctness, it is most important that the programmer and the compiler agree on the interpretation of the specification. The good thing here is that there is an unambiguous interpretation which the compiler uses, and that anyone can read. The downside is that most C programmers probably can't understand this machine-readable language, and so will simply refer to the original human-readable specification.

What would be quite neat is if they translated the machine-readale specification back into a set of clarifying notes against the original spec.


> "On the upside, every C compiler has this problem."

Reminds me of the farmer who peeks out of the basement after a big storm, and reports back to his wife: "Good news and bad news. The bad news is, our barn blew down. The good news is, all the neighbors' barns blew down too!"


>Most compiler bugs are not in their interpretation of the specification, but in various optimization passes.

I work on a large, open-source, industrial-grade compiler for a living and I disagree with this statement unless it is specifically referring to C compilers and not compilers for other languages.


Or are bugs due to interpretations of the specification just easier to find?

The point of compiler fuzzing like Csmith was that optimization bugs are really hard to find by the kinds of testing that would find more typical conformance bugs. Before these test generators were written one might have thought that optimization bugs were rare, simply because they hadn't been triggered much.


>Or are bugs due to interpretations of the specification just easier to find?

It could be. Another possible explanation could be that I am biased due to mostly working on the frontend of the compiler. I don't think that's the case though, most of the tickets that our customers open seem to be related to the frontend doing something that doesn't match the spec.


That customers report front end bugs could also reflect how easy they are to trigger. I admit this could mean they are more important, even if they aren't more numerous.

The relative frequency of different kinds of bugs can also change over time. Testing has a diminishing returns behavior. If front end bugs are easy to trigger, they will get expunged, and you'll be left with the harder to trigger bugs elsewhere in the compiler. I saw this in writing conformance tests for Common Lisp implementations: after implementations got the little direct tests right (function FOO implements its specified behavior), the remaining defects tended to be deeper and more subtle. Eventually random testing was churning out bizarre compiler interactions (but even that eventually burned through the reachable bugs.)

It would be interesting to study various compiler testing approaches using mutation: introduce bugs to the compiler by mutating its code, then see what fraction the testing can detect. Fully mutating a compiler is likely infeasible, but one could create a random sample of mutations and make estimates.


It's the pareto principle, I think: 80% of code that's out there will exercise 20% of the compiler's codepaths. It gets increasingly difficult to find meaningful code that tests those paths.


Which compiler do you work on?


I work on a GCC frontend, but not the C one.


"large, open-source, industrial-grade compiler". No kidding; I thought the same! How many can there really be?


Yeah and apparently a secret?


I indeed prefer to avoid doxing myself.


I wholeheartedly agree. I didn't mean to say that the translation of the specification was a big problem, merely that all others are even more insignificant.


Also, this has been studied empirically, and CompCert holds up well:

https://www.cs.utah.edu/~regehr/papers/pldi11-preprint.pdf

https://compcert.org/motivations.html

I think things may have improved further since the 2011 paper, where they found bugs in parts of CompCert which were not, at the time at least, formally verified.

The formally verified parts of CompCert withstood torture testing admirably.


PLDI does a "most influential paper" award each year, for the paper from the PLDI 10 years previous that had the most influence. The 2021 winner was that paper.


I think a lot of others here are going to have the question "what does it do with UB", which is partially answered in the doc:

https://compcert.org/man/manual005.html#language-reference

Overflow in arithmetic over signed integer types is defined as taking the mathematically-exact result and reducing it modulo 2^32 or 2^64 to the range of representable signed integers. Bitwise operators (&, |, ^, ~, <<, >>) over signed integer types are interpreted following two’s-complement representation.

...so at least for that case, it behaves sanely, but it's unclear what happens for other types of UB.


There was a CPPCon talk [I think?] about changes in this direction for C++ and the "problem" with this approach goes like this:

Although lots of things blow up (and a few of them literally) due to overflow, very few blew up because it was UB, most of them blew up because it was overflow, and so having the natural two's complement behaviour as CompCert does here would not improve those programs.

If in the modern era you began over from scratch (like Rust) you might choose to (like Rust does in production) define this as having the natural two's complement behaviour because that's unsurprising.

But chances are if a programmer encounters overflow and didn't choose a type which has expressed behaviour for overflow, they did not actually want overflow. So in C++ the result of that argument is C++ should get wrapping and saturating integer types (Rust got these some time ago), so you can say "When counter overflows, just saturate" or "When index overflows, just wrap it" explicitly showing you actually made a decision and the compiler ought to accommodate this.

Wherever possible you actually want to try to flag cases where overflow happened and the programmer apparently hadn't even considered that. Is it better to wrap silently than crash at runtime? Maybe, in some cases, but every single place a language lets you raise 2 to the power 32 and then store it in a generic 32-bit integer at compile time without pointing out what a terrible idea this was is a source of bugs, and "defining" it as zero doesn't fix those bugs.

A lot of UB in C is necessary to compiler optimisation in the face of a language that doesn't do as much as it could to help the compiler tell what is really intended. So it's not really practical for CompCert to abolish some of that.


The pain in C is that it lacks any features for dealing with overflow properly, and has a few other features that even work against it.

If you check for overflow in a slightly wrong way, you'll accidentally "prove" to the compiler that the overflow check can be optimized out entirely. There's no explicit saturating or checked arithmetic. Even when you use only unsigned types, you still need to be paranoid about implicit integer promotion that can make intermediate operations signed.

These things could be fixed with relatively small changes to the spec or stdlib, but I don't think there's a will for it among C users.


I think saturation might actually have more takers, because I think almost everybody who knows they want saturation arithmetic would jump at saturated_int16_t or whatever, whereas too many people who should have checked arithmetic don't even realise it's what they need.

But I agree with you these wouldn't be difficult to implement and are worth the appropriate committee looking at so that at least the programmers who realise they have a problem also have a solution.

C like C++ and Ada standards are actually a product of a JTC1 sub-(sub-?)-committee and so it's not so much about the will of C users in general as it is about the will of advocates to justify this to that committee and push it through all the bureaucracy.

Anyone actually interested in this should definitely engage with the people trying to get stuff done for C++ because the two languages know that they both benefit from the relatively good mutual compatibility, so neither would appreciate a pointless divergence on this work.


While not standardized, there are compiler builtins for operations which care about overflow. They will not be optimized out and are very efficient (cost ~1 cycle if no overflow).


I had to use them when I was formally proving some C snippets (see my other comment: https://news.ycombinator.com/item?id=27649879)


> Although lots of things blow up (and a few of them literally) due to overflow, very few blew up because it was UB, most of them blew up because it was overflow

The real problem is when a compiler deletes code due to UB. The poster example being that you can't check for overflow by adding two numbers and examining if overflow occurred, since overflow is undefined. Strict aliasing can similarly break specific memory management code, because it often has to juggle raw memory and an unfortunate inlining can produce a function where a pointer exists simultaneously as two different incompatible types. Another favourite of mine is the fact you can't check if a reference in C++ is null, because it can't be, even though it can happen. The compiler just deletes code like "if (&x)".

This all happens silently, and only with full optimisations, making it possibly the worst footgun in all of C and C++, and it's not even the fault of the language, but a choice by the implementers.

At work we just don't write standard C, we use -fwrapv, -fno-strict-aliasing and a couple of others. Writing software at scale is simply impossible with UB looming over your head all the time.


Could you point out what (legal, UB-free) way exists to obtain a null reference in C++? If it requires UB elsewhere for the expression "if (&x)" to make any sense, I don't see why this is something you blame the compiler implementer for.


It's fairly trivial to make one by dereferencing a pointer that just happens to be null based on user input, which the compiler can't check. There are several acceptable ways for the compiler to deal with it:

1. Require the programmer to prove to the compiler that the pointer is not null.

2. Insert null checks and abort the program if the pointer is null.

Both of those are deficient for me, because it's possible for some other bug to cause the memory holding the reference to be overwritten with 0, so I really favour:

3. Treat references as immutable pointers that can be null, but disallow creating them from pointers that are obviously constantly null.

Obviously currently all ways to obtain a null reference are UB, but at the same time the compiler happily produces a program that creates a null reference, but deletes the check for it. Which is the absolute worst possible way to do it. It's a needless optimisation that requires you to solve the halting problem to be safe.


`memset(&some_struct,0,sizeof some_struct);`, for starters. I don't know (or care) if some random chunk of the C++ spec says that's disallowed, but it happens in quite a lot of working code. I also use `if(!this) return PLACEHOLDER;` (in non-virtual methods) a fair bit, which amounts to the same thing: C++ says it never happens, C++ is wrong as a matter of observed fact.


All it takes is

  X& x = *p;  // p might be 0.
  Y& y = p->y;
It is the responsibility of whoever made the reference not to dereference a null pointer when initializing it. In the second form, it wouldn't help if you could check for a null &y, because it (probably) wouldn't be null anyway.


Right but the post you're replying to asked for any way to do this without invoking Undefined Behaviour.

If p is a null pointer, dereferencing it is Undefined Behaviour and you lose immediately.


A program correctness is not just yes-or-no if you take consequences into account and state that there is some room for errors and that we deal with them instead of just meeting our fate. One thing is an incorrect value that is catched few assertions later (we check our invariants, don’t we?). A completely another thing is entire blocks of code missing from an executable and flow control broken so bad that the program is now a binary equivalent of a triggered psychopath.

We know that every program still contains an error, and we want to reduce its snowball effect wherever possible. UB as handled by a modern compiler is completely opposite to this idea.


> One thing is an incorrect value that is catched few assertions later (we check our invariants, don’t we?).

So like I said, some of them literally blow up.

The assertion happened, it concluded that the system had entered an unpredicted state, the rocket exploded. That was the fail safe condition.

That rocket wasn't actually programmed in C. The nasal demons you've probably imagining infesting everything weren't a factor. Overflow blew the rocket up, like I said, not any associated undefined behaviour. This is very common.


Is this rocket example real or imaginary? If latter, rockets do not have explode-on-assertion-fail schematics, afaik. I’m not a rocket engineer, but I find that at least restarting a subsystem is more appropriate than letting it explode “because yeah anyway”. The difference between a regular error and UB is that you can’t handle UB by definition.


Ariane 5's test flight self-destructed 37 seconds after launch. Its horizontal velocity was outside the range that could be represented in a 16-bit variable. In fact this check wasn't appropriate for Ariane 5 and so was removed in subsequent versions, but that's hindsight.

Maybe you find that "restarting" a rocket would be "more appropriate" but any people who die when an entire rocket full of explosive fuel crashes into their home because it's "restarting" wouldn't agree. Once the rocket ceases to be inside its design conditions, the safest thing to do is blow it up, so that's what happens.

Historically a "range safety officer", a human, would be watching the launch and, if it seems to deviate from their expectations they remotely trigger destruction. This is tricky to do correctly, the remote trigger might malfunction, and humans react slowly, so this is not the preferred option for today's unmanned rockets which are full of computers anyway.


The Ariane 5 failure was much more interesting than that.

The designers recycled the Ariane 4's "inertial platform" IMU but neither adjusted its parameters for expected Ariane 5 conditions, nor bothered to test it in such conditions. When flight accelerations exceeded Ariane 4's design parameters, the inertial platform sent debug messages down the channel meant for input to the engine gimbals. The engine gimbals interpreted those error messages as measurements, and steered accordingly, triggering destruction of the rocket and its payload of two (then) $200M+ satellites.

Many things could have prevented the failure. Checking the IMU specs against, or actually testing the IMU under, projected flight conditions would have revealed the problem. Not building the code in debug-mode, or not sending debug messages down some random control channel, or (on the receiving end) ignoring ill-formed input to the steering system, would have allowed the mission to complete.

In subsequent flights the software on the IMU was adjusted to expect accelerations that had been originally projected and did occur, and all was well. I don't know whether it will still send debug messages to the engines in case of a booboo, or whether the engines would ignore debug messages now.

Note that there was no actual need for the checks that resulted in the error messages. They were assertions of the sort that most Europeans of the time believed should be left in production code. I doubt this experience changed many of their minds.


The (possibly apocryphal) version I was told includes another layer: The problematic code was part of a last-minute kludge for the Ariane 4. The kludged code violated the Ariane 4's specifications. The Ariane 4 team knew this, but the code "couldn't" be triggered under the Ariane 4's conditions.

A few years later, the code is pulled out for re-use in the Ariane 5. Apparently the kludge wasn't documented and no one bothered to confirm that the Ariane 4 code was actually up to the Ariane 4 spec. Had the Ariane 4 code been up to spec, the code reuse would have been fine.


No need for modern era, Algol, Pascal, Modula-2 and Basic compilers already allowed to control checked overflow.


> but it's unclear what happens for other types of UB.

Yeah, strictly speaking we don't care as much what it does in specific cases of undefined behaviour (eg signed integer overflow), since compiler bugs in any specific case, while infuriating, can be worked around on a case-by-case basis (eg, warning on any null pointer check that comes after a dereference). The real concern is what happens in the arbitrarily-large number of cases that don't get specific attention.


CompCert's main theorem is that the compiled code maintains or improves the behavior of the source code, as in, it can only maintain or remove erroneous behaviors. So, it can remove a division by zero, for instance, but it will never add one.

Otherwise, the compiler does not guarantee the absence of UB; for that, you need Verasco (http://compcert.inria.fr/verasco/), a static analyzer that tries to prove that a given code is UB-free. If it succeeds, then combined with CompCert's semantic guarantee, you get a compiled code that is as UB-free as the source code.

Verasco was a research project and it's not clear it's still being developed.


Am I right in thinking that all undefined behavior must have a defined outcome in the implementation? Otherwise, I would guess that "proven correct" is only relevant for the compilation of programs that cannot enter UB territory.

Given Moonchild's comment about all optimizations retaining semantics, this leaves me wondering what happens with some of the more notorious UB-with-optimization footguns, such as the elimination of a chunk of code because an unsigned variable cannot go negative. If such optimizations are performed by this compiler, does the "all optimizations retain semantics" rule mean that this code elimination would be applied before any optimization (to avoid there being a difference between the before and after semantics)? I am not saying that would be wrong, but I am curious about how these cases are handled.


No. There's "Implementation Defined" behaviour where the compiler vendor is expected to explain what their compiler (and maybe operating system, hardware architecture, site installation) does in these cases.

But "Undefined Behaviour" says that anything might happen. The reason to be so vague is that this frees the compiler to conclude that it can ignore this scenario. Whatever it does can't be wrong for Undefined Behaviour, so, it needn't consider any cases that trigger this. This makes it actually practical to produce reasonable object code for good programs that don't actually have Undefined Behaviour.

The practical alternative is that you can just outlaw a huge amount of stuff so that there's no problem for the compiler. For example you could say OK, my language refuses to admit that pointers are just an integer which happens to correspond to a machine address. Anything that depends on that is just removed from the language entirely. But now you can't do a lot of clever tricks, so, you abandon C and move to a language where you can get your job done.

Rust's approach is to put all that stuff that - if you get it wrong - is horribly unsafe and require its "unsafe" keyword. So in one sense it's actually two very similar languages, the safe language which you would have abandoned for a lower-level language because it can't get what you need done, and the low-level language where if you screw up everything catches fire. This would not have been practical in the 1970s but it is practical today.

The semantics of undefined behaviour are exactly the same before and after any transformation of the program, because any behaviour is a semantically reasonable implementation of your program. You screwed up, not the compiler.


Two real world examples I built:

1. In Rust I have a type named OnewayLess that insists it has equivalence (implements the Eq trait) and is totally ordered (implements the Ord trait), yet, in defiance of reason, it also insists that any of its members are less than anything when asked, even themselves.

This is incoherent nonsense, but it's safe under Rust. Rust promises that even though I am clearly a lunatic who doesn't understand what equivalence and ordering mean, my programs do not have Undefined Behaviour. If I try to sort a vector of this type, that won't reformat my hard disk, or more practically change unrelated data in the program. However, it very well might take forever (the sort keeps looking for the smallest object and never finds it) or run out of memory (the sort might try to keep references to the infinitely many "lesser" items from the vector) and that isn't undefined, that's just consequences of my reprehensible type design.

2. In our RDF storage engine written in C an older version contained a mistake in which an array of temporary pointers might get partially overwritten.

In C there is no safety checking, these are raw memory addresses, and dereferencing a pointer that now might be garbage is Undefined Behaviour so all bets are off. It could do absolutely anything. If the compiler is instead required to somehow check, when referencing them, that these are still good pointers, the performance of many C programs is ruined, even though our bug was in unrelated code.


Sure, but this does not seem to provide any insight into how, in this compiler, the preservation of semantics under optimization plays out in optimizations which assume the program will never attempt to perform UB.


I feel like that's very well covered already when I wrote:

"The semantics of undefined behaviour are exactly the same before and after any transformation of the program"

[emphasis added]

I think you've got this idea that Undefined Behaviour somehow means that the behaviour is defined but is being kept secret from you. It isn't. The behaviour really is Undefined. Doing further transformations to it is therefore always harmless.


So how does a computer cause “undefined” behavior to occur? Does it invoke the “undefined” opcode? Of course not. A computer executing a program that has entered UB will do something specific but not specified in the standard.

I don’t think you are getting that Undefined is a concept in the language specification domain, that is not literally implemented. Consequently, you don’t even see yet what my question is about. When I think of a way to make it clearer, I will follow up.


UB can not be relied upon as the compiler can do arbitrary transformations to the code. Relying on it will cause security issues or head scratching bugs down the line.


Yes, yes, yes, we all know this. That is not what this thread is about. By replying to one comment in isolation, you are completely missing the context. For clarification of what it is about, read the rest of the thread.


This compiler provably follows the semantics allowed (and defined by) the C standard. Optimiaztions have nothing to do with UB. Optimizations simply aim to produce more efficient machine code that still matches the semantics as defined by the C standard. Non-optimizing compilers (or this compiler with optimizations off) can (and often do) produce exactly the same output.

The C standard Appendix J.2 contains an incomplete list of undefined behavior causes. Note that some of them simply cannot be checked at compile time, eg "The execution of a program contains a data race (5.1.2.4)" cannot have the semantics of the resulting program defined by the compiler. C does not provide the compiler with enough information to statically prevent data races. I'm not familiar with CompCert, but it should be possible to go through appendix J.2 and find the UB causes which are detectable at compile time, and then go through CompCert and see what it does for each. What it does will likely depend on the particular program it's compiling, of course.


In the post I replied to, Userbinator gave a specific example of a case where this compiler defines explicitly what does happen when a specific type of UB does occur.

In these cases, it is important to distinguish between what the standard defines, and what the implementation does. The standard says, in effect, that we cannot assume any specific outcome from UB, but in any particular case in a specific implementation, something definite (and often deterministic, though context-dependent) will happen. This, of course, is often a source of problems, when programmers know (or think they know) what will happen, and depend on it.


This entire discussion seems more than a little silly to me. At the end of the day no one cares if the compiler is correct. What they care about is that the airplane that is controlled by the code that the compiler emitted does not crash (and I'm talking here about a literal crash with bent metal and broken bodies). If the airplane does crash, no one will be satisfied by the explanation that crashing an airplane is correct behavior in the face of a program that contained an integer overflow. The C standard itself is inadequate to meet the obvious needs of mission-critical systems, and so a compiler that is proven to adhere to this standard is likewise inadequate. Something in the standard has to change even if that something is a requirement to warn the programmer that the compiler has detected UB. At least then you stand a fighting chance. Otherwise you might not discover the problem until you have a smoking hole in the ground.


Well, the question I originally posed (and which has not been addressed yet) is about one specific aspect in what formal compiler verification can do for you in a language with extensive UB.


Yes, and I'm saying: when you have as much UB as C does it makes no sense at all to prove the compiler correct because even with a proven-correct compiler you can still have catastrophic failures in the end product. You have to define at least some of the UB in an implementation-dependent way for the proof-of-correctness to have any actual value in the real world.


I don't agree.

Consider WUFFS https://github.com/google/wuffs

WUFFS will give you a C library that can turn PNG data into raw image data and is definitely correct. It's not exactly idiomatic C, you'd assume if a person wrote this code it's probably wrong, but WUFFS promises it's correct. CompCert should turn that C library into executable code which is therefore also definitely correct.

Now, maybe you will screw up code that reads the PNG data from a disk file, or draws the image on a screen, or a million other things, but the WUFFS library components are definitely fine, not just "Bill wrote it and he's got 20 years experience" fine or "It passed the unit tests" fine but "Four Color Theorem" fine.


> WUFFS promises it's correct

With respect to what standard of correctness? If the answer is that the object code is guaranteed to behave according to the source according to the C standard that is a useless guarantee because of the possibility that there is UB in the source code, in which case the compiler could release the kraken and it would still be "correct".


I should have said "safe" rather than correct here.

WUFFs promises that you can't write unsafe things. For example you can't have arithmetic overflow in WUFFS. Every time it sees arithmetic in your code the compiler is trying to decide why this operation might overflow. If you add together two 8-bit variables and put the result in a 16-bit variable, that doesn't overflow, but if you try to put the result in another 8-bit integer variable the compiler assumes unless it can see otherwise that this is an overflow, which is forbidden, so your code doesn't compile.

Buffer overflows are the same, if you're indexing into a buffer with an 8-bit unsigned variable and the buffer has 400 entries it's cool. If it has 100 entries but the compiler has concluded this variable can only be between say, 20 and 48 then that's cool too. But if the compiler can't prove this variable isn't 100 then you've got a potential buffer overflow and the code does not compile.

The C code is an output from WUFFS. WUFFS says "This C code is definitely safe" and then CompCert says "This object code is definitely a correct implementation of your C code" so the result is WUFFS compiled by CompCert is definitely safe object code.


But I can ask the exact same question: safe by what standard?

> you can't have arithmetic overflow in WUFFS

That seems improbable. Can I write x++? Because if I can, then I can have an arithmetic overflow.

> Buffer overflows are the same

That seems improbable. Can I write *p++? Because if I can then I can overflow a buffer.


Dividing by zero is undefined behaviour.

It's fairly straightforward to provide a proof, given a definition of division by zero, that 1 = 0. It's often done in first-year introduction to algebra classes.

It would be correct, then for a compiler to substitute the value 1 everywhere the programmer used the value 0 (or NULL, or '\0') and still produce a valid program with the same behaviour because 1 and 0 have been proven to be equivalent.

So no, you are not right in thinking all implementations must define undefined behaviour. Undefined is undefined. C code containing undefined behaviour is not valid C code and the compiler to only required to produce correct output when the input is valid C code.


That is a fair point about divide-by-zero [1], but the very post I was replying to gave an example of where this compiler does say explicitly what happens in one case of what falls under undefined behavior in the C standards.

To clarify, what I am interested in here is the interaction of C-standard UB with any guarantee this compiler may give with regard to semantics preservation under optimization.

[1] In reality, the hardware will do something, and deterministically in every case that I am aware of, when the divide operation is invoked with zero as the divisor. Computer division is not an exact implementation of mathematical division.


> To clarify, what I am interested in here is the interaction of C-standard UB with any guarantee this compiler may give with regard to semantics preservation under optimization.

To clarify my point: undefined behaviour has any semantics you might want it to have. Anything and everything is semantically correct even under optimization. Even if the compiler says it will do one thing and then does another: it's correct either way.

The computer may "do something" with division by zero. Or the compiler might completely elide the code and replace it with a picture of a teapot before it even gets as far as hitting a register in anger. It's still valid, semantically, because you're dividing by zero from which you can prove anything and everything.


> It's fairly straightforward to provide a proof, given a definition of division by zero, that 1 = 0.

No, it is not. It's fairly straightforward to provide a proof, given the false[0] premise that x*y/y = x, that 1 = 0.

0: Hell, I don't even need division by zero for that one:

  uint8_t x = 2;
  x = x*128;
  // x is now (uint8_t)(2<<7) == (uint8_t)256 == 0
  x = x/128;
  // x is now 0>>7 == 0
  // x is now x*y/y is 2*y/y is 2
  if(x == 2) printf("it's two\n");
  printf("it's %i\n",(int)x);


CakeML[0] is another formally verified compiler. Notably, unlike compcert, it is open source.

The language it implements (an sml dialect) is high-level and garbage collected, meaning that it is not usable in all of the same domains, but work is ongoing to reuse much of the compiler infrastructure for 'pancake', a low-level language.

0. https://github.com/CakeML/cakeml


Xavier Leroy, who is a big contributor to this project, was an INRIA senior researcher and was the main Ocaml développer. He’s got a lecture (in french) at the college de France where he introduces the formal verification of a basic compiler for an imperative language: https://www.college-de-france.fr/site/xavier-leroy/course-20..., very interesting !


As noted in the License section of the readme,

> CompCert is not free software. This non-commercial release can only be used for evaluation, research, educational and personal purposes. A commercial version of CompCert, without this restriction and with professional support and extra features, can be purchased from AbsInt. See the file LICENSE for more information.

However, https://github.com/AbsInt/CompCert/blob/master/LICENSE goes onto say that

> The following files in this distribution are dual-licensed both under the INRIA Non-Commercial License Agreement and under the Free Software Foundation GNU Lesser General Public License, either version 2.1 or (at your option) any later version: ...

and then lists a bunch of specific files and paths, which may or may not (this is the key question) constitute a majority of the repo. Looking around, I see a few references to the LGPL and LGPLv3.

The LICENSE also makes clear (immediately following the file list) that

> If you opt for the GNU Lesser General Public License, these files are free software and can be used both in commercial and non-commercial contexts, subject to the terms of the GNU Lesser General Public License.

So, broadly speaking: am I just looking at a standard smoke-and-mirrors routine (creating just enough uncertainty that most commercial operations would reasonably pursue the custom license) or are there any actual intuition-violating footguns hiding anywhere in the codebase?


It appears that the LGPL does NOT cover the "meat" of the codebase (see the directories arm, x86, etc) and instead covers the "edges" where their code may interact with other code.

A comment indicates they relicensed some of it GPL->LGPL so I assume they hold the copyright.

The BSD licensed stuff seems to be imported code.


>The files in question are, from a formal verification standpoint, the interface to CompCert. They are licensed under the non-commercial license (NCL) so that they can be used together with the rest of CompCert (the implementation of the compiler, so to speak), which is NCL-only.

>Additionally, the interface files in question are also licensed under the GPL so that they can be used in other, open-source projects such as VST (http://vst.cs.princeton.edu/) that connect with CompCert.

at https://github.com/AbsInt/CompCert/issues/140 (2016); GPL -> LGPL in a later commit.


The copyright holder doesn't have to adhere to the GPL themselves. Even if the complete repository bar one file is GPL, as long as you keep that file, you can't use the project under the GPL. You can replace that file and then use it under the GPL though.


FYI: the directory contains 957 files; 798 of these files are NOT dual-licensed.

But of course it's not a "majority" decision, nor are all files equally important.


Wait, what? I thought the whole point of GPL is that it is viral, i.e. if you use GPL in a part of your shipped program you must release the whole thing.

From GPLv2:

"These requirements apply to the modified work as a whole. If identifiable sections of that work are not derived from the Program, and can be reasonably considered independent and separate works in themselves, then this License, and its terms, do not apply to those sections when you distribute them as separate works. But when you distribute the same sections as part of a whole which is a work based on the Program, the distribution of the whole must be on the terms of this License, whose permissions for other licensees extend to the entire whole, and thus to each and every part regardless of who wrote it."


That only applies to people who don’t own the copyright. If you wrote the original code base, you can license it GPL, but continue to sell it under closed source license as well.

On the other hand, if someone forms it under the GPL, and adds some additional features, the original code base cannot pull those features back in the original closed source project, without ceasing to distribute the whole thing as closed source.

Choosing GPL effectively forms your code, and further code added by others can no longer be added to the “proprietary” branch. But you can continue adding code yourself in parallel in both branches.


> code added by others can no longer be added to the “proprietary” branch

One thing to keep in mind, if you sign over your copyright by agreeing to contribute using a Contributor License Agreement (CLA), the person or organization can take your contributions proprietary.


Makes sense, thanks.


INAL

LGPL is not GPL. You are for example allowed to dynamic link from proprietary code to LGPL libraries. This is not the case with GPL shared libraries. You should take look at the LGPL license agreement and then see how the code is used.


You can link to GPL libraries from proprietary code if you buy a commercial license from the library's copyright owner.


It is not a question if some files constitute a majority, since this is not majority vote based.

As far as I read your evaluation, probably it is gpl. But without the gpl linker exception you cannot create non gpl binaries.


That is confusing. GCC specifically includes the "GCC runtime library exception" so that you can produce non-GPL binaries.


Yes. Gcc does that and there is no problem, but the post is not about gcc (as far as I can tell from my tiny mobile screen)


>but the post is not about gcc

Right. It's about a compiler with a GPL license but without a similar exception. I was making a comparison.


The CompCert compiler is not under the GPL or LGPL, it is under a proprietary license. Some parts may be released with other licenses, but that is different.


That's not the point going down this thread. If part of it is GPL (and not LGPL), it makes it impossible to release compiled binaries made with it that are not GPL. It appears they attempted to switch to LGPL to fix that two months ago[1], but perhaps missed some parts.

[1] https://github.com/AbsInt/CompCert/commit/04f499c632a76e4605...



I gave an introductory talk to Red Hat about proving C code with Frama-C, which touched on CompCert. (Coq, Frama-C, CompCert and OCaml are all written by an overlapping set of the same people.) Note I'm very much a beginner here myself.

http://oirase.annexia.org/tmp/2020-frama-c-tech-talk.mp4


Some caveats in this compiler:

"In C source files, attribute qualifiers can occur anywhere a standard type qualifier (const, volatile) can occur, and also just after the struct and union keywords. For partial compatibility with GCC, the CompCert parser allows attributes to occur in several other places, but may silently ignore them.

Warning. Some standard C libraries, when used in conjunction with CompCert, deactivate the __attribute__ keyword: the standard includes, or CompCert itself, define __attribute__ as a macro that erases its argument. This is the case for the Glibc standard library under Linux, and the Xcode header files under macOS. For this reason, please use the __attribute keyword in preference to the __attribute__ keyword."

I learned C in the 80's and did a lot of it for about 10 years. After not doing much software development for another 10 years or so, I wrote an embedded OS and an application (for the MSP430 platform). I was surprised by the advances in compiler optimization while I had been away. I needed to be much more diligent when interacting with hardware and locking. (E.g. Use the volatile qualifier generously, or my code would not work.) I also found some odd compiler bugs (in the TI compiler) related to bit fields.


I wonder how often a C program does something wrong because of a compiler bug versus a program bug. I’d think the former is extraordinarily rare for e.g. gcc. Although, if you spend an hour studying every line of code (as is customary when writing safety-critical code), it’s possible you may arrive at a point where the primary source of bugs is the compiler.

However, for non safety-critical software I’d say buying a proven correct C compiler would be a waste of money.


Fedora gets the latest GCC release and we often run into bugs (which we file in the upstream tracker). Now to be fair most of them are not critical: the most common ones affect warnings where the compiler starts warning about code that we think is correct. But it does happen that we get more serious things. Several times I recall we've had to track down all packages that were compiled with a particular version of gcc/binutils and recompile them because we discovered a code-gen bug.

In Fedora this is all low stakes but it wouldn't be if the software was deployed in a plane.


I was lucky enough to work with people working on CompCert in French public aerospace research labs.

I remember a presentation in which they presented the number of bugs they found on other compilers (GCC, Clang, MSVC) while developing test cases and benchmarks, and it was surprisingly high!

I don’t want to say any figures as I don’t remember it enough, but I think it was in the hundreds.

And there are many horror stories on Hacker News of people being impacted by compiler bugs in prod on non-critical software, sometimes with significant financial impact.

But yeah, not sure the price is worth it for non critical software in general.


Do you have a link to the presentation please ? All compilers have bugs, but I am interested in their comparison work in the context of CompCert


Did they fix or report the bugs upstream?


Yes I think they systematically reported it, and explained that their work was beneficial to all C compilers as a consequence.

They did not fix them though, that would have needed a team 10x larger


They're usually at least reported. At least that was the case for all the compiler-bug related papers I've read.


Compiler bugs are not rare - it's just that most people never notice them.


Especially if you compile without optimisations. The optimisations can introduce bugs (although most of the times are related to undefined behaviour in the code), but without it is rare…


Compiler bugs outside undefined behaviour are not rare; calling conventions as implemented in C compilers are complex and random testing found bugs in several compilers. However, the bugs are in parameter lists and types that are rare.

https://lindig.github.io/papers/lindig-aadebug-2005.pdf


It would be nice if Tesla would use something like this for certain parts of the car. But you just know they won’t. Love to hear if anyone knows of any formal methods are used at all, and not talking about the self driving stack, but stuff like throttle control.


Relevance??


Not usually in the habit of giving much thought to Tesla, but suggesting that a proven correct C compiler created by a company that builds transportation devices that contain lots of computers might be interesting for another company that builds transportation devices that use lots of computers seems to me to be perfectly sensible.

Especially since that other company has a bit of a reputation for writing shoddy software.

In fact, the more I think about the more relevant this comment becomes. Building computerized transportation devices is hard and apparently one of the things you need to do it well is a proven correct C compiler. A far cry from the type of software development most people do, which involves slapping together some stuff we found on the internet and writing some glue code around it.


I was like, since when does Airbnb build transportation devices?

Oh.

Nevermind.


Ada/Spark is also supposed to be pretty good here.


Cars are similar to planes in that they're very complex electric/mechanical machines. They have lots of code where if it is done poorly (Toyota spaghetti code acceleration nightmare) people could be seriously injured or even die. Tesla isn't the only car company, but has a pretty strong fan base. The poster is just generalizing the OP to car software. Seems relevant to me.


Seems weird to wish a car company would use a specific technique when you don't know if they already do this or not and they're not known to have any problems caused by not doing it.


Someone else commented in this sub thread that Tesla is known for some poor software, so maybe he/she is assuming they don't use the best possible practices.

I know nothing of Tesla though and have no plans of buying one anytime soon. Still seems pretty relevant to me though if you don't take every topic to be so narrowly defined.


If all you think about is Tesla, every comment thread is a relevant place to post.


Yeah, it even gets the undefined behavior right.


Not completely:

> about 90% of the compiler’s algorithms (including all optimizations and all code generation algorithms) are proved correct in Coq, but the remaining 10% (including elaboration, presimplifications, assembling and linking) are not verified


Btw, Monocypher passed Compcert in -interp mode.


Couldnt this problem be dealt with unit tests?


You cannot use unit tests for everything, since the number of unit tests required is basically infinity. Logically, unit tests can only prove an error exists and cannot show the program is correct. What they most likely did was mathematically showed that all possible edge cases (potentially inifinitely many) are all considered.


I have never heard of verified compiler before. Not entirely sure if I understand everything from their docs either. But here are my thoughts.

Even though you cannot unit test everything. I would still prefer it, then to have another compiler pass to ensure C semantics are the same as the compiled code. Even though compiled code is the same as C. C code can be buggy from the start.

I don't understand the use of "mathematically proven" in this context as well. Proving that x86 add instruction or a set of x86 instruction has the same as the semantics of the C code, is just pure computer science problem, not mathematic?


> CompCert is not free software. This non-commercial release can only be used for evaluation, research, educational and personal purposes.


and ?


Isn't that against Github rules? And doesn't that mean the 150 forks (republishings) are in legal danger?


No. GitHub's Terms of Service cover certain rights, allowing them to do certain things like fork, and view the code.

But... The ToS do not suddenly give everyone the right to modify, publish, or reuse the code, however they wish.

> Because you retain ownership of and responsibility for Your Content, we need you to grant us — and other GitHub Users — certain legal permissions, listed in Sections D.4 — D.7. These license grants apply to Your Content. If you upload Content that already comes with a license granting GitHub the permissions we need to run our Service, no additional license is required. You understand that you will not receive any payment for any of the rights granted in Sections D.4 — D.7. The licenses you grant to us will end when you remove Your Content from our servers, unless other Users have forked it. [0]

[0] https://docs.github.com/en/github/site-policy/github-terms-o...


> Isn't that against Github rules?

No. There is no rule against proprietary code on GitHub.

> And doesn't that mean the 150 forks (republishings) are in legal danger?

No.

If you actually read the INRIA license:

> The License entitles you to use the Software to conduct research or education and to create Derivative Works solely for academic, non-commercial research endeavors of the Licensee (A "Derivative Work" is a work that is a modification of, enhancement to, derived from, or based upon the Software).


> Isn't that against Github rules? And doesn't that mean the 150 forks (republishings) are in legal danger?

It’s not against GitHub rules. Were you under the impression users could only host repositories on GitHub that have licenses that fulfill the open source definition? How did you get that misimpression?


Not GP, put I was under this impression. I think I saw some instructions to that extent on the create-repo page in 2008 or 2009, which was changed at some point.


The free tier of GitHub used to require the repositories to be public. Some time back, they also added free private repositories. Commercial licenses always allowed public and private repositories.

And of course, once you set a repository as public, then forking and downloading allowed. Because that is, what the public flag means. Now, once you try to use the downloaded code, the actual license terms apply. But that shouldn't concern GitHub.


There is no requirement for software on github to be open source. The github ToS just demand some basic rights needed for github to function (e.g. forking and downloading must be allowed)


Not if the forks are for evaluation, research, educational and personal purposes.


No lol.


The CompCert C verified compiler is a compiler for a large subset of the C programming language that generates code for the PowerPC, ARM, x86 and RISC-V processors.

I suspect that large subset is doing a lot of lifting.


On their website, the claim is more ambitious: "The main result of the project is the CompCert C verified compiler, a high-assurance compiler for _almost all_ of the C language (ISO C99), generating efficient code for the PowerPC, ARM, RISC-V and x86 processors." (https://compcert.org/)


That makes me wonder even more what the missing parts are.


The differences are documented at https://compcert.org/man/manual001.html#sec13


Interesting that only a restricted switch is supported, but goto is apparently available unrestricted. Any switch should be trivially translatable to if-goto-else-if-goto-etc, so I wonder what the reason for that might be.


Just a guess, but perhaps they don't bother supporting 'unstructured switch' because they expect that input programs will most likely be written in the MISRA C subset of C anyway?

I don't imagine it's due to technical challenges in formal modelling, as 'structured switch' statements still permit the default case to simply break immediately.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: