Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

neat! I almost accidentally implemented this when making an implementation of this paper/technique: http://www.cs.umd.edu/~mwh/papers/martin10ownership.html

I bet they used llvm to do this? it would be childs play to do this in llvm ...

the downside to this technique is that you can only instrument / inspect code that you compile, so if you have any third party code that you link against, and that code results in memory errors, you have problems. you always do have third party code, like the C runtime. you can get around this though by having the source for the runtime and recompiling it with your instrumentation added.

you CAN recompile the runtime on windows (they ship it to you in a ready-to-build way as part of visual studio), but there are a lot of other libraries that they don't do that for (advapi, com, rpc, etc). I guess it's fortunate for chromium that they can run these tests on the linux codebase and have the windows codebase/build pick up the improvements made for free ...

also, if you're willing to insert some annotations into the code that you are instrumenting you can perform race condition detection with lower overhead. but konstantin already wrote a DBI extension that does race condition detection without requiring annotations (ahem, most of the time) so I guess that is low on his to-do list...



Yes, it uses a modified clang binary with which you compile your program.

I don't really see the disadvantage of not being able to debug within in the runtime. The class of errors this analyzer is meant to adress is stuff like use after free, out of bounds access, and use after return. If your runtime is doing those then a) you have bigger problems, b) you wouldn't be able to fix it anyway without the source.


the libraries functions can do an out of bounds access, for example: char a[16]; memset(a, 0, 100);

and as I understand their implementation, this won't be found because they instrument loads and stores in code that their clang compiles. the compiled code will look like: push off a push 0 push 100 call memset

there are two options that I see: 1. compile the runtime with the instrumentation 2. assemble a massive list of functions which you know to de-reference pointers they take as arguments and then hope that list is somewhat representative / comprehensive


Indeed, asan will generally miss bugs in third party libraries which are not compiled. But for memset and a few other popular libc functions asan has interceptors, so in your example the bug could be reported. Currently, when dealing with memset, asan analyzes only the first and the last accessed byte, so in your example a[99] should be inside a redzone, otherwise the bug may still be missed. For heap, asan uses 128 byte redzones by default, but for stack -- redzones are only 32-byte.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: