Very interesting. I'm trying to do something closely related for other reasons.
One thing I'm thinking of is that it might be possible to brute-force the semantics of short snippets of code using genetic algorithms. A similar technique has been demonstrated a few times by author of [1].
I want to use this to eventually rapidly search a large number of binaries for insecure behavior. But to do that I need to be able to formulate questions like:
"Find me a function where attacker controlled data is marshaled to a size type and then used to allocate memory, to which a different attacker controlled amount of attacker controlled data is written".
One thing I'm thinking of is that it might be possible to brute-force the semantics of short snippets of code using genetic algorithms. A similar technique has been demonstrated a few times by author of [1].
I want to use this to eventually rapidly search a large number of binaries for insecure behavior. But to do that I need to be able to formulate questions like:
"Find me a function where attacker controlled data is marshaled to a size type and then used to allocate memory, to which a different attacker controlled amount of attacker controlled data is written".
Basically this: https://github.com/Battelle/PaperMachete
[1]: https://github.com/xoreaxeaxeax/sandsifter