I'm going to have to request a citation for the claim that it's possible to do SQL injection via DTMF or IVR. The article doesn't seem to provide any explanation as to how that might work.
As a past practitioner of Big Freaking Enterprise Apps who has seen the sausage get made, and also as someone who has shot his own foot off with telephony APIs, this does not cause me any doubt whatsoever. Here's a fairly straightforward implementation:
String[] availableActions = ["raw SQL for read last payment date" , "raw SQL for read current balance", ...]
return new SQLStatement(sqlToExecute, bindParams).execute(); #Hah I am imune to SQL injections because I used bind parameters; I am the awesome
I mean, clearly I've given you the ability to crash this method already, right? (You can cause ArrayIndexOutOfBoundsException, trivially.)
There's all sorts of fun available if six months later a different team updates the utility method that actually defines availableActions to, e.g., add in SQL which is only used by sysadmins, through the web interface.
Can you explain this further? I'm sure I'm missing something obvious, but what exactly would dataFromCaller be (I'm assuming the result of a function that maps an analyzed audio sample [input] to a number), and how could I manipulate it to cause the ArrayIndexOutOfBoundsException?
By saying "six" when the system just listed options 1 through 5 to you - or, as patio11 hinted, that may actually allow you to execute SQL statements not inteded to be accessable to cutomers at all.
Thanks for the explanation. I'm still skeptical that sql could be injected though. I'm not seeing how DTMF or speech would translate to the punctuation usually required for that type of attack.
P.S. Coding with telephony APIs sounds horrible. I used to do phone system installs (Nortel, etc) - glad I don't anymore!
I'm not seeing how DTMF or speech would translate to the punctuation usually required for that type of attack.
You're letting the user specify an index for an array pointer. There may be direct memory access involved (C code shudder somebody get Thomas, that stuff scares me). This combination has been proven to get a lot of people to any input they want and a whole lot they don't.
Time and time again, don't `trust' user data -- sanitize it before passing to subsystem or DSL. No matter how unprobable it seems for user to be able to insert unexpected characters.
Don't "sanitize" either, it's next-to-impossible to get right. Treat anything that came from user input as tainted (your language should be able to help you with this), and never pass a tainted string to a database except as a parameter.
Whitelisting works better than blacklisting, I think. You can accidentally make a whitelist too permissive, but at least you can correct it after the fact. A blacklist might never be restrictive enough, and you will end up fighting a losing battle. So a function “sanitize :: Tainted a → Filter a b → b” is acceptable as long as individual filters are secure.
I'm not sure I'm reading your notation right. Do I understand correctly that you think two sanitization functions applied in sequence are safe if both functions applied by themselves are safe? I think that that is true as a mathematical statement but often gets explosively disproven in production systems, because e.g. you can exploit different assumptions made by the filters in ways which were not anticipated (to say nothing of formally proven to be impossible), such that once-safe input becomes viable attack code.
For example, if you have both a) unicode canonicalization and b) escape_single_quotes() running around your application, I really do not like your chances of beating a fuzzer (plus, optionally, a pen-tester who understands encoding issues).
Sorry, perhaps you may find “template<class A, class B> B sanitize(Tainted<A> tainted, Filter<A, B> filter)” more readable. I was saying that “sanitize” is fine in principle—it’s unexpected interaction between filters that’s dangerous, as you say. The problem is that input filters seem more composable than they actually are. I think the best you can do in practice is attack your code with a large number of random inputs and assert that the expected properties hold.