So real question - in the "offset" "limit" example what makes it any more safe i...

didibus · on May 15, 2021

Yeah, it seems to be more about guarantees as a code base grows larger and more people touch it.

If there's a Limit class whose constructor and setter all check that the range is between say 5 to 100, and all existing code that needs the limit uses the instance of Limit, it just becomes less likely a code change is made that uses the limit input as it was directly provided by the user (and thus possibly out of range).

But you'd still need to have had someone be smart enough to make sure the Limit class does prevent limits that could cause DB crashes.

In practice I'm thinking, ok, so someone must have thought... Hey we should validate this user input and put in some logic for it.

So I think what this says is, validation works by having all external input validated as they are received. But it can be easy to make a code change at the boundary where you forget to add proper validation. If all existing functions in the lower layers, like in the data access layer, are designed to take a Limit object, the person who took a limit as external input and was about to pass it to the query function will get a compile error and realize... Oh I need to first parse my integer limit into a Limit, and thus reminds them to use the thing that enforces the valid range.

If instead the code had a util function called assertValidLimit, and the query function took a limit as an integer, it be easy for that person to forget to add a call to assertValidLimit when getting the limit from the user and then pass that unvalidated to the query and possibly cause a vulnerability.

And lastly, it seems they argue, if you were to validate instead in the query function itself, thus it wouldn't matter if others forget to validate since where it matters would, but then it is hard to fail at that layer, since you might have already made other changes and that can leave your state corrupted.

So basically it seems the argument is:

"It is best to validate external input at the boundary as soon as it is received, but it can be easy to forget to do so and that's dangerous. So to help you not forget, have all implementing functions take a different type then the type of the external input, which will remind people... Oh right I need to parse this thing first and in doing so assert it's valid as well.

Iazel · on May 15, 2021

Well said! I would only like to add that I highly discourage adding validations/assertions in the actual data class, this often make them hard to work with and reuse. It is better to have this parsing logic as a simple function, perhaps at factory level if you prefer that kind of flavor :)

TheAceOfHearts · on May 15, 2021

Apologies if I did a poor job of explaining, what you wrote seems in agreement with what I was attempting to convey.

If one were only using integer types then the same problem would persist, that's correct. The problem would be solved by defining our limit type to only represent positive integers up to a specific safe value.

Type refinement is done on the input boundaries of the system during runtime to prevent errors from propagating.

mbildner · on May 15, 2021

This is not yet possible in Typescript, but imagine if you could define a numerical subtype that requires your input be below some threshold eg:

`type Limit = 0..100;`

See discussion here: https://github.com/Microsoft/TypeScript/issues/15480