Should random() be banned?

ThePadawan · on Feb 11, 2021

I'm not big on bans.

What I am big on is forcing developers to make deliberate choices. That's why I like React's policy of naming functionality "dangerouslySetInnerHTML" or "__SECRET_DOM_DO_NOT_USE_OR_YOU_WILL_BE_FIRED".

If you add usages for these in a PR I'm reviewing without justification, it's not getting merged.

So why not make cryptographically unsafe random unsafeRandom() or shittyRandom() or iCopyPastedThisFromStackOverflowRandom()?

Macha · on Feb 11, 2021

This assumes writing crypto code is the most common use case for random numbers.

How often do you write crypto code?

vs

How often do people use random numbers + threshold for A/B tests? How often do game developers use random numbers for gameplay variety? How often is random used for animation variety? Do these use cases need the overhead of a cryptography RNG?

A former employer had the same issue as in the article - the security team implemented an automated vulnerability scanner in our github enterprise instance, and it spammed comments and marked a review as requiring changes if it edited any merge request which touched a file which used java.util.Random. It lasted a day before the security team was made turn it off as on our team (and many others), literally 0 uses of random numbers were those requiring a secure random.

FalconSensei · on Feb 11, 2021

Also: if someone is working on crypto and doesn't know that random() isn't true random, should they be working on that?

aaomidi · on Feb 12, 2021

There isn't a threshold where you become capable of working on crypto problems and code.

Making this safer makes mistakes far less likely and makes all of us safer.

bigiain · on Feb 12, 2021

I disagree.

Making one specific aspect of incompetent crypto coding “safer” doesn’t solve any of the problems.

We can argue about what threshold you need to reach to be considered “capable” of writing crypto code (that’s not just a learning exercise), but not knowing the deficiencies of random() is clearly well below that bar. Even knowing that is barely past the “don’t eat your crayons” level of skill.

If anything, a hook in your CI pipeline that automatically fires anybody who checks in code using random() in a cryptographic context is a better way to “make all of us safer”.

aaomidi · on Feb 12, 2021

You could be a great mathematician and not realize what kind of random you're picking. It's not as simple as you put it.

This discussion goes for other stuff too. Rust is getting popular because even amazing C and C++ devs make terrible mistakes that cause severe security and privacy problems down the line.

thekyle · on Feb 12, 2021

Just because someone is a great mathematician does not mean they would be a great cryptographer. Mathematics and cryptography are about as related as computer science and cryptography (not that much).

lucideer · on Feb 12, 2021

> should they be

This is the point of people suggesting things like bans or explicit warnings. They "shouldn't", but what's stopping them.

JauntyHatAngle · on Feb 11, 2021

The level of expertise far outstrips the demand in my experience, so the "should" often becomes moot.

adrr · on Feb 12, 2021

I’ve seen guids used as secrets on APIs/cookies. Those are easily guessable and also not random.

hombre_fatal · on Feb 12, 2021

Not sure what you have in mind when you say guids, but the word guid doesn't confer any info here. Most people are referring to uuids when they say guid, and a v4 uuid has 122 bits of randomness. A random 122 bit number can certainly be sufficient for most applications like api keys over the network.

matthewdgreen · on Feb 12, 2021

Some standard uuid libraries have weird not-very-random fallbacks when the RNG fails to initialize. It’s rare, but you can get strange stuff without realizing it. Better to use a real RNG and check error codes.

hombre_fatal · on Feb 12, 2021

Yep, there's no rule of thumb that can spare you from reading the docs and knowing basic things about the code you're about to pull in.

adrr · on Feb 12, 2021

Guids is uuid and they can be interchanged but they all have same concept of creating unique identifiers that won’t ever collide.

It isn’t random is the problem and they are are used everywhere as a secret. People think they are random 122bit value.

Dylan16807 · on Feb 12, 2021

No, that's why you need to be specific. v4 is random, other versions aren't.

adrr · on Feb 12, 2021

You’re right I’d just looked at the rail implementation. I was basing it on my past knowledge of windows GUID implementation.

tgsovlerkhgsel · on Feb 12, 2021

I genuinely don't see the reason why non-cryptographic random number generators exist outside of niche applications.

The main arguments I've seen are speed and determinism.

However, a cryptographically secure, deterministic PRNG can be built from hash or block cipher primitives that have hardware acceleration, making them quite fast. Seed (and potentially periodically re-seed) it from a strong source of randomness, and you've got a fast and cryptographically secure non-deterministic PRNG.

I thought that "classic" PRNGs like the widespread Mersenne Twister even had issues that can cause practical problems when used in certain kinds of simulations (Monte Carlo, possibly) that rely on large amounts of random numbers, but I haven't been able to find a clear source for this.

I'm certainly defaulting to secure ones, and I'm surprised modern languages and libraries don't do this by default for their standard randomness functions.

bigiain · on Feb 12, 2021

> I genuinely don't see the reason why non-cryptographic random number generators exist outside of niche applications.

Because for well over 99.99% of developers, cryptography is a “niche application”.

I’ve never written crypto code I’ve deployed anywhere. If I need crypto, I use the highest level crypto library I can find that people I trust who _do_ know about crypto recommend.

The only time I ever recall non cryptographic random() functions to have surprised or affected me was way back when I discovered if you forgot to seed the random on an AppleII the games I wrote in Basic all started out with the same “random” choices.

tgsovlerkhgsel · on Feb 13, 2021

Cryptography is rare, but generating other data that needs to be unpredictable (e.g. session IDs, password reset tokens, random passwords getting generated, gift card codes, real-money gambling numbers) is quite common, I think.

And the default implementation of random() doesn't seem to be any faster than AES-CTR (which is the core of one form of secure PRNG).

hxtk · on Feb 12, 2021

It's true that most people who aren't doing sensitive work don't need cryptographically secure random number generators, but if you use something secure by default, it probably won't cause problems, and to the extent that it does you can catch them with some profiling. If you use insecure RNG by default, it probably won't cause problems but if it does you'll find them when a black hat hacker compromises your system in production.

Very few people set out to roll their own crypto. The issue in my experience is less about someone writing their own hand-optimized password hash function and more about people having overly-narrow views of what counts as security critical code.

Spooky23 · on Feb 12, 2021

The problem is the “banning” things as suggested in the article leads to bullshit like FIPS 140-2 mode.

So MD5 hashes are disabled in the system library for applications where it is not presenting any meaningful risk, for example. The other issue is that bans usually come with lists of things are not aren’t banned. So now you are stuck with waiting for some disinterested committee to support something that delivers a benefit.

devlopr · on Feb 12, 2021

Use the minimal amount of resources and scale up. If you need mostly random don't force cryptographic level random. It adds unnecessary processor cycles and reduces speed.

tgsovlerkhgsel · on Feb 14, 2021

I benchmarked it, and AES-CTR is faster than `random()` on a machine with AES-NI.

That's my main point: It does _not_ seem to be meaningfully more expensive to use cryptographic randomness. Yes, you could build a faster non-cryptographic PRNG, but that's not what is done by the default library.

fanf2 · on Feb 12, 2021

The most common case for me is when fuzzing, when I want reproducible random numbers, so it must be possible to initialize the RNG from a known fixed seed so that I can re-run and debug a failure.

tgsovlerkhgsel · on Feb 14, 2021

Hence my suggestion to use a deterministic cryptographic generator: This way, you can either get "proper" randomness by seeding it with a random value, or predictable (and yet strong against anyone who doesn't know the seed) randomness by seeding with a fixed value.

jcelerier · on Feb 11, 2021

> It lasted a day before the security team was made turn it off as on our team (and many others), literally 0 uses of random numbers were those requiring a secure random.

can concur, currently approaching 400k SLOC of C++ in the repo. A few dozens different places crop up where random is needed (with a quick and dirty grepping). Literally 0% is for secure stuff. Most of it has to be as fast as possible (and very low quality, as it just needs to be random / noisy enough to look random for human perception)

kortex · on Feb 12, 2021

This just kind of proves GP's point. Random APIs usually tell you what the RNG is, but not the why/how. Most people don't care if it's /dev/(u)random, Mersenne twister, PCG, LFSR, LCG, RDRAND, etc. They care about roughly 4 attributes:

- Is it good for crypto

- Is it fast

- Is it reproducible

- Is it portable

But fundamentally, it's about the use case and interface:

- I need secure random (strong, slow, secure)

- I need Monte Carlo (good enough, fast, reproducible)

- I need chaotic behavior for my game/stress test/back off protocol (usually can be barely random, fast, reproducible)

I think calling the last case InsecureRandom or RandomEnough is reasonable to convey "don't use me for secure purposes".

city41 · on Feb 11, 2021

Interestingly a major aspect of video game speed running is figuring out how the game generates random numbers then exploiting the knowledge. For example speedrunners avoid all random battles in an rpg with this tactic. I'm not arguing games need true random for the record.

mmazing · on Feb 11, 2021

That's all great and makes sense, but ...

What does that have to do with being verbose and letting developers know they are using an insecure method when it would apply to them?

If I'm writing code and using rng for gameplay variety, and then I notice that I have to use a function called "insecureRandom", at the very least I'm going to read up on an interesting aspect of computing and be a little more informed at the end of the day.

Macha · on Feb 12, 2021

Because suitability for use in secure algorithms is just one property of the random number generator. At what point do we then decide it needs to be uniformInsecureBoundedRandom?

Why don't we apply the same logic to string comparisons. Should we replace String.equals with String.shortcuttableEquals()? Since there's plenty of circumstances where that is inappropriate for crypto uses also.

What about other functions with important caveats? Should we have mailGmailMightReject()? fsyncCantFixHardware()? file.existsAtCurrentInstant()?

striking · on Feb 12, 2021

I actually don't hate `file.existsAtCurrentInstant`. `file.exists` is so misused.

Maybe functions suitable for crypto use should be in their own "crypto primitives" package.

TheCoelacanth · on Feb 12, 2021

I would prefer something like file.existedAtSomePointInTheRecentPast

Khaine · on Feb 12, 2021

The other thing is, true randomness doesn't seem random to humans. Which is why spotify and others had to modify shuffle. So true random might not be appropriate for the use case.

What is appropriate to use comes down to context.

Dylan16807 · on Feb 12, 2021

You don't modify shuffle to make it seem more random, you modify shuffle because a non-clumping algorithm is more pleasant.

Khaine · on Feb 12, 2021

It's more pleasant because it is less random. I naive implementation might use true randomness which to a human doesn't appear random.

https://www.businessinsider.com.au/spotify-made-shuffle-feat...

Dylan16807 · on Feb 12, 2021

Right, I'm not disagreeing with that.

But I bet that the main goal is not "Let's make this change so it sounds more random, and then people will like it because they think it's randomer." Rather, it's "Let's make this change so it sounds better, and also as a side effect people might think it's more random."

1MoreThing · on Feb 11, 2021

The creation of a trueRandom function certainly seems to solve this problem more than taking away a useful tool for cases where pseudo-random is good enough.

extropy · on Feb 11, 2021

The problem is that true random is quite expensive without dedicated hardware and you can still easily bias true random if you are not careful.

IMO there is no use of true random unless you really know what you are doing.

Also there is true random - pure entropy bits and there is cryptographically secure pseudorandom, seeded with true random bits.

mmazing · on Feb 12, 2021

It's really not clear cut in either way on the surface.

On one side, you can argue that leaning people towards true random will cause unnecessary performance impact because the majority of cases don't need true random.

On another side, the impact of not using true random could cause a catastrophic result for a large number of people.

So which has more weight? I dunno.

In either case, it would be nice if developers knew the consequences of using either method, so this discussion is really more about education than anything else.

SideQuark · on Feb 12, 2021

>the impact of not using true random could cause a catastrophic result for a large number of people.

And the impact of using 1000x slower trueRandom could cause catastrophic results for an even larger number of people, since by far PRNGs are used where speed is more important than security.

And once you pick a "true random", how true is it? Will it be secure in 10 years? Will we then need a "truerTrueRandom" to mitigate that true random has failed to pass future mathematical or hardware tests? Will it return random numbers fast enough for future uses?

It's a rabbit hole. Let developers use the one they need, and since the vast majority does not need secure random, don't force it on them at significant cost.

If your crypto developer cannot know which to use you're going to have a lot more holes in your crypto than the RNG.

chromanoid · on Feb 12, 2021

Pseudo random is sometimes even necessary. It's really cool when you generate reproducible test data via seeds.

wbl · on Feb 12, 2021

How many of those applications are ones where a single AES encryption is too slow?

ThePadawan · on Feb 12, 2021

Did you mean to reply to my comment?

I don't disagree with anything you say, but it also doesn't refer to my comment.

Macha · on Feb 12, 2021

Why not? I feel the same about naming a method to shame/discourage use as you suggested for outright banning. It shouldn't require a justification to use non-CSPRNG, because most use cases I run into for random are not crypto, because I don't write my own crypto.

zrm · on Feb 11, 2021

It's also a good idea to give safer things shorter names.

So make random() a CSPRNG (and an alias for SecureRandom() for people who want to be explicit) while InsecureFastRandom() is just what it says and has no other name. Then if you really need performance over unpredictability, it's there, but nobody is confused about what they're getting. And lazy people who don't like to type or pay close attention get the safe one.

cogman10 · on Feb 11, 2021

That's be my preference.

random() should be the most universally applicable random which includes making it as secure as possible. Non-universally applicable randoms should be named accordingly.

njharman · on Feb 11, 2021

Most simulations, games, everything that isn't generating cryptography does not need security in it's random.

For most domains secure random is a niche, not universally applicable.

SomeCallMeTim · on Feb 11, 2021

Actually, in my experience using the default random implementation in games:

* It's not fast enough. * It has patterns that can be seen if you are using it to, for instance, generate 2d noise.

So for games you'd typically use, say, the Mersenne Twister [1], which is faster (amortized) and is distributed evenly across 623 dimensions. [2]

It's not cryptographic, but it's far better for games. If you're not going to have a crypto default random, better to at least have a really good and really fast one.

[1] https://en.wikipedia.org/wiki/Mersenne_twister

[2] http://www.math.sci.hiroshima-u.ac.jp/m-mat/MT/ARTICLES/mt.p...

patrec · on Feb 11, 2021

Mersenne Twister is the MD5 of random number generators: it's neither secure nor fast (and unlikely md5 not space-efficient, or simple to implement either). You can have something that has better randomness, runs several times faster, uses less space and has a much more compact and simple implementation.

So given that Mersenne Twister sucks in pretty much every possible way other than having a catchy name, likely the sole source of its continued popularity, its probably not a good choice for replacing some "default" random number generator with something better.

MauranKilom · on Feb 12, 2021

> You can have something that has better randomness, runs several times faster, uses less space and has a much more compact and simple implementation.

Could you share at least one example of such a generator?

Most of these aspects are trivial to improve on, but having "better randomness" on top is quite the bar.

SideQuark · on Feb 12, 2021

The PCG class of PRNGs are very good, and have a zillion tunable versions to suit every need.

https://www.pcg-random.org/index.html

sobriquet9 · on Feb 12, 2021

MCG 128 from http://www.pcg-random.org/posts/on-vignas-pcg-critique.html or xoshiro256plus from http://prng.di.unimi.it/xoshiro256plus.c

patrec · on Feb 12, 2021

https://en.wikipedia.org/wiki/Mersenne_Twister#Alternatives

https://nullprogram.com/blog/2017/09/21/

I don't think better randomness than MT is as high a bar as you make it out to be, both links above also briefly mention statistical shortcomings of MT.

Khaine · on Feb 12, 2021

Doesn't FreeBSD use Mersenne Twister as its CSRNG?

fanf2 · on Feb 12, 2021

No, FreeBSD uses Fortuna https://en.wikipedia.org/wiki/Fortuna_(PRNG)

Mersenne Twister is not secure, and it isn’t a very good RNG even for insecure uses.

Khaine · on Feb 12, 2021

Thanks! I don't know why I thought they used it.

bigiain · on Feb 12, 2021

Patterned random numbers become a feature in some games, even if an unintentional one.

I used to be able to play about a dozen Pac-Man levels knowing exactly where every ghost was going to go and every bonus that was going to appear. And I wasn’t a very good player.

Pac-Man would have been a less fun game for many people if it used crypto grade random.

watwut · on Feb 12, 2021

Pac-Man ghosts did not moved all that much randomly. Each ghost had special follow up strategy. You likely learned their movement patterns without realizing it.

epanchin · on Feb 11, 2021

And one would hope secure devs know which random to use, whereas gaming devs have no reason to know.

josephg · on Feb 11, 2021

Look up “password generator” or similar terms on npm and take a look at how the packages you find generate random numbers. I did this ~5 years ago and it took until the second page of results before I found any packages that used a crypto-secure rng.

esrauch · on Feb 11, 2021

Even that seems unlikely to be problematic for anything short of literally constant seeds AND a generator becoming extremely popular.

The vast majority of people reuse low-entropy passwords, figuring out what password generator someone used would be a much higher bar than figuring out passwords, and just knowing the insecure generator wouldnt reduce the entropy by that much.

Actually, a password generator on GitHub that generates the password that is literally just seconds-since-1970 would still be a good generator for almost all use cases.

fanf2 · on Feb 12, 2021

This is extremely bad advice. It is shamefully unethical to encourage people to harm their security in this way.

esrauch · on Feb 12, 2021

Can you explain more? I genuinely don't see any plausible threat model that a user running an Math.random() based custom algorithm password generator would be susceptible to, but the same algorithm using SecureRandom one is not. Both cases are so drastically better than manually thinking up a password that it's not even close.

I think if there's any gap it would be wrong roll your own password generator at all and you only use ones authored by security experts: just using SecureRandom instead of Random isn't going to somehow magically guarantee you didn't mess up another way and write a low-entropy password generator.

ThePadawan · on Feb 12, 2021

I was trying to generate random string identifiers to make some element ids unique and took the most popular library, which was accidentally crypto levels of secure:

https://blog.prat.ch/2020/11/10/randomstring-bundle-size

bigiain · on Feb 12, 2021

Looking for crypto secure code by searching on npm?

:facepalm:

Seriously.

tedunangst · on Feb 11, 2021

So a simple linear generator is fine for an online poker game?

btilly · on Feb 11, 2021

The claim that most uses of random() are not in places where cryptographic security is needed is not in conflict with any list of examples where it is needed.

Here are the last several times I saw random() used.

Seeding a neural network for a Coursera course. Not only do you need to call random() a bunch of times, but the ability to set a seed and get deterministic results makes grading of the results massively easier.

Creating simulation data used for integration tests on a piece of software.

Picking a few random numbers that I used in an explanation of an answer.

Of course the plural of anecdote is not data. However in my corner of the world it is very rare to need cryptographically secure anything. And when I do, I know better than to code it myself. But it is common to need a lot of cheap numbers in a hurry.

wolf550e · on Feb 15, 2021

chacha8 (chacha20 but with fewer rounds) seeded with either a deterministic seed or from /dev/urandom (depending on what you need) is a perfectly fine PRNG, and should run at 4 GB/sec on a single core, which is plenty.

Only 7 rounds of chacha are cryptographically broken, so chacha20 has excessive security margin. Using 8 rounds of that everywhere you use random today (and not a CSPRING) should be a no-brainer, I guess.

On hardware with AESNI (like all x86 for the last 10 years), a reduced round AES-CTR should be very fast and also string enough for the cases when you don't need a CSPRING. Full AES-128 outputs 4.7GB/sec on my laptop, recent chips have better AES throughput.

Sohcahtoa82 · on Feb 11, 2021

Well of course an online poker game should be using a CSPRNG.

The parent comment said "Most simulations, games ...". Most. Not all. I think it's pretty obvious that Poker would not be included a statement of "Most".

The real litmus test is the question of "What happens if a malicious actor is able to predict the random numbers?"

tedunangst · on Feb 11, 2021

The claim was that everything that isn't generating cryptography does not need security.

yellowapple · on Feb 11, 2021

I personally parsed the claim as "most (simulations, games, everything)" rather than "(most simulations), games, everything", but I can see how that can go either way.

SideQuark · on Feb 12, 2021

If you're using it to give hands to people, then no. If you suitably hash or whiten the outcome, then yes.

If you're using it to generate Monte Carlo hand playouts to generate % outcomes to assist such a game, then you most certainly want an extremely fast generator. Only slightly more complex than a LCG is the PCG class (based on a PCG, and it's just as simple and fast in most cases as a LCG). And you'll need such Monte Carlo simulations to detect cheating, among other things.

So even for an online poker game you need to know what you're doing. Neither type of RNG will solve all the issues you need.

So yes, at the base a simple LCG would suffice if you know how to use it.

https://www.pcg-random.org/index.html

bigiain · on Feb 12, 2021

If your devs for your online casino aren’t smart and experienced enough to know when blanket advice and recommendations about random numbers don’t apply to them - enjoy going broke, it shouldn’t take long...

bigiain · on Feb 12, 2021

I completely disagree here.

I’d bet a lot of money that the number of times random() is used in a non cryptographic context is many orders of magnitude higher than the use of random numbers in crypto code. That make non crypto use “the most universally applicable” case.

I would also agree that “non universally applicable random” such as those used in crypto code should be named accordingly. Which they are. Secure_random() is the right choice for the vanishingly small number of developers writing crypto code. random() is the right name for pretty much everybody except those who need to know a lot of other crypto-specific other things as well. They can’t use random() the same way the cant use non-constant-time comparisons and algorithms that leak side channels via power monitoring or cache hitrates. Fixing random() and letting people who don’t know they should have been calling secure_random() write code in niches they don’t know enough to get everything else right, is without doubt going to end up with way more code that is not “as secure as possible” even if it happens to use secure random numbers.

cogman10 · on Feb 12, 2021

Why put security second?

The thing is, if you need random numbers fast, then profiling will tell you "Opps, used random() when I should have used fast_random()". That's an easy change to make that'd show up in profiling if performance were an issue.

If random() never comes up in profiling, why care that you are getting the secure version?

The danger of missing "secure_random()" is that it creates a security vulnerability (Potentially leading to loss of money, information, etc). The danger of going the "fast_random()" route is that your application will run slower potentially leading to a dev needing to spend time to swap in "fast_random()" for "random()". That, to me, doesn't seem like a major problem or risk.

Programmers are notoriously bad at predicting when something will end up being a performance issue. So why preoptimize because we assume most people want/need speed?

bigiain · on Feb 12, 2021

Then we’ll end up with a csprng getting used in a tight loop iterating over every pixel in a raytracer...

“Lazy people who don’t want to type” are not the sort of people I want writing the code I might use or interact with that requires cryptographically secure random numbers...

zrm · on Feb 12, 2021

> Then we’ll end up with a csprng getting used in a tight loop iterating over every pixel in a raytracer...

Which will then be conspicuous enough for the developer to notice and fix it.

> “Lazy people who don’t want to type” are not the sort of people I want writing the code I might use or interact with that requires cryptographically secure random numbers...

Bad news though.

cpeterso · on Feb 11, 2021

Firefox has a secret setting used in test automation called “turn_off_all_security_so_that_viruses_can_take_over_this_computer”.

https://searchfox.org/mozilla-central/rev/3ff133d19f87da2ba0...

makomk · on Feb 11, 2021

Yeah, and if I remember correctly one neat technique for exploiting security vulnerabilities in Firefox was to use them in order to set turn_off_all_security_so_that_viruses_can_take_over_this_computer to true, with obvious results.

wnevets · on Feb 11, 2021

hmm I wonder what would happen if I enable this setting...

throwaway744678 · on Feb 11, 2021

*** wnevets quit (Connection reset by peer)

nicoburns · on Feb 11, 2021

React's dangerouslySetInnerHTML is indeed a good way of handling this kind of thing. Rust's `unsafe` is another example of the same approach.

andoriyu · on Feb 11, 2021

uhm no.

Rust's "unsafe" is a pretty bad name and completely different reasoning behind using it. It doesn't mark something as "dangerously unsafe, don't use", to a consumer it indicates "exercise caution" and to a compiler it just allows 5 things:

    Dereference a raw pointer
    Call an unsafe function or method
    Access or modify a mutable static variable
    Implement an unsafe trait
    Access fields of unions

The point of "unsafe" in rust is to highlight which area requires more human attention... not to discourage its usage.

`dangerouslySetInnerHTML` is literally dangerous and allows XSS if used with outside input.

It also is faster than the other variant. The same is true for `random()`. Both can be used when you know what are you doing to gain some performance.

Meanwhile, `unsafe` rust by itself is not different from safe rust in terms of speed. You have no choice, but to use it places it supposed to be used.

nicoburns · on Feb 12, 2021

The point of `dangerouslySetInnerHTML` is also to highlight an area which requires more human attention. It's perfectly safe if you have otherwise handled escaping or validation of the content. It's just that you want to pay careful attention to that code to ensure that you're doing it correctly, whereas in normal React code you don't have to think about escaping at all because the runtime handles it for you.

Likewise `unsafe` marks areas where you need to be really careful that you upholding the safety invariants yourself, whereas in normal Rust code you don't need to think about that at all.

hyperman1 · on Feb 12, 2021

Isn't this just part of the world view of Rust?

The word 'safe' in Rust has a very specific, technical meaning. 'unsafe' is simply code that is not automatically 'safe' in the Rust sense.

A non-Rust developer sees safe/unsafe and gets worked up, but that just means that he should put his rust-colored glasses [*] on.

This is not unusual in ICT, known for its colorful language. A non-ICTer hears 'black hat' and thinks about how cool and stylish the hat is. An ICTer hears 'hacker' and thinks about how cool and stylish the hack is.

[*] Thanks, Raymond!

setr · on Feb 11, 2021

I don't think rust's unsafe says anything about cryptographic security, beyond pointer-safety implications.

nicoburns · on Feb 11, 2021

Neither does React's dangerouslySetInnerHTML. What both of these do is mark a potentially dangerous operation with an in-your-face warning message that you have to go out of your way to ignore. Which in practice is very useful, as often the biggest problem with security issues is not know what you don't know. It's impractical to review the entire codebase on a regular basis, and it can be hard to know which bits to focus on.

shadowgovt · on Feb 11, 2021

I agree, with the caveat that use of random for cryptography is actually a domain specific use case.

It's probably okay to leave the function as it is and just drill into people that if you're doing cryptography, you either need to know exactly what you're doing all the way down to the hardware or you need to leave it a task for somebody else more specialized than you. I, for one, never assume random() is cryptographically secure, but it might be because I grew up programming during the era where random was computed off of clock cycles since CPU startup because there wasn't much other cheap entropy to lay a hand on ("battery-backed onboard date clock?! Oh, look who has AKERS money!").

mannerheim · on Feb 11, 2021

accursedUnutterablePerformIO

https://hackage.haskell.org/package/bytestring-0.11.0.0/docs...

iratewizard · on Feb 11, 2021

Beginner friendliness is something to remember, too. There are half a dozen words you could use to describe pseudoRandom(). Random() is easy for a first year or non-professional to remember.

dariusj18 · on Feb 11, 2021

Most of the time the people who write and name the functions don't know it's not secure or safe. So you would still need to ban random when the new name is implemented.

sp332 · on Feb 11, 2021

The "ban" can be evaded by telling semgrep to ignore it for one line. https://semgrep.dev/docs/ignoring-findings/ This doesn't really scale though - if someone bans it with a different tool, you'd have to tell each tool to ignore this line.

suzzer99 · on Feb 11, 2021

I have required parameter to push our app to production called: YES_I_HAVE_ALREADY_MERGED_THE_LIB_REPOS_AND_WAITED_FOR_THEM_TO_COMPLETE_BEFORE_MERGING_THE_APP_REPOS

Gets the point across and will still work when I'm long gone.

ddlsmurf · on Feb 11, 2021

Indeed this is a stupid debate. Knives help us in the kitchen but also sometimes stab people - should we ban knives ?

Spivak · on Feb 11, 2021

So you’re not big on bans but if you use dangerouslySetInnerHTML then it’s definitely not getting merged? Is that not a ban? Do you just not like when tooling enforces it?

ThePadawan · on Feb 11, 2021

No, as I said, it would raise a red flag. That flag can be lowered by justification, e. g. if you add types or constraints to only allow safe-enough parameters etc.

dariusj18 · on Feb 11, 2021

They said "without justification"

nemo1618 · on Feb 11, 2021

The root problem here is the notion that you need to choose between "strong and slow" randomness vs. "weak and fast" randomness. If every language's random() was strong and fast, most developers would never have to think about it.

"Strong" randomness is often too slow because every time you ask for new entropy, you make a syscall. The solution is to use 32 bytes of strong randomness to seed a userspace CSPRNG. You can generate gigabytes of secure entropy per second in userspace. If you need deterministic entropy, just use the same seed.

This isn't a one-size-fits-all solution, of course. If you only need to generate a few keys now and then, it's marginally safer to make a separate syscall for each of them. If you're targeting some tiny SoC, then sure, use xorshift instead. But what we care about is the common case, and right now the common case is a developer choosing the weak, deterministic RNG because it's faster and has a more convenient API and the secure RNG says "for cryptographic purposes" and well this usecase doesn't seem like cryptography, it's just a simple load balancer...

That decision should never need to be made.

moonchild · on Feb 12, 2021

> The solution is to use 32 bytes of strong randomness to seed a userspace CSPRNG

All cryptographic randomness generation should be performed by the kernel.

You always have to think about security because if you don't think about security you're going to get hacked. By all means, name the insecure randomness generation function ‘insecure_random’. It does help. But secure-by-default helps you only marginally because when building secure software you don't get to just use the defaults; you have to think about what they're doing.

You have to (for example) know and think about timing attacks even if you're using a cryptographic primitives library that's hardened against them, because it's really easy to introduce timing dependence into your own code and none of Daniel Bernstein or Tanja Lange’s careful designs will save you.

nemo1618 · on Feb 12, 2021

That's fair. There's no silver bullet for security. But we should not let the perfect be the enemy of the good. Everyone writing non-trivial systems should have some understanding of security; but the more components we make secure-by-default, the less those developers need to learn.

repsilat · on Feb 12, 2021

> You can generate gigabytes of secure entropy per second in userspace.

I haven't thought about this before, so please have patience:

I guess the "secure" qualifier does a lot of work in this sentence? That there's 32 bytes of "true entropy", but "secure entropy" is theoretically weaker but practically just as strong with reasonable assumptions about an attacker's computing resources.

So I'd guess the "secure" qualifier must mean something like "given any quantity of derived pseudorandom information, the seed bytes can't be efficiently deduced? Pretty neat. (I had a knee-jerk disagreement until I re-read your post and saw that you said "32 bytes", not "32 bits". Quite plausible -- and cool -- that we have a good solution with just a small amount more seed randomness though.)

nemo1618 · on Feb 12, 2021

Correct -- the RNG output does not reveal anything about its input, just like the output of a strong cipher does not reveal anything about its input.

djb has a good article on the subject: https://blog.cr.yp.to/20170723-random.html

I've implemented his "fast-key-erasure CSPRNG" in Go: https://github.com/lukechampine/frand

smcameron · on Feb 12, 2021

There is no such thing as fast enough.

madsbuch · on Feb 11, 2021

Is there something I am overlooking here? Randomness is used for many other things than cryptographic usages.

Eg. for randomised algorithms you need a fast source of randomness.

otabdeveloper4 · on Feb 11, 2021

No, you are not.

Cryptographic random() is an extremely niche use case that you shouldn't be using unless you're writing your own crypto libraries. (Don't do that.)

jcranmer · on Feb 11, 2021

That's not true.

To answer the question as to when you should use cryptographic random(), ask yourself "What is the worst that could happen if someone guesses the result of random()?"

If the answer is "I don't know," go cryptrographic. You'll save your butt if you didn't know it was important.

If the answer is along the lines of "someone could impersonate a user, or leak information they shouldn't see," for the love of all that is holy, use cryptographic. This is basically every scenario where you are using random to generate an ID of some kind, and while it's only truly critical if that ID is all you need for validation, it does provide another layer of security even if you also require other information to match before giving out elevated access.

If the answer is "it defeats the algorithm I'm trying to do" (think something like ASLR, where you're randomizing the offsets of addresses so that attackers don't know where things are located), well, the reason why you need to use cryptographic should be blindingly obvious.

If the answer is instead "they can reproduce my results," well, you shouldn't use cryptographic in this case. And that's not a lot of cases: Monte Carlo simulations, testing, fuzzing are the obvious poster children for this category, and indeed reproducibility in these cases tends to be a highly valuable feature rather than an anti-feature.

Cryptographic random is almost never harmful to your application, and almost always provides some benefit in reducing guessability of your system. You should err on the side of using cryptographic random(), and only not use it when you are sure that guessability will not harm security in any way and you know that the cryptographic nature actively harms your application.

mreome · on Feb 11, 2021

I would argue that if you're asking yourself "What is the worst that could happen if someone guesses the result of random()?" and your answer is "I don't know," then you're doing something you shouldn't be doing.

There are a lot of domains where security is a non-issue and performance is a huge concern (graphics, game logic, many kind of simulations, etc), and the default is the reverse... always just use the installed non-secure random() and if that's too slow consider other options.

Having a flag you can enable to warn about a non-secure random() usage, when it makes sense for your company/usage, sure. But banning it outright makes no sense, and the default behavior you want is very situational.

jcranmer · on Feb 11, 2021

When it comes to optimization, there's a useful adage: make it work, then make it fast. Secure should really be seen as a necessary component of correct (and that it often isn't is a testament to the failure of our profession).

In that vein, the default random should be cryptographically-secure, with all the logic necessary to actually effect that security (e.g., not reusing seeds after a call to fork). You can also go ahead and provide an insecure random as well, but choosing the insecure random should always be something that the programmer has to go out of their way to do.

mreome · on Feb 11, 2021

Secure should really be seen as a necessary component of correct security. I don’t see random() as part of security, and the problem is that people use it as such (that’s the failure of our profession as I see it). You wouldn’t want the default string equality operator to be constant time to prevent a possible timing attack, and in the same way I don’t think random() should be cryptographically secure by default. If you need secure random values, you are (should be) a domain-expert and should be selecting an appropriate cryptographically secure random generator from a security library, in the same way you would with a constant-time equality function.

I guess it's a matter of perspective over who random() is for. I see random() as for the programmers who don't know what kind of randomness they need, and don't need to know that because they just need something 'random' not something secure. I expect the domain-experts to know that it's not what they need. In my mind it's not that random() is not secure, it's that using it for something it's not intended for is insecure.

chmod775 · on Feb 11, 2021

>When it comes to optimization, there's a useful adage: make it work, then make it fast.

When you need something to be fast, you better design it from the start to be fast. This is terrible advice for everything but some UI/web cases.

Speed is a feature. Not every feature can be just 'added' to existing code without changing most of it.

All of that is especially true for running simulations and the like. Whether these are fast is often determined by the architecture you decided on in the beginning.

Nowadays the world is full of libraries, frameworks, etc. that will never be as fast the competition, because they can't become fast without changing their APIs completely.

hxtk · on Feb 12, 2021

It's true that there's no one-liner you can use to sum up the whole field of software engineering, but that doesn't mean none of them are useful.

I completely agree that things which need to be fast should be designed structurally to be fast.

But which RNG function you use in a given function is about as far you can get from an architectural decision. You shouldn't need to refactor large swathes of your codebase to accommodate a substitution of one random number generator for a faster one.

creata · on Feb 12, 2021

> then you're doing something you shouldn't be doing.

I agree! But I also think a standard library should withstand a bit of abuse from people who don't know what they're doing.

stickfigure · on Feb 11, 2021

Server-side folks generate random identifiers and shared secrets all the time. Yes, it's niche, but not "extremely" and you don't use a crypto library for this (you use secure random!)

mreome · on Feb 11, 2021

There is a difference between generating these kind of IDs and writing the generator for these kinds of IDs. You shouldn't be rolling your own UUID generator if you don't fully understand the concerns/requirements in regards to your source of randomness.

Generally speaking, I'd agree the need for a cryptographicly secure random is niche in that it is limited to the implementation of specific libraries/functions that despite being widely used, should NOT be frequently re-implemented.

josephg · on Feb 11, 2021

I do this in nodejs all the time for IDs:

    require(‘crypto’).randomBytes(15).toString(‘base64’)

Is this bad practice? Can you say more about why?

cperciva · on Feb 12, 2021

That's only 120 bits of entropy, which means that you'll get a collision after generating ~ 2^60 IDs.

Ok, maybe you're not worried about that scale; but I normally recommend 256-bit IDs in order to make sure that you don't need to worry about that possibility.

creata · on Feb 11, 2021

Say you're making an online game, and you need an RNG on your server. Above all, this RNG needs to be unpredictable, or someone will easily game it. Most non-cryptographic PRNGs are very predictable, so it's dangerous to use them.

I think this is a scenario that (a) isn't "extremely niche," and (b) warrants CSPRNGs.

mreome · on Feb 11, 2021

It's not that you shouldn't be using it necessarily, it's just that for many cases (games, procedural generation, graphics, many kind of simulations) it's unnecessary and slow. In my experience if someone doesn't know if they need a cryptographicly secure random(), or if a given random() implementation is secure then they (a) don't need it or (b) are trying to implement something they shouldn't be.

tedunangst · on Feb 11, 2021

What bad things will happen if I use cryptographic random for a non niche use case?

madsbuch · on Feb 11, 2021

It is expensive to increase entropy of a random source. So for randomised algorithms you might not get the performance that merited the algorithms in the first place.

hannob · on Feb 11, 2021

The cryptographic randomness has practically no downside if you use it for non-cryptogrpahic purposes. Not true the other way round. And I'm inclined to say given how many misconceptions around randomness there are around, I don't think people are good at knowing whether they need secure randomness.

The only possible justification for insecure randomness would be performance, but you'd need to generate a lot of random numbers to even be able to measure that.

gwbas1c · on Feb 11, 2021

> The cryptographic randomness has practically no downside if you use it for non-cryptogrpahic purposes

Cryptographic randomness is typically slower than other forms of randomness.

In all of the programming I've done in my career, I've only needed cryptographic randomness a few times. For the rest, a fast pseudorandom number generator seeded by the clock was the correct choice.

pdpi · on Feb 11, 2021

Inversely, in my career there's only been a handful of times where cryptographic randomness was too slow.

I'd argue it's better to do the safe thing by default and switching to the faster alternative when you have proof you need it. Doing the fast thing by default and fixing security later is how we got Meltdown/Spectre.

mreome · on Feb 11, 2021

It's going to depend on your experience, for me I have often run into the exact opposite extreme... where the non-secure random is to slow for my uses-cases (games, graphics, procedural texture gen, etc) and a much faster but less statistically random generator better suited my needs.

I would argue that the default behavior should favor the novice and non-domain-expert. Should game programmers, graphic programmers, etc, be expected to know that they need to tune the performance of random() or should the domain-experts writing cryptographic algorithms be expected to understand the limitations of random() as it applies to their use-case?

Nullabillity · on Feb 11, 2021

Which has the worse consequences for misuse?

The game programmer who needed better performance can "just" switch to a faster algorithm when the profiling calls for it (and if you don't notice it, well, no harm anyway).

The guy who just needed to generate some cryptographic keys? Rotate everything, and you had some pretty horrible hidden vulnerabilities in the meantime.

nullc · on Feb 12, 2021

> Inversely, in my career there's only been a handful of times where cryptographic randomness was too slow.

Yeah, that's the opposite of my experience. Often the libc random is too slow and limiting performance and I need to substitute something even faster.

But I agree that using something that is securely random by default is a good idea. People can substitute faster thing fit for their purpose if needed.

mreome · on Feb 11, 2021

My counter would be that if someone "doesn't know whether they need secure randomness" then the problem is not that random() is not secure, it's the fact that someone is doing something they really should not be doing in the first place.

stefan_ · on Feb 11, 2021

Obligatory mention here for the fine folks of systemd, who have made a properly seeded CSPRNG a requirement for merely booting a system and then kept bricking peoples systems when it turns out finding that seed at boot time is a non-trivial problem. All for what, avoiding collisions in some hash table implementation?

I don't really care for the browser application, if you made a TLS connection in the first place obviously you better have the randomness and might as well make random() use that, but someone explicitly using a CSPRNG in a native application is a huge code smell on the level of implementing your own crypto.

magicalhippo · on Feb 11, 2021

As much as it's overkill for most people, I'm a fan of safe defaults so I say let random() be slow and good. It's better to find out your code is slow due to a slow random() than to find out it's broken because you didn't know and thought random() was really random.

If you need a fast source of randomness, for some Monte Carlo algorithm for example, then you know this and can pick a deliberate pseudo-random generator that fits your needs.

I worked on a Monte Carlo path tracer. Early on we swapped out the random number generator from the standard random(). Initially not for speed, but due to the poor distribution.

After optimizing other areas it became a bottleneck and we swapped it out again for a faster one.

duckerude · on Feb 11, 2021

It is. The question was how often that's the case. If 50% of the uses of random() are bad, then getting those fixed may be worth the cost of annoying the authors of the legitimate 50%.

It turned out to be much less useful than that. So they got rid of it.

analog31 · on Feb 11, 2021

Indeed, I use something like it from a vendor supplied C math library for a noise generator on an embedded app, where I really just care about its crude statistical behavior.

But short of saying "banned," any review of security critical code should include an explanation of where the random numbers are coming from and why they're trusted. Or in general for any code review: Why do you believe your numbers?

nullc · on Feb 12, 2021

> Eg. for randomised algorithms you need a fast source of randomness.

Though normal random() implementations are LCGs which have poor distributions when you either only look at the least significant bits or project them into multiple dimensions.

As a result they may make some randomized algorithms perform poorly!

tiborsaas · on Feb 11, 2021

Meanwhile in graphics programming land:

    float rand(vec2 co){
        return fract(sin(dot(co.xy,vec2(12.9898,78.233))) * 43758.5453);
    }

LolWolf · on Feb 11, 2021

Honestly, the one thing that got me into graphics (from physics and math) was just the incredible amount of: "you can literally do anything so long as you make it pretty in the end."

I took that as a life philosophy and it's been pretty great so far.

bobbylarrybobby · on Feb 12, 2021

I assume you’ve seen John Carmack’s square root hack? What a beaut

makeworld · on Feb 12, 2021

Could someone explain this? How is it used, why does it take a vector as input, where do the magic numbers come from, etc.

hyperman1 · on Feb 12, 2021

random typically works by storing/modifying some state, so every call with the same argument results in different numbers.

In shaders, the same code is executed in parallel for potentially every pixel. Storing state would mean pixels could only be calculated serially, slowing things down.

Hence you need a random-ish function that depends only on its input. Very low RNG quality is not a blocker, as long as things look good.

So in shaders you see a lot of random generators which simply take the pixel coordinate or something else that distinguishes 2 pixels and do some nonsense operations on them.

tiborsaas · on Feb 12, 2021

It's written in GLSL which is a C like language for shaders designed to be executed on the GPU.

The 2D vector used for input is to represent pixel coordinates mapped from 0 to 1 or -1 to 1 on both axes.

The magic numbers are nothing special, they are just large numbers to make the result unpredictable for the given input.

The top level function is a fract which takes the fractional part of a number. So if the result of the inner computation is twisted enough it will be hard to trace it back to the original values. There are lots of variations for these one liners, most of the do a great job to produce noise.

jl2718 · on Feb 12, 2021

Is this actually faster than seeding the PRNG with co.xy? Or xxhash({co.xy,time})?

admax88q · on Feb 11, 2021

Just make Math.random() cryptographically secure, now all your apps are fixed, and no existing code broken. I can't imagine anything relying on Math.random() being "less" random than a CSRNG.

Why must CSRNGs always have alternative obtuse APIs. We're still stuck on C style srand() + rand().

Cryptography is so ubiquitous now that failure to provide cryptographically secure random numbers should be viewed as a hardware flaw.

throw0101a · on Feb 11, 2021

Some folks purposely want random-ish results. When OpenBSD was changing the behaviour of its legacy POSIX random functions it was observed:

   This API is used in two patterns:
 1. Under the assumption it provides good random numbers.
    This is the primary usage case by most developers.
    This is their expectation.
 2. A 'seed' can be re-provided at a later time, allowing
    replay of a previous "random sequence", oh wait, I mean
    a deterministic sequence...

They went through the code, especially the third-party packages/ports, to identify uses:

> Differentiating pattern 1 from pattern 2 involved looking at the seed being given to the subsystem. If the software tried to supply a "good seed", and had no framework for re-submitting a seed for reuse, then it was clear it wanted good random numbers. Those ports could be eliminated from consideration, since they indicated they wanted good random numbers.

> This left only 41 ports for consideration. Generally, these are doing reseeding for reproduceable effects during benchmarking. Further analysis may show some of these ports do not need determinism, but if there is any doubt they can be mindlessly modified as described below.

* https://lwn.net/Articles/625562/

admax88q · on Feb 11, 2021

This is exactly what I mean about being stuck in the C mindset. You're looking at the problem though the lens of what this giant pile of ancient C software does.

Why should newer languages take the same approach to APIs that these old code bases did? It's not like we're porting all those programs to JS. Those C APIs were written long before hardware could provide fast good randomness, heck even before cryptography was standard practice instead of a special use case.

Not to mention in JS you can't even seed the random number generator. If you want predictable "random" numbers, you should have to jump through additional hoops. By default random numbers should be cryptographically secure.

EDIT: It's also worth mentioning that from your reported dataset, 41 of 8800 programs analyzed used srand to get a repeatable set of "random" numbers. That's 0.47%. I'm happy to break less than half a percent of software if it helps prevent the far more ubiquitous failures of software using insecure random numbers.

kevin_thibedeau · on Feb 11, 2021

A reproducible pseudorandom sequence is necessary for fuzz testing with randomized inputs. It isn't strictly the domain of "ancient C". Why should everyone be hobbled by security guarantees they won't need?

repsilat · on Feb 12, 2021

> fuzz testing

Also reproducible builds, non-fuzz testing of very complex systems, the list goes on.

Quite often CS papers describe randomized data structures and algorithms. Quite often (at least historically, not sure now) ML models were seeded from random states.

If you want something to be algorithmically reproducible, you need some pretty strong guarantees from component parts. If you're using a hash table, and that hash table makes use of nondeterministic state that you can't control, you need to make sure that you're not using it in a way that lets that nondetederminism leak out -- if it doesn't provide a deterministic iteration order, you shouldn't iterate over it.

Sometimes determinism can be won back (just using lookup methods on your hash table, sorting the hash elements after iterating over the hash table) but in some cases it's not really possible, and in many cases it's at least impractical. Not providing a mechanism for nondeterminism in the first place can be simpler, but it comes with different problems. (Security problems are an obvious example -- an OS that gives deterministic bits when you ask for random bits is a worry.)

admax88q · on Feb 12, 2021

Why should everyone be compromised because fuzzers want to use the simpler rand() API instead of something different?

I'm not saying we shouldn't have a way to generate predictable "random" sequences. But the default, simple, API for random numbers in any given language should be secure.

The vast majority of software does not need pseudorandom sequences. Developers shouldn't have to think every time "is pseudorandom good enough for this use case?" It should just be strong random every time. If you need a pseudorandom we should have a separate API for that.

IcePic · on Feb 12, 2021

Do you have a guarantee that random() will give you the same sequence on different endians, on different libcs, on different 32/64bit arches, on different OSes or even dists?

The expectation of reproducability is probably lost if you switch any of the above "details", so the problem with bad-random-generation is that it's not only the people who thinks the entropy is good in it that are wrong, the ones that think rand() didn't change in the last 40(?) years are equally wrong.

throw0101a · on Feb 11, 2021

I'm looking at the problem through the lens of changing the behaviour of an existing API.

If you want to create 'secure' APIs that 'do the right' thing, I'm not against it. But leave the old stuff around, or mark it deprecated, throwing warnings and errors on compilation even, for possible future removal—don't change it.

admax88q · on Feb 12, 2021

> or possible future removal—don't change it.

The future is now, insecure code has had long enough to catch up.

asveikau · on Feb 12, 2021

This was specifically a discussion of the impact of purposefully breaking a strict interpretation of a POSIX API, though. They looked at where it was used, which is appropriate.

h_anna_h · on Feb 11, 2021

> If you want predictable "random" numbers, you should have to jump through additional hoops

Why?

> By default random numbers should be cryptographically secure.

Why not both?

lmm · on Feb 12, 2021

> Why?

Because you're doing something that often causes major security bugs.

> Why not both?

It's not possible for random numbers to be both predictable and secure at the same time.

bscphil · on Feb 12, 2021

If by "predictable" all that's meant is "you can deterministically recreate every output bit generated by the function, forever, using a single value that represents the starting state of the function", then you can certainly have that and a secure RNG function. Just use ChaCha20's function with the key representing your seed.

lmm · on Feb 13, 2021

That might be a good default, but where would the seed come from? You also need to make sure that the order in which random bits are read is deterministic, which is a lot harder than it sounds.

h_anna_h · on Feb 12, 2021

In addition to what bscphil said

> Because you're doing something that often causes major security bugs.

Okay, sure, but I might be running a simulation or something, why should I be punished because some idiot decided to srand(time(0))?

lmm · on Feb 13, 2021

> Okay, sure, but I might be running a simulation or something, why should I be punished because some idiot decided to srand(time(0))?

It's about cost/benefit. As far as the compiler/framework knows, there's a 20% (say) chance that you've just introduced a major security bug into your program. Doesn't the benefit of requiring some explicit acknowledgement of that case outweigh the cost? You contrast "I" with "some idiot", but the evidence of the last 20+ years is that most programmers who think they can write secure code can't; if you make those kind of warnings only to people who opt-in to them, the very people who most need them will not get them.

admax88q · on Feb 12, 2021

Because your in the minority. Why should rand() be reserved for your use case and people with other use cases need to use a more obtuse SecureRandom() API?

Defaults should be secure. The simple case should be secure.

h_anna_h · on Feb 12, 2021

> Why should rand() be reserved for your use case

Because rand has a specific and well defined meaning. In addition I see no evidence that this use-case is any less popular. Anyway, this subthread is about having to go through extra hoops to have deterministic random numbers, your post is irrelevant to that. I do not think that anyone would be against defining random() to return a secure random number in your language.

> Defaults should be secure

Are you supporting that new computers should come preinstalled with Qubes OS, have a constant time memcmp, constant time font rendering, etc?

> Because your in the minority

Fuck people with allergies, right? Let's only provide food with nuts and have them go through "additional hoops" to get food that won't kill them.

I do not think that this is a sound argument.

nixpulvis · on Feb 11, 2021

Unless I'm wrong, the only valid reason to not use a better random number generator is for performance / simplicity, which then demands benchmarks and evaluation.

antonyh · on Feb 11, 2021

Seeded random is a glorious thing in the right circumstances. As an example, I've used it for 'random' testing sequences (jumbling up a list of inputs) but in a way I can later re-run EXACTLY the same test.

It's also useful for other data generation tasks where the output can basically be saved as a seed, making it lightweight and easy to store - it could be written it on a scrap of paper in seconds.

Maybe it's a bad name though - it should be called seededRandom() or semiRandom() or deterministicRandom(). Or perhaps it should be true random is no seed is set. Hard to know. Maybe the true random only needs to be the seed to a deterministic random and reset on a frequent basis in some cases.

Then there's the category of casual random that doesn't matter, like random colours just for the sake of it. It doesn't need to be a secure safe random.

And... assuming that any random function is truly random is a mistake anyway. Basing on hardware, and it may fail. Base it on software and where's the source of entropy. Add to that the possibility of bugs/defects in the implementation, and it's possible that it might not be as random as it needs to be. It's better to assume ALL RNGs are PRNGs, with the caveat that some are decidedly better than others.

So no I wouldn't support a ban on it, nor would I support removing it from any language/runtime where it might be useful.

dbcurtis · on Feb 11, 2021

Came here to say this. I have spent a lot of time in hardware validation. Pseudo-random (explicitly NOT random) sequences are hugely useful.

I once had a lights out server room of 60 servers whose entire purpose was to take skeletonized tests and a seed for a pseudo-random function and generate a test instance. That test instance went to one of a dozen test jigs. What was recorded was: pass/fail, the git sha of the template, and the seed. Any failing test could be reproduced at any time from just the git sha and the seed. True random would have killed that whole methodology.

antonyh · on Feb 12, 2021

That sounds awesome, and the use of a repeatable random vital. Another example of random in a non-cryptography context where it's unpredictable under normal operation, but completely predictable when needed. If you wanted you could run the same test on all the test jigs with different seeds, safe in the knowledge that you could re-run all of them exactly again and again if required. Or you could add problematic seeds to a list for repeated retest with future versions. So much power and freedom!

argvargc · on Feb 11, 2021

notRandom() might be usefully descriptive in this case.

not2b · on Feb 11, 2021

There are many uses of random() that do not require cryptographic security: simulation, simulated annealing, sound synthesis, digital signal processing and the like. It would be a nuisance if developers of those kinds of software have to fight warnings because developers of completely different applications can't get it right.

not2b · on Feb 11, 2021

Further, such users usually want to be able to repeat a test case: start from the same seed, get the same sequence. They don't want true randomness, they want a repeatable sequence with good statistical properties.

bob1029 · on Feb 11, 2021

It's all about the discipline of the team in the end... You can ban things all day, but it just takes 2 developers deciding they don't give a shit to code, review & merge that use-fast-random-for-session-token PR. There is more than 1 way to get something that is "random", so basic string matching for methods you don't like is certainly not a guarantee.

In our organization the policy is very simple. We have static method available throughout called CryptographyService.GenerateCsprngBytes(count = 64). All developers are aware that any security-sensitive requirements around entropy must use this method. It wraps the OS-level offering, and encourages a minimum reasonable level of entropy with a default count.

I don't see any reason to make it more complicated than this. Communication with your team is more important than writing check-in rules to prevent bad things from happening.

As for other uses of Math.Random, et. al., we don't have any official policy. Because we have clearly communicated the awareness that security-sensitive applications should always use the secure method, we don't need to add a bunch of additional bandaids on top. Enrich the team before the process.

pdpi · on Feb 11, 2021

> Communication with your team is more important than writing check-in rules to prevent bad things from happening.

There's some subtlety here. This is sort of a security vs safety issue.

Some people are just reckless, and that's a human problem that is best dealt with through a stern talking to (or, ultimately, termination) rather than technical measures. You'd require an oppressive amount of check-in rules in place to be even remotely effective at stopping this behaviour, and those would just make life miserable for everybody else.

Some people are new to the team and/or just plain inexperienced, and it takes time for them to absorb all the standard practices so they can innocently cause trouble. Even veterans will make mistakes. Low friction guard rails can help keep those people from getting into too much trouble without being too onerous.

zabzonk · on Feb 11, 2021

To answer this, we would need to know what you mean by random(). You haven't provided any language, library or other context.

zappsepp · on Feb 11, 2021

Differences would be marginal. Computers are inherently bad at randomness.

millstone · on Feb 12, 2021

random() is just a function name that could do anything. We can't answer the question if we don't know what random() means.

aeturnum · on Feb 11, 2021

It seems like the "proper" solution to this problem would be to make all random number generators pull from the cryptographically secure randomness pool by default. If your random number needs are within what can be provided with strong guarantees, it doesn't seem like there's any reason to give you anything but strongly random numbers.

People who need more random numbers per second than can be generated securely will simply have to pass explicit parameters indicating they want to stop getting high quality numbers. This would be easy to see in code review and highlight the choices being made.

CivBase · on Feb 11, 2021

Most static code analysis tools I've used allow you to write exceptions for rules into your code using comments that follow a particular signature. Isn't it sufficient to just ban the use of random() and require devs to use one of those comments to effectively "sign off" on it if they encounter a good use case?

Macha · on Feb 11, 2021

There's two trains of thoughts here:

1. You can disable lint warnings, so it's no big deal if some rules have false positives

2. Disabling lint warnings should be a code smell, so if there's some rule that's getting people used to disabling lint warnings, it is a problem.

I'm on camp 2 on this rule.

rightbyte · on Feb 12, 2021

> Disabling lint warnings should be a code smell

I believe lint warnings are for pointing out unusual constructs that might be a mistake not to be a code police.

account42 · on Feb 12, 2021

> Fix Rate is the percentage of merge-blocking findings that are fixed (i.e., not muted*) in CI. We believe this is a proxy for engineering value. As we run our own developer-focused security programs, we’re obsessing over how to increase the Fix Rate for the rules on our projects.

> Observing a bad 0% Fix Rate for random() (with only 7 data points from our projects), we decided to silence the rule for r2c developers

CivBase · on Feb 12, 2021

I wouldn't necessarily consider a fix rate of 0% to be bad. If you're doing something that looks bad, you should probably leave a comment that silences the static analysis error and justifies why you're doing it. If that justification is missing, the static analysis tool should flag it IMO.

If using random() in a crypto project is a bad code smell, then I'd say every use of it should come with a brief justification.

sobriquet9 · on Feb 11, 2021

The problem is not random() itself, it's constructs like random() % n. Even if you replace random() with a CSPRNG, misuse won't go away.

ddtaylor · on Feb 12, 2021

I don't agree. I think Python and a few others do a good job by making random range based functions like `randrange` at curbing misuse.

sobriquet9 · on Feb 12, 2021

For cases where small statistical differences matter, like Monte-Carlo simulation, the overhead is non-trivial. Linear congruential generator is so simple it can be inlined by the compiler.

klodolph · on Feb 11, 2021

"Fix rate" is an interesting metric, but I don’t think it’s a good proxy for engineering value.

When I see a false positive flagged by a compiler warning or static analyzer, sometimes I’ll fix it just because I’m not sure I want to turn off the rule. For example, I often use -Wunused-parameter with -Werror with Clang or GCC, and then just use (void)arg; to silence the false positives.

wilsonthewhale · on Feb 11, 2021

No mention of arc4random(3)? Seems like a solved problem in BSD land.

The key takeaways I feel like are: 1. You want as simple of an interface as possible. arc4random(3) returns a single random 32 bit integer, or you can tell it to fill a buffer with them. 2. Just make it cryptographically secure. arc4random does it and it seems to be fine.

MauranKilom · on Feb 12, 2021

> No mention of arc4random(3)? Seems like a solved problem in BSD land.

Do you mean the "real" arc4random algorithm, or ChaCha20 (which is what arc4random actually is on OpenBSD)?