BG3 So, About BG3 dice rolls….

Redglyph · August 28, 2023

Pladio said:
It's funny how people don't realise that rolling two 2s in a row is just as hard as two 1s but you don't see people pay attention to that either.

The odds of rolling a 15 and then a 2 specifically are also the same.

People only seem to attach special attention to very specific patterns they believe are mathematically any different.

They aren't.

Yes, they are: rolling a 15 or a 2 is in the same bag but rolling a 1 is in the other bag. So it's a 19/20 probability against a 1/20 probability.

Pladio said:
Basically, unless you start logging 1000 results your numbers are likely going to be very random unless the rng is badly created. With modern computers, making a good rng isn't that complicated anymore.

Even excel's rng works quite well.

If you generate 100 sets of 5 numbers from 1 to 20 you will end up with quite a few distributions that don't seem random.

But if you generate a 100 sets of 1000 numbers, most sets will have pretty good distributions. If you then create a 100 sets of 100k numbers it's very unlikely any of them will turn out to be a whole bunch of only 1s.

It all depends on how the random generator is made and used. Firstly, making a good RNG is complicated; there are many books and articles on the subject. But programmers often rely on library functions that return an n-bit integer value with a uniform probability, and then it's up to them to use that correctly.

If you need to generate a 1d20, you can see the first problem: the typical random sets - values on n bits - don't produce a multiple of 20 samples. If you take a 16-bit random value modulo 20, you introduce a bias, because 20 doesn't divide 65536. It's so small that you won't notice it though, but if you take 8-bit values, that's a few %, so you may see the 1 or the 20 (very) slightly more or less often.

Many programmers also have the impression that using a modulo is costly because it involves a division (which isn't true for constants), so they'll find creative workarounds to 'optimize', and who knows what bias they can introduce then. Though in this case they don't need to use it very often so I doubt they would bother 'optimizing' that.

Some libraries offer floating-point random values, which are sometimes tricky to use or that are not well balanced. It's also easy to mess up when rounding those values. Or by abusing the seed, etc.

Pladio · August 28, 2023

Redglyph said:
Yes, they are: rolling a 15 or a 2 is in the same bag but rolling a 1 is in the other bag. So it's a 19/20 probability against a 1/20 probability.

No, you misunderstood.

Rolling a 15 AND a 2 you have a 1/20 and another 1/20 chance of rolling these. They are independent odds, so it's 1/400 to roll a 15 AND a 2.

Rolling a 1 AND a 1 you have 1/20 to roll the first 1 and then 1/20 again to roll the second 1. It's 1/400.

There are no different bags, since the rolls come after one another. It's the same bag with the same set of numbers inside it.

Redglyph said:
It all depends on how the random generator is made and used. Firstly, making a good RNG is complicated; there are many books and articles on the subject. But programmers often rely on library functions that return an n-bit integer value with a uniform probability, and then it's up to them to use that correctly.

If you need to generate a 1d20, you can see the first problem: the typical random sets - values on n bits - don't produce a multiple of 20 samples. If you take a 16-bit random value modulo 20, you introduce a bias, because 20 doesn't divide 65536. It's so small that you won't notice it though, but if you take 8-bit values, that's a few %, so you may see the 1 or the 20 (very) slightly more or less often.

Many programmers also have the impression that using a modulo is costly because it involves a division (which isn't true for constants), so they'll find creative workarounds to 'optimize', and who knows what bias they can introduce then. Though in this case they don't need to use it very often so I doubt they would bother 'optimizing' that.

Some libraries offer floating-point random values, which are sometimes tricky to use or that are not well balanced. It's also easy to mess up when rounding those values. Or by abusing the seed, etc.

Of course, but modern computers can handle these things easily.
I just did it on Excel randomly generating numbers from 1-20 and the distribution is mostly flat after 67k numbers:

People only see what they perceive to be wrong.
It is very likely that a small subset of numbers will have things that don't seem to make sense. That is literally how probabilities work.

Redglyph · August 28, 2023

Pladio said:
No, you misunderstood.

Rolling a 15 AND a 2 you have a 1/20 and another 1/20 chance of rolling these. They are independent odds, so it's 1/400 to roll a 15 AND a 2.

Rolling a 1 AND a 1 you have 1/20 to roll the first 1 and then 1/20 again to roll the second 1. It's 1/400.

There are no different bags, since the rolls come after one another. It's the same bag with the same set of numbers inside it.

Maybe I did, or maybe that's not what I meant. I do understand that rolling any other specific value, like a 2, has the same probability. But you said 'People only seem to attach special attention to very specific patterns they believe are mathematically any different'. They are different. The specific patterns that matter here are either 'rolling a 1' or 'not rolling a 1'; that's the patterns people are fixated on because of what they mean, and they have different probabilities.

As you and other said, it's often a perception issue because the 'rolling a 1' event feels unfair and the 'not rolling a 1' feels normal. Even if the unfairness only happens 1/20 time, it will be more easily remembered after a while.

Five times in a row is stretching it though.

Pladio said:
Of course, but modern computers can handle these things easily.
I just did it on Excel randomly generating numbers from 1-20 and the distribution is mostly flat after 67k numbers:
View attachment 4785
People only see what they perceive to be wrong.
It is very likely that a small subset of numbers will have things that don't seem to make sense. That is literally how probabilities work.

Modern computers have nothing to do with it, it's a software issue.

When you write a program that needs PRNG, you may use a known LFSR or you'll more likely use a library because there's no point in reinventing the wheel. And, as I explained, it's easy to misuse it or make a bad choice. Especially if you think it's easy.

Just do a search on pseudorandom generator quality and you'll get an idea of how complex this research topic is (see here to get an idea, though it's only Wikipedia). It doesn't matter as much for games, but it shows how easily one can make a mistake.

sakichop · August 28, 2023

Pladio said:
It's funny how people don't realise that rolling two 2s in a row is just as hard as two 1s but you don't see people pay attention to that either.

The odds of rolling a 15 and then a 2 specifically are also the same.

People only seem to attach special attention to very specific patterns they believe are mathematically any different.

They aren't.

Sure, if your rollin a d20. The discussion isn’t about the probability of rolling a specific number on an actual die. The discussion is about a computer simulating the rolls with a RNG.

If you believe they are exactly simulating a d20 then great.

But it could also be that they are cooking numbers in certain situations. For instance as Lackblogger suggested to add some drama or maybe there certain situations were they would prefer you fail a check or maybe they change the algorithm for each difficulty so the higher the difficulty the worse the rolls.

The fact that they have a “karmic dice” setting tells me they are messing with the rolls at least in that mode. So it’s not about not knowing simple math it’s about what algorithm is larian using. I was just curious what others were seeing.

Redglyph · August 28, 2023

sakichop said:
But it could also be that they are cooking numbers in certain situations.

That's easy to test by reloading a few times, even if it takes a while.
When you had your streak without karmic enabled, was it in a particular situation?

Note that I've seen threads in the Larian forums where people were suspecting the karmic mode to give more critical misses or even series of them.

Zloth · August 29, 2023

Drithius said:
That being said, the chance for 5 crit hits/misses in a row is exceedingly low. 1/20^5 = 1 in 3.2 million. So, see if it happens again

I don't think that's quite right. If you take 5 D20, roll them all at once, and get all 1s, the chances are what you say. That isn't really what's going on here, though. First, lots of rolls are being made and the player is looking for a "surprising" streak anywhere in the series. If Sakichop had gotten a 20, then a 1, then a 20, then another 1, then yet another 20, and finally another 1 again, we'd likely still get a topic.

When you're evaluating stuff like this, it's important to first make the hypothesis and then do the measurements. Otherwise, it's like shooting an arrow at a wall and then drawing the target around wherever it hits. If you just let the random numbers spool along and look for anything that seems like a pattern, you'll get something eventually.

All that is for simple dice, of course. From IGN's article, it sounds like karmic dice break up streaks - but only bad streaks. You can still roll a bunch of crits in a row.

notdart · August 29, 2023

Pladio said:
It's funny how people don't realise that rolling two 2s in a row is just as hard as two 1s but you don't see people pay attention to that either.

The odds of rolling a 15 and then a 2 specifically are also the same.

People only seem to attach special attention to very specific patterns they believe are mathematically any different.

They aren't.

Basically, unless you start logging 1000 results your numbers are likely going to be very random unless the rng is badly created. With modern computers, making a good rng isn't that complicated anymore.

Even excel's rng works quite well.

If you generate 100 sets of 5 numbers from 1 to 20 you will end up with quite a few distributions that don't seem random.

But if you generate a 100 sets of 1000 numbers, most sets will have pretty good distributions. If you then create a 100 sets of 100k numbers it's very unlikely any of them will turn out to be a whole bunch of only 1s.

This is actually not true. I don't know about today but the last time I looked into it there was a problem with random number generators not being that random - esp if the default seed was used. There are tricks to reseed the generator to improve things but unless there has been a major rewrite the last decade by default you won't love the results esp when you examine short sequences of numbers.

Pladio · August 30, 2023

Redglyph said:
Maybe I did, or maybe that's not what I meant. I do understand that rolling any other specific value, like a 2, has the same probability. But you said 'People only seem to attach special attention to very specific patterns they believe are mathematically any different'. They are different. The specific patterns that matter here are either 'rolling a 1' or 'not rolling a 1'; that's the patterns people are fixated on because of what they mean, and they have different probabilities.

As you and other said, it's often a perception issue because the 'rolling a 1' event feels unfair and the 'not rolling a 1' feels normal. Even if the unfairness only happens 1/20 time, it will be more easily remembered after a while.

Five times in a row is stretching it though.

Modern computers have nothing to do with it, it's a software issue.

When you write a program that needs PRNG, you may use a known LFSR or you'll more likely use a library because there's no point in reinventing the wheel. And, as I explained, it's easy to misuse it or make a bad choice. Especially if you think it's easy.

Just do a search on pseudorandom generator quality and you'll get an idea of how complex this research topic is (see here to get an idea, though it's only Wikipedia). It doesn't matter as much for games, but it shows how easily one can make a mistake.

Regarding #1 and the perception of different outcomes. The point was that mathematically, rolling two specific numbers, no matter which they are have the same odds. Comparing 19 numbers two 1 obviously has different odds. The point is that no one seems to be surprised when they roll a 12 AND a 2, but they are surprised when they roll a 1 AND a 1. Most people aren't even surprised if they roll a 2 AND a 5 even though both are failures (if the pass is a 10DC) and have the exact same odds of occurring as a 1 AND a 1... Yes, it seems odd, but that's the way it is.

I know that pRNGs aren't fully random. But they don't have to be. Once you have enough bits, they effectively become random. As shown using a very simple Excel table with random numbers. Once you have large enough datasets, the numbers tend to be uniform in distribution (unless your RNG is based on a different probability density function such as a Gaussian distribution). So unless you tell me that Larian are such bad coders, they can't make such a simple RNG that in the long-term is random (whether by use of libraries of writing it themselves), I can't imagine the RNG in the game being wrong.

sakichop said:
Sure, if your rollin a d20. The discussion isn’t about the probability of rolling a specific number on an actual die. The discussion is about a computer simulating the rolls with a RNG.

If you believe they are exactly simulating a d20 then great.

But it could also be that they are cooking numbers in certain situations. For instance as Lackblogger suggested to add some drama or maybe there certain situations were they would prefer you fail a check or maybe they change the algorithm for each difficulty so the higher the difficulty the worse the rolls.

The fact that they have a “karmic dice” setting tells me they are messing with the rolls at least in that mode. So it’s not about not knowing simple math it’s about what algorithm is larian using. I was just curious what others were seeing.

Well, there being a 'karmic' dice setting, my automatic assumption is that not using it meant it would be a PRNG as per Redglyph's statement. As per above, making one that's effectively random for all practical intents and purposes is not that difficult. I made my own ones using MATLAB in university (although there's no way I would remember how to do them now).

Same as most other games, my suggestion would be to write down about 1,000 rolls in a row and come back with actual findings.
In XCOM2, people always complain about dice rolls (oh how did I miss the 94% shot?!), and never complain when they succeed 7 45% shots in a row.

notdart said:
This is actually not true. I don't know about today but the last time I looked into it there was a problem with random number generators not being that random - esp if the default seed was used. There are tricks to reseed the generator to improve things but unless there has been a major rewrite the last decade by default you won't love the results esp when you examine short sequences of numbers.

See above Excel example and other points I made.

Redglyph · August 30, 2023

Pladio said:
Regarding #1 and the perception of different outcomes. The point was that mathematically, rolling two specific numbers, no matter which they are have the same odds. Comparing 19 numbers two 1 obviously has different odds. The point is that no one seems to be surprised when they roll a 12 AND a 2, but they are surprised when they roll a 1 AND a 1. Most people aren't even surprised if they roll a 2 AND a 5 even though both are failures (if the pass is a 10DC) and have the exact same odds of occurring as a 1 AND a 1... Yes, it seems odd, but that's the way it is.

Those values are not as remarkable because it's not a critical and the outcome depends on the DC, the proficiency level, and possibly other boosts. I think that's partly why. But I agree, people often gets fixated on specific values; sometimes without a good reason.

Pladio said:
I know that pRNGs aren't fully random. But they don't have to be. Once you have enough bits, they effectively become random. As shown using a very simple Excel table with random numbers. Once you have large enough datasets, the numbers tend to be uniform in distribution (unless your RNG is based on a different probability density function such as a Gaussian distribution). So unless you tell me that Larian are such bad coders, they can't make such a simple RNG that in the long-term is random (whether by use of libraries of writing it themselves), I can't imagine the RNG in the game being wrong.

If you have large enough datasets, and if the values are not messed with as I explained, yes, they may be considered as relatively uniform. For example, even a basic LFSR will generate all the values of the set without any duplication (except it won't generate zero, which is already a small bias). But the properties on successive values depend on which polynomial you select, and there's no guarantee you'll get something smooth all the time. If you have to use a simple modulo or mask some bits to get a 1d20 range, you may get even worse results. So more sophisticated libraries improve the odds, if you will. I have no idea what Larian or Tactical Adventures is using though, but since it's not something everyone is aware of, there's a chance of getting a significant bias.

It's been a while since I studied Galois Fields and I don't think we need to get that boring here anyway, but you can check by yourself. Pseudorandom generators are a research topic, not because of games but because they're used in cryptography, where they need to have good properties to avoid guesses and manipulation.

EDIT: That being said, an entire LFSR sequence has the same probability to occur as any other permutation.

Redglyph · August 30, 2023

Note that getting 3 or more criticals in a row isn't as rare as we may think. I've made the calculation of the probability to get streaks of 2 .. 5 identical values in 100 rolls:

100 rolls, 2 or more '1': 21.07264 %
100 rolls, 3 or more '1': 1.15809 %
100 rolls, 4 or more '1': 0.05761 %
100 rolls, 5 or more '1': 0.00285 %

5 is still quite unlucky, sorry @sakichop (I don't know how many times you rolled though).

It's not too complicated to compute if you take a recursive method.

Here's the code:

Rust Playground

A browser interface to the Rust compiler to experiment with the language

play.rust-lang.org

If you have n rolls (independent outcomes), what's the probability pr(n, k) to get a streak of 3 or more 'positive' events (like a critical failure)? The probability of one positive outcome is p, and the proba of a negative outcome is 1-p.

Either
1) You directly get 3 positive outcomes: proba = p^3
2) You don't get 3 positive outcomes in the first 3 events. That means that either the first, the second, or the 3rd outcome is negative. The proba of a streak of 3 or more is, in each case, the proba that the case happens multiplied by the proba of having the streak later:

negative: (1-p) * pr(n-1, 3)
positive, negative: p * (1-p) * pr(n-2, 3)
positive, positive, negative: p * p * (1-p) * pr(n-3, 3)

So the general formula is: pr(n,k) = p^k + sum(i=1 .. k, p^(i-1) * (1-p) * pr(n-i,k)
where pr(n,k) = 0 when n<k

But what about PRNGs?

Since I was bored, I made a little example by cobbling together a few pieces of code (which is a good base for something else I had in mind anyway).

I'm comparing a few LFSRs, and the results are quite interesting.
Here's the code: >link< The important bits (!) are in the 'rand' method, where the LFSRs are implemented. I'm using a naive modulo 20 to get 1d20 values.

I'm testing how many streaks of 3 or more identical rolls I get on the whole sequence, plus some other statistics.

Here are some of the results (you can click RUN to see them all):

## Testing Xorshift16
length: 65535, mean: 10.49966, min = 1, max = 20
10 sequence(s) of 1, best length = 4
10 sequence(s) of 20, best length = 4
Sequence(s): 1:10, 2:12, 3:8, 4:9, 5:10, 6:8, 7:10, 8:8, 9:6, 10:6, 11:8, 12:5, 13:7, 14:7, 15:3, 16:7, 17:9, 18:6, 19:8, 20:10
Frequencies: 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%

## Testing Galois16
length: 65535, mean: 10.49966, min = 1, max = 20
410 sequence(s) of 1, best length = 12
153 sequence(s) of 20, best length = 4
Sequence(s): 1:410, 4:51, 20:153
Frequencies: 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%

## Testing Xorshift32
length: 4294967295, mean: 10.50000, min = 1, max = 20
509990 sequence(s) of 1, best length = 7
509864 sequence(s) of 20, best length = 8
Sequence(s): 1:509990, 2:508920, 3:510120, 4:509622, 5:510305, 6:508374, 7:510420, 8:510209, 9:509928, 10:510038, 11:510456, 12:508997, 13:509466, 14:510744, 15:510642, 16:510580, 17:510876, 18:510083, 19:509482, 20:509864
Frequencies: 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%

## Testing Galois32
length: 4294967295, mean: 10.50000, min = 1, max = 20
26843546 sequence(s) of 1, best length = 28
0 sequence(s) of 20, best length = 1
Sequence(s): 1:26843546, 6:5033165, 10:5033166
Frequencies: 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%

The Galois LFSR (or its reverse, sometimes called Fibonacci) is probably what most people will use, because the polynomial are known and it's easy to code. With a 16-bit generator:

## Testing Galois16
length: 65535, mean: 10.49966, min = 1, max = 20
410 sequence(s) of 1, best length = 12
153 sequence(s) of 20, best length = 4
Sequence(s): 1:410, 4:51, 20:153
Frequencies: 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%, 5.00%

This means it has found 410 streaks of 3 or more '1' (on average, you may meet one every ~159 rolls). The longest has 12 critical misses in a row, which is a lot. What's also strange is the `Sequences`: there are streaks of 3 or more only for the die faces '1', '4', and '20'. All the other faces don't have one. You can see the same aberration in Galois32, which has many more samples.

In comparison, the Xorshift is much better balanced: that shows the importance of selecting a good generator.

If you want, you can compare that to the probabilities calculated above.

notdart · September 3, 2023

One thing i should mention is that computer random number generator have enthopy (sp); and they can loose randomness over long period of time. Again i forget the details at one point i had to do something with work - we had systems up for years and used random numbers are part of the encryption system and ran into issues over long period of time - i forget the details as it was fringe work - look it up once - solve problem and forget about it. But i think the short summary if you been playing bg 3 continously for the past 5 years you might notice oddity in your 'random' sequence.

Redglyph · September 3, 2023

I suppose you meant entropy, as in the randomness of a source that can provide enough variety (in this case)? An entropic source, like a measure of time, can only be used to set the seed. The sequence remains almost certainly the same because of the PRNG, but the implementation can make its length much longer (with a permuted congruential generator, for example).

Unless they always use an entropic source, but people usually don't do that for two good reasons: 1) It's too unreliable, especially with the diversity of hardware people use. 2) It's nice to have a PRNG because it's repeatable, which is very desirable when validating the code and debugging it.

I don't know how many rolls we do in a game but it can't be more than a thousand. Even a small, basic 32-bit PRNG has a length of ~4 billion, so there's enough margin not to worry about that. Each time you reload or restart a game, you'll notice that the rolls are different, which means they either refresh the seed or continue with the same sequence so that the users don't see any repetition.

JDR13 · September 3, 2023

Redglyph said:
Each time you reload or restart a game, you'll notice that the rolls are different, which means they either refresh the seed or continue with the same sequence so that the users don't see any repetition.

How would we know? The rolls are always different anyways.

largh · September 4, 2023

BG3 So, About BG3 dice rolls….

Redglyph

proud GASP member

Pladio

Guardian of Nonsense

Redglyph

proud GASP member

sakichop

Sentinel

Redglyph

proud GASP member

Zloth

I smell a... wumpus!?

notdart

Sentinel

Pladio

Guardian of Nonsense

Redglyph

proud GASP member

Redglyph

proud GASP member

Rust Playground

notdart

Sentinel

Redglyph

proud GASP member

JDR13

SasqWatch

largh

Keeper of the Watch