Is it Random enough?

**Tony Flury** · November 13th, 2010

Just a thought : 175 runs is far too small a sample - to test the randomness you need to be doing 1000s or 10,000s of runs.

**worksofcraft** · November 13th, 2010

I don't quite follow what you want, but you can calculate what to expect.

Think of your set of files as an N sided dice... so selecting one at random is like rolling that dice. Thus it should produce what is known as a "binomial distribution". You can calculate the expected frequency with: n! * p^k*(1-p)^(n-k) / (k!*(n-k)!)

p = probability i.e. 1/N of files in your set.
k = number of times you can expect the same file
n = the number of samples i.e. 175 in your case

**endotherm** · November 13th, 2010

something is definitely not right here. as some of you may have guessed from my code snippet, this utility is for queuing up a random episode of of somthing from my video collection. I added a series of 200 files, but have had to skip at least 8 of them in the last 12-16 hours of use. probably about 30 runs. I think I;m going to try salting the number a little and see where that gets me.

**ad_267** · November 13th, 2010

Maybe the problem is it's too random. Each item has an equal chance of being picked every time, regardless of whether it's already been selected. Instead you might want to have a a list of all the items and after selecting one randomly, remove that from the list.

This is an interesting read on the topic: http://blogs.msdn.com/b/shawnhar/arc...andomness.aspx

**worksofcraft** · November 13th, 2010

I found a binomial chance calculator on line.

Assuming a true random distribution it calculated that from your 200 files (i.e single trial success is 0.005) Pick one at random 175 times. Then on average about 42% of your files will not be chosen, about 36% will be chosen once, 16% will be played twice and less 6% more than twice.

**endotherm** · November 15th, 2010

well, I've come to some interesting conclusions.

I decided to breed my own rng, by multiplying two randoms together and modding them, just for salt. I noticed that my liking for the new algorithm went up, even though all the metrics on the diagnostics went down.

@works: yeah, that describes the output I'm getting pretty well. neat to see the math in action!

ultimately, you guys are right; "random" is not really the path to the ends I seek, so I bit the bullet and added features to record and administer a history of viewed files. a file won;t replay unless there is no history item data within the last 6 months, but I'm sure I'll have to tweak that. I already had data persistence code in place, so adding on to it was trivial enough.

this has been a very interesting conversation. thanks everyone for your thoughts and ideas!