Before we look at that some more, I thought it might be fun for me to manually attempt to generate some random data for 40 coin tosses. All I've done is this
1) Open Excel
2) In each cell, typed heads or tails
3) Rinse and repeat 40 times
I've tried to make this "random". And here is the data below
Heads,Tails,Heads,Tails,Tails,Tails
Heads,Heads,Tails,Heads,Tails,Tails
Heads,Heads,Tails,Heads,Tails,Tails
Tails,Heads,Tails,Heads,Tails,Heads
Heads,Tails,Heads,Tails,Tails,Tails
Heads,Heads,Tails,Heads,Tails,Heads
Heads,Tails,Tails,Heads
19 Heads (47.5%)
21 Tails (52.5%)
The longest sequence was 3 coins of the same type. On 3 different occasions we "threw" tails 3 times. The longest sequence we had for Heads was twice in a row.
Pretty random huh?
Actually, this data is pretty obviously made up as we will discover.
And in a later post, i think we can prove this by testing for randomness (p.s. i haven't tested this dataset but i'm pretty sure it'll flag it as fake but we'll see later).
The main reason it's obviously random is that it looks random. Random data doesn't look random but non-random data does. Counterintuitive, i know.
The reason this data looks random is the following:
- The data is pretty evenly distributed (over a small sample) - 47.5% vs 52.5%
- There are no long chains (the longest chain is 3)
Let's remember this when we look in our next set of posts, where we look at:
Real coin tosses and Excel generated coin tosses
this will work in google docs, does basically the same..
ReplyDelete=if(RAND()<0.5,"head","tail")
i never even thought about google docs. thanks for that, allows me to publish up examples. BTW, RANDBETWEEN also works in google docs so you can follow the instructions exactly from the other post
ReplyDelete