shirou wrote:I don't know why anyone should ever expect to get the median. You should, after a large number of trials, expect to get the expected value, hence the name. But like I said, whatever he was in fact expecting is completely irrelevant, because what he did was roll a bunch of dice and record the results. He's not reporting what he thought he would get; he's reporting what he did get.
Anyway, I will agree with you that the study was not up to full academic rigor. I don't think it was trying to be -- it was just an experiment to give us some information about the dice we use to play games. What I will not agree with is that we cannot draw simple, qualitative conclusions. A student shaking a dice in his hand is a sufficiently random starting configuration that it should not introduce enough bias to give an end result that should have an almost 0 probability of occurring.
The study is interesting, but the main problem is that we can't see the whole dataset. Though there appears to be significant variation between Chessex dice and
GW dice overall, we don't know the variation within those groups. Do all
GW dice roll about 29% ones? No, there a variation there of between 23% and 33%. But we don't know the spread, all that is said is that "
We removed any statistical anomalies and came up with 29%." So there's a problem with variance here, both for the rolls of individual dice and between dice of the same set (we only know the outliers, 23% and 33%). Chessex dice and
GW dice may appear different, but the variance within those datasets may wipe out the statistical significance of it.
Though if the smallest was 23% that's still well more than you'd expect of 16% if they were totally fair. I can't see a statistical test being applied anywhere, but even not knowing the standard distribution or skew of the data and that large variance, with this number of dice being rolled and such a large difference from the expected result, I'd guess there is clear "significant difference".
There could also be an effect from the way the dice are thrown. I assume that all testers got to have an equal go throwing all dice, and that one person didn't throw all the
GW ones and another all the Casino ones. People could have an affect on the way dice are thrown, as an extreme example the results a die throws can be effectively manipulated through controlled dice shots.
The only other problem I see is that the dice in each case have come from the same source, a cube of chessex and a box of
GW dice, which implies they were all manufactured together. So imperfections in the dice tested may be common to all those tested, but not common across the whole range. I'm not saying it's probable, but it's certainly not impossible.