Author |
Message |
 |
|
 |
Advert
|
Forum adverts like this one are shown to any user who is not logged in. Join us by filling out a tiny 3 field form and you will get your own, free, dakka user account which gives a good range of benefits to you:
- No adverts like this in the forums anymore.
- Times and dates in your local timezone.
- Full tracking of what you have read so you can skip to your first unread post, easily see what has changed since you last logged in, and easily see what is new at a glance.
- Email notifications for threads you want to watch closely.
- Being a part of the oldest wargaming community on the net.
If you are already a member then feel free to login now. |
|
 |
![[Post New]](/s/i/i.gif) 2017/07/23 16:58:41
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
Gamgee wrote:
Until you can prove the proper usage of statistics and why you are superior at them this point is merely an attack on the opponent and getting very close to getting personal.
Again, this is incredibly sloppy thinking. Just because I don't have an easily-accessible better method does not mean that we should all pretend that some extremely unreliable method is actually reliable. This is saying that if there aren't any real doctors around we should pretend that faith healing and leeches can solve our problems. No. It is worthwhile in itself to reject quackery.
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:01:09
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
Don't shoot the messenger. I'm not your opponent simply letting you know what you did and what you need to do to impress me with a sound logical argument. You still didn't give any evidence to any of your claims by the way.
Anyways I'm leaving with the others as it seems all fruitful communication has come to an end.
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:01:21
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Potent Possessed Daemonvessel
Why Aye Ya Canny Dakkanaughts!
|
Dionysodorus wrote:I can only assume that you're just lying about your familiarity with how statistics is used in the real world. Your approach is far, far below acceptable real-world standards in political polling, product quality control, determining the effectiveness of new medical procedures, and really any investigation into any interesting question.
Is this a political poll? No. Is this product control? No. Is this determining effectiveness of medical procedures? No. It also helps if you understand the exceptability of error margins depending on the research; when conducting medical research there is an extremely low tolerance for mistakes so the error margins must be extremely low and prejudice/biases have to be removed, this is because the consequences of the research showing positive results when the drug/procedure should have failed the tests are extremely high (i.e. people could die); when product control is carried out there is also a low tolerance for error margins, this is because failure to produce exactly as the manufacturor has advertised could either lead to the company losing money or being sued; and what about our data set? Well the consequences of the data being skewed are that people cannot use the data as 100% proof that a faction is OP when argueing online. Do you see the differences in consequences? Different consequences = different tolerance for error margins and bias.
If you don't understand this concept I don't fully believe that you have any familiarity with statistics. Automatically Appended Next Post: Dionysodorus wrote:Again, this is incredibly sloppy thinking. Just because I don't have an easily-accessible better method does not mean that we should all pretend that some extremely unreliable method is actually reliable. This is saying that if there aren't any real doctors around we should pretend that faith healing and leeches can solve our problems. No. It is worthwhile in itself to reject quackery.
Who said this data is reliable? This is just a rough idea of how the balance looks in this supposedly most balanced edition ever. No one claimed we had Drug testing levels of certainty in these results.
|
This message was edited 1 time. Last update was at 2017/07/23 17:05:31
Ghorros wrote:The moral of the story: Don't park your Imperial Knight in a field of Gretchin carrying power tools.
Marmatag wrote:All the while, my opponent is furious, throwing his codex on the floor, trying to slash his wrists with safety scissors. |
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:06:36
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
mrhappyface wrote:Dionysodorus wrote:I can only assume that you're just lying about your familiarity with how statistics is used in the real world. Your approach is far, far below acceptable real-world standards in political polling, product quality control, determining the effectiveness of new medical procedures, and really any investigation into any interesting question.
Is this a political poll? No. Is this product control? No. Is this determining effectiveness of medical procedures? No. It also helps if you understand the exceptability of error margins depending on the research; when conducting medical research there is an extremely low tolerance for mistakes so the error margins must be extremely low and prejudice/biases have to be removed, this is because the consequences of the research showing positive results when the drug/procedure should have failed the tests are extremely high (i.e. people could die); when product control is carried out there is also a low tolerance for error margins, this is because failure to produce exactly as the manufacturor has advertised could either lead to the company losing money or being sued; and what about our data set? Well the consequences of the data being skewed are that people cannot use the data as 100% proof that a faction is OP when argueing online. Do you see the differences in consequences? Different consequences = different tolerance for error margins and bias.
If you don't understand this concept I don't fully believe that you have any familiarity with statistics.
I mean, sure, if you don't care much about being right you can use really sloppy methods. That's all you're saying here. I agree with that. And so I think now we agree that your data amounts to very weak evidence because you've made basically no attempt to control for error and bias, and so realistic error bars would span basically the whole possible range of values.
I do want to ask, though: do you actually think that you're competent to evaluate this stuff? Like, I promise I won't respond to call you a liar if you say you've got a PhD in applied statistics, I just want to know where you think you're coming from. What is your relevant experience in gathering this sort of data and evaluating it to figure out what's true?
Automatically Appended Next Post:
mrhappyface wrote:
Who said this data is reliable? This is just a rough idea of how the balance looks in this supposedly most balanced edition ever. No one claimed we had Drug testing levels of certainty in these results.
Okay, now I'm confused because literally all I've been saying is that this is an incredibly unreliable method and therefore the data is basically useless for drawing conclusions. That's what people jumped all over me for. Edit: Oh, I see, you're straw-manning me as saying that the only alternative to what you're doing is drug testing levels of certainty.
|
This message was edited 2 times. Last update was at 2017/07/23 17:09:28
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:08:50
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
I will play devil's advocate for my opponent again because I'm bored.
I will say one thing. Until we know what type of game was reported by each player in those statistics it could very well skew the results.
So for example I know the Tau data in there is composed of at least three different types of games.
Pick up games, tournament games, and team doubles results. Each of these could easily have different variations on rules that skew the results.
/devils advocate end
However the amount they would be skewed by is very little. Right now the ITC and ATC are using mostly the majority of the core rules. Even on the latest frontline gaming reece and co said they want to play as close to the books as they can without making changes an right now there is almost no changes to the core rules.
So if anything the data across all factors would be more reliable now than at 6 months down the line when ITC rules have developed more and clearly skewed the results.
In the future it would be best if people posting in here say what type of game it was so we can get a better idea of where some of the data is coming from. Also I will agree no one in here said this proof was golden goose and should be the standard. No doubt the ITC will also have it's information as well.
Also we are seeing the list redistribute data recently as people's armies have been updated for 8th. So this is in fact prime data. As good ad we can possibly get without going to a higher source like the ITC or other tournaments. Also as for trustworthiness? So far there doesn't seem to be any malicious intent skewing the data like angry mobs or high emotions. Ultimately your going to trust people or your not. Do you think every single person reporting a game lied? I have no doubt there could be liars, but they would be a very few. The only other possibility is that everyone is lying or somehow unintentionally falsifying their results which would be astounding and mean all data everywhere is useless in which case you probably have something more serious to worry about than a game.
|
This message was edited 2 times. Last update was at 2017/07/23 17:16:24
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:13:33
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
Interesting side note: it's not actually the case that in drug testing we try really, really, really hard to get the right answer. It's instead that you've got to demonstrate that the treatment clearly works. If the data we've got is insufficient to say that, then the FDA or whoever doesn't approve the treatment. But them doing that is not actually the same as saying that the treatment doesn't work. We're choosing to err on the side of not approving useless or harmful drugs rather than making sure that we're correctly sorting drugs into useless/harmful and useful categories.
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:21:42
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
I will also say this. GW seem to be genuinely interested in balancing. The latest FAQ had a points adjustment for razorwing flocks. So perhaps in the future GW won't need to be right for 2 years, but only for a few months.
If GW is willing to constantly tweak until perfected (or as close as they can get) then this data is a good starting point. If this was old GW who never updates values or makes changes your point would hold more weight since they would want to be as correct as possible because of a lack of quick updates.
|
This message was edited 2 times. Last update was at 2017/07/23 17:22:39
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:24:29
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
I guess another interesting side note:
There's been a lot of discussion the past few years about the "replication crisis" in the sciences. Basically, in lots of fields a surprising number of scientific studies turn out to be non-reproducible. Someone did a study and published a paper saying something -- "chocolate keeps you from getting cancer" or whatever -- but then other people tried to do a very similar study and didn't find the same effect. It has been estimated that less than half of medical studies are replicable (these being the apparent gold standard that we should never expect the survey in this thread to come close to).
But a failure to replicate has proven to be an even bigger problem in the social sciences, which rely on surveys and which can only imperfectly control for variables. Researcher degrees of freedom -- the ability of the people analyzing the data to make decisions post-hoc, such as in this thread when factions and subfactions are split out -- are also a big issue. The most rigorous attempts to study this problem to date suggest that perhaps only 30-40% of psychology studies are replicable. But some have found numbers closer to 25%.
This is to say that, actually, most academic psychology studies are bad evidence for the conclusions that their data seem to support. If you're not doing at least as good of a job as academic psychologists, your data is not worth very much. Automatically Appended Next Post: Gamgee wrote:I will also say this. GW seem to be genuinely interested in balancing. The latest FAQ had a points adjustment for razorwing flocks. So perhaps in the future GW won't need to be right for 2 years, but only for a few months.
If GW is willing to constantly tweak until perfected (or as close as they can get) then this data is a good starting point. If this was old GW who never updates values or makes changes your point would hold more weight since they would want to be as correct as possible because of a lack of quick updates.
GW has pretty easy access to much better data than anything like this -- they can get something much closer to a representative random sample by having stores report results. And, like I said earlier, you're going to do pretty well with a handful of mathematically competent people with the right mindsets, without trying to collect data at all. I'm not sure they have such people or that they are trying to determine whether the game is balanced in a competent way, but they do have many better options that are not too hard to implement.
|
This message was edited 1 time. Last update was at 2017/07/23 17:30:35
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:31:41
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
1. Sources. I may be familiar with the topic at hand, but others aren't.
2. This is true they may not be replicatable, but I think the source is deeper issue at hand in society.
3. So if the majority failed replication then how did the good ones?
4. If you want replicated data it's only been a little over two months since 8th is out. It is unlikely to have enough data to do such a thing as replication.
5. Pure replication won;t work in this context since the data has already begun changing with FAQ's altering results as well as the ITC going to do it's own house rulings. Finally in this case wouldn't pure replication of this study not work since it would just work to recreate the same unbiased results? [Okay I understand that is the point now I had to think for a second] In the sciences there is a fixed outcome being looked for. In 40k there is no fixed result. It is an ethereal "balance" or as close to it as can be obtained and so mere replication would fall short of this.
6. Are you willing to play hundreds of games against at least a hundred people or so to attempt to replicate these results in your own methods?
7. Actual sources to your information please and links.
|
This message was edited 1 time. Last update was at 2017/07/23 17:33:31
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:31:59
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Mighty Vampire Count
|
Gamgee wrote:I will also say this. GW seem to be genuinely interested in balancing. The latest FAQ had a points adjustment for razorwing flocks. So perhaps in the future GW won't need to be right for 2 years, but only for a few months.
If GW is willing to constantly tweak until perfected (or as close as they can get) then this data is a good starting point. If this was old GW who never updates values or makes changes your point would hold more weight since they would want to be as correct as possible because of a lack of quick updates.
I can't see the razorwing flock pts adjustment on the FAQ?
|
I AM A MARINE PLAYER
"Unimaginably ancient xenos artefact somewhere on the planet, hive fleet poised above our heads, hidden 'stealer broods making an early start....and now a bloody Chaos cult crawling out of the woodwork just in case we were bored. Welcome to my world, Ciaphas."
Inquisitor Amberley Vail, Ordo Xenos
"I will admit that some Primachs like Russ or Horus could have a chance against an unarmed 12 year old novice but, a full Battle Sister??!! One to one? In close combat? Perhaps three Primarchs fighting together... but just one Primarch?" da001
www.dakkadakka.com/dakkaforum/posts/list/528517.page
A Bloody Road - my Warhammer Fantasy Fiction |
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:32:31
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Potent Possessed Daemonvessel
Why Aye Ya Canny Dakkanaughts!
|
Dionysodorus wrote:I mean, sure, if you don't care much about being right you can use really sloppy methods. That's all you're saying here. I agree with that. And so I think now we agree that your data amounts to very weak evidence because you've made basically no attempt to control for error and bias, and so realistic error bars would span basically the whole possible range of values.
Weak evidence and full range error bars are greatly over exaggerating, error bars would be smaller for those results with far more games reported and error bars would be even smaller if the results were reported by many different people. There are many factions on the board that we cannot draw conclusions from because of their tiny sample size but other factions have enough reported games for us to draw a vague idea of the strength of different factions. Saying we can't take conclusions from the IG, SM, CSM, Eldar, DE, Ork, etc. results is just wrong and you should know that.
I do want to ask, though: do you actually think that you're competent to evaluate this stuff? Like, I promise I won't respond to call you a liar if you say you've got a PhD in applied statistics, I just want to know where you think you're coming from. What is your relevant experience in gathering this sort of data and evaluating it to figure out what's true?
I am not a PhD level statistician, I am currently studying for a Masters in Medical Engineering which does have me come into contact with the statistics of prosthetic tensile strengths, the wear on different joints, etc. I also took a course of further statistics in College. Though it doesn't take someone with experience in statistics (just someone with common sense) to go "Guard are doing quite well this edition. Space Marines aren't doing as well as I'd have thought but that is quite alot of peoples starting armies, so I'll take that with a pinch of salt. Tau aren't doing great but they have been getting better over the last couple of weeks, perhaps that is something to do with people now realising Commander spam is pretty good". See, we can draw conclusions from this data.
Okay, now I'm confused because literally all I've been saying is that this is an incredibly unreliable method and therefore the data is basically useless for drawing conclusions. That's what people jumped all over me for. Edit: Oh, I see, you're straw-manning me as saying that the only alternative to what you're doing is drug testing levels of certainty.
I am certainly not straw manning you: just because a set of data does not have 99% reliability doesn't mean no conlcusions can be drawn from it, that simply means you have to take the data with a pinch of salt.
You can't claim that I'm straw manning you when you make wild claims like the data is garbage and useless for drawing conclusions when I'm claiming data doesn't have to have the accuracy of scientific studies to be usefull.
|
Ghorros wrote:The moral of the story: Don't park your Imperial Knight in a field of Gretchin carrying power tools.
Marmatag wrote:All the while, my opponent is furious, throwing his codex on the floor, trying to slash his wrists with safety scissors. |
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:34:50
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
Mr Morden wrote: Gamgee wrote:I will also say this. GW seem to be genuinely interested in balancing. The latest FAQ had a points adjustment for razorwing flocks. So perhaps in the future GW won't need to be right for 2 years, but only for a few months.
If GW is willing to constantly tweak until perfected (or as close as they can get) then this data is a good starting point. If this was old GW who never updates values or makes changes your point would hold more weight since they would want to be as correct as possible because of a lack of quick updates.
I can't see the razorwing flock pts adjustment on the FAQ?
Page 2. New changes are in pink.
As to my qualifications I am not a statistician, but extremely competent logician and independent thinker. I was eventually going to go in for a doctorate in psychology (medical), but dropped out before I got bachelor because I didn't like the people in the program. I keep my mind sharp though. Suffice to say I skewed the results in my class so much I raised the GPA of my class and made them look better.
|
This message was edited 2 times. Last update was at 2017/07/23 17:43:32
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:37:07
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Devestating Grey Knight Dreadknight
|
Magenta.
|
SHUPPET wrote:
wtf is this buddhist monk ascendant martial dice arts crap lol
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:38:47
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
The only solid fact posted by my opponent. OW. I'm bleeding out. Someone made a better logical argument than any one person in this thread today lol.
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:42:49
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Mighty Vampire Count
|
Gamgee wrote: Mr Morden wrote: Gamgee wrote:I will also say this. GW seem to be genuinely interested in balancing. The latest FAQ had a points adjustment for razorwing flocks. So perhaps in the future GW won't need to be right for 2 years, but only for a few months.
If GW is willing to constantly tweak until perfected (or as close as they can get) then this data is a good starting point. If this was old GW who never updates values or makes changes your point would hold more weight since they would want to be as correct as possible because of a lack of quick updates.
I can't see the razorwing flock pts adjustment on the FAQ?
Page 2. New changes are in pink.
As to my qualifications I am not a statistician, but extremely competent logician and independent thinker. I was eventually going to go in for a doctorate in psychology, but dropped out before I got bachelor because I didn't like the people in the program. I keep my mind sharp though. Suffice to say I skewed the results in my class so much I raised the GPA of my class and made them look better.
Oh thanks - I was on the gw site and it had not updated been to the thread and some good stuff - is this to sort out flyer balance?
Page 215
– Sudden Death
Change point 2 to read:
‘If at the end of any turn after the first battle round, one player has no models on the battlefield, the game ends immediately and their opponent automatically wins a crushing victory. When determining if a player has any units on the battlefield, do not include any units with
the Flyer Battlefield Role – these units cannot operate within a combat airspace indefinitely and they cannot hold territory without ground support.
|
I AM A MARINE PLAYER
"Unimaginably ancient xenos artefact somewhere on the planet, hive fleet poised above our heads, hidden 'stealer broods making an early start....and now a bloody Chaos cult crawling out of the woodwork just in case we were bored. Welcome to my world, Ciaphas."
Inquisitor Amberley Vail, Ordo Xenos
"I will admit that some Primachs like Russ or Horus could have a chance against an unarmed 12 year old novice but, a full Battle Sister??!! One to one? In close combat? Perhaps three Primarchs fighting together... but just one Primarch?" da001
www.dakkadakka.com/dakkaforum/posts/list/528517.page
A Bloody Road - my Warhammer Fantasy Fiction |
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:47:31
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
mrhappyface wrote:Dionysodorus wrote:I mean, sure, if you don't care much about being right you can use really sloppy methods. That's all you're saying here. I agree with that. And so I think now we agree that your data amounts to very weak evidence because you've made basically no attempt to control for error and bias, and so realistic error bars would span basically the whole possible range of values.
Weak evidence and full range error bars are greatly over exaggerating, error bars would be smaller for those results with far more games reported and error bars would be even smaller if the results were reported by many different people. There are many factions on the board that we cannot draw conclusions from because of their tiny sample size but other factions have enough reported games for us to draw a vague idea of the strength of different factions. Saying we can't take conclusions from the IG, SM, CSM, Eldar, DE, Ork, etc. results is just wrong and you should know that.
This is only true if there is not much bias. Growing your sample size doesn't help if your collection method is bad.
mrhappyface wrote:
I am not a PhD level statistician, I am currently studying for a Masters in Medical Engineering which does have me come into contact with the statistics of prosthetic tensile strengths, the wear on different joints, etc. I also took a course of further statistics in College. Though it doesn't take someone with experience in statistics (just someone with common sense) to go "Guard are doing quite well this edition. Space Marines aren't doing as well as I'd have thought but that is quite alot of peoples starting armies, so I'll take that with a pinch of salt. Tau aren't doing great but they have been getting better over the last couple of weeks, perhaps that is something to do with people now realising Commander spam is pretty good". See, we can draw conclusions from this data.
You can't bootstrap like this. You're trying to establish that your data is reliable by testing it against your sense of how the game is balanced currently, but that means that you can't then say that the data is evidence for how the game is balanced currently. That's an incredibly unreliable kind of inference. Edit: You're also allowing yourself to construct just-so stories to explain your data. You could explain almost any result this way. This means that we shouldn't really believe any of the particular stories you're telling to fit however the data happened to end up.
I am certainly not straw manning you: just because a set of data does not have 99% reliability doesn't mean no conlcusions can be drawn from it, that simply means you have to take the data with a pinch of salt.
You can't claim that I'm straw manning you when you make wild claims like the data is garbage and useless for drawing conclusions when I'm claiming data doesn't have to have the accuracy of scientific studies to be usefull.
I mean, of course it was a straw man. I'm not saying your data has to have 99% reliability to be worth anything. I am saying that it has much, much less than 99% reliability. Everyone is better off ignoring it because people are generally very bad at reasoning about very uncertain claims.
Gamgee wrote:1. Sources. I may be familiar with the topic at hand, but others aren't.
2. This is true they may not be replicatable, but I think the source is deeper issue at hand in society.
3. So if the majority failed replication then how did the good ones?
4. If you want replicated data it's only been a little over two months since 8th is out. It is unlikely to have enough data to do such a thing as replication.
5. Pure replication won;t work in this context since the data has already begun changing with FAQ's altering results as well as the ITC going to do it's own house rulings. Finally in this case wouldn't pure replication of this study not work since it would just work to recreate the same unbiased results? In the sciences there is a fixed outcome being looked for. In 40k there is no fixed result. It is an ethereal "balance" or as close to it as can be obtained and so mere replication would fall short of this.
6. Are you willing to play hundreds of games against at least a hundred people or so to attempt to replicate these results in your own methods?
7. Actual sources to your information please and links.
There's a ton of work on this but the Wikipedia page on the replication crisis is probably a good overview. The Science-Based Medicine blog has an in-depth treatment of the problem (in medicine) here: https://sciencebasedmedicine.org/is-there-a-reproducibility-crisis-in-biomedical-science-no-but-there-is-a-reproducibility-problem/ . Andrew Gelman's blog spends a lot of time on issues of data analysis and confidence in study results, and is generally great, but is probably a bit too complex for a lay reader: http://andrewgelman.com/
I don't really understand your questions 3-6. I'm not demanding replication. The point is that a lack of reproducibility in a field suggests that results are not reliable -- you should not be very confident in conclusions drawn from them until they have been confirmed several times. What I was saying was that, if a particular study in a major psychology journal is generally not good evidence for its conclusions, then surely the data in this thread is not good evidence for its conclusions.
|
This message was edited 2 times. Last update was at 2017/07/23 17:54:47
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:52:55
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
!!Goffik Rocker!!
|
Quickjager wrote:The only real problem that can arise from this is if someone is trying to intentionally dilute the data with false reporting or not reporting the losses.
He who does this must fear the inevitable dakka-vengeance van coming for him.
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 17:57:36
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Pewling Menial
KY, US
|
So...uh...
Admech win vs Tau
Admech win vs Space Marines (Primaris)
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 18:02:50
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
1. Thank you for the sources.
2. You have proven that there is a problem in general with statistics as they are presented to the public.
3. Despite the above you don't actually offer any sort of replication evidence to show that this data is in fact not able to be replicated just that it could.
4. You appeal to a higher authority which is a logical fallacy. You state that because it is known that the above is a problem in statistics and test that therefore there must be a problem here which might not be the case. While I have to admit that it could be a problem I still need to see your own testing on attempting to replicate the results for this to be a valid point against this specific survey.
5. Medical science is a realm of facts. People playing a game are not so logical. If a computer could make the perfectly balanced game but it isn't fun then the balance never really mattered in the first place. As much as I hate to say it there is much subjectivity in balance and you can't force people to change their minds even if the logic backs it up.
6. It will take months and months of analysis to get to the heart of the logical issues at hand and the math and balancing, but GW does not have that time or luxury. Even if you replicate the study once it would need to be done at least a few more dozen times for it to hold up to scientific scrutiny and rigour which would cause the game to be stuck in balance paralysis. People are not patient enough to wait 6+ months of analysing before every rule addition or change to the game and the company would quickly fall apart as it could never sell or create anything without months of mental gymnastics.
7. In this case a simpler less accurate method is more useful for practicality sake. GW is also able to make changes to its game faster than the scientific community and thus if they do over do balance on one thing or another it can always be tweaked in the future. It's a more rough and tumble approach to numbers and math for sure but it gives them much more adaptability and speed.
Your a smart person who clearly can make a good logical argument, but in this case I just don't feel it holds up to practicality and real life. If you were arguing pure facts like in science then of course this would be a fantastic and compelling argument. Heck I would be right there along side of you as it is once we hit peoples "feelings" the bane of so many scientists it can start to fall apart.
|
This message was edited 2 times. Last update was at 2017/07/23 18:07:27
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 18:10:11
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Potent Possessed Daemonvessel
Why Aye Ya Canny Dakkanaughts!
|
Dionysodorus wrote:This is only true if there is not much bias. Growing your sample size doesn't help if your collection method is bad.
How much bias do you think there is exactly? What bias is throwing this data out of skew so much that it is unusable.
You can't bootstrap like this. You're trying to establish that your data is reliable by testing it against your sense of how the game is balanced currently, but that means that you can't then say that the data is evidence for how the game is balanced currently. That's an incredibly unreliable kind of inference. Edit: You're also allowing yourself to construct just-so stories to explain your data. You could explain almost any result this way. This means that we shouldn't really believe any of the particular stories you're telling to fit however the data happened to end up.
I'm not fitting data around my experiences, I've only first hand seen how World Eaters, Imperial Guard and Orks play like this edition (all of my other games have been one offs against an army I don't play against that much), my opinions on Space Marines, Tau, Eldar, etc. come from this data and what battle reports I've seen. I'm constructing stories that explain the data, these are called 'theories' and are a perfectly reasonable way of interpretting data.
I mean, of course it was a straw man. I'm not saying your data has to have 99% reliability to be worth anything. I am saying that it has much, much less than 99% reliability. Everyone is better off ignoring it because people are generally very bad at reasoning about very uncertain claims.
I'm having to say stuff like it isn't 99% reliable because you won't tell us what the reliability of this data is (so far it's all been a lot of hot air about how some biases, which you haven't told us what they are, that skew the results beyond use) and you won't tell us what level of reliability the data becomes usable. I personally believe, because of the sample size and the fact that these results are being replected, the reliability of this data is above 50% (of course this isn't the case for the factions with low numbers of results). As for people? I'd say most of the people regularly coming back to this thread to discuss the results are learned enough to be able to make conclusions from these results whilst still keeping an amount of skepticism due to the lack of extreme reliability.
|
Ghorros wrote:The moral of the story: Don't park your Imperial Knight in a field of Gretchin carrying power tools.
Marmatag wrote:All the while, my opponent is furious, throwing his codex on the floor, trying to slash his wrists with safety scissors. |
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 18:45:29
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
!!Goffik Rocker!!
|
It's just not as big of a deal to bother about statistics so much.
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 18:47:21
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Potent Possessed Daemonvessel
Why Aye Ya Canny Dakkanaughts!
|
koooaei wrote:It's just not as big of a deal to bother about statistics so much.
+1 for talking sense.
|
Ghorros wrote:The moral of the story: Don't park your Imperial Knight in a field of Gretchin carrying power tools.
Marmatag wrote:All the while, my opponent is furious, throwing his codex on the floor, trying to slash his wrists with safety scissors. |
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 19:33:40
Subject: Re:We've seen the ITC results, but what about Dakkas results so far?
|
 |
Warp-Screaming Noise Marine
|
Reported Dionysodorus for persistently derailing the thread and being generally unpleasant. Go be a smartass somewhere else.
|
Drukhari - 4.7k
Space Marines - 3.1k
Chaos Space Marines - 2.9k
Harlequins - 0.9k
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 20:10:54
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
mrhappyface wrote:
How much bias do you think there is exactly? What bias is throwing this data out of skew so much that it is unusable.
What do you mean "how much bias"? How are you wanting this quantified? Like I've said, I think the results are sufficiently unreliable that they're basically not worth considering as evidence. In an earlier post I outlined some potential sources of bias. It's on you to demonstrate that you've avoided or accounted for those if you want to make an argument that this data provides good evidence for some conclusion. One of them was even something you probably could have at least looked at with a slightly better collection strategy -- you want to see if the reported results are biased in favor of the reporting players. I feel like you're wanting me to tell you something like what I think the win rate of reporting players is, but obviously that's silly. The problem is that we don't know what the win rate of reporting players is because you failed to collect that information. I think it could be much better than 50%. Glancing over the first two pages this seems to be the case, since in most cases we can figure out which army is the reporter's; reporting players seem to win more than they lose and players reporting lots of games tend to win a lot more than they lose. So at minimum I think it would be a good idea to try to adjust for this, though given your methodology you don't have any real chance of accounting for selection bias in general. As other people have pointed out, a relatively large fraction of the Sisters' matches are due to a small number of players. The usual explanation for their high win rate has been that our Sisters players are just good but it is also possible that the problem is that Sisters are simply more popular among reporters than among the general population and so the general trend of a bias in favor of reporters has a bigger impact here. It would be hard to tease this out.
I'm not fitting data around my experiences, I've only first hand seen how World Eaters, Imperial Guard and Orks play like this edition (all of my other games have been one offs against an army I don't play against that much), my opinions on Space Marines, Tau, Eldar, etc. come from this data and what battle reports I've seen. I'm constructing stories that explain the data, these are called 'theories' and are a perfectly reasonable way of interpretting data.
I mean, in your first response to me you were talking like I was being ridiculous because what I was saying flew in the face of common sense -- everyone knows that Guard and Sisters are good and Orks and Tau aren't. But, yes, you're offering possible explanations for why the different factions could have the win rates suggested by your data. Let's leave aside that we really have no reason to believe these win rates (at least, from your data). We further have basically no reason to believe your explanations for why they have these win rates. Your data shed no light on this; you're just using it to try to confirm your prejudices.
I'm having to say stuff like it isn't 99% reliable because you won't tell us what the reliability of this data is (so far it's all been a lot of hot air about how some biases, which you haven't told us what they are, that skew the results beyond use) and you won't tell us what level of reliability the data becomes usable. I personally believe, because of the sample size and the fact that these results are being replected, the reliability of this data is above 50% (of course this isn't the case for the factions with low numbers of results). As for people? I'd say most of the people regularly coming back to this thread to discuss the results are learned enough to be able to make conclusions from these results whilst still keeping an amount of skepticism due to the lack of extreme reliability.
I mean, surely it's your job to tell me how reliable your data is, right? If I tell you I've got a fantastic investment opportunity and if you give me $1000 you'll get rich, it's not on you to prove that I don't really have a foolproof scheme. I've got to show you that it is. I've got to be able to explain what I intend to do with the money and how I'm going to get the returns I'm promising, and I've got to answer objections like: "How likely is it that this combination bowling alley / car wash you want to build turns out to be very unpopular and goes bankrupt?" It's not your job to prove that there's no market for that. If you want to make some claim from this data, you've got to actually have an argument that the data is reliable. This is standard in things like polling. The pollster will report nonresponse rates and demographic information and how they tried to obtain a representative sample and what sorts of adjustments they did to control for various kinds of bias, trying to show that they've kept the possible error within acceptable limits.
Sample size alone doesn't cut it. Obviously this does not account for selection bias. That it's being "replicated" -- and I'm not really sure what you mean here -- also doesn't cut it. I think you mean that as you get more results the answers aren't changing too much? But of course this still doesn't even attempt to consider selection bias.
You say that you think your results are 50% reliable. What does an exact percentage mean here? To be clear, I was throwing around "99% reliable" only after you did, when I thought it was clear that this just meant "it is really inconceivable that this stuff would be significantly off from the right answer". But I have no idea what "50% reliable" means. Maybe something like "there is a 50/50 chance that this stuff is basically right?" Obviously that needs to be made a lot more precise. But, regardless, that's not good enough because I agree that the general trend here is probably right. I think that Guard really is likely doing well, and Orks and Tau really are likely not doing that well. I just don't believe that on the basis of your data. The question of whether some of the general trends you're showing are real is very different from the question of whether your data gives us reason to believe that those trends are real. As I've said, I think what you're doing adds basically no value above and beyond what someone could glean from glancing over the forum and seeing which factions there's a consensus on.
I think it is clearly not true that most people in this thread "are learned enough to be able to make conclusions from these results whilst still keeping an amount of skepticism due to the lack of extreme reliability." Very few people in general are. And most people here talking about the data are making elementary errors. I mean, someone is posting up confidence intervals derived from the sample sizes, and this is just straight out of "How to Lie with Statistics". I notice that nobody seems to have called him on this incredibly misleading presentation (to be clear, I don't think it's malicious). Of course, we also know that most people wildly overestimate their own competence to understand this sort of thing. This is part of why I think what you're doing is not just harmless but actively misleading.
Automatically Appended Next Post:
SarisKhan wrote:Reported Dionysodorus for persistently derailing the thread and being generally unpleasant. Go be a smartass somewhere else.
I do not really understand why it is derailing to talk about what we can conclude from the data collected in this thread. That is literally half of the point of the thread, other than collecting the data in the first place.
Edit:
I kept looking a little bit through some of the reported data. People have been talking like we have enough Ork games to say that they're probably underpowered. The difference between Ork wins and Ork losses is basically the same as the difference in wins and losses reported by a single player on page 2. Remove one person from your data set and the Orks look a lot like the Grey Knights.
|
This message was edited 4 times. Last update was at 2017/07/23 20:26:45
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 20:22:21
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
That's a lot of bs words to say you don't trust him and accuse him of actively falsifying data. I've debated enough people to know this is about to get ugly. Sadly I have to report you. I didn't want too since I've been in similar position on this forum in the past and I've given you more than enough chances to say your point, but my experience tells me otherwise.
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 20:27:06
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
Gamgee wrote:That's a lot of bs words to say you don't trust him and accuse him of actively falsifying data. I've debated enough people to know this is about to get ugly. Sadly I have to report you. I didn't want too since I've been in similar position on this forum in the past and I've given you more than enough chances to say your point, but my experience tells me otherwise.
...I don't think I've ever once suggested that anyone was falsifying data. Like, I haven't even raised the possibility that any of the people reporting games are making games up to make their faction look better or worse. Though come to think of it it's pretty plausible that there's some of that going on, right?
|
This message was edited 1 time. Last update was at 2017/07/23 20:28:30
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 20:32:37
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
"This is part of why I think what you're doing is not just harmless but actively misleading. "
You wrote that. Your accusing him of actively misleading. You seem to think it's his intent. Your debating from a very angry and hostile place right now and it's showing. I stand by you needing to take a walk when your disconnecting from the debate that much.
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 20:35:11
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Longtime Dakkanaut
|
One can mislead without intending to do so. He is causing people to have a wrong idea or impression. I think I've been pretty clear that I think the problem is that people are not competent to evaluate information like this rather than that they are competent but malicious.
Also I would note that I phrased it a bit more diplomatically than that, since I said what he was doing was misleading.
|
This message was edited 1 time. Last update was at 2017/07/23 20:35:54
|
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 20:35:15
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Fate-Controlling Farseer
|
Dark Angels w/ Knights vs Daemons w/ Thousand Sons - Dark Angels Lost
|
Full Frontal Nerdity |
|
 |
 |
![[Post New]](/s/i/i.gif) 2017/07/23 20:46:53
Subject: We've seen the ITC results, but what about Dakkas results so far?
|
 |
Potent Possessed Daemonvessel
Why Aye Ya Canny Dakkanaughts!
|
I really don't know what to say to you. You keep bringing up reliability measures that you would have in research projects, you want me to present to you that my data is a reliable data source. You know as well as I do, this data has no measures to prevent skew, it doesn't have enough results for it to be a reliable piece of evidence, the participents of this data collection are being trusted to be truthful but I can't enforce that. It isn't a reliable data set. Trying to publish a report on the current meta of 40k using this data is a laughable idea.
But, all we are doing here is taking a quick look at 8th ed 40k through the experiences of Dakka players and using our knowledge of 40k from our years of experience to make suggestions about the games balance.
You say that you have no reason to believe my theories because I'm just using unreliable data to confirm my prejudices, what you fail to remember is that I am a veteran player (or like to think I am). It is widly known that Space Marines are the most popular faction and has more new players than any other faction but it also contains many players who have used Space Marines for a long time. When I look at these results it makes sense that they would be below average and the data follows that trend. It may be a coincidense, but since the data isn't varying from this norm I see no need for concern. Imperial Guard have always been about an average army but now the data says that they are doing far better than normal, they are varying from the norm. So we look into it, read batreps, etc. and find that IG are in fact better than normal.
The data isn't reliable but so far it has been in accordense with reports all over the world, keep harping on about how no conclusions can be drawn from it but the data has 'coincidently' been right on the money for almost all of the results. A data set like this lets us look at variences from the norm and ask "why is that different than normal?".
I can't argue against you any more, seeing as we are going around in circles. The data is not reliable but we do not need it to be reliable, we just need it to establish the norm. Automatically Appended Next Post: Dionysodorus wrote:One can mislead without intending to do so. He is causing people to have a wrong idea or impression. I think I've been pretty clear that I think the problem is that people are not competent to evaluate information like this rather than that they are competent but malicious.
Also I would note that I phrased it a bit more diplomatically than that, since I said what he was doing was misleading.
Misleading people how? Will people look at this and go "Guard are OP!" without even looking at any other sources? No one is taking these results as gospel.
|
This message was edited 1 time. Last update was at 2017/07/23 20:50:53
Ghorros wrote:The moral of the story: Don't park your Imperial Knight in a field of Gretchin carrying power tools.
Marmatag wrote:All the while, my opponent is furious, throwing his codex on the floor, trying to slash his wrists with safety scissors. |
|
 |
 |
|