About those "Polls"

What does "validity" mean when used in a proper scientific or statistical fashion?

Validity answers a simple question. To what extent does this study or poll actually help us understand what it says it will help us understand? A well-conducted poll, with large sample size and unbiased questions, will be Valid, in that it Adds Knowledge. A small poll, with biased questions, given at after an emotional event has zero validity, it's can't actually inform you of anything.

Political polling has been an absolute nightmare, they were actually worse at predicting this last election than they were in 2016. Trump wasn't even supposed to come close. Why are they off so by so much?

Well, part of it is that these polls are not Valid to be used as predictions of the actual election until maybe the last week or better yet the last day. In fact, their Validity goes up with each hour closer to the election, but you wouldn't know that from the people who talk about them weeks before the election. Those people are just using polls for their own agenda and quoting from them in a way that no statistician ever would.

Polls are truly just a snapshot of how the country feels on that day or even on that hour and don't have much validity at all until the very end and that's only if done right.
A major issue is that they have created more complicated models rather than simplifying them.

Each "fix" actually creates a new problem because it comes with yet another assumption. What kind of assumptions? For instance, they start with the assumption that they already know the total amount of Republicans, Democrats, and Independents. But what if that assumption is wrong? Then you get what is called an over-sample.

A hypothetical example; They start with the assumption that we have 35% republicans and 25% independents and 40% democrats. They then call up a thousand people but what if they can't get in touch with 350 republicans, 250 independents, and 400 democrats? Well, then they just weigh the sample and assign a different value. If they only get in touch with 300 democrats they do NOT change their assumption that there are ONLY 300 democrats, they assume there are 400. (but couldn't get in touch with a hundred of them) They would give each of the 300 democrats a Value of 1.33 (300 times 1.333 is 400) and now they have their 400 people by creating a hundred new ones using statistical magic!

But what if their assumption is wrong and there were more people who now call themselves Republican? What if Trump flipped more people than the assumptions predicted? What if things change? What if there really were 45% republicans not 35%? Then you end up with them saying Wisconsin is going to Biden by 12 points instead of less than one.

This is just one set of assumptions they make and they make more and more assumptions the more complex their modeling becomes. Each one invites more mistakes and they compound.

So why does Rasmussen do better when it comes to political polling? It's because they don't start out with any assumptions at all. They simply follow the same people over time and track how their answers change from day to day and week to week. They created a simplified way of tracking opinions, the rest of them doubled down on a more complex system which only adds more Assumptions.

Opinion polling is just as bad, if not worse because it's often used with a different goal in mind, often it's used to elicit a response or to PUSH you in a given direction.
Garbage in garbage out, I can easily create a poll that you wouldn't even know was worded to get the response that I wanted, but I would get the response I wanted, not from everyone, but from as many as I needed.

This is called a "Push Poll" and they are often given immediately after an event and before people have time to really think about it. Hours after a school shooting, they will poll about gun laws and word it in a way that makes you a heartless bastard if you answer wrong or they will ask about global warming during a heatwave but not during a blizzard, etc. etc.
And to top it all off they often ask a completely ambivalent group of people about topics that they care nothing about and actually believe they can draw valid conclusions from those polls.

Politicians then cite these polls as excuses to pass new laws. Is that ethical? Of course not, but that's how it's done. The media cites these polls to make headlines knowing fully well they are not valid when used out of context. They also use them to Convince and Persuade the public.

Take a fake/push poll, roll it out and tell everyone they should feel the same way, that's not ethical but it's what they do.

A poll given to people who don't actually give a damn has ZERO Validity. A poll given to people ignorant of a topic has zero validity, a poll given after a major event has zero validity and on and on and on.

Are the pollsters making a muck of it on purpose? Well, some of the time yes! Some are in fact just creeps who will do anything for a buck. But often a poll is taken with a very specific goal in mind and the people who use that poll know its limitations. Used properly and with the full understanding of the limitations it becomes valid for that limited purpose and that purpose alone.

Our media, however, will take any poll, no matter how limited and talk about it as if it answers everything in the world and why! And our politicians do the same, they quote from polls so far out of context that they aren't in the same country. They would do a poll of people in South Africa and claim it's valid to use in regards to South Philadelphia. That's an exaggeration but not by much.

Drawing conclusions from push-polling ambivalent people doesn't help us understand Anything. Over-sampling is so common that it's become the norm, adding extra confounding assumptions to computer models is the norm. Using polling in a way that is so far away from it's intended purpose that it's validity falls to Zero is the norm. cherry-picking polls is the norm, etc.

They have abused polling so badly that it needs its own Me Too Movement.

Sadly polling has become yet another tool of the propagandists who use polls to make claims that the poll makers know can't possibly be valid. Some poll makers are paid to make erroneous push polls. Many are truly confused as to why adding complexity to computer-modeled polling, isn't a good thing.

At this point in time, the only one I pay much attention to is Rasmussen, whose only real problem is the sample size, I wish they were sampling about ten times as many people, but for now, they are the best we have.

Time for the eggheads to make a stand and demand their work be used in proper context.
Statisticians of the world, unite and take over.
h/t Brad Smith


Comments

Popular posts from this blog

"What If..." The Judge Strikes Again

AI and Government Surveillance: A Delicate Balance

NFSCD -- Brian Wilson with Domenic Scarcella S3 Ep 19