The Guardian have tried their hand at statistical analysis again – after resounding failures the last two times, you have to at least salute their determination.
As part of their “web we want” initiative, the Guardian have published Max Kelsen’s extensive study of twitter “abuse” suffered by politicians. The study seeks to demonstrate and explain the “concerning” level of abuse, and manages to do neither. Instead it becomes just a tool for the Guardian to justify and renew their assault on the idea of internet free speech.
Methods and Data
The first point that needs to be addressed is how this study defines, and subsequently identifies, “abuse”:
Tweets were filtered into those that contained abusive words, and those that didn’t. While this will include false positives in the case of tweets primarily directed at one politician but containing abuse directed at another, these are in the minority.
Their method WILL produce false positives. Not “might produce”, “ will produce”; a very important distinction.
But don’t worry, these “false positives” are, they assure us, definitely “in the minority”. They never say how they know this, or how they could know, since no data is given. For all we – and possibly they – know the admitted “false positives” could make up literally 100% of their sample.
And it should be noted that these “false positives” could include total reversion of the intent of the tweet. For example the phrase “Hillary Clinton is not a bitch”, would be shuffled into the “abuse” pile simply for containing the word “bitch”.
Still, it’s not every statistics firm that would have the chutzpah to freely admit that anything up to 49% of their data may be totally and irrevocably flawed. So hats off Max Kelsen on that score anyway.
The study also suggests that the vast majority (75%+) of “abusive” tweets come from men, without in turn pointing out that Twitter never specifically asks for a user’s gender, and actually “assigns” it using an algorithm that famously skews male.
…but wait a minute:
The gender of tweeters was assigned where possible based on available information, such as bio information or the tweets themselves.
So twitter’s algorithm doesn’t actually matter, because this “analysis” didn’t even get that technical. No, they just looked at the accounts and sort of guessed. Brilliant.
None of which really matters, in the end, because their graphs reveal that – even including all those false positives – less than 2% of twitter posts are abusive.
Less than two percent. 98% of tweets are non-abusive.
That’s hardly a tickly cough, let alone the “epidemic” that the Guardian is so fond of describing. The study itself seems to recognise the minuteness of the alleged problem, saying this in their summary:
A key point to make is that data alone is not an accurate way to reflect the impact of abuse.
Again, it’s not every statistical study that would sum up: “OK, there’s not much data here…but it feels bigger than it looks”. Maybe this is some new, progressive mathematics – much like the Common Core syllabus in the US – where numbers are given increased weight based on how they make one feel.
It doesn’t take a skilled reader of subtext to see where this is going – the intent of the “Web We Want” section, coincidentally launched parallel to Yvette Cooper’s “Reclaim the Internet” campaign, has always been clear. They attack free speech under the guise of protecting the “oppressed” and the “bullied” – most of the time, this means women.
That slant is clear here. The headline reads:
From Julia Gillard to Hillary Clinton: online abuse of politicians around the world
…which implies there is disparity between men and women in the amount of abuse received. This early paragraph does the same:
The abuse of politicians online, particularly women, is perceived by some to come with the territory. But as high-profile cases flag the urgent need to clean up the web, the scope of the problem is now revealed in greater detail in work by a Brisbane-based social data company, Max Kelsen.
The bolded phrase above – “particularly women” – is an interesting one. Especially since, just a little way down the page, they reveal that the abuse is, in reality, evenly split between men and women over their samples.
Hillary Clinton receives more “abuse” than Bernie Sanders, and Julia Gillard was apparently abused more than Kevin Rudd…but Chris Christie received more abuse than Carly Fiorina, and Andy Burnham and Jeremy Corbyn both received nearly twice as much “abuse” as their female counterparts. In short: There’s no real difference between the genders.
You’d be forgiven, given the tone, for thinking the opposite – the article cites the Jess Philips claim of 600 threats in one night, repeats Yvette Coopers famous “threat” (which, to me, reads as an obviously rather tasteless joke), and then treats us to some pictures of Jo Cox’s mournful public, suggesting that controlling what people are allowed to say on the internet might have saved her life.
The study tells us to disregard the data, and focus on the “emotional impact” of the abuse. I would say disregard the data (or lack thereof), and instead focus on how the Guardian is choosing to present it.