Hazel and baby! 🪴🐩🧶 Profile picture
Foodie, Data Nerd, Sleep Deprived. We repealed the 8th! 1/7 of @repeal_shield. Besties with the block button. hazelisok@mastodon.ie She/her.

May 24, 2018, 18 tweets

So, here is the long promised @repeal_shield block list overview. I know I've been promising it a while, but the list grew so rapidly and I'm never to the data analysis stuff so it took much longer than expected. #RepealShield

Firstly, this analysis is not based on the full block list as it currently stands. There are currently over 16,000 on the list - this analysis is based on the last data dump from a few days ago so that's 13,474 users.

Now, we're seen some absolutely odious users. Some not just actively pushing abuse and misinformation, but also breaking the Twitter rules (e.g. fake political party accounts.) So, unsurprisingly, some accounts have since been suspended, about 2.42% of all blocked accounts.

Another small group have set their accounts from public to protected. We also see a few accounts (mainly re-reg accounts) pre-emptively blocking RS. As we can't access much of the info from these accounts, let's exclude them too. That leaves us with 12,929 accounts to dig into.

One of the easiest checks to do is language - this checks the language the user uses Twitter in. Unsurprisingly this is primarily English, although 31 languages have been represented. Excluding English, the top 5 languages are: Spanish, French, Italian, German and Polish.

The majority of accounts on the list have not allowed geo-enabling permissions. This means that only 27.6% of users allow Twitter to access their actual location.

Now, location on Twitter is a tricky one. One metric we can look at is the timezone the account is set to. Only 44.2% of accounts on the list provide this information.

Here's where things start to get interesting. Ireland is UTC+01:00 so we should expect to see most conversation coming from there, and perhaps UTC+02:00 (western Europe). Instead we see the majority are from the Americas.

Let's look at that same data in a slightly different way by roughly grouping the timezones by continent. Accounts created in Europe only account for 27.3% of all accounts blocked by shield while accounts in the Americas count for 70.4%.

Users can add any location to their profile unverified. As with geo-location, many users don't provide a location, and some provide junk locations. Based on the user-provided locations, we blocked users from 102 different countries from Tokyo to Iran to Kenya to Venezuela.

54.2% of users provided a location. There is no way of verifying these locations, so that needs to be kept in mind. Due to the unique situation in Northern Ireland, I've separated them out from the UK, but they are still counted as the same country (UK) in the country list.

52% of stated locations are in the US, 25% are in Ireland, 7.5% in the UK and 3% in Northern Ireland. Now, those figures for Ireland seem quite high as only 27.3% of all account timezones are based in Europe, so let's look at the timezones of accounts with an Irish location.

58% of users with an Irish location didn't provide a timezone. This is significantly higher than the overall no location rate for users on the list (45.8%). Of those that provided a timezone, 20% are in the Americas, 70% in Ireland/UK and 8.5% are elsewhere in Europe.

Although we can't do location verification, we could consider users that state an Irish location and also have an Irish timeline to be "legit". That is 522 users, or 4% of the list. Important to note that's not 522 unique people, we've seen multiple reregs to get around a block.

There is a theory regarding bot/sock accounts from Russia - these accounts end in 8 digits e.g. Hazel01234567. (Learn more about that theory here: lbc.co.uk/radio/presente…). So do we have many bot number accounts? There are 427 accounts with this format, or 3.3% of the list.

There are also a very small number of verified Twitter accounts on the list. A total of 114 verified accounts have been blocked, accounting for only 0.88% of the list.

Finally, let's look at the accounts on the list by creation date. We can see that while there's been a steady stream of accounts blocked that joined over the last few years, the account creation really ramped up in 2017 and 2018.

So that's all I have for now, not much but a few interesting bits all the same. I had planned to get much more done (e.g. keywords analysis on the bios & comparison vs subscribers), but the canvassing and conversations in the real world take priority at the final hurdle!

Share this Scrolly Tale with your friends.

A Scrolly Tale is a new way to read Twitter threads with a more visually immersive experience.
Discover more beautiful Scrolly Tales like this.

Keep scrolling