Im so genannten "Arabischen Frühling" kamen angeblich massenhaft Social Bots zum Einsatz. Die US-amerikanischen Behörden haben seit 2011 Forschung zur Manipulation der Sozialen Netzwerke gefördert. Siehe:
Bei Trumps Anhängern stößt man auch immer wieder auf Bots. Jetzt hat der Präsident zu den aktuellen Protesten im Iran getwittert und es stellt sich daher die Frage, ob in dieser Debatte auch Social Bots eine Rolle spielen.
und
https://www.darpa.mil/program/social-media-in-strategic-communicationBei Trumps Anhängern stößt man auch immer wieder auf Bots. Jetzt hat der Präsident zu den aktuellen Protesten im Iran getwittert und es stellt sich daher die Frage, ob in dieser Debatte auch Social Bots eine Rolle spielen.
Social Bots: Definition fehlt
Bei Social Bots geht es um automatisierte Accounts, die vorgeben, menschliche Nutzer zu sein. Klingt einfach, ist aber keine eindeutige Definition. Ab wann ist ein Account automatisiert? Was muss gegeben sein, damit die Täuschungsabsicht klar ist? In der Praxis hilft man sich damit, Indizien zu sammeln. Klar sein muss dabei allerdings, dass es sich nur um Hinweise handelt. Es gibt nichts, was so komisch ist, dass es nicht auch im Internet von selbst vorkommen würde. Dabei kann es auch leicht passieren, dass gerade Accounts, die nicht "gläsern" sein wollen, als Bots eingestuft werden (http://politicaldatascience.blogspot.de/2016/03/ethics-of-botdetection.html).Heuristiken
Man kann versuchen, Social Bots anhand von einfachen Regeln zu identifizieren. Vorteil: Die Regeln sind gut verständlich. Nachteil: man weiß nicht, wie gut die Klassifizierung wirklich funktioniert. Die Alternative ist maschinelles Lernen. Dafür braucht man aber (im Prinzip) schon codierte Daten, von denen man weiß, dass es Bots sind. Da sich Bots aber offenbar in unterschiedlichen Kontexten und Sprachräumen sehr unterscheiden, können die Ergebnisse oft sehr schlecht sein.
Source
Viele Twittermeldungen werden nicht über die eigentliche App erzeugt, sondern über andere Programme, die direkt auf die API zugreifen. Die Adresse ("source") ist in den Twitterdaten gespeichert (kann allerdings auch von geschickter Software manipuliert werden).
Gleicher Text, kein Retweet
Wenn ein Text gepostet wird, der genauso auch von einem anderen Nutzer verwendet wurde, kann das für einen automatischen Bot sprechen - zumindest, wenn es sich nicht um einen Retweet handelt.
Hyperactive Cracker
Der Witz (oder zumindest ein Witz) von Bots besteht darin, dass sie mehr posten können als normale Menschen und so Trends verzerren. Wie häufig ein Nutzer also pro Tag tweetet, sagt einiges darüber, ob es ein Mensch oder eine Maschine ist. Aber wie oft ist zu oft? Eine relativ konservative Schätzung ist es, zu schauen, wie oft die aktivsten 5% der Nutzer twittern. Auch das ist allerdings etwas beliebig und lebt von dem Gedanken, dass mindestens 5% der Tweets von Bots kommen (was alle bisherigen Studien in etwa bestätigen).
Friend/Follow Ratio
Das Verhältnis von Freunden und Followern ist auch sehr interessant. Früher war es so, dass Bots viele Freunde hatten (vielen Nutzern folgten), aber selbst wenig Follower hatten. Unsere Untersuchungen haben gezeigt, dass das Verhältnis heute bei Bots häufig ausgeglichen ist, weil die Bot-Software nur weiteren Nutzern folgt, wenn es auch neue Freunde gibt. Häufig folgen sich Bots auch einfach gegenseitig. Man kann also nach Nutzern Ausschau halten, deren Friend/Follower Ratio in etwa 1 ist. Um Missverständnisse zu vermeiden, sollten man zusätzlich nach einer großen Anzahl Follower (z. B. über 1.000) schauen.
Verified Users
Gerade Medienaccounts twittern häufig sehr viel und könnten so als Bots erscheinen (zumal sie auch meist automatisiert agieren). Meistens sind diese Accounts aber von Twitter verifiziert. Man sollte auf jeden Fall darauf achten, dass sich solche Accounts nicht auf der Bot-Liste wiederfinden.
#Iran
Auf ein Sample von ca. 100.000 Tweets mit dem Wort "Iran", die am 30.12.2017 in 4 Stunden über die STREAMING-API von Twitter gesammelt wurden, habe ich diese Heuristiken angewandt.
3.500 Tweets kamen aus "automatisierten" Quellen. 114 Tweets waren Duplikate (ohne Retweets zu sein). Die aktivsten 5% im Sample posteten über ihre gesamte "Lebenszeit" im Durchschnitt mehr als 176 Tweet am Tag. Das waren dann nochmal 5.100 Tweets. (Die Hälfte dieser Accounts wurde übrigens bereits über "Source" als Bot eingeordnet. Weitere 4.500 Tweets (bzw. deren Absender) hatten eine auffällige Friend/Follower Ratio. Duplikate und verifizierte User wieder rausgeschmissen ergibt das 11.000 Tweets von Bots (oder eher: Tweets, die nach den hier angewandten Heuristiken als Bots erscheinen). Das deckt sich ziemlich mit den Schätzungen von anderen Studien in anderen Kontexten.
Nun könnte man weiter gehen und schauen, wozu die Bots sich denn äußern. Hier nur ein kurzer Überblick in Form von zwei Wordclouds (alle Nachrichten ohne Bots und alle Botnachrichten):
Links: Ohne Bots, rechts: nur Bots |
Haben Social Bots einen Effekt?
Wir wissen es nicht. Wir wissen aber auch nicht, ob Trump mit seiner Twitterei einen Effekt hat.
R-Codes
Wer die Codes aufmerksam liest, findet auch eine Liste der 100 aktivsten "Bots" im Sample. Ich habe mir einige genauer angesehen und glaube, dass die Heuristiken ganz gut funktionieren. Fehler sind aber immer mit eingekauft!
Sat Dec 30 17:55:17 2017
# Iran BotAnalyse
# load packages
library(streamR)
## Loading required package: RCurl
## Loading required package: bitops
## Loading required package: rjson
library(ROAuth)
library(twitteR)
# # authenticate Twitter API
# requestURL <- "https://api.twitter.com/oauth/request_token"
# accessURL <- "https://api.twitter.com/oauth/access_token"
# authURL <- "https://api.twitter.com/oauth/authorize"
# consumerKey <- "xxxxxyyyyyzzzzzz"
# consumerSecret <- "xxxxxxyyyyyzzzzzzz111111222222"
#
# token = "xxxxxyyyyyyyyyyyzzzzzzzzz"
# tokenSecret = "xxxxxxxxxxxxxyyyyyyyyyyyzzzzzzzzz"
#
# # get 4 hours of "Iran" from STREAMING-API
# for(i in 1:4){
# file = paste0("tweets", gsub(" |:", "-", Sys.time()), ".json")
# track = "Iran"
# follow = NULL
# loc = NULL #c(50.33, 6.1, 52.36, 9.4)
# lang = NULL
# time = 60*60
# tweets = NULL
# filterStream(file.name = file, track = track,
# follow = follow, locations = loc, language = lang,
# timeout = time, tweets = tweets, oauth = Cred,
# verbose = TRUE)
# }
# change file names according to your data
df <- parseTweets("tweets2017-12-30-11-47-05.json", verbose = FALSE)
df2 <- parseTweets("tweets2017-12-30-12-47-05.json", verbose = FALSE)
df3 <- parseTweets("tweets2017-12-30-13-47-05.json", verbose = FALSE)
df4 <- parseTweets("tweets2017-12-30-14-47-05.json", verbose = FALSE)
df <- rbind(df, df2, df3, df4)
rm(list=c("df2", "df3", "df4"))
# function to get URL-parts
URL_parts <- function(x) {
m <- regexec("^(([^:]+)://)?([^:/]+)(:([0-9]+))?(/.*)", x)
parts <- do.call(rbind,
lapply(regmatches(x, m), `[`, c(3L, 4L, 6L, 7L)))
colnames(parts) <- c("protocol","host","port","path")
parts
}
sources <- URL_parts(df$source)[,2]
sort(table(sources)[table(sources)>30], decreasing = T)
## sources
## twitter.com
## 76361
## twitter.com" rel="nofollow">Twitter Web Client<
## 20421
## mobile.twitter.com" rel="nofollow">Twitter Lite<
## 4501
## ifttt.com" rel="nofollow">IFTTT<
## 1611
## about.twitter.com
## 1143
## dlvrit.com
## 482
## www.google.com
## 438
## www.twitter.com" rel="nofollow">Twitter for Windows<
## 356
## publicize.wp.com
## 312
## www.facebook.com
## 270
## tapbots.com
## 217
## www.hootsuite.com" rel="nofollow">Hootsuite<
## 215
## mobile.twitter.com" rel="nofollow">Mobile Web (M2)<
## 204
## www.tweetcaster.com" rel="nofollow">TweetCaster for Android<
## 169
## www.twitter.com" rel="nofollow">Twitter for Windows Phone<
## 136
## www.echofon.com
## 85
## www.crowdfireapp.com" rel="nofollow">Crowdfire - Go Big<
## 78
## mvilla.it
## 67
## github.com
## 55
## bufferapp.com" rel="nofollow">Buffer<
## 51
## itunes.apple.com
## 49
## twitterrific.com" rel="nofollow">Twitterrific<
## 46
## paper.li" rel="nofollow">Paper.li<
## 38
## instagram.com" rel="nofollow">Instagram<
## 34
## zou.tv" rel="nofollow">الفيديو للكباُر فقط<
## 31
# Sources other than twitter.com are suspicious
#collect suspicious sources
auto <- c("publicize.wp.com", "www.echofon.com", "www.hootsuite.com", "tapbots.com",
"www.tweetcaster.com", "www.crowdfireapp.com", "ifttt.com", "twittbot.net", "software.complete.org",
"twicca.r246.jp", "roundteam.co", "twibble.io", "paper.li", "twitterrific.com", "mvilla.it",
"dlvrit.com", "bufferapp.com")
# show URLs to check the source webpages
unique(df$source[grepl(paste(auto,collapse="|"), df$source)])
## [1] "<a href=\"https://dlvrit.com/\" rel=\"nofollow\">dlvr.it</a>"
## [2] "<a href=\"https://ifttt.com\" rel=\"nofollow\">IFTTT</a>"
## [3] "<a href=\"http://twibble.io\" rel=\"nofollow\">Twibble.io</a>"
## [4] "<a href=\"http://www.echofon.com\" rel=\"nofollow\">Echofon Android</a>"
## [5] "<a href=\"http://www.tweetcaster.com\" rel=\"nofollow\">TweetCaster for Android</a>"
## [6] "<a href=\"http://www.echofon.com/\" rel=\"nofollow\">Echofon</a>"
## [7] "<a href=\"http://bufferapp.com\" rel=\"nofollow\">Buffer</a>"
## [8] "<a href=\"http://publicize.wp.com/\" rel=\"nofollow\">WordPress.com</a>"
## [9] "<a href=\"http://www.hootsuite.com\" rel=\"nofollow\">Hootsuite</a>"
## [10] "<a href=\"http://tapbots.com/tweetbot\" rel=\"nofollow\">Tweetbot for iΟS</a>"
## [11] "<a href=\"http://twittbot.net/\" rel=\"nofollow\">twittbot.net</a>"
## [12] "<a href=\"http://twitterrific.com\" rel=\"nofollow\">Twitterrific</a>"
## [13] "<a href=\"http://twicca.r246.jp/\" rel=\"nofollow\">twicca</a>"
## [14] "<a href=\"http://www.crowdfireapp.com\" rel=\"nofollow\">Crowdfire - Go Big</a>"
## [15] "<a href=\"https://roundteam.co\" rel=\"nofollow\">RoundTeam</a>"
## [16] "<a href=\"http://tapbots.com/software/tweetbot/mac\" rel=\"nofollow\">Tweetbot for Mac</a>"
## [17] "<a href=\"http://paper.li\" rel=\"nofollow\">Paper.li</a>"
## [18] "<a href=\"http://mvilla.it/fenix\" rel=\"nofollow\">Fenix 2</a>"
## [19] "<a href=\"http://mvilla.it/fenix\" rel=\"nofollow\">Fenix 2 Preview</a>"
## [20] "<a href=\"http://tapbots.com/tweetbot\" rel=\"nofollow\">Tweetbot 3 for iΟS</a>"
## [21] "<a href=\"http://software.complete.org/twidge\" rel=\"nofollow\">twidge</a>"
## [22] "<a href=\"http://mvilla.it/fenix\" rel=\"nofollow\">Fenix for Android</a>"
## [23] "<a href=\"http://twitterrific.com\" rel=\"nofollow\">Twitterrific for Mac</a>"
# Tweets from these sources might be bots
botsAuto <- which(grepl(paste(auto,collapse="|"), df$source))
# Sometimes, bots use the same text, not marked as retweet.
botsDup <- which(duplicated(df$text)&df$retweet_count==0)
# bots post a lot (often). Take a look at distribution of
# tweets per user (logarithm to base 10)
hist(log10(df$statuses_count))
# Set time format to English
Sys.setlocale("LC_TIME", "C")
## [1] "C"
# Function to convert Twitter dates:
formatTwDate <- function(datestring, format="datetime"){
if (format=="datetime"){
date <- as.POSIXct(datestring, format="%a %b %d %H:%M:%S %z %Y")
}
if (format=="date"){
date <- as.Date(datestring, format="%a %b %d %H:%M:%S %z %Y")
}
return(date)
}
# Create column with age of accounts in days.
df$age <- sapply(df$user_created_at, function(x) Sys.Date() - as.Date(formatTwDate(x)))
# Divide number of tweets by days
df$TwPerDay <- df$statuses_count/df$age
# Take a look at the distribution
hist(log10(df$TwPerDay))
plot(df$TwPerDay)
# How many tweets per day is suspicious?
round(quantile(df$TwPerDay, probs= seq(0, 1, 0.05)))
## 0% 5% 10% 15% 20% 25% 30% 35% 40% 45% 50% 55% 60% 65% 70%
## 0 0 1 2 3 4 5 7 9 11 14 17 22 27 33
## 75% 80% 85% 90% 95% 100%
## 42 54 73 104 175 Inf
# Quite a lot users have age zero, i.e. created today.
# Could be normal users could be bots, could be activists.
df$age[which(df$TwPerDay==Inf)]
## [1] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [36] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [71] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [106] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [141] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [176] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [211] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## [246] 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
df$user_created_at[which(df$TwPerDay==Inf)]
## [1] "Sat Dec 30 10:39:03 +0000 2017" "Sat Dec 30 10:32:59 +0000 2017"
## [3] "Sat Dec 30 10:08:22 +0000 2017" "Sat Dec 30 10:39:03 +0000 2017"
## [5] "Sat Dec 30 00:29:29 +0000 2017" "Sat Dec 30 10:39:03 +0000 2017"
## [7] "Sat Dec 30 09:51:52 +0000 2017" "Sat Dec 30 10:47:33 +0000 2017"
## [9] "Sat Dec 30 10:47:33 +0000 2017" "Sat Dec 30 10:47:33 +0000 2017"
## [11] "Sat Dec 30 10:47:33 +0000 2017" "Sat Dec 30 10:47:33 +0000 2017"
## [13] "Sat Dec 30 10:58:56 +0000 2017" "Sat Dec 30 07:15:43 +0000 2017"
## [15] "Sat Dec 30 10:47:33 +0000 2017" "Sat Dec 30 07:31:03 +0000 2017"
## [17] "Sat Dec 30 10:47:33 +0000 2017" "Sat Dec 30 07:28:36 +0000 2017"
## [19] "Sat Dec 30 06:44:07 +0000 2017" "Sat Dec 30 00:43:22 +0000 2017"
## [21] "Sat Dec 30 09:45:29 +0000 2017" "Sat Dec 30 11:06:39 +0000 2017"
## [23] "Sat Dec 30 09:26:04 +0000 2017" "Sat Dec 30 07:38:38 +0000 2017"
## [25] "Sat Dec 30 01:41:50 +0000 2017" "Sat Dec 30 00:43:22 +0000 2017"
## [27] "Sat Dec 30 10:46:56 +0000 2017" "Sat Dec 30 11:12:20 +0000 2017"
## [29] "Sat Dec 30 11:06:46 +0000 2017" "Sat Dec 30 11:19:47 +0000 2017"
## [31] "Sat Dec 30 11:19:47 +0000 2017" "Sat Dec 30 11:14:15 +0000 2017"
## [33] "Sat Dec 30 11:14:15 +0000 2017" "Sat Dec 30 10:08:22 +0000 2017"
## [35] "Sat Dec 30 11:14:15 +0000 2017" "Sat Dec 30 11:14:15 +0000 2017"
## [37] "Sat Dec 30 10:02:22 +0000 2017" "Sat Dec 30 01:55:20 +0000 2017"
## [39] "Sat Dec 30 07:28:49 +0000 2017" "Sat Dec 30 11:19:47 +0000 2017"
## [41] "Sat Dec 30 11:19:47 +0000 2017" "Sat Dec 30 11:14:15 +0000 2017"
## [43] "Sat Dec 30 11:58:03 +0000 2017" "Sat Dec 30 11:19:47 +0000 2017"
## [45] "Sat Dec 30 11:35:00 +0000 2017" "Sat Dec 30 11:35:00 +0000 2017"
## [47] "Sat Dec 30 06:16:37 +0000 2017" "Sat Dec 30 12:13:08 +0000 2017"
## [49] "Sat Dec 30 10:27:19 +0000 2017" "Sat Dec 30 10:27:19 +0000 2017"
## [51] "Sat Dec 30 10:27:19 +0000 2017" "Sat Dec 30 12:11:14 +0000 2017"
## [53] "Sat Dec 30 12:11:14 +0000 2017" "Sat Dec 30 10:27:19 +0000 2017"
## [55] "Sat Dec 30 10:27:19 +0000 2017" "Sat Dec 30 09:49:20 +0000 2017"
## [57] "Sat Dec 30 08:30:42 +0000 2017" "Sat Dec 30 10:27:19 +0000 2017"
## [59] "Sat Dec 30 10:27:19 +0000 2017" "Sat Dec 30 10:27:19 +0000 2017"
## [61] "Sat Dec 30 10:27:19 +0000 2017" "Sat Dec 30 11:40:10 +0000 2017"
## [63] "Sat Dec 30 10:27:19 +0000 2017" "Sat Dec 30 09:53:00 +0000 2017"
## [65] "Sat Dec 30 12:17:58 +0000 2017" "Sat Dec 30 10:27:19 +0000 2017"
## [67] "Sat Dec 30 12:17:58 +0000 2017" "Sat Dec 30 12:17:58 +0000 2017"
## [69] "Sat Dec 30 10:27:19 +0000 2017" "Sat Dec 30 12:28:41 +0000 2017"
## [71] "Sat Dec 30 12:17:58 +0000 2017" "Sat Dec 30 12:17:58 +0000 2017"
## [73] "Sat Dec 30 11:13:44 +0000 2017" "Sat Dec 30 07:31:03 +0000 2017"
## [75] "Sat Dec 30 00:43:22 +0000 2017" "Sat Dec 30 12:21:32 +0000 2017"
## [77] "Sat Dec 30 09:41:47 +0000 2017" "Sat Dec 30 12:52:58 +0000 2017"
## [79] "Sat Dec 30 13:01:07 +0000 2017" "Sat Dec 30 06:27:24 +0000 2017"
## [81] "Sat Dec 30 10:46:34 +0000 2017" "Sat Dec 30 10:46:34 +0000 2017"
## [83] "Sat Dec 30 10:15:29 +0000 2017" "Sat Dec 30 10:27:19 +0000 2017"
## [85] "Sat Dec 30 10:27:19 +0000 2017" "Sat Dec 30 12:54:47 +0000 2017"
## [87] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 02:05:53 +0000 2017"
## [89] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 10:15:29 +0000 2017"
## [91] "Sat Dec 30 10:15:29 +0000 2017" "Sat Dec 30 12:13:08 +0000 2017"
## [93] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 02:05:53 +0000 2017"
## [95] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 02:05:53 +0000 2017"
## [97] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 13:00:55 +0000 2017"
## [99] "Sat Dec 30 13:11:10 +0000 2017" "Sat Dec 30 13:18:55 +0000 2017"
## [101] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 12:46:59 +0000 2017"
## [103] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 13:09:56 +0000 2017"
## [105] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 12:55:29 +0000 2017"
## [107] "Sat Dec 30 13:12:57 +0000 2017" "Sat Dec 30 02:05:53 +0000 2017"
## [109] "Sat Dec 30 12:18:27 +0000 2017" "Sat Dec 30 13:13:58 +0000 2017"
## [111] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 09:41:41 +0000 2017"
## [113] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 13:03:02 +0000 2017"
## [115] "Sat Dec 30 12:43:37 +0000 2017" "Sat Dec 30 12:59:21 +0000 2017"
## [117] "Sat Dec 30 12:11:05 +0000 2017" "Sat Dec 30 02:05:53 +0000 2017"
## [119] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 02:05:53 +0000 2017"
## [121] "Sat Dec 30 12:54:21 +0000 2017" "Sat Dec 30 12:48:24 +0000 2017"
## [123] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 12:57:47 +0000 2017"
## [125] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 12:42:24 +0000 2017"
## [127] "Sat Dec 30 13:09:54 +0000 2017" "Sat Dec 30 12:48:03 +0000 2017"
## [129] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 02:05:53 +0000 2017"
## [131] "Sat Dec 30 12:57:49 +0000 2017" "Sat Dec 30 12:53:22 +0000 2017"
## [133] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 12:57:15 +0000 2017"
## [135] "Sat Dec 30 13:17:07 +0000 2017" "Sat Dec 30 07:39:20 +0000 2017"
## [137] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 13:08:47 +0000 2017"
## [139] "Sat Dec 30 13:10:14 +0000 2017" "Sat Dec 30 02:05:53 +0000 2017"
## [141] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 02:05:53 +0000 2017"
## [143] "Sat Dec 30 12:57:54 +0000 2017" "Sat Dec 30 13:00:49 +0000 2017"
## [145] "Sat Dec 30 13:17:07 +0000 2017" "Sat Dec 30 09:32:30 +0000 2017"
## [147] "Sat Dec 30 12:51:22 +0000 2017" "Sat Dec 30 13:10:05 +0000 2017"
## [149] "Sat Dec 30 12:57:09 +0000 2017" "Sat Dec 30 13:18:20 +0000 2017"
## [151] "Sat Dec 30 13:20:54 +0000 2017" "Sat Dec 30 13:18:20 +0000 2017"
## [153] "Sat Dec 30 13:07:08 +0000 2017" "Sat Dec 30 13:07:23 +0000 2017"
## [155] "Sat Dec 30 13:10:05 +0000 2017" "Sat Dec 30 12:41:34 +0000 2017"
## [157] "Sat Dec 30 12:59:23 +0000 2017" "Sat Dec 30 13:08:32 +0000 2017"
## [159] "Sat Dec 30 13:04:07 +0000 2017" "Sat Dec 30 13:08:47 +0000 2017"
## [161] "Sat Dec 30 13:06:12 +0000 2017" "Sat Dec 30 13:22:53 +0000 2017"
## [163] "Sat Dec 30 13:11:43 +0000 2017" "Sat Dec 30 13:03:36 +0000 2017"
## [165] "Sat Dec 30 12:52:03 +0000 2017" "Sat Dec 30 13:16:25 +0000 2017"
## [167] "Sat Dec 30 13:08:04 +0000 2017" "Sat Dec 30 12:57:07 +0000 2017"
## [169] "Sat Dec 30 10:27:19 +0000 2017" "Sat Dec 30 13:21:37 +0000 2017"
## [171] "Sat Dec 30 13:24:26 +0000 2017" "Sat Dec 30 12:58:47 +0000 2017"
## [173] "Sat Dec 30 13:04:56 +0000 2017" "Sat Dec 30 13:12:47 +0000 2017"
## [175] "Sat Dec 30 12:56:25 +0000 2017" "Sat Dec 30 13:19:36 +0000 2017"
## [177] "Sat Dec 30 13:10:27 +0000 2017" "Sat Dec 30 13:09:04 +0000 2017"
## [179] "Sat Dec 30 13:06:03 +0000 2017" "Sat Dec 30 13:02:28 +0000 2017"
## [181] "Sat Dec 30 13:19:28 +0000 2017" "Sat Dec 30 11:18:26 +0000 2017"
## [183] "Sat Dec 30 12:57:33 +0000 2017" "Sat Dec 30 13:23:50 +0000 2017"
## [185] "Sat Dec 30 12:37:21 +0000 2017" "Sat Dec 30 12:54:49 +0000 2017"
## [187] "Sat Dec 30 12:17:01 +0000 2017" "Sat Dec 30 13:00:47 +0000 2017"
## [189] "Sat Dec 30 13:03:11 +0000 2017" "Sat Dec 30 13:12:01 +0000 2017"
## [191] "Sat Dec 30 12:59:34 +0000 2017" "Sat Dec 30 09:59:02 +0000 2017"
## [193] "Sat Dec 30 13:07:40 +0000 2017" "Sat Dec 30 13:22:02 +0000 2017"
## [195] "Sat Dec 30 13:31:59 +0000 2017" "Sat Dec 30 06:24:21 +0000 2017"
## [197] "Sat Dec 30 13:28:26 +0000 2017" "Sat Dec 30 10:44:17 +0000 2017"
## [199] "Sat Dec 30 02:05:53 +0000 2017" "Sat Dec 30 13:01:07 +0000 2017"
## [201] "Sat Dec 30 13:01:07 +0000 2017" "Sat Dec 30 13:02:32 +0000 2017"
## [203] "Sat Dec 30 13:01:07 +0000 2017" "Sat Dec 30 12:32:40 +0000 2017"
## [205] "Sat Dec 30 13:01:07 +0000 2017" "Sat Dec 30 13:30:08 +0000 2017"
## [207] "Sat Dec 30 09:26:04 +0000 2017" "Sat Dec 30 13:30:08 +0000 2017"
## [209] "Sat Dec 30 09:26:04 +0000 2017" "Sat Dec 30 12:30:51 +0000 2017"
## [211] "Sat Dec 30 11:22:56 +0000 2017" "Sat Dec 30 10:46:34 +0000 2017"
## [213] "Sat Dec 30 13:45:40 +0000 2017" "Sat Dec 30 10:44:17 +0000 2017"
## [215] "Sat Dec 30 04:16:04 +0000 2017" "Sat Dec 30 13:46:43 +0000 2017"
## [217] "Sat Dec 30 12:11:14 +0000 2017" "Sat Dec 30 04:16:04 +0000 2017"
## [219] "Sat Dec 30 04:16:04 +0000 2017" "Sat Dec 30 13:18:22 +0000 2017"
## [221] "Sat Dec 30 04:16:04 +0000 2017" "Sat Dec 30 13:46:43 +0000 2017"
## [223] "Sat Dec 30 09:43:22 +0000 2017" "Sat Dec 30 14:03:52 +0000 2017"
## [225] "Sat Dec 30 13:46:43 +0000 2017" "Sat Dec 30 12:41:21 +0000 2017"
## [227] "Sat Dec 30 12:41:21 +0000 2017" "Sat Dec 30 13:46:43 +0000 2017"
## [229] "Sat Dec 30 06:17:39 +0000 2017" "Sat Dec 30 06:17:39 +0000 2017"
## [231] "Sat Dec 30 12:11:14 +0000 2017" "Sat Dec 30 06:17:39 +0000 2017"
## [233] "Sat Dec 30 06:17:39 +0000 2017" "Sat Dec 30 12:11:14 +0000 2017"
## [235] "Sat Dec 30 12:11:14 +0000 2017" "Sat Dec 30 04:16:04 +0000 2017"
## [237] "Sat Dec 30 14:23:15 +0000 2017" "Sat Dec 30 14:23:15 +0000 2017"
## [239] "Sat Dec 30 11:15:30 +0000 2017" "Sat Dec 30 11:15:30 +0000 2017"
## [241] "Sat Dec 30 04:16:04 +0000 2017" "Sat Dec 30 06:44:07 +0000 2017"
## [243] "Sat Dec 30 05:07:24 +0000 2017" "Sat Dec 30 14:02:05 +0000 2017"
## [245] "Sat Dec 30 04:16:04 +0000 2017" "Sat Dec 30 01:17:11 +0000 2017"
## [247] "Sat Dec 30 14:31:39 +0000 2017" "Sat Dec 30 06:44:07 +0000 2017"
## [249] "Sat Dec 30 13:46:43 +0000 2017" "Sat Dec 30 13:01:07 +0000 2017"
## [251] "Sat Dec 30 13:46:43 +0000 2017" "Sat Dec 30 10:27:19 +0000 2017"
## [253] "Sat Dec 30 13:01:07 +0000 2017" "Sat Dec 30 03:08:46 +0000 2017"
## [255] "Sat Dec 30 13:01:07 +0000 2017" "Sat Dec 30 13:01:07 +0000 2017"
## [257] "Sat Dec 30 13:46:43 +0000 2017" "Sat Dec 30 08:41:32 +0000 2017"
## [259] "Sat Dec 30 13:22:16 +0000 2017" "Sat Dec 30 06:17:39 +0000 2017"
df$statuses_count[which(df$TwPerDay==Inf)]
## [1] 1 4 18 2 5 3 1 2 2 3 4 5 1 7 7 31 8
## [18] 3 19 21 1 1 1 270 17 24 2 2 2 1 3 3 5 45
## [35] 9 10 5 1 1 1 2 11 2 3 4 5 1 1 9 10 11
## [52] 2 3 15 16 13 17 20 21 22 24 1 26 128 26 28 27 52
## [69] 29 2 71 72 5 35 36 1 2 2 4 5 3 4 46 39 41
## [86] 1 101 102 104 54 55 4 107 108 109 110 110 2 20 1 112 25
## [103] 113 20 114 4 24 115 4 5 116 2 117 19 14 14 1 119 120
## [120] 121 8 11 122 27 123 5 13 5 124 125 21 14 126 26 20 55
## [137] 127 6 18 128 129 131 29 22 32 17 32 40 15 1 8 9 4
## [154] 5 16 1 4 12 7 31 24 8 3 12 1 28 34 4 51 11
## [171] 3 6 40 23 11 3 21 24 16 20 41 20 9 27 1 7 16
## [188] 28 7 15 31 4 3 31 15 11 25 16 141 6 7 4 8 5
## [205] 10 1 2 4 3 4 3 37 2 22 2 8 29 3 4 2 5
## [222] 9 2 2 13 248 249 15 2 4 38 8 9 39 40 11 1 2
## [239] 10 11 15 22 2 3 20 27 3 23 22 37 23 83 39 7 41
## [256] 43 26 1 3 13
# Take the upper five percent as Bots (without age 0)
botsTWpD <- which(df$TwPerDay>176 & df$age!=0)
# Not all automated accounts are very acitve. Could be users just
# using differnt app than twitter.
# We could think about throwing them out again.
botsFriendlyAuto <- (which(!(botsAuto %in% botsTWpD)))
df$statuses_count[botsFriendlyAuto]
## [1] 7669 49052 14646 59 50549 9048 691 35837
## [9] 38 17861 106566 555416 1751 27205 95388 180925
## [17] 153004 218207 3997 4515 12102 25416 65166 4386
## [25] 140 172159 2661 2219 35838 19583 1896 16381
## [33] 32400 3721 27985 143842 577 1752 292 2512
## [41] 64746 494192 9131 1585 193675 5006 12826 2805
## [49] 1133 79331 4908 6461 48611 60147 8715 4972
## [57] 193676 11758 40231 11281 78508 95389 1 48026
## [65] 119617 67821 3666 3544 24449 70581 33783 238284
## [73] 45344 3339 6115 18221 81241 193677 8664 1479
## [81] 99152 1717 681 53 12691 172047 9320 16422
## [89] 8475 416 2847 2495 1987 3476 85738 3998
## [97] 176056 32522 65307 193678 45684 45028 19435 1531
## [105] 5271 160320 53591 1679 8676 21298 15620 1532
## [113] 1194 6159 173 3828 790 98 5208 36548
## [121] 28672 107744 16423 15622 38112 48004 1746 132781
## [129] 147226 116902 2546 10369 29854 5055 79127 568
## [137] 2092 1513 19436 3063 3594 35841 24077 174
## [145] 27312 2496 6319 94 30474 80317 8098 85
## [153] 19 1401 6996 35842 175714 13452 71235 12827
## [161] 70582 30339 378 5890 4458 1406956 46065 25662
## [169] 15347 177647 2201 12421 122 71315 2220 22806
## [177] 172160 40500 949 13079 698796 13346 38381 5820
## [185] 80098 38800 280469 6997 4910 26557 4055 414983
## [193] 2814 8665 2445 149168 7612 1681 1988 1000
## [201] 359 20474 16334 47216 148785 1552 54 5022
## [209] 120580 1521 12938 3386 71317 74628 13347 119899
## [217] 6462 4641 27194 40227 42697 7763 13236 13380
## [225] 587358 6998 390526 28046 7368 38521 13009 33179
## [233] 119695 255038 12852 3896 8289 75547 67484 13010
## [241] 1329 16910 189729 27195 3387 448 2315 3893
## [249] 1516 22179 3246 214615 35845 261410 2541 46911
## [257] 6999 4517 35731 5479 1730 2806 65452 2950
## [265] 29069 9531 45312 36384 337498 4471 5366 54891
## [273] 5558 74666 14088 143 35846 6702 214616 71318
## [281] 59384 7845 5174 164 7555 17298 52034 1876
## [289] 7995 11016 6955 25934 7274 42538 28 7707
## [297] 7363 40070 179 56432 10659 88949 10795 7817
## [305] 48309 31086 52338 15790 225710 267877 12829 554
## [313] 19394 773 67670 222384 1610 727999 51264 1718
## [321] 24183 22848 70584 304 3365 4566 27 17895
## [329] 32526 131 80 44 79522 22538 340 13369
## [337] 115729 17647 70290 2125 623 60949 7000 185966
## [345] 107745 27381 13879 220 26669 71276 341 8592
## [353] 2131 2517 1448 10901 103951 6165 5821 153008
## [361] 38499 11774 2160 1952 10252 29819 6988 95393
## [369] 4843 4449 30711 222385 630491 13237 20471 5559
## [377] 60041 13425 29855 1019 304496 35768 11527 5379
## [385] 45780 184238 19282 37906 103360 15303 2512 39002
## [393] 28279 15391 38113 2067 25201 27807 595 16993
## [401] 11390 2152 2510 18055 18690 35413 7001 175033
## [409] 120 72040 34316 1855 97363 412697 33704 27065
## [417] 7238 3194 61091 22330 49876 5414 10276 24535
## [425] 20009 6606 719 26247 43820 71319 200 72041
## [433] 11455 212161 238287 12830 3221 6415 6249 28621
## [441] 6045 42987 352960 848 3411 1897 10660 7002
## [449] 207927 43906 146 18056 22332 4519 3241 7793
## [457] 2507 9023 11663 125 1292 38500 92 550
## [465] 12492 8813 259 19992 6062 23083 103 47217
## [473] 1012 87355 27852 12831 59680 3415 7003 10639
## [481] 6390 18634 18838 68497 41478 760 12969 7819
## [489] 49054 1824 6141 70585 22405 15878 16000 1392
## [497] 1780 11636 5823 15268 21080 144 24184 22628
## [505] 1802 18740 48613 10275 17132 286861 31931 28609
## [513] 7531 61530 12832 7491 213 11020 6393 1314
## [521] 138586 1586 16335 1419 41423 2493 5824 64747
## [529] 321265 26010 78354 6064 10902 23083 221566 7208
## [537] 12 48378 12593 11919 661 2590 3657 15394
## [545] 23029 6644 22033 12840 14520 2702 32009 22841
## [553] 22727 106567 29820 37907 9303 102543 6394 29875
## [561] 21707 1915 21707 6813 69737 347 727 13973
## [569] 531 6065 5406 16774 78791 20240 5561 1339
## [577] 52866 5085 29876 7660 49877 428607 28652 49055
## [585] 40624 6458 3654 5375 30409 1912 353 321266
## [593] 5936 22544 15185 6042 7809 127 12745 106568
## [601] 51768 49068 7641 79935 524 532 358 17053
## [609] 238290 39197 31743 16107 5334 4504 39766 260324
## [617] 2812 8225 113 103624 2954 10170 14521 16425
## [625] 74246 42853 26558 47635 7629 50207 21894 4946
## [633] 59488 60081 260325 73069 96689 11420 423096 1864
## [641] 61 35417 70586 146220 4026 63387 11454 359
## [649] 28665 3787 14961 19994 128 13346 8716 2061
## [657] 24115 24043 10172 62 38353 534 1898 10093
## [665] 37 10173 179944 76706 33756 37909 284831 2128
## [673] 54897 38501 13825 11366 7695 156 22038 9679
## [681] 18613 47636 79936 13611 4765 23313 40 62
## [689] 9283 9366 16860 17054 41160 2469 64 10968
## [697] 6066 3792 3545 6066 33615 423097 12443 1056
## [705] 27747 1590 28677 10952 857609 1495 29701 68310
## [713] 45645 3486 4674 28050 22807 56911 42610 7710
## [721] 126192 536 1757 2118 6067 48 29 135281
## [729] 22039 1509 1796 8104 37910 6535 27199 80131
## [737] 1866 15547 2236 35860 6788 6068 32 300
## [745] 28051 3008 552 5825 2535 559 40378 718
## [753] 3109 538 1641 50209 4697 25416 97131 64796
## [761] 32402 871 27117 5584 4577 3705 23603 21353
## [769] 8717 183 17055 26618 33617 480 539 31195
## [777] 17085 9002 43470 36388 35861 27 23314 60426
## [785] 45564 2374 9684 1276 19999 16300 22041 71238
## [793] 19369 1571 97749 939 146224 27748 1762 360734
## [801] 13260 3364 33618 205530 10035 7312 129 60427
## [809] 10103 17409 9846 1685 44267 24450 35778 25
## [817] 8718 359 13401 102 39767 4765 208 8377
## [825] 4119 19651 228 102374 60428 28808 257 9513
## [833] 251 3312 22293 52413 11230 41651 6047 18491
## [841] 540 301536 9285 12595 238296 287004 35779 9583
## [849] 1003759 22042 7315 2766 60429 71322 71323 71323
## [857] 13961 45524 7998 889 45648 46218 8029 994
## [865] 1621 26 942 17056 97578 47817 60430 1217531
## [873] 268754 12158 2965 33622 1827 1111 1038 23281
## [881] 189619 712 7004 1559 7741 96052 2113 29819
## [889] 27201 260 31351 22043 153009 8108 60431 22986
## [897] 97750 40859 26519 28711 2777 16112 32868 20242
## [905] 85 9752 17483 16464 8719 13958 157199 25310
## [913] 12969 42855 60432 4948 113277 79937 268 153010
## [921] 40503 440 299 5844 58815 50463 22044 2396
## [929] 16336 358 298196 209157 11227 55023 45879 218
## [937] 6317 10998 6530 22860 719 7308 163898 100986
## [945] 12139 10694 24992 4010 33394 116458 1584 6057
## [953] 10277 1310 260326 28376 300737 60433 22143 4452
## [961] 346 24343 32869 20 2001 40964 6071 170426
## [969] 3140 21710 45648 12976 25312 1377 62828 130532
## [977] 8 124 702488 132417 51692 24074 45649 58205
## [985] 15754 450 990 4308 40966 2088 1304 1078
## [993] 15223 32447 1315522 134525 40096 8239 29447 3816
## [1001] 21608 96054 2593 19175 110902 68327 11832 1450
## [1009] 3215 21183 8720 33931 29448 45526 170427 10903
## [1017] 1096 22488 105391 7364 43901 843336 57088 28717
## [1025] 15224 17716 18913 11630 81542 205534 1120 66257
## [1033] 18694 761 1000 146228 268761 50135 152 8829
## [1041] 3116 20652 28624 21992 48235 176332 4614 5412
## [1049] 231 28806 1078 66323 6073 1025 16304 11115
## [1057] 5524 179432 587915 124225 176517 14683 17940 40860
## [1065] 4233 97689 11470 413 1487 6635 18149 6074
## [1073] 720 15225 9149 23431 12557 81 2127 1592
## [1081] 39769 1311 15710 24 1078 17002 587917 1618
## [1089] 20528 100987 19433 6070 409 661 907 16465
## [1097] 221580 4949 34417 96553 26590 8721 20221 1947
## [1105] 945 2488 15226 7430 18062 1078 16426 12
## [1113] 7616 40971 7366 68329 20269 3220 301539 7567
## [1121] 11997 2526 115731 10913 80909 36250 1843 4006
## [1129] 52867 12596 2637 29527 19 66404 2250 5466
## [1137] 18915 3262 1098 7278 20653 1059530 4950 17965
## [1145] 4895 48236 176518 141518 25490 100380 3564 19469
## [1153] 17517 12250 82 3396 1040 1078 15408 4614
## [1161] 14789 290 35866 39495 6075 2200 21239 68619
## [1169] 18855 3263 242083 12854 33861 8968 15227 1078
## [1177] 2527 16427 522 9078 21660 45882 14163 977
## [1185] 28623 18723 441 47264 8528 94183 90511 26591
## [1193] 2251 3190 12928 8600 48 83 81543 1078
## [1201] 186352 724 91042 6021 9830 215 18917 71048
## [1209] 30619 1896 9251 176519 261432 1431 10857 16305
## [1217] 19 56181 1648 3671 15766 3264 48310 3999
## [1225] 33862 261433 15227 5293 48470 698 113349 10887
## [1233] 666976 10035 3721 22052 36782 37911 1078 26592
## [1241] 26747 18918 2585 2870 24118 7115 110784 5129
## [1249] 626962 509 3 16442 1170 4808 159326 84
## [1257] 11486 7292 16428 4289 1369893 15757 104349 73303
## [1265] 189623 2324 5561 482 48311 15128 57407 3577
## [1273] 160 2627 34527 7431 6399 161748 15228 115732
## [1281] 7601 629247 73304 2172 132482 8086 843 8160
## [1289] 1534 8382 34028 10984 22145 10858 4290 3721
## [1297] 941 101805 60706 7282 20804 7493 261436 375
## [1305] 81070 21538 11022 12836 5515 5519 96 13589
## [1313] 20010 8336 260327 4291 24593 24162 4108 4248
## [1321] 4669 1499 125 4826 4762 22054 27314 1137
## [1329] 48312 5967 11679 420 46224 107747 48313 40975
## [1337] 35558 17637 4776 30 257722 14164 53678 12241
## [1345] 39873 7283 14644 1769 15637 7822 1687 6079
## [1353] 7826 37373 48315 4586 14115 42572 8222 63
## [1361] 14631 229 7619 1517 4293 9288 526 96399
## [1369] 16431 3597 942 20394 9504 21795 3129 4000
## [1377] 35585 4328 2564 1152563 206 2011 1780 5697
## [1385] 20183 91043 67491 673 23623 12242 48315 14530
## [1393] 117 9352 6217 4294 44270 357 16576 5435
## [1401] 71242 189 2788 4294 5034 16309 721 7309
## [1409] 594 62230 4953 42858 1602684 3510 652 176520
## [1417] 15362 19698 138587 38505 49 360886 1312 90135
## [1425] 87 28668 32465 39773 1318 30854 14214 76888
## [1433] 4079 59760 43248 229 379 10154 466 10150
## [1441] 11270 85 20244 45817 35936 2938 8244 2018
## [1449] 202 1659 9316 9335 71296 1577 15154 99648
## [1457] 83415 1332 1456 27164 120000 10186 1105 46494
## [1465] 76042 104823 10076 107748 7589 4495 1554 55384
## [1473] 29292 260329 141520 8320 74907 4784 2457 2256
## [1481] 11962 80132 12837 23867 103362 5966 260330 320369
## [1489] 104824 320370 68904 225575 261447 25684 104039 5414
## [1497] 16429 695906 5564 154488 9461 4514 35999 14116
## [1505] 38506 29048 8371 8321 7210 2252 15424 3798
## [1513] 8836 146229 728 2 43934 35868 3564 78803
## [1521] 17485 414985 25222 32870 21184 839 33374 22147
## [1529] 10359 1365 13733 180 4074 21046 35938 52408
## [1537] 6270 533054 5045 87772 16432 52589 15647 1738
## [1545] 82218 6944 14055 12177 11963 124623 505439 65194
## [1553] 59372 11023 17967 24173 8447 6117 28811 235138
## [1561] 1381 235139 18238 1602686 13422 235141 221590 34438
## [1569] 9782 235142 1717 5783 12622 161 176 18621
## [1577] 15235 74784 8294 133 8961 1990 40978 21469
## [1585] 117635 78124 88175 1805 200058 235403 50464 77848
## [1593] 1772 47087 15157 14904 1917 30029 11198 31140
## [1601] 2169 213366 3738 3093 90512 1876 1688 14571
## [1609] 9064 38507 52410 96978 3299 3345 209204 273
## [1617] 69898 454 28051 15236 5481 200772 6957 373
## [1625] 316698 1792 5063 223454 36382 43067 978 43292
## [1633] 17969 206 10334 6168 23848 11729 15363 104013
## [1641] 30811 7652 23756 5670 16577 10196 25223 36197
## [1649] 29559 15759 3185 15237 33597 280474 1084 4715
## [1657] 7108 74785 48240 22712 5099 29248 3952 2384
## [1665] 28344 43656 119626 8221 23133 57936 56557 141231
## [1673] 1574 1236 11024 3096 40979 17107 82297 75
## [1681] 23740 117637 1950 2932 191 7900 742 67427
## [1689] 28405 455329 8295 63 28385 3291 133115 103560
## [1697] 13875 2460 11663 2485 3313 104014 1652 28598
## [1705] 21851 100821 42704 31353 73414 1927 7226 6084
## [1713] 5695 9378 116458 973 389 5317 8326 17971
## [1721] 2401 11801 801 45231 74786 35869 24121 52868
## [1729] 1727 9927 16004 8296 77102 895 3770 370
## [1737] 51460 55402 1833 35870 15239 28599 24321 153807
## [1745] 741 5 23736 10335 8327 52086 167772 12837
## [1753] 1280449 6581 14368 113353 11796 13231 529946 58159
## [1761] 73679 3206 13183 475613 455 339763 153808 1515
## [1769] 198792 147367 1065 1063 4 15426 8694 6806
## [1777] 22220 16684 1883 193 25418 2637 5509 929
## [1785] 103363 543 5969 17412 2152 15240 5377 13070
## [1793] 17518 11665 2461 4009 26662 7747 946 20205
## [1801] 40039 6085 10741 1904 24165 12287 67428 1125
## [1809] 4853 49124 30 60442 1240 3 36789 18120
## [1817] 4228 31100 12112 198793 3823 11368 53056 21873
## [1825] 28270 176620 1393 120778 22795 21645 18622 15419
## [1833] 8449 738 55403 2261 67429 6118 6447 14589
## [1841] 42526 158762 18263 33 73680 194 2729 5797
## [1849] 50477 37290 113280 974 23642 2083 7516 15987
## [1857] 208098 35142 15162 104040 2285 31004 3512 5181
## [1865] 4017 4002 38585 15365 13069 44282 119901 5897
## [1873] 422 1087 99074 3622 17486 6973 229656 67430
## [1881] 104493 85235 16431 221597 14929 74787 5104 6432
## [1889] 21688 286278 12761 145536 260331 7012 108 1198
## [1897] 71855 80133 4772 802 18151 11110 1103 12771
## [1905] 189959 37391 3023 10337 5288 11341 320617 49768
## [1913] 31853 41685 19285 1163 14693 3870 226 1104
## [1921] 1280450 35782 675 11640 83592 4203 16432 26831
## [1929] 31844 18744 12 3875 443124 46284 104829 63428
## [1937] 42514 4516 32872 3458 16433 91 17066 58858
## [1945] 19759 4729 19164 57 35289 1144 34842 95678
## [1953] 7072 6088 1923 104830 172164 605 26243 406
## [1961] 2591 10854 26833 253303 1929 202 7293 26942
## [1969] 103365 963733 51347 622 38336 99288 104831 22809
## [1977] 1893 42304 50195 612 20833 22059 1499 5185
## [1985] 136879 72249 2128 3349 10093 2109 17119 11
## [1993] 40 89102 10988 95396 1440 6010 23511 12741
## [2001] 6791 42707 4261 10152 173509 176 10989 7374
## [2009] 8560 2956 105683 3884 2364 1358 20245 39158
## [2017] 27991 19925 2828 81143 900 3067 2465 5107
## [2025] 2211 41398 271 20893 42075 206713 20246 125355
## [2033] 8439 28600 31745 125353 13501 45068 5236 6090
## [2041] 19165 6793 42612 70 54525 1715 186 7575
## [2049] 119696 2450 225999 22611 31 1718 1756 24123
## [2057] 174579 9927 16435 189 164 18242 120779 73719
## [2065] 2105040 5107 260333 1441 96566 41427 4494 76909
## [2073] 10946 18881 61855 710 30895 10429 24124 90998
## [2081] 27699 15366 27697 63000 59928 8293 261732 8432
## [2089] 12702 33270 20039 195 8503 172166 189961 2466
## [2097] 57410 130528 156586 114372 2 15883 589619 9352
## [2105] 26841 1901 8103 453 4672 1009 94 107751
## [2113] 22187 14289 6511 31938 5 18887 3148 67418
## [2121] 20394 3989 6673 38021 13538 475 92777 9210
## [2129] 3799 15884 6599 46835 1593 11907 32830 30475
## [2137] 440 15327 33910 35787 32442 2977 1376 24126
## [2145] 4613 32874 14192 60 3446 3955 21356 1687
## [2153] 1537 81557 66511 173891 21837 896 3946 17060
## [2161] 26992 14890 607 14338 93188 68630 8275 24545
## [2169] 10553 3680 10108 32583 15328 6094 18189 2112
## [2177] 27315 10160 110793 198804 13582 76446 3448 32193
## [2185] 63430 3277 1857 267 14891 7915 248 5322
## [2193] 3301 35333 15727 4698 27414 154491 43664 64875
## [2201] 9733 65514 1009 59389 134 30659 25310 243
## [2209] 2046 62910 2113 8683 11429 5698 67436 2967
## [2217] 14569 18664 1010 59743 538 1123 24128 6096
## [2225] 35433 127851 9252 15940 108253 42820 32156 1894
## [2233] 15 1011 268578 233 556 7217 21037 10162
## [2241] 8987 1169 30783 43296 3074 44283 15184 67383
## [2249] 63431 226002 25312 573 5826 36257 9702 24129
## [2257] 29451 36244 67495 783 130505 260335 778 18186
## [2265] 31701 3317 37397 10487 317 9014 1170 6564
## [2273] 172168 80318 44032 2608 15012 93189 32443 7821
## [2281] 21548 40509 35877 144102 40510 379275 40511 1
## [2289] 107752 6444 1014 10163 1106 2217 12812 15654
## [2297] 1200 81559 35878 8581 179853 2052 41414 15276
## [2305] 1458 237573 4674 26755 90513 29360 6774 2476
## [2313] 75761 27132 175538 29046 48571 15663 3823 11519
## [2321] 13239 27811 154891 3758 3221 1708 588 2955
## [2329] 52221 4919 32110 43154 20477 44665 52 6033
## [2337] 8597 5916 103952 63459 10488 10978 4602 2705
## [2345] 1193390 1029 100 2462 414985 11347 226003 44947
## [2353] 6647 33310 70115 41416 169933 90 10145 89114
## [2361] 20012 4474 5190 33684 196 5063 154892 5369
## [2369] 173895 144539 33863 4496 170443 45070 114088 4082
## [2377] 491536 95629 105 3074 33144 38835 6582 21048
## [2385] 11287 35879 17882 58501 33221 10801 46191 10390
## [2393] 59978 15702 16137 33864 39770 27841 62722 6411
## [2401] 562 218100 9019 58502 217 1203 22030 3235811
## [2409] 9076 12534 38511 226004 1214 6520 5721 1820
## [2417] 21185 39338 1576 13733 5827 30 207941 12922
## [2425] 105 53941 18974 236074 9436 81560 5173 72
## [2433] 27 902 37910 10338 5193 17947 106801 5111
## [2441] 27685 9725 8025 5797 1004 209 5828 177498
## [2449] 124235 1569 32543 19437 27337 16726 42160 16412
## [2457] 42678 21402 88725 28655 6493 2726 224 5829
## [2465] 36201 173896 229659 10543 57 502 52533 2117
## [2473] 35882 10829 33822 62 41577 32876 1386 29898
## [2481] 10339 60448 154896 2735 1966 868 80 19349
## [2489] 1373 43667 26793 72934 5416 179854 35883 60
## [2497] 23444 16436 72274 103368 106802 60449 15369 1163276
## [2505] 17796 42515 7672 5830 98420 1684 20154 12259
## [2513] 14893 162244 11168 2989 63329 5370 16437 135
## [2521] 19609 226005 815 2128 2713 40862 62233 7325
## [2529] 9399 10135 1460 20655 18090 21049 60451 101855
## [2537] 3772 6208 5831 257 25444 3684 181951 99712
## [2545] 6599 5039 8522 89116 3759 119698 5065 175539
## [2553] 442 53836 72 181952 163993 236182 20013 150822
## [2561] 771 11262 573 9295 103369 24340 302970 14798
## [2569] 526 4354 19665 169934 2133 28237 19641 10
## [2577] 847 21839 17884 5254 78395 2136 15648 15193
## [2585] 122 459 260339 11834 107325 4677 33145 216407
## [2593] 8450 8451 176836 15248 4299 65002 11174 81330
## [2601] 12525 1387 16180 32457 46228 6036 3463 76886
## [2609] 10720 6024 41629 416 39775 101 143175 80772
## [2617] 1813 203090 18535 8791 1086 26837 1329 21352
## [2625] 16379 14576 37982 49615 8087 252856 1633 35561
## [2633] 52611 209174 23759 2872 5219 21 24151 19423
## [2641] 26599 169 72378 1113 268234 61857 2467 3869
## [2649] 56614 15108 35562 41430 30857 9252 1077 54
## [2657] 27753 15249 180287 10928 35888 14006 1115 20016
## [2665] 10632 17030 9859 4613 150825 3916 35790 3853
## [2673] 38087 3464 6204 813 8613 7792 39602 9507
## [2681] 68531 7961 44271 57110 2082 160 120003 2717
## [2689] 52060 29265 15615 35791 5673 17325 7836 7949
## [2697] 173897 3840 4959 26413 103 779 2668 6723
## [2705] 19146 21209 9153 26838 1116 275 6637 4678
## [2713] 747 111371 15888 412 1138 3301 102225 38167
## [2721] 617867 14738 117 28282 6778 120003 29247 67256
## [2729] 20234 8474 6584 2129 1388 6076 8012 52535
## [2737] 33 295898 9917 115787 135729 55188 1033 12712
## [2745] 4035 2429 25044 7220 21909 8877 7092 16611
## [2753] 10662 265496 8205 5113 28234 12629 1592 5146
## [2761] 27723 16296 21164 11073 785 10423 2468 5970
## [2769] 664 1003770 203092 5316 53372 2307 1375 50350
## [2777] 156758 1729 5074 32413
round(df$TwPerDay[botsFriendlyAuto])
## [1] 4 15 5 1 57 4 2 24 19 35 49 162 9
## [14] 59 45 107 51 69 14 2 12 22 44 32 0 91
## [27] 1 92 24 92 1 15 37 5 34 67 4 9 1
## [40] 2 47 169 6 5 68 3 8 2 11 31 2 9
## [53] 49 24 18 24 68 8 29 7 33 45 Inf 23 41
## [66] 76 3 3 14 59 13 104 56 1 4 33 50 68
## [79] 28 5 66 3 0 18 31 72 12 10 60 2 21
## [92] 3 5 5 48 14 85 46 54 68 14 16 15 1
## [105] 2 68 26 4 4 10 5 1 0 7 0 3 1
## [118] 1 3 17 55 80 10 10 32 37 1 51 51 74
## [131] 4 6 13 4 25 0 4 4 15 4 1 24 8
## [144] 0 23 3 3 94 22 31 3 0 0 1 38 24
## [157] 75 12 38 8 59 48 24 9 3 1135 15 13 7
## [170] 85 5 18 20 29 92 12 91 15 4 6 1952 138
## [183] 21 4 26 17 138 38 3 13 3 137 7 28 1
## [196] 299 4 4 5 1 12 10 7 88 92 3 18 2
## [209] 85 1 6 1 29 26 138 100 9 29 13 37 70
## [222] 3 103 7 169 38 232 19 3 11 31 13 107 202
## [235] 12 4 9 24 21 31 20 6 158 13 1 2 1
## [248] 1 4 18 2 53 24 429 1 53 38 16 14 4
## [261] 6 2 25 2 9 3 32 15 107 13 12 30 21
## [274] 31 33 1 24 3 53 29 20 3 3 20 5 14
## [287] 20 5 9 6 3 12 20 19 0 3 4 27 0
## [300] 43 5 31 3 7 21 106 27 11 88 91 8 0
## [313] 8 7 31 69 1 231 28 3 14 45 59 10 14
## [326] 2 0 46 46 0 2 0 53 99 0 12 67 15
## [339] 73 3 0 18 38 50 80 30 163 1 56 35 0
## [352] 4 3 1 1 3 64 44 4 51 29 10 1 22
## [365] 8 46 2 45 3 4 34 69 194 103 8 21 64
## [378] 65 13 0 106 14 46 52 23 58 9 65 70 13
## [391] 10 19 20 5 32 5 8 14 2 7 8 1 9
## [404] 55 10 17 38 54 0 34 14 1 59 191 10 10
## [417] 18 10 19 11 24 2 7 12 143 3 9 23 39
## [430] 29 12 34 5 90 104 8 5 4 46 35 3 15
## [443] 377 0 2 1 3 38 90 30 0 55 11 6 28
## [456] 3 7 21 5 3 2 29 0 0 8 6 0 42
## [469] 27 13 0 88 7 305 18 8 29 5 38 10 3
## [482] 6 6 19 15 1 9 5 15 14 5 59 40 35
## [495] 6 1 2 8 4 8 8 5 14 17 1 13 49
## [508] 23 12 126 44 16 22 29 8 3 0 6 15 1
## [521] 53 5 7 18 13 6 4 47 149 30 40 27 3
## [534] 13 232 5 0 17 8 33 5 5 2 5 40 5
## [547] 18 9 32 75 17 8 28 49 46 65 11 49 15
## [560] 36 11 6 11 3 22 13 0 6 2 27 2 7
## [573] 34 7 21 7 73 4 36 8 24 1016 29 15 14
## [586] 4 4 7 21 1 1 149 2 10 9 32 174 0
## [599] 6 49 36 24 8 44 0 2 40 17 104 35 24
## [612] 11 2 2 111 85 256 4 0 33 2 14 32 10
## [625] 97 17 13 16 4 15 25 26 29 119 85 85 49
## [638] 19 148 1 3 17 59 86 2 35 11 40 10 5
## [651] 14 44 0 12 18 3 778 14 14 3 37 2 1
## [664] 55 4 14 81 30 18 65 104 2 30 29 7 54
## [677] 59 0 18 7 10 16 44 22 3 20 0 0 5
## [690] 14 7 17 143 9 3 6 2 5 3 27 192 148
## [703] 4 2 11 9 55 4 459 8 31 24 50 27 4
## [716] 19 12 167 77 3 52 2 1 1 27 48 0 79
## [729] 18 7 1 7 65 2 13 79 1 6 1 24 5
## [742] 27 0 38 19 7 1 4 8 1 18 14 2 2
## [755] 1 15 2 43 56 25 37 1 64 3 13 2 13
## [768] 7 18 0 17 9 192 0 2 15 9 73 14 12
## [781] 24 0 20 72 30 1 22 2 44 5 18 38 17
## [794] 29 26 235 86 11 1 192 4 2 17 194 18 4
## [807] 0 72 9 33 4 4 44 14 14 0 18 12 6
## [820] 0 111 85 2 7 25 6 0 112 72 12 1 4
## [833] 1 2 10 45 6 18 33 9 2 200 5 8 104
## [846] 228 14 4 1049 18 21 1 72 29 29 29 14 27
## [859] 9 2 50 83 4 2 1 0 1 17 88 14 72
## [872] 386 157 6 21 17 14 8 1 11 109 1 3 1
## [885] 5 44 6 15 13 0 16 18 51 24 72 10 26
## [898] 46 8 8 8 21 14 7 2 15 9 96 18 13
## [911] 115 23 13 17 72 26 53 44 0 51 15 13 19
## [924] 9 23 71 18 1 7 0 126 193 5 24 65 1
## [937] 4 18 5 37 14 3 85 286 6 5 16 2 10
## [950] 40 3 9 5 7 85 72 156 72 28 4 0 10
## [963] 14 0 4 207 27 59 2 42 50 17 23 34 28
## [976] 60 0 21 313 70 17 22 50 33 7 2 1 3
## [989] 207 1 4 135 16 12 593 63 12 7 12 2 7
## [1002] 44 5 11 49 25 11 4 2 18 18 15 12 27
## [1015] 59 3 5 11 48 4 31 300 121 45 16 31 24
## [1028] 5 82 194 4 22 9 0 1 86 157 20 7 4
## [1041] 3 7 35 20 101 123 5 69 1 9 135 38 27
## [1054] 0 5 13 2 99 223 85 67 10 40 46 2 32
## [1067] 9 0 27 2 10 27 14 16 3 9 10 3 3
## [1080] 9 111 0 7 0 135 104 223 5 22 286 442 25
## [1093] 0 0 0 96 233 26 91 201 23 18 6 3 2
## [1106] 6 16 15 29 135 10 0 14 207 4 25 35 3
## [1119] 200 4 9 316 67 8 92 82 1 2 73 8 3
## [1132] 78 3 21 1 8 24 3 5 17 7 297 26 5
## [1145] 4 101 67 129 12 81 1 10 12 5 3 12 1
## [1158] 135 5 2 6 15 24 24 27 2 21 501 6 3
## [1171] 229 4 36 5 16 135 316 10 0 3 7 65 11
## [1184] 15 9 7 0 15 3 63 69 23 1 7 23 53
## [1197] 0 3 82 135 61 9 28 17 13 10 24 23 12
## [1210] 14 3 67 279 1 53 5 0 19 1 5 10 3
## [1223] 21 14 36 279 16 35 35 233 177 8 786 7 2
## [1236] 18 15 65 135 23 12 24 81 90 778 7 35 14
## [1249] 334 10 3 5 10 12 52 3 18 2 10 7 484
## [1262] 18 42 59 109 1 3 0 21 29 41 1 0 3
## [1275] 15 15 4 284 16 67 6 220 59 6 47 4 0
## [1288] 3 3 7 43 16 28 53 7 1 235 54 19 17
## [1301] 7 3 279 0 45 6 6 8 3 3 0 6 44
## [1314] 22 85 7 10 12 1 3 2 8 3 58 9 18
## [1327] 23 1 21 6 66 1 192 80 21 207 12 15 4
## [1340] 0 89 11 22 37 42 17 6 1 10 5 4 27
## [1353] 3 25 21 2 5 25 8 0 33 0 3 4 7
## [1366] 5 1 33 27 47 236 7 7 70 5 14 23 24
## [1379] 20 1656 206 3 5 4 9 28 21 1 83 37 56
## [1392] 5 0 3 31 7 44 22 7 2 38 0 4 16
## [1405] 2 5 4 3 10 20 2 17 553 13 1 67 17
## [1418] 7 53 29 0 127 0 272 0 10 14 111 8 11
## [1431] 8 79 5 198 56 0 24 6 1 17 7 3 7
## [1444] 81 276 8 19 5 1 1 4 194 27 4 7 60
## [1457] 84 13 1 127 158 18 0 18 193 49 4 80 23
## [1470] 1 1 46 25 85 129 4 42 11 33 9 13 79
## [1483] 8 10 70 12 85 366 49 366 99 224 279 11 54
## [1496] 69 10 246 21 127 3 2 29 5 29 33 4 4
## [1509] 5 1 27 11 4 86 9 Inf 20 24 5 34 9
## [1522] 137 8 14 18 0 22 28 4 1 7 0 12 23
## [1535] 276 23 6 453 3 31 27 153 8 1 35 3 20
## [1548] 5 13 62 160 44 182 6 20 9 32 4 12 66
## [1561] 86 66 10 553 26 66 233 29 15 66 10 9 5
## [1574] 0 44 38 16 33 9 1 11 23 207 26 61 43
## [1587] 34 1 88 127 72 49 1 19 7 8 46 21 5
## [1600] 133 1 66 2 37 69 24 6 6 5 29 23 39
## [1613] 3 6 312 6 152 2 22 16 6 62 3 0 96
## [1626] 9 3 135 15 85 15 18 20 0 3 12 54 5
## [1639] 17 172 32 14 16 13 7 5 8 85 12 18 2
## [1652] 16 15 138 1 2 6 33 101 20 8 36 3 2
## [1665] 25 56 41 10 93 30 67 137 2 11 6 2 207
## [1678] 45 45 0 8 61 2 3 0 5 0 45 16 505
## [1691] 9 0 13 4 50 49 63 33 5 5 4 172 1
## [1704] 46 12 32 70 16 37 107 21 27 2 11 40 4
## [1717] 0 2 4 20 30 4 0 26 33 24 778 73 1
## [1730] 17 25 9 100 9 4 0 21 18 22 24 16 46
## [1743] 8 50 1 Inf 10 3 4 37 117 10 2039 27 19
## [1756] 177 29 5 636 18 28 27 7 157 2 157 50 2
## [1769] 198 46 0 0 0 27 4 8 18 11 2 0 43
## [1782] 4 10 0 70 2 6 22 1 16 3 7 12 12
## [1795] 33 2 14 4 236 6 13 27 4 3 12 5 45
## [1808] 3 3 40 0 72 6 Inf 17 302 2 13 11 198
## [1821] 2 54 23 7 22 176 1 40 18 172 38 5 26
## [1834] 0 18 10 45 4 3 7 26 452 7 0 28 0
## [1847] 1 3 67 21 53 4 14 4 3 137 107 19 7
## [1860] 54 22 14 13 3 3 14 20 17 10 14 100 2
## [1873] 1 0 31 49 9 3 86 45 129 92 10 233 7
## [1886] 33 8 9 53 283 76 144 85 13 0 1 34 79
## [1899] 2 0 10 7 5 5 61 29 2 3 8 4 119
## [1912] 24 13 34 9 1 8 16 0 5 2039 14 2 6
## [1925] 44 5 10 35 11 6 0 4 138 18 49 43 17
## [1938] 2 14 2 10 3 13 32 9 9 6 0 54 1
## [1951] 23 33 3 27 1 49 91 18 56 6 3 5 35
## [1964] 104 46 3 3 10 70 869 17 3 16 1460 49 12
## [1977] 5 18 16 0 12 18 2 2 43 42 106 6 5
## [1990] 33 9 0 20 28 5 45 1 2 14 4 13 70
## [2003] 34 17 203 0 5 10 28 10 59 37 1 1 7
## [2016] 33 21 9 257 279 0 1 33 8 53 300 5 6
## [2029] 205 68 7 36 10 46 24 117 22 14 2 27 6
## [2042] 13 77 0 75 19 1 11 107 1 135 8 0 1
## [2055] 18 778 510 5 10 189 0 6 40 33 706 8 85
## [2068] 2 31 13 43 79 4 8 27 5 13 7 778 27
## [2081] 29 17 14 21 25 3 76 5 6 20 11 0 160
## [2094] 91 61 2 41 65 142 150 2 35 426 4 8 1
## [2107] 3 6 15 1 3 80 7 129 9 31 0 8 11
## [2120] 35 7 40 2 15 5 0 49 6 11 35 5 163
## [2133] 1 5 15 8 9 16 17 14 88 6 1 778 2
## [2146] 17 22 1 431 1 7 12 2 82 21 61 58 9
## [2159] 2 8 14 84 7 5 76 33 3 12 5 2 4
## [2172] 31 16 27 13 33 23 6 35 198 10 47 431 69
## [2185] 43 3 1 0 84 3 21 8 52 20 9 3 12
## [2198] 127 289 26 3 21 1 20 0 41 23 0 6 20
## [2211] 33 4 19 4 45 4 5 8 1 25 3 2 778
## [2224] 27 14 49 3 21 32 17 18 1 0 1 79 0
## [2237] 1 4 7 6 10 11 17 18 2 14 9 27 43
## [2250] 135 23 0 4 78 6 778 12 29 21 5 33 85
## [2263] 0 13 11 1 29 5 0 5 11 18 91 31 21
## [2276] 41 4 76 88 174 16 74 24 99 74 228 74 Inf
## [2289] 80 3 1 6 5 4 22 8 1 82 24 3 81
## [2302] 2 300 7 2 143 15 12 69 24 4 2 115 14
## [2315] 92 16 69 6 2 5 6 15 50 1 1 6 2
## [2328] 33 68 2 10 16 10 341 1 2 26 2 64 23
## [2341] 5 5 2 1 361 0 0 1 137 15 135 18 5
## [2354] 26 24 300 53 0 3 83 70 13 2 19 0 4
## [2367] 50 9 61 205 36 1 59 14 52 5 218 192 0
## [2380] 29 14 21 6 9 5 24 10 34 14 11 21 10
## [2393] 34 98 40 36 38 13 19 6 281 66 36 34 9
## [2406] 1 7 1159 3 5 29 135 2 7 2 8 18 15
## [2419] 29 7 4 3 90 62 0 17 8 86 3 82 31
## [2432] 0 0 0 12 3 2 40 173 8 29 4 6 4
## [2445] 1 0 4 67 85 26 110 15 10 7 79 7 309
## [2458] 20 39 16 2 1 0 4 85 61 86 4 0 36
## [2471] 58 33 24 37 15 62 16 14 4 11 3 72 50
## [2484] 1 1 0 0 51 1 289 49 18 69 81 24 0
## [2497] 9 10 42 70 173 72 17 1074 19 17 3 4 44
## [2510] 9 10 7 84 108 147 8 29 5 10 0 8 135
## [2523] 1 3 1 46 20 8 3 158 1 7 15 9 72
## [2536] 63 3 4 4 10 11 1 66 46 36 8 17 83
## [2549] 1 107 3 92 0 25 0 66 77 268 143 60 2
## [2562] 9 2 5 70 13 781 16 0 4 58 53 107 21
## [2575] 8 0 0 58 10 10 56 25 8 7 0 2 85
## [2588] 4 33 4 14 68 9 15 59 16 16 28 6 98
## [2601] 18 4 5 12 83 6 2 31 5 17 13 1 111
## [2614] 0 290 123 5 187 11 4 1 10 2 7 14 6
## [2627] 14 54 3 100 204 12 73 82 13 2 71 21 13
## [2640] 6 79 5 23 5 138 27 33 2 18 9 12 300
## [2653] 11 49 28 0 11 16 270 21 24 5 5 143 6
## [2666] 9 3 4 60 14 14 4 57 2 103 1 5 3
## [2679] 11 9 44 44 44 18 1 0 158 4 39 10 7
## [2692] 14 6 7 3 120 61 2 26 8 4 0 6 5
## [2705] 15 8 11 10 5 0 29 4 2 39 35 0 0
## [2718] 3 64 226 772 72 20 20 2 158 13 22 6 3
## [2731] 6 3 10 25 7 58 0 102 4 742 50 86 0
## [2744] 5 3 1 9 258 28 12 7 6 3 324 4 8
## [2757] 137 47 1 3 14 6 6 33 1 4 33 6 0
## [2770] 1049 187 6 145 9 1 31 40 6 4 37
df$age[botsFriendlyAuto]
## [1] 2175 3181 3238 48 892 2568 305 1499 2 515 2155 3423 201
## [14] 460 2120 1692 3016 3174 283 2429 1032 1163 1480 136 329 1897
## [27] 2478 24 1499 213 1668 1105 881 808 835 2137 130 201 484
## [40] 1465 1391 2921 1446 333 2847 1884 1533 1591 107 2589 2385 754
## [53] 986 2484 473 210 2847 1549 1403 1534 2390 2120 0 2088 2904
## [66] 894 1055 1019 1765 1205 2511 2292 817 2300 1573 544 1613 2847
## [79] 308 328 1506 500 2298 3 403 2375 765 1656 141 239 134
## [92] 820 408 692 1782 283 2081 713 1218 2847 3167 2868 1301 1970
## [105] 2203 2369 2094 439 2185 2209 3120 1970 2745 832 1283 1513 568
## [118] 75 1940 2158 523 1354 1656 1529 1187 1306 3022 2615 2915 1590
## [131] 587 1614 2385 1383 3147 2175 487 413 1301 803 3051 1499 2990
## [144] 1283 1177 820 1956 1 1401 2621 2673 1822 44 1625 186 1499
## [157] 2349 1099 1856 1533 1205 631 16 639 1418 1240 3093 1999 2146
## [170] 2089 428 701 6 2452 24 1946 1897 2647 262 2110 358 97
## [183] 1816 1566 3106 2256 2039 186 1751 2046 1191 3034 396 308 3776
## [196] 499 1973 439 408 1834 31 2112 2402 538 1619 555 3 2174
## [209] 1417 1356 2036 2452 2452 2835 97 1205 754 158 2120 1101 614
## [222] 2400 129 2034 3468 186 1680 1440 2345 3384 425 2515 1115 1262
## [235] 1055 1093 946 3198 3227 425 67 2760 1203 2120 2452 212 3098
## [248] 2625 388 1200 2122 4049 1499 610 3188 884 186 276 2477 1387
## [261] 293 1591 2644 1864 3140 3201 1400 2407 3161 333 466 1832 270
## [274] 2418 429 210 1499 2210 4049 2452 2975 2687 1799 8 1517 1198
## [287] 2620 354 868 1827 2549 2251 363 2211 1018 2670 1784 1463 1283
## [300] 1319 2154 2841 3111 1131 2287 293 1953 1378 2569 2947 1533 1291
## [313] 2454 106 2187 3222 1507 3151 1845 500 1670 505 1205 32 235
## [326] 2371 273 390 713 323 43 450 1508 228 1209 1137 1720 1183
## [339] 964 733 1903 3395 186 3704 1354 898 85 352 478 2055 713
## [352] 2037 619 1767 1414 3267 1636 141 1566 3016 1345 1132 2099 89
## [365] 1318 648 2902 2120 1661 1181 904 3222 3255 129 2564 270 932
## [378] 207 2385 2601 2886 2523 252 104 1971 3197 2235 581 1470 1187
## [391] 240 2042 1411 3053 1187 398 3201 2038 254 2303 1413 2526 265
## [404] 327 1846 2060 186 3235 1825 2106 2497 1758 1660 2165 3318 2612
## [417] 405 326 3133 1950 2041 2417 1453 2069 140 2372 80 1141 1111
## [430] 2452 16 2106 2193 2353 2292 1533 631 1826 137 824 2022 2870
## [443] 936 2216 2010 1668 3073 186 2315 1487 1997 327 1950 755 114
## [456] 2313 338 439 2447 47 606 1345 1891 2418 1478 1537 2046 473
## [469] 222 1754 237 538 143 286 1557 1533 2050 670 186 1083 2413
## [482] 3208 2910 3573 2796 625 1480 1647 3181 126 1201 1205 561 452
## [495] 2597 955 1176 1456 1566 1805 2568 30 1670 1330 1687 1490 986
## [508] 448 1402 2271 726 1748 340 2115 1533 2789 925 1827 423 1352
## [521] 2598 333 2402 77 3264 411 1566 1391 2151 854 1940 222 3267
## [534] 1754 953 1311 965 2855 1513 357 133 486 2162 3053 571 1432
## [547] 1225 1499 451 36 1901 2820 804 2155 648 581 823 2109 423
## [560] 834 1981 336 1981 2268 3104 27 2022 2500 292 222 3053 2318
## [573] 2309 2923 270 195 728 1375 834 989 2041 422 988 3181 2804
## [586] 1556 983 768 1453 1342 476 2151 2375 2231 1638 186 45 3281
## [599] 1996 2155 1419 2005 955 1821 2643 292 9 1005 2292 1111 1306
## [612] 1412 2389 2412 358 3069 11 2101 824 3128 1429 714 451 1656
## [625] 769 2504 2046 3066 2130 3366 859 193 2086 506 3069 857 1983
## [638] 615 2857 2621 21 2060 1205 1699 1889 1823 1042 9 2958 713
## [651] 1088 454 3281 1085 473 591 31 1736 714 21 1027 292 1668
## [664] 185 10 714 2227 2562 1924 581 2739 1399 1832 1345 1898 212
## [677] 130 1420 1225 1436 1925 3066 1821 633 1766 1175 1990 757 1932
## [690] 684 2354 1005 288 277 21 1979 2684 825 1019 222 175 2857
## [703] 3160 438 2589 186 523 3020 1868 180 972 2859 916 128 1093
## [716] 1440 1946 341 553 2670 2445 292 1637 1821 222 1 273 1715
## [729] 1225 216 1889 1246 581 3014 2120 1013 1746 2394 2963 1499 1478
## [742] 222 218 8 1440 444 832 1566 307 633 2220 51 1743 292
## [755] 1432 3366 2184 596 1742 2590 881 1421 423 1675 348 1531 1786
## [768] 3098 473 1283 1005 2897 175 1472 292 2119 1969 124 3052 3121
## [781] 1499 2080 1175 837 1513 2516 432 700 454 3329 1225 1856 1122
## [794] 55 3738 4 1699 2589 2001 1875 3186 1348 1990 1058 545 2035
## [807] 3281 837 1178 524 2292 439 1000 1765 2523 2023 473 31 2128
## [820] 2849 358 56 118 1138 164 3025 2052 912 837 2317 326 2494
## [833] 392 1926 2194 1163 1887 2360 186 2014 292 1508 1932 1513 2292
## [846] 1259 2523 2273 957 1225 351 2323 837 2452 2452 2452 978 1672
## [859] 868 418 916 556 1983 544 2168 55 1167 1005 1115 3374 837
## [872] 3151 1717 2095 138 1990 126 136 1899 2118 1741 839 2149 1052
## [885] 1599 2180 356 1966 2120 2046 1933 1225 3016 336 837 2195 3738
## [898] 897 3234 3536 336 760 2367 2923 43 642 1972 172 473 1067
## [911] 1369 1110 1025 2504 837 193 2121 1821 609 3016 2647 35 16
## [924] 658 2504 712 1225 2949 2402 1640 2372 1085 2157 2340 705 180
## [937] 1522 606 1389 623 51 2236 1935 353 1988 2032 1516 2246 3213
## [950] 2887 453 643 1922 180 3069 395 1931 837 803 1181 1641 2530
## [963] 2367 2461 532 198 222 2892 1465 517 916 742 1110 41 2268
## [976] 2183 3081 6 2243 1896 3127 1083 916 1765 2409 237 767 1484
## [989] 198 1891 291 8 935 2804 2220 2147 3242 1254 2422 2333 3118
## [1002] 2180 486 1679 2258 2709 1058 363 1858 1206 473 2295 2422 1672
## [1015] 2892 3267 224 2007 2183 1784 1430 2811 472 641 935 576 787
## [1028] 2498 1000 1058 293 3081 2116 2443 1063 1699 1717 2529 22 2498
## [1041] 899 3176 824 1108 477 1439 873 78 264 3222 8 1766 222
## [1054] 2449 3329 850 3130 1821 2638 1468 2622 1444 451 897 2507 3092
## [1067] 1318 1369 55 3300 1793 222 51 935 2941 2483 1269 28 733
## [1080] 186 358 3119 2197 376 8 163 2638 323 938 353 44 244
## [1093] 1593 1463 2554 172 953 193 378 480 1132 473 3119 671 412
## [1106] 396 935 512 615 8 1656 110 560 198 1784 2709 585 1263
## [1119] 1508 2005 1389 8 1720 1419 878 441 2367 1614 728 1513 862
## [1132] 379 6 3113 3208 726 787 1148 224 428 3176 3569 193 3337
## [1145] 1173 477 2622 1098 2064 1232 2494 1860 1521 2377 28 284 1204
## [1158] 8 3053 2679 2373 19 1499 1661 222 1194 1018 137 3092 1148
## [1171] 1058 2859 936 1805 935 8 8 1656 1098 2746 3283 705 1344
## [1184] 65 3189 2864 3120 3210 3183 1504 1303 1132 3208 478 558 162
## [1197] 281 28 1000 8 3064 80 3227 345 768 21 787 3108 2605
## [1210] 136 3067 2622 936 2177 203 3329 44 2963 1732 688 1650 1148
## [1223] 2287 283 936 936 935 150 1378 3 639 1343 849 1500 1950
## [1236] 1225 2506 581 8 1132 2232 787 32 32 31 998 3193 377
## [1249] 1879 50 1 3119 122 410 3085 28 646 3048 1656 596 2831
## [1262] 886 2477 1237 1741 3088 2105 2366 2287 524 1401 2997 3338 838
## [1275] 2289 512 1554 570 935 1720 1173 2865 1237 353 2827 2243 2073
## [1288] 2524 611 1138 798 687 803 203 596 2637 4 1887 3227 428
## [1301] 3166 2789 936 2141 1809 3535 1827 1533 1579 1794 1379 2244 454
## [1314] 385 3069 596 2464 2040 3205 1304 1985 180 36 83 507 1225
## [1327] 1177 1106 2287 968 178 798 241 1354 2287 198 2975 1200 1325
## [1340] 91 2905 1344 2472 328 954 428 2504 2276 1520 1647 439 222
## [1353] 2742 1518 2287 1861 2697 1720 1087 484 442 2052 2500 388 596
## [1366] 1932 704 2943 612 76 4 3083 1376 313 609 283 1552 182
## [1379] 127 696 1 576 344 1317 2309 3227 3227 1077 284 328 865
## [1392] 3156 1693 2738 200 596 1000 16 2446 3115 1856 1283 700 270
## [1405] 2347 3040 179 2236 61 3128 2148 2504 2897 273 926 2622 929
## [1418] 2850 2598 1345 281 2839 3246 331 444 2837 2240 358 166 2781
## [1431] 1743 977 755 302 773 1761 16 1713 480 589 1508 28 2923
## [1444] 563 130 389 436 421 283 2255 2438 48 2657 442 2309 1662
## [1457] 992 106 2522 214 760 551 3090 2536 395 2125 2509 1354 331
## [1470] 3250 1356 1217 1188 3069 1098 1919 1786 450 75 259 955 1013
## [1483] 1533 2380 1470 513 3069 875 2125 875 694 1006 936 2347 1919
## [1496] 78 1656 2829 270 1221 3178 2135 1231 2697 1345 891 2143 1919
## [1509] 1311 3208 567 351 2498 1699 80 0 2187 1499 720 2309 1972
## [1522] 3034 3017 2367 1206 2015 1539 803 2614 2128 1950 679 333 913
## [1535] 130 2266 1095 1178 1610 2789 612 344 1843 3106 2360 2420 712
## [1548] 2244 955 2021 3159 1480 327 1827 890 2829 267 1573 2317 3543
## [1561] 16 3543 1749 2897 525 3543 953 1172 672 3543 173 630 2719
## [1574] 3338 4 496 935 2296 946 228 819 85 198 839 1921 1809
## [1587] 2591 1211 2264 1859 703 1574 2369 2467 2309 1914 42 1430 2457
## [1600] 235 3187 3214 1972 84 1303 77 289 2418 1916 1345 2266 2480
## [1613] 1268 535 670 42 461 212 1282 935 917 3227 2503 3320 3285
## [1626] 199 1952 1658 2406 504 65 2405 890 530 3085 530 441 2210
## [1639] 929 606 966 538 1490 438 2446 2162 3017 428 2420 886 1633
## [1652] 935 2267 2039 1625 2084 1141 2296 477 1157 664 807 1350 1269
## [1665] 1122 782 2904 857 248 1936 838 1031 720 112 1827 1451 198
## [1678] 383 1818 715 3006 1921 914 878 1283 1652 3143 1513 1794 901
## [1691] 946 656 2117 774 2646 2121 221 75 2476 527 738 606 2597
## [1704] 620 1850 3118 614 1933 1983 18 345 222 2482 875 2887 218
## [1717] 1374 2209 1919 890 81 3158 2158 1740 2296 1499 31 728 2375
## [1730] 570 637 946 770 97 984 3165 2408 3044 84 1499 935 620
## [1743] 3194 3103 495 0 2437 3085 1919 1418 1438 1322 628 243 744
## [1756] 639 406 2669 833 3219 2644 117 1989 3035 212 2162 3103 772
## [1769] 1002 3191 3066 3247 2206 567 2347 884 1258 1566 1079 1283 596
## [1782] 648 553 2051 1470 316 968 803 1785 935 2142 1807 1521 977
## [1795] 75 1614 1924 2195 4 3491 3113 222 2582 729 2040 2303 1513
## [1808] 421 1858 1233 273 837 224 0 2196 60 2060 2375 1153 1002
## [1821] 1939 212 2312 2982 1299 1004 2016 2992 1258 126 496 3053 328
## [1834] 3270 3044 234 1513 1573 1913 2069 1640 351 2635 218 2644 1283
## [1847] 2965 1787 748 1812 2121 218 1657 564 2177 117 1950 1876 2309
## [1860] 1919 103 2160 273 1613 1551 283 1951 929 1373 3247 1205 2525
## [1873] 437 3104 3244 74 1972 2565 2685 1513 812 925 1656 953 2182
## [1886] 2296 664 751 413 1013 167 1013 3069 528 833 976 2094 1013
## [1899] 1943 2534 1793 1644 224 2552 3114 1294 1469 3085 682 3064 2699
## [1912] 2087 2469 1237 2235 1647 1791 242 1911 224 628 2523 341 1821
## [1925] 1888 889 1656 777 2772 2936 1753 1065 3202 2642 2125 1459 2465
## [1938] 2049 2367 1970 1656 28 1271 1822 2176 518 3027 1587 658 1781
## [1951] 1529 2889 2032 222 2859 2125 1897 34 472 72 913 2088 777
## [1964] 2439 42 63 2149 2733 1470 1109 3093 180 2412 68 2125 1946
## [1977] 415 2320 3053 2857 1717 1225 662 3079 3192 1722 20 535 2098
## [1990] 64 1837 2076 2 3155 2288 2120 2155 2704 1715 2842 541 614
## [2003] 124 589 854 1281 2288 731 302 302 1781 106 2966 1459 2923
## [2016] 1189 1333 2159 11 291 2491 2161 75 664 42 138 60 3269
## [2029] 205 3036 2923 3447 854 620 1306 1067 605 3237 2126 222 3027
## [2042] 541 553 691 723 90 157 697 1115 1977 1668 2815 91 2269
## [2055] 100 31 342 2109 1656 1 2039 3195 2992 2226 2983 664 3069
## [2068] 600 3107 3264 104 977 2478 2327 2319 141 2331 1414 31 3313
## [2081] 959 929 1984 2931 2399 3119 3432 1855 2120 1668 1743 1283 53
## [2094] 1897 3114 1291 1401 2002 1104 764 1 452 1383 2319 3438 2179
## [2107] 2673 72 304 1804 28 1354 3113 111 698 1037 2206 2323 281
## [2120] 1928 3083 100 2845 2569 2532 1127 1875 1559 351 452 1226 287
## [2133] 2495 2548 2240 3609 48 979 2052 2523 370 466 2196 31 2456
## [2146] 1952 646 48 8 3086 3098 137 633 1000 3212 2829 375 97
## [2159] 2024 2093 1969 177 87 3112 1224 2067 2554 2069 1972 2438 2311
## [2172] 1040 979 222 1419 64 1177 1713 3193 1002 1405 1634 8 467
## [2185] 1459 1298 2168 633 177 3119 12 642 64 1750 1817 1505 2271
## [2198] 1221 151 2523 2851 3110 926 2975 3281 741 1110 510 315 3105
## [2211] 64 2185 615 1317 1513 754 3225 2480 926 2391 157 462 31
## [2224] 222 2510 2620 3067 767 3340 2528 1803 2502 274 926 3395 1761
## [2237] 799 1715 3121 1713 926 106 1803 2405 1237 3247 1636 2490 1459
## [2250] 1668 1110 3008 1566 463 1564 31 2422 1229 3227 151 3933 3069
## [2263] 2606 1445 2871 2517 1294 2259 2574 1889 106 363 1897 2621 2140
## [2276] 63 3348 1224 370 45 1388 550 1499 1449 550 1666 550 0
## [2289] 1354 2156 926 1713 224 569 592 1882 1631 1000 1499 2492 2215
## [2302] 1357 138 2243 771 1659 304 2232 1303 1237 1541 1135 656 1873
## [2315] 1906 1768 703 2579 2048 2172 2271 1907 3118 2734 3055 277 303
## [2328] 90 771 2164 3184 2708 2112 131 83 3179 328 2873 1636 2800
## [2341] 2259 2038 2068 2299 3303 3211 2145 2008 3034 750 1668 2489 1432
## [2354] 1285 2980 138 3208 444 3933 1069 286 333 3079 1816 1283 1383
## [2367] 3118 581 2829 704 936 3250 2892 3237 2205 755 2258 498 2236
## [2380] 105 2324 1864 1073 2427 2423 1499 1762 1743 2434 968 2249 1065
## [2393] 1780 161 404 936 1056 2205 3320 1150 2 3316 251 1743 23
## [2406] 1106 3045 2793 3192 2342 1345 1668 491 987 3136 235 1206 2614
## [2419] 55 1950 1566 10 2315 210 405 3083 2276 2750 2739 1000 165
## [2432] 1807 71 2491 3108 3085 3079 451 616 664 958 2404 1408 1412
## [2445] 1240 1812 1566 2659 1468 60 296 1301 2791 2251 537 2306 138
## [2458] 1074 2297 1842 3226 2783 2066 1566 428 2829 2685 2559 1587 14
## [2471] 911 64 1499 289 2259 1 2570 2367 377 2815 3085 837 3118
## [2484] 2171 2065 2466 300 382 1963 151 546 3946 78 2215 1499 1097
## [2497] 2483 1656 1722 1470 616 837 929 1083 943 2465 2257 1566 2241
## [2510] 194 2011 1807 177 1497 76 362 2189 1046 1656 3281 2413 1668
## [2523] 566 733 1868 897 3128 864 2761 64 2008 3176 1219 2427 837
## [2536] 1610 1226 1761 1566 25 2349 2542 2769 2183 184 652 514 1069
## [2549] 2734 1115 1776 1906 3120 2150 305 2769 2125 881 140 2510 314
## [2562] 1193 382 1932 1470 1855 388 942 2521 1020 339 3208 20 1370
## [2575] 2525 56 2061 375 1762 518 1407 85 1843 2176 1148 212 3069
## [2588] 3081 3270 1093 2324 3180 965 582 3007 935 270 2360 1726 832
## [2601] 683 377 3142 2804 556 1090 1970 2472 2246 345 3107 450 358
## [2614] 2145 494 658 363 1086 1741 2023 1625 2773 675 3167 1180 2380
## [2627] 2632 912 2471 2524 8 2975 721 2544 1801 1325 74 1 1929
## [2640] 3193 335 35 3082 224 1943 2319 75 2520 3117 1774 2975 138
## [2653] 2781 187 39 1919 2589 935 667 530 1499 2902 224 140 1669
## [2666] 1808 3218 1222 2510 289 2523 999 672 1970 60 1228 1695 2465
## [2679] 3673 1080 1568 183 1000 3228 1534 1074 760 770 1341 2845 2332
## [2692] 2523 988 2474 2548 66 2829 1629 193 3238 28 2606 466 1311
## [2705] 1248 2618 807 2773 224 1471 232 1093 450 2851 452 1070 2529
## [2718] 1279 1604 169 800 206 6 1411 3291 760 2190 3071 3119 3236
## [2731] 1073 733 138 244 1171 911 354 2908 2315 156 2737 645 2532
## [2744] 2415 1251 1728 2845 28 771 725 1091 2741 3202 820 1968 664
## [2757] 206 268 2424 1583 2028 2777 3672 336 794 2632 75 968 1923
## [2770] 957 1086 945 368 245 2358 1605 3929 279 1383 881
# Friend/Follower ratio is important. Bots ratio is often around 1.
plot(df$favourites_count+1, df$friends_count+1, log = "xy")
abline(0,1)
# If FF-ratio is close to 1 and there are more than 1000 followers
# it is likely a bot. How likely? We don't know!
botsFF <- which(round((df$friends_count+1)/(df$followers_count+1),1)==1 &
df$followers_count>1000)
# Combine all our bot-suspects.
bots <- unique(c(botsDup, botsTWpD, botsFF, botsAuto))
# Media companies are using bots as well. But most of the times,
# they are verified users.
bots <- bots[-which(df$verified[bots]==T)]
Botsources <- URL_parts(df$source[bots])[,2]
sort(table(Botsources)[table(Botsources)>10], decreasing = T)
## Botsources
## twitter.com
## 5430
## twitter.com" rel="nofollow">Twitter Web Client<
## 1653
## ifttt.com" rel="nofollow">IFTTT<
## 1598
## dlvrit.com
## 427
## mobile.twitter.com" rel="nofollow">Twitter Lite<
## 376
## publicize.wp.com
## 309
## about.twitter.com
## 252
## tapbots.com
## 214
## www.hootsuite.com" rel="nofollow">Hootsuite<
## 193
## www.google.com
## 181
## www.tweetcaster.com" rel="nofollow">TweetCaster for Android<
## 169
## www.echofon.com
## 85
## www.crowdfireapp.com" rel="nofollow">Crowdfire - Go Big<
## 78
## mvilla.it
## 67
## bufferapp.com" rel="nofollow">Buffer<
## 45
## twitterrific.com" rel="nofollow">Twitterrific<
## 45
## paper.li" rel="nofollow">Paper.li<
## 37
## mobile.twitter.com" rel="nofollow">Mobile Web (M2)<
## 34
## twicca.r246.jp
## 29
## roundteam.co" rel="nofollow">RoundTeam<
## 28
## twibble.io" rel="nofollow">Twibble.io<
## 27
## www.facebook.com
## 25
## www.twitter.com" rel="nofollow">Twitter for Windows<
## 24
## twittbot.net
## 23
## semanticearth.com
## 17
## archive.org
## 15
## uk.timesofnews.com" rel="nofollow">Times of News from UK<
## 15
## software.complete.org
## 12
## lissted.com" rel="nofollow">Lissted<
## 11
## www.twhirl.org" rel="nofollow">Seesmic twhirl<
## 11
# Show 10 random bot texts.
df$text[bots][sample(length(bots), 10)]
## [1] "RT @GenRickDeMarco: Obama solution:Give pallets of cash to Iran,a Russian ally #IranDeal which mirrored a Bill Clinton deal w/North Korea.…"
## [2] "Citi973: Iran protests: Citizens told to avoid ‘illegal gatherings’ |More here: https://t.co/eUYUthRxan #CitiNews:… https://t.co/WzU4qpLvmf"
## [3] "RT @Kinghaven111: لليوم الثالث ع التوالي\xed\xa0\xbd\xed\xb1\x87\n\nايران سوف تنتفض ضد الاستبداد والطغيان هناك اصرار رهيب بكل قوة من الشعب الايراني،لتغيير حكم الملا…"
## [4] "RT @PamelaGeller: Bravo! Finally undoing the vicious anti-freedom policies of @BarackObama https://t.co/iiusxiFc52"
## [5] "RT @President1Trump: The green movement in Iran is happening again folks! At least we know that this @POTUS will have the good people of Ir…"
## [6] "RT @pnique: Curiosamente, el Tribunal de Cuentas no ha encontrado la financiación rusa, iraní y venezolana de Podemos. Lo que sí ha encontr…"
## [7] "RT @motostrelki1989: 테헤란 경찰이 더이상 여성을 복장 불량을 이유로 체포하지 않을 것이라 발표 https://t.co/wNIepy7Fkw"
## [8] "Iran says US President Trump support for protests &#39;deceitful&#39; https://t.co/lgSINOIxk2 #NewIE #World"
## [9] "RT @finy06: THE WORLD IS WATCHING!! #IranProtests #Revolution #HassanRouhani \nIran warns against 'illegal gatherings' after protests https:…"
## [10] "RT @mdubowitz: Imagine a free, democratic, independent, wealthy Iran. Giving full expression to beauty of Persian culture. Tapping into bra…"
# Get the 100 most active bots.
sort(table(df$screen_name[bots]), decreasing = T)[1:100]
##
## 5b20be6386164f8 ProSyria1 s_total_s2 TuxiTalk1
## 101 88 78 63
## mehrijavan UTHornsRawk enzyplex_ 2017TRUMP2017
## 52 50 47 43
## rentonMAGA BeanfromPa CityofInvestmnt PMoallemian
## 40 38 37 36
## costellodaniel1 MarjanFa1 uliw315 etdbrief_ro_1
## 34 33 33 32
## MsContrarianSci idesignwis AnvilTMF putzie63
## 32 30 28 28
## JoelGoldenberg1 zyiteblog ascending2him G436R52F2V08
## 27 27 25 25
## sama_ebrahimi FaranakAzad1 Leslieforlife Lyn1350
## 25 24 24 24
## TinaOrt79591465 cloudwanderer3 nyvetvote VNVE_
## 24 23 23 23
## Davewellwisher salahrv WMW_2018 10_KSA_10
## 22 22 22 21
## besoypirozi1 IranHurria LazarusDeLargeV SharonM44754993
## 21 21 21 21
## sunrise_freedom 6muammer5 AmericanMom2 bbeekk321
## 21 20 20 20
## clarinetwoman2 donjone38970700 s0a1r EvOConnor15
## 20 20 20 19
## ginapaints OLPL RobChristie11 SlowRoll2
## 19 19 19 19
## FaridehTavasso1 gxr5055 Hassan__Hadi mitraba60
## 18 18 18 18
## nastaranazimi9 newslivenetwork THEDOGPOUND1 AnnCarolPerry1
## 18 18 18 17
## ektrit GboruM instapundit mglom11
## 17 17 17 17
## scheerenberger sheyda_hsh syjere17 TimesofNews
## 17 17 17 17
## zeeshan_shah_dc CelesteHerget E__Strobel MSpan10
## 17 16 16 16
## RoRoscoe alialmutawa24 datrumpnation1 mtoni93
## 16 15 15 15
## DoomsdayVixon fefo646 GiglioMarilyn kaveh20092009
## 14 14 14 14
## Melissa31920880 Milatrud11 MuhammadWasAJew Nvehecnycrrcom1
## 14 14 14 14
## PascaLegrand ResidentOfFL Sammie_Snickers ZyiteGadgets
## 14 14 14 14
## DavidGr78574965 Della1946 EdwardPDeRosa Jim_Peoples_
## 13 13 13 13
## JimPolk leeleemunster lynn_weiser MaddyMaga
## 13 13 13 13
## nasher747ghamdi Nazanin22902172 SDrinsinger americanshomer
## 13 13 13 12
# Build Wordcloud
library("tm")
## Loading required package: NLP
library("SnowballC")
library("wordcloud")
## Loading required package: RColorBrewer
library("RColorBrewer")
# docs <- Corpus(VectorSource(df$text[-bots]))
#
# # Remove numbers
# docs <- tm_map(docs, removeNumbers)
# # Remove punctuations
# docs <- tm_map(docs, removePunctuation)
# # Eliminate extra white spaces
# docs <- tm_map(docs, stripWhitespace)
#
# dtm <- TermDocumentMatrix(docs)
# m <- as.matrix(dtm)
# load("m.RDATA")
# v <- sort(rowSums(m),decreasing=TRUE)
# d <- data.frame(word = names(v),freq=v)
# save(d, file="d.RDATA")
# head(d, 10)
load("d.RDATA")
# Same with bots:
docsB <- Corpus(VectorSource(df$text[bots]))
# Remove numbers
docsB <- tm_map(docsB, removeNumbers)
# Remove punctuations
docsB <- tm_map(docsB, removePunctuation)
# Eliminate extra white spaces
docsB <- tm_map(docsB, stripWhitespace)
dtmB <- TermDocumentMatrix(docsB)
mB <- as.matrix(dtmB)
vB <- sort(rowSums(mB),decreasing=TRUE)
dB <- data.frame(word = names(vB),freq=vB)
head(d, 10)
## word freq
## iran iran 36067
## the the 34758
## and and 10486
## this this 8355
## are are 7242
## protests protests 7180
## people people 6170
## iranprotests iranprotests 5813
## for for 5026
## iranian iranian 4802
# png("NoBots.png", type = "cairo", units = "cm", width = 30, height = 15, res = 600)
par(mfrow=c(1,2))
wordcloud(words = d$word, freq = d$freq, min.freq = 10,
max.words=200, random.order=FALSE, rot.per=0.35,
colors=colorRampPalette(c("orange", "black"))(100), random.color = T)
## Warning in wordcloud(words = d$word, freq = d$freq, min.freq = 10,
## max.words = 200, : independent could not be fit on page. It will not be
## plotted.
## Warning in wordcloud(words = d$word, freq = d$freq, min.freq = 10,
## max.words = 200, : تظاهراتسراسرى could not be fit on page. It will not be
## plotted.
wordcloud(words = dB$word, freq = dB$freq, min.freq = 10,
max.words=200, random.order=FALSE, rot.per=0.35,
colors=colorRampPalette(c("red", "black"))(100), random.color = T)
# dev.off()