Socio-Economic Data for the German General Election

The general elections are ahead and wouldn't it be cool to run some databased models? The Government is publishing some very interesting statistics on the level of the electoral constituencies. This is quite rare. Why? In Germany the electoral constituencies are not equal to the regional districts which are in charge for the normal gorvernmental statistics. You can see the data here: https://www.bundeswahlleiter.de/bundestagswahlen/2017/strukturdaten.html.
Alas, the csv-files are not really R-friendly. Therefore, I have wrote the following script that will give you three csv-files, one for 2013, one for 2017 and one with the variables that are included for both election.
And I have changed the column names so that it is easier to work with them...

require(readr)

## Loading required package: readr

df2017  <- read_delim("https://www.bundeswahlleiter.de/dam/jcr/f7566722-a528-4b18-bea3-ea419371e300/btw17_strukturdaten.csv", 
                      ";", escape_double = FALSE, col_names = FALSE, 
                      locale = locale(decimal_mark = ","), 
                      comment = "#", trim_ws = TRUE, skip = 2)

## Parsed with column specification:
## cols(
##   .default = col_double(),
##   X1 = col_character(),
##   X2 = col_integer(),
##   X3 = col_character(),
##   X4 = col_integer(),
##   X26 = col_integer(),
##   X27 = col_integer(),
##   X29 = col_character(),
##   X39 = col_character(),
##   X40 = col_character(),
##   X41 = col_character(),
##   X42 = col_character(),
##   X52 = col_character()
## )

## See spec(...) for full column specifications.

df2017 <- df2017[,-52]
newNames<-c("Land","WahlkreisNR","Wahlkreis",
            "Gemeinden","Flaeche","Bevoelkerung",
            "Deutsch","AuslaenderProz","Einwohnerkm2",
            "Geburtensaldo","Wanderungssaldo","Alter18",
            "Alter1825","Alter2535","Alter3560",
            "Alter6075","Alter75","OhneMigration",
            "MitMigration","RoemischKath","Evangelisch",
            "SonstigeRel","Eigentuemerquote","Wohnungen",
            "WohnungenBestand","Einkommen","BIP",
            "Autos","BerufsschuleAbschluss","Abschluss",
            "OhneHaupt","Haupt","Real",
            "Abi","KitaKinder","Unternehmen",
            "Handwerksunternehmen","Besch","BeschLand",
            "BeschProd","BeschHandel","BeschDienst",
            "BeschSonst","SozialEmpf","SozialEmpfNichtErwerb",
            "SozialEmpfAusl","Arbeitsl","ArbeitslMaenner",
            "ArbeitslFrauen","Arbeitsl1519","Arbeitsl5564")
colnames(df2017) <- newNames
write.csv(df2017, "Strukturdaten2017.csv", row.names = F)


#2013
df2013 <- read_delim("https://www.bundeswahlleiter.de/dam/jcr/65ef1c2d-4df0-44e2-8881-99a176b4896c/btw13_strukturdaten.csv", 
           ";", escape_double = FALSE, col_names = FALSE, 
           locale = locale(decimal_mark = ","), 
           comment = "#", trim_ws = TRUE, skip = 2)

## Parsed with column specification:
## cols(
##   .default = col_double(),
##   X1 = col_character(),
##   X2 = col_integer(),
##   X3 = col_character(),
##   X4 = col_character(),
##   X43 = col_character()
## )
## See spec(...) for full column specifications.

df2013 <- df2013[,-43]


newNames <- c("Land","WahlkreisNR","Wahlkreis",
              "Gemeinden","Flaeche",
              "Bevoelkerung","Maennlich",
              "Deutsch","Einwohnerkm2",
              "Geburtensaldo","Wanderungssaldo",
              "Alter18","Alter1825",
              "Alter2535","Alter3560",
              "Alter6075","Alter75",
              "Abschluss","OhneHaupt",
              "Haupt","Real",
              "Abi","Autos",
              "Wohnungen","WohnungenBestand",
              "Betriebe","Gewerbesteuer",
              "Gewerbeanmeldung","Gewerbeabmeldung",
              "Insolvenz","InsolvenzBesch",
              "Besch","BeschLand",
              "BeschProd","BeschHandel",
              "BeschDienst","BeschSonst",
              "Arbeitsl","ArbeitslFrauen",
              "ArbeitslAusl","SozialEmpf",
              "SozialEmpfNichtErwerb")
colnames(df2013) <- newNames

write.csv(df2013, "Strukturdaten2013.csv", row.names = F)
 
df2013 <- cbind(df2013, Jahr=2013)
df2017 <- cbind(df2017, Jahr=2017)
commonNames <- colnames(df2013)[colnames(df2013) %in% colnames(df2017)]
dfCommon <- rbind(df2013[,commonNames], df2017[,commonNames])

write.csv(dfCommon, "StrukturdatenCommon.csv", row.names = F)

Political Data Science

Dieses Blog durchsuchen

Socio-Economic Data for the German General Election

Labels

Kommentare

Kommentar veröffentlichen

Beliebte Posts aus diesem Blog

Deep-Dive Impfeffektivität: Eine kritische Datenanalyse der RKI-Berechnungen / Teil 1: Die Methode

Der Nutzerismus: Eine Ideologie mit totalitärem Potential

Was man an der COVID-Politik über Faschismus lernen kann