Data

I’m going to apologize in advance for the non-Americanists or behavioralists, I have not used any of the non-American datasets other than the Eurobarometer (and even then, it’s been awhile). I did my best to be as inclusive as possible, while sticking to data that would be fairly straightforward to use. Below is a brief description of each data, as well as links to their respective websites where their codebooks can be found. Note that you are not limited just to the data that I have here, you can use any data you want, I just thought this would be helpful for getting you started.

American Data

ANES

The American National Election Studies is by far the most widely used data in American politics (although it might be being edged out by the CES). This data is a nationally representative (probably) sample that asks Americans their opinions on various political events and entities. Here is the link to the website for the data. There you will find the codebook and information about the data. Note that in order to download anything you will need to make an account (it is completely free, I recommend using your FSU email if you have access to it). Below is sample code on how you could likely read the data into R.

library(tidyverse)

# remember to specify the correct path to your data
anes <- read_csv("ANES.csv")

## if you have it saved in your downloads folder, here is an example on how the code would look (on windows) note
## that it is better practice to have your data and script in the same folder, or the data in a subfolder of the
## script folder, more on that in class
anes <- read_csv("C:/User/Name/Downloads/ANES.csv")

CES

The Cooperative Election Study (CES, formerly the CCES) is the largest readily available survey of American voters that I know of. There are ~60,000 respondents, meaning that each state is represented with a fair bit of accuracy (a few folks have a paper on this, I think one is by Gelman but I could be wrong). This data asks respondents a lot of the same questions as the ANES with some notable exceptions (no feeling thermometers/the more political psych type stuff), but they ask a lot more of policy questions. Great if you want to get an idea of what your home state looks like. The reason the data is called the “common content” is because that is the data that they make readily available to everyone. Departments are able to buy modules on the CES to get ~1000 respondents pre and post election, which can be a great way to field experiments and pool resources. Here is the link to the page where the data is from with the codebook and such. Here is some sample code for reading the data into R:

library(tidyverse)

# remember to specify the correct path to your data
ces22 <- read_csv("CES22_Common.csv")

## example from another folder
ces22 <- read_csv("C:/User/Name/Downloads/CES22_Common.csv")

State Data

Someone else already went to the trouble of compiling these data, so I’m not going to go through each of these sources. If you go here, then you’ll be taken to Jeremey Darrington from Princeton’s website. There, he has collected a bunch of different state data sets that can be used for whatever interests you may have.

Comparative Data

QoG

The Quality of Government (QOG) dataset is an open source dataset which measures…the quality of government. The version here is the basic dataset, which has the most frequently used variables. The codebook and more info on this dataset can be found here. Below is some example code to load the data into R:

library(tidyverse)

# remember to specify the correct path to your data
qog <- read_csv("qog_bas_cs_jan23.csv")

# example from another folder
qog <- read_csv("C:/User/Name/Downloads/qog_bas_cs_jan23.csv")

CPDS

The Comparative Political Data Set (CPDS) is a collection of country-level comparative data. This data consists of annual data for 36 democracies from 1960-2021 or whenever the country became a democracy. The data page is here.

Please note that this data is a .xlsx file rather than a csv. Therefore, we will need to change the code slightly to read the data in correctly. Other than the specific function, specifying the path and assigning the data to an objsect is exactly the same. The code is as follows:

# installing a new package, note that this only needs to be done once, and can be run in the console
install.packages("xlsx")
library(tidyverse)

# remember to specify the correct path to your data
cpds <- xlsx::read.xlsx("CPDS_1960-2021_Update_2023.xlsx")

# example from another folder
cpds <- xlsx::read.xlsx("C:/User/Name/Downloads/CPDS_1960-2021_Update_2023.xlsx")

Manifesto Project

The Manifest Project analyses parties’ election manifestos to study the policy preferences of political parties globally. Their website is here

library(tidyverse)

# remember to specify the correct path to your data
man <- read_csv("parties_MPDataset_MPDS2023a.csv")

# example from another folder
man <- read_csv("C:/User/Name/Downloads/parties_MPDataset_MPDS2023a.csv")

International Relations Data

I’m going to offer an extra apology to the IR scholars among you. The data for IR is absolutely terrible. I personally find collecting this sort of data challenging, which is why the selection here is limited. If you don’t have an interest in using any of the data that is here and need help finding anything, just let me know.

V-dem

The V-dem data is a dataset which provides a democracy rating for a large number of countries. It is very commonly used in IR, the download link to the data is above, you will need to make an account with them, but its completely free. You can download the data as a CSV, Rdata, or stata file.

library(tidyverse)

# remember to specify the correct path to your data
vdem <- read_csv("vdem_file_name.csv")

# example from another folder
vdem <- read_csv("C:/User/Name/Downloads/vdem_file_name.csv")

CoW

The Correlates of War (CoW) data is exactly as it is described. It is a large dataset that contains information about conflicts between countries as well as non-state actors. The link is above, no account needed but there are a few different options for the data, so you can pick whichever fits what you’re trying to do. I’ve done a little bit with the intra-state war data (which you can find here), but I’ll admit it is pretty messy, do with that what you will.

Other Resources

Outside of the two data sources above, for IR there is also Polity, Penn World Tables, CINC scores, PRIO conflict data, CIRI Human Rights Dataset (aka CIRIGHTS), and then data from the World Bank IMF.