The original dataset of PDFs that Facebook released to Congress were processed to extract the relevant metadata and the post image. You can find this code on GitHub if you would like to reuse it. The resulting CSV data was then loaded into this Omeka instance that you are currently looking at. Each Facebook post was then tagged by students while using a codebook developed by the IRAds team.

The resulting augmented dataset is made available for download here under a CC-BY license.

The zip file includes items.csv which lists all the items and their respective metadata. It also includs tag-matrix.csv which is a matrix of tag combinations to help you see which tags co-occur the most. While items.csv includes a column with the image URL the PNG images are bundled in the images directory of the zip file for ease of use.

If you have any questions about the data or ideas for improving it, please contact:

Damien Pfister
Department of Communication
University of Maryland

Cite as:

Lindblad, P., Murphy, N., Pfister, D.S., Styer, M., Summers, E., and Yang, M. Internet Research Agency Ads Dataset. [data file]. Retrieved from