Evaluation Scales with Item Response Theory

This page contains code and data used in the following paper:


The dataset consists of response patterns collected using the Amazon Mechanical Turk crowdsourcing platform.

Included in the zip file is the data and a README.

License: The data is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License. Based on the Stanford SNLI project.

Download: zip file


Code used to generate the evaluation scales from the paper was written in R. Included are R files for each of the 5 evaluation scales.

Download: code hosted on GitHub

Questions about the code or data? Contact me at lalor at cs dot umass dot edu.