DATASET for EMNLP 2015 paper