Validation of experimental results through their replication is central to the scientific process. In the current paper we report on our efforts to replicate the central result in the Bogdanova et al. (2015) paper, Detecting Semantically Equivalent Questions in Online User Forums, which achieved results far surpassing the state-of-the-art for the task of duplicate question detection, and how that effort allowed us to find a flaw in data preprocessing in the original paper that casts doubt on the validity of the results reported there.
@InProceedings{SILVA18.7, author = {João Silva ,João António Rodrigues ,Vladislav Maraev ,Chakaveh Saedi Saedi and António Branco}, title = {A 20% Jump in Duplicate Question Detection Accuracy? Replicating IBM team’s experiment and finding problems in its data preparation}, booktitle = {Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)}, year = {2018}, month = {may}, date = {7-12}, location = {Miyazaki, Japan}, editor = {António Branco and Nicoletta Calzolari and Khalid Choukri}, publisher = {European Language Resources Association (ELRA)}, address = {Paris, France}, isbn = {979-10-95546-21-4}, language = {english} }