This repository contains all code used in our paper as well as additional material. Unfortunately, we are not allowed to share any of the datasets used due to legal reasons, but we wanted to share our method nonetheless.
The distribution of fake news is not a new but a rapidly growing problem. The shift to news consumption via social media has been one of the drivers for the spread of misleading and deliberately wrong information, as in addition to it of easy use there is rarely any veracity monitoring. Due to the harmful effects of such fake news on society, the detection of these has become increasingly important. We present an approach to the problem that combines the power of transformer-based language models while simultaneously addressing one of their inherent problems. Our framework, CMTR-BERT, combines multiple text representations, with the goal of circumventing sequential limits and related loss of information the underlying transformer architecture typically suffers from. Additionally, it enables the incorporation of contextual information. Extensive experiments on two very different, publicly available datasets demonstrates that our approach is able to set new state-of-the-art performance benchmarks. Apart from the benefit of using automatic text summarization techniques we also find that the incorporation of contextual information contributes to performance gains.
Accepted in Proceedings of LREC 2022 appeared in June 2022. To cite, use:
@inproceedings{hartl2022applying,
title={Applying Automatic Text Summarization for Fake News Detection},
author={Hartl, Philipp and Kruschwitz, Udo},
booktitle={Proceedings of the Thirteenth Language Resources and Evaluation Conference},
pages={2702--2713},
year={2022}
}