We use cookies to ensure that we give you the best experience on our website. You can change your cookie settings at any time. Otherwise, we'll assume you're OK to continue.

Durham University

Computer Science


Publication details for Dr Steven Bradley

Gajbhiye, Amit, Jaf, Sardar, Al-Moubayed, Noura, Bradley, Steven & McGough, A. Stephen (2018), CAM: A Combined Attention Model for Natural Language Inference, in Abe, Naoki Liu, Huan Pu, Calton Hu, Xiaohua Ahmed, Nesreen Qiao, Mu Song, Yang Kossmann, Donald Liu, Bing Lee, Kisung Tang, Jiliang He, Jingrui & Saltz, Jeffrey eds, IEEE International Conference on BIG DATA. Seattle, United States of America, IEEE, Piscataway, N.J., 1009-1014.

Author(s) from Durham


Natural Language Inference (NLI) is a fundamental
step towards natural language understanding. The task aims
to detect whether a premise entails or contradicts a given
hypothesis. NLI contributes to a wide range of natural language
understanding applications such as question answering,
text summarization and information extraction. Recently, the
public availability of big datasets such as Stanford Natural
Language Inference (SNLI) and SciTail, has made it feasible
to train complex neural NLI models. Particularly, Bidirectional
Long Short-Term Memory networks (BiLSTMs) with attention
mechanisms have shown promising performance for NLI. In
this paper, we propose a Combined Attention Model (CAM)
for NLI. CAM combines the two attention mechanisms: intraattention
and inter-attention. The model first captures the
semantics of the individual input premise and hypothesis with
intra-attention and then aligns the premise and hypothesis with
inter-sentence attention. We evaluate CAM on two benchmark
datasets: Stanford Natural Language Inference (SNLI) and
SciTail, achieving 86.14% accuracy on SNLI and 77.23% on
SciTail. Further, to investigate the effectiveness of individual
attention mechanism and in combination with each other, we
present an analysis showing that the intra- and inter-attention
mechanisms achieve higher accuracy when they are combined
together than when they are independently used.