Hierarchical gated recurrent neural network with adversarial and virtual adversarial training on text classification

Poon HK; Yap WS; Tee YK; Lee WK; Goi BM

doi:10.1016/j.neunet.2019.08.017

Hierarchical gated recurrent neural network with adversarial and virtual adversarial training on text classification

Poon HK ¹ , Yap WS ² , Tee YK ¹ , Lee WK ³ , Goi BM ¹

Affiliations

¹ Lee Kong Chian Faculty of Engineering and Science, Universiti Tunku Abdul Rahman, Malaysia
² Lee Kong Chian Faculty of Engineering and Science, Universiti Tunku Abdul Rahman, Malaysia. Electronic address: yapws@utar.edu.my
³ Faculty of Information and Communication Technology, Universiti Tunku Abdul Rahman, Malaysia

Neural Netw, 2019 Nov;119:299-312.

PMID: 31499354 DOI: 10.1016/j.neunet.2019.08.017

Abstract

Document classification aims to assign one or more classes to a document for ease of management by understanding the content of a document. Hierarchical attention network (HAN) has been showed effective to classify documents that are ambiguous. HAN parses information-intense documents into slices (i.e., words and sentences) such that each slice can be learned separately and in parallel before assigning the classes. However, introducing hierarchical attention approach leads to the redundancy of training parameters which is prone to overfitting. To mitigate the concern of overfitting, we propose a variant of hierarchical attention network using adversarial and virtual adversarial perturbations in 1) word representation, 2) sentence representation and 3) both word and sentence representations. The proposed variant is tested on eight publicly available datasets. The results show that the proposed variant outperforms the hierarchical attention network with and without using random perturbation. More importantly, the proposed variant achieves state-of-the-art performance on multiple benchmark datasets. Visualizations and analysis are provided to show that perturbation can effectively alleviate the overfitting issue and improve the performance of hierarchical attention network.

* Title and MeSH Headings from MEDLINE®/PubMed®, a database of the U.S. National Library of Medicine.

MeSH terms

Similar publications