Skip to content

Latest commit

 

History

History
27 lines (24 loc) · 1.99 KB

Jiang2019Improved.md

File metadata and controls

27 lines (24 loc) · 1.99 KB

Title

Improved Differentiable Architecture Search for Language Modeling and Named Entity Recognition

Author

Yufan Jiang, Chi Hu, Tong Xiao, Chunliang Zhang, Jingbo Zhu

Abstract

In this paper, we study differentiable neural architecture search (NAS) methods for natural language processing. In particular, we improve differentiable architecture search by removing the softmax-local constraint. Also, we apply differentiable NAS to named entity recognition (NER). It is the first time that differentiable NAS methods are adopted in NLP tasks other than language modeling. On both the PTB language modeling and CoNLL-2003 English NER data, our method outperforms strong baselines. It achieves a new state-of-the-art on the NER task.

Bib

@inproceedings{jiang-etal-2019-improved, title = "Improved Differentiable Architecture Search for Language Modeling and Named Entity Recognition", author = "Jiang, Yufan and Hu, Chi and Xiao, Tong and Zhang, Chunliang and Zhu, Jingbo", booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)", month = nov, year = "2019", address = "Hong Kong, China", publisher = "Association for Computational Linguistics", url = "https://www.aclweb.org/anthology/D19-1367", doi = "10.18653/v1/D19-1367", pages = "3585--3590", abstract = "In this paper, we study differentiable neural architecture search (NAS) methods for natural language processing. In particular, we improve differentiable architecture search by removing the softmax-local constraint. Also, we apply differentiable NAS to named entity recognition (NER). It is the first time that differentiable NAS methods are adopted in NLP tasks other than language modeling. On both the PTB language modeling and CoNLL-2003 English NER data, our method outperforms strong baselines. It achieves a new state-of-the-art on the NER task.", }