Knowing Right from Wrong:
Should We Use More Complex Models for Automatic Short-Answer Scoring in Bahasa Indonesia?
Ali Akbar Septiandri, Yosef Ardhito Winatmoko and Ilham Firdausi Putra
Rank and run-time aware compression of NLP Applications
Urmish Thakker, Jesse Beu, Dibakar Gope, Ganesh Dasika and Matthew Mattina
Incremental Neural Coreference Resolution in Constant Memory
Patrick Xia, João Sedoc and Benjamin Van Durme
Learning Informative Representations of Biomedical Relations with Latent Variable Models
Harshil Shah and Julien Fauqueur
End to End Binarized Neural Networks for Text Classification
Kumar Shridhar, Harshil Jain, Akshat Agarwal and Denis Kleyko
Large Product Key Memory for Pre-trained Language Models
Gyuwan Kim and Tae Hwan Jung
P-SIF: Document Embeddings using Partition Averaging
Vivek Gupta, Ankit Saw, Pegah Nokhiz, Praneeth Netrapalli, Piyush Rai and Partha Talukdar
Exploring the Boundaries of Low-Resource BERT Distillation
Moshe Wasserblat, Oren Pereg and Peter Izsak
Efficient Estimation of Influence of a Training Intance
Sosuke Kobayashi, Sho Yokoi, Jun Suzuki and Kentaro Inui
Efficient Inference For Neural Machine Translation
Yi-Te Hsu, Sarthak Garg, Yi-Hsiu Liao and Ilya Chatsviorkin
Sparse Optimization for Unsupervised Extractive Summarization of Long Documents with the Frank-Wolfe Algorithm
Alicia Tsai and Laurent El Ghaoui
Don’t Read Too Much Into It: Adaptive Computation for Open-Domain Question Answering
Yuxiang Wu, Pasquale Minervini, Pontus Stenetorp and Sebastian Riedel
A Two-stage Model for Slot Filling in Low-resource Settings: Domain-agnostic Non-slot Reduction and Pretrained Contextual Embeddings
Cennet Oguz and Ngoc Thang Vu
Early Exiting BERT for Efficient Document Ranking
Ji Xin, Rodrigo Nogueira, Yaoliang Yu and Jimmy Lin
Keyphrase Generation with GANs in Low-Resources Scenarios
Giuseppe Lancioni, Saida S.Mohamed, Beatrice Portelli, Giuseppe Serra and Carlo Tasso
Quasi-Multitask Learning: an Efficient Surrogate for Obtaining Model Ensembles
Norbert Kis-Szabó and Gábor Berend
A Little Bit Is Worse Than None: Ranking with Limited Training Data
Xinyu Zhang, Andrew Yates and Jimmy Lin
Predictive Model Selection for Transfer Learning in Sequence Labeling Tasks
Parul Awasthy, Bishwaranjan Bhattacharjee, John Kender and Radu Florian
Load What You Need: Smaller Versions of Mutlilingual BERT
Amine Abdaoui, Camille Pradel and Grégoire Sigel
SqueezeBERT: What can computer vision teach NLP about efficient neural networks?
Forrest Iandola, Albert Shaw, Ravi Krishna and Kurt Keutzer
Analysis of Resource-efficient Predictive Models for Natural Language Processing
Raj Pranesh and Ambesh Shekhar
Towards Accurate and Reliable Energy Measurement of NLP Models
Qingqing Cao, Aruna Balasubramanian and Niranjan Balasubramanian
FastFormers: Highly Efficient Transformer Models for Natural Language Understanding
Young Jin Kim and Hany Hassan
A comparison between CNNs and WFAs for Sequence Classification
Ariadna Quattoni and Xavier Carreras
Counterfactual Augmentation for Training Next Response Selection
Seungtaek Choi, Myeongho Jeong, Jinyoung Yeo and Seung-won Hwang
Do We Need to Create Big Datasets to Learn a Task?
Swaroop Mishra and Bhavdeep Singh Sachdeva
Doped Structured Matrices for Extreme Compression of LSTM Models
Urmish Thakker, Paul Whatmough, Zhi-Gang Liu, Matthew Mattina and Jesse Beu
Guiding Attention for Self-Supervised Learning with Transformers
Ameet Deshpande and Karthik Narasimhan
DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling
Jiecao Chen, Liu Yang, Karthik Raman, Michael Bendersky, Jung-Jung Yeh, Yun Zhou, Marc Najork, Danyang Cai, Ehsan Emadzadeh
Constrained Decoding for Computationally Efficient Named Entity Recognition Taggers
Brian Lester, Daniel Pressel, Amy Hemmeter, Sagnik Ray Choudhury, Srinivas Bangalore
Unsupervised Few-Bits Semantic Hashing with Implicit Topics Modeling
Fanghua Ye, Jarana Manotumruksa, Emine Yilmaz
OptSLA: an Optimization-Based Approach for Sequential Label Aggregation
Nasim Sabetpour, Adithya Kulkarni, Qi Li.
Improve Transformer Models with Better Relative Position Embeddings
Zhiheng Huang, Davis Liang, Peng Xu, Bing Xiang
General Purpose Text Embeddings from Pre-trained Language Models for Scalable Inference
Jingfei Du, Myle Ott, Haoran Li, Xing Zhou, Veselin Stoyanov
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro Raganato, Yves Scherrer, Jörg Tiedemann
Pruning Redundant Mappings in Transformer Models via Spectral-Normalized Identity Prior
Zi Lin, Jeremiah Zhe Liu, Zi Yang, Nan Hua, Dan Roth
Improving QA Generalization by Concurrent Modeling of Multiple Biases
Mingzhu Wu, Nafise Sadat Moosavi, Andreas Rücklé, Iryna Gurevych
PBoS: Probabilistic Bag-of-Subwords for Generalizing Word Embedding
Zhao Jinman, Shawn Zhong, Xiaomin Zhang, Yingyu Liang
Identifying Spurious Correlations for Robust Text Classification
Zhao Wang, Aron Culotta
Extremely Low Bit Transformer Quantization for On-Device Neural Machine Translation
Insoo Chung, Byeongwook Kim, Yoonjung Choi, Se Jung Kwon, Yongkweon Jeon, Baeseong Park, Sangha Kim, Dongsoo Lee
Blockwise Self-Attention for Long Document Understanding
Jiezhong Qiu, Hao Ma, Omer Levy, Scott Wen-tau Yih, Sinong Wang, Jie Tang
Multi-hop Question Generation with Graph Convolutional Network
Dan Su, Yan Xu, Wenliang Dai, Ziwei Ji, Tiezheng Yu, Pascale Fung
Inexpensive Domain Adaptation of Pretrained Language Models: Case Studies on Biomedical NER and Covid-19 QA
Nina Poerner, Ulli Waltinger, Hinrich Schütze
TopicBERT for Energy Efficient Document Classification
Yatin Chaudhary, Pankaj Gupta, Khushbu Saxena, Vivek Kulkarni, Thomas Runkler, Hinrich Schütze
Enhance Robustness of Sequence Labelling with Masked Adversarial Training
Luoxin Chen, Xinyue Liu, Weitong Ruan, Jianhua Lu
Domain Adversarial Fine-Tuning as an Effective Regularizer
Giorgos Vernikos, Katerina Margatina, Alexandra Chronopoulou, Ion Androutsopoulos
SupMMD: A Sentence Importance Model for Extractive Summarization using Maximum Mean Discrepancy
Umanga Bista, Alexander Patrick Mathews, Aditya Krishna Menon, Lexing Xie
Understanding tables with intermediate pre-training
Julian Martin Eisenschlos, Syrine Krichene, Thomas Müller
Probabilistic Case-based Reasoning for Open-World Knowledge Graph Completion
Rajarshi Das, Ameya Godbole, Nicholas Monath, Manzil Zaheer, Andrew McCallum
Semi-supervised Formality Style Transfer using LanguageModel Discriminator and Mutual Information Maximization
Kunal Chawla, Diyi Yang
DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling
Jiecao Chen, Liu Yang, Karthik Raman, Michael Bendersky, Jung-Jung Yeh, Yun Zhou, Marc Najork, Danyang Cai, Ehsan Emadzadeh
Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation
Alessandro Raganato, Yves Scherrer, Jörg Tiedemann