COLING 2018 Accepted Papers

Here is the list of papers accepted at COLING 2018, to appear in Santa Fe. This list was delayed until the best paper process was completed, to make sure that these awards were selected without committee members being able to know the identity of paper authors.

Congratulations to all authors of accepted papers; we look forward to seeing you in New Mexico!

  • A Comparison of Transformer and Recurrent Neural Networks on Multilingual Neural Machine Translation – Surafel Melaku Lakew, Mauro Cettolo and Marcello Federico.
  • A Computational Model for the Linguistic Notion of Morphological Paradigm – Miikka Silfverberg, Ling Liu and Mans Hulden.
  • A Knowledge-Augmented Neural Network Model for Implicit Discourse Relation Classification – Yudai Kishimoto, Yugo Murawaki and Sadao Kurohashi.
  • A Lexicon-Based Supervised Attention Model for Neural Sentiment Analysis – Yicheng Zou, Tao Gui, Qi Zhang and Xuanjing Huang.
  • A Multi-Attention based Neural Network with External Knowledge for Story Ending Predicting Task – Qian Li, Ziwei Li, Jin-Mao Wei, Yanhui Gu, Adam Jatowt and Zhenglu Yang.
  • A New Approach to Animacy Detection – Labiba Jahan, Geeticka Chauhan and Mark Finlayson.
  • A New Concept of Deep Reinforcement Learning based Augmented General Tagging System – Yu Wang, Abhishek Patel and Hongxia Jin.
  • A Position-aware Bidirectional Attention Network for Aspect-level Sentiment Analysis – Shuqin Gu, Lipeng Zhang, Yuexian Hou and Yin Song.
  • A Practical Incremental Learning Framework For Sparse Entity Extraction – Hussein Al-Olimat, Steven Gustafson, Jason Mackay, Krishnaprasad Thirunarayan and Amit Sheth.
  • A Prospective-Performance Network to Alleviate Myopia in Beam Search for Response Generation – Zongsheng Wang, Yunzhi Bai, Bowen Wu, Zhen Xu, Zhuoran Wang and Baoxun Wang.
  • A Reinforcement Learning Framework for Natural Question Generation using Bi-discriminators – Zhihao Fan, Zhongyu Wei, Siyuan Wang, Yang Liu and Xuanjing Huang.
  • A Retrospective Analysis of the Fake News Challenge Stance-Detection Task – Andreas Hanselowski, Avinesh PVS, Benjamin Schiller, Felix Caspelherr, Debanjan Chaudhuri, Christian M. Meyer and Iryna Gurevych.
  • Ab Initio: Automatic Latin Proto-word Reconstruction – Alina Maria Ciobanu and Liviu P. Dinu.
  • Abstract Meaning Representation for Multi-Document Summarization – Kexin Liao, Logan Lebanoff and Fei Liu.
  • Abstractive Unsupervised Multi-Document Summarization using Paraphrastic Sentence Fusion – Mir Tafseer Nayeem, Tanvir Ahmed Fuad and Yllias Chali.
  • Adopting the Word-Pair-Dependency-Triplets with Individual Comparison for Natural Language Inference – Qianlong Du, Chengqing Zong and Keh-Yih Su.
  • Adversarial Domain Adaptation for Variational Neural Language Generation in Dialogue Systems – Van-Khanh Tran and Le-Minh Nguyen.
  • Adversarial Multi-lingual Neural Relation Extraction – Xiaozhi Wang, Xu Han, Yankai Lin, Zhiyuan Liu and Maosong Sun.
  • Aff2Vec: Affect–Enriched Distributional Word Representations – Sopan Khosla, Niyati Chhaya and Kushal Chawla.
  • All-in-one: Multi-task Learning for Rumour Verification – Elena Kochkina, Maria Liakata and Arkaitz Zubiaga.
  • An Attribute Enhanced Domain Adaptive Model for Cold-Start Spam Review Detection – Zhenni You, Tieyun Qian and Bing Liu.
  • An Empirical Study on Fine-Grained Named Entity Recognition – Khai Mai, Thai-Hoang Pham, Minh Trung Nguyen, Nguyen Tuan Duc, Danushka Bollegala, Ryohei Sasano and Satoshi Sekine.
  • An Exploration of Three Lightly-supervised Representation Learning Approaches for Named Entity Classification – Ajay Nagesh and Mihai Surdeanu.
  • AnlamVer: Semantic Model Evaluation Dataset for Turkish – Word Similarity and Relatedness – Gökhan Ercan and Olcay Taner Yıldız.
  • Answerable or Not: Devising a Dataset for Extending Machine Reading Comprehension – Mao Nakanishi, Tetsunori Kobayashi and Yoshihiko Hayashi.
  • Ask No More: Deciding when to guess in referential visual dialogue – RAVI SHEKHAR, Tim Baumgärtner, Aashish Venkatesh, Elia Bruni, Raffaella Bernardi and Raquel Fernández.
  • Aspect and Sentiment Aware Abstractive Review Summarization – Min Yang, Qiang Qu, Ying Shen, Qiao Liu, Wei Zhao and Jia Zhu.
  • Aspect-based summarization of pros and cons in unstructured product reviews – Florian Kunneman, Sander Wubben, Antal van den Bosch and Emiel Krahmer.
  • Assessing Composition in Sentence Vector Representations – Allyson Ettinger, Ahmed Elgohary, Colin Phillips and Philip Resnik.
  • Attending Sentences to detect Satirical Fake News – Sohan De Sarkar, Fan Yang and Arjun Mukherjee.
  • Authorless Topic Models: Biasing Models Away from Known Structure – Laure Thompson and David Mimno.
  • Authorship Attribution By Consensus Among Multiple Features – Jagadeesh Patchala and Raj Bhatnagar.
  • Automated Fact Checking: Task Formulations, Methods and Future Directions – James Thorne and Andreas Vlachos.
  • Automated Scoring: Beyond Natural Language Processing – Nitin Madnani and Aoife Cahill.
  • Automatic Detection of Fake News – Verónica Pérez-Rosas, Bennett Kleinberg, Alexandra Lefevre and Rada Mihalcea.
  • Bridge Video and Text with Cascade Syntactic Structure – Guolong Wang, Zheng Qin, Kaiping Xu, Kai Huang and Shuxiong Ye.
  • Bringing replication and reproduction together with generalisability in NLP: Three reproduction studies for Target Dependent Sentiment Analysis – Andrew Moore and Paul Rayson.
  • Can Rumour Stance Alone Predict Veracity? – Sebastian Dungs, Ahmet Aker, Norbert Fuhr and Kalina Bontcheva.
  • CASCADE: Contextual Sarcasm Detection in Online Discussion Forums – Devamanyu Hazarika, Soujanya Poria, Sruthi Gorantla, Erik Cambria, Roger Zimmermann and Rada Mihalcea.
  • Challenges and Opportunities of Applying Natural Language Processing in Business Process Management – Han Van der Aa, Josep Carmona, Henrik Leopold, Jan Mendling and Lluís Padró.
  • Challenges of language technologies for the indigenous languages of the Americas – Manuel Mager, Ximena Gutierrez-Vasques, Gerardo Sierra and Ivan Meza-Ruiz.
  • Context-Sensitive Generation of Open-Domain Conversational Responses – Wei-Nan Zhang, Yiming Cui, Yifa Wang, Qingfu Zhu, Lingzhi Li, Lianqiang Zhou and Ting Liu.
  • Contextual String Embeddings for Sequence Labeling – Alan Akbik, Duncan Blythe and Roland Vollgraf.
  • Cooperative Denoising for Distantly Supervised Relation Extraction – Kai Lei, Daoyuan Chen, Yaliang Li, Nan Du, Min Yang, Wei Fan and Ying Shen.
  • Cross-lingual Argumentation Mining: Machine Translation (and a bit of Projection) is All You Need! – Steffen Eger, Johannes Daxenberger, Christian Stab and Iryna Gurevych.
  • Deep Enhanced Representation for Implicit Discourse Relation Recognition – Hongxiao Bai and Hai Zhao.
  • Dependent Gated Reading for Cloze-Style Question Answering – Reza Ghaeini, Xiaoli Fern, Hamed Shahbazi and Prasad Tadepalli.
  • Design Challenges and Misconceptions in Neural Sequence Labeling – Jie Yang, Shuailong Liang and Yue Zhang.
  • Design Challenges in Named Entity Transliteration – Yuval Merhav and Stephen Ash.
  • Dialogue-act-driven Conversation Model : An Experimental Study – Harshit Kumar, Arvind Agarwal and Sachindra Joshi.
  • Distance-Free Modeling of Multi-Predicate Interactions in End-to-End Japanese Predicate-Argument Structure Analysis – Yuichiroh Matsubayashi and Kentaro Inui.
  • Distinguishing affixoid formations from compounds – Josef Ruppenhofer, Michael Wiegand, Rebecca Wilm and Katja Markert.
  • Does Higher Order LSTM Have Better Accuracy for Segmenting and Labeling Sequence Data? – Yi Zhang, Xu SUN, Shuming Ma, Yang Yang and Xuancheng Ren.
  • Dynamic Multi-Level Multi-Task Learning for Sentence Simplification – Han Guo, Ramakanth Pasunuru and Mohit Bansal.
  • Effective Attention Modeling for Aspect-Level Sentiment Classification – Ruidan He, Wee Sun Lee, Hwee Tou Ng and Daniel Dahlmeier.
  • Embedding Words as Distributions with a Bayesian Skip-gram Model – Arthur Bražinskas, Serhii Havrylov and Ivan Titov.
  • Emotion Detection and Classification in a Multigenre Corpus with Joint Multi-Task Deep Learning – Shabnam Tafreshi and Mona Diab.
  • Emotion Representation Mapping for Automatic Lexicon Construction (Mostly) Performs on Human Level – Sven Buechel and Udo Hahn.
  • Employing Text Matching Network to Recognise Nuclearity in Chinese Discourse – Sheng Xu, Peifeng Li, Guodong Zhou and Qiaoming Zhu.
  • Enhanced Aspect Level Sentiment Classification with Auxiliary Memory – Peisong Zhu and Tieyun Qian.
  • Enhancing Sentence Embedding with Generalized Pooling – Qian Chen, Zhen-Hua Ling and Xiaodan Zhu.
  • Exploiting Structure in Representation of Named Entities using Active Learning – Nikita Bhutani, Kun Qian, Yunyao Li, H. V. Jagadish, Mauricio Hernandez and Mitesh Vasa.
  • Exploiting Syntactic Structures for Humor Recognition – Lizhen Liu, Donghai Zhang and Wei Song.
  • Exploratory Neural Relation Classification for Domain Knowledge Acquisition – Yan Fan, Chengyu Wang and Xiaofeng He.
  • Exploring the Influence of Spelling Errors on Lexical Variation Measures – Ryo Nagata, Taisei Sato and Hiroya Takamura.
  • Expressively vulgar: The socio-dynamics of vulgarity and its effects on sentiment analysis in social media – Isabel Cachola, Eric Holgate, Daniel Preoţiuc-Pietro and Junyi Jessy Li.
  • Extracting Parallel Sentences with Bidirectional Recurrent Neural Networks to Improve Machine Translation – Francis Grégoire and Philippe Langlais.
  • Extractive Headline Generation Based on Learning to Rank for Community Question Answering – Tatsuru Higurashi, Hayato Kobayashi, Takeshi Masuyama and Kazuma Murao.
  • Folksonomication: Predicting Tags for Movies from Plot Synopses using Emotion Flow Encoded Neural Network – Sudipta Kar, Suraj Maharjan and Thamar Solorio.
  • From Text to Lexicon: Bridging the Gap between Word Embeddings and Lexical Resources – Ilia Kuznetsov and Iryna Gurevych.
  • Fusing Recency into Neural Machine Translation with an Inter-Sentence Gate Model – Shaohui Kuang and Deyi Xiong.
  • GenSense: A Generalized Sense Retrofitting Model – Yang-Yin Lee, Ting-Yu Yen, Hen-Hsen Huang, Yow-Ting Shiue and Hsin-Hsi Chen.
  • Graphene: Semantically-Linked Propositions in Open Information Extraction – Matthias Cetto, Christina Niklaus, André Freitas and Siegfried Handschuh.
  • Grounded Textual Entailment – Hoa Vu, Claudio Greco, Aliia Erofeeva, Somayeh Jafaritazehjan, Guido Linders, Marc Tanti, Alberto Testoni, Raffaella Bernardi and Albert Gatt.
  • How emotional are you? Neural Architectures for Emotion Intensity Prediction in Microblogs – Devang Kulshreshtha, Pranav Goel and Anil Kumar Singh.
  • Hybrid Attention based Multimodal Network for Spoken Language Classification – Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li and Ivan Marsic.
  • Implicit Discourse Relation Recognition using Neural Tensor Network with Interactive Attention and Sparse Learning – Fengyu Guo, Ruifang He, Di Jin, Jianwu Dang, Longbiao Wang and Xiangang Li.
  • Improving Neural Machine Translation by Incorporating Hierarchical Subword Features – Makoto Morishita, Jun Suzuki and Masaaki Nagata.
  • Integrating Question Classification and Deep Learning for improved Answer Selection – Harish Tayyar Madabushi, Mark Lee and John Barnden.
  • Investigating Productive and Receptive Knowledge: A Profile for Second Language Learning – Leonardo Zilio, Rodrigo Wilkens and Cédrick Fairon.
  • Joint Modeling of Structure Identification and Nuclearity Recognition in Macro Chinese Discourse Treebank – Xiaomin Chu, Feng Jiang, Yi Zhou, Guodong Zhou and Qiaoming Zhu.
  • Knowledge as A Bridge: Improving Cross-domain Answer Selection with External Knowledge – Yang Deng, Ying Shen, Min Yang, Yaliang Li, Nan Du, Wei Fan and Kai Lei.
  • Learning Features from Co-occurrences: A Theoretical Analysis – Yanpeng Li.
  • Learning from Measurements in Crowdsourcing Models: Inferring Ground Truth from Diverse Annotation Types – Paul Felt, Eric Ringger, Kevin Seppi and Jordan Boyd-Graber.
  • Learning Sentiment Composition from Sentiment Lexicons – Orith Toledo-Ronen, Roy Bar-Haim, Alon Halfon, Charles Jochim, Amir Menczel, Ranit Aharonov and Noam Slonim.
  • Learning Target-Specific Representations of Financial News Documents For Cumulative Abnormal Return Prediction – Junwen Duan, Yue Zhang, Xiao Ding, Ching-Yun Chang and Ting Liu.
  • Learning to Generate Word Representations using Subword Information – Yeachan Kim, Kang-Min Kim, Ji-Min Lee and SangKeun Lee.
  • Learning Word Meta-Embeddings by Autoencoding – Danushka Bollegala and Cong Bao.
  • Low-resource Cross-lingual Event Type Detection via Distant Supervision with Minimal Effort – Aldrian Obaja Muis, Naoki Otani, Nidhi Vyas, Ruochen Xu, Yiming Yang, Teruko Mitamura and Eduard Hovy.
  • Lyrics Segmentation: Textual Macrostructure Detection using Convolutions – Michael Fell, Yaroslav Nechaev, Elena Cabrio and Fabien Gandon.
  • Measuring the Diversity of Automatic Image Descriptions – Emiel van Miltenburg, Desmond Elliott and Piek Vossen.
  • Model-Free Context-Aware Word Composition – Bo An, Xianpei Han and Le Sun.
  • Modeling Coherence for Neural Machine Translation with Dynamic and Topic Caches – Shaohui Kuang, Deyi Xiong, Weihua Luo and Guodong Zhou.
  • Modeling Semantics with Gated Graph Neural Networks for Knowledge Base Question Answering – Daniil Sorokin and Iryna Gurevych.
  • Modeling with Recurrent Neural Networks for Open Vocabulary Slots – Jun-Seong Kim, Junghoe Kim, SeungUn Park, Kwangyong Lee and Yoonju Lee.
  • Multilevel Heuristics for Rationale-Based Entity Relation Classification in Sentences – Shiou Tian Hsu, Mandar Chaudhary and Nagiza Samatova.
  • Multilingual Neural Machine Translation with Task-Specific Attention – Graeme Blackwood, Miguel Ballesteros and Todd Ward.
  • Multimodal Grounding for Language Processing – Lisa Beinborn, Teresa Botschen and Iryna Gurevych.
  • Neural Activation Semantic Models: Computational lexical semantic models of localized neural activations – Nikos Athanasiou, Elias Iosif and Alexandros Potamianos.
  • Neural Collective Entity Linking – Yixin Cao, Lei Hou, Juanzi Li and Zhiyuan Liu.
  • Neural Machine Translation with Decoding History Enhanced Attention – Mingxuan Wang.
  • Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering – Wuwei Lan and Wei Xu.
  • Neural Relation Classification with Text Descriptions – Feiliang Ren, Di Zhou, Zhihui Liu, Yongcheng Li, Rongsheng Zhao, Yongkang Liu and Xiaobo Liang.
  • Neural Transition-based String Transduction for Limited-Resource Setting in Morphology – Peter Makarov and Simon Clematide.
  • Novelty Goes Deep. A Deep Neural Solution To Document Level Novelty Detection – Tirthankar Ghosal, Vignesh Edithal, Asif Ekbal, Pushpak Bhattacharyya, Srinivasa Satya Sameer Kumar Chivukula and George Tsatsaronis.
  • On Adversarial Examples for Character-Level Neural Machine Translation – Javid Ebrahimi, Daniel Lowd and Dejing Dou.
  • One-shot Learning for Question-Answering in Gaokao History Challenge – Zhuosheng Zhang and Hai Zhao.
  • Open Information Extraction from Conjunctive Sentences – Swarnadeep Saha and Mausam -.
  • Open Information Extraction on Scientific Text: An Evaluation – Paul Groth, Mike Lauruhn, Antony Scerri and Ron Daniel, Jr..
  • Pattern-revising Enhanced Simple Question Answering over Knowledge Bases – Yanchao Hao, Hao Liu, Shizhu He, Kang Liu and Jun Zhao.
  • Personalized Text Retrieval for Learners of Chinese as a Foreign Language – Chak Yan Yeung and John Lee.
  • Predicting Stances from Social Media Posts using Factorization Machines – Akira Sasaki, Kazuaki Hanawa, Naoaki Okazaki and Kentaro Inui.
  • Punctuation as Native Language Interference – Ilia Markov, Vivi Nastase and Carlo Strapparava.
  • Quantifying training challenges of dependency parsers – Lauriane Aufrant, Guillaume Wisniewski and François Yvon.
  • Recognizing Humour using Word Associations and Humour Anchor Extraction – Andrew Cattle and Xiaojuan Ma.
  • Recurrent One-Hop Predictions for Reasoning over Knowledge Graphs – Wenpeng Yin, Yadollah Yaghoobzadeh and Hinrich Schütze.
  • Relation Induction in Word Embeddings Revisited – Zied Bouraoui, Shoaib Jameel and Steven Schockaert.
  • Representations and Architectures in Neural Sentiment Analysis for Morphologically Rich Languages: A Case Study from Modern Hebrew – Adam Amram, Anat Ben-David and Reut Tsarfaty.
  • Rethinking the Agreement in Human Evaluation Tasks – Jacopo Amidei, Paul Piwek and Alistair Willis.
  • RNN Simulations of Grammaticality Judgments on Long-distance Dependencies – Shammur Absar Chowdhury and Roberto Zamparelli.
  • Self-Normalization Properties of Language Modeling – Jacob Goldberger and Oren Melamud.
  • Semi-Supervised Disfluency Detection – Feng Wang, Zhen Yang, Wei Chen, Shuang Xu, Bo Xu and Qianqian Dong.
  • Semi-Supervised Lexicon Learning for Wide-Coverage Semantic Parsing – Bo Chen, Bo An, Le Sun and Xianpei Han.
  • Sequence-to-Sequence Data Augmentation for Dialogue Language Understanding – Yutai Hou, Yijia Liu, Wanxiang Che and Ting Liu.
  • SGM: Sequence Generation Model for Multi-label Classification – Pengcheng Yang, Xu SUN, Wei Li, Shuming Ma, Wei Wu and Houfeng WANG.
  • Simple Algorithms For Sentiment Analysis On Sentiment Rich, Data Poor Domains. – Prathusha K Sarma and William Sethares.
  • Sprucing up the trees – Error detection in treebanks – Ines Rehbein and Josef Ruppenhofer.
  • Stress Test Evaluation for Natural Language Inference – Aakanksha Naik, Abhilasha Ravichander, Norman Sadeh, Carolyn Rose and Graham Neubig.
  • Structure-Infused Copy Mechanisms for Abstractive Summarization – Kaiqiang Song, Lin Zhao and Fei Liu.
  • Structured Dialogue Policy with Graph Neural Networks – Lu Chen, Bowen Tan, Sishan Long and Kai Yu.
  • Subword-augmented Embedding for Cloze Reading Comprehension – Zhuosheng Zhang, Yafang Huang and Hai Zhao.
  • Systematic Study of Long Tail Phenomena in Entity Linking – Filip Ilievski, Piek Vossen and Stefan Schlobach.
  • The Road to Success: Assessing the Fate of Linguistic Innovations in Online Communities – Marco Del Tredici and Raquel Fernández.
  • They Exist! Introducing Plural Mentions to Coreference Resolution and Entity Linking – Ethan Zhou and Jinho D. Choi.
  • Topic or Style? Exploring the Most Useful Features for Authorship Attribution – Yunita Sari, Mark Stevenson and Andreas Vlachos.
  • Towards a unified framework for bilingual terminology extraction of single-word and multi-word terms – Jingshu Liu, Emmanuel Morin and Peña Saldarriaga.
  • Towards identifying the optimal datasize for lexically-based Bayesian inference of linguistic phylogenies – Taraka Rama and Søren Wichmann.
  • Transition-based Neural RST Parsing with Implicit Syntax Features – Nan Yu, Meishan Zhang and Guohong Fu.
  • Treat us like the sequences we are: Prepositional Paraphrasing of Noun Compounds using LSTM – Girishkumar Ponkiya, Kevin Patel, Pushpak Bhattacharyya and Girish Palshikar.
  • Triad-based Neural Network for Coreference Resolution – Yuanliang Meng and Anna Rumshisky.
  • Two Local Models for Neural Constituent Parsing – Zhiyang Teng and Yue Zhang.
  • Unsupervised Morphology Learning with Statistical Paradigms – Hongzhi Xu, Mitchell Marcus, Charles Yang and Lyle Ungar.
  • Using J-K-fold Cross Validation To Reduce Variance When Tuning NLP Models – Henry Moss, David Leslie and Paul Rayson.
  • Variational Attention for Sequence-to-Sequence Models – Hareesh Bahuleyan, Lili Mou, Olga Vechtomova and Pascal Poupart.
  • What represents “style” in authorship attribution? – Kalaivani Sundararajan and Damon Woodard.
  • Who is Killed by Police: Introducing Supervised Attention for Hierarchical LSTMs – Minh Nguyen and Thien Nguyen.
  • Word-Level Loss Extensions for Neural Temporal Relation Classification – Artuur Leeuwenberg and Marie-Francine Moens.
  • Zero Pronoun Resolution with Attention-based Neural Network – Qingyu Yin, Yu Zhang, Wei-Nan Zhang, Ting Liu and William Yang Wang.
  • A Dataset for Building Code-Mixed Goal Oriented Conversation Systems – Suman Banerjee, Nikita Moghe, Siddhartha Arora and Mitesh M. Khapra.
  • A Deep Dive into Word Sense Disambiguation with LSTM – Minh Le, Marten Postma, Jacopo Urbani and Piek Vossen.
  • A Full End-to-End Semantic Role Labeler, Syntactic-agnostic Over Syntactic-aware? – Jiaxun Cai, Shexia He, Zuchao Li and Hai Zhao.
  • A LSTM Approach with Sub-Word Embeddings for Mongolian Phrase Break Prediction – Rui Liu, Feilong Bao, Guanglai Gao, Hui Zhang and Yonghe Wang.
  • A Neural Question Answering Model Based on Semi-Structured Tables – Hao Wang, Xiaodong Zhang, Shuming Ma, Xu SUN, Houfeng WANG and wang mengxiang.
  • A Nontrivial Sentence Corpus for the Task of Sentence Readability Assessment in Portuguese – Sidney Evaldo Leal, Magali Sanches Duran and Sandra Maria Aluísio.
  • A Pseudo Label based Dataless Naive Bayes Algorithm for Text Classification with Seed Words – Ximing Li and Bo Yang.
  • A Reassessment of Reference-Based Grammatical Error Correction Metrics – Shamil Chollampatt and Hwee Tou Ng.
  • A review of Spanish corpora annotated with negation – Salud María Jiménez-Zafra, Roser Morante, Maite Martin and L. Alfonso Urena Lopez.
  • A Review on Deep Learning Techniques Applied to Answer Selection – Tuan Manh Lai, Trung Bui and Sheng Li.
  • A Survey of Domain Adaptation for Neural Machine Translation – Chenhui Chu and Rui Wang.
  • A Survey on Open Information Extraction – Christina Niklaus, Matthias Cetto, André Freitas and Siegfried Handschuh.
  • A Survey on Recent Advances in Named Entity Recognition from Deep Learning models – Vikas Yadav and Steven Bethard.
  • Adaptive Learning of Local Semantic and Global Structure Representations for Text Classification – Jianyu Zhao, Zhiqiang Zhan, Qichuan Yang, Yang Zhang, Changjian Hu, Zhensheng Li, Liuxin Zhang and Zhiqiang He.
  • Adaptive Multi-Task Transfer Learning for Chinese Word Segmentation in Medical Text – Junjie Xing, Kenny Zhu and Shaodian Zhang.
  • Adaptive Weighting for Neural Machine Translation – Yachao Li, Junhui Li and Min Zhang.
  • Addressee and Response Selection for Multilingual Conversation – Motoki Sato, Hiroki Ouchi and Yuta Tsuboi.
  • Adversarial Feature Adaptation for Cross-lingual Relation Classification – Bowei Zou, Zengzhuang Xu, Yu Hong and Guodong Zhou.
  • AMR Beyond the Sentence: the Multi-sentence AMR corpus – Tim O’Gorman, Michael Regan, Kira Griffitt, Martha Palmer, Ulf Hermjakob and Kevin Knight.
  • An Analysis of Annotated Corpora for Emotion Classification in Text – Laura Ana Maria Bostan and Roman Klinger.
  • An Empirical Investigation of Error Types in Vietnamese Parsing – Quy Nguyen, Yusuke Miyao, Hiroshi Noji and Nhung Nguyen.
  • An Evaluation of Neural Machine Translation Models on Historical Spelling Normalization – Gongbo Tang, Fabienne Cap, Eva Pettersson and Joakim Nivre.
  • An Interpretable Reasoning Network for Multi-Relation Question Answering – Mantong Zhou, Minlie Huang and xiaoyan zhu.
  • An Operation Network for Abstractive Sentence Compression – Naitong Yu, Jie Zhang, Minlie Huang and xiaoyan zhu.
  • Ant Colony System for Multi-Document Summarization – Asma Al-Saleh and Mohamed El Bachir Menai.
  • Argumentation Synthesis following Rhetorical Strategies – Henning Wachsmuth, Manfred Stede, Roxanne El Baff, Khalid Al Khatib, Maria Skeppstedt and Benno Stein.
  • Arguments and Adjuncts in Universal Dependencies – Adam Przepiórkowski and Agnieszka Patejuk.
  • Arrows are the Verbs of Diagrams – Malihe Alikhani and Matthew Stone.
  • Assessing Quality Estimation Models for Sentence-Level Prediction – Hoang Cuong and Jia Xu.
  • Attributed and Predictive Entity Embedding for Fine-Grained Entity Typing in Knowledge Bases – Hailong Jin, Lei Hou, Juanzi Li and Tiansi Dong.
  • Author Profiling for Abuse Detection – Pushkar Mishra, Marco Del Tredici, Helen Yannakoudakis and Ekaterina Shutova.
  • Authorship Identification for Literary Book Recommendations – Haifa Alharthi, Diana Inkpen and Stan Szpakowicz.
  • Automatic Assessment of Conceptual Text Complexity Using Knowledge Graphs – Sanja Štajner and Ioana Hulpus.
  • Automatically Creating a Lexicon of Verbal Polarity Shifters: Mono- and Cross-lingual Methods for German – Marc Schulder, Michael Wiegand and Josef Ruppenhofer.
  • Automatically Extracting Qualia Relations for the Rich Event Ontology – Ghazaleh Kazeminejad, Claire Bonial, Susan Windisch Brown and Martha Palmer.
  • Bridging resolution: Task definition, corpus resources and rule-based experiments – Ina Roesiger, Arndt Riester and Jonas Kuhn.
  • Butterfly Effects in Frame Semantic Parsing: impact of data processing on model ranking – Alexandre Kabbach, Corentin Ribeyre and Aurélie Herbelot.
  • Can Taxonomy Help? Improving Semantic Question Matching using Question Taxonomy – Deepak Gupta, Rajkumar Pujari, Asif Ekbal, Pushpak Bhattacharyya, Anutosh Maitra, Tom Jain and Shubhashis Sengupta.
  • Character-Level Feature Extraction with Densely Connected Networks – Chanhee Lee, Young-Bum Kim, Dongyub Lee and Heuiseok Lim.
  • Clausal Modifiers in the Grammar Matrix – Kristen Howell and Olga Zamaraeva.
  • Combining Information-Weighted Sequence Alignment and Sound Correspondence Models for Improved Cognate Detection – Johannes Dellert.
  • Convolutional Neural Network for Universal Sentence Embeddings – Xiaoqi Jiao, Fang Wang and Dan Feng.
  • Corpus-based Content Construction – Balaji Vasan Srinivasan, Pranav Maneriker, Kundan Krishna and Natwar Modani.
  • Correcting Chinese Word Usage Errors for Learning Chinese as a Second Language – Yow-Ting Shiue, Hen-Hsen Huang and Hsin-Hsi Chen.
  • Cross-lingual Knowledge Projection Using Machine Translation and Target-side Knowledge Base Completion – Naoki Otani, Hirokazu Kiyomaru, Daisuke Kawahara and Sadao Kurohashi.
  • Cross-media User Profiling with Joint Textual and Social User Embedding – Jingjing Wang, Shoushan Li, Mingqi Jiang, Hanqian Wu and Guodong Zhou.
  • Crowdsourcing a Large Corpus of Clickbait on Twitter – Martin Potthast, Tim Gollub, Kristof Komlossy, Sebastian Schuster, Matti Wiegmann, Erika Patricia Garces Fernandez, Matthias Hagen and Benno Stein.
  • Deconvolution-Based Global Decoding for Neural Machine Translation – Junyang Lin, Xu SUN, Xuancheng Ren, Shuming Ma, jinsong su and Qi Su.
  • Deep Neural Networks at the Service of Multilingual Parallel Sentence Extraction – Ahmad Aghaebrahimian.
  • deepQuest: A Framework for Neural-based Quality Estimation – Julia Ive, Frédéric Blain and Lucia Specia.
  • Diachronic word embeddings and semantic shifts: a survey – Andrey Kutuzov, Lilja Øvrelid, Terrence Szymanski and Erik Velldal.
  • DIDEC: The Dutch Image Description and Eye-tracking Corpus – Emiel van Miltenburg, Ákos Kádár, Ruud Koolen and Emiel Krahmer.
  • Distantly Supervised NER with Partial Annotation Learning and Reinforcement Learning – Yaosheng Yang, Wenliang Chen, Zhenghua Li, Zhengqiu He and Min Zhang.
  • Document-level Multi-aspect Sentiment Classification by Jointly Modeling Users, Aspects, and Overall Ratings – Junjie Li, Haitong Yang and Chengqing Zong.
  • Double Path Networks for Sequence to Sequence Learning – Kaitao Song, Xu Tan, Di He, Jianfeng Lu, Tao QIN and Tie-Yan Liu.
  • Dynamic Feature Selection with Attention in Incremental Parsing – Ryosuke Kohita, Hiroshi Noji and Yuji Matsumoto.
  • Embedding WordNet Knowledge for Textual Entailment – Yunshi Lan and Jing Jiang.
  • Encoding Sentiment Information into Word Vectors for Sentiment Analysis – Zhe Ye, Fang Li and Timothy Baldwin.
  • Enhancing General Sentiment Lexicons for Domain-Specific Use – Tim Kreutz and Walter Daelemans.
  • Enriching Word Embeddings with Domain Knowledge for Readability Assessment – Zhiwei Jiang, Qing Gu, Yafeng Yin and Daoxu Chen.
  • Ensure the Correctness of the Summary: Incorporate Entailment Knowledge into Abstractive Sentence Summarization – Haoran Li, Junnan Zhu, Jiajun Zhang and Chengqing Zong.
  • Evaluating the text quality, human likeness and tailoring component of PASS: A Dutch data-to-text system for soccer – Chris van der Lee, Bart Verduijn, Emiel Krahmer and Sander Wubben.
  • Evaluation of Unsupervised Compositional Representations – Hanan Aldarmaki and Mona Diab.
  • Farewell Freebase: Migrating the SimpleQuestions Dataset to DBpedia – Michael Azmy, Peng Shi, Ihab Ilyas and Jimmy Lin.
  • Fast and Accurate Reordering with ITG Transition RNN – Hao Zhang, Axel Ng and Richard Sproat.
  • Few-Shot Charge Prediction with Discriminative Legal Attributes – Zikun Hu, Xiang Li, Cunchao Tu, Zhiyuan Liu and Maosong Sun.
  • Fine-Grained Arabic Dialect Identification – Mohammad Salameh and Houda Bouamor.
  • Generating Reasonable and Diversified Story Ending Using Sequence to Sequence Model with Adversarial Training – Zhongyang Li, Xiao Ding and Ting Liu.
  • Generic refinement of expressive grammar formalisms with an application to discontinuous constituent parsing – Kilian Gebhardt.
  • Genre Identification and the Compositional Effect of Genre in Literature – Joseph Worsham and Jugal Kalita.
  • Gold Standard Annotations for Preposition and Verb Sense with Semantic Role Labels in Adult-Child Interactions – Lori Moon, Christos Christodoulopoulos, Fisher Cynthia, Sandra Franco and Dan Roth.
  • Graph Based Decoding for Event Sequencing and Coreference Resolution – Zhengzhong Liu, Teruko Mitamura and Eduard Hovy.
  • HL-EncDec: A Hybrid-Level Encoder-Decoder for Neural Response Generation – Sixing Wu, Dawei Zhang, Ying Li, Xing Xie and Zhonghai Wu.
  • How Predictable is Your State? Leveraging Lexical and Contextual Information for Predicting Legislative Floor Action at the State Level – Vladimir Eidelman, Anastassia Kornilova and Daniel Argyle.
  • Identifying Emergent Research Trends by Key Authors and Phrases – Shenhao Jiang, Animesh Prasad, Min-Yen Kan and Kazunari Sugiyama.
  • If you’ve seen some, you’ve seen them all: Identifying variants of multiword expressions – Caroline Pasquer, Agata Savary, Carlos Ramisch and Jean-Yves Antoine.
  • Improving Feature Extraction for Pathology Reports with Precise Negation Scope Detection – Olga Zamaraeva, Kristen Howell and Adam Rhine.
  • Improving Named Entity Recognition by Jointly Learning to Disambiguate Morphological Tags – Onur Gungor, Suzan Uskudarli and Tunga Gungor.
  • Incorporating Argument-Level Interactions for Persuasion Comments Evaluation using Co-attention Model – Lu Ji, Zhongyu Wei, Xiangkun Hu, Yang Liu, Qi Zhang and Xuanjing Huang.
  • Incorporating Deep Visual Features into Multiobjective based Multi-view Search Result Clustering – Sayantan Mitra, Mohammed Hasanuzzaman and Sriparna Saha.
  • Incorporating Image Matching Into Knowledge Acquisition for Event-Oriented Relation Recognition – Yu Hong, Yang Xu, Huibin Ruan, Bowei Zou, Jianmin Yao and Guodong Zhou.
  • Incorporating Syntactic Uncertainty in Neural Machine Translation with a Forest-to-Sequence Model – Poorya Zaremoodi and Gholamreza Haffari.
  • Incremental Natural Language Processing: Challenges, Strategies, and Evaluation – Arne Köhn.
  • Indigenous language technologies in Canada: Assessment, challenges, and successes – Patrick Littell, Anna Kazantseva, Roland Kuhn, Aidan Pine, Antti Arppe, Christopher Cox and Marie-Odile Junker.
  • Information Aggregation via Dynamic Routing for Sequence Encoding – Jingjing Gong, Xipeng Qiu, Shaojing Wang and Xuanjing Huang.
  • Integrating Tree Structures and Graph Structures with Neural Networks to Classify Discussion Discourse Acts – Yasuhide Miura, Ryuji Kano, Motoki Taniguchi, Tomoki Taniguchi, Shotaro Misawa and Tomoko Ohkuma.
  • Interaction-Aware Topic Model for Microblog Conversations through Network Embedding and User Attention – Ruifang He, Xuefei Zhang, Di Jin, Longbiao Wang, Jianwu Dang and Xiangang Li.
  • Interpretation of Implicit Conditions in Database Search Dialogues – Shunya Fukunaga, Hitoshi Nishikawa, Takenobu Tokunaga, Hikaru Yokono and Tetsuro Takahashi.
  • Investigating the Working of Text Classifiers – Devendra Sachan, Manzil Zaheer and Ruslan Salakhutdinov.
  • iParaphrasing: Extracting Visually Grounded Paraphrases via an Image – Chenhui Chu, Mayu Otani and Yuta Nakashima.
  • ISO-Standard Domain-Independent Dialogue Act Tagging for Conversational Agents – Stefano Mezza, Alessandra Cervone, Evgeny Stepanov, Giuliano Tortoreto and Giuseppe Riccardi.
  • Joint Learning from Labeled and Unlabeled Data for Information Retrieval – Bo Li, Ping Cheng and Le Jia.
  • Joint Neural Entity Disambiguation with Output Space Search – Hamed Shahbazi, Xiaoli Fern, Reza Ghaeini, Chao Ma, Rasha Mohammad Obeidat and Prasad Tadepalli.
  • JTAV: Jointly Learning Social Media Content Representation by Fusing Textual, Acoustic, and Visual Features – Hongru Liang, Haozheng Wang, Jun Wang, Shaodi You, Zhe Sun, Jin-Mao Wei and Zhenglu Yang.
  • Killing Four Birds with Two Stones: Multi-Task Learning for Non-Literal Language Detection – Erik-Lân Do Dinh, Steffen Eger and Iryna Gurevych.
  • LCQMC:A Large-scale Chinese Question Matching Corpus – Xin Liu, Qingcai Chen, Chong Deng, Huajun Zeng, Jing Chen, Dongfang Li and Buzhou Tang.
  • Learning Emotion-enriched Word Representations – Ameeta Agrawal, Aijun An and Manos Papagelis.
  • Learning Multilingual Topics from Incomparable Corpora – Shudong Hao and Michael J. Paul.
  • Learning Semantic Sentence Embeddings using Sequential Pair-wise Discriminator – Badri Narayana Patro, Vinod Kumar Kurmi, Sandeep Kumar and Vinay Namboodiri.
  • Learning to Progressively Recognize New Named Entities with Sequence to Sequence Model – Lingzhen Chen and Alessandro Moschitti.
  • Learning to Search in Long Documents Using Document Structure – Mor Geva and Jonathan Berant.
  • Learning Visually-Grounded Semantics from Contrastive Adversarial Samples – Haoyue Shi, Jiayuan Mao, Tete Xiao, Yuning Jiang and Jian Sun.
  • Learning What to Share: Leaky Multi-Task Network for Text Classification – Liqiang Xiao, Honglun Zhang, Wenqing Chen, Yongkun Wang and Yaohui Jin.
  • Learning with Noise-Contrastive Estimation: Easing training by learning to scale – Matthieu Labeau and Alexandre Allauzen.
  • Leveraging Meta-Embeddings for Bilingual Lexicon Extraction from Specialized Comparable Corpora – Amir Hazem and Emmanuel Morin.
  • Lexi: A tool for adaptive, personalized text simplification – Joachim Bingel, Gustavo Paetzold and Anders Søgaard.
  • Local String Transduction as Sequence Labeling – Joana Ribeiro, Shashi Narayan, Shay B. Cohen and Xavier Carreras.
  • Location Name Extraction from Targeted Text Streams using Gazetteer-based Statistical Language Models – Hussein Al-Olimat, Krishnaprasad Thirunarayan, Valerie Shalin and Amit Sheth.
  • MCDTB: A Macro-level Chinese Discourse TreeBank – Feng Jiang, Sheng Xu, Xiaomin Chu, Peifeng Li, Qiaoming Zhu and Guodong Zhou.
  • MEMD: A Diversity-Promoting Learning Framework for Short-Text Conversation – Meng Zou, Xihan Li, Haokun Liu and Zhihong Deng.
  • Modeling Multi-turn Conversation with Deep Utterance Aggregation – Zhuosheng Zhang, Jiangtong Li, Pengfei Zhu and Hai Zhao.
  • Modeling the Readability of German Targeting Adults and Children: An empirically broad analysis and its cross-corpus validation – Zarah Weiß and Detmar Meurers.
  • Multi-layer Representation Fusion for Neural Machine Translation – Qiang Wang, Fuxue Li, Tong Xiao, Yanyang Li, Yinqiao Li and Jingbo Zhu.
  • Multi-Perspective Context Aggregation for Semi-supervised Cloze-style Reading Comprehension – Liang Wang, Sujian Li, Wei Zhao, Kewei Shen, Meng Sun, Ruoyu Jia and Jingming Liu.
  • Multi-Source Multi-Class Fake News Detection – Hamid Karimi, Proteek Roy, Sari Saba-Sadiya and Jiliang Tang.
  • Multi-task and Multi-lingual Joint Learning of Neural Lexical Utterance Classification based on Partially-shared Modeling – Ryo Masumura, Tomohiro Tanaka, Ryuichiro Higashinaka, Hirokazu Masataki and Yushi Aono.
  • Multi-task dialog act and sentiment recognition on Mastodon – Christophe Cerisara, Somayeh Jafaritazehjani, Adedayo Oluokun and Hoa T. Le.
  • Multi-Task Learning for Sequence Tagging: An Empirical Study – Soravit Changpinyo, Hexiang Hu and Fei Sha.
  • Multi-Task Neural Models for Translating Between Styles Within and Across Languages – Xing Niu, Sudha Rao and Marine Carpuat.
  • Narrative Schema Stability in News Text – Dan Simonson and Anthony Davis.
  • Natural Language Interface for Databases Using a Dual-Encoder Model – Ionel Alexandru Hosu, Radu Cristian Alexandru Iacob, Florin Brad, Stefan Ruseti and Traian Rebedea.
  • Neural Machine Translation Incorporating Named Entity – Arata Ugawa, Akihiro Tamura, Takashi Ninomiya, Hiroya Takamura and Manabu Okumura.
  • Neural Math Word Problem Solver with Reinforcement Learning – Danqing Huang, Jing Liu, Chin-Yew Lin and Jian Yin.
  • NIPS Conversational Intelligence Challenge 2017 Winner System: Skill-based Conversational Agent with Supervised Dialog Manager – Idris Yusupov and Yurii Kuratov.
  • One vs. Many QA Matching with both Word-level and Sentence-level Attention Network – Lu Wang, Shoushan Li, Changlong Sun, Luo Si, Xiaozhong Liu, Min Zhang and Guodong Zhou.
  • Open-Domain Event Detection using Distant Supervision – Jun Araki and Teruko Mitamura.
  • Par4Sim — Adaptive Paraphrasing for Text Simplification – Seid Muhie Yimam and Chris Biemann.
  • Parallel Corpora for bi-lingual English-Ethiopian Languages Statistical Machine Translation – Michael Melese, Solomon Teferra Abate, Martha Yifiru Tachbelie, Million Meshesha, Wondwossen Mulugeta, Yaregal Assibie, Solomon Atinafu, Binyam Ephrem, Tewodros Abebe, Hafte Abera, Amanuel Lemma, Tsegaye Andargie, Seifedin Shifaw and Wondimagegnhue Tsegaye.
  • Part-of-Speech Tagging on an Endangered Language: a Parallel Griko-Italian Resource – Antonios Anastasopoulos, Marika Lekakou, Josep Quer, Eleni Zimianiti, Justin DeBenedetto and David Chiang.
  • Personalizing Lexical Simplification – John Lee and Chak Yan Yeung.
  • Pluralizing Nouns across Agglutinating Bantu Languages – Joan Byamugisha, C. Maria Keet and Brian DeRenzi.
  • Point Precisely: Towards Ensuring the Precision of Data in Generated Texts Using Delayed Copy Mechanism – Liunian Li and Xiaojun Wan.
  • Projecting Embeddings for Domain Adaption: Joint Modeling of Sentiment Analysis in Diverse Domains – Jeremy Barnes, Roman Klinger and Sabine Schulte im Walde.
  • Reading Comprehension with Graph-based Temporal-Casual Reasoning – Yawei Sun, Gong Cheng and Yuzhong Qu.
  • Real-time Change Point Detection using On-line Topic Models – Yunli Wang and Cyril Goutte.
  • Refining Source Representations with Relation Networks for Neural Machine Translation – Wen Zhang, hu jiawei, Yang Feng and Qun Liu.
  • Representation Learning of Entities and Documents from Knowledge Base Descriptions – Ikuya Yamada, Hiroyuki Shindo and Yoshiyasu Takefuji.
  • Reproducing and Regularizing the SCRN Model – Olzhas Kabdolov, Zhenisbek Assylbekov and Rustem Takhanov.
  • Responding E-commerce Product Questions via Exploiting QA Collections and Reviews – Qian Yu, Wai Lam and Zihao Wang.
  • ReSyf: a French lexicon with ranked synonyms – Mokhtar Boumedyen BILLAMI, Thomas François and Nuria Gala.
  • Retrofitting Distributional Embeddings to Knowledge Graphs with Functional Relations – Ben Lengerich, Andrew Maas and Christopher Potts.
  • Revisiting the Hierarchical Multiscale LSTM – Ákos Kádár, Marc-Alexandre Côté, Grzegorz Chrupała and Afra Alishahi.
  • Rich Character-Level Information for Korean Morphological Analysis and Part-of-Speech Tagging – Andrew Matteson, Chanhee Lee, Youngbum Kim and Heuiseok Lim.
  • Robust Lexical Features for Improved Neural Network Named-Entity Recognition – Abbas Ghaddar and Phillippe Langlais.
  • RuSentiment: An Enriched Sentiment Analysis Dataset for Social Media in Russian – Anna Rogers, Alexey Romanov, Anna Rumshisky, Svitlana Volkova, Mikhail Gronas and Alex Gribov.
  • Scoring and Classifying Implicit Positive Interpretations: A Challenge of Class Imbalance – Chantal van Son, Roser Morante, Lora Aroyo and Piek Vossen.
  • Semantic Parsing for Technical Support Questions – Abhirut Gupta, Anupama Ray, Gargi Dasgupta, Gautam Singh, Pooja Aggarwal and Prateeti Mohapatra.
  • Sensitivity to Input Order: Evaluation of an Incremental and Memory-Limited Bayesian Cross-Situational Word Learning Model – Sepideh Sadeghi and Matthias Scheutz.
  • Sentence Weighting for Neural Machine Translation Domain Adaptation – Shiqi Zhang and Deyi Xiong.
  • Seq2seq Dependency Parsing – Zuchao Li, Jiaxun Cai, Shexia He and Hai Zhao.
  • Sequence-to-Sequence Learning for Task-oriented Dialogue with Dialogue State Representation – Haoyang Wen, Yijia Liu, Wanxiang Che, Libo Qin and Ting Liu.
  • SeVeN: Augmenting Word Embeddings with Unsupervised Relation Vectors – Luis Espinosa Anke and Steven Schockaert.
  • Simple Neologism Based Domain Independent Models to Predict Year of Authorship – Vivek Kulkarni, Yingtao Tian, Parth Dandiwala and Steve Skiena.
  • Sliced Recurrent Neural Networks – Zeping Yu and Gongshen Liu.
  • SMHD: a Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions – Arman Cohan, Bart Desmet, Andrew Yates, Luca Soldaini, Sean MacAvaney and Nazli Goharian.
  • Source Critical Reinforcement Learning for Transferring Spoken Language Understanding to a New Language – He Bai, Yu Zhou, Jiajun Zhang, Liang Zhao, Mei-Yuh Hwang and Chengqing Zong.
  • Stance Detection with Hierarchical Attention Network – Qingying Sun, Zhongqing Wang, Qiaoming Zhu and Guodong Zhou.
  • Structured Representation Learning for Online Debate Stance Prediction – Chang Li, Aldo Porco and Dan Goldwasser.
  • Style Detection for Free Verse Poetry from Text and Speech – Timo Baumann, Hussein Hussein and Burkhard Meyer-Sickendiek.
  • Style Obfuscation by Invariance – Chris Emmery, Enrique Manjavacas Arevalo and Grzegorz Chrupała.
  • Summarization Evaluation in the Absence of Human Model Summaries Using the Compositionality of Word Embeddings – Elaheh ShafieiBavani, Mohammad Ebrahimi, Raymond Wong and Fang Chen.
  • Synonymy in Bilingual Context: The CzEngClass Lexicon – Zdenka Uresova, Eva Fucikova, Eva Hajicova and Jan Hajic.
  • Tailoring Neural Architectures for Translating from Morphologically Rich Languages – Peyman Passban, Andy Way and Qun Liu.
  • Task-oriented Word Embedding for Text Classification – Qian Liu, Heyan Huang, Yang Gao, Xiaochi Wei, Yuxin Tian and Luyang Liu.
  • The APVA-TURBO Approach To Question Answering in Knowledge Base – Yue Wang, Richong Zhang, Cheng Xu and Yongyi Mao.
  • Toward Better Loanword Identification in Uyghur Using Cross-lingual Word Embeddings – Chenggang Mi, Yating Yang, Lei Wang, Xi Zhou and Tonghai Jiang.
  • Towards a Language for Natural Language Treebank Transductions – Carlos A. Prolo.
  • Towards an argumentative content search engine using weak supervision – Ran Levy, Ben Bogin, Shai Gretz, Ranit Aharonov and Noam Slonim.
  • Transfer Learning for a Letter-Ngrams to Word Decoder in the Context of Historical Handwriting Recognition with Scarce Resources – Adeline Granet, Emmanuel Morin, Harold Mouchère, Solen Quiniou and Christian Viard-Gaudin.
  • Transfer Learning for Entity Recognition of Novel Classes – Juan Diego Rodriguez, Adam Caldwell and Alexander Liu.
  • Twitter corpus of Resource-Scarce Languages for Sentiment Analysis and Multilingual Emoji Prediction – Nurendra Choudhary, Rajat Singh, Vijjini Anvesh Rao and Manish Shrivastava.
  • Urdu Word Segmentation using Conditional Random Fields (CRFs) – Haris Bin Zia, Agha Ali Raza and Awais Athar.
  • User-Level Race and Ethnicity Predictors from Twitter Text – Daniel Preoţiuc-Pietro and Lyle Ungar.
  • Using Formulaic Expressions in Writing Assistance Systems – Kenichi Iwatsuki and Akiko Aizawa.
  • Using Word Embeddings for Unsupervised Acronym Disambiguation – Jean Charbonnier and Christian Wartena.
  • Visual Question Answering Dataset for Bilingual Image Understanding: A Study of Cross-Lingual Transfer Using Attention Maps – Nobuyuki Shimizu, Na Rong and Takashi Miyazaki.
  • Vocabulary Tailored Summary Generation – Kundan Krishna, Aniket Murhekar, Saumitra Sharma and Balaji Vasan Srinivasan.
  • What’s in Your Embedding, And How It Predicts Task Performance – Anna Rogers, Shashwath Hosur Ananthakrishna and Anna Rumshisky.
  • Who Feels What and Why? Annotation of a Literature Corpus with Semantic Roles of Emotions – Evgeny Kim and Roman Klinger.
  • Why does PairDiff work? – A Mathematical Analysis of Bilinear Relational Compositional Operators for Analogy Detection – Huda Hakami, Kohei Hayashi and Danushka Bollegala.
  • WikiRef: Wikilinks as a route to recommending appropriate references for scientific Wikipedia pages – Abhik Jana, Pranjal Kanojiya, Pawan Goyal and Animesh Mukherjee.
  • Word Sense Disambiguation Based on Word Similarity Calculation Using Word Vector Representation from a Knowledge-based Graph – Dongsuk O, Sunjae Kwon, Kyungsun Kim and Youngjoong Ko.

COLING 2018 Best papers

There are multiple categories of award at COLING 2018, as we laid out in an earlier blog post. We received 44 nominations for best papers over ten categories, and conferred best paper awards in the categories as follows:

  • Best error analysis: SGM: Sequence Generation Model for Multi-label Classification, by Pengcheng Yang, Xu Sun, Wei Li, Shuming Ma, Wei Wu and Houfeng Wang.
  • Best evaluation: SGM: Sequence Generation Model for Multi-label Classification, by Pengcheng Yang, Xu Sun, Wei Li, Shuming Ma, Wei Wu and Houfeng Wang.
  • Best linguistic analysis: Distinguishing affixoid formations from compounds, by Josef Ruppenhofer, Michael Wiegand, Rebecca Wilm and Katja Markert
  • Best NLP engineering experiment: Authorless Topic Models: Biasing Models Away from Known Structure, by Laure Thompson and David Mimno
  • Best position paper: Arguments and Adjuncts in Universal Dependencies, by Adam Przepiórkowski and Agnieszka Patejuk
  • Best reproduction paper: Neural Network Models for Paraphrase Identification, Semantic Textual Similarity, Natural Language Inference, and Question Answering, by Wuwei Lan and Wei Xu
  • Best resource paper: AnlamVer: Semantic Model Evaluation Dataset for Turkish – Word Similarity and Relatedness, by Gökhan Ercan and Olcay Taner Yıldız
  • Best survey paper: A Survey on Open Information Extraction, by Christina Niklaus, Matthias Cetto, André Freitas and Siegfried Handschuh
  • Most reproducible: Design Challenges and Misconceptions in Neural Sequence Labeling, by Jie Yang, Shuailong Liang and Yue Zhang

Note that, as announced last year, for open science & reproducibility COLING 2018 did not confer best paper awards to paper that could not make the code/resources publicly available by camera ready time. This means you can ask the best paper authors for associated data and programs right now, and they should be able to provide you with a link.

In addition, we would like to note the following papers as “Area Chair Favorites”, which were nominated by reviewers and recognised as excellent by chairs.

  • Visual Question Answering Dataset for Bilingual Image Understanding: A study of cross-lingual transfer using attention maps. Nobuyuki Shimizu, Na Rong and Takashi Miyazaki
  • Using J-K-fold Cross Validation To Reduce Variance When Tuning NLP Models. Henry Moss, David Leslie and Paul Rayson
  • Measuring the Diversity of Automatic Image Descriptions. Emiel van Miltenburg, Desmond Elliott and Piek Vossen
  • Reading Comprehension with Graph-based Temporal-Causal Reasoning. Yawei Sun, Gong Cheng and Yuzhong Qu
  • Diachronic word embeddings and semantic shifts: a survey. Andrey Kutuzov, Lilja Øvrelid, Terrence Szymanski and Erik Velldal
  • Transfer Learning for Entity Recognition of Novel Classes. Juan Diego Rodriguez, Adam Caldwell and Alexander Liu
  • Joint Modeling of Structure Identification and Nuclearity Recognition in Macro Chinese Discourse Treebank. Xiaomin Chu, Feng Jiang, Yi Zhou, Guodong Zhou and Qiaoming Zhu
  • Unsupervised Morphology Learning with Statistical Paradigms. Hongzhi Xu, Mitchell Marcus, Charles Yang and Lyle Ungar
  • Challenges of language technologies for the Americas indigenous languages. Manuel Mager, Ximena Gutierrez-Vasques, Gerardo Sierra and Ivan Meza-Ruiz
  • A Lexicon-Based Supervised Attention Model for Neural Sentiment Analysis. Yicheng Zou, Tao Gui, Qi Zhang and Xuanjing Huang
  • From Text to Lexicon: Bridging the Gap between Word Embeddings and Lexical Resources. Ilia Kuznetsov and Iryna Gurevych
  • The Road to Success: Assessing the Fate of Linguistic Innovations in Online Communities. Marco Del Tredici and Raquel Fernández
  • Relation Induction in Word Embeddings Revisited. Zied Bouraoui, Shoaib Jameel and Steven Schockaert
  • Learning with Noise-Contrastive Estimation: Easing training by learning to scale. Matthieu Labeau and Alexandre Allauzen
  • Stress Test Evaluation for Natural Language Inference. Aakanksha Naik, Abhilasha Ravichander, Norman Sadeh, Carolyn Rose and Graham Neubig
  • Recurrent One-Hop Predictions for Reasoning over Knowledge Graphs. Wenpeng Yin, Yadollah Yaghoobzadeh and Hinrich Schütze
  • SMHD: a Large-Scale Resource for Exploring Online Language Usage for Multiple Mental Health Conditions. Arman Cohan, Bart Desmet, Andrew Yates, Luca Soldaini, Sean MacAvaney and Nazli Goharian
  • Automatically Extracting Qualia Relations for the Rich Event Ontology. Ghazaleh Kazeminejad, Claire Bonial, Susan Windisch Brown and Martha Palmer
  • What represents “style” in authorship attribution?. Kalaivani Sundararajan and Damon Woodard
  • SeVeN: Augmenting Word Embeddings with Unsupervised Relation Vectors. Luis Espinosa Anke and Steven Schockaert
  • GenSense: A Generalized Sense Retrofitting Model. Yang-Yin Lee, Ting-Yu Yen, Hen-Hsen Huang, Yow-Ting Shiue and Hsin-Hsi Chen
  • A Multi-Attention based Neural Network with External Knowledge for Story Ending Predicting Task. Qian Li, Ziwei Li, Jin-Mao Wei, Yanhui Gu, Adam Jatowt and Zhenglu Yang
  • Abstract Meaning Representation for Multi-Document Summarization. Kexin Liao, Logan Lebanoff and Fei Liu
  • Cooperative Denoising for Distantly Supervised Relation Extraction. Kai Lei, Daoyuan Chen, Yaliang Li, Nan Du, Min Yang, Wei Fan and Ying Shen
  • Dialogue Act Driven Conversation Model: An Experimental Study. Harshit Kumar, Arvind Agarwal and Sachindra Joshi
  • Dynamic Multi-Level, Multi-Task Learning for Sentence Simplification. Han Guo, Ramakanth Pasunuru and Mohit Bansal
  • A Knowledge-Augmented Neural Network Model for Implicit Discourse Relation Classification. Yudai Kishimoto, Yugo Murawaki and Sadao Kurohashi
  • Abstractive Multi-Document Summarization using Paraphrastic Sentence Fusion. Mir Tafseer Nayeem, Tanvir Ahmed Fuad and Yllias Chali
  • They Exist! Introducing Plural Mentions to Coreference Resolution and Entity Linking. Ethan Zhou and Jinho D. Choi
  • A Comparison of Transformer and Recurrent Neural Networks on Multilingual NMT. Surafel Melaku Lakew, Mauro Cettolo and Marcello Federico
  • Expressively vulgar: The socio-dynamics of vulgarity and its effects on sentiment analysis in social media. Isabel Cachola, Eric Holgate, Daniel Preoţiuc-Pietro and Junyi Jessy Li
  • On Adversarial Examples for Character-Level Neural Machine Translation. Javid Ebrahimi, Daniel Lowd and Dejing Dou
  • Neural Transition-based String Transduction for Limited-Resource Setting in Morphology. Peter Makarov and Simon Clematide
  • Structured Dialogue Policy with Graph Neural Networks. Lu Chen, Bowen Tan, Sishan Long and Kai Yu

We would like to recognise with exceptional thanks our best paper committee.

Review statistics

So far, there have been many things to measure of our review process at COLING. Here are a few.

Firstly, it’s interesting to see how many reviewers recommend the authors cite them. We can’t evaluate how appropriate this is, but it happened in 68 out of 2806 reviews (2.4%).

Best paper nominations are quite rare in general. This gives very little signal for the best paper committee to work with. To gain more information, in addition to asking whether a paper warranted further recognition, we asked reviewers to say if a given paper was the best out of those they had reviewed. This worked well for 747 reviewers, but 274 reviewers (26.8%) said no paper they reviewed was the best of their reviewing allocation.

Mean scores and confidence can be broken down by type, as follows.

Score Confidence
Computationally-aided linguistic analysis 2.85 3.42
NLP engineering experiment paper 2.86 3.51
Position paper 2.41 3.36
Reproduction paper 2.92 3.54
Resource paper 2.76 3.50
Survey paper 2.93 3.58

We can see that reviewers were least confident with position papers, and were both most confident and most pleased with survey papers—though reproduction papers came in a close second in regard to mean score. This fits the general expectation that position papers are hard to evaluate.

The overall distribution of scores follows.

Anonymity and Review

Anonymous review is a way of achieving a fairer process. The ongoing discussion among many in our field led to us examining how well this was really working, and rethinking how anonymity was implemented for COLING this year.

One step we took was to make sure that area chairs did not know who the authors were. This is important because area chairs are the ones putting forward recommendations based on reviews; area chairs are the people who mediate between borderline papers and acceptance, or who assess reviewer ratings to decide if they put a paper on the wrong side of the acceptance boundary. This is a critical and powerful role. So, we should be extra sure that if a venue has chosen to run an anonymized process, the area chairs don’t see paper authors’ names.

This policy caused a little initial surprise but everyone has adapted quickly. In order for this to work, authors must continue to hide their identity, especially through author response to chairs—the current process.

We also increased anonymity in reviewer discussion: reviewers did not and still do not know each others’ identity. To keep review tone professional, we will reveal reviewer identities to each other later in the process, so if you are one of our generous program committee members, you can see who perhaps wrote the excellent review you saw, and also who left the blank one—on submissions you also reviewed.

It’s established that signed reviews—that is, those including the reviewer’s name—are generally found by authors to be of better quality and tone. We gave an option to reviewers to sign their reviews. This time, 121 reviewers used this, out of 1020 active review authors (11.9%).

On the topic of anonymity, there have been a few rejections due to poor or absent anonymization. To help future authors, here are some ways anonymity can be broken.

  • Linking to a personal or institutional github account and making it clear in the prose it is the authors’ (e.g. “We make this available at github.com/authorname/tool/”).
  • Describing and citing prior work as “we showed”, “our previous work”, and so on
  • Leaving names and affiliations on the front page
  • Including unpublished papers in the bibliography

Some of these can be avoided by simply only referring to one’s past literature in the camera-ready copy, and holding back for review, which is a strategy we recommend. Of course it’s not always possible, but in most of cases we saw, refraining from self-citing would not have damaged the narrative and would have left the paper compliant.

The final step in the review process, from the author side, is author response to chairs. Please remember to keep yourself anonymous here—the chairs know neither author nor reviewer identities, which helps them be impartial.

COLING 2018 Submissions Overview

We’ve had a successful COLING so far, with over a thousand papers submitted, covering a variety of areas. In total, 1017 papers were submitted to the main conference, all full-length.

Each submitted paper had a distinct type assigned by the authors, that affects how it is reviewed. These were developed based on our earlier blog post on paper types. The “NLP Engineering Experiment paper” was unsurprisingly the dominant type, though only made up for 65% of all papers. We were very happy to receive 25 survey papers, 31 position papers, and 35 reproduction papers—as well as a solid 106 resource papers and a strong showing of 163 computationally-aided linguistic analysis papers, the second largest contingent.

Some papers were withdrawn or desk rejected before review began in earnest. Between ACs and PC co-chairs, in total, 32 papers were rejected without review. Excluding desk rejects, so far 41 papers have been withdrawn from consideration by the authors.

Allocating papers to areas gave each area a mean and median of 27 papers. The largest area has 31 papers and the smallest 19. We interpret this as indicating that area chairs will not be overloaded, leading to better review quality and interpretation.

Who gets to author a paper? A note on the Vancouver recommendations

At COLING 2018, we require submitted work to follow the Vancouver Convention on authorship – i.e. who gets to be an author on a paper. This guest post by Željko Agić of ITU Copenhagen introduces the topic.

Who gets to author a paper? A note on the Vancouver recommendations

One of the basic principles of publishing scientific research is that research papers are authored and signed by researchers.

Recently, the tenet of authorship has sparked some very interesting discussions in our community. In light of the increased use of preprint servers, we have been questioning the *ACL conference publication workflows. These mostly had to do with the peer review biases, but also with authorship: Should we enable blind preprint publications?

The notion of unattributed publications mostly does not sit well with researchers. We do not even know how to cite such papers, while we can invoke entire research programs in our paper narratives through a single last name.

Authorship is of crucial importance in research, and not just in writing up our related work sections. This goes without saying to all us fellow researchers. While in everyday language an author is simply a writer or an instigator of a piece of work, the question is slightly more nuanced in publishing scientific work:

  • What activities qualify one for paper authorship?
  • If there are multiple contributors, how should they be ordered?
  • Who decides on the list of paper authors?

These questions have sparked many controversies over the centuries of scientific research. An F. D. C. Willard, short for Felis Domesticus Chester, has authored a physics paper, similar to Galadriel Mirkwood, a Tolkien-loving Afgan hound versed in medical research. Others have built on the shoulders of giants such as Mickey Mouse and his prolific group.

Yet, authorship is no laughing matter: It can make and break research careers, and its (un)fair treatment can make a difference between a wonderful research group and an uneasy one at the least. A fair and transparent approach to authorship is of particular importance to early-stage researchers. There, the tall tales of PhD students might include the following conjectures:

  • The PIs in medical research just sign all the papers their students author.
  • In algorithms research the author ordering is always alphabetical.
  • Conference papers do not make explicit the individual author contributions.
  • The first and the last author matter the most.

The curiosities and the conjectures listed above all stem from the fact that there seems to be no awareness of any standard rulebook to play by in publishing research. This in turn gives rise to the many different traditions in different fields.

Yet, there is a rulebook!

One prominent attempt to put forth a set of guidelines for determining authorship are the Vancouver Group recommendations. The Vancouver Group are the International Committee of Medical Journal Editors (ICMJE), who in 1985 introduced a set of criteria for authorship. The criteria have seen many updates over the years, to match the latest developments in research and publishing. Their scope far surpasses the topic of authorship, and spans across the scientific publication process: reviewing, editorial work, publishing, copyright, and the like.

While the recommendations do stem from the medical field, they are nowadays broadened and thus widely adopted. The following is an excerpt from the recommendations in relation to authorship criteria.

The ICMJE recommends that authorship be based on the following 4 criteria:

1. Substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data for the work; AND

2. Drafting the work or revising it critically for important intellectual content; AND

3. Final approval of the version to be published; AND

4. Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

(…)

All those designated as authors should meet all four criteria for authorship, and all who meet the four criteria should be identified as authors. Those who do not meet all four criteria should be acknowledged.

(…)

These authorship criteria are intended to reserve the status of authorship for those who deserve credit and can take responsibility for the work. The criteria are not intended for use as a means to disqualify colleagues from authorship who otherwise meet authorship criteria by denying them the opportunity to meet criterion #s 2 or 3.

Note that there is an AND operator tying the four criteria, but there are some ORs within the individual entries. Thus, in essence, to be adherent with the Vancouver recommendations for authorship, one has to meet all four requirements, while in meeting each of the four, one is allowed to meet them minimally.

To take one example:

If you substantially contributed to 1) data analysis, and to 2) revising the paper draft, and then you subsequently 3) approved of the final version and 4) agreed to be held accountable for all the work, then congrats! you have met the authorship criteria!

One could take others routes through the four criteria, some arguably easier, while some even harder.

In my own view, we as a field should hope for the Vancouver recommendations to have already been adopted in NLP research, if only implicitly through the way our research groups and collaborations work.

Yet, are they? What are your thoughts? In your view, are the Vancouver recommendations well-matched with the COLING 2018 paper types? In general, are there aspects of your work in NLP that are left uncovered by the authorship criteria? Might there be at least some controversy and discussion potential to this matchup? 🙂

Metadata and COLING submissions

As the deadline for submission draws near, we’d like to alert our authors to a few things that are a bit different from previous COLINGs and other computational linguistics/NLP venues in the hopes that this will help the submission process go smoothly.

Paper types

Please consider the paper type you indicate carefully, as this will affect what the reviewers are instructed to look for you in your paper.  We encourage you to read the description of the paper types and especially the associated reviewer questions carefully. Which set of questions would you most like to have asked of your paper? (And if reading the questions inspires you to reframe/edit a bit to better address them before submitting, that is absolutely fair game!)

Emiel van Miltenburg raised the point on Twitter last week that it can be difficult to categorize papers and in particular that certain papers might fall between our paper types, combining characteristics of more than one, or being something else entirely.

Emiel and colleagues wondered whether we could implement a “tagging” system where authors could indicate the range of paper types their paper relates to. That is an intriguing idea, but it doesn’t work with the way we are using paper types to improve the diversity and rage of papers at COLING. As noted above, the paper types entail different questions on the review forms. We’re doing that because otherwise it seems that everything gets evaluated against the NLP Engineering Experiment paper type, which in turn means it’s hard to get papers of the other types accepted.  And as we hope we’ve made it blindingly clear, we really are interested in getting a broad range of paper types!

Keywords

The other aspect of our submission form that will have a strong impact on how your paper is reviewed is the keywords. Following the system pioneered by Ani Nenkova and Owen Rambow as PC co-chairs for NAACL 2016, we have asked our reviewers to all describe their areas of expertise along five dimensions:

  1. Linguistic targets of study
  2. Application tasks
  3. Approaches
  4. Languages
  5. Genres

(All five of these have a none of the above/not-applicable option.) The reviewers (and area chairs) all indicated all of the items on each of these dimensions they have the expertise and interest to review for. For authors, we ask you to indicate which items on each dimension best describe the paper you are submitting. Softconf will then match your paper to an area based on the assignment of papers to areas that best optimizes reviewer expertise for the papers submitted.

In sum: To ensure the most informed reviewing possible of your paper, please fill out these keywords carefully.  We urge you to start your submission in the system ahead of time so you aren’t trying to complete this task in a hurry just at the deadline.

Dual submission policy

Our Call for Papers indicates the following dual submission policy:

Papers that have been or will be under consideration for other venues at the same time must indicate this at submission time. If a paper is accepted for publication at COLING, it must be immediately withdrawn from other venues. If a paper under review at COLING is accepted elsewhere and authors intend to proceed there, the COLING committee must be notified immediately.

We have added a field in the submission form for you to be able to indicate this information.

LRE Map

COLING 2018 is participating in LRE map, as described in this guest post by Nicoletta Calzolari. In the submission form, you are asked to provide information about language resources your research has used—and those it has produced. Do not worry about anonymity on this form.  This information is not shared with reviewers.

Author responsibilities and the COLING 2018 desk reject policy

As our field experiences an upswing in participation, we have more submissions to our conferences, and this means we have to be careful to keep the reviewing process as efficient as possible. One tool used by editors and chairs is the “desk reject”. This is a way to filter out papers that clearly shouldn’t get through for whatever reason, without asking area chairs and reviewers to handle them, leaving our volunteers to use their energy on the important process of dealing with your serious work.

A desk reject is an automatic rejection without further review. This saves time, but is also quite a strong reaction to a submission. For that reason, this post clarifies possible reasons for a desk reject and the stages at which this might occur. It is the responsibility of the authors to make sure to avoid these situations.

Reasons for desk rejects:

  • Page length violations. The content limit at COLING is 9 pages. (You may include as many pages as needed for references.) Appendices, if part of the main paper, must be put into that nine pages. It’s unfair to judge longer papers against those that have kept to the limit and so exceeding the page limit means a desk reject.
  • Template cheating. The LaTeX and Word templates give a level playing field for everyone. Squeezing out whitespace, adjusting margins, and changing the font size all stop that playing field from being even and give an unfair advantage. If you’re not using the official template, you’ve altered that template, or the way a manuscript uses it goes beyond our intent, then the paper may be desk rejected.
  • Missing or poor anonymisation. It’s well-established that non-anonymised papers from “big name” authors and institutions fare better during review. To avoid this effect, and others, COLING is running double-blind; see our post on the nuances of double-blinding. We do not endeavour to be arbiters of what does or does not constitute a “big name”—rather, any paper that is poorly anonymised (or not anonymised at all) will face desk reject. See below for a few more comments on anonymisation.
  • Inappropriate content. We want to give our reviewers and chairs research papers to review. Content that really does not fit this will be desk rejected.
  • Plagiarism. Submitting work that has already appeared, has already been accepted for publication at another venue, or has any significant overlap with other works submitted to COLING will be desk rejected. Several major NLP conferences are actively collaborating on this.
  • Breaking the arXiv embargo. COLING follows the ACL pre-print policy. This means that only papers not published on pre-print services or published on pre-print services more than a month before the deadline (i.e. before February 16, 2018) will be considered. Pre-prints published after this date (non-anonymously) may not be submitted for review at COLING. In conjunction with other NLP conferences this year, we’ll be looking for instances of this and desk rejecting them.

The desk rejects are determined at four separate points. In order,

  1. Automatic rejection by the START submission system, which has a few checks at various levels.
  2. A rejection by the PC co-chairs, before papers are allocated to areas.
  3. After papers are placed in areas, ACs have the opportunity to check for problems. One response is to desk reject.
  4. Finally, during and immediately after allocation of papers to reviewers, an individual reviewer may send a message to invoke desk rejection, which will be queried and checked by at least two people from the ACs or PC co-chairs.

As an honest researcher trying to publish your important and exciting work, the above probably do not apply to you. But if they do, please think twice. We would prefer to send out no desk rejects and imagine it would be much more pleasant for our authors if none were to receive a desk reject. So, now you know what to avoid!

Postscript on anonymisation

Papers must be anonymised. This protects everybody during review. It’s a complex issue to implement, which is why we earlier had a post dedicated to double blindness in peer review. There are strict anonymisation guidelines in the call for papers and the only way to be sure that nobody takes exception during the review process is to follow these guidelines.

We’ve received several questions on what the best practices for anonymisation are.  We realize that in long-standing projects, it can be impossible to truly disguise the group that work comes from.  Nonetheless, we expect all COLING authors to follow the forms of anonymisation:

  1. Do NOT include author names/affiliations in the version of the paper submitted for review.  Instead, the author block should say “Anonymous”.
  2. When making reference to your own published work, cite it as if written by someone else: “Following Lee (2007), …” “Using the evaluation metric proposed by Garcia (2016), …”
  3. The only time it’s okay to use “anonymous” in a citation is when you are referring to your own unpublished work: “The details of the construction of the data are described in our companion paper (anonymous, under review).”
  4. Expanded versions of earlier workshop papers should rework the prose sufficiently so as not to turn up as potential plagiarism examples. The final published version of such papers should acknowledge the earlier workshop paper, but that should be suppressed in the version submitted for review.
  5. More generally, the acknowledgments section should be left out of the version submitted for review.
  6. Papers making code available for reproducibility or resources available for community use should host a version of that at a URL that doesn’t reveal the authors’ identity or  institution.

We have been asked a few times about whether LRE Map entries can be done without de-anonymising submissions.  The LRE Map data will not be shared with reviewers, so this is not a concern.

Keeping resources anonymised is a little harder. We recommend you keep things like names of people and labs out of your code and files; for example, Java code uploaded that ran within an edu.uchicago.nlp namespace would be problematic. Similarly, if the URL given is within a personal namespace, this breaks double-blindness, and must be avoided. Google Drive, Dropbox and Amazon S3 – as well as many other file-sharing services – offer reasonably anonymous (and often free) file sharing URLs, and we recommend you use those if you can’t upload your data/code/resources into START as supplementary materials.

 

 

LRE Map: What? Why? When? Who?

This guest post is by Nicoletta Calzolari.

Not-documented Language Resources (LRs) don’t exist!

The LRE Map of Language Resources (data and tools) (http://lremap.elra.info) is an innovative instrument introduced at LREC2010 with the aim of monitoring the wealth of data and technologies developed and used in our field. Why “Map”? Because we aimed at representing the relevant features of a large territory, also for the aspects not represented in the official catalogues of the major players of the field. But we had other purposes too: we wanted to draw attention to the importance of the LRs that are behind many of our papers and to map also the “use” of LRs, to understand the purposes of the developed LRs.

Its collaborative, bottom-up, creation was critical: we conceived the Map as a means to influence a “change of culture” in our community, whereby everyone is asked to make a minimal effort to document the LRs that are used or created, thus understanding the need of proper documentation. By spreading the LR documentation effort across many people instead of leaving it only in the hands of the distribution centres, we also encourage awareness of the importance of metadata and proper documentation. Documenting a resource is the first step for making it identifiable, which in its turn is the first step towards reproducibility.

We kept the requested information at a simple level, knowing that we had to compromise between richness of metadata and willingness of authors to fill them in.

With all these purposes in mind we thought we could exploit the great opportunity offered by LREC and the involvement of so many authors from so many countries, from different modalities and working in so many areas of NLP. Afterwards the Map was used also in the framework of other major Conferences, in particular by COLING, and this provides another opportunity for useful comparisons.

The number of LRs currently described in the Map is 7453 (instances), collected from 17 different conferences. The major conferences for which we have data on a regular basis are LREC and COLING.

With initiatives such as the LRE Map and “Share your LRs” (introduced in 2014) we want to encourage in the field of LT and LRs what is already in use in more mature disciplines, i.e. ensure proper documentation and reproducibility as a normal practice. We think that research is strongly affected also by such infrastructural (meta-research) activities and therefore we continue to promote – also through such initiatives – a greater visibility of LRs, the sharing of LRs in an easier way and the reproducibility of research results.

Here is the vision: it must become common practice also in our field that when you submit a paper either to a conference or a journal you are offered the opportunity to document and upload the LRs related to your research. This is even more important in a data-intensive discipline like NLP. The small cost that each of us will pay to document, share, etc. should be paid back from benefiting of others’ efforts.

What do we ask to colleagues submitting at COLING 2018? Please document all the LRs mentioned in your paper!

SemEval: Striving for Reproducibility in Research – Guest post

Being able to reproduce experiments and results is important to advancing our knowledge, but it’s not something we’ve always been able to do well. In a series of guest posts, we have invited perspectives and advice on reproducibility in NLP.

by Saif M. Mohammad, National Research Council Canada.

A shared task invites participation in a competition where system predictions are examined and ranked by an independent party on a common evaluation framework (common new training and test sets, common evaluation metrics, etc.). The International Workshop on Semantic Evaluation (SemEval) is a popular shared task platform for computational semantic analysis. (See SemEval-2017; participate in SemEval-2018!) Every year, the workshop selects a dozen or so tasks (from a competitive pool of proposals) and co-ordinates their organizationthe setting up of task websites, releasing training and test sets, conducting evaluations, and publishing proceedings. It draws hundreds of participants, and publishes over a thousand pages of proceedings. It’s awesome!

Embedded in SemEval, but perhaps less obvious, is a drive for reproducibility in researchobtaining the same results again, using the same method. Why does reproducibility matter? Reproducibility is a foundational tenet of the scientific method. There is no truth other than reproducibility. If repeated data annotations provide wildly diverging labels, then that data is not capturing anything meaningful. If no one else is able to replicate one’s algorithm and results, then that original work is called into question. (See Most Scientists Can’t Replicate Studies by their Peers and also this wonderful article by Ted Pedersen, Empiricism Is Not a Matter of Faith.)

I have been involved with SemEval in many roles: from a follower of the work, to a participant, a task organizer, and co-chair. In this post, I share my thoughts on some of the key ways in which SemEval encourages reproducibility, and how many of these initiatives can easily be carried over to your research (whether or not it is part of a shared task).

SemEval has two core components:

Tasks: SemEval chooses a mix of repeat tasks (tasks that were run in prior years), new-to-SemEval tasks (tasks studied separately by different research groups, but not part of SemEval yet), and some completely new tasks. The completely new tasks are exciting and allow the community to make quick progress. The new-to-SemEval tasks allow for the comparison and use of disparate past work (ideas, algorithms, and linguistic resources) on a common new test set. The repeat tasks allow participants to build on past submissions and help track progress over the years. By drawing the attention of the community to a set of tasks, SemEval has a way of cleaning house. Literature is scoured, dusted, and re-examined to identify what generalizes wellwhich ideas and resources are truly helpful.

Bragging rights apart, a common motivation to participate in SemEval is to test whether a particular hypothesis is true or not. Irrespective of what rank a system attains, participants are encouraged to report results on multiple baselines, benchmarks, and comparison submissions.

Data and Resources: The common new (previously unseen) test set is a crucial component of SemEval. It minimizes the risk of highly optimistic results from (over)training on a familiar dataset. Participants usually have only two or three weeks from when they get access to the test set to when they have to provide system submissions. Task organizers often provide links to code and other resources that participants can use, including baseline systems and the winning systems from the past years. Participants can thus build on these resources.

SemEval makes a concerted effort to keep the data and the evaluation framework for the shared tasks available through the task websites even after the official competition. Thus, people with new approaches can continue to compare results with that of earlier participants, even years later. The official proceedings record the work done by the task organizers and participants.

Task Websites: For each task, the organizers set up a website providing details of the task definition, data, annotation questionnaires, links to relevant resources, and references. Since 2017, the tasks are run on shared task platforms such as CodaLab. They include special features such as phases and leaderboards. Phases often correspond to a pre-evaluation period (when systems have access to the training data but not the test data), the official evaluation period (when the test data is released and official systems submissions are to be made), and a post-evaluation period. The leaderboard is a convenient way to record system results. Once the organizers set up the task website with the evaluation script, the system automatically generates results on every new submission and uploads it on the leaderboard. There is a separate leaderboard for each phase. Thus, even after the official competition has concluded, one can upload submissions, and the auto-computed results are posted on the leaderboard. Anyone interested in a task can view all of the results in one place.

SemEval also encourages participants to make system submissions freely available and to make system code available where possible.

Proceedings: For each task, the organizers write a task-description paper that describes their task, data, evaluation, results, and a summary of participating systems. Participants write a system-description paper describing their system and submissions. Special emphasis is paid to replicability in the instructions to authors and in the reviewing process. For the task paper: “present all details that will allow someone else to replicate the data creation process and evaluation.” For the system paper: “present all details that will allow someone else to replicate your system.” All papers are accepted except for system papers that fail to provide clear and  adequate details of their submission. Thus SemEval is also a great place to record negative results — ideas that seemed promising but did not work out.

Bonus article: Why it’s time to publish research “failures”

All of the above make SemEval a great sandbox for working on compelling tasks, reproducing and refining ideas from prior research, and developing new ones that are accessible to all. Nonetheless, shared tasks can entail certain less-desirable outcomes that are worth noting and avoiding:

  • Focus on rankings: While the drive to have the top-ranked submission can be productive, it is not everything. More important is the analysis to help improve our collective understanding of the task. Thus, irrespective of one’s rank, it is useful to test different hypotheses and report negative results. 
  • Comparing post-competition results with official competition results: A crucial benefit of participating under the rigor of a shared task is that one does not have access to the reference/gold labels of the test data until the competition has concluded. This is a benefit because having open access to the reference labels can lead to unfair and unconscious optimisation on the test set. Every time one sees the result of their system on a test set and tries something different, it is a step towards optimising on the test set. However, once the competition has concluded the gold labels are released so that the task organizers are not the only gatekeepers for analysis. Thus, even though post-competition work on the task–data combination is very much encouraged, the comparisons of those results with the official competition results have to pass a higher bar of examination and skepticism.

There are other pitfalls worth noting too—feel free to share your thoughts in the comments.

“That’s great!” you say, “but we are not always involved in shared tasks…”

How do I encourage reproducibility of *my* research?

Here are some pointers to get started:

  • In your paper, describe all that is needed for someone else to reproduce the work. Make use of provisions for Appendices. Don’t be limited by page lengths. Post details on websites and provide links in your paper.
  • Create a webpage for the research project. Briefly describe the work in a manner that anybody interested can come away understanding what you are working on and why that matters. There is merit in communicating our work to people at large, and not just to our research peers. Also:
    • Post the project papers or provide links to them.
    • Post annotation questionnaires.
    • Post the code on repositories such as GitHub and CodaLab. Provide links.
    • Share evaluation scripts.
    • Provide interactive visualisations to explore the data and system predictions. Highlight interesting findings.
    • Post tables with results of work on a particular task of interest. This is especially handy if you are working on a new task or creating new data for a task. Use tools such as CodaLab to create leaderboards and allow others to upload their system predictions.
    • If you are releasing data or code, briefly describe the resource, and add information on:
      • What can the resource be used for and how?
      • What hypotheses can be tested with this resource?
      • What are the properties of the resource — its strengths, biases, and limitations?
      • How can one build on the resource to create something new?
  • (Feel free to add more suggestions through your comments below.)

Sharing your work often follows months and years of dedicated research. So enjoy it, and don’t forget to let your excitement shine through! 🙂

Many thanks to Svetlana Kiritchenko, Graeme Hirst, Ted Pedersen, Peter Turney, and Tara Small for comments and discussions.

References: