Call for input: Paper types and associated review forms

In our opening post, we laid out our goals as PC co-chairs for COLING 2018. In this post, we present our approach to the subgoal (of goal #1) of creating a program with many different types of research contributions. As both authors and reviewers, we have been frustrated by the one-size-fits-all review form typical of conferences in our field. When reviewing, how do we answer the ‘technical correctness’ question about a position paper? Or the ‘impact of resources’ question on a paper that doesn’t present any resources?

We believe that a program that includes a wide variety of paper types (as well as a wide variety of paper topics) will be more valuable both for conference attendees and for the field as a whole. We hypothesize that more tailored review forms will lead to fairer treatment of different types of papers, and that fairer treatment will lead to a more varied program. Of course, if we don’t get many papers outside the traditional type (called “NLP engineering experiment paper” below), having tailored review forms won’t do us much good. Therefore, we aim to get the word out early (via this blog post) so that our audience knows what kinds of papers we’re interested in.

Furthermore, we’re interested in what kinds of papers you’re interested in. Below you will find our initial set of five categories, with drafts of the associated review forms. You’ll see some questions are shared across some or all of the paper types, but we’ve elected to lay them out this way (even though it might feel repetitive) so that you can look at each category, putting yourself in both the position of author and of reviewer, and think about what we might be missing/which questions might be inappropriate. Let us know in the comments!

As you answer, keep in mind that our goal with the review forms is to help reviewers structure their reviews in such a way that they are helpful for the area chairs in making final acceptance decisions, informative for the authors (so they understand the decisions that were made), and helpful for the authors (as they improve their work either for camera ready, or for submission to a later venue).

Computationally-aided linguistic analysis

The focus of this paper type is new linguistic insight.

  • Relevance: Is this paper relevant to COLING?
  • Readability/clarity: From the way the paper is written, can you tell what research question was addressed, what was done and why, and how the results relate to the research question?
  • Originality: How original and innovative is the research described? Originality could be in the linguistic question being addressed, in the methodology applied to the linguistic question, or in the combination of the two.
  • Technical correctness/soundness: Is the research described in the paper technically sound and correct? Can one trust the claims of the paper—are they supported by the analysis or experiments and are the results correctly interpreted?
  • Reproducibility: Is there sufficient detail for someone in the same field to reproduce/replicate the results?
  • Generalizability: Does the paper show how the results generalize, either by deepening our understanding of some linguistic system in general or by demonstrating methodology that can be applied to other problems as well?
  • Meaningful comparison: Does the paper clearly place the described work with respect to existing literature? Is it clear both what is novel in the research presented and how it builds on earlier work?
  • Substance: Does this paper have enough substance for a full-length paper, or would it benefit from further development?
  • Overall recommendation: There are many good submissions competing for slots at COLING 2018; how important is it to feature this one? Will people learn a lot by reading this paper or seeing it presented? Please be decisive—it is better to differ from other reviewers than to grade everything in the middle.

NLP engineering experiment paper

This paper type matches the bulk of submissions at recent CL and NLP conferences.

  • Relevance: Is this paper relevant to COLING?
  • Readability/clarity: From the way the paper is written, can you tell what research question was addressed, what was done and why, and how the results relate to the research question?
  • Originality: How original and innovative is the research described? Note that originality could involve a new technique or a new task, or it could lie in the careful analysis of what happens when a known technique is applied to a known task (where the pairing is novel) or in the careful analysis of what happens when a known technique is applied to a known task in a new language.
  • Technical correctness/soundness: Is the research described in the paper technically sound and correct? Can one trust the claims of the paper—are they supported by the analysis or experiments and are the results correctly interpreted?
  • Reproducibility: Is there sufficient detail for someone in the same field to reproduce/replicate the results?
  • Error analysis: Does the paper provide a thoughtful error analysis, which looks for linguistic patterns in the types of errors made by the system(s) evaluated and sheds light on either avenues for future work or the source of the strengths/weaknesses of the systems?
  • Meaningful comparison: Does the paper clearly place the described work with respect to existing literature? Is it clear both what is novel in the research presented and how it builds on earlier work?
  • Substance: Does this paper have enough substance for a full-length paper, or would it benefit from further work?
  • Overall recommendation: There are many good submissions competing for slots at COLING 2018; how important is it to feature this one? Will people learn a lot by reading this paper or seeing it presented? Please be decisive—it is better to differ from other reviewers than to grade everything in the middle.

Reproduction paper

The contribution of a reproduction paper lies in analyses of and in insights into existing methods and problems—plus the added certainty that comes with validating previous results.

  • Relevance: Is this paper relevant to COLING?
  • Readability/clarity: Is the paper well-written and well-structured?
  • Analysis: If the paper was able to replicate the results of the earlier work, does it clearly lay out what needed to be filled in in order to do so? If it wasn’t able to replicate the results of earlier work, does it clearly identify what information was missing/the likely causes?
  • Generalizability: Does the paper go beyond replicating the results on the original to explore whether they can be reproduced in another setting? Alternatively, in cases of non-replicability, does the paper discuss the broader implications of that result?
  • Informativeness: To what extent does the analysis reported in the paper deepen our understanding of the methodology used or the problem approached? Will the information in the paper help practitioners with their choice of technique/resource?
  • Meaningful comparison: In addition to identifying the experimental results being replicated, does the paper motivate why these particular results are an important target for reproduction and what the future implications are of their having been reproduced or been found to be non-reproducible?
  • Overall recommendation: There are many good submissions competing for slots at COLING 2018; how important is it to feature this one? Will people learn a lot by reading this paper or seeing it presented? Please be decisive—it is better to differ from other reviewers than to grade everything in the middle.

Resource paper

Papers in this track present a new language resource. This could be a corpus, but also could be an annotation standard, tool, and so on.

  • Relevance: Is this paper relevant to COLING? Will the resource presented likely be of use to our community?
  • Readability/clarity: From the way the paper is written, can you tell how the resource was produced, how the quality of annotations (if any) was evaluated, and why the resource should be of interest?
  • Originality: Does the resource fill a need in the existing collection of accessible resources? Note that originality could be in the choice of language/language variety or genre, in the design of the annotation scheme, in the scale of the resource, or still other parameters.
  • Resource quality: What kind of quality control was carried out? If appropriate, was inter-annotator agreement measured, and if so, with appropriate metrics? Otherwise, what other evaluation was conducted, and how agreeable were the results?
  • Resource accessibility: Will it be straightforward for researchers to download or otherwise access the resource in order to use it in their own work? To what extent can work based on this resource be shared?
  • Metadata: Do the authors make clear whose language use is captured in the resource and to what populations experimental results based on the resource could be generalized to? In case of annotated resources, are the demographics of the annotators also characterized?
  • Meaningful comparison: Is the new resource situated with respect to existing work in the field, including similar resources it took inspiration from or improves on? Is it clear what is novel about the resource?
  • Overall recommendation: There are many good submissions competing for slots at COLING 2018; how important is it to feature this one? Will people learn a lot by reading this paper or seeing it presented? Please be decisive—it is better to differ from other reviewers than to grade everything in the middle.

Position paper

A position paper presents a challenge to conventional thinking or a futuristic new vision. It could open up a new area or novel technology, propose changes in existing research, or give a new set of ground rules.

  • Relevance: Is this paper relevant to COLING?
  • Readability/clarity: Is it clear what the position is that the paper is arguing for? Are the arguments for it laid out in an understandable way?
  • Soundness: Are the arguments presented in the paper relevant and coherent? Is the vision well-defined, with success criteria? (Note: It should be possible to give a high score here even if you don’t agree with the position taken by the authors)
  • Creativity: How novel or bold is the position taken in the paper? Does it represent well-thought through and creative new ground?
  • Scope: How much scope for new research is opened up by this paper? What effect could it have on existing areas and questions?
  • Meaningful comparison: Is the paper well-situated with respect to previous work, both position papers (taking the same or opposing side on the same or similar issues) and relevant theoretical or experimental work?
  • Substance: Does the paper have enough substance for a full-length paper? Is the issue sufficiently important? Are the arguments sufficiently thoughtful and varied?
  • Overall recommendation: There are many good submissions competing for slots at COLING 2018; how important is it to feature this one? Please be decisive—it is better to differ from other reviewers than to grade everything in the middle.

 

So, those are the initial set of submission types. These types of paper aren’t limited to single tracks. That is to say, there won’t be a dedicated position paper track, with its own reviewers and chair. You might find a resource paper in any track, for example, and a multi-lingual embeddings track (if one appears—but that’s for a future post) might contain all five kinds of paper mixed together. This makes it even more important that the right questions are asked for a paper type, to help out hard-working reviewers with the task of judging each kind of paper in an appropriate light.

Our questions for you: Is there a type of paper you’d either like to submit to COLING or would like to see at COLING that you think doesn’t fit any of these five already? Should any of the review questions be dropped or refined for any of the paper types? Are there review questions it would be useful to add? Please let us know in the comments!

 

COLING 2018 PC Blog: Welcome!

Emily M. Bender and Leon Derczynski, at the University of Washington

We (Emily M. Bender and Leon Derczynski) are the PC co-chairs for COLING 2018, to be held in Santa Fe, NM, USA, 20-25 August 2018. Inspired by Min-Yen Kan and Regina Barzilay’s ACL 2017 PC Blog, we will be keeping one of our own. We start today with a brief post introducing ourselves and outlining our goals for COLING 2018. In later posts, we’ll describe the various plans we have for meeting those goals.

First the intros:

Emily is a Professor of Linguistics and Adjunct Professor of Computer Science & Engineering at the University of Washington, Seattle WA (USA), where she has been on the faculty since 2003 and has served as the Faculty Director of the Professional Masters in Computational Linguistics (CLMS) since its inception in 2005. Her degrees are all in Linguistics (AB UC Berkeley, MA and PhD Stanford) and her primary research interests are in grammar engineering, computational semantics, and computational linguistic typology. She is also interested in ethics in NLP, the application of computational methods to linguistic analysis, and different ways of integrating linguistic knowledge into NLP.

Leon is a Research Fellow of Computer Science at the University of Sheffield (UK), the home ICCL, where he has been a researcher since 2012, including visiting positions at Aarhus Universitet (Denmark), Innopolis University (Russian Federation) and University of California, San Diego (USA). His degrees are in Computer Science (MComp and PhD), also from Sheffield, and his research interests are in noisy text, unsupervised methods, and spatio-temporal information extraction. He is also interested in chunking and tagging, effective crowdsourcing, and assessing veracity and fake news.

We first met by proxy, through Tim Baldwin, at LREC 2014 in Reykjavik. Tim pointed out that we both happened to be currently visiting scholars in a hip Danish city devoid of its own NLP group—Aarhus.  Shortly after returning from Iceland, each upon Tim’s recommendation, we met for lunch a few times in Aarhus, chatting about understanding language, language diversity, and the interface between data-driven computational techniques and linguistic reality. We have made a point of catching up regularly ever since, and the city is a place where we still have connections—and even more hip, as the European Capital of Culture for 2017!

Then goals:

Our goals for COLING 2018 are (1) to create a program of high quality papers which represent diverse approaches to and applications of computational linguistics written and presented by researchers from throughout our international community; (2) to facilitate thoughtful reviewing which is both informative to ACs (and to us as PC co-chairs) and helpful to authors; and (3) to ensure that the results published at COLING 2018 are as reproducible as possible.

To give a bit more detail on the first goal, by diverse approaches/applications, we mean that we intend to attract (in the tradition of COLING):

  • papers which develop linguistic insight as well as papers which deepen our understanding of how machine learning can be applied to NLP — and papers that do both!
  • research on a broad variety of languages and genres
  • many different types of research contributions (application papers, resource papers, methodology papers, position papers, reproduction papers…)

We have the challenge and the privilege of taking on this role at a time when our field is growing tremendously quickly.  We hope to advance the way our conferences work by trying new things and improving the experience from all sides.  In approaching this task, we started by reviewing the strategies taken by PC chairs at other recent conferences (including COLING 2016, NAACL 2016, and ACL 2017), learning from them, and then adapting strategies based on our goals for COLING 2018.  We strongly believe that one key to achieving a diverse and strong program is community engagement.  Thus our first step towards that is starting this blog.  Over the coming weeks we will tell you more about what we are working on and seek input on various points in the process.  We look forward to working with you and hope to see many of you in Santa Fe next August!