Best paper categories and requirements

Recognition of excellent work is very important.  In particular, we see the role of best/outstanding paper awards as being two-fold: On the one hand, it is a chance for the conference program committee to highlight papers it found particularly compelling and promote them to a broader audience.  On the other hand, it provides recognition to the authors and may help advance their careers.

From the perspective of both of these functions we think it is critical that different kinds of excellent work be recognized.  Accordingly, we have established an expanded set of categories in which an award will be given for COLING 2018. The categories are:

  • Best linguistic analysis
  • Best NLP engineering experiment
  • Best reproduction paper
  • Best resource paper
  • Best position paper
  • Best survey paper
  • Best evaluation, for a paper that does their eval very well
  • Most reproducible, where the paper’s work is highly reproducible
  • Best challenge, for a paper that sets a new challenge

The first six of these correspond to our paper types.  The last three cross-cut those categories, at least to a certain extent.  We hope that ‘Best evaluation’ and ‘Most reproducible’ in particular will provide motivation for raising the bar in best practice in these areas.

A winner and runner-up will be selected for each category by a best paper committee. However, while there are more opportunities for recognition, we’ve also raised the minimum requirements for winning a prize. Namely, any work with associated code or other resources must make that openly available, and do so before the best paper committee comes to select works.

We’ve taken this step to provide a solid reward for those who share their work and help advance our field (see e.g. “Sharing is Caring”, Nissim et al. 2017, Computational Linguistics), without excluding others (e.g. industrial authors) who cannot easily share work from participating in COLING 2018’s many tracks.

We look forward with great anticipation to this collection of papers!

Untangling biases and nuances in double-blind peer review at scale

It’s important to get reviewing right, and remove as many biases as we can. We had a discussion about how to do this in COLING, presented in this blog post in interview format. The participants are the program co-chairs, Emily M. Bender and Leon Derczynski.

LD: How do you feel about blindness in the review process? It could be great for us to have blindness in a few regards. I’ll start with the most important to me. First, reviewers do not see author identities. Next, reviewers do not see each other’s identities. Most people would adjust their own review to align with e.g. Chris Manning’s (sounds terribly boring for him if this happens!). Third, area chairs do not see author identities. Finally, area chairs do not see reviewer identities in connection to their reviews, or a paper. But I don’t know how much of this is possible within the confines of conference management. The last seems the most risky; but reviewer identities being hidden from each other seems like a no-brainer. What do you think?

Reviewers blind from each other

EMB: It looks like we have a healthy difference of opinion here 🙂 Absolutely, reviewers should not see author identities. With them not seeing each other’s identities, I disagree. I think the inter-reviewer discussion tends to go better if people know who they are talking to. Perhaps we can get the software to track the score changes and ask the ACs to be on guard for bigwigs dragging others to their opinions?

LD: Alright, we can try that; but after reading that report from URoch, how would you expect PhD students/postdocs/asst profs to have reacted around a review of Florian Jaeger’s, if they’d had or intended to have any connection with his lab? On the other side, I hear a lot from people unwilling to go against big names, because they’ll look silly. So my perception of this is that discussion goes worse when people know who they’re contradicting—though reviews might end up being more civil, too. I still think big names distort reviews here despite getting reviewing wrong just as often as the small names, so having reviewers know who each other are makes for less fair reviewing.

EMB: I wonder to what extent we’ll have ‘big names’ among our reviewers. I wonder if we can get the best of both worlds though by revealing all reviewers names to each other only after the decisions are out. So people will be on good behavior in the discussions (and reviews) knowing that they’ll be associated with their remarks eventually, but won’t be swayed by big names during the process?

LD: Yes, let’s do this. OK, what about hiding authors from area chairs?

Authors and ACs

EMB: I think hiding author identities from ACs is a good idea, but we still need to handle conflicts-of-interest somehow. And the cases where reviewers think that the authors should be citing X previous work when X is actually the author’s. Maybe we can have some of the small team of “roving” ACs doing that work? I’m not sure how they can handle all COI checking though.

LD: Ah, that’s tough. I don’t know too much about how the COI process typically works from the AC side, so I can’t comment here. If we agree on the intention—that author identities should ideally be hidden from ACs—we can make the problem better-defined and share it with the community, so some development happens.

EMB: Right. Having ACs be blind to authors is also being discussed in other places in the field, so we might be able to follow in their footsteps.

Reviewers and ACs

LD: So how about reviewer identities being hidden from ACs?

EMB: I disagree again about area chairs not seeing reviewer identities next to their reviews. While a paper should be evaluated solely on its merits, I don’t think we can rely on the reviewers to get absolutely everything into their reviews. And so having the AC know who’s writing which review can provide helpful context.

LD: I suppose we are choosing ACs we hope will be strong and authoritative about their domain. Do you agree there’s a risk of a bias here? I’m not convinced that knowing a reviewer’s identity helps so much—all humans make mistakes with great reliability (else annotation would be easier), and so what we really see is random effect magnification/minimization depending on the AC’s knowledge of a particular reviewer, where a given review’s quality varies on its own.

EMB: True, but/and it’s even more complex: The AC can only directly detect some aspects of review quality (is it thorough? helpful?) but doesn’t necessarily have the ability to tell whether it’s accurate. Also—how are the ACs supposed to do the allocation of reviewers to papers, and do things like make sure those with more linguistic expertise are evenly distributed, if they don’t know who the reviewers are?

LD: My concern is that ACs will have bias about which reviewers are “reliable” (and anyway, no reviewer is 100% reliable). However, in the interest of simplicity: we’ve already taken steps to ensure that we have a varied, balanced AC pool this iteration, which I hope will reduce the effect of AC:reviewer bias when compared to conferences with mostly static AC pools. And the problem of allocating reviews to papers remains unsettled.

EMB: Right. Maybe we’re making enough changes this year?

LD: Right.

Resource papers

LD: An addendum: this kind of blindness may prove impossible for resource-type papers, where author anonymity may become an optionally relaxable constraint.

EMB: Well, I think people should at least go through the motions.

LD: Sure—this makes life easier, too. As long as authors aren’t torn apart during review because someone can guess the authors behind a resource.

EMB: Good point. I’ll make a note in our draft AC duties document.

Reviewing style

LD: I want to bring up review style, as well. To nudge reviewers towards good reviewing style, I’d like reviewers to have the option of signing their reviews, with signatures available to authors at notification only. The reviewer identity would not be attached to a specific review, but rather general, in the form “Reviewers of this paper included: Natalie Schluter.” We known adversarial reviewing drops when reviewer identity is known, and I’d love to see CS—a discipline known for nasty reviews—begin to move in a positive direction. Indeed, as PC co-chairs of a CS-related conference, I feel we in particular have a duty to address this problem. My hope is that I can write a script to add this information, if we do it.

EMB: If the reviewers are opting in, perhaps it makes more sense for them to claim their own reviews. If I think one of my co-reviewers was a jerk, I would be less inclined to put my name to the group of reviews.

LD: That’s an interesting point. Nevertheless I’d like us to make progress on this front. In some time-rich utopia it might make sense to have the reviewers all agree whether or not to sign all three, and only have their identities revealed to each other after that—but we don’t have time. How about, reviews may be signed, but only at the point notifications are sent out? This prevents reviewers knowing who each other is, and lets those who want to hide, do so—as well as protecting us all from the collateral damage that results from jerk reviewers.

This could work with a checkbox—”Sign my review with my name in the final author notification”—and the rest’s scripted in Softconf.

EMB: So how about option to sign for author’s view (the checkbox) + all reviewers revealed to each other once the decisions are done?

LD: Good, let’s do that. Reviewer identities are hidden from each other during the process, and revealed later; and reviewers have the option to sign their review via a checkbox in softconf.

EMB: Great.

Questions

What do you think? What would you change about the double-blind process?

What kinds of invited speakers could we have?

As we begin to plan the keynote talks for COLING, we are looking for community input.  The keynote talks, among the few shared experiences in a conference with multiple parallel tracks, serve to both anchor the ‘conversation’ that the field is having through the conference and push it in new directions. In the past, speakers have been from both close to the center of our community and from outside it, lending both new, important perspectives that contextualize COLING, as well as helping us hear stories and insights that have led to great successes.

We are seeking two kinds of input:

  1. In public in the comments on this post: What kinds of topics would you like to hear about in the invited keynotes? We’re interested in both suggestions within computational linguistics as well as specific topics from related fields: linguistics, machine learning, cognitive science, and applications of computational linguistics to other fields.
  2. Privately, via this web form: If you have specific speakers you would like to nominate, please send us their contact info and any further information you’d like to share.

 

COLING 2018 PC Blog: Welcome!

Emily M. Bender and Leon Derczynski, at the University of Washington

We (Emily M. Bender and Leon Derczynski) are the PC co-chairs for COLING 2018, to be held in Santa Fe, NM, USA, 20-25 August 2018. Inspired by Min-Yen Kan and Regina Barzilay’s ACL 2017 PC Blog, we will be keeping one of our own. We start today with a brief post introducing ourselves and outlining our goals for COLING 2018. In later posts, we’ll describe the various plans we have for meeting those goals.

First the intros:

Emily is a Professor of Linguistics and Adjunct Professor of Computer Science & Engineering at the University of Washington, Seattle WA (USA), where she has been on the faculty since 2003 and has served as the Faculty Director of the Professional Masters in Computational Linguistics (CLMS) since its inception in 2005. Her degrees are all in Linguistics (AB UC Berkeley, MA and PhD Stanford) and her primary research interests are in grammar engineering, computational semantics, and computational linguistic typology. She is also interested in ethics in NLP, the application of computational methods to linguistic analysis, and different ways of integrating linguistic knowledge into NLP.

Leon is a Research Fellow of Computer Science at the University of Sheffield (UK), the home ICCL, where he has been a researcher since 2012, including visiting positions at Aarhus Universitet (Denmark), Innopolis University (Russian Federation) and University of California, San Diego (USA). His degrees are in Computer Science (MComp and PhD), also from Sheffield, and his research interests are in noisy text, unsupervised methods, and spatio-temporal information extraction. He is also interested in chunking and tagging, effective crowdsourcing, and assessing veracity and fake news.

We first met by proxy, through Tim Baldwin, at LREC 2014 in Reykjavik. Tim pointed out that we both happened to be currently visiting scholars in a hip Danish city devoid of its own NLP group—Aarhus.  Shortly after returning from Iceland, each upon Tim’s recommendation, we met for lunch a few times in Aarhus, chatting about understanding language, language diversity, and the interface between data-driven computational techniques and linguistic reality. We have made a point of catching up regularly ever since, and the city is a place where we still have connections—and even more hip, as the European Capital of Culture for 2017!

Then goals:

Our goals for COLING 2018 are (1) to create a program of high quality papers which represent diverse approaches to and applications of computational linguistics written and presented by researchers from throughout our international community; (2) to facilitate thoughtful reviewing which is both informative to ACs (and to us as PC co-chairs) and helpful to authors; and (3) to ensure that the results published at COLING 2018 are as reproducible as possible.

To give a bit more detail on the first goal, by diverse approaches/applications, we mean that we intend to attract (in the tradition of COLING):

  • papers which develop linguistic insight as well as papers which deepen our understanding of how machine learning can be applied to NLP — and papers that do both!
  • research on a broad variety of languages and genres
  • many different types of research contributions (application papers, resource papers, methodology papers, position papers, reproduction papers…)

We have the challenge and the privilege of taking on this role at a time when our field is growing tremendously quickly.  We hope to advance the way our conferences work by trying new things and improving the experience from all sides.  In approaching this task, we started by reviewing the strategies taken by PC chairs at other recent conferences (including COLING 2016, NAACL 2016, and ACL 2017), learning from them, and then adapting strategies based on our goals for COLING 2018.  We strongly believe that one key to achieving a diverse and strong program is community engagement.  Thus our first step towards that is starting this blog.  Over the coming weeks we will tell you more about what we are working on and seek input on various points in the process.  We look forward to working with you and hope to see many of you in Santa Fe next August!