Author response

The value of the author response mechanism is frequently debated in our field and can be a source of stress for authors. On the one hand, when our work is being reviewed by others, it can feel helpless to not have the opportunity to respond to those reviews. On the other hand, there is the perennial question about whether author responses ever “help” (in the sense of taking a paper over the line to “accept” from “reject”). (On that point, see this very thoughtful analysis by Hal Daumé III for the process for NAACL 2013.) And finally there is the issue that author responses must be turned around in a short time and can be tricky to write: How to strike the right tone (firm, polite, confident; not pleading or angry) especially when we might still be feeling the sting of negative reviews. As reviewers, we have seen both very effective author responses (expressing gratitude for feedback and pointing out sources of misunderstanding) and very ineffective ones (pure vitriol, or long lists of promises of what will be accomplished before the camera-ready version).

In light of all of this, what we settled on for COLING 2018 is an optional author response to be seen by the area chairs only – and not the reviewers. Thus we are providing authors with the opportunity to flag reviewer misunderstandings for area chairs and to answer questions raised by reviews. The latter should only be done when the information is already available and can be indicated in a short statement (e.g. “Indeed, we did set the random seed and will include this information in the camera ready” but not “That is an interesting idea for a further experiment, we will run that one and include the numbers in the camera ready”). We also note that author response is optional and area chairs will not read anything into the lack of an author response.

Author response will run from 20-24 April.

Why this route? Well, the quantitative evidence is that pointing out reviewer mistakes rarely leads to a change in scores. The folk knowledge has been for some time that responses are really used by ACs to detect misaligned reviews. So rather than encourage an intrinsically difficult communication that has had little to no effect in the past, we instead divert the replies to go to the authoritative party they are relevant to. This gives a little extra work for ACs, but as they’re acting in pairs and areas are roughly the same compact size, our hope is that time can be spent more on working out the dialog around a paper and less on administering a huge set of authors and reviewers.

Lessons Learned

The role of PC chair is interesting in many ways. It provides a perhaps unparalleled opportunity to influence the way in which research is approached and presented in our field. For COLING 2018, we have been taking this responsibility very seriously and working hard, through both decisions for the review process and the publicization of those ideas in this blog, the push the field in directions that we believe will be fruitful, including stronger interdisciplinarity and more reproducibility.

On the flip side, the role of PC chair comes with some serious downsides. One is the heart rending process of deciding on and then informing authors of desk rejects. We did our utmost to do this as fairly as humanly possible, starting with publicizing our desk reject policy. We hoped that that move would reduce the number of desk rejects, and it may have, but there were still a handful of papers rejected without review under the policy.

The most common reason for a desk reject by a long way was the paper’s length (ie. documents were submitted with more than 9 content pages). Papers in the completely incorrect template were also desk rejected, as were those with squashed line spacing, reduced font size, removed author boxes, and so on. Other reasons for desk rejection were bad anonymisation; some papers, for example, linked to the author’s private github repository. This is the sort of thing that can really wait until camera ready. All papers sent in other templates were desk rejected (we saw e.g. NAACL, ACL, NIPS formats). One paper was rejected for breaking the arXiv embargo period, having been published there fewer than 30 days before the COLING deadline. No edits were allowed after the deadline had passed. This was a very unpleasant process overall and we can only make a plea to authors to follow the guidelines so that work gets the attention it needs, instead of rejection without feedback. That way there don’t have to be any desk rejects at all. They are often desperately unpleasant to send, and probably even worse to receive.

In this blog post, we wanted to briefly reflect on what we have learned about the kind of practices that put people in the corners that lead to the kind of mistakes that result in desk rejects. In general, we see that there is a culture of last-minutism in our field. Deadlines can inspire people to get things done that otherwise seem impossible, but doing things in a rush also has downsides. Here are some DOs and DON’Ts of paper submission that we hope will spare people some pain in the future:

  • Do access the submission system early, so you know what awaits.
  • Do read the CFP carefully. Such documents can be intimidating, especially for first-time submitters, but the information there all has a purpose, and it’s easier to make use of if you get it early.
  • Don’t leave submitting your final paper until the absolute last minute. If something goes wrong (e.g. submitting the wrong pdf, losing your internet connection), you’ll have missed the deadline. This happens regularly and is wasteful. Sometimes you might not find out it was the wrong PDF until after the deadline, or might be so rushed that the paper spills over the page limit unnoticed. This means the hard work has to wait for another conference.

And finally a couple of thoughts on interacting with PC chairs, especially in large conferences:

  • Please don’t ask the PC chairs to upload a PDF for you after the deadline. The deadline is a deadline. Asking for it to be bent is asking the PC chairs to not apply policies evenly and fairly.
  • Do be aware that the PC chairs in a conference this size are communicating with ~1000 authors and ~1000 reviewers, and keep that in mind as you make requests.

COLING 2018 Submissions Overview

We’ve had a successful COLING so far, with over a thousand papers submitted, covering a variety of areas. In total, 1017 papers were submitted to the main conference, all full-length.

Each submitted paper had a distinct type assigned by the authors, that affects how it is reviewed. These were developed based on our earlier blog post on paper types. The “NLP Engineering Experiment paper” was unsurprisingly the dominant type, though only made up for 65% of all papers. We were very happy to receive 25 survey papers, 31 position papers, and 35 reproduction papers—as well as a solid 106 resource papers and a strong showing of 163 computationally-aided linguistic analysis papers, the second largest contingent.

Some papers were withdrawn or desk rejected before review began in earnest. Between ACs and PC co-chairs, in total, 32 papers were rejected without review. Excluding desk rejects, so far 41 papers have been withdrawn from consideration by the authors.

Allocating papers to areas gave each area a mean and median of 27 papers. The largest area has 31 papers and the smallest 19. We interpret this as indicating that area chairs will not be overloaded, leading to better review quality and interpretation.

Author survey results

Shortly after the submission deadline, we sent out a survey to our authors, with the goal of better understanding how our outreach was working.

Respondents

We sent the notification of the survey via START to all corresponding authors (so roughly 1000 people) and asked them to share it with co-authors. The survey recorded 434 total responses, which is a pretty satisfying response rate!

Of those 434, 302 (69.6%) indicated that they were submitting to COLING for the first time, and 101 (23.3%) to a major NLP conference for the first time.

Outreach

We asked how people first found out about COLING 2018. The most popular response was “Web search” (44.2%), followed by “Call for Papers sent over email (e.g. corpora mailing list, ACL mailing list)” (35.9%), then “Other” (12.4%) and “Social media” (7.4%).  The “Other” answers included word-of-mouth, knowing to expect COLING to come around in 2018, and websites that aggregate CFPs.

Paper types

We wanted to find out if people were aware of the paper types (since this is relatively unusual in our field) before submitting their papers, and if so, how they found out. Most—349 (80.4%)—were aware of the paper types ahead of time.  Of these, the vast majority (93.4%) found out about the paper types via the Call for Papers. Otherwise, people found out because someone else told them (7.4%), via our Twitter or Facebook feeds (6.0%), or via our blog (3.7%).

We also asked if it was clear to authors which paper type was appropriate for their paper and if they think paper types are a good idea. The answers in both cases were pretty strongly positive: 78.8% said it was clear and 91.0% said it was a good idea. (Interestingly, 74 people who said it wasn’t clear which paper type was a good fit for theirs nonetheless said it was a good idea, and 21 people who thought it was clear which paper type fit nonetheless said it wasn’t.)

Writing mentoring program

We wanted to know if our authors were aware of the writing mentoring program, and for those who were but didn’t take advantage of it, why not. 277 respondents (63.8%) said they were aware of it. The most common reason chosen for not taking advantage of it was “I didn’t/couldn’t have a draft ready in time.” (150 respondents), followed by “I have good mentoring available to me in my local institution” (97 respondents). The other two options available in that check-all-that-apply question were “I have a lot of practice writing papers already” (74 respondents) and “Other” (10). Alas, a few people indicated that they only discovered it too late.

Other channels

We have been putting significant effort into getting information out about our process, but still worry that the channels we’re using aren’t reaching everyone. We asked “What other channels would you like to see information like this publicized on?” referring specifically to the paper types. Most people did not respond, or indicated that what we’re doing is enough. Other responses included: LINGUIST List, LinkedIn, Instagram, ResearchGate, and email. Ideas for email include creating a conference-specific mailing list that people can subscribe to and sending out messages to all email addresses registered in START.

We include these ideas here for posterity (and the benefit of future people filling this role). We have used LinkedIn and Weibo in a limited capacity and are using Twitter and Facebook. Adding additional social media (Instagram) sounds plausible, but is not in our plans for this year. An email list that people could opt into for updates makes a lot of sense, though there’s still the problem of getting the word out about that list. Perhaps a good way to do that would be to include that info in the CFP (starting from the first CFP). Emailing everyone through START may not be feasible (depending on START’s email privacy policy) and at any rate wouldn’t help reach those who have never submitted to a compling/NLP conference before.

Blog readership

Of course we wanted to know if our authors are reading this blog.  44.7% of respondents weren’t aware of the blog (prior to being asked that question!), 15.0% had found it only recently, 24.9% had been aware of it for at least a month but less than 6, and 15.4% indicated that they’ve been aware of it for at least 6 months. 9.2% of respondents read (almost) everything we post, 32.0% read it sometimes, and the remainder don’t read it or read it only rarely.

We also wanted to know if the PC blog helped our authors to understand our submission process or shape their submissions to COLING 2018. 22.8% indicated “Yes, a lot!” and 28.1% “Yes, a little”. On the no side, 22.1% chose “No, not really” and 27.0% “No, not at all”. “Yes, a lot!” people, we’re doing this for you 🙂

 

Outstanding Mentors

The COLING 2018 writing mentoring program went extremely well—we are grateful to all of the mentors who volunteered their time to provide thoughtful comments to the authors who participated. Furthermore, the prompts we used in the writing mentoring form (listed in the description of the program) were effective in eliciting useful feedback for authors.

There is great willingness in our field to participate from the mentoring side.  Over 100 mentors signed up, which means we could have provided mentoring for even more papers than we did. It seems that the biggest hurdle to success for such a program is getting the word out to those who would most likely benefit from it. (We’ve got another blog post in the works about outreach & responses to our author survey.)

Reviewing the work of the mentors to find those to recognize as outstanding mentors was inspiring—and the task of choosing difficult—because so many did such a great job. Even if the mentored papers aren’t ultimate accepted to COLING, the authors who received mentoring will have benefited from thoughtful, constructive feedback on their work which we hope will inform both future writings on the same topic and perhaps even their approach to writing on other topics.

Against that background, the following mentors distinguished themselves as particularly outstanding:

  • Kevin Cohen
  • Carla Parra Escartín
  • David Mimno
  • Emily Morgan
  • Irina Temnikova
  • Jennifer Williams

Thank you to all of our mentors!

Reviewing in an interdisciplinary field

The process of peer review, when functioning at its best, ensures that work that is published in archival venues is carefully vetted, such that the results likely to be reliable and the presentation of those results interpretable by scholars in the field at large. But what does “peer review” mean for an interdisciplinary field? Who are the relevant “peers”? We believe that for both goals—vetting reliability of results and vetting readability of presentation—reviewing in an interdisciplinary field ideally involves reviewers coming from different perspectives.

In our particular context, we had the added (near-)novelty of our keyword-based area assignment system. (Near-novelty, because this was pioneered by NAACL 2016.) This means our areas are not named, but rather emerge from the clusters of both reviewer interests and paper topics. On the upside, this means that really popular topics (“semantics”, or “MT”) can be spread across areas, such that we don’t have area chairs scrambling for large numbers of additional reviewers. On the downside, the areas can’t be named until the dust has settled, so reviewers don’t necessarily have a clear sense of which area (in the traditional sense) they are assigned to. In addition, interests that were very popular and therefore not highly discriminative (e.g. “MT”) weren’t given very much weight alone by the clustering algorithm.

During the bidding process, we had a handful of requests from reviewers to change areas (really a small number, considering the overall size of the reviewing pool!). These requests came in three types. First, there were a few who just said: There’s relatively little in this area that I feel qualified to review, could I try a different area? In all cases, we were able to find another area that was a better match.

Second, there were reviewers who said “My research interest is X, and I don’t see any papers on X in my area. I only want to review papers on X.” We found these remarks a little surprising, as our understanding of the field is that it is not a collection of independent areas that don’t inform each other, but rather a collection of ways of looking at the same very general and very pervasive phenomenon: human language and the ways in which it can be processed by computers. Indeed, we structured the keywords into multiple groups—targets, tasks, approaches, languages and genres—to increase intersection on at least a few areas of any given reviewer’s expertise. We very much hope that the majority of researchers in our field read outside their specific subfield and are open to influence from other subfields on their own.

The third type was reviewers, typically of a more linguistic than computational orientation, who expressed concern that because they aren’t familiar with the details of the models being used, they wouldn’t be able to review effectively. To these reviewers, we pointed out that it is equally important to look critically at the evaluation (what data is being used and how) and the relationship of the work to the linguistic concepts it is drawing on. Having reviewers with deep linguistic expertise is critical and both COLING and the authors very much benefit from it.

To create the best cross-field review, then, it helps to examine each of one’s strengths and compare these with the multiple facets presented by any paper. No single reviewer is likely to be expert in every area and aspect of a manuscript; but, there’s a good chance that, as long as some care has been applied to matching, there will be some crossover expertise. Be bold with that expertise. And indeed, the coverage of knowledge that multiple reviewers have is often complementary. As a reviewer, you can bring some knowledge to reviewing at least one aspect of a paper—more so than others sharing the workload—even if that is not the aspect initially expected.

COI policy

With a conference the size of COLING, managing conflicts of interest in the reviewing process is a challenge. Below, we sketch our COI handling policy, in the interest of transparency. In all cases, our goals are to maintain the fairness and integrity of the double-blind review process while also making sure that our hard-working volunteer program committee members can still also submit to COLING.

reviewer <> author

Softconf will automatically COI any reviewers from the same institution as an author.  In addition, in the bidding phase, we will ask reviewers to indicate any COIs that were not automatically caught. When ACs match reviewers to papers, this will be done so as to avoid COIs.

AC <> author

Any paper for which an AC has a COI (beyond being simply sharing affiliation) will be handled by rerouting the paper to another area. Here we’re talking about papers authored by ACs, their students, or other close collaborators. Since the areas are emergent (rather than strictly defined a priori) we anticipate it being relatively straightforward to find the area that is the next best match for all such papers. For papers authored by ACs themselves, this is relatively straightforward to detect. Beyond that, we will be asking ACs to identify any such COI papers as we assign papers to areas.

PC chair <> author

Perhaps the trickiest case is COIs involving the PC co-chairs ourselves. Neither of us are submitting our own papers to the COLING main conference. (Workshops, being handled entirely separately, are fair game in principle.) However, the fact that we’ve taken on this role shouldn’t prevent our students and other close collaborators from submitting to COLING. In this case, the entire process (including assignment to an area or possibly the “Special Circumstances” ACs, assignment to reviewers, and final accept/reject decision) will be overseen by our counterpart in conjunction the General Chair, Pierre Isabelle. This way, we still ensure that two people are present at every level of chairing.

Who gets to author a paper? A note on the Vancouver recommendations

At COLING 2018, we require submitted work to follow the Vancouver Convention on authorship – i.e. who gets to be an author on a paper. This guest post by Željko Agić of ITU Copenhagen introduces the topic.

Who gets to author a paper? A note on the Vancouver recommendations

One of the basic principles of publishing scientific research is that research papers are authored and signed by researchers.

Recently, the tenet of authorship has sparked some very interesting discussions in our community. In light of the increased use of preprint servers, we have been questioning the *ACL conference publication workflows. These mostly had to do with the peer review biases, but also with authorship: Should we enable blind preprint publications?

The notion of unattributed publications mostly does not sit well with researchers. We do not even know how to cite such papers, while we can invoke entire research programs in our paper narratives through a single last name.

Authorship is of crucial importance in research, and not just in writing up our related work sections. This goes without saying to all us fellow researchers. While in everyday language an author is simply a writer or an instigator of a piece of work, the question is slightly more nuanced in publishing scientific work:

  • What activities qualify one for paper authorship?
  • If there are multiple contributors, how should they be ordered?
  • Who decides on the list of paper authors?

These questions have sparked many controversies over the centuries of scientific research. An F. D. C. Willard, short for Felis Domesticus Chester, has authored a physics paper, similar to Galadriel Mirkwood, a Tolkien-loving Afgan hound versed in medical research. Others have built on the shoulders of giants such as Mickey Mouse and his prolific group.

Yet, authorship is no laughing matter: It can make and break research careers, and its (un)fair treatment can make a difference between a wonderful research group and an uneasy one at the least. A fair and transparent approach to authorship is of particular importance to early-stage researchers. There, the tall tales of PhD students might include the following conjectures:

  • The PIs in medical research just sign all the papers their students author.
  • In algorithms research the author ordering is always alphabetical.
  • Conference papers do not make explicit the individual author contributions.
  • The first and the last author matter the most.

The curiosities and the conjectures listed above all stem from the fact that there seems to be no awareness of any standard rulebook to play by in publishing research. This in turn gives rise to the many different traditions in different fields.

Yet, there is a rulebook!

One prominent attempt to put forth a set of guidelines for determining authorship are the Vancouver Group recommendations. The Vancouver Group are the International Committee of Medical Journal Editors (ICMJE), who in 1985 introduced a set of criteria for authorship. The criteria have seen many updates over the years, to match the latest developments in research and publishing. Their scope far surpasses the topic of authorship, and spans across the scientific publication process: reviewing, editorial work, publishing, copyright, and the like.

While the recommendations do stem from the medical field, they are nowadays broadened and thus widely adopted. The following is an excerpt from the recommendations in relation to authorship criteria.

The ICMJE recommends that authorship be based on the following 4 criteria:

1. Substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data for the work; AND

2. Drafting the work or revising it critically for important intellectual content; AND

3. Final approval of the version to be published; AND

4. Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

(…)

All those designated as authors should meet all four criteria for authorship, and all who meet the four criteria should be identified as authors. Those who do not meet all four criteria should be acknowledged.

(…)

These authorship criteria are intended to reserve the status of authorship for those who deserve credit and can take responsibility for the work. The criteria are not intended for use as a means to disqualify colleagues from authorship who otherwise meet authorship criteria by denying them the opportunity to meet criterion #s 2 or 3.

Note that there is an AND operator tying the four criteria, but there are some ORs within the individual entries. Thus, in essence, to be adherent with the Vancouver recommendations for authorship, one has to meet all four requirements, while in meeting each of the four, one is allowed to meet them minimally.

To take one example:

If you substantially contributed to 1) data analysis, and to 2) revising the paper draft, and then you subsequently 3) approved of the final version and 4) agreed to be held accountable for all the work, then congrats! you have met the authorship criteria!

One could take others routes through the four criteria, some arguably easier, while some even harder.

In my own view, we as a field should hope for the Vancouver recommendations to have already been adopted in NLP research, if only implicitly through the way our research groups and collaborations work.

Yet, are they? What are your thoughts? In your view, are the Vancouver recommendations well-matched with the COLING 2018 paper types? In general, are there aspects of your work in NLP that are left uncovered by the authorship criteria? Might there be at least some controversy and discussion potential to this matchup? 🙂

Metadata and COLING submissions

As the deadline for submission draws near, we’d like to alert our authors to a few things that are a bit different from previous COLINGs and other computational linguistics/NLP venues in the hopes that this will help the submission process go smoothly.

Paper types

Please consider the paper type you indicate carefully, as this will affect what the reviewers are instructed to look for you in your paper.  We encourage you to read the description of the paper types and especially the associated reviewer questions carefully. Which set of questions would you most like to have asked of your paper? (And if reading the questions inspires you to reframe/edit a bit to better address them before submitting, that is absolutely fair game!)

Emiel van Miltenburg raised the point on Twitter last week that it can be difficult to categorize papers and in particular that certain papers might fall between our paper types, combining characteristics of more than one, or being something else entirely.

Emiel and colleagues wondered whether we could implement a “tagging” system where authors could indicate the range of paper types their paper relates to. That is an intriguing idea, but it doesn’t work with the way we are using paper types to improve the diversity and rage of papers at COLING. As noted above, the paper types entail different questions on the review forms. We’re doing that because otherwise it seems that everything gets evaluated against the NLP Engineering Experiment paper type, which in turn means it’s hard to get papers of the other types accepted.  And as we hope we’ve made it blindingly clear, we really are interested in getting a broad range of paper types!

Keywords

The other aspect of our submission form that will have a strong impact on how your paper is reviewed is the keywords. Following the system pioneered by Ani Nenkova and Owen Rambow as PC co-chairs for NAACL 2016, we have asked our reviewers to all describe their areas of expertise along five dimensions:

  1. Linguistic targets of study
  2. Application tasks
  3. Approaches
  4. Languages
  5. Genres

(All five of these have a none of the above/not-applicable option.) The reviewers (and area chairs) all indicated all of the items on each of these dimensions they have the expertise and interest to review for. For authors, we ask you to indicate which items on each dimension best describe the paper you are submitting. Softconf will then match your paper to an area based on the assignment of papers to areas that best optimizes reviewer expertise for the papers submitted.

In sum: To ensure the most informed reviewing possible of your paper, please fill out these keywords carefully.  We urge you to start your submission in the system ahead of time so you aren’t trying to complete this task in a hurry just at the deadline.

Dual submission policy

Our Call for Papers indicates the following dual submission policy:

Papers that have been or will be under consideration for other venues at the same time must indicate this at submission time. If a paper is accepted for publication at COLING, it must be immediately withdrawn from other venues. If a paper under review at COLING is accepted elsewhere and authors intend to proceed there, the COLING committee must be notified immediately.

We have added a field in the submission form for you to be able to indicate this information.

LRE Map

COLING 2018 is participating in LRE map, as described in this guest post by Nicoletta Calzolari. In the submission form, you are asked to provide information about language resources your research has used—and those it has produced. Do not worry about anonymity on this form.  This information is not shared with reviewers.

Things to do in Santa Fe in late August (2): The Indian Market

The 97th Santa Fe Summer Indian Market

Santa Fe Indian Market August 18-19, 2018
Saturday 7am – 5pm
Sunday 8am-5pm
Historic Downtown Plaza

The Santa Fe Indian Market is in the historic downtown plaza in beautiful Santa Fe, New Mexico. The streets of downtown are transformed into the largest Native arts market and exhibition. There is nowhere else in the world you can go and see this many Native artists exhibiting in one place.

Read about it here: http://swaia.org/About_SWAIA/index.html

For a schedule of events and ticket information, please follow this link: http://swaia.org/Indian_Market/2017_Schedule_and_Tickets/

For reviews on Tripadvisor go here:

https://www.tripadvisor.com/Attraction_Review-g60958-d4735854-Reviews-Santa_Fe_Indian_Market-Santa_Fe_New_Mexico.html