Reviewing in an interdisciplinary field

The process of peer review, when functioning at its best, ensures that work that is published in archival venues is carefully vetted, such that the results likely to be reliable and the presentation of those results interpretable by scholars in the field at large. But what does “peer review” mean for an interdisciplinary field? Who are the relevant “peers”? We believe that for both goals—vetting reliability of results and vetting readability of presentation—reviewing in an interdisciplinary field ideally involves reviewers coming from different perspectives.

In our particular context, we had the added (near-)novelty of our keyword-based area assignment system. (Near-novelty, because this was pioneered by NAACL 2016.) This means our areas are not named, but rather emerge from the clusters of both reviewer interests and paper topics. On the upside, this means that really popular topics (“semantics”, or “MT”) can be spread across areas, such that we don’t have area chairs scrambling for large numbers of additional reviewers. On the downside, the areas can’t be named until the dust has settled, so reviewers don’t necessarily have a clear sense of which area (in the traditional sense) they are assigned to. In addition, interests that were very popular and therefore not highly discriminative (e.g. “MT”) weren’t given very much weight alone by the clustering algorithm.

During the bidding process, we had a handful of requests from reviewers to change areas (really a small number, considering the overall size of the reviewing pool!). These requests came in three types. First, there were a few who just said: There’s relatively little in this area that I feel qualified to review, could I try a different area? In all cases, we were able to find another area that was a better match.

Second, there were reviewers who said “My research interest is X, and I don’t see any papers on X in my area. I only want to review papers on X.” We found these remarks a little surprising, as our understanding of the field is that it is not a collection of independent areas that don’t inform each other, but rather a collection of ways of looking at the same very general and very pervasive phenomenon: human language and the ways in which it can be processed by computers. Indeed, we structured the keywords into multiple groups—targets, tasks, approaches, languages and genres—to increase intersection on at least a few areas of any given reviewer’s expertise. We very much hope that the majority of researchers in our field read outside their specific subfield and are open to influence from other subfields on their own.

The third type was reviewers, typically of a more linguistic than computational orientation, who expressed concern that because they aren’t familiar with the details of the models being used, they wouldn’t be able to review effectively. To these reviewers, we pointed out that it is equally important to look critically at the evaluation (what data is being used and how) and the relationship of the work to the linguistic concepts it is drawing on. Having reviewers with deep linguistic expertise is critical and both COLING and the authors very much benefit from it.

To create the best cross-field review, then, it helps to examine each of one’s strengths and compare these with the multiple facets presented by any paper. No single reviewer is likely to be expert in every area and aspect of a manuscript; but, there’s a good chance that, as long as some care has been applied to matching, there will be some crossover expertise. Be bold with that expertise. And indeed, the coverage of knowledge that multiple reviewers have is often complementary. As a reviewer, you can bring some knowledge to reviewing at least one aspect of a paper—more so than others sharing the workload—even if that is not the aspect initially expected.

COI policy

With a conference the size of COLING, managing conflicts of interest in the reviewing process is a challenge. Below, we sketch our COI handling policy, in the interest of transparency. In all cases, our goals are to maintain the fairness and integrity of the double-blind review process while also making sure that our hard-working volunteer program committee members can still also submit to COLING.

reviewer <> author

Softconf will automatically COI any reviewers from the same institution as an author.  In addition, in the bidding phase, we will ask reviewers to indicate any COIs that were not automatically caught. When ACs match reviewers to papers, this will be done so as to avoid COIs.

AC <> author

Any paper for which an AC has a COI (beyond being simply sharing affiliation) will be handled by rerouting the paper to another area. Here we’re talking about papers authored by ACs, their students, or other close collaborators. Since the areas are emergent (rather than strictly defined a priori) we anticipate it being relatively straightforward to find the area that is the next best match for all such papers. For papers authored by ACs themselves, this is relatively straightforward to detect. Beyond that, we will be asking ACs to identify any such COI papers as we assign papers to areas.

PC chair <> author

Perhaps the trickiest case is COIs involving the PC co-chairs ourselves. Neither of us are submitting our own papers to the COLING main conference. (Workshops, being handled entirely separately, are fair game in principle.) However, the fact that we’ve taken on this role shouldn’t prevent our students and other close collaborators from submitting to COLING. In this case, the entire process (including assignment to an area or possibly the “Special Circumstances” ACs, assignment to reviewers, and final accept/reject decision) will be overseen by our counterpart in conjunction the General Chair, Pierre Isabelle. This way, we still ensure that two people are present at every level of chairing.

Who gets to author a paper? A note on the Vancouver recommendations

At COLING 2018, we require submitted work to follow the Vancouver Convention on authorship – i.e. who gets to be an author on a paper. This guest post by Željko Agić of ITU Copenhagen introduces the topic.

Who gets to author a paper? A note on the Vancouver recommendations

One of the basic principles of publishing scientific research is that research papers are authored and signed by researchers.

Recently, the tenet of authorship has sparked some very interesting discussions in our community. In light of the increased use of preprint servers, we have been questioning the *ACL conference publication workflows. These mostly had to do with the peer review biases, but also with authorship: Should we enable blind preprint publications?

The notion of unattributed publications mostly does not sit well with researchers. We do not even know how to cite such papers, while we can invoke entire research programs in our paper narratives through a single last name.

Authorship is of crucial importance in research, and not just in writing up our related work sections. This goes without saying to all us fellow researchers. While in everyday language an author is simply a writer or an instigator of a piece of work, the question is slightly more nuanced in publishing scientific work:

  • What activities qualify one for paper authorship?
  • If there are multiple contributors, how should they be ordered?
  • Who decides on the list of paper authors?

These questions have sparked many controversies over the centuries of scientific research. An F. D. C. Willard, short for Felis Domesticus Chester, has authored a physics paper, similar to Galadriel Mirkwood, a Tolkien-loving Afgan hound versed in medical research. Others have built on the shoulders of giants such as Mickey Mouse and his prolific group.

Yet, authorship is no laughing matter: It can make and break research careers, and its (un)fair treatment can make a difference between a wonderful research group and an uneasy one at the least. A fair and transparent approach to authorship is of particular importance to early-stage researchers. There, the tall tales of PhD students might include the following conjectures:

  • The PIs in medical research just sign all the papers their students author.
  • In algorithms research the author ordering is always alphabetical.
  • Conference papers do not make explicit the individual author contributions.
  • The first and the last author matter the most.

The curiosities and the conjectures listed above all stem from the fact that there seems to be no awareness of any standard rulebook to play by in publishing research. This in turn gives rise to the many different traditions in different fields.

Yet, there is a rulebook!

One prominent attempt to put forth a set of guidelines for determining authorship are the Vancouver Group recommendations. The Vancouver Group are the International Committee of Medical Journal Editors (ICMJE), who in 1985 introduced a set of criteria for authorship. The criteria have seen many updates over the years, to match the latest developments in research and publishing. Their scope far surpasses the topic of authorship, and spans across the scientific publication process: reviewing, editorial work, publishing, copyright, and the like.

While the recommendations do stem from the medical field, they are nowadays broadened and thus widely adopted. The following is an excerpt from the recommendations in relation to authorship criteria.

The ICMJE recommends that authorship be based on the following 4 criteria:

1. Substantial contributions to the conception or design of the work; or the acquisition, analysis, or interpretation of data for the work; AND

2. Drafting the work or revising it critically for important intellectual content; AND

3. Final approval of the version to be published; AND

4. Agreement to be accountable for all aspects of the work in ensuring that questions related to the accuracy or integrity of any part of the work are appropriately investigated and resolved.

(…)

All those designated as authors should meet all four criteria for authorship, and all who meet the four criteria should be identified as authors. Those who do not meet all four criteria should be acknowledged.

(…)

These authorship criteria are intended to reserve the status of authorship for those who deserve credit and can take responsibility for the work. The criteria are not intended for use as a means to disqualify colleagues from authorship who otherwise meet authorship criteria by denying them the opportunity to meet criterion #s 2 or 3.

Note that there is an AND operator tying the four criteria, but there are some ORs within the individual entries. Thus, in essence, to be adherent with the Vancouver recommendations for authorship, one has to meet all four requirements, while in meeting each of the four, one is allowed to meet them minimally.

To take one example:

If you substantially contributed to 1) data analysis, and to 2) revising the paper draft, and then you subsequently 3) approved of the final version and 4) agreed to be held accountable for all the work, then congrats! you have met the authorship criteria!

One could take others routes through the four criteria, some arguably easier, while some even harder.

In my own view, we as a field should hope for the Vancouver recommendations to have already been adopted in NLP research, if only implicitly through the way our research groups and collaborations work.

Yet, are they? What are your thoughts? In your view, are the Vancouver recommendations well-matched with the COLING 2018 paper types? In general, are there aspects of your work in NLP that are left uncovered by the authorship criteria? Might there be at least some controversy and discussion potential to this matchup? 🙂

Metadata and COLING submissions

As the deadline for submission draws near, we’d like to alert our authors to a few things that are a bit different from previous COLINGs and other computational linguistics/NLP venues in the hopes that this will help the submission process go smoothly.

Paper types

Please consider the paper type you indicate carefully, as this will affect what the reviewers are instructed to look for you in your paper.  We encourage you to read the description of the paper types and especially the associated reviewer questions carefully. Which set of questions would you most like to have asked of your paper? (And if reading the questions inspires you to reframe/edit a bit to better address them before submitting, that is absolutely fair game!)

Emiel van Miltenburg raised the point on Twitter last week that it can be difficult to categorize papers and in particular that certain papers might fall between our paper types, combining characteristics of more than one, or being something else entirely.

Emiel and colleagues wondered whether we could implement a “tagging” system where authors could indicate the range of paper types their paper relates to. That is an intriguing idea, but it doesn’t work with the way we are using paper types to improve the diversity and rage of papers at COLING. As noted above, the paper types entail different questions on the review forms. We’re doing that because otherwise it seems that everything gets evaluated against the NLP Engineering Experiment paper type, which in turn means it’s hard to get papers of the other types accepted.  And as we hope we’ve made it blindingly clear, we really are interested in getting a broad range of paper types!

Keywords

The other aspect of our submission form that will have a strong impact on how your paper is reviewed is the keywords. Following the system pioneered by Ani Nenkova and Owen Rambow as PC co-chairs for NAACL 2016, we have asked our reviewers to all describe their areas of expertise along five dimensions:

  1. Linguistic targets of study
  2. Application tasks
  3. Approaches
  4. Languages
  5. Genres

(All five of these have a none of the above/not-applicable option.) The reviewers (and area chairs) all indicated all of the items on each of these dimensions they have the expertise and interest to review for. For authors, we ask you to indicate which items on each dimension best describe the paper you are submitting. Softconf will then match your paper to an area based on the assignment of papers to areas that best optimizes reviewer expertise for the papers submitted.

In sum: To ensure the most informed reviewing possible of your paper, please fill out these keywords carefully.  We urge you to start your submission in the system ahead of time so you aren’t trying to complete this task in a hurry just at the deadline.

Dual submission policy

Our Call for Papers indicates the following dual submission policy:

Papers that have been or will be under consideration for other venues at the same time must indicate this at submission time. If a paper is accepted for publication at COLING, it must be immediately withdrawn from other venues. If a paper under review at COLING is accepted elsewhere and authors intend to proceed there, the COLING committee must be notified immediately.

We have added a field in the submission form for you to be able to indicate this information.

LRE Map

COLING 2018 is participating in LRE map, as described in this guest post by Nicoletta Calzolari. In the submission form, you are asked to provide information about language resources your research has used—and those it has produced. Do not worry about anonymity on this form.  This information is not shared with reviewers.

Things to do in Santa Fe in late August (2): The Indian Market

The 97th Santa Fe Summer Indian Market

Santa Fe Indian Market August 18-19, 2018
Saturday 7am – 5pm
Sunday 8am-5pm
Historic Downtown Plaza

The Santa Fe Indian Market is in the historic downtown plaza in beautiful Santa Fe, New Mexico. The streets of downtown are transformed into the largest Native arts market and exhibition. There is nowhere else in the world you can go and see this many Native artists exhibiting in one place.

Read about it here: http://swaia.org/About_SWAIA/index.html

For a schedule of events and ticket information, please follow this link: http://swaia.org/Indian_Market/2017_Schedule_and_Tickets/

For reviews on Tripadvisor go here:

https://www.tripadvisor.com/Attraction_Review-g60958-d4735854-Reviews-Santa_Fe_Indian_Market-Santa_Fe_New_Mexico.html

Reviewer Code of Conduct

We ask the reviewers for COLING 2018 to adhere to the following code of conduct. (This has also been sent to the reviewers via email, but for transparency’s sake we post it here as well.)

Reviewer Code of Conduct

As you prepare your reviews, keep in mind that our goal with the review forms is to help reviewers structure their reviews in such a way that they are helpful for the area chairs in making final acceptance decisions, informative for the authors (so they understand the decisions that were made), and helpful for the authors (as they improve their work either for camera ready, or for submission to a later venue). To that end, we ask you to follow these guidelines as you prepare your reviews:

Be timely: Even if you don’t plan to start your reviews as soon as they are assigned, please do log in to START and see which papers you got. This will allow you to notify us of conflicts of interest in time for us to reassign the paper. Furthermore, please don’t count on finishing your reviews at the last minute. As we all know, things can come up, and that time you were counting on might not be there. As we coordinate the efforts of 1200+ PC members, it is imperative that everyone be timely.

Be constructive: Be sure to state what you find valuable about each paper, even if this is difficult to do. There’s a person on the other end of your review, who has put thought and effort into their paper. Your suggestions for improvement will be better received if the author can also see that you understood what they were trying to do. Normative statements (e.g. “insufficient evaluation”) are much more valuable to both authors and chairs when there are supporting explanations, so include them.

Be thorough: Read both the review forms and your papers carefully and provide detailed comments. We ask for scores on specific dimensions because we want you to consider those dimensions as you evaluate the paper. But ultimately, your comments will be more helpful, both to the ACs and to the authors, than the numerical scores. So please comment on each of the points as well in the text of your review. Note, too, that we have quite different review forms for different paper types, because we believe that different paper types should be evaluated in (somewhat) different ways (e.g. a position paper shouldn’t be criticized for not including an evaluation section). Please look at the review form before reading the paper so you know what you are looking for.

Maintain confidentiality: As a professional researcher, we have confidence you already know that this entire process is confidential, and how to treat it that way. Do not share the papers you review discuss their contents with others. Do not appropriate the ideas in the paper.

Author responsibilities and the COLING 2018 desk reject policy

As our field experiences an upswing in participation, we have more submissions to our conferences, and this means we have to be careful to keep the reviewing process as efficient as possible. One tool used by editors and chairs is the “desk reject”. This is a way to filter out papers that clearly shouldn’t get through for whatever reason, without asking area chairs and reviewers to handle them, leaving our volunteers to use their energy on the important process of dealing with your serious work.

A desk reject is an automatic rejection without further review. This saves time, but is also quite a strong reaction to a submission. For that reason, this post clarifies possible reasons for a desk reject and the stages at which this might occur. It is the responsibility of the authors to make sure to avoid these situations.

Reasons for desk rejects:

  • Page length violations. The content limit at COLING is 9 pages. (You may include as many pages as needed for references.) Appendices, if part of the main paper, must be put into that nine pages. It’s unfair to judge longer papers against those that have kept to the limit and so exceeding the page limit means a desk reject.
  • Template cheating. The LaTeX and Word templates give a level playing field for everyone. Squeezing out whitespace, adjusting margins, and changing the font size all stop that playing field from being even and give an unfair advantage. If you’re not using the official template, you’ve altered that template, or the way a manuscript uses it goes beyond our intent, then the paper may be desk rejected.
  • Missing or poor anonymisation. It’s well-established that non-anonymised papers from “big name” authors and institutions fare better during review. To avoid this effect, and others, COLING is running double-blind; see our post on the nuances of double-blinding. We do not endeavour to be arbiters of what does or does not constitute a “big name”—rather, any paper that is poorly anonymised (or not anonymised at all) will face desk reject. See below for a few more comments on anonymisation.
  • Inappropriate content. We want to give our reviewers and chairs research papers to review. Content that really does not fit this will be desk rejected.
  • Plagiarism. Submitting work that has already appeared, has already been accepted for publication at another venue, or has any significant overlap with other works submitted to COLING will be desk rejected. Several major NLP conferences are actively collaborating on this.
  • Breaking the arXiv embargo. COLING follows the ACL pre-print policy. This means that only papers not published on pre-print services or published on pre-print services more than a month before the deadline (i.e. before February 16, 2018) will be considered. Pre-prints published after this date (non-anonymously) may not be submitted for review at COLING. In conjunction with other NLP conferences this year, we’ll be looking for instances of this and desk rejecting them.

The desk rejects are determined at four separate points. In order,

  1. Automatic rejection by the START submission system, which has a few checks at various levels.
  2. A rejection by the PC co-chairs, before papers are allocated to areas.
  3. After papers are placed in areas, ACs have the opportunity to check for problems. One response is to desk reject.
  4. Finally, during and immediately after allocation of papers to reviewers, an individual reviewer may send a message to invoke desk rejection, which will be queried and checked by at least two people from the ACs or PC co-chairs.

As an honest researcher trying to publish your important and exciting work, the above probably do not apply to you. But if they do, please think twice. We would prefer to send out no desk rejects and imagine it would be much more pleasant for our authors if none were to receive a desk reject. So, now you know what to avoid!

Postscript on anonymisation

Papers must be anonymised. This protects everybody during review. It’s a complex issue to implement, which is why we earlier had a post dedicated to double blindness in peer review. There are strict anonymisation guidelines in the call for papers and the only way to be sure that nobody takes exception during the review process is to follow these guidelines.

We’ve received several questions on what the best practices for anonymisation are.  We realize that in long-standing projects, it can be impossible to truly disguise the group that work comes from.  Nonetheless, we expect all COLING authors to follow the forms of anonymisation:

  1. Do NOT include author names/affiliations in the version of the paper submitted for review.  Instead, the author block should say “Anonymous”.
  2. When making reference to your own published work, cite it as if written by someone else: “Following Lee (2007), …” “Using the evaluation metric proposed by Garcia (2016), …”
  3. The only time it’s okay to use “anonymous” in a citation is when you are referring to your own unpublished work: “The details of the construction of the data are described in our companion paper (anonymous, under review).”
  4. Expanded versions of earlier workshop papers should rework the prose sufficiently so as not to turn up as potential plagiarism examples. The final published version of such papers should acknowledge the earlier workshop paper, but that should be suppressed in the version submitted for review.
  5. More generally, the acknowledgments section should be left out of the version submitted for review.
  6. Papers making code available for reproducibility or resources available for community use should host a version of that at a URL that doesn’t reveal the authors’ identity or  institution.

We have been asked a few times about whether LRE Map entries can be done without de-anonymising submissions.  The LRE Map data will not be shared with reviewers, so this is not a concern.

Keeping resources anonymised is a little harder. We recommend you keep things like names of people and labs out of your code and files; for example, Java code uploaded that ran within an edu.uchicago.nlp namespace would be problematic. Similarly, if the URL given is within a personal namespace, this breaks double-blindness, and must be avoided. Google Drive, Dropbox and Amazon S3 – as well as many other file-sharing services – offer reasonably anonymous (and often free) file sharing URLs, and we recommend you use those if you can’t upload your data/code/resources into START as supplementary materials.

 

 

LRE Map: What? Why? When? Who?

This guest post is by Nicoletta Calzolari.

Not-documented Language Resources (LRs) don’t exist!

The LRE Map of Language Resources (data and tools) (http://lremap.elra.info) is an innovative instrument introduced at LREC2010 with the aim of monitoring the wealth of data and technologies developed and used in our field. Why “Map”? Because we aimed at representing the relevant features of a large territory, also for the aspects not represented in the official catalogues of the major players of the field. But we had other purposes too: we wanted to draw attention to the importance of the LRs that are behind many of our papers and to map also the “use” of LRs, to understand the purposes of the developed LRs.

Its collaborative, bottom-up, creation was critical: we conceived the Map as a means to influence a “change of culture” in our community, whereby everyone is asked to make a minimal effort to document the LRs that are used or created, thus understanding the need of proper documentation. By spreading the LR documentation effort across many people instead of leaving it only in the hands of the distribution centres, we also encourage awareness of the importance of metadata and proper documentation. Documenting a resource is the first step for making it identifiable, which in its turn is the first step towards reproducibility.

We kept the requested information at a simple level, knowing that we had to compromise between richness of metadata and willingness of authors to fill them in.

With all these purposes in mind we thought we could exploit the great opportunity offered by LREC and the involvement of so many authors from so many countries, from different modalities and working in so many areas of NLP. Afterwards the Map was used also in the framework of other major Conferences, in particular by COLING, and this provides another opportunity for useful comparisons.

The number of LRs currently described in the Map is 7453 (instances), collected from 17 different conferences. The major conferences for which we have data on a regular basis are LREC and COLING.

With initiatives such as the LRE Map and “Share your LRs” (introduced in 2014) we want to encourage in the field of LT and LRs what is already in use in more mature disciplines, i.e. ensure proper documentation and reproducibility as a normal practice. We think that research is strongly affected also by such infrastructural (meta-research) activities and therefore we continue to promote – also through such initiatives – a greater visibility of LRs, the sharing of LRs in an easier way and the reproducibility of research results.

Here is the vision: it must become common practice also in our field that when you submit a paper either to a conference or a journal you are offered the opportunity to document and upload the LRs related to your research. This is even more important in a data-intensive discipline like NLP. The small cost that each of us will pay to document, share, etc. should be paid back from benefiting of others’ efforts.

What do we ask to colleagues submitting at COLING 2018? Please document all the LRs mentioned in your paper!

Why Santa Fe?

We want the first COLING conference in the US since 1984 to be remembered not only for its scientific impact (which, by the way, we fully expect to be top-notch). We want the participants to have a great experience that is not “same old, same old.”

This is why COLING 2018 will not be held at some huge hotel (“you saw one, you saw them all”) or a sprawling college campus in a big city. It will happen somewhere interesting, beautiful and unusual.

Allow us to introduce Santa Fe, whose motto is “the city different.” It was founded over 400 years ago (this is really old for the US!). It has no skyscrapers, but the Palace of the Governors, built in 1610 on Santa Fe Plaza in the heart of the historic district, is the oldest continuously occupied public building in the United States. The air at 7,000 feet is clean, the sun shines on average 325 days of the year. The surrounding countryside is breathtakingly beautiful.

In this blog we will talk about what Santa Fe has to offer and will also include practical tips for visitors. We will be happy to answer your questions.

For starters, here is what people say about Santa Fe:

(Much more info here: https://santafe.org/)