Author survey results

Posted on April 5, 2018 by Emily M. Bender

Shortly after the submission deadline, we sent out a survey to our authors, with the goal of better understanding how our outreach was working.

Respondents

We sent the notification of the survey via START to all corresponding authors (so roughly 1000 people) and asked them to share it with co-authors. The survey recorded 434 total responses, which is a pretty satisfying response rate!

Of those 434, 302 (69.6%) indicated that they were submitting to COLING for the first time, and 101 (23.3%) to a major NLP conference for the first time.

Outreach

We asked how people first found out about COLING 2018. The most popular response was “Web search” (44.2%), followed by “Call for Papers sent over email (e.g. corpora mailing list, ACL mailing list)” (35.9%), then “Other” (12.4%) and “Social media” (7.4%). The “Other” answers included word-of-mouth, knowing to expect COLING to come around in 2018, and websites that aggregate CFPs.

Paper types

We wanted to find out if people were aware of the paper types (since this is relatively unusual in our field) before submitting their papers, and if so, how they found out. Most—349 (80.4%)—were aware of the paper types ahead of time. Of these, the vast majority (93.4%) found out about the paper types via the Call for Papers. Otherwise, people found out because someone else told them (7.4%), via our Twitter or Facebook feeds (6.0%), or via our blog (3.7%).

We also asked if it was clear to authors which paper type was appropriate for their paper and if they think paper types are a good idea. The answers in both cases were pretty strongly positive: 78.8% said it was clear and 91.0% said it was a good idea. (Interestingly, 74 people who said it wasn’t clear which paper type was a good fit for theirs nonetheless said it was a good idea, and 21 people who thought it was clear which paper type fit nonetheless said it wasn’t.)

Writing mentoring program

We wanted to know if our authors were aware of the writing mentoring program, and for those who were but didn’t take advantage of it, why not. 277 respondents (63.8%) said they were aware of it. The most common reason chosen for not taking advantage of it was “I didn’t/couldn’t have a draft ready in time.” (150 respondents), followed by “I have good mentoring available to me in my local institution” (97 respondents). The other two options available in that check-all-that-apply question were “I have a lot of practice writing papers already” (74 respondents) and “Other” (10). Alas, a few people indicated that they only discovered it too late.

Other channels

We have been putting significant effort into getting information out about our process, but still worry that the channels we’re using aren’t reaching everyone. We asked “What other channels would you like to see information like this publicized on?” referring specifically to the paper types. Most people did not respond, or indicated that what we’re doing is enough. Other responses included: LINGUIST List, LinkedIn, Instagram, ResearchGate, and email. Ideas for email include creating a conference-specific mailing list that people can subscribe to and sending out messages to all email addresses registered in START.

We include these ideas here for posterity (and the benefit of future people filling this role). We have used LinkedIn and Weibo in a limited capacity and are using Twitter and Facebook. Adding additional social media (Instagram) sounds plausible, but is not in our plans for this year. An email list that people could opt into for updates makes a lot of sense, though there’s still the problem of getting the word out about that list. Perhaps a good way to do that would be to include that info in the CFP (starting from the first CFP). Emailing everyone through START may not be feasible (depending on START’s email privacy policy) and at any rate wouldn’t help reach those who have never submitted to a compling/NLP conference before.

Blog readership

Of course we wanted to know if our authors are reading this blog. 44.7% of respondents weren’t aware of the blog (prior to being asked that question!), 15.0% had found it only recently, 24.9% had been aware of it for at least a month but less than 6, and 15.4% indicated that they’ve been aware of it for at least 6 months. 9.2% of respondents read (almost) everything we post, 32.0% read it sometimes, and the remainder don’t read it or read it only rarely.

We also wanted to know if the PC blog helped our authors to understand our submission process or shape their submissions to COLING 2018. 22.8% indicated “Yes, a lot!” and 28.1% “Yes, a little”. On the no side, 22.1% chose “No, not really” and 27.0% “No, not at all”. “Yes, a lot!” people, we’re doing this for you 🙂

Outstanding Mentors

Posted on April 3, 2018 by Emily M. Bender

The COLING 2018 writing mentoring program went extremely well—we are grateful to all of the mentors who volunteered their time to provide thoughtful comments to the authors who participated. Furthermore, the prompts we used in the writing mentoring form (listed in the description of the program) were effective in eliciting useful feedback for authors.

There is great willingness in our field to participate from the mentoring side. Over 100 mentors signed up, which means we could have provided mentoring for even more papers than we did. It seems that the biggest hurdle to success for such a program is getting the word out to those who would most likely benefit from it. (We’ve got another blog post in the works about outreach & responses to our author survey.)

Reviewing the work of the mentors to find those to recognize as outstanding mentors was inspiring—and the task of choosing difficult—because so many did such a great job. Even if the mentored papers aren’t ultimate accepted to COLING, the authors who received mentoring will have benefited from thoughtful, constructive feedback on their work which we hope will inform both future writings on the same topic and perhaps even their approach to writing on other topics.

Against that background, the following mentors distinguished themselves as particularly outstanding:

Kevin Cohen
Carla Parra Escartín
David Mimno
Emily Morgan
Irina Temnikova
Jennifer Williams

Thank you to all of our mentors!

Reviewing in an interdisciplinary field

Posted on April 3, 2018 by Emily M. Bender

The process of peer review, when functioning at its best, ensures that work that is published in archival venues is carefully vetted, such that the results likely to be reliable and the presentation of those results interpretable by scholars in the field at large. But what does “peer review” mean for an interdisciplinary field? Who are the relevant “peers”? We believe that for both goals—vetting reliability of results and vetting readability of presentation—reviewing in an interdisciplinary field ideally involves reviewers coming from different perspectives.

In our particular context, we had the added (near-)novelty of our keyword-based area assignment system. (Near-novelty, because this was pioneered by NAACL 2016.) This means our areas are not named, but rather emerge from the clusters of both reviewer interests and paper topics. On the upside, this means that really popular topics (“semantics”, or “MT”) can be spread across areas, such that we don’t have area chairs scrambling for large numbers of additional reviewers. On the downside, the areas can’t be named until the dust has settled, so reviewers don’t necessarily have a clear sense of which area (in the traditional sense) they are assigned to. In addition, interests that were very popular and therefore not highly discriminative (e.g. “MT”) weren’t given very much weight alone by the clustering algorithm.

During the bidding process, we had a handful of requests from reviewers to change areas (really a small number, considering the overall size of the reviewing pool!). These requests came in three types. First, there were a few who just said: There’s relatively little in this area that I feel qualified to review, could I try a different area? In all cases, we were able to find another area that was a better match.

Second, there were reviewers who said “My research interest is X, and I don’t see any papers on X in my area. I only want to review papers on X.” We found these remarks a little surprising, as our understanding of the field is that it is not a collection of independent areas that don’t inform each other, but rather a collection of ways of looking at the same very general and very pervasive phenomenon: human language and the ways in which it can be processed by computers. Indeed, we structured the keywords into multiple groups—targets, tasks, approaches, languages and genres—to increase intersection on at least a few areas of any given reviewer’s expertise. We very much hope that the majority of researchers in our field read outside their specific subfield and are open to influence from other subfields on their own.

The third type was reviewers, typically of a more linguistic than computational orientation, who expressed concern that because they aren’t familiar with the details of the models being used, they wouldn’t be able to review effectively. To these reviewers, we pointed out that it is equally important to look critically at the evaluation (what data is being used and how) and the relationship of the work to the linguistic concepts it is drawing on. Having reviewers with deep linguistic expertise is critical and both COLING and the authors very much benefit from it.

To create the best cross-field review, then, it helps to examine each of one’s strengths and compare these with the multiple facets presented by any paper. No single reviewer is likely to be expert in every area and aspect of a manuscript; but, there’s a good chance that, as long as some care has been applied to matching, there will be some crossover expertise. Be bold with that expertise. And indeed, the coverage of knowledge that multiple reviewers have is often complementary. As a reviewer, you can bring some knowledge to reviewing at least one aspect of a paper—more so than others sharing the workload—even if that is not the aspect initially expected.

COI policy

Posted on March 13, 2018 by Emily M. Bender

With a conference the size of COLING, managing conflicts of interest in the reviewing process is a challenge. Below, we sketch our COI handling policy, in the interest of transparency. In all cases, our goals are to maintain the fairness and integrity of the double-blind review process while also making sure that our hard-working volunteer program committee members can still also submit to COLING.

reviewer <> author

Softconf will automatically COI any reviewers from the same institution as an author. In addition, in the bidding phase, we will ask reviewers to indicate any COIs that were not automatically caught. When ACs match reviewers to papers, this will be done so as to avoid COIs.

AC <> author

Any paper for which an AC has a COI (beyond being simply sharing affiliation) will be handled by rerouting the paper to another area. Here we’re talking about papers authored by ACs, their students, or other close collaborators. Since the areas are emergent (rather than strictly defined a priori) we anticipate it being relatively straightforward to find the area that is the next best match for all such papers. For papers authored by ACs themselves, this is relatively straightforward to detect. Beyond that, we will be asking ACs to identify any such COI papers as we assign papers to areas.

PC chair <> author

Perhaps the trickiest case is COIs involving the PC co-chairs ourselves. Neither of us are submitting our own papers to the COLING main conference. (Workshops, being handled entirely separately, are fair game in principle.) However, the fact that we’ve taken on this role shouldn’t prevent our students and other close collaborators from submitting to COLING. In this case, the entire process (including assignment to an area or possibly the “Special Circumstances” ACs, assignment to reviewers, and final accept/reject decision) will be overseen by our counterpart in conjunction the General Chair, Pierre Isabelle. This way, we still ensure that two people are present at every level of chairing.

Reviewer Code of Conduct

Posted on March 9, 2018 by Emily M. Bender

We ask the reviewers for COLING 2018 to adhere to the following code of conduct. (This has also been sent to the reviewers via email, but for transparency’s sake we post it here as well.)

Reviewer Code of Conduct

As you prepare your reviews, keep in mind that our goal with the review forms is to help reviewers structure their reviews in such a way that they are helpful for the area chairs in making final acceptance decisions, informative for the authors (so they understand the decisions that were made), and helpful for the authors (as they improve their work either for camera ready, or for submission to a later venue). To that end, we ask you to follow these guidelines as you prepare your reviews:

Be timely: Even if you don’t plan to start your reviews as soon as they are assigned, please do log in to START and see which papers you got. This will allow you to notify us of conflicts of interest in time for us to reassign the paper. Furthermore, please don’t count on finishing your reviews at the last minute. As we all know, things can come up, and that time you were counting on might not be there. As we coordinate the efforts of 1200+ PC members, it is imperative that everyone be timely.

Be constructive: Be sure to state what you find valuable about each paper, even if this is difficult to do. There’s a person on the other end of your review, who has put thought and effort into their paper. Your suggestions for improvement will be better received if the author can also see that you understood what they were trying to do. Normative statements (e.g. “insufficient evaluation”) are much more valuable to both authors and chairs when there are supporting explanations, so include them.

Be thorough: Read both the review forms and your papers carefully and provide detailed comments. We ask for scores on specific dimensions because we want you to consider those dimensions as you evaluate the paper. But ultimately, your comments will be more helpful, both to the ACs and to the authors, than the numerical scores. So please comment on each of the points as well in the text of your review. Note, too, that we have quite different review forms for different paper types, because we believe that different paper types should be evaluated in (somewhat) different ways (e.g. a position paper shouldn’t be criticized for not including an evaluation section). Please look at the review form before reading the paper so you know what you are looking for.

Maintain confidentiality: As a professional researcher, we have confidence you already know that this entire process is confidential, and how to treat it that way. Do not share the papers you review discuss their contents with others. Do not appropriate the ideas in the paper.

Error analysis in research and writing

Posted on January 22, 2018 by Emily M. Bender

The COLING 2018 main conference deadline is in about eight weeks — have you integrated error analysis into your workflow yet?

One distinctive feature of our review forms for COLING 2018 is the question we’ve added about error analysis in the form for the NLP Engineering Experiment paper type. Specifically, we will ask reviewers to consider:

Error analysis: Does the paper provide a thoughtful error analysis, which looks for linguistic patterns in the types of errors made by the system(s) evaluated and sheds light on either avenues for future work or the source of the strengths/weaknesses of the systems?

Is error analysis required for NLP engineering experiment papers at COLING?

We’ve been asked this, in light of the fact that many NLP engineering experiment papers (by far the most common type of paper published in computational linguistics and NLP conferences of late) do not have error analysis and many of those are still influential, important and valuable.

Our response is of necessity somewhat nuanced. In our ideal world, all NLP engineering experiment papers at COLING 2018 would include thoughtful error analyses. We believe that this would amplify the contributions of the research we publish both in terms of short term interest and long term relevance. However, we also recognize that error analysis is not yet as prominent in the field as it could be and we’d say it should.

And so, our answer is that error analysis not a strict requirement. However, we ask our reviewers to look for it, and value it, and include the value of the error analysis in their overall evaluation of the papers they review. (And conversely, we absolutely do not want to see reviewers complaining that space in the paper is ‘wasted’ on error analysis.)

But why is error analysis so important?

As Antske Fokkens puts it in her excellent guest post on reproducibility:

The outcome becomes much more convincing if the hypothesis correctly predicts which kind of errors the new approach would solve compared to the baseline. For instance, if you predict that reinforcement learning reduces error propagation, investigate the error propagation in the new system compared to the baseline. Even if it is difficult to predict where improvement comes from, a decent error analysis showing which phenomena are treated better than by other systems, which perform as good or bad and which have gotten worse can provide valuable insights into why an approach works or, more importantly, why it does not.

In other words, a good error analysis tells us something about why method X is effective or ineffective for problem Y. This in turn provides a much richer starting point for further research, allowing us to go beyond throwing learning algorithms at the wall of tasks and seeing which stick, while allowing us to also discover which are the harder parts of a problem. And, as Antske also points out, a good error analysis makes it easier to publish papers about negative results. The observation that method X doesn’t work for problem Y is far more interesting if we can learn something about why not!

How do you do error analysis anyway?

Fundamentally, error analysis involves examining the errors made by a system and developing a classification of them. (This is typically best done over dev data, to avoid compromising held-out test sets.) At a superficial level, this can involve breaking things down by input length, token frequency or looking at confusion matrices. But we should not limit ourselves to examining only labels (rather than input linguistic forms) as with confusion matrices, or superficial properties of the linguistic signal. Languages are, after all, complex systems and linguistic forms are structured. So a deeper error analysis involves examining those linguistic forms and looking for patterns. The categories in the error analysis typically aren’t determined ahead of time, but rather emerge from the data. Does your sentiment analysis system get confused by counterfactuals? Does your event detection system miss negation not expressed by a simple form like not? Does your MT system trip up on translating pronouns especially when they are dropped in the source language? Do your morphological analysis system or string-based features meant to capture noisy morphology make assumptions about the form and position of affixes that aren’t equally valid across test languages?

As Emily noted in a guest post over on the NAACL PC blog:

Error analysis of this type requires a good deal of linguistic insight, and can be an excellent arena for collaboration with linguists (and far more rewarding to the linguist than doing annotation). Start this process early. The conversations can be tricky, as you try to explain how the system works to a linguist who might not be familiar with the type of algorithms you’re using and the linguist in turn tries to explain the patterns they are seeing in the errors. But they can be rewarding in equal measure as the linguistic insight brought out by the error analysis can inform further system development.

Why COLING?

This brings us to why COLING in particular should be a leader in placing the spotlight on error analysis: As we noted in a previous blog post, COLING has a tradition of being a locus of interdisciplinary communication between (computational) linguistics and NLP as practiced in computer science. Error analysis is a key, under-discussed component of our research process that benefits from such interdisciplinary communication.

Workshop review process for ACL, COLING, EMNLP, and NAACL 2018

Posted on January 22, 2018 by Emily M. Bender

This guest post by the workshop chairs describes the process by which workshops were reviewed for COLING and the other major conferences in 2018 and how they were allocated.

For approximately the last 10 years, ACL, COLING, EMNLP, and NAACL have issued a joint call for workshops. While this adds an additional level of effort and coordination for the conference organizers, it lets workshop organizers focus on putting together a strong program and helps to ensure a balanced set of offerings for attendees across the major conferences each year. Workshop proposals are submitted early in the year, and specify which conference(s) they prefer or require. A committee composed of the workshop chairs of each conference then undertakes a review process of the proposals, and decides which proposals to accept, and an assignment of venues. This blog post explains how the process worked in 2018, and largely followed the guidance on the ACL wiki.

We began by gathering the workshop chairs in August 2017. At that time, workshop chairs from ACL (Brendan O’Connor, Eva Maria Vecchi), COLING (Tim Baldwin, Yoav Goldberg, Jing Jiang), and NAACL (Marie Meteer, Jason Williams) had been appointed, but EMNLP (which occurs last of the 4 events in 2018) had not. This group drafted the call for workshops, largely following previous calls.

The call was issued on August 31, 2017, and specified a due date of October 22, 2017. During those months, the workshop chairs from EMNLP were appointed (Marieke van Erp, Vincent Ng) and joined the committee, which now consisted of 9 people. We received a total of 58 workshop proposals.

We went into the review process with the following goals:

Ensure a high-quality workshop program across the conferences
Ensure that the topics are relevant to the research community
Avoid having topically very similar workshops at the same conference
For placing workshops in conferences, follow proposer’s preferences wherever possible, diverging only in cases where there existed space limitations and/or substantial topical overlap

In addition to quality and relevance, it is worth noting here that space is an important consideration for workshops. Each conference has a fixed set of meeting rooms available for workshops, and the sizes of those rooms varies widely, with the smallest room holding 44 people, and the largest holding 500. We therefore made considerable effort to estimate the expected attendance at workshops (explained more below).

We started by having each proposal reviewed by 2 members of the committee, with most committee members reviewing around 15 proposals. To aid in the review process, we attempted to first categorize the workshop proposals, to help align proposals with areas of expertise on the committee. This categorization proved quite difficult because many proposals intentionally spanned several disciplines, but it did help identify proposals that were similar.

Our review form included the following questions:

Relevance: Is the topic of this workshop interesting for the NLP community?
Originality: Is the topic of this workshop original? (“no” not necessarily a bad thing)
Variety: Does the topic of this workshop add to the diversity of topics discussed in the NLP community? (“no” not necessarily a bad thing)
Quality of organizing team: Will the organisers be able to run a successful workshop?
Quality of program committee: Have the organisers drawn together a high-quality PC?
Quality of invited speakers (if any): Have high-quality, appropriate invited speaker(s) been identified by the organisers?
Quality of proposal: Is the topic of the workshop motivated and clearly explained?
Coherence: Is the topic of the workshop coherent?
Size (smaller size not necessarily a bad thing):
- Number of previous attendees: Is there an indication of previous numbers of workshop attendees, and if so, what is that number?
- Number of previous submissions: Is there an indication of previous numbers of submissions, and if so, what is that number?
- Projected number of attendees: Is there an indication of projected numbers of workshop attendees, and if so, what is that number?
Recommendation: Final recommendation
Text comments to provide to proposers
Text comments for internal committee use

As was done last year, we also surveyed ACL members to seek input on which workshops people were likely to attend. We felt this survey would be useful in two respects. First, it gave us some additional signal on the relative attendance at each workshop (in addition to workshop organizers’ estimates), which helps assign workshops to appropriately sized rooms. Second, it gave us a rough signal about the interest level from the community. We expected that results from this type of survey are almost certainly biased, and kept this in mind when interpreting results.

Before considering the bulk of the 58 submissions, we note that there are a handful of large, long-standing workshops which the ACL organization agrees to pre-admit, including *SEM, WMT, CoNLL, and SemEval. These were all placed at their first-choice venue.

We then dug into our main responsibility of making accept/reject and placement decisions for the bulk of proposals. In making these decisions, we took into account proposal preferences, our reviews, available space, and results from the survey. Although we operated as a joint committee, ultimately the workshop chairs for each conference took responsibility for workshops accepted to their conference.

We first examined space. These 4 conferences in 2018 each had between 8 and 14 rooms available over 2 days, with room capacities ranging from 40 to 500 people. The total space available nearly matched the number of proposals. Specifically — had all proposals been accepted — there was enough space for all but 3 proposals to be at their first choice venue, and the remaining 3 at their second choice.

Considering the reviews, the 2 reviews per paper were very low-variance: about ⅔ of the final recommendations were identical, and the remaining ⅓ differed by 1 point on a 4-point scale. Overall, we were very impressed by the quality of the proposals, which covered a broad range of topics with strong organizing committees, reviewers, and invited speakers. None of the reviewers recommended 1 (clear reject) for any proposal. Further, the survey results for most borderline proposals showed reasonable interest from the community.

We also considered topicality. Here we found that there were 5 pairs of workshops where each requested the same conference as their first choice, and were topically very similar. In four of the pairs, we assigned a workshop to its second choice conference. In the final pair, in light of all the factors listed above, one workshop was rejected.

In summary, of the 58 proposals, 53 workshops were accepted to their first-choice conference; 4 were accepted to their second-choice conference; and 1 was rejected.

For the general chairs of *ACL conferences next year, we would definitely recommend continuing to organize a similarly large number of workshop rooms. For workshop chairs, we stress that reviewing and selecting workshops is qualitatively different than reviewing and selecting papers; for this reason, we recommend reviewing the proposals among the committee rather than recruiting reviewers (as was previously pointed out by the workshop chairs from the previous year). We would also suggest having workshop chairs consider using a structured form for workshop submissions, since a fair amount of manual effort was required to extract structured data from each proposal document.

WORKSHOP CO-CHAIRS

For ACL:
Brendan O’Connor, University of Massachusetts Amherst
Eva Maria Vecchi, University of Cambridge

For COLING:
Tim Baldwin, University of Melbourne
Yoav Goldberg, Bar Ilan University
Jing Jiang, Singapore Management University

For NAACL:
Marie Meteer, Brandeis University
Jason Williams, Microsoft Research

For EMNLP:
Marieke van Erp, KNAW Humanities Cluster
Vincent Ng, University of Texas at Dallas

COLING as a Locus of Interdisciplinary Communication

Posted on December 8, 2017 by Emily M. Bender

The nature of the relationship between (computational) linguistics and natural language processing remains a hot topic in the field. There is at this point a substantial history of workshops focused on how to get the most out of this interaction, including at least:

An invited plenary symposium on Computational Linguistics in Support of Linguistic Analysis at the 2009 meeting of the Linguistic Society of America and the associated special issue of Linguistic Issues in Language Technology
The EACL 2009 Workshop on the Interaction between Linguistics and Computational Linguistics: Virtuous, Vicious or Vacuous? and the associated special issue of Linguistic Issues in Language Technology
The ACL 2010 Workshop on NLP and Linguistics: Finding the Common Ground
The Workshops on the Use of Computational Methods in the Study of Endangered Languages at ACL 2014 and ICLDC 2017
The EMNLP 2017 Workshop on Building Linguistically Generalizable NLP Systems
The Workshop Perceptrons and Syntactic Structures at Sixty, held jointly with the inaugural meeting of the Society for Computational in Linguistics
The ACL 2018 Workshop on the Relevance of Linguistic Structure in Neural NLP

[There are undoubtedly more! Please let us know what we’ve missed in the comments and we’ll add them to this list.]

The interaction between the fields also tends to be a hot-button topic on Twitter, leading to very long and sometimes informative discussions, such as the NLP/CL Megathread of April 2017 (as captured by Sebastian Mielke) or the November 2017 discussion on linguistics, NLP, and interdisciplinarity, summarized in blog posts by Emily M. Bender and Ryan Cotterell.

It is very important to us as PC co-chairs of COLING 2018 to continue the COLING tradition of providing a venue that encourages interdisciplinary work. COLING as a venue should host both computationally-aided linguistic analysis and linguistically informed work on natural language processing. Furthermore, it should provide a space for authors of each of these kinds of papers to provide feedback to each other.

Actions we have taken so far to support this vision include recruiting area chairs whose expertise spans the two fields as well as in the design of our paper types and associated review forms.

We’d like to see even more discussion of how interdiscipinarity works/can work in our field. What do you consider to be best practices for carrying out such interdisciplinary work? What role do you see for linguistics in NLP/how do computational methods inform your linguistic research? How do you build and maintain collaborations? When you read (or review) in this field, what kind of features of a paper stand out for you as particularly good approaches to interdisciplinary work? Finally, how can COLING further support such best practices?

Recruiting Area Chairs

Posted on November 8, 2017 by Emily M. Bender

An absolutely key ingredient for a successful conference is a stellar team of area chairs (ACs). What do we mean by stellar? We need people who take the task seriously, work hard to ensure fairness, bring their expertise to bear in selecting papers that make valuable contributions and constitute a vibrant program, can be effective leaders and get the reviewers to do their job well, and finally who represent a broad range of diverse interests and perspectives on our field. What a tall order!

On top of that, given the size of conferences in our field presently, we need a large team of such amazing colleagues. How big? We are planning for 2000 submissions (yikes!), which we will allocate evenly across 40 areas, so roughly 50 papers per area. We plan to have area chairs work in pairs, so we need 80 area chairs to cover 40 areas. In addition, we anticipate a range of troubleshooting and consulting beyond what we two as PC co-chairs can handle, and so we also want an additional 10 area chairs who can assist across areas, with START troubleshooting, handling papers with COI issues, and whatever else comes up. That means we’re looking for about 100 people total.

We decided to do the recruiting in two phases. The first phase involved recruiting 50 area chairs directly by invitation. Phase II is an open call for nominations (and self-nominations!) for the remaining 50 area chairs. The purpose of this blog post is to give you an update on how we are doing in terms of various metrics of diversity, and, more importantly, to alert you to the call for area chairs. If you would like to serve as area chair, or if you know someone who you’d like to nominate, please fill out this form.

As we select additional area chairs, we will be looking to round out the range of areas of expertise we have recruited so far (see below); maintain our gender balance; improve our regional diversity; improve the representation of area chairs from non-academic affiliations; and improve racial/ethnic diversity. The stats for our area chairs so far are as follows (based on a self-report survey we sent to the area chairs).

Research Interests

A diverse range of areas were described, from a free-text entry from. Those with multiple entries are shown in the chart, and the hapaxes listed below.

Accent Variation
Active Learning
Argument Mining
Aspect
Authorship Analysis (Attribution, Profiling, Plagiarism Detection)
Automatic Summarization
Biomedical/clinical Text Processing
BioNLP
Clinical NLP
Clustering
Code-mixing
Code-switching
Computational Cognitive Modeling
Computational Discourse
Computational Lexical Semantics
Computational Lexicography
Computational Morphology
Computational Pragmatics
Conversational AI
Conversation Modeling
Corpora Construction
Corpus Design And Development
Corpus Linguistics
Cross-language Speech Recognition
Cross-lingual Learning
Data Modeling And System Architecture
Dialogue Pragmatics
Dialogue System
Dialogue Systems
Discourse Modes
Discourse Parsing
Document Summarization
Emotion Analysis
Endangered Language Documentation
Evaluation
Event And Temporal Processing
Experimental Linguistics
Eye Movements
Fact Checking
Grammar Correction
Grammar Engineering
Grammar Induction
Grounded Language Learning
Grounded Semantics
HPSG
Incremental Language Processing
Information Retrieval
KA
Korean NLP
Language Acquisition
Lexical Resources
Linguistic Annotation
Linguistic Issues In NLP
Linguistic Processing Of Non-canonical Text
Low-resource Learning
Machine Reading
Modality
Multilingual Systems
Multimodal NLP
NER
NLG
NLP In Health Care & Education
NLU
Ontologies
Ontology Construction
Phonology
POS Tagging
Reading
Reasoning
Relation Extraction
Resources
Resources And Evaluation
Rhetorical Types
Semantic Parsing
Semantic Processing
Short-answer Scoring
Situation Types
Social Media
Social Media Analysis
Social Media Analytics
Software And Tools
Speech
Speech Perception
Speech Recognition
Speech Synthesis
Spoken Language Understanding
Stance Detection
Structured Prediction
Summarization
Syntactic And Semantic Parsing
Syntax/parsing
Tagging
Temporal Information Extraction
Text Classification
Text Mining
Text Simplification
Text Types
Transfer Learning
Treebanks
Vision And Language
Weakly Supervised Learning

Gender

We asked a completely open-ended question here, which was furthermore optional, and then binned the answers into the three categories female, male, and other/question skipped.

Country of affiliation

Another open-ended question, which we again binned by region. Latin America is the Americas minus the US and Canada. Australia is counted as Asia. So far Africa is not represented.

Type of affiliation

Our survey anticipated five possible answers here: Academia, Industry – research lab, Industry – other, Government, Other; but only the first two are represented so far.

Race/ethnicity

We are interested in making sure that our senior program committee is diverse in terms of race/ethnicity, but it is very difficult to talk about what this means in an international context, because racial constructs are very much products of the cultures they are a part of. So rather than ask for specific race/ethnicity categories, which we would be unprepared to summarize across cultures, we decided to ask the following pair of questions, both of which were optional (like the question about gender):

As we work to make sure that our senior PC is appropriately diverse, we would like to consider race/ethnicity. Yet, at the level of an international organization, it is very unclear what categories could possibly be appropriate for such a survey. Accordingly, we have settled on the distinction minoritized (treated as a minority)/not minoritized (treated as normative/majority).

In the context of your country of current affiliation, and with respect to your race/ethnicity, are you: (optional)

Minoritized

Not minoritized

During your education or career prior to your current affiliation, has there ever been a significant period time during which you were minoritized with respect to your race/ethnicity? (optional)

Yes

No

Please join us!

We’re looking for about 50 more ACs! Please consider nominating yourself and/or other people who you think would do a good job and also help us round out our leadership team along the various dimensions identified above. Both self- and other-nominations can be done at this form. You can nominate as many people as you like (but only nominate yourself once, please 😉

Writing Mentoring Program

Posted on September 15, 2017 by Emily M. Bender

Submit your manuscript for mentoring here: https://www.softconf.com/coling2018/mentoring/

Among the goals we outlined in our inaugural post was the following:

(1) to create a program of high quality papers which represent diverse approaches to and applications of computational linguistics written and presented by researchers from throughout our international community;

One of our strategies for achieving this goal is to create a writing mentoring program, which takes place before the reviewing stage. This optional program is focused on helping those who perhaps aren’t used to publishing in the field of computational linguistics, are early in their careers, and so on. We see mentoring as a tool that makes COLING accessible for broader range of high-quality ideas. In other words, this isn’t about pushing borderline papers into acceptance but rather alleviating presentational problems with papers that, in their underlying research quality, easily make the high required standard.

In order for this program to be successful, we need buy-in from prospective mentors. In this blog post, we provide the outlines of the program, in order to let the community (including both prospective mentors and mentees) know what we have in mind and to seek (as usual) your feedback.

We plan to run the mentoring program through the START system, as follows:

Anyone wishing to receive mentoring will submit an abstract by 4 weeks before the COLING submission deadline. Authors will be instructed that submitting an abstract at this point represents a commitment to submit a full draft by the mentoring deadline and then to submit to COLING.
Requesting mentoring doesn’t guarantee receiving mentoring and receiving mentoring doesn’t guarantee acceptance to the conference program.
Any reviewer willing to serve as mentor will bid on those abstracts and indicate how many papers total they are willing to mentor. Mentors will receive guidance from the program committee co-chairs on their duties as mentors, as well as a code of conduct.
Area chairs will assign papers to mentors by 3 weeks before the submission deadline, giving priority as follows. (Note that if there are not enough mentors, not every paper requesting mentoring will receive it.)
1. Authors from non-anglophone institutions
2. Authors from beyond well-represented institutions
Authors wishing to receive mentoring will submit complete drafts via START by 3 weeks before the submission deadline.
Mentors will provide feedback within one week, using a ‘mentoring form’ created by the PCs structured to encourage constructive feedback.
No mentor will serve as a reviewer for a paper they were mentor of.
Mentor bidding will be anonymous, but actual mentoring will not be (in either direction).
Mentors will be recognized in the conference handbook/website, but COLING will not indicate which papers received mentoring (though authors are free to acknowledge mentorship in their acknowledgments section).

As a starting point, here are our initial questions for the mentoring form:

What is the main claim or result of this paper?
What are the strengths of this paper?
What questions do you have as a reader? What do you wish to know about the research that was carried out that is unclear as yet from the paper?
What aspect of the paper do you think the COLING audience will find most interesting?
Which paper category/review form do you think is most appropriate for this paper?
Taking into consideration the specific questions in that review form, in what ways could the presentation of the research be strengthened?
If you find places where there are grammatical or stylistic issues in writing, or in general, if you think certain improvements are possible in terms of overall organization and structure, please indicate these. It may be most convenient to do so by marking up a pdf with comments.

Regarding code of conduct, by signing up to mentor a paper, mentors agree to:

Maintain confidentiality: Do not share the paper draft or discuss its contents with others (without express permission from the author). Do not appropriate the ideas in the paper.
Commit to prompt feedback: Read the paper and provide feedback via the form by the deadline specified.
Be constructive: Avoid sarcastic or harsh evaluative remarks; phrase feedback in terms of how to improve, rather than what is wrong or bad.

The benefits to authors are clear: Authors participating in the program will benefit because they will receive feedback on the presentation of their work, which if heeded, might also improve chances of acceptance as well as enhance the impact of the paper once published. Perhaps the benefits to mentors are more in need of articulation. Here are the benefits we see: Mentors will be recognized through a listing in the conference handbook and website, with outstanding mentors receiving further recognition. In addition, mentoring should be rewarding for the mentors because the exercise of giving constructive feedback on academic writing provides insight into what makes good writing. Finally, the mentoring program will benefit the entire COLING audience through both improved presentation of research results and improved diversity of authors included in the conference.

Our questions for our readership at this point are:

What would make this program more enticing to you as a prospective mentor or author?
As a prospective mentor or author, are there additional things you’d like to see in the mentoring form?
Are there points you think we should add to the code of conduct?

COLING 2018

August 20-26, 2018, Santa Fe, New Mexico, USA

Author Archives: Emily M. Bender

Author survey results

Respondents

Outreach

Paper types

Writing mentoring program

Other channels

Blog readership

Outstanding Mentors

Reviewing in an interdisciplinary field

COI policy

reviewer <> author

AC <> author

PC chair <> author

Reviewer Code of Conduct

Reviewer Code of Conduct

Error analysis in research and writing

The COLING 2018 main conference deadline is in about eight weeks — have you integrated error analysis into your workflow yet?

Is error analysis required for NLP engineering experiment papers at COLING?

But why is error analysis so important?

How do you do error analysis anyway?

Why COLING?

Workshop review process for ACL, COLING, EMNLP, and NAACL 2018

COLING as a Locus of Interdisciplinary Communication

Recruiting Area Chairs

Research Interests

Gender

Type of affiliation

Race/ethnicity

Please join us!

Writing Mentoring Program