skills/research/research-paper-writing/references/reviewer-guidelines.md
This reference documents how reviewers evaluate papers at major ML/AI conferences, helping authors anticipate and address reviewer concerns.
All major ML conferences assess papers across four core dimensions:
What reviewers ask:
How to ensure high quality:
What reviewers ask:
How to ensure clarity:
What reviewers ask:
How to demonstrate significance:
What reviewers ask:
Key insight from NeurIPS guidelines:
"Originality does not necessarily require introducing an entirely new method. Papers that provide novel insights from evaluating existing approaches or shed light on why methods succeed can also be highly original."
| Score | Label | Description |
|---|---|---|
| 6 | Strong Accept | Groundbreaking, flawless work; top 2-3% of submissions |
| 5 | Accept | Technically solid, high impact; would benefit the community |
| 4 | Borderline Accept | Solid work with limited evaluation; leans accept |
| 3 | Borderline Reject | Solid but weaknesses outweigh strengths; leans reject |
| 2 | Reject | Technical flaws or weak evaluation |
| 1 | Strong Reject | Well-known results or unaddressed ethics concerns |
Reviewers are explicitly instructed to:
Note: These dates are from the 2025 cycle. Always check the current year's call for papers at the venue website.
ICML reviewers provide:
ICML uses a similar 1-6 scale with calibration:
ICLR uses OpenReview with:
ICLR reviews include:
ACL adds NLP-specific evaluation:
ACL specifically requires a Limitations section. Reviewers check:
ACL has a dedicated ethics review process for:
AAAI reviewers evaluate along similar axes to NeurIPS/ICML but with some differences:
| Criterion | Weight | Notes |
|---|---|---|
| Technical quality | High | Soundness of approach, correctness of results |
| Significance | High | Importance of the problem and contribution |
| Novelty | Medium-High | New ideas, methods, or insights |
| Clarity | Medium | Clear writing, well-organized presentation |
| Reproducibility | Medium | Sufficient detail to reproduce results |
COLM reviews focus on relevance to language modeling in addition to standard criteria:
| Criterion | Weight | Notes |
|---|---|---|
| Relevance | High | Must be relevant to language modeling community |
| Technical quality | High | Sound methodology, well-supported claims |
| Novelty | Medium-High | New insights about language models |
| Clarity | Medium | Clear presentation, reproducible |
| Significance | Medium-High | Impact on LM research and practice |
COLM uses an ICLR-style scoring system:
Good reviewers follow these principles:
Strong Review Structure:
Summary (1 paragraph):
- What the paper does
- Main contribution claimed
Strengths (3-5 bullets):
- Specific positive aspects
- Why these matter
Weaknesses (3-5 bullets):
- Specific concerns
- Why these matter
- Suggestions for addressing
Questions (2-4 items):
- Clarifications needed
- Things that would change assessment
Minor Issues (optional):
- Typos, unclear sentences
- Formatting issues
Overall Assessment:
- Clear recommendation with reasoning
| Concern | How to Pre-empt |
|---|---|
| "Baselines too weak" | Use state-of-the-art baselines, cite recent work |
| "Missing ablations" | Include systematic ablation study |
| "No error bars" | Report std dev/error, multiple runs |
| "Hyperparameters not tuned" | Document tuning process, search ranges |
| "Claims not supported" | Ensure every claim has evidence |
| Concern | How to Pre-empt |
|---|---|
| "Incremental contribution" | Clearly articulate what's new vs prior work |
| "Similar to [paper X]" | Explicitly compare to X in Related Work |
| "Straightforward extension" | Highlight non-obvious aspects |
| Concern | How to Pre-empt |
|---|---|
| "Hard to follow" | Use clear structure, signposting |
| "Notation inconsistent" | Review all notation, create notation table |
| "Missing details" | Include reproducibility appendix |
| "Figures unclear" | Self-contained captions, proper sizing |
| Concern | How to Pre-empt |
|---|---|
| "Limited impact" | Discuss broader implications |
| "Narrow evaluation" | Evaluate on multiple benchmarks |
| "Only works in restricted setting" | Acknowledge scope, explain why still valuable |
Do:
Don't:
We thank the reviewers for their thoughtful feedback.
## Reviewer 1
**R1-Q1: [Quoted concern]**
[Direct response with evidence]
**R1-Q2: [Quoted concern]**
[Direct response with evidence]
## Reviewer 2
...
## Summary of Changes
If accepted, we will:
1. [Specific change]
2. [Specific change]
3. [Specific change]
Some reviewer feedback should simply be accepted:
Acknowledge these gracefully: "The reviewer is correct that... We will revise to..."
You can respectfully disagree when:
Frame disagreements constructively: "We appreciate this perspective. However, [explanation]..."
Before submitting, ask yourself:
Quality:
Clarity:
Significance:
Originality: