scientific-skills/venue-templates/references/cs_conference_style.md
Comprehensive writing guide for ACL, EMNLP, NAACL (NLP), CHI, CSCW (HCI), SIGKDD, WWW, SIGIR (data mining/IR), and other major CS conferences.
Last Updated: 2024
CS conferences span diverse subfields with distinct writing cultures. This guide covers NLP, HCI, and data mining/IR venues, each with unique expectations and evaluation criteria.
"Strong empirical results on standard benchmarks with insightful analysis."
NLP papers balance empirical rigor with linguistic insight. Human evaluation is increasingly important alongside automatic metrics.
| Characteristic | Description |
|---|---|
| Task-focused | Clear problem definition |
| Benchmark-oriented | Standard datasets emphasized |
| Analysis-rich | Error analysis, qualitative examples |
| Reproducible | Full implementation details |
Coreference resolution remains challenging for pronouns with distant or
ambiguous antecedents. Prior neural approaches struggle with these
difficult cases due to limited context modeling. We introduce
LongContext-Coref, a retrieval-augmented coreference model that
dynamically retrieves relevant context from document history. On the
OntoNotes 5.0 benchmark, LongContext-Coref achieves 83.4 F1, improving
over the previous state-of-the-art by 1.2 points. On the challenging
WinoBias dataset, we reduce gender bias by 34% while maintaining
accuracy. Qualitative analysis reveals that our model successfully
resolves pronouns requiring world knowledge, a known weakness of
prior approaches.
├── Introduction
│ ├── Task motivation
│ ├── Prior work limitations
│ ├── Your contribution
│ └── Contribution bullets
├── Related Work
├── Method
│ ├── Problem formulation
│ ├── Model architecture
│ └── Training procedure
├── Experiments
│ ├── Datasets (with statistics)
│ ├── Baselines
│ ├── Main results
│ ├── Analysis
│ │ ├── Error analysis
│ │ ├── Ablation study
│ │ └── Qualitative examples
│ └── Human evaluation (if applicable)
├── Discussion / Limitations
└── Conclusion
Increasingly expected for generation tasks:
Table 3: Human Evaluation Results (100 samples, 3 annotators)
─────────────────────────────────────────────────────────────
Method | Fluency | Coherence | Factuality | Overall
─────────────────────────────────────────────────────────────
Baseline | 3.8 | 3.2 | 3.5 | 3.5
GPT-3.5 | 4.2 | 4.0 | 3.7 | 4.0
Our Method | 4.4 | 4.3 | 4.1 | 4.3
─────────────────────────────────────────────────────────────
Inter-annotator κ = 0.72. Scale: 1-5 (higher is better).
"Technology in service of humans—understand users first, then design and evaluate."
HCI papers are fundamentally user-centered. Technology novelty alone is insufficient; understanding human needs and demonstrating user benefit is essential.
| Characteristic | Description |
|---|---|
| User-centered | Focus on people, not technology |
| Design-informed | Grounded in design thinking |
| Empirical | User studies provide evidence |
| Reflective | Consider broader implications |
Video calling has become essential for remote collaboration, yet
current interfaces poorly support the peripheral awareness that makes
in-person work effective. Through formative interviews with 24 remote
workers, we identified three key challenges: difficulty gauging
colleague availability, lack of ambient presence cues, and interruption
anxiety. We designed AmbientOffice, a peripheral display system that
conveys teammate presence through subtle ambient visualizations. In a
two-week deployment study with 18 participants across three distributed
teams, AmbientOffice increased spontaneous collaboration by 40% and
reduced perceived isolation (p<0.01). Participants valued the system's
non-intrusive nature and reported feeling more connected to remote
colleagues. We discuss implications for designing ambient awareness
systems and the tension between visibility and privacy in remote work.
├── Introduction
│ ├── Problem in human terms
│ ├── Why technology can help
│ └── Contribution summary
├── Related Work
│ ├── Domain background
│ ├── Prior systems
│ └── Theoretical frameworks
├── Formative Work (often)
│ ├── Interviews / observations
│ └── Design requirements
├── System Design
│ ├── Design rationale
│ ├── Implementation
│ └── Interface walkthrough
├── Evaluation
│ ├── Study design
│ ├── Participants
│ ├── Procedure
│ ├── Findings (quant + qual)
│ └── Limitations
├── Discussion
│ ├── Design implications
│ ├── Generalizability
│ └── Future work
└── Conclusion
├── Introduction
├── Related Work
├── Methods
│ ├── Participants
│ ├── Procedure
│ ├── Data collection
│ └── Analysis method (thematic, grounded theory, etc.)
├── Findings
│ ├── Theme 1 (with quotes)
│ ├── Theme 2 (with quotes)
│ └── Theme 3 (with quotes)
├── Discussion
│ ├── Implications for design
│ ├── Implications for research
│ └── Limitations
└── Conclusion
Use direct quotes to ground findings:
Participants valued the ambient nature of the display. As P7 described:
"It's like having a window to my teammate's office. I don't need to
actively check it, but I know they're there." This passive awareness
reduced the barrier to initiating contact.
Translate findings into actionable guidance:
**Implication 1: Support peripheral awareness without demanding attention.**
Ambient displays should be visible in peripheral vision but not require
active monitoring. Designers should consider calm technology principles.
**Implication 2: Balance visibility with privacy.**
Users want to share presence but fear surveillance. Systems should
provide granular controls and make visibility mutual.
acmart document class with sigchi option"Scalable methods for real-world data with demonstrated practical impact."
Data mining papers emphasize scalability, real-world applicability, and solid experimental methodology.
| Characteristic | Description |
|---|---|
| Scalable | Handle large datasets |
| Practical | Real-world applications |
| Reproducible | Datasets and code shared |
| Industrial | Industry datasets valued |
Fraud detection in e-commerce requires processing millions of
transactions in real-time while adapting to evolving attack patterns.
We present FraudShield, a graph neural network framework for real-time
fraud detection that scales to billion-edge transaction graphs. Unlike
prior methods that require full graph access, FraudShield uses
incremental updates with O(1) inference cost per transaction. On a
proprietary dataset of 2.3 billion transactions from a major e-commerce
platform, FraudShield achieves 94.2% precision at 80% recall,
outperforming production baselines by 12%. The system has been deployed
at [Company], processing 50K transactions per second and preventing
an estimated $400M in annual fraud losses. We release an anonymized
benchmark dataset and code.
├── Introduction
│ ├── Problem and impact
│ ├── Technical challenges
│ ├── Your approach
│ └── Contributions
├── Related Work
├── Preliminaries
│ ├── Problem definition
│ └── Notation
├── Method
│ ├── Overview
│ ├── Technical components
│ └── Complexity analysis
├── Experiments
│ ├── Datasets (with scale statistics)
│ ├── Baselines
│ ├── Main results
│ ├── Scalability experiments
│ ├── Ablation study
│ └── Case study / deployment
└── Conclusion
Table 4: Scalability Comparison (runtime in seconds)
──────────────────────────────────────────────────────
Dataset | Nodes | Edges | GCN | GraphSAGE | Ours
──────────────────────────────────────────────────────
Cora | 2.7K | 5.4K | 0.3 | 0.2 | 0.1
Citeseer | 3.3K | 4.7K | 0.4 | 0.3 | 0.1
PubMed | 19.7K | 44.3K | 1.2 | 0.8 | 0.3
ogbn-arxiv | 169K | 1.17M | 8.4 | 4.2 | 1.6
ogbn-papers | 111M | 1.6B | OOM | OOM | 42.3
──────────────────────────────────────────────────────
Used across all CS venues:
Our contributions are:
• We identify [problem/insight]
• We propose [method name] that [key innovation]
• We demonstrate [results] on [benchmarks]
• We release [code/data] at [URL]
All CS venues increasingly expect:
| Aspect | ACL/EMNLP | CHI | KDD/WWW | SIGIR |
|---|---|---|---|---|
| Focus | NLP tasks | User studies | Scalable ML | IR/search |
| Evaluation | Benchmarks + human | User studies | Large-scale exp | Datasets |
| Theory weight | Moderate | Low | Moderate | Moderate |
| Industry value | High | Medium | Very high | High |
| Page limit | 8 long / 4 short | 10 + refs | 9 + refs | 10 + refs |
| Review style | ARR | Direct | Direct | Direct |
venue_writing_styles.md - Master style overviewml_conference_style.md - NeurIPS/ICML style guideconferences_formatting.md - Technical formatting requirementsreviewer_expectations.md - What CS reviewers seek