scientific-skills/scientific-slides/references/data_visualization_slides.md
Effective data visualization in presentations differs fundamentally from journal figures. While publications prioritize comprehensive detail, presentation slides must emphasize clarity, impact, and immediate comprehension. This guide covers adapting figures for slides, choosing appropriate chart types, and avoiding common visualization mistakes.
The Core Difference:
Simplification Strategies:
Remove Non-Essential Elements:
Focus on Key Message:
Example Transformation:
Journal Figure:
- 6 panels (A-F)
- 4 experimental conditions per panel
- 50+ data points visible
- Complex statistical annotations
- Small font labels
Presentation Version:
- 3 separate slides (1-2 panels each)
- Focus on key comparison per slide
- Large, clear data representation
- One statistical result highlighted
- Large, readable labels
Guide Attention:
Techniques:
Color Emphasis:
Main Result: Bold, saturated color (e.g., blue)
Comparison: Muted gray or desaturated color
Background: Very light gray or white
Size Emphasis:
Key line/bar: Thicker (3-4pt)
Reference lines: Thinner (1-2pt)
Grid lines: Very thin (0.5pt) or remove
Annotation:
Add text callouts: "34% increase" with arrow
Add shapes: Circle key region
Add color highlights: Background shading for important area
Font Sizes for Presentations:
The Distance Test:
Line and Marker Sizes:
Build Complex Figures Incrementally:
Instead of showing complete figure at once:
Benefits:
Implementation:
\pause or overlaysBest For:
Presentation Optimization:
✅ DO:
- Large, clear bars with adequate spacing
- Horizontal bars for long category names
- Direct labeling on bars (not legend)
- Order by value (highest to lowest) unless natural order exists
- Start y-axis at zero for accurate visual comparison
❌ DON'T:
- Too many categories (max 8-10)
- 3D bars (distorts perception)
- Multiple grouped comparisons (split to separate slides)
- Decorative patterns or gradients
Example Enhancement:
Before: 12 categories, small fonts, legend
After: Top 6 categories only, large fonts, direct labels, key bar highlighted
Best For:
Presentation Optimization:
✅ DO:
- Thick lines (2-4pt)
- Distinct colors AND line styles (solid, dashed, dotted)
- Direct line labeling (at end of lines, not legend)
- Highlight key line with color/thickness
- Minimal gridlines or none
- Clear markers at data points
❌ DON'T:
- More than 4-5 lines per plot
- Similar colors (ensure high contrast)
- Small markers or thin lines
- Cluttered with excess gridlines
Time Series Tips:
Best For:
Presentation Optimization:
✅ DO:
- Large, distinct markers (8-12pt)
- Color code groups clearly
- Show trendline if discussing correlation
- Annotate key points (outliers, examples)
- Report R² or p-value directly on plot
❌ DON'T:
- Overplot (too many overlapping points)
- Small markers
- Multiple marker types that look similar
- Missing scale information
Overplotting Solutions:
Best For:
Presentation Optimization:
✅ DO:
- Large, clear boxes
- Color code groups
- Add individual data points if n is small (< 30)
- Annotate median or mean values
- Explain components (quartiles, whiskers) first time shown
❌ DON'T:
- Assume audience knows box plot conventions
- Use without brief explanation
- Too many groups (max 6-8)
- Omit axis labels and units
First Use: If your audience may be unfamiliar, briefly explain: "Box shows middle 50% of data, line is median, whiskers show range"
Best For:
Presentation Optimization:
✅ DO:
- Large cells (readable grid)
- Clear, intuitive color scale (diverging or sequential)
- Label rows and columns with large fonts
- Show color scale legend prominently
- Cluster or order meaningfully
- Highlight key region with border
❌ DON'T:
- Too many rows/columns (200×200 matrix unreadable)
- Poor color scales (rainbow, red-green)
- Missing dendrograms if claiming clusters
- Tiny labels
Simplification:
Best For:
Presentation Optimization:
✅ DO:
- Large nodes and labels
- Clear edge directionality (arrows)
- Color or size code importance
- Highlight path of interest
- Simplify to essential connections
- Use layout that minimizes crossing edges
❌ DON'T:
- Show entire complex network at once
- Hairball diagrams (too many connections)
- Small labels on nodes
- Unclear what nodes and edges represent
Build Strategy:
Kaplan-Meier Survival Curves:
✅ Optimize:
- Thick lines (3-4pt)
- Show confidence intervals as shaded regions
- Mark censored observations clearly
- Report hazard ratio and p-value on plot
- Extend axes to show full follow-up
Forest Plots:
✅ Optimize:
- Large markers (diamonds or squares)
- Clear confidence interval bars
- Large font for study names
- Highlight overall estimate
- Show line of no effect prominently
ROC Curves:
✅ Optimize:
- Thick curve line
- Show diagonal reference line (AUC = 0.5)
- Report AUC with confidence interval on plot
- Mark optimal threshold if discussing cutpoint
- Compare ≤ 3 curves per plot
When to Use: Ordered data (low to high)
Good Palettes:
Avoid:
When to Use: Data with meaningful midpoint (e.g., +/− change, correlation from -1 to +1)
Good Palettes:
Key Principle: Midpoint should be visually neutral (white or light gray)
When to Use: Distinct groups with no order
Good Practices:
Example Set:
Blue (#0173B2)
Orange (#DE8F05)
Green (#029E73)
Purple (#CC78BC)
Red (#CA3542)
Strategy: Use color to direct attention
Main Result: Bright, saturated color (e.g., blue)
Comparison: Neutral (gray) or muted color
Background: Very light gray or white
Example Application:
Problem: Showing too much data at once
Example:
Solution:
Problem: Text too small to read
Common Issues:
Solution:
Problem: Unnecessary decorative elements
Examples:
Solution:
Problem: Visual representation distorts data
Examples:
Solution:
Problem: Colors reduce clarity or accessibility
Examples:
Solution:
Problem: Audience can't interpret visualization
Missing Elements:
Solution:
Problem: Wrong visualization for data type
Examples:
Solution:
Scenario: Showing multi-panel experimental result
Approach 1: Sequential Panels
Slide 1: Panel A only (baseline condition)
Slide 2: Panels A+B (add treatment effect)
Slide 3: Panels A+B+C (add time course)
Slide 4: All panels with interpretation overlay
Approach 2: Layered Data
Slide 1: Axes and experimental design schematic
Slide 2: Add control group data
Slide 3: Add treatment group data
Slide 4: Highlight difference, show statistics
Approach 3: Zoom and Context
Slide 1: Full dataset overview
Slide 2: Zoom to interesting region
Slide 3: Highlight specific points in zoomed view
Use Animation (PowerPoint/Beamer overlays):
Use Separate Slides:
For Generated Figures:
For Published Figures:
Edit in Graphics Software:
Tools:
Check:
Test:
Enhancements:
Verbal Integration:
Recreate When:
Reuse When:
Python (matplotlib, seaborn):
import matplotlib.pyplot as plt
import seaborn as sns
# Set presentation-friendly defaults
plt.rcParams['font.size'] = 18
plt.rcParams['axes.linewidth'] = 2
plt.rcParams['lines.linewidth'] = 3
plt.rcParams['figure.figsize'] = (10, 6)
# Create plot with large, clear elements
# Export as high-res PNG or PDF
R (ggplot2):
library(ggplot2)
# Presentation theme
theme_presentation <- theme_minimal() +
theme(
text = element_text(size = 18),
axis.text = element_text(size = 16),
axis.title = element_text(size = 20),
legend.text = element_text(size = 16)
)
# Apply to plots
ggplot(data, aes(x, y)) + geom_point(size=4) + theme_presentation
GraphPad Prism:
Excel/PowerPoint:
Before including a figure in your presentation:
Clarity:
Readability:
Design:
Context:
Technical Quality:
Progressive Disclosure (if complex):