src/process/resources/skills/officecli-academic-paper/SKILL.md
This skill is a scene layer on top of officecli-docx. Every docx hard rule — style architecture, heading hierarchy, shell quoting, break=newPage alias, belt-and-suspenders page breaks, live PAGE field, Delivery Gate, renderer quirks — is inherited, not re-taught. This file adds only what academic papers need on top: citation styles, equations, SEQ / PAGEREF cross-refs, multi-column journal layout, bibliography hanging indent, abstract/keywords/affiliation block.
When the docx base rules cover it, the text here says → see docx v2 §X. Read docx v2 first if you have not.
If officecli is missing:
curl -fsSL https://raw.githubusercontent.com/iOfficeAI/OfficeCLI/main/install.sh | bashirm https://raw.githubusercontent.com/iOfficeAI/OfficeCLI/main/install.ps1 | iexVerify with officecli --version (open a new terminal if PATH hasn't picked up). If install fails, download a binary from https://github.com/iOfficeAI/OfficeCLI/releases.
This skill teaches what an academic paper requires, not every command flag. When a prop name, enum value, or field instruction is uncertain, consult help BEFORE guessing.
officecli help docx # All docx elements
officecli help docx <element> # Full schema (e.g. section, equation, field, footnote)
officecli help docx <element> --json # Machine-readable
Help is pinned to the installed CLI version. When this skill and help disagree, help wins. Every --prop X= in this file has been grep-verified against officecli help docx <element> — if help adds / renames a prop in a later version, trust help.
Inherits docx v2. You should have read skills/officecli-docx/SKILL.md first. This skill assumes you know how to add paragraphs, set styles, build tables, insert images, manage TOC/footer/headers, force page breaks, and run the Delivery Gate. If any of those are unfamiliar, open a second session on docx v2 before continuing.
Shell quoting, incremental execution, $FILE convention → see docx v2 §Shell & Execution Discipline. The same rules apply here verbatim — quote [N] paths, single-quote any value containing $ (including $2.8B in a body paragraph or @ DOIs), never hand-write \$ \t \n in executable examples, one command at a time. Academic-paper examples below use $FILE as a shell variable (FILE="thesis.docx").
An academic paper is a docx with a scholarly layer on top: verifiable citations, precise equations, cross-refs that stay in sync, a formatted reference list. The base docx rules still apply; academic adds six deltas:
[1] vs (Smith, 2024)) follows.oMath inside prose, display oMathPara as standalone blocks, optionally numbered.SEQ Figure / SEQ Table fields count them; PAGEREF links "see Fig. 2" to its live page number.Stay in docx v2 for white papers, policy briefs, technical reports, HR templates — anything without a venue / citation style. Use this skill only when the document will carry at least TWO of: citation-style biblio, equations, SEQ/PAGEREF cross-refs, multi-column, abstract + keywords block.
Everything in docx v2 §Requirements for Outputs applies. On top of that, academic papers MUST meet these additional rules:
firstLineIndent. Use spaceAfter=12pt for paragraph separation. If view issues reports "body paragraph missing first-line indent" on an Abstract paragraph, it's a false positive — ignore.indent=720 hangingIndent=720 (left indent 0.5", first-line reversed by same amount). First line flush left; wraps indent under author name.SEQ Figure / SEQ Table fields (not hardcoded "Figure 1" text that drifts when you insert a new figure mid-document). Delivery Gate 5 verifies.Academic covers differ from professional covers. Minimum elements: title (centered, 20-22pt bold), author(s), affiliation, submission target or journal, date, abstract, keywords. The "60% fill" rule from docx v2 §Visual delivery floor still applies — a three-line cover with half a page of whitespace is a fail. See §Abstract / keywords / affiliation block below for the first-page recipe.
Academic section numbers are part of the heading text, not computed via list numbering. officecli's numId/listStyle mechanism is fragile across Heading1 re-use, so hand-write the prefix. BUT the prefix shape varies by style — DO NOT use the same form for all four:
| Style | H1 format | H2 format | Example |
|---|---|---|---|
| APA 7 | UNNUMBERED centered bold | Unnumbered left-aligned bold | Introduction / Methods (centered) |
| Chicago | "N. Title" left-aligned | "N.M Title" | 1. Introduction, 2.1 Policy Formation |
| IEEE | "N. TITLE" ALL CAPS + Roman numerals | A. Subtitle title case | I. INTRODUCTION, II. RELATED WORK, A. Datasets |
| MLA 9 | Unnumbered left-aligned bold | Same | Literature Review (no prefix) |
APA 7 L1 headings are centered, bold, unnumbered; L2 are flush-left bold; L3 flush-left bold italic; L4/L5 run-in. Do NOT prefix APA headings with 1. / 2. — that is Chicago/IEEE convention. IEEE wants ALL CAPS with Roman numerals (I. INTRODUCTION); inside each section, use A./B./C. sub-headings (title case). Arabic-numbered body sections are Chicago-style only.
Exception for all four: References / Bibliography / Works Cited / Acknowledgments are unnumbered regardless of style — omit the N. prefix.
FILE="paper.docx"
officecli create "$FILE"
officecli open "$FILE"
officecli set "$FILE" / --prop defaultFont="Times New Roman"
officecli add "$FILE" /body --type paragraph --prop text="Remote Work and Team Cohesion" --prop align=center --prop size=20pt --prop bold=true --prop spaceAfter=24pt
officecli add "$FILE" /body --type paragraph --prop text="Alice Chen" --prop align=center --prop size=12pt
officecli add "$FILE" /body --type paragraph --prop text="Department of Psychology, Stanford University" --prop align=center --prop size=11pt --prop spaceAfter=24pt
officecli add "$FILE" /body --type paragraph --prop text="Abstract" --prop align=center --prop size=14pt --prop bold=true --prop spaceBefore=12pt --prop spaceAfter=6pt
officecli add "$FILE" /body --type paragraph --prop text="This study examines remote-work adoption on team cohesion across 18 months..." --prop size=12pt --prop lineSpacing=2x --prop spaceAfter=12pt
officecli add "$FILE" /body --type paragraph --prop text="Keywords: remote work, team cohesion, psychological safety" --prop italic=true --prop size=11pt --prop spaceAfter=18pt
officecli add "$FILE" /body --type paragraph --prop text="1. Introduction" --prop style=Heading1 --prop size=20pt --prop bold=true --prop spaceBefore=18pt --prop spaceAfter=12pt
officecli add "$FILE" /body --type paragraph --prop text="Remote-work research (Smith, 2024) has expanded since 2020..." --prop size=12pt --prop lineSpacing=2x --prop firstLineIndent=720
officecli add "$FILE" /body --type paragraph --prop text="References" --prop style=Heading1 --prop size=20pt --prop bold=true --prop spaceBefore=18pt --prop spaceAfter=12pt
officecli add "$FILE" /body --type paragraph --prop text="Smith, J. (2024). Remote work and cohesion. Journal of Applied Psychology, 109(3), 412-430." --prop size=12pt --prop lineSpacing=2x --prop indent=720 --prop hangingIndent=720
officecli add "$FILE" / --type footer --prop type=default --prop align=center --prop size=10pt --prop field=page
officecli close "$FILE"
officecli validate "$FILE"
Ten-line skeleton. Real papers grow by adding more body paragraphs, more bibliography entries (each with the same indent=720 hangingIndent=720 pair), figures / tables with captions, and a TOC if there are 3+ Heading1s. The Quick Start validates clean; the sections below elaborate each dimension.
Four mainstream families. Pick one at project start; every downstream decision follows. Per-style decision table:
| Style | In-text shape | Reference list order | Body line spacing | Footnotes? |
|---|---|---|---|---|
| APA 7 | (Smith, 2024) or Smith (2024) | Alphabetical by author | 2x (double) | Rare (content notes only) |
| Chicago 17 (Notes-Bib) | Superscript footnote number | Alphabetical by author | 1.5x-2x | Primary (full citation in footnote) |
| IEEE | [1], [2], ..., [N] | Order of first citation | 1.15x-1.5x, 2-col | Rare |
| MLA 9 | (Smith 412) page-number | Alphabetical by author, "Works Cited" | 2x | Rare |
Shared defaults across all four: reference-list paragraphs use indent=720 hangingIndent=720 (hanging indent 0.5"); add a live TOC if 3+ Heading1s (→ see docx v2 §Table of Contents); static TOC fallback if recipient cannot recalculate (→ see docx v2 §Report-level recipes (f)).
(Author, Year) or Author (Year) for narrative. Page number required on direct quotes: (Smith, 2024, p. 15). Three+ authors: (Smith et al., 2024) after first citation.Author, A. A., & Co-Author, B. B. (Year). Title of article. Journal Name, Volume(Issue), pages. DOI preferred over URL; present as https URL, not doi: prefix.lineSpacing=2x) including abstract and references. Body first-line indent = 0.5" (firstLineIndent=720).# Body paragraph with parenthetical citation
officecli add "$FILE" /body --type paragraph --prop text="Remote work adoption accelerated during the pandemic (Kramer & Kramer, 2020)." --prop size=12pt --prop lineSpacing=2x --prop firstLineIndent=720
# Reference entry with hanging indent
officecli add "$FILE" /body --type paragraph --prop text="Kramer, A., & Kramer, K. Z. (2020). The potential impact of the Covid-19 pandemic on occupational status. Journal of Vocational Behavior, 119, 103442." --prop size=12pt --prop lineSpacing=2x --prop indent=720 --prop hangingIndent=720
# DOI hyperlink appended to the reference paragraph
officecli add "$FILE" "/body/p[last()]" --type hyperlink --prop url="https://doi.org/10.1016/j.jvb.2020.103442" --prop text="https://doi.org/10.1016/j.jvb.2020.103442"
QA: officecli query "$FILE" 'paragraph[hangingIndent]' returns every reference entry; zero references with first-line indent instead of hanging.
Timothy Brook, The Troubled Empire (Cambridge, MA: Harvard UP, 2010), 142.); shortened form thereafter (Brook, Troubled Empire, 150.).Ibid.Ibid., 22.Brook, Troubled Empire, 150.), NOT op. cit.. Chicago 17 drops op. cit. — use shortened form every time except for immediate repeats.footnote element exposes only text — size is not settable per-footnote; trust renderer defaults.)Primary Sources and Secondary Sources as two Heading2s under a single Bibliography Heading1. Book titles italic in both footnotes and bibliography.(Smith 2024)).# Body paragraph that will anchor a footnote, then the footnote itself
officecli add "$FILE" /body --type paragraph --prop text="The Ming dynasty's 海禁 policy shaped coastal trade for two centuries." --prop size=12pt --prop lineSpacing=1.5x --prop firstLineIndent=720
officecli add "$FILE" "/body/p[last()]" --type footnote --prop text="Timothy Brook, The Troubled Empire: China in the Yuan and Ming Dynasties (Cambridge, MA: Harvard University Press, 2010), 142."
# Next footnote — shortened form
officecli add "$FILE" "/body/p[last()]" --type footnote --prop text="Brook, Troubled Empire, 150."
# Bibliography section split — primary sources first
officecli add "$FILE" /body --type paragraph --prop text="Bibliography" --prop style=Heading1 --prop size=20pt --prop bold=true --prop spaceBefore=18pt
officecli add "$FILE" /body --type paragraph --prop text="Primary Sources" --prop style=Heading2 --prop size=14pt --prop bold=true --prop spaceBefore=12pt
officecli add "$FILE" /body --type paragraph --prop text="Ming Shilu 明實錄. Taipei: Academia Sinica, 1966." --prop size=12pt --prop indent=720 --prop hangingIndent=720
officecli add "$FILE" /body --type paragraph --prop text="Secondary Sources" --prop style=Heading2 --prop size=14pt --prop bold=true --prop spaceBefore=12pt
officecli add "$FILE" /body --type paragraph --prop text="Brook, Timothy. The Troubled Empire: China in the Yuan and Ming Dynasties. Cambridge, MA: Harvard University Press, 2010." --prop size=12pt --prop indent=720 --prop hangingIndent=720
QA: officecli query "$FILE" 'footnote' count ≥ body-paragraph citation count.
[1], [2]. Numbered in order of first appearance, not alphabetical. Reuse the same number for repeat citations. [1, p. 15] for page refs, [1]-[3] for a range.[1] A. Smith and B. Jones, "Title," IEEE Trans. X, vol. 5, no. 3, pp. 1-10, 2024, doi: .... Authors are initial-first; journal names abbreviated per IEEE list (IEEE Trans. Neural Netw., not full name).firstLineIndent=288 twips ≈ 14pt). Smaller than APA's 0.5" because the 2-col width is narrower.I. INTRODUCTION, II. RELATED WORK, III. METHOD. Sub-sections A. Datasets, B. Baselines in title case. Do NOT use 1. Introduction (Arabic) for IEEE — that is Chicago style.Table I, Table II, Table III. Figures remain Arabic (Fig. 1, Fig. 2). The SEQ Table field emits Arabic cached values — for IEEE, patch the cached <w:t> to Roman manually (see §SEQ cached-value trap), or accept Arabic and note in the cover letter.# Body citing reference 1
officecli add "$FILE" /body --type paragraph --prop text="Attention-based anomaly detection has been applied to industrial sensor data [1], [2]." --prop size=10pt --prop lineSpacing=1.15x
# Reference list entry — number in the text
officecli add "$FILE" /body --type paragraph --prop text="[1] A. Smith and B. Jones, \"Attention for anomaly detection,\" IEEE Trans. Neural Netw., vol. 35, no. 2, pp. 412-430, 2024." --prop size=10pt --prop indent=720 --prop hangingIndent=720
officecli add "$FILE" /body --type paragraph --prop text="[2] C. Lee, \"Time-series anomaly survey,\" in Proc. ICML, 2023, pp. 1200-1215." --prop size=10pt --prop indent=720 --prop hangingIndent=720
QA: the highest [N] in body must equal the number of reference-list entries. Grep: officecli view "$FILE" text | grep -oE '\[[0-9]+\]' | sort -u | tail -5.
Diff vs APA: in-text is (Author Page) no comma (e.g. (Smith 412)); direct quotes always carry the page number. Reference section titled Works Cited (not References / Bibliography). Entries alphabetical by surname, hanging indent, 2x spacing, nine "core elements" separated by periods: Author. Title. Container, Other Contributors, Version, Number, Publisher, Date, Location. — skip any that don't apply. Book titles italic; article titles in quotes. Otherwise identical to APA paragraph setup.
--type equation parses a LaTeX-ish formula into OMML. Two modes, selected by --prop mode=:
| Mode | XML | Visual | Use |
|---|---|---|---|
display (default) | <m:oMathPara> at /body | Standalone centered block | Numbered equations, theorem statements |
inline | <m:oMath> appended to a run inside a paragraph | Runs with the text | if $x > 0$ style in prose |
# Display equation (own paragraph, centered) — explicitly set mode=display for clarity
officecli add "$FILE" /body --type equation --prop mode=display --prop formula="x^2 + y^2 = z^2"
# Display equation with Greek / subscript / integral — verify rendering below
officecli add "$FILE" /body --type equation --prop mode=display --prop formula="\\lambda_1 + \\alpha"
officecli add "$FILE" /body --type equation --prop mode=display --prop formula="\\frac{1}{2\\pi} \\int_0^{\\infty} e^{-x^2} dx"
# Inline equation INSIDE prose — required whenever variables like x_{t+1}, \lambda, etc. appear in a body paragraph:
officecli add "$FILE" /body --type paragraph --prop text="Given the weight " --prop size=11pt
officecli add "$FILE" "/body/p[last()]" --type equation --prop mode=inline --prop formula="W_t"
officecli add "$FILE" "/body/p[last()]" --type run --prop text=" we define the loss..."
Verify equations render as OMML math, not plain-text LaTeX tokens. After close, run:
officecli view "$FILE" text | head -20 # λ₁ + α, ∫₀∞, x² must appear as unicode math (verified renders)
officecli raw "$FILE" /document | grep -c '<m:oMathPara' # ≥ 1 per display equation
If the body prose contains raw lambda_1, x_{t+1}, \alpha or similar plain-text tokens (i.e., you typed them into a paragraph --prop text= instead of wrapping with --type equation --prop mode=inline), downstream viewers will render them as literal ASCII. Rule: every mathematical variable / Greek letter / subscript in prose goes through --type equation mode=inline, never through paragraph --prop text=.
LaTeX subset pitfalls (non-negotiable):
\left(...\right) / \left[...\right] + sub/superscript inside → cast error crash. Use plain (, ), [, ] — OMML auto-sizes delimiters in display mode.\mathcal{L} → invalid OMML. Use \mathit{L} or plain uppercase letters.move on /body/oMathPara[N] does not reliably reposition. Workaround: add at target position, remove the original.Equation numbering — no native \eqno. Add the display equation, then add a right-aligned paragraph "(1)" immediately after with spaceBefore=0 spaceAfter=6pt. Separate line, works in 2-col. Do NOT place --type equation directly in a table cell tc[N] — it emits oMathPara as a direct <w:tc> child (illegal OOXML). Target tc[N]/p[1] with mode=inline if you need equations in cells.
Full equation schema: officecli help docx equation.
Two primitives, both native fieldTypes (verified against officecli help docx field v1.0.63): seq for auto-numbered caption counters, pageref for "see Fig. 2 on page 7" back-references. Native fields insert correctly, but their cached rendered values need a one-shot raw-set patch per field (see §SEQ cached-value trap below) — otherwise downstream viewers that don't recompute cached fields will show every figure as "Fig. 1".
A SEQ field is a counter with a name (identifier). Every SEQ Figure increments the Figure counter on recalc; every SEQ Table increments the Table counter.
⚠️ SEQ cached-value trap (verified on v1.0.63). The CLI emits every SEQ field with cached result 1 — so a document with 3 Figure captions readbacks as Figure 1 / Figure 1 / Figure 1 via view text or query field[fieldType=seq], and any downstream viewer that doesn't recompute cached fields will display the same Figure 1 / Figure 1 / Figure 1. Word and WPS recompute on open when w:updateFields=true is set in settings. Two must-do steps per paper with multiple figures/tables:
updateFields=true in settings once per document (right after create). Position matters — OOXML CT_Settings schema rejects <w:updateFields> as the first child; insert it before <w:compat>:
officecli raw-set "$FILE" /settings --xpath '//w:compat' --action insertbefore \
--xml '<w:updateFields xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" w:val="true"/>'
<w:t> after each SEQ field so the artifact reads correctly in every viewer:
# After adding the Nth SEQ Figure caption, override cached "1" to the real number N:
officecli raw-set "$FILE" /document \
--xpath "(//w:p[.//w:instrText[contains(text(),'SEQ Figure')]])[N]//w:fldChar[@w:fldCharType='separate']/following::w:t[1]" \
--action replace \
--xml '<w:t xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xml:space="preserve">N</w:t>'
SEQ Table for tables. After patching, officecli view "$FILE" text will show Figure 1 / Figure 2 / Figure 3 — and downstream viewers will too.# Figure with caption BELOW the image. Caption = "Figure <seq>: title" + optional bookmark for cross-ref.
officecli add "$FILE" /body --type picture --prop src=arch.png --prop width=5in
officecli set "$FILE" "/body/p[last()]/r[last()]" --prop alt="Model architecture: attention over time-series sensors"
# Caption paragraph (below the figure, per academic convention)
officecli add "$FILE" /body --type paragraph --prop text="Figure " --prop style=Caption --prop size=10pt --prop italic=true --prop align=center
officecli add "$FILE" "/body/p[last()]" --type field --prop fieldType=seq --prop identifier=Figure
officecli add "$FILE" "/body/p[last()]" --type run --prop text=": Attention-based anomaly detection model."
# Bookmark the caption so other paragraphs can PAGEREF it
officecli add "$FILE" /body --type bookmark --prop name=fig_arch
# Patch cached value — this is Figure 1 (first SEQ Figure in doc)
officecli raw-set "$FILE" /document \
--xpath "(//w:p[.//w:instrText[contains(text(),'SEQ Figure')]])[1]//w:fldChar[@w:fldCharType='separate']/following::w:t[1]" \
--action replace --xml '<w:t xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" xml:space="preserve">1</w:t>'
# Cross-ref paragraph: "see Figure 1 on page X"
officecli add "$FILE" /body --type paragraph --prop text="As shown in Figure 1 (see page " --prop size=11pt --prop lineSpacing=1.5x
officecli add "$FILE" "/body/p[last()]" --type field --prop fieldType=pageref --prop name=fig_arch
officecli add "$FILE" "/body/p[last()]" --type run --prop text=")."
# Caption first (ABOVE the table), THEN the table
officecli add "$FILE" /body --type paragraph --prop text="Table " --prop style=Caption --prop size=10pt --prop italic=true --prop spaceAfter=6pt
officecli add "$FILE" "/body/p[last()]" --type field --prop fieldType=seq --prop identifier=Table
officecli add "$FILE" "/body/p[last()]" --type run --prop text=": Participant demographics (N=47)."
officecli add "$FILE" /body --type table --prop rows=5 --prop cols=4 --prop width=100%
# ... fill header + rows per docx v2 §Tables
# At least one SEQ Figure or SEQ Table in the body document part
officecli raw "$FILE" /document | grep -c 'w:instrText[^>]*>[^<]*SEQ' # expect ≥ 1
officecli raw "$FILE" /document | grep -c 'w:instrText[^>]*>[^<]*PAGEREF' # 0 ok if no cross-refs
Live fields carry cached values that render stale until a human presses F9 in Word. Expect "Figure 1" to show as 1, 2, ... immediately after recalc; before recalc, some viewers render 0 or blank. Judge field presence by fldChar existence, not by visible digit (→ see docx v2 §Field / cached-value spot-check).
Footnote — sits at the bottom of the page where its anchor paragraph lives. Used for source citations in Chicago Notes-Bib, content asides in any style.
Endnote — sits at the end of the document (or before the bibliography). Used by some venues in place of footnotes, or for long contextual notes that would clutter the page.
# Footnote anchored to paragraph N
officecli add "$FILE" "/body/p[3]" --type footnote --prop text="Smith et al. reported similar findings in their 2023 review."
# Endnote
officecli add "$FILE" /endnotes --type endnote --prop text="Extended derivation of equation (4) is available at the project repository."
Both appear as empty-string runs in view annotated output (r[N] "") — the run carries a <w:footnoteReference> XML element, not visible text. Confirm insertion with officecli query "$FILE" 'footnote' or officecli get "$FILE" "/footnotes/footnote[N]". Footnotes do NOT shift paragraph indices; add them in any order after body content is in place. Full schema: officecli help docx footnote / officecli help docx endnote.
Every academic paper ends with a reference list. The name of the section depends on the style (References for APA / IEEE / Chicago Author-Date; Bibliography for Chicago Notes-Bib; Works Cited for MLA). Each entry is a separate paragraph with hanging indent.
# Section heading — same as body Heading1 (excluded from body numbering by convention)
officecli add "$FILE" /body --type paragraph --prop text="References" --prop style=Heading1 --prop size=20pt --prop bold=true --prop spaceBefore=18pt --prop spaceAfter=12pt
# Each entry: hanging indent 720 twips (0.5"), with indent=720 as the partner (first line flush, wraps indented)
officecli add "$FILE" /body --type paragraph --prop text="Smith, J. (2024). Remote work and cohesion. Journal of Applied Psychology, 109(3), 412-430." --prop size=12pt --prop lineSpacing=2x --prop indent=720 --prop hangingIndent=720
# DOI hyperlink on its own run appended to the entry paragraph
officecli add "$FILE" "/body/p[last()]" --type hyperlink --prop url="https://doi.org/10.1037/apl0001123" --prop text="https://doi.org/10.1037/apl0001123"
Verified: --prop indent=720 --prop hangingIndent=720 is the canonical hanging-indent pair per officecli help docx paragraph. The old ind.firstLine=-720 form (negative first-line indent) is NOT canonical and fails schema on emit — → see docx v2 §Schema-invalid-on-emit.
Round-trip QA. Count in-text citation markers (APA (Author, Year), IEEE [N], MLA (Author N)) vs reference-list entries. See Delivery Gate 4 below. Every cited key must resolve; every listed entry should be cited at least once.
IEEE and many engineering / physics journals render body text in two columns with a single-column abstract above. The mechanism: a section break with type=continuous and columns=2, then another section break at the end to revert to single-column.
The reversion step is not optional. Without it, the rest of the document — including references — renders as two columns. This is the single most common multi-column failure.
FILE="ieee.docx"
officecli create "$FILE"
officecli open "$FILE"
# 1. Title, authors, affiliation — single-column (the default first section)
officecli add "$FILE" /body --type paragraph --prop text="Attention-Based Anomaly Detection for Industrial Time Series" --prop align=center --prop size=18pt --prop bold=true --prop spaceAfter=12pt
officecli add "$FILE" /body --type paragraph --prop text="Alice Chen, Bob Martinez" --prop align=center --prop size=11pt
officecli add "$FILE" /body --type paragraph --prop text="Department of CS, Stanford University" --prop align=center --prop size=10pt --prop spaceAfter=18pt
# 2. Abstract — still single-column, block-style
officecli add "$FILE" /body --type paragraph --prop text="Abstract" --prop align=center --prop size=12pt --prop bold=true --prop spaceAfter=6pt
officecli add "$FILE" /body --type paragraph --prop text="We present an attention-based model for detecting anomalies in industrial sensor time series..." --prop size=10pt --prop lineSpacing=1.15x --prop spaceAfter=12pt
# 3. Section break + two-column from here on
# CRITICAL: `/section[last()]` is REJECTED on v1.0.63 (cast-error). Count sections first, use explicit /section[N].
officecli add "$FILE" /body --type section --prop type=continuous
SECTION_COUNT=$(officecli query "$FILE" section --json | jq '.data.results | length')
# After the add, SECTION_COUNT should be 2 — [1] is pre-break, [2] is post-break (2-col body area).
officecli set "$FILE" "/section[2]" --prop columns=2 --prop columnSpace=1cm
# 4. Body — IEEE wants Roman numerals + ALL CAPS section titles (P1.2).
officecli add "$FILE" /body --type paragraph --prop text="I. INTRODUCTION" --prop style=Heading1 --prop size=10pt --prop bold=true
officecli add "$FILE" /body --type paragraph --prop text="Industrial anomaly detection has been studied since [1]..." --prop size=10pt --prop lineSpacing=1.15x --prop firstLineIndent=360
# 5. At the end of 2-column body, ANOTHER section break + revert to single column for references / appendices
# (If you want references in 2-col too, skip step 5 — but most IEEE papers use 2-col for references as well.)
# officecli add "$FILE" /body --type section --prop type=continuous
# Then re-count and use the new explicit /section[N], NOT /section[last()]:
# officecli set "$FILE" "/section[3]" --prop columns=1
# 6. Footer, close, validate
officecli add "$FILE" / --type footer --prop type=default --prop align=center --prop size=9pt --prop field=page
officecli close "$FILE"
officecli validate "$FILE"
Visual verify. Run officecli view "$FILE" html and Read the returned HTML to audit the rendered output. The abstract must render as full-width and the introduction onward as two columns. If the abstract wraps into two narrow columns, the first section break landed before the abstract — move it.
Section index bookkeeping. Each add /body --type section inserts one empty paragraph into /body (the section-break marker). All subsequent p[N] indices shift by +1 per section break. Plan section breaks in advance; after adding a break, officecli get "$FILE" /body --depth 1 to re-index before continuing.
Full section schema (columns, columnSpace, orientation, pageNumFmt, titlePage, lineNumbers): officecli help docx section.
First-page metadata stack: title (centered 20-22pt bold) → authors (centered 12pt, superscript ^1 ^2 for multi-affiliation) → affiliations (centered 11pt, keyed to superscripts) → submission target / date → Abstract heading (14pt bold) → abstract body (block-style, NO firstLineIndent, 150-300 words) → keywords line (italic 11pt). Same "cover ≥ 60% filled" rule as docx v2.
# Superscript affiliation markers (multi-institution paper)
officecli add "$FILE" /body --type paragraph --prop text="Alice Chen" --prop align=center --prop size=12pt
officecli add "$FILE" "/body/p[last()]" --type run --prop text="1" --prop superscript=true
officecli add "$FILE" "/body/p[last()]" --type run --prop text=", Bob Martinez"
officecli add "$FILE" "/body/p[last()]" --type run --prop text="2" --prop superscript=true
# Running header (skip on cover via type=first empty header — see docx v2 §headers)
officecli add "$FILE" / --type header --prop type=default --prop align=right --prop size=9pt --prop text="Short Running Title"
Nature-family 2-col abstract is rare — if required, open a section type=continuous columns=2 BEFORE the abstract heading; short abstracts (<100 words) leave ragged columns. Mirrored odd/even headers need <w:evenAndOddHeaders/> in settings via raw-set — not exposed by high-level API on 1.0.63; deliver without mirroring or inject the flag manually. Full header schema: officecli help docx header.
Assume there are problems. Your job is to find them. First render is almost never correct. Run this block before declaring done.
→ see docx v2 §Delivery Gate. Schema validate, token leak grep, live PAGE field structure. Copy-paste the docx v2 gate block first. Every check must print its success message.
Every in-text citation key should resolve to a bibliography entry. Count mismatches = REJECT.
# IEEE example (bracketed numerics). Adjust regex for APA (Author, Year) or MLA (Author Page).
CITATIONS=$(officecli view "$FILE" text | grep -oE '\[[0-9]+\]' | sort -u | wc -l)
ENTRIES=$(officecli query "$FILE" 'paragraph[hangingIndent]' --json | jq '.data.results | length')
echo "In-text citation markers: $CITATIONS | Bibliography entries: $ENTRIES"
# REJECT when citations exceed entries (cites without references). Entries > citations is allowed by some venues.
[ "$CITATIONS" -le "$ENTRIES" ] && echo "Gate 4 OK" || { echo "REJECT Gate 4: $CITATIONS in-text markers but only $ENTRIES bibliography entries"; exit 1; }
If the paper has any numbered figure or table, the body must carry live SEQ fields AND their cached values must show distinct ascending numbers (else view text and downstream viewers that don't recompute cached fields will show "Figure 1" for all).
# Count SEQ fields via query (raw-grep collapses multi-matches on one XML line → undercounts).
SEQ_COUNT=$(officecli query "$FILE" 'field[fieldType=seq]' --json | jq '.data.results | length')
VISIBLE_FIG=$(officecli view "$FILE" text | grep -cE '(Figure|Table) [0-9]+')
if [ "$VISIBLE_FIG" -gt 0 ] && [ "$SEQ_COUNT" -eq 0 ]; then
echo "REJECT Gate 5a: $VISIBLE_FIG visible Figure/Table labels but 0 SEQ fields."
exit 1
fi
# Cached values must be distinct (CLI emits "1" per field by default → all three would show "Figure 1").
# After the raw-set patches in §SEQ, view text should show Figure 1 / Figure 2 / Figure 3:
DISTINCT=$(officecli view "$FILE" text | grep -oE '(Figure|Table) [0-9]+' | sort -u | wc -l)
[ "$SEQ_COUNT" -le "$DISTINCT" ] && echo "Gate 5a OK (SEQ=$SEQ_COUNT, distinct=$DISTINCT)" || { echo "REJECT Gate 5a: $SEQ_COUNT SEQ fields but only $DISTINCT distinct rendered labels — patch cached <w:t> after each SEQ field"; exit 1; }
Gates 1–5a catch schema, token leaks, live-field presence, citation counts. They do NOT catch physical assembly defects — scrambled page order, a duplicated Abstract mid-document, three figures all labeled "Fig. 1" despite SEQ field presence, equation variables rendering as plain-text LaTeX (lambda_1, x_{t+1}) instead of math. Do not skip — Gates 1–5a pass ≠ visual OK.
Run officecli view "$FILE" html and Read the returned HTML path. For every page of the paper, answer:
(a) Are pages in logical academic sequence? (Title → Abstract → Keywords → Introduction → body → References — no forward jumps, no backward leaks.) (b) Does the Abstract appear exactly once, not duplicated mid-document? (c) Are Figure N / Table N labels distinct and ascending? (Fig. 1, Fig. 2, Fig. 3 — not all "Fig. 1". Same for tables.) (d) Do equations render as math? (Italicized variables, Greek letters like λ / α, proper integrals / fractions — NOT plain-text
lambda_1,x_{t+1},\int.) (e) For IEEE papers: are section titles ALL CAPS with Roman numerals (I. INTRODUCTION)? Are tables Roman (Table I,Table II)? (f) For APA papers: are Level-1 headings centered bold and unnumbered (not1. Introduction)? (g) Does every in-text "see Fig. N" / "see Table N" resolve to a figure/table that actually carries that number? (h) Heading hierarchy visually distinct (size + weight) across H1 / H2 / H3?
Report every instance. If even one defect is present → REJECT; do not deliver until fixed.
Human preview (optional). If you want the user to visually preview the paper, run officecli watch "$FILE" for a live preview the user can open at their own discretion, or have them open the .docx directly in Word / WPS / Pages. For final visual verification, open the file in the target viewer.
validate catches schema errors, not academic-style errors. A document passes validate with APA citations in an IEEE paper, footnotes in a style that forbids them, or figures with hardcoded numbers that drift when a new figure is inserted. The gates above — especially Gate 4 (round-trip) and Gate 5 (SEQ presence) — are how you catch what validate cannot.
→ Base pitfalls (shell escape, \$ \t \n literals, table cell formatting order, pageBreakBefore belt-and-suspenders, shd.fill / ind.firstLine schema-invalid forms, TOC cached values, watermark two-step): see docx v2 §Known Issues & Pitfalls.
Academic-specific:
\left(...\right) / \left[...\right] + sub/superscript crashes. Cast error. Use plain (, ), [, ] — OMML auto-sizes in display mode.\mathcal{L} emits invalid OMML. Use \mathit{L} or plain uppercase. \mathbf, \mathit, \mathbb work; \mathcal does not.move on /body/oMathPara[N] not reliable. Do not rely on move to reposition display equations. Workaround: add at the target position, remove the original.add /body --type section inserts one empty paragraph into /body. All p[N] indices after the break shift by +1. Plan breaks; after any add section, officecli get "$FILE" /body --depth 1 to re-index./section[last()] is REJECTED on v1.0.63 (cast-error, same family as pptx's /slide[last()]). Always resolve to an explicit /section[N]:
SECTION_COUNT=$(officecli query "$FILE" section --json | jq '.data.results | length')
# then use /section[2], /section[3], ..., NEVER /section[last()]
add /body --type section increments the count. Re-query after every break.columns=2 section, you must add another section break and explicitly set columns=1 on the new /section[N] (N = post-revert count) — otherwise the rest of the document, including references, renders as two columns. Verify with officecli get "$FILE" "/section[N]" for each N.--type equation targeting a tc[N] path emits illegal OOXML. Inside a table cell, target tc[N]/p[1] with --prop mode=inline instead. Display equations (oMathPara) are not legal as direct <w:tc> children.indent=720 hangingIndent=720. Not ind.firstLine=-720. The dotted form emits <w:ind> after <w:jc> and fails schema on emit.view annotated. The <w:footnoteReference> XML element has no visible text on the reference side; the note body lives in /footnotes/footnote[N]. Confirm with officecli query "$FILE" 'footnote', not by eyeballing view text.validate will not catch it.→ see docx v2 §Renderer quirks. PAGE / TOC cached values, OMML baseline shifts, scheme colors — all identical quirks apply to academic papers. Before calling an equation or a citation marker broken, open the file in the user's target viewer (Word, WPS, Pages) — if it renders correctly there, it is a viewer quirk, not a skill defect.
When in doubt: officecli help docx, officecli help docx <element>, officecli help docx <element> --json. Help is the authoritative schema; this skill is the decision guide for academic deltas on top of docx v2.