Back to Hermes Agent

songsee

skills/media/songsee/SKILL.md

2026.6.51.9 KB
Original Source

songsee

Generate spectrograms and multi-panel audio feature visualizations from audio files.

Prerequisites

Requires Go:

bash
go install github.com/steipete/songsee/cmd/songsee@latest

Optional: ffmpeg for formats beyond WAV/MP3.

Quick Start

bash
# Basic spectrogram
songsee track.mp3

# Save to specific file
songsee track.mp3 -o spectrogram.png

# Multi-panel visualization grid
songsee track.mp3 --viz spectrogram,mel,chroma,hpss,selfsim,loudness,tempogram,mfcc,flux

# Time slice (start at 12.5s, 8s duration)
songsee track.mp3 --start 12.5 --duration 8 -o slice.jpg

# From stdin
cat track.mp3 | songsee - --format png -o out.png

Visualization Types

Use --viz with comma-separated values:

TypeDescription
spectrogramStandard frequency spectrogram
melMel-scaled spectrogram
chromaPitch class distribution
hpssHarmonic/percussive separation
selfsimSelf-similarity matrix
loudnessLoudness over time
tempogramTempo estimation
mfccMel-frequency cepstral coefficients
fluxSpectral flux (onset detection)

Multiple --viz types render as a grid in a single image.

Common Flags

FlagDescription
--vizVisualization types (comma-separated)
--styleColor palette: classic, magma, inferno, viridis, gray
--width / --heightOutput image dimensions
--window / --hopFFT window and hop size
--min-freq / --max-freqFrequency range filter
--start / --durationTime slice of the audio
--formatOutput format: jpg or png
-oOutput file path

Notes

  • WAV and MP3 are decoded natively; other formats require ffmpeg
  • Output images can be inspected with vision_analyze for automated audio analysis
  • Useful for comparing audio outputs, debugging synthesis, or documenting audio processing pipelines