What We Learned From 1 Million Chord Annotations

Why measure this at all?

A chord recognizer has to make user experience decisions that look small on the surface but matter a lot in practice. Should a bare fifth be treated as a chord, or as an interval? Which altered colors deserve first-class names? Is it better to add more exotic templates, or to improve the ranking and spelling of the common ones?

Those questions cannot be answered by counting possible pitch-class sets. There are only 4,096 possible sets in 12-tone equal temperament (12-TET), but most of them are not useful chord names. The more relevant question is what chord labels musicians, transcribers, and datasets actually use when describing real music.

That is where engineering discipline helps. A recognition engine can still be shaped by musical taste, but the roadmap should not depend only on intuition. We wanted external validation: compare WhatChord's current chord vocabulary with a large collection of existing chord annotations, then use the result to decide where more work would actually help players.

The data source

The comparison used ChoCo, a large linked-data chord corpus that gathers annotations from many existing datasets and formats. It includes material derived from sources such as Isophonics, RWC Pop, Weimar Jazz, USPop2002, Wikifonia, iReal Pro, and others.

That mix is exactly what makes the corpus useful. It is not a perfect model of what someone will play into a MIDI keyboard, and it is not a benchmark for live recognition accuracy. It is a broad snapshot of chord-symbol language across real annotated music.

The analysis looked at converted JAMS (JSON Annotated Music Specification) files from ChoCo and extracted Harte-style chord labels such as C:maj, D:min7, G:7(b9), and F#:hdim7. Then it removed the root from each label, so C major, E-flat major, and F-sharp major all count as the same chord body: major.

The goal was not to train WhatChord from ChoCo or copy corpus-specific behavior into the app. The goal was to ask a narrower question: does WhatChord have names for the chord families that show up most often in a large real-world corpus?

The headline result

After excluding labels for no chord, unknown harmony, and empty values, the snapshot contained 1,097,701 chord observations. Those observations collapsed to 350 distinct chord bodies after root removal.

Measure	Count
JAMS files scanned	16,249
Chord observations	1,097,701
Distinct full chord labels	4,798
Distinct chord bodies after root removal	350

Compared with WhatChord's current templates and extension handling, the result was encouraging:

Coverage basis	Supported	Unsupported	Coverage
Observations	1,085,031	12,670	98.85%
Duration	3,082,437	37,074	98.81%

In plain English: most of the chord language in this large mixed corpus is already inside WhatChord's recognition vocabulary. The current set of supported chord families is broad enough to cover the overwhelming majority of real annotated chord material.

The common chords were the expected ones

The highest-frequency chord bodies were not surprising, which is a good sign. Major, dominant seventh, minor, minor seventh, major seventh, diminished, sixth, ninth, and altered dominant material all appeared near the top. Those are exactly the chord families a practical recognizer should handle well.

Chord body	Observations	WhatChord status
maj	375,189	supported
7	169,546	supported
min	123,885	supported
min7	71,362	supported
maj7	40,493	supported
dim	24,612	supported
9	15,367	supported
7(b9)	6,131	supported

This matters because development time is finite. A chord recognizer can always grow a longer list of labels, but every new label affects ranking. Add too many marginal templates and the app can become worse at naming common voicings, especially when several chord readings share the same notes.

The missing labels were instructive

The most common unsupported bodies were not missing mainstream chord families. They were mostly omitted-tone labels, fifth-only sonorities, and root-only annotations. Those labels are meaningful in a corpus, but they do not necessarily make good live chord names.

Unsupported body	Observations	What it describes
(3,5)	6,008	omitted third and fifth
(*3,5)	1,595	fifth-only sonority
(5)	710	fifth-only sonority
(1,3,5)	623	root-only sonority

This supports an existing WhatChord design choice: dyads are reported as intervals rather than promoted into chord templates. WhatChord previously supported a power-fifth chord label, but that made ranking worse for a piano-focused app. The corpus result did not argue for bringing it back. It argued for restraint.

The first unsupported labels that looked more like candidate chord qualities were minor sharp-five forms. They are real, but much rarer than the common chord families that dominate the corpus.

What this means for WhatChord

The practical lesson is not "the app is done." It is that the next highest-impact improvements are probably not a long list of new chord templates.

The data points toward three priorities:

Keep improving ranking for common ambiguous voicings, because those are the cases players will hit most often.
Keep improving spelling and explanations, because the same recognized chord can be more or less useful depending on whether the symbol matches musical convention.
Track rare but real chord families, such as minor sharp-five, without letting them disrupt the common cases.

That balance is central to WhatChord's approach. More recognition is only better when it improves the answer a musician sees. Sometimes the disciplined choice is to say no to a label, or at least not yet.

What the numbers do not prove

Corpus coverage is not the same as live analyzer accuracy. A supported chord body means WhatChord has the vocabulary to name that kind of chord. It does not prove every voicing from the source material would rank exactly the same way in the app.

The source material also mixes audio annotations, score-derived annotations, lead sheets, and converted symbolic formats. That breadth is useful, but it also means some labels encode conventions from their original source rather than universal chord-symbol practice.

So the corpus is best understood as a reality check, not a product spec. It helps keep the recognition roadmap grounded in music people actually annotate and play, while leaving room for the musical judgment that real-time chord naming still requires.

Read the research note.

The longer write-up includes the extraction method, source paths, reproduction commands, and additional unsupported-label details.

Open the full details

Curious how it names them? Try identifying chords in your browser →

Why measure this at all?

The data source

The headline result

The common chords were the expected ones

The missing labels were instructive

What this means for WhatChord

What the numbers do not prove

Read the research note.

Also on this site

Why Chord Naming Is Harder Than It Looks

Under the Hood: Building a Real-Time Chord Recognizer