You are here

capella audio2score pro 5

Score Generating Software By Robin Bigwood
Published July 2024

audio2score in full flight, in main program view, showing the detected harmonic content for an orchestral recording in a piano‑roll editor, and the resulting live notation below.audio2score in full flight, in main program view, showing the detected harmonic content for an orchestral recording in a piano‑roll editor, and the resulting live notation below.

audio2score uses AI to turn recorded music into sheet music — but just how good are the results?

While headline‑grabbing so-called Artificial Intelligence (AI) technology fills our social media feeds with fantasy artwork and formulaic poetry and prose, a much more focused version of it has been powering various transcription‑oriented music software for years. Tools to scan printed music to live notation, or that extract note information from melodies or individual tracks, ranging from cheap iOS apps to Celemony's Melodyne, are commonplace now.

However, software that can ingest audio of a multi‑instrument mix and then attempt to separate out and notate all its constituent parts is still comparatively rare. For good reason: it’s really difficult to do, even for expert‑level human musicians. But it’s exactly this that the (self‑consciously lower‑case) capella audio2score pro 5.0 aims to do, on Windows (10 or later) and macOS (10.13 or later). In fact the app goes beyond mere transcription, as we’ll see in a minute, and journeys well into arrangement territory. It’s more capable and flexible than you might at first assume.

Analysis

audio2score pro 5 (which I’ll call A2S for the rest of this review, for grammatical sanity) runs in a single window on your computer, and has a pleasant immediacy about it from the off. After launch you’re prompted to open an audio file (an 8‑ or 16‑bit WAV, MP3, or WMA on Windows), or an A2S project in capasp format. Then it’s straight down to business.

Assuming you’ve fed in new audio, you’re met with a dialogue box that asks you to specify what it contains: the choices are Piano only, Classical (without vocals), Pop (without vocals), and Pop (with vocals). An analysis process takes place, with a progress window appearing to show the app’s progress calculating ‘spectrum’ and ‘sound tissue’, then recognising tuning, key and notes. On my M1 Mac this typically took (approximately) half the duration of the audio, varying according to complexity. And then, boom, a notated score appears.

It is just possible, at this stage, especially with simple musical material, that the software will have nailed it. You can gauge that visually if you’re a music reader, or to some extent via A2S’s embedded SoundFont‑based playback module. Playback controls are found in a strip at the bottom of the window, and one of them is a useful crossfader that outputs your original audio file (in mono) when set to the left, sample playback from the extracted data to the right, or a sync’ed mix of the two in between.

If all is up to snuff you’ll have already reached the ultimate point of the exercise, which is to export a PDF score, a type 0 or 1 MIDI file, or a MusicXML document for passing on to another notation application [like Sibelius]. If you use capella software’s own notation application (capella) then the transcription can be transferred to that directly or via the specialised CapXML format. There is one last option, to save the A2S project itself, and that lets you return to unfinished work without having to re-run the analysis process.

Reality

Although audio2score’s initial analysis often gets some things right, the chances of complete accuracy are actually slim. And while playback of extracted data frequently sounds reasonably plausible, in a ball‑park sort of way, the notation may well be off, ranging from clunky, inelegant and scored for a different instrumentation, via strangely full of holes and misallocated voices, to downright bizarre.

The vast majority of features in the application are concerned with refining the accuracy of the data extraction, and hence the coherence of the musical content and the readability of a score. It all happens at two markedly different levels of complexity.

In what’s known as the Assistant — where you’ll begin and end your A2S journeys — you can get a lot done. Your analysis will have begun in the Start tab, and Export is the ultimate goal, but In between are several others: Recognition, Score, Barlines and Layout. There’s a point to this ordering, and tweaks you make to the relatively few parameters in each tab can have a significant bearing on the outcome.

All A2S work starts in the simplified, tabbed Assistant view, and in some cases can finish there too, if little or no editing is required.All A2S work starts in the simplified, tabbed Assistant view, and in some cases can finish there too, if little or no editing is required.

In Note by Note mode the resulting score is subject to a fairly restrictive list of instrumentations. To go further, manual editing is required.In Note by Note mode the resulting score is subject to a fairly restrictive list of instrumentations. To go further, manual editing is required.Most fundamental of all is the so‑called ‘Scheme of recognition’. A ‘Note by note’ scheme does indeed attempt to extract and represent the maximum amount of original audio file content, without adding anything that wasn’t there to begin with. Interestingly, though, and perhaps surprisingly, A2S is by design not at all hung up on sticking with the instrumentation in your audio file, and indeed frequently merrily discards it.

After ingesting a piano piece, for example, choosing the ‘Piano’ instrumentation preset might give you a faithfully‑notated result. But you could just as easily select ‘Guitar’ to get an instant arrangement for that instrument, notated on an appropriate octave‑down single treble stave and with a playable pitch range. Conversely, you might choose to analyse a guitar performance and have A2S score it for classical organ, with a pedal part and feasible voicing and hand‑stretches for a bass and treble stave above. Eleven preset instrumentations completely disregard the instrumentation in the input file, while a further seven do attempt to recognise timbre, but only within the predefined combinations of piano, strings and wind, which I think is a bit restrictive. For vocal pop music transcription, which is a new ability in version 5 of the app, some alternative options appear here, tilted towards transcription in a lead‑sheet style. You get to specify the...

You are reading one of the locked Subscribers-only articles from our latest 5 issues.

You've read 30% of this article for free, so to continue reading...

  • ✅ Log in - if you have a Subscription you bought from SOS.
  • Buy & Download this Single Article in PDF format £1.00 GBP$1.49 USD
    For less than the price of a coffee, buy now and immediately download to your computer or smartphone.
     
  • Buy & Download the FULL ISSUE PDF
    Our 'full SOS magazine' for smartphone/tablet/computer. More info...
     
  • Buy a DIGITAL subscription (or Print + Digital)
    Instantly unlock ALL premium web articles! Visit our ShopStore.

RECORDING TECHNOLOGY: Basics & Beyond
Claim your FREE 170-page digital publication
from the makers of Sound On SoundCLICK HERE