ACE Studio is an AI‑powered virtual vocalist that’s frighteningly close to the real thing.
These audio examples accompany my review of ACE Studio from SOS November 2024 issue.
www.soundonsound.com/reviews/ace-studio
To provide an impression of what is possible with ACE Studio, I’ve provided three audio examples here. Each is based around a different musical style and, in all cases, it consists of a simple backing track (not arranged, mixed or mastered; just enough to provide the musical context) and vocals produced within ACE Studio. In the final versions you hear, I have, however, added a quick dash of EQ (mostly just to trim the low end), compression, and reverb/delay to the various AI‑generated vocals.
These examples were created after I’d been using ACE Studio off and on over about a two‑week period, within which I was working on the review itself; as with the audio examples I prepared when reviewing Dreamtonics’ Synthesizer V, this would put me in the ‘new user’ category, but perhaps not the ‘novice’ one. The workflow most certainly became faster as I gained familiarity with the various controls and options over that period and I’m sure there is scope to improve that further with extended use.
To provide a (hopefully) realistic impression of what’s possible based upon this level of experience, I then deliberately limited the amount of time I spent on each of these examples. For the first two examples, the backing tracks were quickly roughed out (mostly from various audio loops) within Cubase and a MIDI‑based melody was added using a virtual piano sound. Both backing track (as a stereo audio files) and melody (as a MIDI file) were then exported from Cubase and imported into an empty ACE Studio project set to the correct tempo. For comparison, both of these first two examples were tracks that I had also generated using Synthesizer V and which are also available to audition on the SOS website as part of that review. The third example uses a different approach as detailed below.
I then spent around 60 minutes creating each of the vocals. First, I selected a suitable AI voice database for the MIDI‑based ‘singer’ track. I then entered some suitable lyrics (no prizes for originality!) for each of the melody notes. Finally — and where the most time was spent — I worked my way through the performance, adjusting note timings, pitch data and pronunciation/emphasis of the individual phonemes as required. I also experimented with the various automatable parameters available for the voice to provide some contrasts between the different parts of each example.
ACE Studio Audio Example 01.mp3
This example is based around a trance/EDM backing track and features one of the native English female AI voices. The only difficulty I encountered was with editing the pronunciation of one or two words. I can easily imagine EDM songwriters using this approach as part of their demo process.
ACE Studio Audio Example 02.mp3
This example is based around a simple solo piano and vocal arrangement, providing one of the most exposed contexts in which ACE Studio’s vocals might be placed. Again, I used a native English female voice for this but also blended in elements of two other female voices using the Voice Mix feature described in the main text to get something that I thought suited the style more closely.
ACE Studio Audio Example 03.mp3
This third examples makes use of ACE Studio’s impressive a capella‑to‑MIDI conversion process. As before, I quickly put together a backing track within Cubase (just a combination of drum, bass and guitar loops) but, this time, a then sang a scratch vocal idea (with apologises for making you endure that!) with the first random lyrics I could think of. Both backing track and scratch demo vocal were then exported from Cubase and imported into an empty ACE Studio project set to the correct tempo.
I then configured a Singer Track using an English‑native male AI vocalist. I dragged and dropped my scratch demo vocal audio to this track and, a couple of clicks later, ACE Studio analysed the audio performance, converted it into suitable MIDI notes, created a pitch curve based upon the original performance and attempted to interpret the words within the original and add them as lyrics to the appropriate notes within the Singer Track. This conversion process took less than 60 seconds. I was then able to makes any necessary edits, swap the vocalist in/out, and use any of the other features of ACE Studio to experiment with the vocal performance created.
The audio example file you can hear demonstrates some of the steps involved and is divided into six sections, each of which contains a playthrough of the same 16 bars of the backing track. These six sections are organised as follows:
- (a) The backing track with my scratch demo vocal.
- (b) The backing track with the AI vocal created by ACE Studio’s automatic acapella‑to‑MIDI conversion process. This is without any further editing applied and, while there are a few spots where the lyrics have not been interpreted correctly, and one very obvious glitch in the pitch, it’s an impressive result that captures the timing, pitch and lyrical content with minimal effort on behalf of the user.
- (c) As for (b) but with the addition of a few minutes’ work to tidy up the most obvious lyric and pitch issues that required attention after the automatic conversion process. Your mileage may vary, but to my ears at least, the AI‑generated vocal is delivered with considerably more conviction than my original scratch part.
- (d) As for (c) but with the addition of a pair of vocal double tracks added using ACE Studio’s automatic Vocal Double feature. Other than adjusting their respective volumes to provide a suitable balance, no other editing has been applied. Again, you could tweak this to taste, or only add the doubles in selected spots but, as with doubling a human vocal, the result is added weight to the overall vocal performance created.
- (e) As for (d) but in this case, all the male AI vocalists have been swapped for female vocalists and, for the central lead vocal, the pitch of the MIDI notes within the melody in the second half of the clip have been raised by an octave to give it more energy and impact. To hear the scratch vocal from (a) transformed into a female vocal in (e) is quite a thing... and, again, the application for songwriters when demo’ing new song ideas is obvious.
- (f) As for (e) but in this case, the doubles have been removed and both the melody of the female lead vocal and some of the lyrics have been edited to adjust the performance (I didn’t adjust the lengths of notes or the timing in this case but that could also be done). I think I prefer the original version but, again, it illustrates the principle of editing the vocal performance as part of what might be an evolving song project.