The Null Device

Vocaloid

The NYTimes has a piece on Vocaloid, the new singing voice-synthesis program that could automate the last part of music performance still done by humans. Vocaloid is interesting because voices are stored as interchangeable "fonts" of vast numbers of samples and articulation data. The first fonts coming out (from British samplemongers Zero-G) are a pair of soul-singer voices, Leon and Lola:
In the case of Leon and Lola, session singers were hired to record what Mr. Stratton calls "generic soul-singing voices." The decision to start with soul was purely a marketing calculation: Mr. Stratton figured that the most common use of Vocaloid, at least in its early stages, would be to serve as background singers. With a soulful sound, the company could target a commercial market that ranges from Justin Timberlake to Jay-Z.

(Bugger soul singers, I say, just give me Liz Fraser. Or Ian Curtis. A generic French-accented female voice could also be useful for all the post-Stereolab acts.)

The process, of course, could be exploited for mischief, as described below. Though doing so would require a vast amount of raw data, work and expertise to prepare the voice font, something beyond the reach of casual pranksters.

What's to stop dilettantes from creating their own fonts? Could it be long before falsified but entirely convincing clips of Britney Spears begging for Justin's forgiveness circulate on the Web to say nothing of George Bush conspiring with Tony Blair about weapons of mass destruction?

The major market will be celebrity voices, undoubtedly priced beyond the reach of mere mortals, and giving Fortune 500 corporations that touch of class that comes with having Frank Sinatra sing the company song:

Licensing Elvis for Vocaloid would be a different matter, though, says Gary Hovey, vice-president of entertainment for Elvis Presley Enterprises. "If someone came to us and said, `We want Elvis to sing this new song,' we'd have a lot to contemplate," he said. "We tried to retain the integrity of his original song with the remixes. Now you're talking about a whole new vocal performance of a song he never sang or knew? How do we know he'd want to sing it?" "Believe me, that would go all the way to Lisa," he added, referring to Elvis's daughter, Lisa Marie Presley, who owns Elvis's estate.
Once a full palette of vocal fonts is available (or once Yamaha allows users to create their own), the possibilities become mind-boggling: a chorus of Billie Holiday, Louis Armstrong and Frank Sinatra; Marilyn Manson singing show tunes and Barbra Streisand covering Iron Maiden. And how long before a band takes the stage with no human at the mike, but boasting an amazing voice, regardless?

The article then points out that, with this in place, the entire process of song production could be automated. Lyrics could be pieced together from a database of stock phrases or using a narrative engine (though, then again, given how songs can succeed without the lyrics making sense (look at any 90s Eurodance hit), that may not be necessary); instruments can be synthesised (this includes guitars; I have in my collection a program named Virtual Guitarist which does just that, passably if inflexibly in places, though certainly well enough for pop songs), and the mixing can be automated. Finally, the hit quality of the finished product can be mathematically assessed using the Hit Song Science algorithm, and a genetic algorithm used to evolve the catchiest song. All stages of the process (from instrumentation/lyrical content to final scoring) could be tweaked using market research ("Electroclash is out, booty bass is coming back ironically, chip tunes are the dog's bollocks, and 90s grunge retro is due any day now"). And then we may all end up living in a Greg Egan story.

There are no comments yet on "Vocaloid"