The Null Device

Posts matching tags 'speech synthesis'

2005/3/20

Another online speech synthesizer demo; this one (ScanSoft's rVoice), however, has multiple accents, including British (i.e., RP), Scottish, Australian (only in sheila, though, and not bloke), Spanish, and not only American but also Valley Girl (more formally known as "Southern California").

Which is rather nifty; it's good to be able to get synthesized speech that doesn't sound either generic-American or (occasionally) RP-British (which some call the BBC accent, except for the fact that nobody on the BBC talks like that these days).

Apparently one of their markets is call centres and voice-response systems (and some of the voices have normal and call-centre modes of diction). Which could explain the presence of a Scottish accent; apparently, studies in Britain found that Scottish accents are considered the most soothing/least aggravating to call centre callers.

accents speech synthesis tech web toys websites 6 Share

2003/11/25

The NYTimes has a piece on Vocaloid, the new singing voice-synthesis program that could automate the last part of music performance still done by humans. Vocaloid is interesting because voices are stored as interchangeable "fonts" of vast numbers of samples and articulation data. The first fonts coming out (from British samplemongers Zero-G) are a pair of soul-singer voices, Leon and Lola:

In the case of Leon and Lola, session singers were hired to record what Mr. Stratton calls "generic soul-singing voices." The decision to start with soul was purely a marketing calculation: Mr. Stratton figured that the most common use of Vocaloid, at least in its early stages, would be to serve as background singers. With a soulful sound, the company could target a commercial market that ranges from Justin Timberlake to Jay-Z.

(Bugger soul singers, I say, just give me Liz Fraser. Or Ian Curtis. A generic French-accented female voice could also be useful for all the post-Stereolab acts.)

The process, of course, could be exploited for mischief, as described below. Though doing so would require a vast amount of raw data, work and expertise to prepare the voice font, something beyond the reach of casual pranksters.

What's to stop dilettantes from creating their own fonts? Could it be long before falsified but entirely convincing clips of Britney Spears begging for Justin's forgiveness circulate on the Web to say nothing of George Bush conspiring with Tony Blair about weapons of mass destruction?

The major market will be celebrity voices, undoubtedly priced beyond the reach of mere mortals, and giving Fortune 500 corporations that touch of class that comes with having Frank Sinatra sing the company song:

Licensing Elvis for Vocaloid would be a different matter, though, says Gary Hovey, vice-president of entertainment for Elvis Presley Enterprises. "If someone came to us and said, `We want Elvis to sing this new song,' we'd have a lot to contemplate," he said. "We tried to retain the integrity of his original song with the remixes. Now you're talking about a whole new vocal performance of a song he never sang or knew? How do we know he'd want to sing it?" "Believe me, that would go all the way to Lisa," he added, referring to Elvis's daughter, Lisa Marie Presley, who owns Elvis's estate.
Once a full palette of vocal fonts is available (or once Yamaha allows users to create their own), the possibilities become mind-boggling: a chorus of Billie Holiday, Louis Armstrong and Frank Sinatra; Marilyn Manson singing show tunes and Barbra Streisand covering Iron Maiden. And how long before a band takes the stage with no human at the mike, but boasting an amazing voice, regardless?

The article then points out that, with this in place, the entire process of song production could be automated. Lyrics could be pieced together from a database of stock phrases or using a narrative engine (though, then again, given how songs can succeed without the lyrics making sense (look at any 90s Eurodance hit), that may not be necessary); instruments can be synthesised (this includes guitars; I have in my collection a program named Virtual Guitarist which does just that, passably if inflexibly in places, though certainly well enough for pop songs), and the mixing can be automated. Finally, the hit quality of the finished product can be mathematically assessed using the Hit Song Science algorithm, and a genetic algorithm used to evolve the catchiest song. All stages of the process (from instrumentation/lyrical content to final scoring) could be tweaked using market research ("Electroclash is out, booty bass is coming back ironically, chip tunes are the dog's bollocks, and 90s grunge retro is due any day now"). And then we may all end up living in a Greg Egan story.

computer music softsynths software speech synthesis tech vocaloid 0 Share

2003/10/2

If you can read this, then we're back. A routine machine relocation didn't go quite to plan, but it's all fixed now (hopefully).

And below is the backlog of blog items that didn't get posted to The Null Device over the past few days:

chavs consumerism gibson's law google google file system retrocomputing scotland society speech synthesis tech terrorism the long siege videogames warren ellis websites 0 Share

2003/9/8

First there were pocket-sized USB flash disks, then USB flash disks with built-in MP3 players (for those whose music collections fit in 128Mb), and now, if an ad on the front page of the Computer Trader (a cheaply printed monthly paper of classifieds and price lists) is to be believed, there are USB flash disks with text-to-speech. It doesn't say exactly how it works, but I presume that you copy text files to it and it reads them to you while you drive/jog/catch the bus. Which could be useful, depending on other things (i.e., how listenable the voice used is, how easy it is to navigate through texts, what file formats it can read (plain text? MS Word? Unicode?).

gadgets speech synthesis tech 0 Share

2003/4/26

Yamaha have developed a program for synthesizing sung vocals. Named Vocaloid, the program uses libraries of vocal fragments and articulation algorithms to synthesise realistic singing. It currently comes with a "Soul Vocalist" data set, for all your throaty dance vocal needs. Windows-only, I'm afraid, and no word of VST compatibility; there's a screenshot here. (via Found)

computer music softsynths speech synthesis vocaloid 1 Share

2002/3/21

A company has developed speech synthesis with user-selectable accents, including an Australian accent and a Scottish brogue. Wonder on which platforms this technology will be available.

accents speech synthesis tech 0 Share

This will be the comment popup.
Post a reply
Display name:

Your comment:


Please enter the text in the image above here: