Text-to-speech recognition

Rick Knight (rknight@platinum.com)
Fri, 25 Jul 97 10:37:49 CST


Danny (a.k.a. Some Dumb Kid) ;-> wrote:

All they (text-to-speech programs) need to do is to get someone to
record their voice for each word, probably a few different times, with
different stresses each time. The problem comes in assembling the
grammar to fit which stressed words should be used. Does typical
grammar apply here or are there new forms of language
composition/evaluation that can be applied?

Rick replies:

Well, if the text-to-speech freebee (TextO'LE) I got with my
Soundblaster AWE64 is any indication, the algorithms for constructing
speech would need to be a little more robust than just a few
pre-recorded inflections of each conceivable word. This program hits
the mark quite often but it does often sound like a Russian with a
Swedish accent on barbiturates. And the female voice sounds like me
on helium and shrooms.

I was "listening to" this programs reciting of "The Long Boom" that I
downloaded from Wired's web page (more to see if it could tackle some
of the veverbiage."Internet" and "cyberspace" were very curiously
pronounced before I added inflection correction (is that like
conjunction junction?).

I assume this would require more than look up tables to decide when to
pronounce "lead" as LEED or LEAD or "live" with a short or long "i".
That's just one of its shortcomings (but it WAS free). I was going to
go about looking for a better program. If I could capably understand
(and more enjoy) the inflections and timbre AND it hit more on the
mark without requiring so much correction, I could "listen" to
articles and books while making my bed or some other mundane task.
Seems like the blind would appreciate this as well.

Any ideas on existing software out there or how to build a better
mousetrap, I'm listening.

Rick