Creating a ASP.NET application converting text to speech

By default, ASP.Net applications don’t run with sufficient permissions to access Speech Synthesis, and attempting to run Larsenal’s code will fail with a security error.

I was able to get around this in an app by having a separate WCF service running on the server, as a regular Windows Service. The ASP.Net application then communicated with that service. That service just wrapped Larsenal’s code, returning an array of bytes, given a string of text.

Also, one megabyte of text? That’s a good-sized novel.

Edit, 11-12-09, answering some comments:

System.Speech can either return an array of bytes, or save to a wav file, which you can then feed to a media player embedded on the user’s page. When I built my talking web page, it worked like this:

1) Page.aspx includes an ’embed’ tag that puts a Windows Media Player on the page. The source is “PlayText.aspx?Textid=whatever”.
2) PlayText.aspx loads the appropriate text, and communicates (via WCF) to the speechreader service, handing it the text to read.
3) The Speechreader service creates a MemoryStream and calls SpeechSynthesiser.SetOutputToWaveStream, and then returns the stream as a single array of bytes. This array is Response.Write()-ed to the client.

Here’s the meat of the SpeechReader service:

    byte[] ITextReader.SpeakText(string text)
    {
        using (SpeechSynthesizer s = new SpeechSynthesizer())
        {
            using (MemoryStream ms = new MemoryStream())
            {
                s.SetOutputToWaveStream(ms);
                s.Speak(text);
                return ms.GetBuffer();
            }
        }
    }

I’m pretty sure that on the back end, this returns an enormous XML array-of-bytes, and is horribly inefficient. I just did it as a proof of concept, and so didn’t research that. If you intend to use this in production, make sure that it’s not internally returning something like this:

<byte>23</byte>
<byte>42</byte>
<byte>117</byte>
...

Leave a Comment