Over a million developers have joined DZone.

Making Machines Sound Human

Computer speech, just strings some syllables together, right? We've been working on this for more than half a century. Maybe one of our most natural, organic, and human behaviors isn't that easy to simulate.

· Big Data Zone

Hortonworks DataFlow is an integrated platform that makes data ingestion fast, easy, and secure. Download the white paper now.  Brought to you in partnership with Hortonworks

voice-watsonRecently I wrote about a study that had tested the potential for automated speech writing.  It uses a rule based approach that was developed after researching thousands of successful and unsuccessful speeches down the years.

Meanwhile, there are also advances being made in speech analytics, with a Google Glass-like invention giving speakers live feedback on everything from their pitch to their cadence.

Whilst this may sound as though we’re rapidly hurtling towards a time where robots will be able to deliver speeches, the actual voice of the machine is still something that researchers are struggling to crack.

Making Machines Sound Human

It’s a problem that IBM attempted to tackle when they developed a voice for Watson.  Early attempts didn’t really sound particularly human, but they weren’t quite as foreboding as HAL either.

Since those pioneering days, we’ve had voice added to a range of computerized platforms, from your GPS device to the Siri-like personal assistant on your mobile phone.

They’re also increasingly deployed in robotic assistants for the home, the factory and for various medical environments.

Developments in this area revolve around what are known as ‘conversational agents’, which are programs that can both understand natural language, and then respond in kind.

Alas, the field is still a long way from developing a device that is indistinguishable from that of a human.  We’re a long way from a machine being able to pass an audible Turing test, for instance.

There is also the issue of the ‘uncanny valley’, which describes our repulsion to things that are broadly similar to us, but still noticeably not us.  So the more human robots become, the more turned off we are.

Mixing and Matching

At the moment, most synthesized speech is generated using a huge database of words and other subsets of speech that can then be put together into something sensible sounding.

This database consists of humans recording those words, but even then there are distinct challenges involved in the inflection used to portray emotions and context in particular circumstances.  Simply having one iteration of each word is therefore not enough, and this is before things like accents, dialects and slang are taken into account.

This remains a challenge that the industry has not managed to overcome, so whilst synthesized speech is largely functional, it is still some way from really reflecting our own speech.

Whether we really want synthesized speech to become lifelike however, is another matter again.  I mentioned at the start of this post about a project designed to automate speeches, and it doesn’t seem a stretch to think that once the content and delivery are automated it could have some severe implications.

For instance, the Israeli tech company Imperson are believed to be considering a foray into politics, with politicians deploying an avatar developed by Imperson to represent them online.

So you could have a digital Donald Trump cut loose on Twitter, talking in the same way as the Donald does in real life.

There is already evidence that organizations are using bots online to engage with stakeholders, but giving those bots the ability to talk fluently and convincingly gives them a whole new level of power.

Hortonworks Sandbox is a personal, portable Apache Hadoop® environment that comes with dozens of interactive Hadoop and it's ecosystem tutorials and the most exciting developments from the latest HDP distribution, brought to you in partnership with Hortonworks.


Published at DZone with permission of Adi Gaskell, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

The best of DZone straight to your inbox.

Please provide a valid email address.

Thanks for subscribing!

Awesome! Check your inbox to verify your email so you can start receiving the latest in tech news and resources.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}