Archive for October, 2017

Controlling my App using Voice

October 15th, 2017 by Heather Maloney

Adding voice recognition to my mobile app
In order for the apps on your smartphone to be voice controlled, they need to be specifically programmed that way.

Some of the more common voice-enabled apps you are likely to find on your smartphone are:

  • Calendar – ask your smartphone the time of your next / first appointment, on a particular day, and it will tell you the answer and automatically show your calendar appointments for that day on screen
  • Phone – tell your smartphone to call person X, or send a text message to person Y, and it will take care of these tasks, prompting you for the details as required
  • Alarm – set an alarm to go off at a particular date and time
  • Search – ask your phone to search for a topic, and it will display a clickable list of search results

Voice recognition technologies have improved significantly over the last few years, providing numerous options with regard to voice enabling mobile apps, including:

  1. The Android operating system for wearables (e.g. Galaxy watch), smart phones and tablets includes in-built voice control actions for carrying out commonly used tasks such as writing a note. It also comprises the ability for an app to include its own “intents” which listen for voice activation once the user has launched the app. Finally it includes methods for allowing the user to enter free form text for processing by your app.
  2. Google Voice Interactions API – a code library provided by Google which allows an app to be triggered via the Google Now interface – that’s what you’re using when you say ‘Okay Google’ and then say a command.
    okay-google
  3. Apple devices (iPhones, iPads, iWatch) are built on the iOS operating system. Native iOS apps are written in either Objective C or Swift (a more recent language). With the launch of iOS 10, the Swift programming language included a Speech framework to allow developers to more easily implement listen for voice commands, and manipulate voice into text for use within apps.
  4. SiriKit was released in 2016, providing a toolkit for iOS developers to add voice interaction through Siri into their iOS 10 apps.

    What-is-my-heart-rate-voice-interaction-with-mobile-app
  5. Cross platform apps need to use 3rd party libraries to interface with the native speech recognition functions.

It’s important to know that the speech of the user is processed by Apple’s servers or Google’s servers, and then returned to the mobile device, so some lag may be noticed particularly when dealing with longer bursts of voice. It may also have privacy considerations for your users.

3rd party APIs exist which are completely contained within the mobile device, meaning that the user doesn’t need to have an internet connection to use them, and the privacy issues are reduced. An example of such a 3rd party API is the CMU Sphinx – Speech Recognition Toolkit. The downside of using such a library is that you can’t avail yourself of the amazingly accurate voice recognition the large players have developed over time, including for many different languages.

Obvious apps which provide the user with significant benefit from the use of voice control include:

  • An app which improves or assists the job of a hands-on task e.g. chefs, surgeons, artists, hairdressers …
  • An app which is needed while a person is driving e.g. navigation, finding locations, dictating ideas on-the-go …
  • An app needed by a person with disability.
  • An app which involves the entry of lots of text.

We expect to see more and more support for voice in all sorts of applications in future. What would you like to be able to achieve through voice commands?

Facebooktwitterredditpinterestlinkedinmailby feather

Can voice input be added to my web form?

October 13th, 2017 by Heather Maloney

power-of-voice-newsletter
Given the recent proliferation of ads about Google Home, it’s now common knowledge that you can easily talk to electronic devices and instruct them to do things such as search the web, play your favourite tune, give you the weather forecast, call a friend, or tell you the time of your first appointment on a particular day. Google Now is the technology that enables voice control of Google and Android devices, and Siri powers voice control on Apple devices. Windows 10 provided Cortana to do the same.

When you are using a smartphone to interact with a form on a web page, then you can usually fill in a form using voice … how easy or hard that is depends on your device. On an iPhone (and an iPad) when you bring up the keyboard in a form, there’s an additional ‘microphone’ icon that you simply need to tap in order to speak your entry. If you are using an Android Samsung Galaxy phone, you can switch your entry from keyboard to voice by swiping down from the top of the screen and choosing Change Keyboard, and then choosing Google Voice … yes, that’s 3 steps :-(.

When it comes to using a PC or Mac, filling in a form usually relies on typing. Now that I am getting used to talking to electronic devices, I find myself looking for more ways that I can use my voice to control the device rather than having to type everything. Talking, even for me as a very fast touch-typist, is quicker than typing. Plus, speech control enables you to control your device when you need to be using it hands-free.

What about my web form?
In answer to the question posed by this blog article, yes! voice input can be added to your web form even when you are entering text on a PC or a Mac. To demonstrate, we’ve added a very simple voice entry capability to the enquiry form on the Contactpoint home page. Please note; this example only works in the Chrome web browser, and of course you must have a microphone on your PC or Mac in order to speak to fill out the form. To use the voice input:

  1. click or press on the microphone icon beside a field
  2. click to Allow access to the microphone (you will only need to do this the first time)
  3. talk to complete the field!

As you are speaking you will see that there’s a red recording icon pulsing in the browser tab. When you stop talking, the recording will also stop, and then what you said will appear in the box.

From a programming point of view, there are several ways to implement voice input into a web form. The example on the Contactpoint home page uses a very simple method involving Javascript and the webkitSpeechRecognition which is an API for Google Chrome, giving the browser access (after the user has specifically allowed it) to the microphone and then handling voice input very nicely. Google’s team has spent many years refining speech recognition, and the webkit gives you quick and free access to their powerful functionality.

Other Javascript libraries have been developed to enable much more sophistication in the manner in which you can use voice to interact with a web form. Annyang is a great example, whereby specific parts of your web form can have tailored voice interactions enabled so that whatever you say has context e.g. choosing from a drop-down list in a form will know about the allowed options, and match the voice input with one or more of those options. Due to the additional sophistication, there’s obviously more effort involved in using this library. Another benefit is that Annyang functionality works in any web browser.

If you would like to improve the usability of your web forms by enabling speech entry, feel free to get in touch!

Handy Hints for voice entry of text:
If you speak your text message without including punctuation, paragraphs and the like, it can be a lot harder for the recipient to understand your message. But have no fear, the following list will have your test messages reading just like you typed it!
“full stop” – if you pause and then say “full stop” Google Now and Siri will type in a ‘.’
“exclamation mark” – if you say “exclamation mark” Google Now and Siri will type in a ‘!’
“question mark”- if you say “question mark” Google Now and Siri will type in a ‘?’
“new line” – if you pause and say “new line” Google Now and Siri will move the cursor down to the next line.
“comma” – if you pause and say “comma” Google Now and Siri will type in a ‘,’

Facebooktwitterredditpinterestlinkedinmailby feather

Subscribe to our monthly

Contactpoint Email News

Our enews is sent out approximately monthly, and contains information on latest digital technologies, and how these can be used to help your organisation grow.

To subscribe, simply fill in your details below: