Voice Recognition Security ID

Wanted: Adaptive Multi-Factor Authentication

The not-so-recent OPM data breach has resulted in some critical repercussions, not the least of which is the exposure of millions of government employees and contractors to identity theft. The breach has raised the ire of Congress with numerous hearings about what happened, why it happened, and what are “you” going to do about it. These hearings have not only been with OPM but with other agencies affected by the breach – namely our military departments and intelligence agencies. Besides the threat of identity theft, workers for the federal government may now be exposed to blackmail, become targets for phishing attempts, and risk reprisals in the field.

One of the least mentioned casualties of the breach is the leak of fingerprint files contained in the OPM databases. As a method to identify a person, fingerprints are often used as a means to implement multi-factor authentication (MFA). Now, it seems that this MFA approach may have some vulnerability for those affected by the breach. Agencies need to start scrambling for new adaptive multi-factor authentication techniques. So what are the alternatives? There are several other physiological biometrics that offer a possible replacement for fingerprints including:

  • Iris scans
  • Retina scans
  • Voice
  • Vein structure analysis
  • Facial recognition

Each of these approaches comes with its own pros and cons. But if I had to pick one, it would be using voice biometrics for authentication.

“All I have is a voice.” – W.H. Auden

While the other biometrics schemes necessitate a variety of extra equipment along with physical presence to capture the biometric, voice authentication can be conveniently delivered remotely over the omnipresent phone. As the number of enterprise mobile workers continues to grow, this convenience is essential. Given the use of the correct analytical techniques, a person’s voiceprint can be as unique as any other biometric characteristic, and has the added benefit of being less personally intrusive than subjecting the person to a retinal or fingerprint scan. Also, there is both a physiological biometric component (for example, voice tone and pitch) and a behavioral component (for example, accent) that together are very useful for biometric authentication.

Voice biometrics is applied in a variety of use cases. For example, there are a number of speech recognition systems with Siri probably the most well-known. Speech recognition is also applied in voice-to-text systems such as Dragon. Speech recognition does not attempt to give any information as to the identity of the speaker, but instead attempts to determine what they are saying. Voice identification systems attempt to identify the speaker but not necessarily for purposes of authentication. For example, law enforcement agencies use voice identification systems when wire-tapping a phone. Voice authentication differs in some respects from speech recognition and voice identification. Voice (or speech) authentication attempts to verify that the individual speaking is, in fact, who they claim to be. This is normally accomplished by comparing an individual’s voice with a previously recorded “voiceprint” sample of their speech.

All types of biometric systems have problems like false accepts and rejects that are not found in password or token authentication systems. This is because biometrics is based on pattern recognition rather than exact match. In general for biometrics systems, some sample live phenomena are sensed and digitized. A feature extraction module then computes significant “landmarks” in this sample. A decision then is made based upon a “degree of match.” The system can therefore make errors and the tradeoffs between various error rates must be considered. Voice biometrics are sometimes criticized as being susceptible to false accepts and rejects due to voice changes caused by colds, or by loss of “data” due to compression and restricted frequency ranges applied over microphones and digital phones. However, studies have shown that it is the lower pitched, voiced phonemes, the ones that are most dependent on the physical structure of the vocal tract (and thus most relevant for authentication), that provide the greatest use in voice verification. These phonemes are least affected by factors such as coughs, colds or mouth injuries. Most phonemes also only use frequencies between 300Hz and 3400Hz, so telephones are designed only to transmit sounds in this frequency range. This reduces the error rate as well. Voiceprint collection and analysis techniques, such as feature analysis, can also be tuned to ensure higher accuracy for matches.

Morphing Biometrics for Resiliency and Privacy

Another problem with biometric authentication of any type is the re-issuance of identity tokens. This OPM issue with fingerprints is a case in point. For authentication based on physical possessions, e.g., keys and badges, a token can be easily cancelled and the user can be reassigned a new one. Similarly, logical entities, such as user ID and passwords, can be changed as often as required. Yet, a user has only a limited number of biometrics such as one face, ten fingers, one voice, and two eyes. You can’t exactly reissue one of these. Furthermore, a biometrics authentication system uses private details of users and there is an immediate privacy concern about misuse of this information. There are different approaches that can be applied to offset these issues when it comes to voice authentication. One adaptive method is called “cancellable biometrics,” which uses morphing techniques to ensure the privacy of enrolled users as well as provide a means for “reissuance” of biometric “tokens.” In the case of voice authentication, the voiceprint is digitized and divided into time segments for analysis. A morphing function is applied to re-sequence the different voice segments and the resulting scrambled voiceprint is used as the stored version for queries and matching. Note that for voice scrambling, only minimal registration of the query voice print with the enrolled voice print is needed, such as aligning the onset time. The scrambled voiceprint can be later deprecated and a new scrambled voiceprint can be “reissued” if needed without another enrollment.

Moving Forward With Voice

There are several companies that have offerings in the field of voice biometrics and voice authentication and one of the clear leaders is Nuance. As the supplier for the Siri and Dragon voice biometric engines, Nuance has a large market presence in speech recognition. Nuance’s VocalPassword offering is also the world’s most widely deployed voice biometric solution. VocalPassword is an adaptive voice authentication offering. It provides flexibility in enrollment and use through three methods of voice authentication:

  • Text-Dependent Biometric Engine – this method requires speaking a fixed passphrase such as “My Voice is my Password” or a set of digits.
  • Text-independent Biometric Engine – this method operates on a live conversation.
  • Text-Prompted Biometric Engine – in this method, a random passphrase is used, typically a set of digits, letters or both.

VocalPassword can provide authentication layers for additional protection depending on how you set it up. For example, VocalPassword can be used for Knowledge-Based Verification by enrolling text-dependent voiceprints that contain the answers to verification questions. VocalPassword’s ASR add-on can be used to validate the speaker’s answers. This multi-factor authentication approach (what you are, what you know) instills greater confidence in voice authentication. VocalPassword also supports proactive methods to militate against known threats, such as:

  • Liveness detection which significantly reduces recording threats. Following text-dependent verification, this method uses text-independent voice biometrics technology to compare the voice sample captured during the text-dependent verification process, with an additional sample captured by prompting the speaker to repeat a random or semi-random sentence.
  • Prompted passwords verification – Prompted verification requires the user to repeat a random phrase that is a subset of speech atoms (digits/words) trained during enrollment. Prompted verification provides protection against interception and playback attacks, as each session uses a different subset of the trained speech atoms.
  • Playback detection – VocalPassword’s patented playback detection algorithm runs as part of the verification process and identifies audio segments that unnaturally match audio segments that were previously used for verification/enrollment.

VocalPassword also supports adaptive methods to ensure accuracy is maintained during enrollment and post-enrollment. By using new audio to update existing voice templates, VocalPassword allows each speaker to maintain an accurate voiceprint according to changing background noises and voice tones that shift with age. It also enables fully automated enhancement of voiceprints based on the analysis of failed authentication attempts by legitimate users.

So you can see from this example of a voice authentication offering, voice biometrics can provide a strong approach to plug the MFA hole caused by the OPM breach. With the improvements in accuracy over the last 10 years, voice authentication has entered the mainstream as a verification technology. In fact, the voice biometrics industry is growing faster than ever. High-profile data breaches such as OPM, financial losses from fraud, customer experience optimization, mobile devices, and deployments by the world’s largest enterprises are driving voice authentication into the mainstream. The market is also demanding multi-factor authentication that combines voice biometrics with other factors. For example, some customers will want to geo-locate their users as an additional factor for voice authentication. This will necessitate other systems to integrate easily and securely with voice authentication systems. Tools such as Resilient Network Systems Trust Broker that can combine claims or assertions from different authentication systems come to mind as part of an overall solution. Cloud-based voice authentication and speech recognition systems will also play a larger role as third party identity providers begin to voice-enable their applications and provide MFA methods for mobile and other devices. The Internet of Things is also a new area where voice authentication approaches are beginning to penetrate, such as using voice authentication to access devices in cars and homes.

To conclude, I see voice authentication approaches helping to provide a few more nails in the coffin of passwords and userIDs at last. Let me know your experience with voice biometrics and adaptive ways that voice biometrics is being employed by your enterprise or product offering. Thanks for reading and check out my latest eBook on Active Cyber Defense if you haven’t already done so – you can find it here.