msm6679al-110 Oki Semiconductor, msm6679al-110 Datasheet - Page 20

msm6679al-110

Manufacturer Part Number

msm6679al-110

Description

Si/sd Voice Recognizer, Recorder/player, And Speech

Manufacturer

Oki Semiconductor

Datasheet

1.MSM6679AL-110.pdf (46 pages)

Current page: 20 of 46
Download datasheet (199Kb)

A typical target accuracy of 97% is achieved with a 3% E

a 3%E

SD Recognition

In SD recognition mode, the MSM6679AL-110 can be trained to recognize up to 61 words. The

MSM6679AL-110 can support multiple speakers by switching vocabularies, but only one

speaker’s vocabulary should be active at one time.

The end user enrolls a phrase in the MSM6679AL-110’s vocabulary by recording the phrase three

times or more. The host Micro Controller Unit (MCU) controls the number of times each phrase

in enrolled. Generally, higher recognition accuracy is achieved with each additional enrollment.

The word set is made more robust by pronouncing each phrase slightly differently during initial

enrollment.

In addition to enrollment training, adaptive template updating can drive the accuracy towards

100%. The host MCU updates templates by first asking the speaker to confirm a recognized

phrase with a “yes” or “no” response, and subsequently updating the template for corresponding

words. The use of name tags (see next paragraph) facilitates this process.

MSM6679AL-110 Voice Recognition Processor

The samples should be generated from a randomly-ordered list, with each word spoken twice

and with a dummy word at the beginning and end. There must be >2 sec between each sample

for accurate data processing. To provide the audio fidelity required for high-quality recognition

training, a DAT recorder, together with the microphone that will be used in the final application,

is required. To ensure data integrity, data is submitted to Oki after collecting samples from the

first 20 speakers for initial screening. If acceptable, then the remaining collection may proceed.

If substitution errors are possible, collection of spare words during initial collection is

recommended. For example, alternate words to “Stop” and “Top” could be “Halt” and “First.”

Collections should contain a wide variety of the background sound conditions that will exist

during actual usage. For example, if the collection is for use in an automobile, conditions such

as vehicle speed, road conditions, various window opening positions, heater or AC blower

speeds and radio volumes should be varied during the collection. The signal-to-noise ratio

should be maintained at 20dB.

To achieve high accuracy rates, phrase selection, data collection, background initialization

strategy, and control software need careful consideration. There are no published standards for

recognition accuracy.

Oki defines accuracy by:

with the following definitions:

Parameters for Recognition Accuracy

Substitution Error

Rejection Error

Gap Error

Time-Out Error

Spurious Response Error

REJ

rate.

Name

Accuracy = 100% - E

RATE

Symbol

SUB

GAP

TME

SPU

REJ

= E

SUB

Most critical type error, e.g., Say "Five", recogrize "Nine"

Word not recognized, opportunity for operator to repeat

Word spoken before recognizer ready

Word length is too long

Sourd or imvalid word classfied as a valid word

(i.e., drop handset or speak wong word)

+ 1/2 E

RATE

REJ

RATE

, composed of a 1.5% E

Condition

¡ Semiconductor

SUB

rate and

msm6679al-110 Oki Semiconductor, msm6679al-110 Datasheet - Page 20

msm6679al-110

Related parts for msm6679al-110