KR970060043A

Movatterモバイル変換

Info

Publication number: KR970060043A
Application number: KR1019960000108A
Authority: KR
Inventors: 김민성
Original assignee: 구자홍; Lg 전자주식회사
Priority date: 1996-01-05
Filing date: 1996-01-05
Publication date: 1997-08-12
Anticipated expiration: 2016-01-05
Also published as: KR0176882B1

Abstract

Translated fromKorean

본 발명은 연결음성 인식의 검색방법에 관한 것으로, 종래의 연결음성 인식의 검색방법은 문법에서 만들 수 있는 모든 문장경로를 추적하면서, 경로중 최대의 유사도를 가지는 경로를 선택하고, 그 때의 경로상에서 비교되었던 단어열로서 입력음성을 인식하였기 때문에 검색시간이 많이 걸리는 문제점이 있었다. 본 발명은 이러한 종래의 문제점을 해결하기 위해 T프레임의 음성이 입력되면 특징을 추출하고 근사단어열을 찾은 다음 그 근사단어열에 대해 인식과정을 통하여 평균유사도(Psc)를 구하는 제1단계와; 이후, 비터비 인식과정을 첫번째 프레임부터 소정프레임(S)까지 진행하면서 그 때까지 검색된 경로의 정합유사도를 상기 평균유사도(Psc)와 비교하여 평균유사도(Psc)보다 적은 경로를 제거함으로서 근사단어열을 인식하는 제2단계와; 상기 제2단계의 반복수행 결과 입력 음성의 마지막 프레임까지 비터비 인식과정이 완료되면 제거되지 않은 경로중에서 유사도가 최대인 경로를 구하고, 단어열 역 추적과정을 통해 그 경로상에서 비교된 단어열을 찾아 인식된 문장으로 출력하는 제3단계로 이루어진 연결음성 인식의 검색방법을 창안한 것으로, 이의 작용을 통해 즉, 입력의 양자화된 오차만을 더하여 입력에 근사적인 단어열을 선택하고 전체 정합과정에서 이 근사적인 단어열보다 유사도가 낮게 정합되는 경로를 검색과정 중에서 제거함으로서 인식시간을 줄일 수 있는 효과가 있다.The present invention relates to a method for searching connected speech recognition. In the conventional method for searching connected speech recognition, a path having the greatest similarity among paths is selected while tracing all sentence paths that can be created in the grammar, The input speech is recognized as a word sequence which is compared on the input speech. In order to solve such a conventional problem, the present invention has a first step of extracting a feature when a voice of a T frame is input, finding an approximate word string, and obtaining an average similarity degree (Psc) through a recognition process on the approximate word string; Thereafter, the Viterbi recognition process proceeds from the first frame to the predetermined frame S, and the matching similarity degree of the retrieved path up to that time is compared with the average similarity degree Psc to eliminate paths shorter than the average similarity degree Psc, A second step of recognizing the first image; As a result of repeating the second step, when the Viterbi recognition process is completed up to the last frame of the input speech, a path having the maximum similarity among the paths that have not been removed is found, and a word sequence And a third step of outputting the recognized word sequence to the recognized sentence. In this case, an approximate word sequence is selected by adding only quantized errors of the input, It is possible to reduce the recognition time by eliminating the matching path that is less than the similar word sequence in the searching process.

Description

Translated fromKorean

연결음성 인식의 검색방법How to Find Connected Speech Recognition

본 내용은 요부공개 건이므로 전문내용을 수록하지 않았음Since this is a trivial issue, I did not include the contents of the text.

제3도는 본 발명 연결음성 인식의 검색동작을 설명을 위한 개략적인 흐름도.FIG. 3 is a schematic flowchart for explaining a search operation of connected speech recognition according to the present invention; FIG.

제4도는 본 발명에 있어서, 경로를 제거하는 동작을 설명하기 위한 구조도.FIG. 4 is a structural view for explaining an operation of removing a path in the present invention; FIG.

Claims

Translated fromKorean

T프레임의 음성이 입력되면 특징을 추출하고 근사단어열을 찾은 다음 그 근사단어열에 대해 인식과정을 통하여 평균유사도(Psc)를 구하는 제1단계와; 이후, 비터비 인식과정을 첫번째 프레임부터 소정프레임(S)까지 진행하면서 그 때까지 검색된 경로의 정합유사도를 상기 평균유사도(Psc)와 비교하여 평균유사도(Psc)보다 적은 경로를 제거하으로서 근사단어열을 인식하는 제2단계와; 상기 제2단게의 반복수행 결과 입력 음성의 마지막 프레임까지 비터비 인식과정이 완료되면 제거되지 않은 경로중에서 유사도가 최대인 경로를 구하고, 단어열 역 추적과정을 통해 그 경로상에서 비교된 단어열을 찾아 인식된 문장으로 출력하는 제3단계로 이루어진 것을 특징으로 하는 연결음성 인식의 검색방법.A first step of extracting a feature when a voice of a T frame is input, finding an approximate word string, and obtaining an average similarity degree (Psc) through a recognition process on the approximate word string; Thereafter, the Viterbi recognition process proceeds from the first frame to the predetermined frame S, and the matching similarity degree of the retrieved path up to that time is compared with the average similarity degree Psc to eliminate paths shorter than the average similarity degree Psc, A second step of recognizing the heat; When the Viterbi recognition process is completed up to the last frame of the input speech as a result of the second iteration, a path having the maximum similarity among the paths that have not been removed is found, and a word sequence And outputting the recognized sentence as a recognized sentence.

제1항에 있어서, 근사단어열 인식방법은 훈련음성으로부터 양자화 설계과정을 거쳐 벡터양자기를 구하고, 각 단어에 대해 자주 발생하는 코드워드를 N개 구하며, 각 단어에 대한 음성의 최소 길이와 최대 길이를 저장하는 제1단계와; 제1단계와 같은 상태에서 음성이 입력되면 먼저, 특징을 추출하고 양자화 한다음 입력에 대한 상기 양자화된 코드워드와 각 코드워드로 양자화 할 때의 오차로부터 입력음성에 근사적으로 맞는 단어열을 선택하는 제2단계로 이루어진 것을 특징으로 하는 연결음성 인식의 검색방법.2. The method of claim 1, wherein the approximate word sequence recognition method comprises: obtaining a vector quantizer from a training speech through a quantization design process; calculating N codewords frequently occurring for each word; calculating a minimum length and a maximum length The method comprising: If a speech is input in the same state as the first step, the feature is first extracted and quantized. A word string that is approximately matched to the input speech is selected from the quantized codeword for the sound input and the error when quantizing the codeword into each codeword And a second step of searching for the connected speech recognition.

제1항에 있어서, 단어정합 과정은 음성이 입력되면 입력의 시각점부터 각 단어의 최소길이에서 최대 길이까지 구간에서 해당되는 단어에 대해 자주 발생하는 코드워드의 양자화 오차를 모두 더하고 길이로 나누어 그 단어에 대한 유사도를 구하는 제1단계와; 상기 제1단계의 과정을 모든 단어에 대해서 수행하여 초소누적 오차를 갖는 단어를 선택하는 과정을 입력의 마지막 시점까지 반복함으로써 전체음성에 대한 근사단어열을 구하는 제2단계로 이루어진 것을 특징으로 하는 연결음성 인식의 검색방법.The method of claim 1, wherein, when a voice is input, the word matching process adds all the quantization errors of code words frequently occurring to the corresponding word from the input point of view to the maximum length of each word, A first step of obtaining a degree of similarity to a word; And a second step of performing a process of the first step on all the words and selecting a word having a cumulative error by repeating the process up to the last point of the input to obtain an approximate word string for the entire voice. Search method of speech recognition.

※ 참고사항 : 최초출원 내용에 의하여 공개하는 것임.※ Note: It is disclosed by the contents of the first application.