JP2001228891A

Movatterモバイル変換

Info

Publication number: JP2001228891A
Application number: JP2000038701A
Authority: JP
Inventors: Keisuke Watanabe; 圭輔渡邉; Yasushi Ishikawa; 泰石川
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2000-02-16
Filing date: 2000-02-16
Publication date: 2001-08-24

Abstract

(57)【要約】【課題】対話文脈情報に基づいて未知語のカテゴリを
同定し、それに応じて適切な対話動作を実行することに
より、利用者との対話を維持できる音声対話装置を得
る。【解決手段】認識語彙・構文知識を用いて音声認識部
１で入力音声に対する音声認識処理と未知語検出処理を
行い、その音声認識結果に未知語が含まれている場合、
システム動作決定知識に規定された規則と、対話文脈情
報記憶部４に記憶された対話文脈情報から、未知語カテ
ゴリ推定部５で未知語のカテゴリを推定し、システム動
作決定部６でシステム動作決定知識を参照して、音声認
識結果と推定した未知語のカテゴリからシステム動作を
決定し、それをシステム動作実行部７で実行する。(57) [Summary] [Problem] To obtain a speech dialogue device capable of maintaining a dialogue with a user by identifying a category of an unknown word based on dialogue context information and executing an appropriate dialogue action in accordance with the category. SOLUTION: A speech recognition unit 1 performs speech recognition processing and unknown word detection processing for input speech using recognition vocabulary and syntactic knowledge, and when the speech recognition result includes an unknown word,
The unknown word category estimating unit 5 estimates the category of the unknown word from the rules defined in the system operation determining knowledge and the dialogue context information stored in the dialogue context information storage unit 4, and the system operation determining unit 6 determines the system operation. With reference to the knowledge, the system operation is determined from the speech recognition result and the estimated category of the unknown word, and the system operation is executed by the system operation execution unit 7.

Description

Translated fromJapanese

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】この発明は、自然言語による
マン・マシン・インタフェースに用いられる音声対話装
置に関するものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a spoken dialogue apparatus used for a natural language man-machine interface.

【０００２】[0002]

【従来の技術】昨今、音声を用いた対話によって、必要
とする情報を得る音声対話装置の重要性が高まってい
る。このような音声対話装置においては、音声認識の対
象とする語彙をあらかじめ設計しておく必要がある。し
かし、適切に語彙を設計したとしても、固有名詞などを
すべて網羅することは不可能であり、認識対象語彙に含
まれない語（以下未知語と呼ぶ）を利用者が入力する可
能性が常に存在する。従って、音声対話装置はそのよう
な未知語が入力された場合にも、利用者との対話を維持
し利用者の利便性を損なわないことが重要となる。2. Description of the Related Art In recent years, the importance of a speech dialogue apparatus for obtaining necessary information through dialogue using speech has increased. In such a spoken dialogue apparatus, it is necessary to design a vocabulary to be subjected to speech recognition in advance. However, even if the vocabulary is designed properly, it is not possible to cover all proper nouns, etc., and there is always a possibility that the user will input words that are not included in the vocabulary to be recognized (hereinafter, unknown words). Exists. Therefore, it is important that the voice dialogue device maintain a dialogue with the user even if such an unknown word is input, and do not impair the user's convenience.

【０００３】従来、未知語を含む入力に対しては、未知
語モデルを用いて未知語を検出する方法が提案されてい
る。また、検出した未知語の意味的なカテゴリを、未知
語を含む入力音声情報の他の部分から同定し、その情報
を用いて対話動作を行う方法が提案されている。図１７
は例えば、特開平７−５８９１号公報に示された従来の
音声対話装置における動作処理の流れを示すフローチャ
ートである。Conventionally, a method for detecting an unknown word using an unknown word model has been proposed for an input including an unknown word. In addition, a method has been proposed in which the semantic category of the detected unknown word is identified from another part of the input speech information including the unknown word, and a dialog operation is performed using the information. FIG.
Is a flow chart showing a flow of an operation process in a conventional voice interaction apparatus disclosed in Japanese Patent Application Laid-Open No. 7-5891, for example.

【０００４】処理が開始されると、まず、ステップＳＴ
１において、利用者が最初に発声する語彙や構文を予測
する。ステップＳＴ２で利用者が音声を入力すると、ス
テップＳＴ３にてその入力された音声に対する音声認識
処理を行う。次にステップＳＴ４でその入力された音声
に未知語が含まれているかどうかをチェックし、含まれ
ている場合には音声中の未知語部分の位置を検出する。
その後、ステップＳＴ５において利用者の発話意図を抽
出し、データベース検索などの利用者の意図した処理を
実行する。なお、ステップＳＴ４で未知語が検出された
場合にはステップＳＴ６において、このステップＳＴ５
による対話処理の結果を利用して未知語を解消するため
の処理の方針を決定する。When the process is started, first, at step ST
In step 1, the vocabulary and syntax that the user utters first are predicted. When the user inputs voice in step ST2, voice recognition processing is performed on the input voice in step ST3. Next, in step ST4, it is checked whether or not an unknown word is included in the input voice, and if so, the position of the unknown word portion in the voice is detected.
Then, in step ST5, the user's utterance intention is extracted, and a process intended by the user, such as a database search, is executed. If an unknown word is detected in step ST4, this step ST5 is executed in step ST6.
Using the result of the interactive processing by, a processing policy for eliminating unknown words is determined.

【０００５】ステップＳＴ６にて未知語解消方針が決定
されると、処理はステップＳＴ７に進む。このステップ
ＳＴ７ではステップＳＴ５による対話処理の結果を用い
て認識対象語彙・構文を変更する。次にステップＳＴ８
において、検出された未知語部分を再評価するか否かの
判定を行う。判定の結果、再評価が必要な場合にはステ
ップＳＴ９に分岐し、ステップＳＴ７で変更された認識
語彙・構文を用いて未知語部分を再度音声認識する。一
方、再評価が不要な場合にはステップＳＴ１０に分岐し
て、ステップＳＴ５およびステップＳＴ６における処理
結果を用いて利用者への応答内容を算出し、画面表示・
音声合成などで利用者に応答内容を提示する。[0005] When the unknown word elimination policy is determined in step ST6, the process proceeds to step ST7. In this step ST7, the recognition target vocabulary and syntax are changed using the result of the dialog processing in step ST5. Next, step ST8
In, it is determined whether to re-evaluate the detected unknown word part. As a result of the determination, if reevaluation is required, the process branches to step ST9, and the unknown word portion is speech-recognized again using the recognized vocabulary and syntax changed in step ST7. On the other hand, if re-evaluation is unnecessary, the process branches to step ST10, where the contents of the response to the user are calculated using the processing results in steps ST5 and ST6, and the screen display /
The response content is presented to the user by voice synthesis or the like.

【０００６】なお、ステップＳＴ５における対話処理で
の利用者の発話意図の抽出処理においては、未知語の意
味的なカテゴリを、未知語を含む入力音声情報の他の部
分から同定する。例えば、ステップＳＴ３による音声認
識の結果が「（未知語）にある温泉を知りたい」であっ
た場合には、「にある」が場所を示す付属語であること
から、「ユーザが場所を指しているが未知語になってい
ること」を得る。また、未知語を表すモデルとして
「（未知語）＋（カテゴリを示す語）」を用いることで
未知語のカテゴリを同定する。例えば「（未知語）温
泉」を一つの語として語彙に登録しておくことにより、
ステップＳＴ３による音声認識の結果が「（未知語）温
泉について知りたい」であった場合には、未知語のカテ
ゴリを「温泉」と同定することができる。In the process of extracting the utterance intention of the user in the dialogue process in step ST5, the semantic category of the unknown word is identified from other parts of the input voice information including the unknown word. For example, if the result of the voice recognition in step ST3 is “I want to know a hot spring in (unknown word)”, “user” indicates a place because “is in” is a supplementary word indicating a place. Is an unknown word. " The category of the unknown word is identified by using “(unknown word) + (word indicating category)” as a model representing the unknown word. For example, by registering "(unknown word) hot spring" as one word in the vocabulary,
When the result of the speech recognition in step ST3 is “(I want to know about (unknown word) hot spring”), the category of the unknown word can be identified as “hot spring”.

【０００７】なお、このような従来の音声対話装置に関
連する事項の記載されている文献としては、この他に、
固有名詞の属性を文脈上の共起関係から推定する特開平
３−２６３２６０号公報や、音声認識に用いる統計的言
語モデルの作成に関する特開平１０−３１９９８９号公
報などもある。[0007] Other documents that describe matters related to such a conventional voice interaction device include:
There are JP-A-3-263260 in which attributes of proper nouns are estimated from co-occurrence relations in context, and JP-A-10-311989 relating to creation of a statistical language model used for speech recognition.

【０００８】[0008]

【発明が解決しようとする課題】従来の音声対話装置は
以上のように構成されているので、未知語のカテゴリを
同定するために必要な情報が入力音声情報の他の部分に
存在しない場合には、未知語のカテゴリを同定できない
という課題があった。例えば、未知語を表すモデルとし
て「（未知語）ホテル」を登録しておいても、「ニュー
グランド」というホテル名が認識語彙に含まれていない
場合、「ニューグランド空いていますか」という発話に
対する音声認識結果は「（未知語）空いていますか」と
なる。この場合、未知語のカテゴリとしては「ホテル
名」、「日時」、「部屋タイプ」などが想定され、その
カテゴリを同定することはできない。Since the conventional speech dialogue apparatus is configured as described above, the conventional speech dialogue apparatus is used when the information necessary for identifying the category of the unknown word does not exist in other parts of the input speech information. Had a problem that the category of the unknown word could not be identified. For example, even if "(unknown word) hotel" is registered as a model representing an unknown word, if the hotel name "new ground" is not included in the recognized vocabulary, the utterance "is new ground available?" The speech recognition result for "is (unknown word) free?" In this case, the category of the unknown word may be “hotel name”, “date and time”, “room type”, etc., and the category cannot be identified.

【０００９】この発明は、上記のような課題を解決する
ためになされたもので、対話文脈情報に基づいて未知語
のカテゴリを同定し、その同定結果に応じて適切な対話
動作を実行することにより、利用者との対話を維持でき
る音声対話装置を得ることを目的とする。SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and identifies an unknown word category based on dialog context information and executes an appropriate dialog operation in accordance with the identification result. Accordingly, it is an object of the present invention to obtain a voice interactive device capable of maintaining a dialog with a user.

【００１０】[0010]

【課題を解決するための手段】この発明に係る音声対話
装置は、音声認識部、認識語彙・構文知識記憶部、シス
テム動作決定知識部、対話文脈記憶部、未知語カテゴリ
推定部、システム動作決定部、およびシステム動作実行
部を備え、認識語彙・構文知識記憶部内の認識語彙・構
文知識を用いて、入力音声に対する音声認識処理と未知
語検出処理を音声認識部にて行い、その音声認識結果に
未知語が含まれている場合、未知語カテゴリ推定部に
て、システム動作決定知識記憶部内のシステム動作決定
知識に規定された規則と、対話文脈情報記憶部に記憶さ
れた対話文脈情報から未知語のカテゴリを推定し、シス
テム動作決定部でシステム動作決定知識を参照して、音
声認識部の音声認識結果と、未知語カテゴリ推定部にて
推定した未知語のカテゴリからシステム動作を決定し、
システム動作実行部でそれを実行するようにしたもので
ある。A speech dialogue apparatus according to the present invention comprises a speech recognition unit, a recognition vocabulary / syntax knowledge storage unit, a system operation determination knowledge unit, a dialogue context storage unit, an unknown word category estimation unit, and a system operation determination. And a system operation execution unit. The speech recognition unit performs input speech recognition processing and unknown word detection processing using the recognition vocabulary and syntax knowledge in the recognition vocabulary and syntax knowledge storage unit. When the unknown word is included in the unknown word category estimating unit, the unknown word category estimating unit determines the unknown word from the rule defined in the system operation determining knowledge in the system operation determining knowledge storage unit and the dialog context information stored in the dialog context information storage unit. The word category is estimated, and the system operation decision unit refers to the system operation decision knowledge, and the speech recognition result of the speech recognition unit and the unknown word category estimated by the unknown word category estimation unit. To determine the system operation from Gori,
This is performed by the system operation execution unit.

【００１１】この発明に係る音声対話装置は、システム
動作決定知識として、システム動作に必要なパラメータ
中の、対話開始時点から一度も値を割り当てられなかっ
たものを未知語のカテゴリ候補とすると規定したもので
ある。[0011] The speech dialogue apparatus according to the present invention stipulates, as the system operation determination knowledge, a parameter which is not assigned a value even once from the start of the dialogue among the parameters necessary for the system operation as an unknown word category candidate. Things.

【００１２】この発明に係る音声対話装置は、システム
動作決定知識として、システム動作に必要なパラメータ
中の、既に値を割り当てられているものを未知語のカテ
ゴリ候補とすると規定したものである。[0012] The speech dialogue apparatus according to the present invention is stipulated as a system operation determination knowledge that a parameter to which a value is already assigned among the parameters required for the system operation is defined as an unknown word category candidate.

【００１３】この発明に係る音声対話装置は、システム
動作決定知識として、システム動作に必要なパラメータ
の、既に値が割り当てられているものの中より、時間的
に最も遅く値が割り当てられたパラメータを未知語のカ
テゴリ候補とすると規定し、そのために必要な時刻情報
として、対話文脈情報記憶部に記憶内容を更新した時刻
も記憶させるようにしたものである。The speech dialogue apparatus according to the present invention determines, as the system operation determination knowledge, a parameter to which a value is assigned the latest one of the parameters required for the system operation from among those already assigned values. It is defined as a word category candidate, and the time at which the storage content is updated is stored in the conversation context information storage unit as time information necessary for that.

【００１４】[0014]

【発明の実施の形態】以下、この発明の実施の一形態を
説明する。実施の形態１．図１はこの発明の実施の形態１による音
声対話装置の構成図を示すブロック図である。図におい
て、１は利用者から入力された音声に対して、後述する
認識語彙・構文知識を用いて音声認識処理および未知語
検出処理を行い、未知語が検出された場合には未知語を
含んだ音声認識結果を出力する音声認識部である。２は
この音声認識部１にて認識対象とする語彙と構文を規定
した上記認識語彙・構文知識が保持される認識語彙・構
文知識記憶部である。３は音声認識部１の音声認識結果
に対して取り得るシステム動作と、各システム動作に対
して対話文脈情報からパラメータを未知語のカテゴリ候
補と定める規則と、各システム動作に対して上記未知語
カテゴリから新たに定まるシステム動作とを規定したシ
ステム動作決定知識が保持されるシステム動作決定知識
記憶部であり、この場合、システム動作に必要なパラメ
ータのうちの対話開始時点から一度も値を割り当てられ
なかったパラメータが、未知語のカテゴリ候補と定めら
れる。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment of the present invention will be described below. Embodiment 1 FIG. FIG. 1 is a block diagram showing a configuration diagram of a voice interaction apparatus according to Embodiment 1 of the present invention. In the figure, reference numeral 1 denotes speech recognition processing and unknown word detection processing performed on speech input from a user using recognition vocabulary and syntactic knowledge described later, and includes an unknown word when an unknown word is detected. This is a voice recognition unit that outputs a voice recognition result. Reference numeral 2 denotes a recognition vocabulary / syntax knowledge storage unit that holds the recognition vocabulary / syntax knowledge defining the vocabulary and syntax to be recognized by the speech recognition unit 1. Reference numeral 3 denotes a system operation that can be performed on the speech recognition result of the speech recognition unit 1, a rule that determines a parameter as a category candidate of an unknown word from dialog context information for each system operation, and an unknown word for each system operation. A system operation determination knowledge storage unit that holds system operation determination knowledge defining a system operation newly determined from a category, and in this case, a value of a parameter required for the system operation can be assigned even once from the dialogue start time. Missing parameters are determined as unknown word category candidates.

【００１５】４は前述の音声認識部１での音声認識結果
の履歴と、後述するシステム動作実行部が実行したシス
テム動作とそのシステム動作の実行に用いたパラメータ
値の履歴を記憶する対話文脈情報記憶部である。５は音
声認識部１での音声認識結果に未知語が含まれている場
合に、システム動作決定知識記憶部３に規定された規則
と、対話文脈情報記憶部４に記憶された対話文脈情報と
から、未知語のカテゴリを推定して出力する未知語カテ
ゴリ推定部である。６は音声認識部１での音声認識結果
と、その音声認識結果に未知語が含まれている場合に未
知語カテゴリ推定部５が推定した未知語のカテゴリとか
ら、システム動作決定知識記憶部３を参照してシステム
動作を決定するシステム動作決定部である。７はシステ
ム動作決定部６によって決定されたシステム動作を実行
し、利用者への応答、データベースの検索などを行うシ
ステム動作実行部である。Reference numeral 4 denotes dialogue context information for storing the history of the result of speech recognition by the speech recognition unit 1, the history of system operations executed by the system operation execution unit described later, and the parameter values used for executing the system operations. It is a storage unit. Reference numeral 5 denotes a rule defined in the system operation determination knowledge storage unit 3 and the dialog context information stored in the dialog context information storage unit 4 when an unknown word is included in the voice recognition result in the voice recognition unit 1. And an unknown word category estimating unit for estimating and outputting the category of the unknown word. Reference numeral 6 denotes a system operation determination knowledge storage unit 3 based on the speech recognition result of the speech recognition unit 1 and the category of the unknown word estimated by the unknown word category estimation unit 5 when the unknown word is included in the speech recognition result. Is a system operation determining unit that determines a system operation with reference to FIG. Reference numeral 7 denotes a system operation execution unit that executes the system operation determined by the system operation determination unit 6 and responds to the user, searches the database, and the like.

【００１６】次に動作について説明する。なお、ここで
は音声対話装置をホテル予約音声対話装置として用いた
場合について具体的に説明する。このホテル予約音声対
話装置とは、利用者が当該ホテル予約音声対話装置と音
声で対話することにより、場所、金額、日時などの条件
を入力して希望のホテルを検索し、さらに予約に必要な
ホテル名、日時などの項目を入力して目的とするホテル
の予約を行うものである。Next, the operation will be described. Here, a case where the voice interaction device is used as a hotel reservation voice interaction device will be specifically described. With this hotel reservation voice interaction device, a user interacts with the hotel reservation voice interaction device by voice to input a condition such as a place, an amount, a date and time, to search for a desired hotel, and furthermore, necessary for a reservation. The reservation of a target hotel is performed by inputting items such as a hotel name and a date and time.

【００１７】図２は認識語彙・構文知識記憶部２に規定
された認識語彙・構文知識の一例を示す遷移図であり、
音声認識部１が受理可能な文をネットワークとして表現
したものである。図示の節点Ｓから節点Ｅまでのパスに
より生成される文が音声認識部１で受理可能な文であ
る。ネットワークの弧にはシステム動作に対する〈冗長
語〉、〈検索条件〉、〈ホテル名〉、〈日時〉、〈部屋
タイプ〉、〈部屋数〉、〈泊数〉、〈意図〉などのパラ
メータがそれぞれ割り当てられている。また、図３はそ
れら各パラメータに対応して規定された語彙の一例を示
す説明図であり、〈冗長語〉には｛もしもし、あ、えー
と、…｝、〈ホテル名〉には｛鎌倉プリンス、鶴ヶ丘会
館、…｝、〈日時〉には｛今日、明日、１月１日、
…｝、〈部屋タイプ〉には｛１部屋、２部屋、…｝、…
などの語が規定されている。なお、パラメータが割り当
てられていない弧は、対応する語なしに節点から節点ま
で無条件に遷移することを意味している。従って、例え
ば「あのー、鎌倉プリンスホテル予約したいんですが」
あるいは「明日シングル一泊空いてますか」などが音声
認識部１で受理可能な文である。FIG. 2 is a transition diagram showing an example of recognized vocabulary / syntax knowledge defined in the recognized vocabulary / syntax knowledge storage unit 2.
The sentence that the speech recognition unit 1 can accept is represented as a network. The sentence generated by the path from the node S to the node E is a sentence that can be accepted by the speech recognition unit 1. In the network arc, parameters such as <redundant word>, <search condition>, <hotel name>, <date and time>, <room type>, <number of rooms>, <number of nights>, <intention>, etc. Have been assigned. FIG. 3 is an explanatory diagram showing an example of a vocabulary defined corresponding to each of these parameters. The <redundant word> is ｛Hello, ah, um,｝, and the <hotel name> is ｛Kamakura Prince. , Tsurugaoka Kaikan, ...｝, <date and time> on today, tomorrow, January 1,
…｝, <Room type> includes 1 room, 2 rooms, ... 、, ...
Such words are prescribed. Note that an arc to which no parameter is assigned means that transition from node to node is unconditional without a corresponding word. Therefore, for example, "I want to make a reservation at Kamakura Prince Hotel."
Alternatively, "Do you have a single night tomorrow?" Is a sentence that can be accepted by the voice recognition unit 1.

【００１８】図４はこのようなホテル予約音声対話装置
と利用者との対話の一例を示す説明図であり、以下この
例を用いて動作を説明する。利用者からのアクセスに対
して、ホテル予約音声対話装置からの応答（以下システ
ム応答という）Ｓ１１「はい、こちらホテル予約センタ
ーです」により対話を開始する。このシステム応答Ｓ１
１を受けた利用者は最初の発話Ｕ１１「明日ツインを一
部屋探しているんですが」を入力する。この利用者の発
話Ｕ１１は音声認識部１に入力され、音声認識部１は入
力された発話Ｕ１１に対する音声認識処理を行って、構
文に規定されたパラメータと各パラメータに対応する音
声認識結果とを組にして出力する。例えば、利用者の発
話Ｕ１１に対する音声認識結果として「〈日時〉＝｛明
日｝、〈部屋タイプ〉＝｛ツイン｝、〈部屋数〉＝｛１
部屋｝、〈意図〉＝｛探しているんですが｝」を出力す
る。FIG. 4 is an explanatory diagram showing an example of a dialog between the hotel reservation voice dialogue device and the user, and the operation will be described below using this example. In response to access from the user, a dialog is started by a response from the hotel reservation voice interaction device (hereinafter, referred to as a system response) S11 "Yes, this is a hotel reservation center". This system response S1
The user who receives 1 inputs the first utterance U11 "I am looking for a twin room tomorrow." The utterance U11 of the user is input to the voice recognition unit 1, and the voice recognition unit 1 performs a voice recognition process on the input utterance U11, and generates a parameter defined in the syntax and a voice recognition result corresponding to each parameter. Output as a set. For example, as the speech recognition result for the utterance U11 of the user, “<date> = {tomorrow}, <room type> = {twin}, <number of rooms> = $ 1
Room｝, <intention> = ている I'm looking for｝ ”is output.

【００１９】次にシステム動作決定部６の動作について
説明する。ここで、図５、図６、図７はシステム動作決
定知識記憶部３に規定されたシステム動作決定知識の一
例を示す説明図であり、図５は音声認識部１の音声認識
結果に対して取り得るシステム動作を、図６は各システ
ム動作に対して対話文脈情報から未知語のカテゴリを同
定する規則を、図７は各システム動作に対して未知語カ
テゴリから新たに定まるシステム動作をそれぞれ規定し
たものである。Next, the operation of the system operation determining unit 6 will be described. Here, FIGS. 5, 6 and 7 are explanatory diagrams showing an example of the system operation determination knowledge defined in the system operation determination knowledge storage unit 3, and FIG. FIG. 6 defines rules for identifying unknown word categories from dialog context information for each system operation, and FIG. 7 defines system operations newly determined from unknown word categories for each system operation. It was done.

【００２０】利用者の発話Ｕ１１に対する音声認識部１
の音声認識結果として「〈日時〉＝｛明日｝、〈部屋タ
イプ〉＝｛ツイン｝、〈部屋数〉＝｛１部屋｝、〈意
図〉＝｛探しているんですが｝」が入力されると、シス
テム動作決定部６はシステム動作決定知識記憶部３に規
定されている図５のシステム動作決定知識を参照し、意
図「探しているんですが」に対するシステム動作「検
索」および「条件質問」を得る。また、各システム動作
のパラメータに対して条件が規定されており、発話Ｕ１
１が入力された時点では「検索」に対するパラメータ条
件「〈検索条件〉パラメータが値を持つ」を満足してい
ないため、システム動作「条件質問」を選択して出力す
る。Voice recognition unit 1 for user's utterance U11
"<Date> = {Tomorrow}, <Room type> = Twin, <Number of rooms> = 1 room, <Intention> = <I'm looking for <> Then, the system operation determination unit 6 refers to the system operation determination knowledge of FIG. 5 specified in the system operation determination knowledge storage unit 3, and performs the system operation “search” and “condition question To get. Also, conditions are defined for the parameters of each system operation, and the utterance U1
At the time when 1 is input, the parameter condition for “search” “<search condition> parameter has a value” is not satisfied, so the system operation “condition question” is selected and output.

【００２１】それを受けたシステム動作実行部７は、シ
ステム動作「条件質問」を実行することで、図４に示す
システム応答Ｓ１２「お値段、場所などのご希望はござ
いますか」を利用者に音声出力する。The system operation execution section 7 having received the request executes the system operation "condition question", and the system response S12 "Do you have any request for price, location, etc." shown in FIG. Audio output to

【００２２】この利用者の発話Ｕ１１およびシステム応
答Ｓ１２について、対話文脈情報記憶部４は音声認識部
１での音声認識結果の履歴と、システム動作実行部７が
実行したシステム動作およびシステム動作の実行に用い
たパラメータ値の履歴とを記憶する。図８はこの音声認
識結果の履歴と、システム動作およびシステム動作パラ
メータの履歴の記憶状態の一例を示した説明図である。
この場合、音声認識結果履歴としては〈日時〉＝｛明
日｝、〈部屋タイプ〉＝｛ツイン｝、〈部屋数〉＝｛１
部屋｝、〈意図〉＝｛探しているんですが｝が記憶さ
れ、システム動作とシステム動作パラメータ履歴として
は〈システム動作〉＝｛挨拶｝と、〈システム動作〉＝
｛条件質問｝、〈質問パラメータ〉＝｛〈料金〉、〈場
所〉｝とが記憶される。For the user's utterance U11 and system response S12, the dialogue context information storage unit 4 stores the history of the speech recognition result in the speech recognition unit 1, the system operation executed by the system operation execution unit 7, and the execution of the system operation. And the history of the parameter values used for. FIG. 8 is an explanatory diagram showing an example of the storage state of the history of the speech recognition result and the history of the system operation and the system operation parameter.
In this case, as the speech recognition result history, <date and time> = {tomorrow}, <room type> = {twin}, <number of rooms> = $ 1
Room｝, <intention> = ｛I am looking for, but｝ is memorized, and as system operation and system operation parameter history, <system operation> = ｛greeting｝, and <system operation> =
{Condition question}, <question parameter> = {<fee>, <location>} are stored.

【００２３】次に、利用者がそのシステム応答Ｓ１２
「お値段、場所などのご希望はございますか」に対し
て、未知語「ニューグランド」を含む発話Ｕ１２「そう
ですね、ニューグランドは空いていますか」を入力す
る。音声認識部１この発話Ｕ１２に対して音声認識処理
を行ない、未知語モデルにより「ニューグランド」を未
知語として検出し、入力を「そうですね、《未知語》空
いていますか」として認識する。未知語の取り得るカテ
ゴリは、図２に示すネットワークの構成から〈検索条
件〉、〈ホテル名〉、〈日時〉、〈部屋タイプ〉、〈部
屋数〉、〈泊数〉の６種類の可能性があり、未知語を含
んだ「《未知語》＝｛〈検索条件〉、〈ホテル名〉、
〈日時〉、〈部屋タイプ〉、〈部屋数〉、〈泊数〉｝、
〈意図〉＝｛空いていますか｝」を音声認識結果として
出力する。Next, the user sets the system response S12.
In response to "Do you have a request for price, location, etc.", the utterance U12 containing the unknown word "New Grand" is input. Speech Recognition Unit 1 Speech recognition processing is performed on this utterance U12, "new ground" is detected as an unknown word by an unknown word model, and the input is recognized as "Yes, << unknown word >> is available?" There are six possible categories of unknown words: <search condition>, <hotel name>, <date and time>, <room type>, <number of rooms>, <number of nights> based on the network configuration shown in FIG. , “<Unknown word> = ｛<search condition>, <hotel name>,
<Date>, <Room type>, <Number of rooms>, <Number of nights>｝,
<Intention> = {Available?} ”Is output as the speech recognition result.

【００２４】未知語カテゴリ推定部５は音声認識部１よ
り、この未知語を含んだ音声認識結果「《未知語》＝
｛〈検索条件〉、〈ホテル名〉、〈日時〉、〈部屋タイ
プ〉、〈部屋数〉、〈泊数〉｝、〈意図〉＝｛空いてい
ますか｝」が入力されると、未知語のカテゴリを上記６
種類の中の１つに同定する処理を行う。システム動作決
定知識記憶部３に設定された、図５に示すシステム動作
決定知識をまず参照して、意図「空いていますか」に対
するシステム動作「空室状況確認」を得る。さらに、シ
ステム動作「空室状況確認」のパラメータに対する条件
から、システム動作に必須のパラメータが〈ホテル
名〉、〈日時〉であることを得る。The unknown word category estimating unit 5 outputs a speech recognition result “<unknown word> =
｛If <Search condition>, <Hotel name>, <Date / time>, <Room type>, <Number of rooms>, <Number of nights>, <Intention> = {Available? Category of the above 6
A process for identifying one of the types is performed. First, referring to the system operation determination knowledge shown in FIG. 5 set in the system operation determination knowledge storage unit 3, the system operation "vacancy check" for the intention "is vacant" is obtained. Further, from the condition for the parameter of the system operation “vacancy check”, it is obtained that the parameters essential for the system operation are <hotel name> and <date and time>.

【００２５】次に、システム動作決定知識記憶部３に設
定された、図６に示すシステム動作決定知識を参照する
ことにより、システム動作「空室状況確認」に対する未
知語カテゴリ同定規則として「〈ホテル名〉、〈日時〉
のうちで対話開始時点から一度も値を割り当てられなか
ったパラメータ」を得る。次いで、対話文脈情報記憶部
４に記憶された、図８に示す音声認識結果履歴を参照す
ることにより、対話開始時点から値が割り当てられなか
ったパラメータは〈ホテル名〉であることが分かる。従
って、未知語カテゴリ推定部５は、未知語のカテゴリを
６種類の候補の中から〈ホテル名〉に同定して出力す
る。Next, by referring to the system operation decision knowledge shown in FIG. 6 set in the system operation decision knowledge storage unit 3, "<hotel" is defined as an unknown word category identification rule for the system operation "vacancy check". First name), <date and time>
Among them, "a parameter to which a value has never been assigned since the start of the dialogue" is obtained. Next, by referring to the voice recognition result history shown in FIG. 8 stored in the dialogue context information storage unit 4, it can be seen that the parameter to which no value has been assigned since the start of the dialogue is <hotel name>. Therefore, the unknown word category estimating unit 5 identifies and outputs the category of the unknown word as the <hotel name> from the six types of candidates.

【００２６】システム動作決定部６はこの未知語カテゴ
リ推定部５から未知語カテゴリ〈ホテル名〉が入力され
ると、システム動作決定知識記憶部３に設定された、図
７に示すシステム動作決定知識を参照して、システム動
作「空室状況確認」に対する新たなシステム動作「デー
タ外ホテル応答」を実行すべきシステム動作として決定
し出力する。When the unknown word category <hotel name> is input from the unknown word category estimating unit 5, the system operation determining unit 6 sets the system operation determining knowledge shown in FIG. , A new system operation "hotel response outside data" for the system operation "vacancy check" is determined and output as a system operation to be executed.

【００２７】システム動作実行部７はシステム動作決定
部６より、このシステム動作「データ外ホテル応答」が
入力されると、システム応答Ｓ１３として「申し訳ござ
いません。そのホテルはお取り扱い致しておりません」
を利用者に出力する。When this system operation "hotel response out of data" is input from the system operation determining unit 6, the system operation execution unit 7 returns "sorry, the hotel is not handled" as the system response S13.
Is output to the user.

【００２８】以上のように、この実施の形態１によれ
ば、対話文脈情報記憶部４が対話開始時点からの音声認
識結果の履歴とシステム動作の履歴とシステム動作に用
いたパラメータの履歴を記憶し、未知語を含む入力に対
して音声認識部１が未知語を検出し、未知語カテゴリ推
定部５が対話文脈情報記憶部４に記憶された対話文脈情
報を参照することで、システム動作に必要なパラメータ
の内の対話開始時刻から一度も値が割り当てられていな
いものを未知語のカテゴリとして同定しているので、未
知語のカテゴリを同定するために必要な情報が入力音声
情報の他の部分に存在しない場合にも、対話文脈に基づ
いて未知語のカテゴリを同定することができ、同定結果
に応じて適切な対話動作を実行することで利用者との対
話を維持することが可能になるという効果が得られる。As described above, according to the first embodiment, the dialog context information storage unit 4 stores the history of the speech recognition result from the start of the dialog, the history of the system operation, and the history of the parameters used in the system operation. Then, the speech recognition unit 1 detects an unknown word with respect to the input including the unknown word, and the unknown word category estimating unit 5 refers to the dialogue context information stored in the dialogue context information storage unit 4 so that the system operation can be performed. Among the necessary parameters, those that have never been assigned a value from the dialog start time are identified as unknown word categories, so the information necessary to identify the unknown word category is Even if it does not exist in the part, the category of the unknown word can be identified based on the dialogue context, and the dialogue with the user can be maintained by performing appropriate dialogue action according to the identification result. Effect is obtained that it becomes capacity.

【００２９】実施の形態２．次に、この発明の実施の形
態２について説明する。ここで、この実施の形態２によ
る音声対話装置は、上記実施の形態１とはシステム動作
決定知識の規定と、システム動作決定部６の動作が異な
るものであり、他は上記実施の形態１の場合と同様であ
る。以下、図１にその構成を示したホテル予約音声対話
装置の動作について、システム動作決定知識記憶部３に
規定されたシステム動作決定知識を用いたシステム動作
決定部６の動作を中心に説明する。図９はこのようなホ
テル予約音声対話装置と利用者との対話の一例を示す説
明図であり、以下この例に従ってその動作を説明する。Embodiment 2 Next, a second embodiment of the present invention will be described. Here, the spoken dialogue device according to the second embodiment is different from the first embodiment in the definition of the system operation determination knowledge and the operation of the system operation determination unit 6. Same as in the case. Hereinafter, the operation of the hotel reservation voice dialogue apparatus whose configuration is shown in FIG. 1 will be described focusing on the operation of the system operation determination unit 6 using the system operation determination knowledge specified in the system operation determination knowledge storage unit 3. FIG. 9 is an explanatory diagram showing an example of such a dialog between the hotel reservation voice interaction device and the user, and the operation will be described below according to this example.

【００３０】ここで、図１０、図１１、図１２はシステ
ム動作決定知識記憶部３に規定されたシステム動作決定
知識の一例を示す説明図であり、図１０は音声認識結果
に対して取り得るシステム動作を、図１１は各システム
動作に対して対話文脈情報から未知語のカテゴリを同定
する規則を、図１２各システム動作に対して未知語カテ
ゴリから新たに定まるシステム動作をそれぞれ規定した
ものである。Here, FIGS. 10, 11 and 12 are explanatory diagrams showing an example of the system operation decision knowledge defined in the system operation decision knowledge storage unit 3, and FIG. 10 can be used for the speech recognition result. FIG. 11 defines a system operation, a rule for identifying an unknown word category from dialog context information for each system operation, and FIG. 12 a system operation newly determined from an unknown word category for each system operation. is there.

【００３１】この場合も、実施の形態１の場合と同様に
ホテル予約音声対話装置からのシステム応答Ｓ２１「は
い、こちらホテル予約センターです」により対話を開始
する。利用者の発話Ｕ２１、Ｕ２２には未知語が含まれ
ていないので、システム応答Ｓ２３を出力するまでは実
施の形態１と同様に動作して対話が進行し、対話文脈情
報記憶部４の状態は図１３に示すものとなる。In this case, as in the case of the first embodiment, the dialogue is started by the system response S21 "Yes, this is the hotel reservation center" from the hotel reservation voice interaction device. Since the user's utterances U21 and U22 do not include unknown words, the dialogue proceeds in the same manner as in the first embodiment until the system response S23 is output, and the state of the dialogue context information storage unit 4 is This is shown in FIG.

【００３２】図１３はこの対話文脈情報記憶部４に記録
された音声認識結果の履歴と、システム動作とシステム
動作パラメータの履歴の一例を示した説明図であり、こ
の場合、音声認識結果履歴としては〈意図〉＝｛予約し
たいんですが｝、〈ホテル名〉＝｛鎌倉プリンス｝と、
〈意図〉＝｛です｝、〈日時〉＝｛４月２９日｝とが記
憶され、システム動作とシステム動作パラメータ履歴と
しては〈システム動作〉＝｛挨拶｝と、〈システム動
作〉＝｛予約パラメータ質問｝、〈質問パラメータ〉＝
｛〈日時〉｝と、〈システム動作〉＝｛空室状況検
索｝、〈日時〉＝｛４月２９日｝、〈ホテル名〉＝｛鎌
倉プリンス｝と、〈システム動作〉＝｛満室応答｝、
〈日時〉＝｛４月２９日｝、〈ホテル名〉＝｛鎌倉プリ
ンス｝とが記憶される。FIG. 13 is an explanatory diagram showing an example of the history of the speech recognition result and the history of the system operation and the system operation parameter recorded in the dialogue context information storage unit 4. In this case, the history of the speech recognition result is used. Is <Intent> = I want to make a reservation, but <Hotel name> = {Kamakura Prince}
<Intention> = ｛, <Date> = {April 29} is stored, and as system operation and system operation parameter history, <system operation> = {greeting} and <system operation> = ｛reservation parameter Question｝, <question parameter> =
｛<Date and time>｝ and <system operation> = ｛vacancy status search｝, <date and time> = ｛April 29｝, <hotel name> = ｛Kamakura Prince｝, and <system operation> = ｛full occupancy response｝ ,
<Date> = {April 29}, <Hotel name> = {Kamakura Prince} are stored.

【００３３】このシステム応答Ｓ２３に対して、利用者
が未知語「メーデー」を含む発話Ｕ２３「じゃあ、メー
デーに変更してください」を入力すると、音声認識部１
は音声認識処理を行ない、未知語モデルにより「メーデ
ー」を未知語として検出し、入力を「じゃあ、《未知
語》変更してください」として認識する。この未知語の
取り得るカテゴリは図２のネットワーク構成から〈検索
条件〉、〈ホテル名〉、〈日時〉、〈部屋タイプ〉、
〈部屋数〉、〈泊数〉の６種類があり、未知語を含む
「《未知語》＝｛〈検索条件〉、〈ホテル名〉、〈日
時〉、〈部屋タイプ〉、〈部屋数〉、〈泊数〉｝、〈意
図〉＝｛変更してください｝」を音声認識結果として出
力する。In response to the system response S23, when the user inputs an utterance U23 containing the unknown word "Mayday""OK, please change to Mayday", the voice recognition unit 1
Performs speech recognition processing, detects “Mayday” as an unknown word using an unknown word model, and recognizes the input as “OK, change“ unknown word ””. The categories that this unknown word can take are <search condition>, <hotel name>, <date and time>, <room type>,
There are six types of <number of rooms> and <number of nights>, including unknown words. “<Unknown word> = ｛<search condition>, <hotel name>, <date and time>, <room type>, <number of rooms>, <Number of nights>｝, <Intention> = ｛Please change｝ ”is output as the speech recognition result.

【００３４】未知語カテゴリ推定部５は音声認識部１よ
り、この未知語を含んだ音声認識結果「《未知語》＝
｛〈検索条件〉、〈ホテル名〉、〈日時〉、〈部屋タイ
プ〉、〈部屋数〉、〈泊数〉｝、〈意図〉＝｛変更して
ください｝」が入力されると、未知語のカテゴリを上記
６種類の中の１つに同定する処理を行う。すなわち、ま
ずシステム動作決定知識記憶部３を参照して、図１０に
示すシステム動作決定知識より、意図「変更してくださ
い」に対するシステム動作「パラメータ変更」を得る。
さらにシステム動作「パラメータ変更」のパラメータに
対する条件から、システム動作に必須のパラメータが
〈ホテル名〉、〈日時〉、〈部屋タイプ〉、〈部屋
数〉、〈泊数〉のいずれかであることを得る。The unknown word category estimating unit 5 outputs a speech recognition result “<unknown word> =
｛When <Search condition>, <Hotel name>, <Date and time>, <Room type>, <Room number>, <Number of nights>, <Intention> = {Please change. Is performed to identify the category as one of the above six types. That is, referring to the system operation determination knowledge storage unit 3, the system operation "parameter change" for the intention "change" is obtained from the system operation determination knowledge shown in FIG.
In addition, from the conditions for the system operation "Parameter change" parameter, the parameters required for system operation are any of <Hotel name>, <Date and time>, <Room type>, <Number of rooms>, <Number of nights>. obtain.

【００３５】次にシステム動作決定知識記憶部３を参照
することにより、図１１に示すシステム動作決定知識よ
りシステム動作「パラメータ変更」に対する未知語カテ
ゴリ同定規則として「〈ホテル名〉、〈日時〉、〈部屋
タイプ〉、〈部屋数〉、〈泊数〉のうち対話開始時点か
ら値を割り当てられたパラメータ」を得る。次いで図１
３に示す対話文脈情報記憶部４の内容を参照することに
より、対話開始時点から値が割り当てられているパラメ
ータが〈ホテル名〉、〈日時〉であることを知り、それ
より未知語のカテゴリを６種類の候補の中から〈ホテル
名〉と〈日時〉の２種類に同定して出力する。Next, by referring to the system operation determination knowledge storage unit 3, as the unknown word category identification rule for the system operation "parameter change" based on the system operation determination knowledge shown in FIG. 11, "<hotel name>, <date and time>, A parameter to which a value has been assigned from the start of the conversation among <room type>, <number of rooms>, and <number of nights> is obtained. Then Figure 1
By referring to the contents of the dialogue context information storage unit 4 shown in FIG. 3, it is known that the parameters to which values are assigned from the start of the dialogue are <hotel name> and <date and time>. It identifies and outputs two types of <hotel name> and <date and time> from the six types of candidates.

【００３６】システム動作決定部６は、この未知語カテ
ゴリ推定部５から複数の未知語カテゴリ〈ホテル名〉と
〈日時〉が入力されると、システム動作決定知識記憶部
３を参照して、図１２に示すシステム動作決定知識よ
り、システム動作「パラメータ変更」に対する新たなシ
ステム動作「パラメータ確認」を、実行すべきシステム
動作として決定し出力する。When a plurality of unknown word categories <hotel name> and <date and time> are input from the unknown word category estimation unit 5, the system operation determination unit 6 refers to the system operation determination knowledge storage unit 3 and 12, a new system operation "parameter confirmation" for the system operation "parameter change" is determined and output as a system operation to be executed.

【００３７】システム動作実行部７はこのシステム動作
決定部６にて決定されたシステム動作「パラメータ確
認」が入力されると、システム応答Ｓ２４「すみませ
ん、ホテルと日時とどちらのご変更でしょうか」を利用
者に出力する。When the system operation "parameter confirmation" determined by the system operation determination unit 6 is input, the system operation execution unit 7 issues a system response S24 "I'm sorry. Output to the user.

【００３８】以上のように、この実施の形態２によれ
ば、対話文脈情報記憶部４が対話開始時点からの音声認
識結果の履歴、システム動作の履歴、およびシステム動
作に用いたパラメータの履歴を記憶し、未知語を含む入
力に対して音声認識部１が未知語を検出し、未知語カテ
ゴリ推定部５が対話文脈情報記憶部４に記憶された対話
文脈情報を参照することで、システム動作に必要なパラ
メータのうちの、既に値を割り当てられているパラメー
タを未知語のカテゴリとして同定しているので、未知語
のカテゴリを同定するために必要な情報が入力音声情報
の他の部分に存在しない場合にも、対話文脈に基づいて
未知語のカテゴリを同定することができ、同定結果に応
じて適切な対話動作を実行することで利用者との対話を
維持することが可能になるという効果が得られる。As described above, according to the second embodiment, the dialog context information storage unit 4 stores the history of the speech recognition result from the start of the dialog, the history of the system operation, and the history of the parameters used for the system operation. The speech recognition unit 1 detects an unknown word in response to an input including the unknown word, and the unknown word category estimating unit 5 refers to the dialogue context information stored in the dialogue context information storage unit 4 to operate the system. Of the parameters that are already assigned values are identified as unknown word categories, so the information necessary to identify the unknown word category exists in other parts of the input speech information. If not, the category of the unknown word can be identified based on the dialogue context, and the dialogue with the user can be maintained by executing appropriate dialogue actions according to the identification result. Effect that can be obtained.

【００３９】実施の形態３．次に、この発明の実施の形
態３について説明する。ここで、この実施の形態３によ
る音声対話装置は、上記実施の形態２とはシステム動作
決定知識の規定、対話文脈情報記憶部４の記憶内容、お
よびシステム動作決定部６の動作が異なるものであり、
他は上記実施の形態２の場合と同様である。以下、図１
にその構成を示すホテル予約音声対話装置の動作につい
て、システム動作決定知識記憶部３に規定されたシステ
ム動作決定知識を用いたシステム動作決定部６の動作の
を中心に説明する。図１４はこのようなホテル予約音声
対話装置と利用者との対話の一例を示す説明図であり、
以下この例に従ってその動作を説明する。Embodiment 3 Next, a third embodiment of the present invention will be described. Here, the voice interaction device according to the third embodiment differs from the second embodiment in the definition of the system operation determination knowledge, the storage contents of the dialogue context information storage unit 4, and the operation of the system operation determination unit 6. Yes,
Others are the same as in the case of the second embodiment. Hereinafter, FIG.
The operation of the hotel reservation voice dialogue apparatus having the above-described configuration will be described focusing on the operation of the system operation determination unit 6 using the system operation determination knowledge specified in the system operation determination knowledge storage unit 3. FIG. 14 is an explanatory diagram showing an example of a conversation between such a hotel reservation voice interaction device and a user.
The operation will be described below according to this example.

【００４０】ここで、図１５はシステム動作決定知識記
憶部３に規定されたシステム動作決定知識の、各システ
ム動作に対して対話文脈情報から未知語のカテゴリを同
定する規則の一例を示す説明図である。なお、このシス
テム動作決定知識における音声認識結果に対して取り得
るシステム動作については、図１０に示した実施の形態
２のそれと同様に、各システム動作に対して未知語カテ
ゴリから新たに定まるシステム動作については、図１２
に示した実施の形態２のそれと同様にそれぞれ規定され
る。FIG. 15 is an explanatory diagram showing an example of a rule for identifying a category of an unknown word from dialog context information for each system operation of the system operation determination knowledge specified in the system operation determination knowledge storage unit 3. It is. The system operations that can be taken for the speech recognition result in this system operation determination knowledge are the same as those of the second embodiment shown in FIG. For FIG. 12,
Are defined similarly to those of the second embodiment shown in FIG.

【００４１】この場合も、システム応答Ｓ３１「はい、
こちらホテル予約センターです」により、ホテル予約音
声対話装置と利用者の対話が開始される。利用者からの
発話Ｕ３１、Ｕ３２には未知語が含まれていないので、
システム応答Ｓ３３を出力するまでは実施の形態２と同
様に動作して対話が進行し、対話文脈情報記憶部４の状
態は図１６に示すものとなる。Also in this case, the system response S31 "Yes,
This is the hotel reservation center ", and the dialogue between the hotel reservation voice interaction device and the user is started. Since utterances U31 and U32 from the user do not include unknown words,
Until the system response S33 is output, the dialogue proceeds in the same manner as in the second embodiment, and the state of the dialogue context information storage unit 4 is as shown in FIG.

【００４２】図１６はこの対話文脈情報記憶部４に記録
された音声認識結果の履歴と、システム動作およびシス
テム動作パラメータの履歴の一例を示す説明図で、この
図１６にはシステム応答Ｓ３１からＳ３３まで経過した
ときの、対話文脈情報記憶部４に記録された音声認識結
果履歴と、システム動作とシステム動作パラメータ履歴
が示されている。なお、この実施の形態３においては、
各履歴毎にそれが更新された時刻も記録されている。FIG. 16 is an explanatory diagram showing an example of the history of the speech recognition result and the history of the system operation and the system operation parameters recorded in the conversation context information storage unit 4. FIG. 16 shows the system responses S31 to S33. The history of the speech recognition result, the system operation, and the system operation parameter history recorded in the conversation context information storage unit 4 when the time has elapsed are shown. In the third embodiment,
The time at which it was updated is also recorded for each history.

【００４３】すなわち、図示の場合、音声認識結果履歴
としては〈意図〉＝｛予約したいんですが｝、〈ホテル
名〉＝｛鎌倉プリンス｝およびその更新時刻「１４時０
０分０５秒」と、〈意図〉＝｛です｝、〈日時〉＝｛４
月２９日｝およびその更新時刻「１４時００分１５秒」
が記憶され、またシステム動作とシステム動作パラメー
タ履歴としては〈システム動作〉＝｛挨拶｝およびその
更新時刻「１４時００分００秒」と、〈システム動作〉
＝｛予約パラメータ質問｝、〈質問パラメータ〉＝
｛〈日時〉｝およびその更新時刻「１４時００分１０
秒」と、〈システム動作〉＝｛空室状況検索｝、〈日
時〉＝｛４月２９日｝、〈ホテル名〉＝｛鎌倉プリン
ス｝およびその更新時刻「１４時００分１８秒」と、
〈システム動作〉＝｛満室応答｝、〈日時〉＝｛４月２
９日｝、〈ホテル名〉＝｛鎌倉プリンス｝およびその更
新時刻「１４時００分２５秒」とが記憶されている。That is, in the case shown in the figure, as the speech recognition result history, <intention> = {I would like to make a reservation}, <hotel name> = {Kamakura Prince} and its update time “14:00”
0:05 ”, <intention> = ｛, <date> = ｛4
March 29 and its update time "14:00:15"
The system operation and the history of the system operation parameters include <system operation> = {greeting} and its update time “14: 00: 00: 00”, and <system operation>
= {Reservation parameter question}, <Question parameter> =
{<Date>} and its update time "14:00:10
Second ", <system operation> = {availability search}, <date> = {April 29}, <hotel name> = {Kamakura Prince} and its update time" 14:00:18 ",
<System operation> = {Full response}, <Date> = {April 2}
On the 9th, <hotel name> = {Kamakura Prince} and its update time “14:00:25” are stored.

【００４４】このシステム応答Ｓ３３に対して、利用者
が未知語「メーデー」を含む発話Ｕ３３を入力すると、
音声認識部１は実施の形態２の場合と同様に音声認識処
理を行ない、未知語モデルにより「メーデー」を未知語
として検出し、入力を「じゃあ、《未知語》変更してく
ださい」として認識する。この未知語を含む「《未知
語》＝｛〈検索条件〉、〈ホテル名〉、〈日時〉、〈部
屋タイプ〉、〈部屋数〉、〈泊数〉｝、〈意図〉＝｛変
更してください｝」を音声認識結果として出力する。In response to the system response S33, when the user inputs an utterance U33 including the unknown word "Mayday",
The speech recognition unit 1 performs speech recognition processing in the same manner as in the second embodiment, detects “Mayday” as an unknown word using an unknown word model, and recognizes the input as “OK, change“ <unknown word> ”. I do. "<Unknown word> = {<Search condition>, <Hotel name>, <Date and time>, <Room type>, <Number of rooms>, <Number of nights >>, <Intent> = <Intent>"Pleaseoutput" as the speech recognition result.

【００４５】未知語カテゴリ推定部５は音声認識部１よ
り、この未知語を含んだ音声認識結果「《未知語》＝
｛〈検索条件〉、〈ホテル名〉、〈日時〉、〈部屋タイ
プ〉、〈部屋数〉、〈泊数〉｝、〈意図〉＝｛変更して
ください｝」が入力されると、システム動作決定知識記
憶部３を参照して、図１０に示すシステム動作決定知識
より、意図「変更してください」に対するシステム動作
「パラメータ変更」を得る。さらに、システム動作に必
須のパラメータが、システム動作「パラメータ変更」の
パラメータに対する条件から〈ホテル名〉、〈日時〉、
〈部屋タイプ〉、〈部屋数〉、〈泊数〉のいずれかであ
ることを得る。The unknown word category estimating unit 5 outputs a speech recognition result “<unknown word> =
｛If <Search condition>, <Hotel name>, <Date / time>, <Room type>, <Number of rooms>, <Number of nights>, <Intention> = {Please change} Referring to the decision knowledge storage unit 3, the system operation "parameter change" for the intention "change" is obtained from the system operation decision knowledge shown in FIG. Furthermore, the parameters required for system operation are based on the conditions for the parameters of the system operation "Parameter change", <Hotel name>, <Date and time>,
Get one of <room type>, <number of rooms>, <number of nights>.

【００４６】次に、システム動作決定知識記憶部３を参
照することにより、図１５に示したシステム動作決定知
識から、システム動作「パラメータ変更」に対する未知
語カテゴリ同定規則として「〈ホテル名〉、〈日時〉、
〈部屋タイプ〉、〈部屋数〉、〈泊数〉のうち対話開始
時点から値を割り当てられたパラメータのうち、時間的
に最も遅く値を割り当てられたパラメータ」を得る。次
いで図１６に示す対話文脈情報記憶部４の内容を参照す
ることにより、対話開始時点から値が割り当てられてい
るパラメータは〈ホテル名〉、〈日時〉であり、時間的
に最も遅く値が割り当てられたパラメータは〈日時〉で
あるので、未知語のカテゴリを６種類の候補の中から
〈日時〉に同定して出力する。Next, by referring to the system operation determination knowledge storage section 3, the system operation determination knowledge shown in FIG. Date and time),
Among the parameters to which values have been assigned from the start of the dialog among <room type>, <number of rooms>, and <number of nights>, the parameter to which the value is assigned the latest in time is obtained. Next, by referring to the contents of the dialogue context information storage unit 4 shown in FIG. 16, the parameters to which values have been assigned since the start of the dialogue are <hotel name> and <date and time>, and the values are assigned the latest in time. Since the obtained parameter is <date and time>, the category of the unknown word is identified as <date and time> from the six types of candidates and output.

【００４７】システム動作決定部６は、この未知語カテ
ゴリ〈日時〉が未知語カテゴリ推定部５から入力される
と、システム動作決定知識記憶部３を参照して、図１２
に示すシステム動作決定知識より、システム動作「パラ
メータ変更」に対する新たなシステム動作「標準形での
パラメータ要求」を、実行すべきシステム動作として決
定し出力する。When the unknown word category <date> is input from the unknown word category estimating unit 5, the system operation determining unit 6 refers to the system operation determining knowledge storage unit 3 and
The new system operation "parameter request in standard form" for the system operation "parameter change" is determined and output as the system operation to be executed from the system operation determination knowledge shown in (1).

【００４８】システム動作実行部７はこのシステム動作
決定部６にて決定されたシステム動作「パラメータ確
認」が入力されると、システム応答Ｓ３４「すみませ
ん、何月何日という言い方でおっしゃって頂けますか」
を利用者に出力する。When the system operation "parameter confirmation" determined by the system operation determining unit 6 is input, the system operation executing unit 7 responds to the system response S34 "Sorry, how many days in the month?""
Is output to the user.

【００４９】以上のように、この実施の形態３によれ
ば、対話文脈情報記憶部４が対話開始時点からの音声認
識結果の履歴、システム動作の履歴、システム動作に用
いたパラメータの履歴、および更新時刻を記憶し、未知
語を含む入力に対して音声認識部１が未知語を検出し、
未知語カテゴリ推定部５が対話文脈情報記憶部４に記憶
された対話文脈情報を参照することで、システム動作に
必要なパラメータのうちの、時間的に最も遅く値が割り
当てられたパラメータを未知語のカテゴリとして同定す
るので、未知語のカテゴリを同定するために必要な情報
が入力音声情報の他の部分に存在しない場合にも、対話
文脈に基づいて未知語のカテゴリを同定することがで
き、同定結果に応じて適切な対話動作を実行することで
利用者との対話を維持することが可能になるという効果
が得られる。As described above, according to the third embodiment, the dialog context information storage unit 4 stores the history of the speech recognition result from the start of the dialog, the history of the system operation, the history of the parameters used for the system operation, and An update time is stored, and the speech recognition unit 1 detects an unknown word for an input including an unknown word,
The unknown word category estimating unit 5 refers to the dialogue context information stored in the dialogue context information storage unit 4 to determine, from among the parameters necessary for the system operation, the parameter assigned the slowest time in terms of the unknown word. Since the information necessary to identify the category of the unknown word does not exist in other parts of the input speech information, the category of the unknown word can be identified based on the dialogue context. By executing an appropriate interaction operation according to the identification result, it is possible to maintain the interaction with the user.

【００５０】[0050]

【発明の効果】以上のように、この発明によれば、認識
語彙・構文知識記憶部に認識語彙・構文知識を保持さ
せ、音声認識部がそれを用いて入力音声に対する音声認
識処理と未知語検出処理を行い、その音声認識結果に未
知語が含まれていると、システム動作決定知識記憶部の
保持するシステム動作決定知識に規定された規則と、対
話文脈情報記憶部に記憶された対話文脈情報から、未知
語のカテゴリを未知語カテゴリ推定部で推定し、システ
ム動作決定知識を参照して、音声認識結果と推定した未
知語のカテゴリから、システム動作をシステム動作決定
部で決定して、それをシステム動作実行部で実行するよ
うに構成したので、対話文脈情報に基づいて未知語のカ
テゴリを同定することが可能となり、その同定結果に応
じて適切な対話動作を実行することにより、利用者との
対話を維持することのできる音声対話装置が得られると
いう効果がある。As described above, according to the present invention, the recognition vocabulary / syntax knowledge storage unit holds the recognition vocabulary / syntax knowledge, and the speech recognition unit uses the speech recognition process for the input speech and the unknown words. If an unknown word is included in the speech recognition result of the detection process, the rule defined in the system operation determination knowledge held in the system operation determination knowledge storage unit and the dialog context stored in the dialog context information storage unit From the information, the unknown word category is estimated by the unknown word category estimating unit, and the system operation is determined by the system operation determining unit from the speech recognition result and the estimated unknown word category with reference to the system operation determining knowledge. Since it is configured to be executed by the system operation execution unit, it is possible to identify the category of the unknown word based on the dialogue context information, and an appropriate dialogue operation is performed according to the identification result. By executing an effect that voice dialogue system is obtained which can maintain a dialogue with the user.

【００５１】この発明によれば、システム動作決定知識
として、システム動作に必要なパラメータ中の、対話開
始時点から一度も値を割り当てられなかったものを未知
語のカテゴリ候補とすると規定するように構成したの
で、システム動作に必要なパラメータのうちの、対話開
始時刻から一度も値が割り当てられていないパラメータ
が未知語のカテゴリとして同定され、未知語のカテゴリ
を同定するために必要な情報が入力音声情報の他の部分
に存在しない場合でも、対話文脈に基づいて未知語のカ
テゴリを同定することができ、同定結果に応じて適切な
対話動作を実行することで利用者との対話を維持するこ
とが可能となるという効果がある。According to the present invention, the system operation determining knowledge is defined such that, among the parameters necessary for the system operation, those to which no value has been assigned since the start of the dialogue are defined as unknown word category candidates. Therefore, of the parameters required for system operation, parameters to which no value has been assigned since the dialogue start time are identified as unknown word categories, and information necessary for identifying unknown word categories is input speech. Even if it does not exist in other parts of the information, the category of the unknown word can be identified based on the dialogue context, and the dialogue with the user is maintained by performing appropriate dialogue actions according to the identification result. There is an effect that it becomes possible.

【００５２】この発明によれば、システム動作決定知識
として、システム動作に必要なパラメータ中の、既に値
を割り当てられているものを未知語のカテゴリ候補とす
ると規定するように構成したので、システム動作に必要
なパラメータのうちの、既に値を割り当てられているパ
ラメータが未知語のカテゴリとして同定され、未知語の
カテゴリを同定するために必要な情報が入力音声情報の
他の部分に存在しない場合でも、対話文脈に基づいて未
知語のカテゴリを同定することができ、同定結果に応じ
て適切な対話動作を実行することで利用者との対話を維
持することが可能になるという効果がある。According to the present invention, as the system operation determination knowledge, a parameter which is already assigned a value among the parameters required for the system operation is defined as an unknown word category candidate. Parameters that are already assigned values are identified as unknown word categories, and even if the information necessary to identify the unknown word categories does not exist in other parts of the input speech information. In addition, the category of the unknown word can be identified based on the conversation context, and the effect of executing an appropriate conversation operation according to the identification result is that the conversation with the user can be maintained.

【００５３】この発明によれば、対話文脈情報記憶部に
記憶内容の更新時刻も記憶させ、システム動作決定知識
として、システム動作に必要なパラメータの既に値が割
り当てられているものの中の、時間的に最も遅く値が割
り当てられたパラメータを未知語のカテゴリ候補とする
と規定するように構成したので、システム動作に必要な
パラメータのうちで時間的に最も遅く値が割り当てられ
たパラメータが、未知語のカテゴリとして同定され、未
知語のカテゴリを同定するために必要な情報が入力音声
情報の他の部分に存在しない場合でも、対話文脈に基づ
いて未知語のカテゴリを同定することができ、同定結果
に応じて適切な対話動作を実行することで利用者との対
話を維持することが可能になるという効果がある。According to the present invention, the update time of the storage contents is also stored in the dialogue context information storage unit, and as the system operation determination knowledge, the temporal values of the parameters already required for the system operation are assigned. The parameter that is assigned the slowest value to the unknown word category candidate is configured so that, of the parameters required for system operation, the parameter that is assigned the slowest time is the unknown word. Even if the information necessary to identify the category of the unknown word is not present in other parts of the input speech information, the category of the unknown word can be identified based on the dialogue context, and the identification result Executing an appropriate interactive operation in response to this has an effect that it is possible to maintain a dialogue with the user.

【図面の簡単な説明】[Brief description of the drawings]

【図１】この発明による音声対話装置の構成を示すブ
ロック図である。FIG. 1 is a block diagram showing a configuration of a voice interaction device according to the present invention.

【図２】この発明における認識語彙・構文知識記憶部
に規定された認識語彙・構文知識の一例を示す遷移図で
ある。FIG. 2 is a transition diagram showing an example of recognized vocabulary / syntax knowledge defined in a recognized vocabulary / syntax knowledge storage unit according to the present invention.

【図３】この発明における各パラメータに対応して規
定された語彙の一例を示す説明図である。FIG. 3 is an explanatory diagram showing an example of a vocabulary defined corresponding to each parameter in the present invention.

【図４】この発明の実施の形態１による音声対話装置
と利用者との対話の一例を示す説明図である。FIG. 4 is an explanatory diagram showing an example of a conversation between the user and the voice conversation device according to the first embodiment of the present invention;

【図５】実施の形態１におけるシステム動作決定知識
の、音声認識結果に対して取り得るシステム動作を示す
説明図である。FIG. 5 is an explanatory diagram showing a system operation that can be taken for a speech recognition result of the system operation determination knowledge in the first embodiment.

【図６】実施の形態１におけるシステム動作決定知識
の、対話文脈情報から未知語のカテゴリを同定する規則
を示す説明図である。FIG. 6 is an explanatory diagram showing rules for identifying a category of an unknown word from dialog context information of system operation determination knowledge in the first embodiment.

【図７】実施の形態１におけるシステム動作決定知識
の、未知語カテゴリから新たに定まるシステム動作を示
す説明図である。FIG. 7 is an explanatory diagram showing a system operation newly determined from an unknown word category of the system operation determination knowledge in the first embodiment.

【図８】実施の形態１における対話文脈情報記憶部の
内容を示す説明図である。FIG. 8 is an explanatory diagram showing contents of a conversation context information storage unit according to the first embodiment.

【図９】この発明の実施の形態２による音声対話装置
と利用者との対話の一例を示す説明図である。FIG. 9 is an explanatory diagram showing an example of a dialogue between a user and a voice interactive device according to a second embodiment of the present invention;

【図１０】実施の形態２および実施の形態３における
システム動作決定知識の、声認識結果に対して取り得る
システム動作を示す説明図である。FIG. 10 is an explanatory diagram showing a system operation that can be taken for a voice recognition result of the system operation determination knowledge in the second and third embodiments.

【図１１】実施の形態２におけるシステム動作決定知
識の、対話文脈情報から未知語のカテゴリを同定する規
則を示す説明図である。FIG. 11 is an explanatory diagram showing rules for identifying a category of an unknown word from dialog context information of system operation determination knowledge according to the second embodiment.

【図１２】実施の形態２および実施の形態３における
システム動作決定知識の、未知語カテゴリから新たに定
まるシステム動作を示す説明図である。FIG. 12 is an explanatory diagram showing a system operation newly determined from an unknown word category of system operation determination knowledge in the second and third embodiments.

【図１３】実施の形態２における対話文脈情報記憶部
の内容を示す説明図である。FIG. 13 is an explanatory diagram showing contents of a conversation context information storage unit according to the second embodiment.

【図１４】この発明の実施の形態３による音声対話装
置と利用者との対話の一例を示す説明図である。FIG. 14 is an explanatory diagram showing an example of a dialogue between a user and a voice interactive device according to a third embodiment of the present invention;

【図１５】実施の形態３におけるシステム動作決定知
識の、対話文脈情報から未知語のカテゴリを同定する規
則を示す説明図である。FIG. 15 is an explanatory diagram showing rules for identifying a category of an unknown word from dialog context information of system operation determination knowledge according to the third embodiment.

【図１６】実施の形態３における対話文脈情報記憶部
の内容を示す説明図である。FIG. 16 is an explanatory diagram showing contents of a conversation context information storage unit according to the third embodiment.

【図１７】従来の音声対話装置における動作処理の流
れを示すフローチャートである。FIG. 17 is a flowchart showing a flow of operation processing in the conventional voice interaction device.

【符号の説明】[Explanation of symbols]

１音声認識部、２認識語彙・構文知識記憶部、３
システム動作決定知識記憶部、４対話文脈情報記憶
部、５未知語カテゴリ推定部、６システム動作決定
部、７システム動作実行部。1. Speech recognition unit 2. Recognition vocabulary / syntax knowledge storage unit 3.
System operation determination knowledge storage unit, dialogue context information storage unit, unknown word category estimation unit, 6 system operation determination unit, 7 system operation execution unit.

Claims

Translated fromJapanese

【特許請求の範囲】[Claims]

【請求項２】システム動作決定知識記憶部が保持する
システム動作決定知識が、システム動作に必要なパラメ
ータのうちの、対話開始時点から一度も値を割り当てら
れなかったパラメータを、未知語のカテゴリ候補とする
と規定したことを特徴とする請求項１記載の音声対話装
置。2. The system operation determination knowledge stored in the system operation determination knowledge storage unit converts parameters of which parameters have not been assigned a value even once from the start of the dialogue among the parameters required for the system operation into unknown word category candidates. 2. The voice interaction device according to claim 1, wherein:

【請求項３】システム動作決定知識記憶部が保持する
システム動作決定知識が、システム動作に必要なパラメ
ータのうちの、既に値を割り当てられているパラメータ
を、未知語のカテゴリ候補とすると規定したことを特徴
とする請求項１または請求項２記載の音声対話装置。3. The system operation determination knowledge stored in the system operation determination knowledge storage unit specifies that among the parameters required for the system operation, parameters to which values are already assigned are set as unknown word category candidates. The voice interaction device according to claim 1 or 2, wherein:

【請求項４】対話文脈情報記憶部が、その記憶内容を
更新した時刻も記憶するものであり、システム動作決定知識記憶部が保持するシステム動作決
定知識が、システム動作に必要なパラメータのうちで既
に値を割り当てられたパラメータ中の、時間的に最も遅
く値が割り当てられたパラメータを、未知語のカテゴリ
候補とすると規定したことを特徴とする請求項３記載の
音声対話装置。4. The dialog context information storage unit also stores the time at which the storage content was updated, and the system operation determination knowledge held by the system operation determination knowledge storage unit is one of parameters required for system operation. 4. The voice interaction apparatus according to claim 3, wherein, among the parameters to which values have already been assigned, parameters to which values are assigned the latest in time are defined as unknown word category candidates.