JP2007336344A

Movatterモバイル変換

Info

Publication number: JP2007336344A
Application number: JP2006167321A
Authority: JP
Inventors: Hideo Fushimoto; 秀雄伏本
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2006-06-16
Filing date: 2006-06-16
Publication date: 2007-12-27

Abstract

【課題】連続的な会話を中断することなく、快適なＴＶ電話システム、ＴＶ会議システムを実現する。
【解決手段】検索用のキーワードと関連付けられたデータを記憶する外部記憶部２４と、自装置及び外部装置のうちの少なくとも何れか一方に対して発声された音声を音声データとして入力する音声入力部１５と、前記音声データに基づいてキーワードを抽出し、抽出したキーワードに係るデータを外部記憶部２４から読み出して、読み出したデータを表示出力部１７に表示すると共に、当該データを通信ネットワーク２３を介して前記外部装置に送信する制御を行う制御部１１を具備する。
【選択図】図１A comfortable TV phone system and a TV conference system are realized without interrupting continuous conversation.
An external storage unit that stores data associated with a search keyword, and a voice input unit that inputs voice uttered to at least one of the device and the external device as voice data 15 and a keyword is extracted based on the voice data, the data related to the extracted keyword is read from the external storage unit 24, the read data is displayed on the display output unit 17, and the data is transmitted via the communication network 23. And a control unit 11 that performs control of transmission to the external device.
[Selection] Figure 1

Description

Translated fromJapanese

本発明は、音声データ及び画像データを含む各種のデータを伝送路を介して外部装置と送受信可能に構成された情報端末装置、その駆動方法、及び当該駆動方法をコンピュータに実行させるためのプログラムに関する。 The present invention relates to an information terminal device configured to be able to transmit and receive various types of data including audio data and image data to and from an external device via a transmission line, a driving method thereof, and a program for causing a computer to execute the driving method. .

従来より、画像データ、音声データ等の各種情報をネットワークを介して端末装置間で授受するＴＶ電話／ＴＶ会議システムが提案されている（例えば、下記の特許文献１参照）。 2. Description of the Related Art Conventionally, there has been proposed a TV phone / TV conference system that exchanges various types of information such as image data and audio data between terminal devices via a network (for example, see Patent Document 1 below).

近年、ＴＶ電話では、画像データと音声データを端末装置間で送受信することにより、相手側の映像を見ながら会話ができるようになっている。更に、ＴＶ電話では、ネットワークの伝送路の大幅な伝送容量の増大に伴い、ＴＶ会議システムで提案されている、映像、音声を授受するだけではなく、ファイル転送、描画情報のやり取り、アプリケーションソフトの共有等を多地点間で行うことができるようになっている。これにより、効率的な会議を進めることが可能なシステムが提供されている。 In recent years, in videophones, conversations can be made while viewing images of the other party by transmitting and receiving image data and audio data between terminal devices. Furthermore, in the video phone, as the transmission capacity of the network transmission line is greatly increased, not only the video and audio, which are proposed in the TV conference system, but also file transfer, exchange of drawing information, application software, etc. Sharing, etc. can be performed between multiple points. Thereby, a system capable of proceeding with an efficient conference is provided.

上述したシステムでは、ＴＶ会議システムのみならず、個人のＴＶ電話においても、端末装置内に記憶された画像データや各種ドキュメントデータ等をお互いの端末装置上に表示出力させて会話を行うことができるようになっている。これにより、遠隔地間で会話をしながら、各種情報を共有することが可能であり、よりリアルな会議の雰囲気での会話に近づけたシステムも実現されてきている。 In the above-described system, not only a TV conference system but also a personal TV phone can display and output image data, various document data, and the like stored in the terminal device on each other's terminal device. It is like that. As a result, various types of information can be shared while having a conversation between remote locations, and a system that is closer to a conversation in a more realistic conference atmosphere has been realized.

特開２００３−２２４８３６号公報JP 2003-224836 A

上述したシステムにおいては、端末装置の表示画面を見ながら会話を行い、必要に応じて共有すべき情報を端末装置を操作することにより検索、出力して送信を行うことを実現している。 In the above-described system, it is realized that a conversation is performed while looking at the display screen of the terminal device, and information to be shared is searched for, output, and transmitted as needed by operating the terminal device.

しかしながら、上述したシステムでは、端末装置に設けられたキーボード等の入力手段を操作して所望のデータが格納されている記憶手段を特定し、その中から当該所望のデータを検索して、自装置の表示装置に表示すると共に、相手側の端末装置に対する送信指示を行うことが必要となり、連続的に会話をしている状態である場合、当該操作中は会話が中断してしまうといった問題があった。 However, in the above-described system, the storage unit in which desired data is stored is specified by operating an input unit such as a keyboard provided in the terminal device, and the desired data is retrieved from the storage unit. Display on the other display device and instructing transmission to the other terminal device, and in a state of continuous conversation, there is a problem that the conversation is interrupted during the operation. It was.

また、上述したシステムでは、表示装置に出力されている双方の映像を確認しながら会話が行われるため、通常、映像を撮影するカメラが表示装置上に配設されている関係上、会話者は、表示装置からある程度離れた位置で会話を行う必要がある。また、上述した操作を表示装置を含む端末装置で行う場合は、端末装置の傍に近づき操作を行うため、やはり、会話が一時的に中断されてしまうといった問題があった。 Further, in the above-described system, since the conversation is performed while confirming both images output to the display device, the conversation person is usually placed on the relationship that the camera for capturing the image is disposed on the display device. It is necessary to have a conversation at a position some distance away from the display device. Further, when the above-described operation is performed on a terminal device including a display device, there is a problem in that the conversation is temporarily interrupted because the operation is performed near the terminal device.

本発明は上述の問題点に鑑みてなされたものであり、連続的な会話を中断することなく、快適なＴＶ電話システム、ＴＶ会議システムを実現する情報端末装置、その駆動方法及びプログラムを提供することを目的とする。 The present invention has been made in view of the above-described problems, and provides an information terminal device that realizes a comfortable TV telephone system and a TV conference system without interrupting continuous conversation, a driving method thereof, and a program. For the purpose.

本発明の情報端末装置は、音声データ及び画像データを含む各種のデータを伝送路を介して外部装置と送受信可能に構成された情報端末装置であって、検索用のキーワードと関連付けられたデータを記憶する第１の記憶手段と、前記情報端末装置及び前記外部装置のうちの少なくとも何れか一方に対して発声された音声を音声データとして入力する音声入力手段と、前記音声入力手段により入力された音声データに基づいてキーワードを抽出する抽出手段と、前記抽出手段で抽出したキーワードに係るデータを前記第１の記憶手段から読み出す読み出し手段と、前記読み出し手段で読み出したデータを表示媒体に表示する表示手段と、前記読み出し手段で読み出したデータを前記伝送路を介して前記外部装置に送信する送信手段とを有する。 An information terminal device according to the present invention is an information terminal device configured to be able to transmit and receive various types of data including audio data and image data to and from an external device via a transmission line, and to store data associated with a search keyword. First storage means for storing, voice input means for inputting voice uttered to at least one of the information terminal device and the external device as voice data, and input by the voice input means Extraction means for extracting a keyword based on voice data, reading means for reading data relating to the keyword extracted by the extraction means from the first storage means, and display for displaying the data read by the reading means on a display medium And transmission means for transmitting the data read by the reading means to the external device via the transmission path.

本発明の情報端末装置の駆動方法は、音声データ及び画像データを含む各種のデータを伝送路を介して外部装置と送受信可能に構成され、検索用のキーワードと関連付けられたデータを記憶する第１の記憶手段を具備する情報端末装置の駆動方法であって、前記情報端末装置及び前記外部装置のうちの少なくとも何れか一方に対して発声された音声を音声データとして入力する音声入力ステップと、前記音声入力ステップにより入力された音声データに基づいてキーワードを抽出する抽出ステップと、前記抽出ステップで抽出したキーワードに係るデータを前記第１の記憶手段から読み出す読み出しステップと、前記読み出しステップで読み出したデータを表示媒体に表示する表示ステップと、前記読み出しステップで読み出したデータを前記伝送路を介して前記外部装置に送信する送信ステップとを有する。 The information terminal device driving method of the present invention is configured to be capable of transmitting and receiving various types of data including audio data and image data to and from an external device via a transmission line, and stores data associated with a search keyword. A method for driving an information terminal device comprising the storage means, wherein the voice input step inputs voice uttered to at least one of the information terminal device and the external device as voice data; An extraction step for extracting a keyword based on the voice data input in the voice input step, a read step for reading data relating to the keyword extracted in the extraction step from the first storage means, and data read in the read step Display on the display medium, and the data read in the reading step And a transmission step of transmitting to the external device through the transmission path.

また、本発明のプログラムは、前記情報端末装置の駆動方法の各ステップをコンピュータに実行させるためのものである。 Moreover, the program of this invention is for making a computer perform each step of the drive method of the said information terminal device.

本発明によれば、連続的な会話を中断することなく、快適なＴＶ電話システム、ＴＶ会議システムを実現することができる。 According to the present invention, a comfortable TV phone system and a TV conference system can be realized without interrupting continuous conversation.

以下、図面を参照して、本発明の実施形態について説明する。 Embodiments of the present invention will be described below with reference to the drawings.

（第１の実施形態）
図１は、第１の実施形態に係る情報端末装置のハードウエア構成を示すブロック図である。
図１において、制御部１１は、情報端末装置のシステム全体の制御を司るものであり、例えば、後述する情報検索出力制御及び情報送信制御等を実行する。ＲＯＭ１２は、制御部１１で各種制御を行う際に必要なプログラムなどを格納する。ＲＡＭ１３は、入力データなどの情報やプログラム実行中のデータなどを一時的に記憶する。(First embodiment)
FIG. 1 is a block diagram illustrating a hardware configuration of the information terminal device according to the first embodiment.
In FIG. 1, thecontrol unit 11 controls the entire system of the information terminal device, and executes, for example, information search output control and information transmission control described later. TheROM 12 stores a program necessary for performing various controls by thecontrol unit 11. TheRAM 13 temporarily stores information such as input data and data during program execution.

画像入力部１４は、例えばカメラ装置等からなるものであり、ＴＶ電話／ＴＶ会議システムにおいて、撮影された映像（画像）を映像データ（画像データ）として入力を行う。音声入力部１５は、例えばマイク装置等からなるものであり、ＴＶ電話／ＴＶ会議システムにおいて、発声された音声を音声データとして入力を行う。ここで、本実施形態の音声入力部１５は、当該情報端末装置のみならず、相手側の情報端末装置に対して発声された音声を音声データとして入力するものとする。 Theimage input unit 14 is composed of, for example, a camera device or the like, and inputs a captured video (image) as video data (image data) in a TV phone / TV conference system. Thevoice input unit 15 is composed of, for example, a microphone device or the like, and inputs spoken voice as voice data in a TV phone / TV conference system. Here, it is assumed that thevoice input unit 15 of the present embodiment inputs voice uttered not only to the information terminal apparatus but also to the information terminal apparatus on the other side as voice data.

操作入力部１６は、例えばキーボード装置等からなるものであり、情報端末装置における各種の設定、制御、データ入力手段として機能する。 Theoperation input unit 16 includes, for example, a keyboard device and functions as various settings, controls, and data input means in the information terminal device.

表示出力部１７は、例えば液晶ディスプレイ端末装置やＴＶ表示装置等からなるものであり、画像を出力する手段として機能する。また、音声出力部１８は、例えばスピーカ装置等からなるものであり、音声を出力する手段として機能する。これらの表示出力部１７及び音声出力部１８により、当該情報端末装置と通信可能に構成された情報端末装置の使用者における双方の顔や、会話、共有データの出力再生がなされる。通常、表示出力部１７は、同時に複数の情報の出力が可能であり、送信側の映像データや、相手側の映像データ、共有データ、及び各情報端末装置内の情報が同時に出力可能に構成されている。 Thedisplay output unit 17 includes, for example, a liquid crystal display terminal device or a TV display device, and functions as a means for outputting an image. Theaudio output unit 18 is constituted by a speaker device, for example, and functions as a means for outputting audio. Thedisplay output unit 17 and theaudio output unit 18 output and reproduce both faces, conversations, and shared data of the user of the information terminal device configured to be able to communicate with the information terminal device. Usually, thedisplay output unit 17 can output a plurality of information at the same time, and can be configured to simultaneously output video data on the transmission side, video data on the other side, shared data, and information in each information terminal device. ing.

画像符号化／復号化処理部１９及び音声符号化／復号化処理部２０は、送信側の情報端末装置と受信側の情報端末装置との間で、それぞれ、映像及び音声、並びにその他の情報を授受する際のこれらの各データの符号化／復号化処理を行なう。多重／分離処理部２１は、画像データ、音声データを分離した後、回線インターフェース処理部２２を介して、通信ネットワーク２３に伝送可能な通信形態に所望の通信プロトコルに準じた変換を行う。これらの一連の処理により、送信側の情報端末装置と受信側の情報端末装置との間で、画像データ及び音声データを含む各種のデータの送受信が行なわれる。 The image encoding /decoding processing unit 19 and the audio encoding /decoding processing unit 20 respectively transmit video, audio, and other information between the transmitting-side information terminal device and the receiving-side information terminal device. Encoding / decoding processing of each of these data at the time of exchange is performed. The multiplexing / separation processing unit 21 separates image data and audio data, and then performs conversion according to a desired communication protocol into a communication form that can be transmitted to the communication network 23 via the lineinterface processing unit 22. Through a series of these processes, various types of data including image data and audio data are transmitted and received between the transmission-side information terminal device and the reception-side information terminal device.

外部記憶部２４には、画像やドキュメント等の各種のデータが検索用のキーワードと関連付けられて記憶されている。音声認識処理部２５は、音声入力部１５より入力された音声データの中から従来より提案されている認識処理を用いてテキスト情報に変換するものであり、このテキスト情報は、キーワードとしてキーワード保持部２６へ順次格納される。 Various data such as images and documents are stored in theexternal storage unit 24 in association with search keywords. The voice recognition processing unit 25 converts the voice data input from thevoice input unit 15 into text information using a conventionally proposed recognition process. The text information is a keyword holding unit as a keyword. 26 are sequentially stored.

制御部１１は、キーワード保持部２６に格納された各種キーワードを活用して、外部記憶部２４内に予めキーワードと関連付けられて格納された画像データや各種のデータの検索を行う。そして、制御部１１は、検索された各種のデータを外部記憶部２４から読み出して、表示出力部１７に表示出力したり、回線インターフェース処理部２２を介して、相手の情報端末装置に送信したりする。 Thecontrol unit 11 uses various keywords stored in thekeyword holding unit 26 to search for image data and various data stored in theexternal storage unit 24 in association with the keywords in advance. Then, thecontrol unit 11 reads various types of searched data from theexternal storage unit 24 and outputs the data to thedisplay output unit 17 or transmits it to the partner information terminal device via the lineinterface processing unit 22. To do.

図２は、従来例における表示出力部１７の出力表示の一例を示す図である。
図２に示す表示出力部１７には、送信側の情報端末装置の画像入力部１４で撮影された画像を表示する送信側映像部２７と、相手側の情報端末装置で撮影され、通信ネットワーク２３を介して伝送される画像を表示する相手側映像部２８が設けられている。この場合、また、音声入力部１５により音声データを送受信しながら、通信ネットワーク２３を介して映像と音声による会話が実現されている。FIG. 2 is a diagram illustrating an example of output display of thedisplay output unit 17 in the conventional example.
Thedisplay output unit 17 illustrated in FIG. 2 includes a transmission-side video unit 27 that displays an image captured by theimage input unit 14 of the transmission-side information terminal device, and a communication network 23 that is captured by the other-side information terminal device. There is provided acounterpart video unit 28 for displaying an image transmitted through the network. In this case, a conversation between video and audio is realized via the communication network 23 while audio data is transmitted and received by theaudio input unit 15.

上述のように、映像と音声で会話を行っている際、送信側の情報端末装置内の外部記憶部２４に格納された画像やドキュメント等のデータに関して会話を行う場合、外部記憶部２４内のデータの格納状態を示すデータ一覧２９を表示出力部１７の下部に表示させる。このデータ一覧２９は、通常、階層構造になっている。 As described above, when a conversation is performed with respect to data such as an image or a document stored in theexternal storage unit 24 in the information terminal device on the transmission side when the conversation is performed with video and audio, Adata list 29 indicating the data storage state is displayed below thedisplay output unit 17. Thisdata list 29 usually has a hierarchical structure.

操作者は、会話をしながら操作入力部１６の操作によりデータ一覧２９の中から所望のデータ「００１」３０を送信指示することにより、通信ネットワーク２３を介して相手側の情報端末装置の表示出力部に当該データ「００１」が出力表示される。この際、送信側の情報端末装置では、送信するデータ「００１」３０を画像符号化／復号化処理部１９で圧縮し、回線インターフェース処理部２２を介して、相手側の情報端末装置に送信する。また、送信側の情報端末装置の表示出力部１７には、送信指示したデータ「００１」３０がデータ表示部３１に表示される。このようにして、送信側と相手側とが表示されたデータ「００１」を確認しながら会話を行うことが可能となる。 The operator instructs the transmission of the desired data “001” 30 from thedata list 29 by operating theoperation input unit 16 while having a conversation. The data “001” is output and displayed on the part. At this time, in the information terminal device on the transmission side, the data “001” 30 to be transmitted is compressed by the image encoding /decoding processing unit 19 and transmitted to the information terminal device on the other side via the lineinterface processing unit 22. . Also, the data “001” 30 instructed to be transmitted is displayed on thedata display unit 31 on thedisplay output unit 17 of the information terminal device on the transmission side. In this way, it is possible to perform a conversation while confirming the data “001” displayed on the transmission side and the other side.

図３は、図２に示す従来例における表示出力部１７の出力表示動作を示すフローチャートである。この図３に示す処理は、図１の制御部１１で行われる。 FIG. 3 is a flowchart showing an output display operation of thedisplay output unit 17 in the conventional example shown in FIG. The processing shown in FIG. 3 is performed by thecontrol unit 11 shown in FIG.

まず、ステップＳ１０１では、送信側の情報端末装置と相手側（受信側）の情報端末装置とを、通信ネットワーク２３を介して画像データや音声データを含む各種のデータの送受信が可能な状態として、通信開始の処理を行う。 First, in step S101, the transmission-side information terminal device and the counterpart (reception-side) information terminal device can transmit and receive various types of data including image data and audio data via the communication network 23. Performs communication start processing.

続いて、操作者が外部記憶部２４内のデータを相手側の情報端末装置に送信する場合、ステップＳ１０２では、図２における表示出力部１７の下段に示すデータ一覧２９を表示出力させる。 Subsequently, when the operator transmits the data in theexternal storage unit 24 to the information terminal device on the other side, in step S102, thedata list 29 shown in the lower part of thedisplay output unit 17 in FIG.

続いて、ステップＳ１０３では、操作者によりデータ一覧２９から相手側の情報端末装置に送信する所望のデータ（図２に示す例では、データ「００１」３０）が選択され、当該データの送信指示があるか否かを判断する。この判断の結果、データの送信指示がなかった場合には、ステップＳ１０７に進む。一方、ステップＳ１０３の判断の結果、データの送信指示があった場合には、ステップＳ１０４に進む。 Subsequently, in step S103, the operator selects desired data (data “001” 30 in the example shown in FIG. 2) to be transmitted from thedata list 29 to the partner information terminal device, and an instruction to transmit the data is issued. Judge whether there is. As a result of the determination, if there is no data transmission instruction, the process proceeds to step S107. On the other hand, as a result of the determination in step S103, if there is a data transmission instruction, the process proceeds to step S104.

ステップＳ１０４では、ステップＳ１０３で送信指示されたデータ「００１」３０の元データを外部記憶部２４から読み出し、画像符号化／復号化処理部１９で圧縮して通信ネットワーク２３を介して相手側の情報端末装置に送信する。続いて、ステップＳ１０５では、図２に示すように、送信指示したデータ「００１」３０を当該表示出力部１７のデータ表示部３１に表示合成する。 In step S 104, the original data of the data “001” 30 instructed to be transmitted in step S 103 is read from theexternal storage unit 24, compressed by the image encoding /decoding processing unit 19, and information on the other party via the communication network 23. Send to terminal device. Subsequently, in step S105, as shown in FIG. 2, the transmission-designated data “001” 30 is displayed and synthesized on thedata display unit 31 of thedisplay output unit 17.

続いて、ステップＳ１０６では、操作者によりデータ一覧２９から相手側の情報端末装置に送信する他のデータが選択され、当該データの送信指示があるか否かを判断する。この判断の結果、他のデータの送信指示があった場合には、ステップＳ１０４に戻り、ステップＳ１０４で改めて当該他のデータを読み出して、ステップＳ１０５で当該他のデータを表示しながら会話を継続することになる。一方、ステップＳ１０６の判断の結果、他のデータの送信指示がなかった場合には、ステップＳ１０７に進む。 Subsequently, in step S106, the operator selects other data to be transmitted from thedata list 29 to the partner information terminal device, and determines whether there is an instruction to transmit the data. If there is an instruction to transmit other data as a result of this determination, the process returns to step S104, the other data is read again in step S104, and the conversation is continued while displaying the other data in step S105. It will be. On the other hand, if it is determined in step S106 that there is no other data transmission instruction, the process proceeds to step S107.

続いて、ステップＳ１０７では、一連の会話を終了させるために、通信の切断処理を実行する。以上のステップＳ１０１〜ステップＳ１０７までの処理を経ることにより、図２に示す表示出力部１７の出力表示動作が行われる。 In step S107, a communication disconnection process is executed to end a series of conversations. The output display operation of thedisplay output unit 17 shown in FIG. 2 is performed through the processes from step S101 to step S107.

図２及び図３に示す従来例の場合、表示出力部１７の相手側映像部２８を確認しながら操作入力部１６を操作して所望のデータを検索指示することが必要であった。このため、当該所望のデータが格納されている構成が複雑な場合、操作入力部１６の操作に手間取り、会話が中断してしまうといった不都合が発生していた。そこで、この課題を解決するための本発明の第１の実施形態に係る情報端末装置の駆動方法を、図４乃至図６を用いて以下に説明する。 In the case of the conventional example shown in FIGS. 2 and 3, it is necessary to operate theoperation input unit 16 while checking thecounterpart video unit 28 of thedisplay output unit 17 to instruct search for desired data. For this reason, when the structure in which the desired data is stored is complicated, there is a problem that theoperation input unit 16 is troublesome and the conversation is interrupted. Therefore, a method for driving the information terminal device according to the first embodiment of the present invention for solving this problem will be described below with reference to FIGS.

図４は、第１の実施形態における表示出力部１７の出力表示の一例を示す図である。
本実施形態では、図４に示す送信側映像部２７と相手側映像部２８を同一画面上に表示しながら会話を行っている時、会話における音声データを音声認識処理部２５において順次認識して、キーワードを連続的に抽出する。FIG. 4 is a diagram illustrating an example of the output display of thedisplay output unit 17 in the first embodiment.
In the present embodiment, when a conversation is performed while the transmission-side video unit 27 and the counterpart-side video unit 28 shown in FIG. 4 are displayed on the same screen, the voice recognition processing unit 25 sequentially recognizes voice data in the conversation. , Extract keywords continuously.

そして、連続的に抽出されるキーワードから、予め外部記憶部２４内にキーワードと共に格納された複数のデータの中から、抽出されたキーワードと一致するキーワードに係るデータを順次読み出し、これを表示出力部１７のデータ表示部３３に順次表示する。 Then, from a plurality of continuously extracted keywords, data relating to the keyword that matches the extracted keyword is sequentially read out from a plurality of data stored together with the keyword in theexternal storage unit 24 in advance. The data are sequentially displayed on the 17 data displaysections 33.

これにより、会話を継続しながら、会話の中から関連する情報がデータ表示部３３に順次表示されると共に、当該情報を共有させることにより、円滑で効率的なＴＶ会議システムが実現される。 Thereby, while continuing the conversation, related information from the conversation is sequentially displayed on thedata display unit 33, and by sharing the information, a smooth and efficient TV conference system is realized.

図５は、第１の実施形態に係る情報端末装置の駆動方法を示すフローチャートである。具体的に、図５は、図４に示す表示出力部１７の出力表示動作を示すフローチャートである。この図５に示す処理は、図１の制御部１１で行われる。 FIG. 5 is a flowchart showing a method of driving the information terminal device according to the first embodiment. Specifically, FIG. 5 is a flowchart showing an output display operation of thedisplay output unit 17 shown in FIG. The processing shown in FIG. 5 is performed by thecontrol unit 11 shown in FIG.

まず、ステップＳ２０１では、送信側の情報端末装置と相手側（受信側）の情報端末装置とを、通信ネットワーク２３を介して画像データや音声データを含む各種のデータの送受信が可能な状態として、通信開始の処理を行う。 First, in step S201, a transmission-side information terminal device and a partner-side (reception-side) information terminal device can transmit and receive various types of data including image data and audio data via the communication network 23. Performs communication start processing.

続いて、ステップＳ２０２では、会話中の音声からデータ検索のキーワードを抽出する音声認識モードの設定がなされているか否かを判断する。この判断の結果、音声認識モードの設定がなされていない場合には、ステップＳ２１０に進む。一方、ステップＳ２０２の判断の結果、音声認識モードの設定がなされている場合には、ステップＳ２０３に進む。 Subsequently, in step S202, it is determined whether or not a voice recognition mode for extracting a data search keyword from the voice during conversation is set. As a result of the determination, if the voice recognition mode is not set, the process proceeds to step S210. On the other hand, if the result of determination in step S202 is that the voice recognition mode has been set, processing proceeds to step S203.

ステップＳ２０２で音声認識モードが設定されていると判断された場合、続いて、ステップＳ２０３では、送信者あるいは相手側の会話を音声認識処理部２５で順次音声認識を行い、検索対象となるキーワードが発声されたか否かを判断する。この判断の結果、検索対象となるキーワードが発声されていない場合には、検索対象となるキーワードが発声されるまで、ステップＳ２０３で待機する。一方、ステップＳ２０３の判断の結果、検索対象となるキーワードが発声された場合には、ステップＳ２０４に進む。 If it is determined in step S202 that the voice recognition mode is set, then in step S203, the voice recognition processing unit 25 sequentially performs voice recognition on the conversation of the sender or the other party, and the keyword to be searched is determined. It is determined whether or not the voice is spoken. As a result of the determination, if the keyword to be searched is not uttered, the process waits in step S203 until the keyword to be searched is uttered. On the other hand, if the keyword to be searched is uttered as a result of the determination in step S203, the process proceeds to step S204.

ステップＳ２０４では、キーワードの認識処理を行う。続いて、ステップＳ２０５では、予め外部記憶部２４にキーワードと共に格納されたデータの中から検索を行う。続いて、ステップＳ２０６では、外部記憶部２４内に、ステップＳ２０４で認識されたキーワードと一致したキーワードに係る元データが存在するか否かを判断する。この判断の結果、ステップＳ２０４で認識されたキーワードと一致したキーワードに係る元データが外部記憶部２４内に存在しない場合には、ステップＳ２０３に戻る。一方、ステップＳ２０６の判断の結果、ステップＳ２０４で認識されたキーワードと一致したキーワードに係る元データが外部記憶部２４内に存在する場合には、ステップＳ２０７に進む。 In step S204, keyword recognition processing is performed. Subsequently, in step S205, a search is performed from data stored together with keywords in theexternal storage unit 24 in advance. Subsequently, in step S206, it is determined whether or not original data relating to the keyword that matches the keyword recognized in step S204 exists in theexternal storage unit 24. As a result of this determination, when the original data relating to the keyword that matches the keyword recognized in step S204 does not exist in theexternal storage unit 24, the process returns to step S203. On the other hand, as a result of the determination in step S206, if the original data relating to the keyword that matches the keyword recognized in step S204 exists in theexternal storage unit 24, the process proceeds to step S207.

ステップＳ２０７では、当該元データを読み出す。続いて、ステップＳ２０８では、ステップＳ２０７で読み出した元データを表示出力部１７のデータ表示部３３に表示する。さらに、相手側の情報端末装置にステップＳ２０７で読み出された元データを送信し、当該相手側の情報端末装置の表示出力部に当該元データを出力表示させる。 In step S207, the original data is read. Subsequently, in step S208, the original data read in step S207 is displayed on thedata display unit 33 of thedisplay output unit 17. Further, the original data read in step S207 is transmitted to the information terminal device on the other side, and the original data is output and displayed on the display output unit of the information terminal device on the other side.

続いて、ステップＳ２０９では、次なる検索キーワードが発声されたか否かを判断し、発声された場合には、ステップＳ２０４に戻って以降の処理を繰り返す。一方、ステップＳ２０９での判断の結果、次なる検索キーワードが発声されなかった場合には、ステップＳ２１０に進む。 Subsequently, in step S209, it is determined whether or not the next search keyword has been uttered, and if uttered, the process returns to step S204 to repeat the subsequent processing. On the other hand, as a result of the determination in step S209, if the next search keyword is not uttered, the process proceeds to step S210.

ステップＳ２１０では、一連の会話を終了させるために、通信の切断処理を実行する。以上のステップＳ２０１〜ステップＳ２１０までの処理を経ることにより、図４に示す表示出力部１７の出力表示動作が行われる。 In step S210, a communication disconnection process is executed to end a series of conversations. Through the processing from step S201 to step S210, the output display operation of thedisplay output unit 17 shown in FIG. 4 is performed.

図６は、第１の実施形態における表示出力部１７の出力表示の他の一例を示す図である。即ち、図６には、図４に示す表示出力部１７の出力表示における別のレイアウトを示したものである。 FIG. 6 is a diagram illustrating another example of the output display of thedisplay output unit 17 in the first embodiment. That is, FIG. 6 shows another layout in the output display of thedisplay output unit 17 shown in FIG.

図６において、表示出力部１７の下段には、一連の会話で使用予定、もしくは、外部記憶部２４内の一部のデータの縮小データ３２が複数表示されている。図６に示す例では、この縮小データ３２が複数表示されている状態の中で会話を行いながら、データを指定するものである。 In FIG. 6, a plurality of reduceddata 32 that are scheduled to be used in a series of conversations or part of data in theexternal storage unit 24 are displayed in the lower part of thedisplay output unit 17. In the example shown in FIG. 6, data is specified while performing conversation in a state where a plurality of the reduceddata 32 are displayed.

図６に示す例では、会話における音声認識により抽出されたキーワードに合致するデータ「ａ」３４が強調されると共に、その元データ「Ａ」が外部記憶部２４から読み出されて、表示出力部１７のデータ表示部３３に表示される。また、指示部３５ａ及び３５ｂは、縮小データ３２の候補の切替を指示する際に操作されるものである。 In the example shown in FIG. 6, the data “a” 34 matching the keyword extracted by voice recognition in conversation is emphasized, and the original data “A” is read from theexternal storage unit 24 and displayed in the display output unit. 17 is displayed on thedata display unit 33. Theinstruction units 35a and 35b are operated when instructing switching of candidates for the reduceddata 32.

第１の実施形態によれば、操作者は、予め会話に必要なデータ一覧を確認しながら会話を行うことができ、相手側にも同一表示状態を再現させることにより、より効果的な会話が実現可能となる。これにより、連続的な会話を中断することなく、快適なＴＶ電話システム、ＴＶ会議システムを実現することが可能となる。 According to the first embodiment, the operator can perform a conversation while confirming a list of data necessary for the conversation in advance, and by reproducing the same display state on the other side, a more effective conversation can be performed. It becomes feasible. This makes it possible to realize a comfortable TV phone system and TV conference system without interrupting continuous conversation.

（第２の実施形態）
図７は、第２の実施形態における表示出力部１７の出力表示の一例を示す図である。
第１の実施形態では、送信側の情報端末装置の外部記憶部（第１の記憶手段）２４に格納されたデータのみを検索対象としていたが、第２の実施形態では、相手側の情報端末装置の外部記憶部（第２の記憶手段）に格納されたデータも検索対象とするものである。(Second Embodiment)
FIG. 7 is a diagram illustrating an example of the output display of thedisplay output unit 17 in the second embodiment.
In the first embodiment, only the data stored in the external storage unit (first storage means) 24 of the transmission-side information terminal device is the search target. However, in the second embodiment, the counterpart information terminal Data stored in the external storage unit (second storage means) of the apparatus is also a search target.

図７において、送信側の情報端末装置における表示出力部１７の下段には、当該情報端末装置に格納されている送信側の縮小データ（縮小画像データ）３６と、相手側の情報端末装置に格納されている相手側の縮小データ（縮小画像データ）３７が表示される。なお、この際、これらの縮小データがそれぞれ両者の情報端末装置に同時に通信ネットワーク２３を介して表示されている。このため、会話において認識されたキーワードに合致するデータを、それぞれの情報端末装置内の外部記憶部から検索し、合致したデータ「ｂ」３８が強調されると共に、その元データ「Ｂ」が表示出力部１７のデータ表示部３３に表示される。 In FIG. 7, in the lower stage of thedisplay output unit 17 in the transmission-side information terminal device, the transmission-side reduced data (reduced image data) 36 stored in the information terminal device and the counterpart-side information terminal device are stored. The other party's reduced data (reduced image data) 37 is displayed. At this time, these reduced data are simultaneously displayed on both information terminal apparatuses via the communication network 23. Therefore, data that matches the keyword recognized in the conversation is searched from the external storage unit in each information terminal device, and the matched data “b” 38 is emphasized and the original data “B” is displayed. The data is displayed on thedata display unit 33 of theoutput unit 17.

（第３の実施形態）
図８は、第３の実施形態における表示出力部１７の出力表示の一例を示す図である。第３の実施形態は、第２の実施形態の更なる応用を示すものである。(Third embodiment)
FIG. 8 is a diagram illustrating an example of an output display of thedisplay output unit 17 in the third embodiment. The third embodiment shows a further application of the second embodiment.

第２の実施形態では、送信側及び相手側の情報端末装置の各外部記憶部内に格納された縮小データの一覧を、それぞれのデータ表示部に出力表示させていたが、各情報端末装置内に格納されたデータの中で、相手に見せたくないデータも含まれることがある。そこで、第３の実施形態では、図８に示すように、相手側の情報端末装置において送信側の情報端末装置に表示したくない禁止縮小データ４０ａ及び４０ｂは、送信側の情報端末装置からは視認できないように、データ表示部３３に表示される。 In the second embodiment, the list of reduced data stored in each external storage unit of the information terminal device on the transmission side and the partner side is output and displayed on each data display unit. The stored data may include data that you do not want to show to the other party. Therefore, in the third embodiment, as shown in FIG. 8, the prohibited reduceddata 40a and 40b that are not desired to be displayed on the transmitting information terminal device in the partner information terminal device are transmitted from the transmitting information terminal device. It is displayed on the data displaypart 33 so that it cannot be visually recognized.

即ち、第３の実施形態では、それぞれの会話の中から認識されたキーワードに合致するデータが禁止縮小データ４０ａの場合は、当該データを格納している相手側の情報端末装置にはその元データが表示されるが、当該データを未格納な送信側の情報端末装置では、図８に示すように、元データがデータ表示部３３に表示されない。 That is, in the third embodiment, when the data that matches the keyword recognized in each conversation is the prohibited reduceddata 40a, the original data is not sent to the counterpart information terminal device storing the data. However, in the information terminal device on the transmitting side that has not stored the data, the original data is not displayed on thedata display unit 33 as shown in FIG.

この第３の実施形態の具体的な形態としては、例えば、図５のステップＳ２０７において相手側の情報端末装置の外部記憶部（第２の記憶手段）から読み出したデータに対して表示禁止の設定がなされていた場合、ステップＳ２０８では、当該データの表示出力部１７への表示を行わないようにする。また、例えば、ステップＳ２０７において自装置の外部記憶部（第１の記憶手段）２４から読み出したデータに対して表示禁止の設定がなされていた場合、ステップＳ２０７では、当該データの相手側の情報端末装置への送信を行わないようにする。 As a specific form of the third embodiment, for example, display prohibition is set for the data read from the external storage unit (second storage unit) of the information terminal device on the other side in step S207 of FIG. If it is determined that the data is not displayed on thedisplay output unit 17 in step S208. Further, for example, when display prohibition is set for the data read from the external storage unit (first storage unit) 24 of the own apparatus in step S207, in step S207, the information terminal on the partner side of the data is displayed. Do not send to the device.

第３の実施形態によれば、一連の縮小データを基に会話をする際に、相手側に見られたくないデータを当該相手側に誤って見られてしまうということを回避でき、当該データを当該相手側に対して自動的に隠蔽することができる。なお、予め見れないように指示がされている場合でも、キーワード検索により一致した元データが自分の情報端末装置に表示された後、特定の操作処理を行って相手側に送信することにより、相手側で当該元データを表示可能とするように構成しても良い。 According to the third embodiment, when a conversation is performed based on a series of reduced data, it is possible to avoid that the other party erroneously sees data that the other party does not want to see. It can be automatically hidden from the other party. Even if the instruction is given so that it cannot be seen in advance, the original data matched by the keyword search is displayed on its own information terminal device, and then the specific data is processed and transmitted to the other party. The original data may be displayed on the side.

（第４の実施形態）
図９は、第４の実施形態に係る情報端末装置の駆動方法を示すフローチャートである。具体的に、この図９に示す処理は、図１の制御部１１で行われる。(Fourth embodiment)
FIG. 9 is a flowchart showing a method of driving the information terminal device according to the fourth embodiment. Specifically, the process shown in FIG. 9 is performed by thecontrol unit 11 in FIG.

続いて、ステップＳ３０２では、会話中の音声からデータ検索のキーワードを抽出する音声認識モードの設定がなされているか否かを判断する。この判断の結果、音声認識モードの設定がなされていない場合には、ステップＳ３１１に進む。一方、ステップＳ３０２の判断の結果、音声認識モードの設定がなされている場合には、ステップＳ３０３に進む。 Subsequently, in step S302, it is determined whether or not a voice recognition mode for extracting a data search keyword from the voice during conversation is set. As a result of this determination, if the voice recognition mode is not set, the process proceeds to step S311. On the other hand, if the result of determination in step S302 is that voice recognition mode has been set, processing proceeds to step S303.

ステップＳ３０２で音声認識モードが設定されていると判断された場合、続いて、ステップＳ３０３では、送信者あるいは相手側の会話を音声認識処理部２５で順次音声認識を行い、検索対象となるキーワードが発声されたか否かを判断する。この判断の結果、検索対象となるキーワードが発声されていない場合には、検索対象となるキーワードが発声されるまで、ステップＳ３０３で待機する。一方、ステップＳ３０３の判断の結果、検索対象となるキーワードが発声された場合には、ステップＳ３０４に進む。 If it is determined in step S302 that the voice recognition mode is set, then in step S303, the voice recognition processing unit 25 sequentially performs voice recognition on the conversation of the sender or the other party, and the keyword to be searched is determined. It is determined whether or not the voice is spoken. As a result of the determination, if the keyword to be searched is not uttered, the process waits in step S303 until the keyword to be searched is uttered. On the other hand, as a result of the determination in step S303, if a keyword to be searched is uttered, the process proceeds to step S304.

ステップＳ３０４では、キーワードの認識処理を行う。そして、認識されたキーワードがキーワード保持部２６にその認識回数と共に格納される。 In step S304, keyword recognition processing is performed. Then, the recognized keyword is stored in thekeyword holding unit 26 together with the number of times of recognition.

続いて、ステップＳ３０５では、キーワード保持部２６に格納されたステップＳ３０４で認識されたキーワードがＮ回（Ｎは、自然数）以上発声されたか否かを判断する。この判断の結果、ステップＳ３０４で認識されたキーワードがＮ回以上発声されていない場合には、ステップＳ３０３に戻る。一方、ステップＳ３０５の判断の結果、ステップＳ３０４で認識されたキーワードがＮ回以上発声された場合には、ステップＳ３０６に進む。 Subsequently, in step S305, it is determined whether or not the keyword recognized in step S304 stored in thekeyword holding unit 26 has been uttered N times (N is a natural number). As a result of the determination, if the keyword recognized in step S304 has not been uttered N or more times, the process returns to step S303. On the other hand, as a result of the determination in step S305, if the keyword recognized in step S304 is uttered N or more times, the process proceeds to step S306.

ステップＳ３０６では、予め外部記憶部２４にキーワードと共に格納されたデータの中から検索を行う。続いて、ステップＳ３０７では、外部記憶部２４内に、ステップＳ３０４で認識されたキーワードと一致したキーワードに係る元データが存在するか否かを判断する。この判断の結果、ステップＳ３０４で認識されたキーワードと一致したキーワードに係る元データが外部記憶部２４内に存在しない場合には、ステップＳ３０３に戻る。一方、ステップＳ３０７の判断の結果、ステップＳ３０４で認識されたキーワードと一致したキーワードに係る元データが外部記憶部２４内に存在する場合には、ステップＳ３０８に進む。 In step S306, a search is performed from data stored together with keywords in theexternal storage unit 24 in advance. Subsequently, in step S307, it is determined whether or not the original data relating to the keyword that matches the keyword recognized in step S304 exists in theexternal storage unit 24. As a result of this determination, if the original data related to the keyword that matches the keyword recognized in step S304 does not exist in theexternal storage unit 24, the process returns to step S303. On the other hand, as a result of the determination in step S307, if the original data relating to the keyword that matches the keyword recognized in step S304 exists in theexternal storage unit 24, the process proceeds to step S308.

ステップＳ３０８では、当該元データを読み出す。続いて、ステップＳ３０９では、ステップＳ３０８で読み出した元データを表示出力部１７のデータ表示部３３に表示する。さらに、相手側の情報端末装置にステップＳ３０８で読み出された元データを送信し、当該相手側の情報端末装置の表示出力部に当該元データを出力表示させる。 In step S308, the original data is read out. Subsequently, in step S309, the original data read in step S308 is displayed on thedata display unit 33 of thedisplay output unit 17. Further, the original data read in step S308 is transmitted to the information terminal device on the other side, and the original data is output and displayed on the display output unit of the information terminal device on the other side.

続いて、ステップＳ３１０では、次なる検索キーワードが発声されたか否かを判断し、発声された場合には、ステップＳ３０４に戻って以降の処理を繰り返す。一方、ステップＳ３１０での判断の結果、次なる検索キーワードが発声されなかった場合には、ステップＳ３１１に進む。 Subsequently, in step S310, it is determined whether or not the next search keyword has been uttered, and if uttered, the process returns to step S304 and the subsequent processing is repeated. On the other hand, if the result of determination in step S310 is that the next search keyword has not been uttered, processing proceeds to step S311.

ステップＳ３１１では、一連の会話を終了させるために、通信の切断処理を実行する。以上のステップＳ３０１〜ステップＳ３１１までの処理を経ることにより、第４の実施形態における表示出力部１７の出力表示動作が行われる。 In step S311, a communication disconnection process is executed to end a series of conversations. By performing the processing from step S301 to step S311 described above, the output display operation of thedisplay output unit 17 in the fourth embodiment is performed.

第４の実施形態によれば、会話の中で連続的に発声される異なるキーワードに対して、所定回数（Ｎ回）発声されたキーワードの元データを検索することにより、処理が遅くなったり、不要な元データを読み出してしまうといった問題を回避することができる。これにより、検索する元データの精度を向上させることができる。なお、本実施形態において、当該キーワードの発声は、送信側、相手側のそれぞれの発声回数を計上する形態であっても良い。 According to the fourth embodiment, by searching the original data of a keyword uttered a predetermined number of times (N times) for different keywords uttered continuously in a conversation, the processing becomes slow, The problem of reading unnecessary original data can be avoided. Thereby, the accuracy of the original data to be searched can be improved. In the present embodiment, the utterance of the keyword may be in the form of counting the number of utterances on the transmission side and the other side.

（第５の実施形態）
図１０は、第５の実施形態に係る情報端末装置の駆動方法を示すフローチャートである。具体的に、この図５に示す処理は、図１の制御部１１で行われる。(Fifth embodiment)
FIG. 10 is a flowchart illustrating a method of driving the information terminal device according to the fifth embodiment. Specifically, the process shown in FIG. 5 is performed by thecontrol unit 11 in FIG.

図６で示した表示出力部１７の表示例では、検索対象となる縮小データ群を予め指定して会話を行う必要があった。第５の実施形態では、検索対象となるデータ群に対して、予め話者を対応させて記憶させておく。そして、送信側の情報端末装置に対して、話者が会話を開始することにより、音声認識処理部２５で話者が特定され、話者に対応したデータ群が呼び出されるように構成したものである。 In the display example of thedisplay output unit 17 shown in FIG. 6, it is necessary to perform a conversation by designating a reduced data group to be searched in advance. In the fifth embodiment, a speaker is stored in advance in association with a data group to be searched. Then, when the speaker starts a conversation with the information terminal device on the transmission side, the speaker is specified by the voice recognition processing unit 25, and a data group corresponding to the speaker is called. is there.

まず、ステップＳ４０１では、送信側の情報端末装置と相手側（受信側）の情報端末装置とを、通信ネットワーク２３を介して画像データや音声データを含む各種のデータの送受信が可能な状態として、通信開始の処理を行う。 First, in step S401, the transmission-side information terminal device and the counterpart (reception-side) information terminal device are in a state in which various data including image data and audio data can be transmitted and received via the communication network 23. Performs communication start processing.

続いて、ステップＳ４０２では、会話中の音声からデータ検索のキーワードを抽出する音声認識モードの設定がなされているか否かを判断する。この判断の結果、音声認識モードの設定がなされていない場合には、ステップＳ４１１に進む。一方、ステップＳ４０２の判断の結果、音声認識モードの設定がなされている場合には、ステップＳ４０３に進む。 Subsequently, in step S402, it is determined whether or not a voice recognition mode for extracting a data search keyword from the voice during conversation is set. As a result of the determination, if the voice recognition mode is not set, the process proceeds to step S411. On the other hand, if the result of determination in step S402 is that voice recognition mode has been set, processing proceeds to step S403.

ステップＳ４０２で音声認識モードが設定されていると判断された場合、続いて、ステップＳ４０３では、送信者あるいは相手側の会話を音声認識処理部２５で順次音声認識を行い、検索対象となるキーワードが発声されたか否かを判断する。この判断の結果、検索対象となるキーワードが発声されていない場合には、検索対象となるキーワードが発声されるまで、ステップＳ４０３で待機する。一方、ステップＳ４０３の判断の結果、検索対象となるキーワードが発声された場合には、ステップＳ４０４に進む。 If it is determined in step S402 that the voice recognition mode is set, then in step S403, the voice recognition processing unit 25 sequentially performs voice recognition on the conversation of the sender or the other party, and the keyword to be searched is determined. It is determined whether or not the voice is spoken. As a result of the determination, if the keyword to be searched is not uttered, the process waits in step S403 until the keyword to be searched is uttered. On the other hand, as a result of the determination in step S403, if a keyword to be searched is uttered, the process proceeds to step S404.

ステップＳ４０４では、キーワードが認識処理を行う。そして、認識されたキーワードがキーワード保持部２６にその話者の情報と共に格納される。 In step S404, the keyword performs recognition processing. Then, the recognized keyword is stored in thekeyword holding unit 26 together with the speaker information.

続いて、ステップＳ４０５では、ステップＳ４０４で認識されたキーワードに基づいて、キーワード保持部２６を参照することにより、話者を特定する。話者が特定されると、続いて、ステップＳ４０６では、外部記憶部２４に予め話者と関連させて記憶させていたデータ群が検索され、図６に示す縮小データ３２が表示される。 Subsequently, in step S405, the speaker is specified by referring to thekeyword holding unit 26 based on the keyword recognized in step S404. When the speaker is specified, subsequently, in step S406, a data group stored in advance in association with the speaker in theexternal storage unit 24 is searched, and reduceddata 32 shown in FIG. 6 is displayed.

続いて、ステップＳ４０７では、外部記憶部２４内に、ステップＳ４０４で認識されたキーワードと一致したキーワードに係る元データが存在するか否かを判断する。この判断の結果、ステップＳ４０４で認識されたキーワードと一致したキーワードに係る元データが外部記憶部２４内に存在しない場合には、ステップＳ４０３に戻る。一方、ステップＳ４０７の判断の結果、ステップＳ４０４で認識されたキーワードと一致したキーワードに係る元データが外部記憶部２４内に存在する場合には、ステップＳ４０８に進む。 Subsequently, in step S407, it is determined whether or not original data relating to the keyword that matches the keyword recognized in step S404 exists in theexternal storage unit 24. As a result of this determination, when the original data relating to the keyword that matches the keyword recognized in step S404 does not exist in theexternal storage unit 24, the process returns to step S403. On the other hand, as a result of the determination in step S407, if the original data relating to the keyword that matches the keyword recognized in step S404 exists in theexternal storage unit 24, the process proceeds to step S408.

ステップＳ４０８では、当該元データを読み出す。続いて、ステップＳ４０９では、ステップＳ４０８で読み出した元データを表示出力部１７のデータ表示部３３に表示する。さらに、相手側の情報端末装置にステップＳ４０８で読み出された元データを送信し、当該相手側の情報端末装置の表示出力部に当該元データを出力表示させる。 In step S408, the original data is read out. Subsequently, in step S409, the original data read in step S408 is displayed on thedata display unit 33 of thedisplay output unit 17. Furthermore, the original data read in step S408 is transmitted to the information terminal device on the other side, and the original data is output and displayed on the display output unit of the information terminal device on the other side.

続いて、ステップＳ４１０では、次なる検索キーワードが発声されたか否かを判断し、発声された場合には、ステップＳ４０４に戻って以降の処理を繰り返す。一方、ステップＳ４１０での判断の結果、次なる検索キーワードが発声されなかった場合には、ステップＳ４１１に進む。 Subsequently, in step S410, it is determined whether or not the next search keyword has been uttered. If so, the process returns to step S404 and the subsequent processing is repeated. On the other hand, if the result of determination in step S410 is that the next search keyword has not been spoken, processing proceeds to step S411.

ステップＳ４１１では、一連の会話を終了させるために、通信の切断処理を実行する。以上のステップＳ４０１〜ステップＳ４１１までの処理を経ることにより、第５の実施形態における表示出力部１７の出力表示動作が行われる。 In step S411, a communication disconnection process is executed to end a series of conversations. By performing the processing from step S401 to step S411, the output display operation of thedisplay output unit 17 in the fifth embodiment is performed.

第５の実施形態によれば、音声認識処理部２５の機能を利用して話者の特定をキーワードの認識と共に行うことにより、話者特定のデータを検索することができ、更なる会話の有効性が高められる。なお、本実施形態において、相手の話者を認識して、相手の話者と予め関連付けられたデータ群を読み出しても良いことは言うまでもない。 According to the fifth embodiment, by using the function of the speech recognition processing unit 25 to specify a speaker together with keyword recognition, it is possible to search for speaker-specific data, and to further improve the effectiveness of conversation. Sexuality is enhanced. In this embodiment, it is needless to say that the other party's speaker may be recognized and a data group previously associated with the other party's speaker may be read out.

前述した各実施形態に係る情報端末装置を構成する図１の各手段、並びに情報端末装置の駆動方法を示した図５、図９及び図１０の各ステップは、コンピュータのＲＡＭやＲＯＭなどに記憶されたプログラムが動作することによって実現できる。このプログラム及び当該プログラムを記録したコンピュータ読み取り可能な記憶媒体は本発明に含まれる。 Each unit of FIG. 1 constituting the information terminal device according to each of the embodiments described above, and each step of FIGS. 5, 9, and 10 showing the driving method of the information terminal device is stored in a RAM or a ROM of a computer. It can be realized by operating the programmed program. This program and a computer-readable storage medium storing the program are included in the present invention.

具体的に、前記プログラムは、例えばＣＤ−ＲＯＭのような記憶媒体に記録し、或いは各種伝送媒体を介し、コンピュータに提供される。前記プログラムを記録する記憶媒体としては、ＣＤ−ＲＯＭ以外に、フレキシブルディスク、ハードディスク、磁気テープ、光磁気ディスク、不揮発性メモリカード等を用いることができる。他方、前記プログラムの伝送媒体としては、プログラム情報を搬送波として伝搬させて供給するためのコンピュータネットワーク（ＬＡＮ、インターネットの等のＷＡＮ、無線通信ネットワーク等）システムにおける通信媒体を用いることができる。また、この際の通信媒体としては、光ファイバ等の有線回線や無線回線などが挙げられる。 Specifically, the program is recorded in a storage medium such as a CD-ROM, or provided to a computer via various transmission media. As a storage medium for recording the program, a flexible disk, a hard disk, a magnetic tape, a magneto-optical disk, a nonvolatile memory card, and the like can be used in addition to the CD-ROM. On the other hand, as the transmission medium of the program, a communication medium in a computer network (LAN, WAN such as the Internet, wireless communication network, etc.) system for propagating and supplying program information as a carrier wave can be used. In addition, examples of the communication medium at this time include a wired line such as an optical fiber, a wireless line, and the like.

また、コンピュータが供給されたプログラムを実行することにより各実施形態に係る情報端末装置の機能が実現されるだけでなく、そのプログラムがコンピュータにおいて稼働しているＯＳ（オペレーティングシステム）或いは他のアプリケーションソフト等と共同して各実施形態に係る情報端末装置の機能が実現される場合や、供給されたプログラムの処理の全て、或いは一部がコンピュータの機能拡張ボードや機能拡張ユニットにより行われて各実施形態に係る情報端末装置の機能が実現される場合も、かかるプログラムは本発明に含まれる。 Moreover, not only the functions of the information terminal device according to each embodiment are realized by executing a program supplied by the computer, but also an OS (Operating System) or other application software in which the program is running on the computer. When the functions of the information terminal device according to each embodiment are realized in cooperation with the above, or all or part of the processing of the supplied program is performed by a function expansion board or a function expansion unit of the computer. Such a program is also included in the present invention when the function of the information terminal device according to the embodiment is realized.

第１の実施形態に係る情報端末装置のハードウエア構成を示すブロック図である。It is a block diagram which shows the hardware constitutions of the information terminal device which concerns on 1st Embodiment.従来例における表示出力部の出力表示の一例を示す図である。It is a figure which shows an example of the output display of the display output part in a prior art example.図２に示す従来例における表示出力部の出力表示動作を示すフローチャートである。It is a flowchart which shows the output display operation | movement of the display output part in the prior art example shown in FIG.第１の実施形態における表示出力部の出力表示の一例を示す図である。It is a figure which shows an example of the output display of the display output part in 1st Embodiment.第１の実施形態に係る情報端末装置の駆動方法を示すフローチャートである。It is a flowchart which shows the drive method of the information terminal device which concerns on 1st Embodiment.第１の実施形態における表示出力部の出力表示の他の一例を示す図である。It is a figure which shows another example of the output display of the display output part in 1st Embodiment.第２の実施形態における表示出力部の出力表示の一例を示す図である。It is a figure which shows an example of the output display of the display output part in 2nd Embodiment.第３の実施形態における表示出力部の出力表示の一例を示す図である。It is a figure which shows an example of the output display of the display output part in 3rd Embodiment.第４の実施形態に係る情報端末装置の駆動方法を示すフローチャートである。It is a flowchart which shows the drive method of the information terminal device which concerns on 4th Embodiment.第５の実施形態に係る情報端末装置の駆動方法を示すフローチャートである。It is a flowchart which shows the drive method of the information terminal device which concerns on 5th Embodiment.

符号の説明Explanation of symbols

１１：制御部
１２：ＲＯＭ
１３：ＲＡＭ
１４：画像入力部
１５：音声入力部
１６：操作入力部
１７：表示出力部
１８：音声出力部
１９：画像符号化／復号化処理部
２０：音声符号化／復号化処理部
２１：多重／分離処理部
２２：回線インターフェース処理部
２３：通信ネットワーク
２４：外部記憶部
２５：音声認識処理部
２６：キーワード保持部
２７：送信側映像部
２８：相手側映像部
２９：データ一覧
３０、３４、３８：データ
３１、３３：データ表示部
３２：縮小データ
３５ａ、３５ｂ：指示部
３６：送信側の縮小データ
３７：相手側の縮小データ
４０ａ、４０ｂ：禁止縮小データ11: Control unit 12: ROM
13: RAM
14: image input unit 15: audio input unit 16: operation input unit 17: display output unit 18: audio output unit 19: image encoding / decoding processing unit 20: audio encoding / decoding processing unit 21: multiplexing / separation Processing unit 22: Line interface processing unit 23: Communication network 24: External storage unit 25: Voice recognition processing unit 26: Keyword holding unit 27: Transmission side video unit 28: Opposite side video unit 29:Data list 30, 34, 38:Data 31, 33: Data display unit 32:Reduction data 35a, 35b: Instruction unit 36: Reduction data on transmission side 37:Reduction data 40a, 40b on the other side: prohibited reduction data

Claims

Translated fromJapanese

音声データ及び画像データを含む各種のデータを伝送路を介して外部装置と送受信可能に構成された情報端末装置であって、
検索用のキーワードと関連付けられたデータを記憶する第１の記憶手段と、
前記情報端末装置及び前記外部装置のうちの少なくとも何れか一方に対して発声された音声を音声データとして入力する音声入力手段と、
前記音声入力手段により入力された音声データに基づいてキーワードを抽出する抽出手段と、
前記抽出手段で抽出したキーワードに係るデータを前記第１の記憶手段から読み出す読み出し手段と、
前記読み出し手段で読み出したデータを表示媒体に表示する表示手段と、
前記読み出し手段で読み出したデータを前記伝送路を介して前記外部装置に送信する送信手段と
を有することを特徴とする情報端末装置。An information terminal device configured to be able to send and receive various data including audio data and image data to and from an external device via a transmission line,
First storage means for storing data associated with search keywords;
Voice input means for inputting voice uttered to at least one of the information terminal device and the external device as voice data;
Extracting means for extracting a keyword based on voice data input by the voice input means;
Reading means for reading out data relating to the keyword extracted by the extracting means from the first storage means;
Display means for displaying data read by the reading means on a display medium;
An information terminal device comprising: a transmission unit configured to transmit data read by the reading unit to the external device via the transmission path.

前記外部装置には、検索用のキーワードと関連付けられたデータを記憶する第２の記憶手段が具備されており、
前記読み出し手段は、前記抽出手段で抽出したキーワードに係るデータを、前記第１の記憶手段及び前記第２の記憶手段から読み出すことを特徴とする請求項１に記載の情報端末装置。The external device includes second storage means for storing data associated with a search keyword,
2. The information terminal device according to claim 1, wherein the reading unit reads data related to the keyword extracted by the extracting unit from the first storage unit and the second storage unit.

前記読み出し手段において前記第２の記憶手段から読み出したデータに対して表示禁止の設定がなされていた場合、前記表示手段は、当該データの前記表示媒体への表示を行わないことを特徴とする請求項２に記載の情報端末装置。 The display means does not display the data on the display medium when display prohibition is set for the data read from the second storage means in the reading means. Item 3. The information terminal device according to Item 2.

前記読み出し手段において前記第１の記憶手段から読み出したデータに対して表示禁止の設定がなされていた場合、前記送信手段は、当該データの前記外部装置への送信を行わないことを特徴とする請求項２又は３に記載の情報端末装置。 The transmission unit does not transmit the data to the external device when display prohibition is set for the data read from the first storage unit in the reading unit. Item 4. The information terminal device according to Item 2 or 3.

前記読み出し手段は、前記抽出手段により抽出された前記キーワードの回数が既定回数となった場合に、当該キーワードに係るデータを読み出すことを特徴とする請求項１乃至４の何れか１項に記載の情報端末装置。 The said reading means reads the data which concern on the said keyword, when the frequency | count of the said keyword extracted by the said extracting means turns into a predetermined number of times, The said any one of Claim 1 thru | or 4 characterized by the above-mentioned. Information terminal device.

前記第１の記憶手段及び前記第２の記憶手段に記憶されているデータには、話者に係る前記音声データの音声属性コードが関連付けられており、
前記抽出手段は、前記音声属性コードに対応したキーワードを抽出することを特徴とする請求項２乃至４の何れか１項に記載の情報端末装置。The data stored in the first storage means and the second storage means is associated with a voice attribute code of the voice data relating to a speaker,
5. The information terminal device according to claim 2, wherein the extraction unit extracts a keyword corresponding to the voice attribute code. 6.

音声データ及び画像データを含む各種のデータを伝送路を介して外部装置と送受信可能に構成され、検索用のキーワードと関連付けられたデータを記憶する第１の記憶手段を具備する情報端末装置の駆動方法であって、
前記情報端末装置及び前記外部装置のうちの少なくとも何れか一方に対して発声された音声を音声データとして入力する音声入力ステップと、
前記音声入力ステップにより入力された音声データに基づいてキーワードを抽出する抽出ステップと、
前記抽出ステップで抽出したキーワードに係るデータを前記第１の記憶手段から読み出す読み出しステップと、
前記読み出しステップで読み出したデータを表示媒体に表示する表示ステップと、
前記読み出しステップで読み出したデータを前記伝送路を介して前記外部装置に送信する送信ステップと
を有することを特徴とする情報端末装置の駆動方法。Driving of an information terminal device comprising a first storage means configured to be able to transmit and receive various types of data including audio data and image data to / from an external device via a transmission path, and storing data associated with a search keyword A method,
A voice input step of inputting voice uttered to at least one of the information terminal device and the external device as voice data;
An extraction step of extracting a keyword based on the voice data input in the voice input step;
A reading step of reading data relating to the keyword extracted in the extraction step from the first storage means;
A display step of displaying the data read in the reading step on a display medium;
And a transmitting step of transmitting the data read in the reading step to the external device via the transmission path.

請求項７に記載の情報端末装置の駆動方法の各ステップをコンピュータに実行させるためのプログラム。 The program for making a computer perform each step of the drive method of the information terminal device of Claim 7.