JP7306439B2

Movatterモバイル変換

Info

Publication number: JP7306439B2
Application number: JP2021183024A
Authority: JP
Inventors: 隆史園田
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2017-03-30
Filing date: 2021-11-10
Publication date: 2023-07-11
Anticipated expiration: 2037-03-30
Also published as: JP2022028769A

Description

本発明は、情報処理装置、情報処理方法、情報処理プログラムおよび情報処理システムに関する。 The present invention relates to an information processing device, an information processing method, an information processing program, and an information processing system.

上記技術分野において、特許文献１には、取得した映像情報に基づいて、映像情報に映し出されている者に関する感情情報を取得し、取得した音声情報に基づいて、文脈情報を取得する。そして、取得した感情情報および文脈情報に基づいて、メディア表現文書情報を生成する技術が開示されている。また、特許文献２には、魚眼レンズと無指向性マイクロフォンとを用いて、音源位置方向（話者方向）を判定し、音源位置方向の画像（話者人物像）を切り出して映像信号を生成する技術が開示されている。 In the above technical field, Japanese Patent Laid-Open No. 2002-200003 discloses a method for obtaining emotional information about a person displayed in the video information based on the obtained video information, and obtaining contextual information based on the obtained audio information. A technique is disclosed for generating media expression document information based on the acquired emotion information and context information. Further, in Patent Document 2, a fisheye lens and an omnidirectional microphone are used to determine the sound source position direction (speaker direction), and an image in the sound source position direction (speaker person image) is cut out to generate a video signal. Techniques are disclosed.

特開２００７－２９９２５５号公報JP 2007-299255 A特開平１１－３３１８２７号公報JP-A-11-331827

しかしながら、上記文献に記載の技術では、映像を見ている人が現場の臨場感を共有することができなかった。 However, with the technology described in the above document, the person watching the video cannot share the realism of the scene.

本発明の目的は、上述の課題を解決する技術を提供することにある。 An object of the present invention is to provide a technique for solving the above problems.

上記目的を達成するため、本発明に係る情報処理装置は、
所定エリアおよび少なくとも１人の被写体を撮像した映像を取得する映像取得手段と、
前記被写体に関する被写体情報と前記被写体の周辺環境の情報である周辺環境情報とを収集する情報収集手段と、
前記被写体の映像および前記被写体情報に基づいて、前記被写体の状況を認識する被写体状況認識手段と、
表示画面を見ている人が前記所定エリアの臨場感を共有できるように、前記被写体の映像を含む前記所定エリアの映像と、前記被写体の状況と、前記被写体情報と、前記周辺環境情報とを、１つの前記表示画面に表示する表示手段と、
を備えた。In order to achieve the above object, an information processing device according to the present invention includes:
an image acquiring means for acquiring an image of a predetermined area and at least one subject;
information collecting means for collecting subject information about the subject and surrounding environment information that is information about the surrounding environment of the subject;
subject situation recognition means for recognizing the situation of the subject based on the image of the subject and the subject information;
The image of the predetermined area including the image of the subject, the situation of the subject, the subject information, and the surrounding environment information are displayed so that the viewers of the display screen can share the presence of the predetermined area. , display means for displayingon one of the display screens ;
provided.

上記目的を達成するため、本発明に係る情報処理方法は、
所定エリアおよび少なくとも１人の被写体を撮像した映像を取得する映像取得ステップと、
前記被写体に関する被写体情報と前記被写体の周辺環境の情報である周辺環境情報とを収集する情報収集ステップと、
前記被写体の映像および前記被写体情報に基づいて、前記被写体の状況を認識する被写体状況認識ステップと、
表示画面を見ている人が前記所定エリアの臨場感を共有できるように、前記被写体の映像を含む前記所定エリアの映像と、前記被写体の状況と、前記被写体情報と、前記周辺環境情報とを、１つの前記表示画面に表示する表示ステップと、
を含む。In order to achieve the above object, an information processing method according to the present invention comprises:
an image acquisition step of acquiring an image of a predetermined area and at least one subject;
an information collecting step of collecting subject information about the subject and surrounding environment information that is information about the surrounding environment of the subject;
a subject situation recognition step of recognizing the situation of the subject based on the image of the subject and the subject information;
The image of the predetermined area including the image of the subject, the situation of the subject, the subject information, and the surrounding environment information are displayed so that the viewers of the display screen can share the presence of the predetermined area. , a display step of displayingon one of the display screens ;
including.

上記目的を達成するため、本発明に係る情報処理プログラムは、
所定エリアおよび少なくとも１人の被写体を撮像した映像を取得する映像取得ステップと、
前記被写体に関する被写体情報と前記被写体の周辺環境の情報である周辺環境情報とを収集する情報収集ステップと、
前記被写体の映像および前記被写体情報に基づいて、前記被写体の状況を認識する被写体状況認識ステップと、
表示画面を見ている人が前記所定エリアの臨場感を共有できるように、前記被写体の映像を含む前記所定エリアの映像と、前記被写体の状況と、前記被写体情報と、前記周辺環境情報とを、１つの前記表示画面に表示する表示ステップと、
をコンピュータに実行させる。In order to achieve the above object, an information processing program according to the present invention comprises:
an image acquisition step of acquiring an image of a predetermined area and at least one subject;
an information collecting step of collecting subject information about the subject and surrounding environment information that is information about the surrounding environment of the subject;
a subject situation recognition step of recognizing the situation of the subject based on the image of the subject and the subject information;
The image of the predetermined area including the image of the subject, the situation of the subject, the subject information, and the surrounding environment information are displayed so that the viewers of the display screen can share the presence of the predetermined area. , a display step of displayingon one of the display screens ;
run on the computer.

上記目的を達成するため、本発明に係る情報処理システムは、
所定エリアおよび少なくとも１人の被写体の映像を撮像する撮像手段と、
前記撮像手段により撮像された前記被写体に関する被写体情報と、前記被写体の周辺環境の情報である周辺環境情報とを収集する情報収集手段と、
前記被写体の映像および前記被写体情報に基づいて、前記被写体の状況を認識する被写体状況認識手段と、
表示画面を見ている人が前記所定エリアの臨場感を共有できるように、前記被写体の映像を含む前記所定エリアの映像と、前記被写体の状況と、前記被写体情報と、前記周辺環境情報とを、１つの前記表示画面に表示する表示手段と、
を備えた。In order to achieve the above object, an information processing system according to the present invention includes:
imaging means for capturing an image of a predetermined area and at least one subject;
information collecting means for collecting subject information about the subject imaged by the imaging means and surrounding environment information that is information on the surrounding environment of the subject;
subject situation recognition means for recognizing the situation of the subject based on the image of the subject and the subject information;
The image of the predetermined area including the image of the subject, the situation of the subject, the subject information, and the surrounding environment information are displayed so that the viewers of the display screen can share the presence of the predetermined area. , display means for displayingon one of the display screens ;
provided.

本発明によれば、映像を見ている人が現場の臨場感を共有することができる。 According to the present invention, people watching the video can share the realism of the scene.

本発明の第１実施形態に係る情報処理装置の構成を示すブロック図である。1 is a block diagram showing the configuration of an information processing device according to a first embodiment of the present invention; FIG.本発明の第２実施形態に係る情報処理システムの構成を説明するための図である。It is a figure for demonstrating the structure of the information processing system which concerns on 2nd Embodiment of this invention.本発明の第２実施形態に係る情報処理システムに含まれる情報処理装置の構成を示すブロック図である。FIG. 7 is a block diagram showing the configuration of an information processing device included in the information processing system according to the second embodiment of the present invention;本発明の第２実施形態に係る情報処理システムに含まれる情報処理装置が有する状況テーブルの一例を示す図である。It is a figure which shows an example of the situation table which the information processing apparatus contained in the information processing system which concerns on 2nd Embodiment of this invention has.本発明の第２実施形態に係る情報処理システムに含まれる情報処理装置のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the information processing apparatus contained in the information processing system which concerns on 2nd Embodiment of this invention.本発明の第２実施形態に係る情報処理システムに含まれる情報処理装置の処理手順を説明するフローチャートである。FIG. 10 is a flow chart explaining a processing procedure of an information processing device included in an information processing system according to a second embodiment of the present invention; FIG.本発明の第３実施形態に係る情報処理システムの構成を説明するための図である。It is a figure for demonstrating the structure of the information processing system which concerns on 3rd Embodiment of this invention.

以下に、本発明を実施するための形態について、図面を参照して、例示的に詳しく説明記載する。ただし、以下の実施の形態に記載されている、構成、数値、処理の流れ、機能要素などは一例に過ぎず、その変形や変更は自由であって、本発明の技術範囲を以下の記載に限定する趣旨のものではない。 BEST MODE FOR CARRYING OUT THE INVENTION Hereinafter, embodiments for carrying out the present invention will be exemplarily described in detail with reference to the drawings. However, the configuration, numerical values, process flow, functional elements, etc. described in the following embodiments are merely examples, and modifications and changes are free, and the technical scope of the present invention is not limited to the following description. It is not intended to be limited.

［第１実施形態］
本発明の第１実施形態としての情報処理装置１００について、図１を用いて説明する。情報処理装置１００は、撮像された映像に写っている被写体の状況を認識して表示する装置である。[First embodiment]
Aninformation processing apparatus 100 as a first embodiment of the present invention will be described using FIG. Theinformation processing device 100 is a device that recognizes and displays the situation of a subject appearing in a captured image.

図１に示すように、情報処理装置１００は、映像取得部１０１と、被写体情報収集部１０２と、状況認識部１０３と、表示部１０４と、を含む。 As shown in FIG. 1, theinformation processing apparatus 100 includes avideo acquisition unit 101, a subjectinformation collection unit 102, asituation recognition unit 103, and adisplay unit 104.

映像取得部１０１は、所定エリアを撮像する撮像手段から映像を取得する。被写体情報収集部１０２は、撮像された映像に写っている被写体の被写体情報を収集する。状況認識部１０３は、被写体の映像および被写体情報に基づいて、被写体の状況を認識する。表示部１０４は、認識された状況を識別可能に表示する。 Theimage acquisition unit 101 acquires an image from an image pickup device that picks up an image of a predetermined area. A subjectinformation collection unit 102 collects subject information of a subject appearing in a captured image. Thesituation recognition unit 103 recognizes the situation of the subject based on the image of the subject and the subject information. Thedisplay unit 104 displays the recognized situation in an identifiable manner.

本実施形態によれば、映像を見ている人が現場の臨場感を共有することができる。 According to this embodiment, people watching the video can share the realism of the scene.

［第２実施形態］
次に本発明の第２実施形態に係る情報処理システムについて、図２乃至図６を用いて説明する。図２は、本実施形態に係る情報処理システムの構成の一例を説明するための図である。[Second embodiment]
Next, an information processing system according to a second embodiment of the present invention will be explained using FIGS. 2 to 6. FIG. FIG. 2 is a diagram for explaining an example of the configuration of the information processing system according to this embodiment.

情報処理システム２００は、情報処理装置２１０、表示部２２０およびカメラ２３０を含む。また、情報処理システム２００は、図示しない、音声情報取得センサ（マイク）やバイタルデータ取得センサ、環境情報取得センサなどの各種センサを含む。Information processing system 200 includesinformation processing device 210 ,display unit 220 andcamera 230 . Theinformation processing system 200 also includes various sensors (not shown) such as a voice information acquisition sensor (microphone), a vital data acquisition sensor, and an environment information acquisition sensor.

情報処理システム２００は、例えば、人が多く集まる集会会場やコンサート会場、病院、刑務所、一般家庭などにおいて、映像に写る被写体の状況を認識し、認識した状況を識別可能に表示することにより、映像を見ている人が現場の臨場感を共有することができるシステムである。また、情報処理システムは、カメラに写っている人の感情を表現するシステムであり、喜怒哀楽以上の感情を表現するシステムである。さらに、情報処理システム２００は、多くの人が集まっているようなイベント会場や、設備の整っている病院ではなく、設備の整っていない在宅医療、多くの囚人をコントロールしなければならない刑務所、デモ行進などにも適用可能なシステムである。 Theinformation processing system 200 recognizes the situation of a subject appearing in an image in, for example, a meeting hall, a concert hall, a hospital, a prison, an ordinary home, etc., where many people gather, and displays the recognized situation in an identifiable manner. It is a system that allows viewers to share the realism of the scene. Also, the information processing system is a system that expresses the emotions of a person photographed by a camera, and is a system that expresses emotions beyond emotions. Furthermore, theinformation processing system 200 is not suitable for event venues where many people are gathered, home medical care facilities that are not well-equipped, prisons where many prisoners must be controlled, and demonstrations. This system can also be applied to marching and the like.

情報処理装置２１０は、カメラ２３０が撮像した複数の被写体が存在する所定エリアの映像を取得し、取得した映像に写っている被写体の被写体情報を取得する。被写体情報は、例えば、音声情報やバイタルデータ、動作情報、表情情報などである。音声情報は、例えば、被写体の声や声紋、発話内容などである。バイタルデータは、例えば、体温や心拍、脈拍、空腹、のどの渇き、吐き気、尿意、痛み、瞳孔、脳波、指紋、呼吸数、発汗量、虹彩などを含む。動作情報は、例えば、被写体の動きに関する情報であり、被写体の行動などである。表情情報は、被写体の表情、顔の動き、目の動きなどである。 Theinformation processing device 210 acquires an image of a predetermined area in which a plurality of subjects are present captured by thecamera 230, and acquires subject information of the subjects appearing in the acquired image. Subject information is, for example, voice information, vital data, motion information, facial expression information, and the like. The audio information is, for example, the subject's voice, voiceprint, utterance content, and the like. Vital data includes, for example, body temperature, heart rate, pulse, hunger, thirst, nausea, urge to urinate, pain, pupils, brain waves, fingerprints, respiratory rate, perspiration, iris, and the like. The motion information is, for example, information about the motion of the subject, such as the behavior of the subject. The facial expression information is the subject's facial expression, facial movement, eye movement, and the like.

また、情報処理装置２１０は、被写体の周辺環境の情報を取得する。周辺環境情報は、例えば、被写体のいる場所の温度や湿度、震度、雑音の大きさ、照度、輝度、水位、水量、速度、角度、位置、熟成度、時間などである。 Theinformation processing device 210 also acquires information about the surrounding environment of the subject. Surrounding environment information includes, for example, temperature and humidity, seismic intensity, noise level, illuminance, brightness, water level, amount of water, speed, angle, position, maturity, and time at the location where the subject is located.

そして、情報処理装置２１０は、取得した映像および被写体情報に基づいて、被写体の状況を認識する。被写体の状況は、例えば、被写体の感情や心の動き、身体状態などである。 Then, theinformation processing device 210 recognizes the situation of the subject based on the acquired video and subject information. The subject's situation is, for example, the subject's emotion, mental movement, physical condition, and the like.

表示部２２０は、表示領域２２１にカメラ２３０で撮像した映像と共に、被写体のバイタイルデータなどをグラフ化などして表示する。グラフなどは、例えば、カメラ２３０で撮像した映像に重畳して表示（例えば、ＡＲ（Augmented Reality）表示）し、カメラ２３０の映像を見ている人が視覚的に分かるように表示してもよい。 Thedisplay unit 220 displays the video imaged by thecamera 230 in thedisplay area 221 and the bitile data of the subject in graph form or the like. For example, the graph may be superimposed on the image captured by thecamera 230 and displayed (for example, AR (Augmented Reality) display) so that the person viewing the image of thecamera 230 can visually understand it. .

また、表示部２２０は、表示領域２２２に被写体の状況や感情を表示する。被写体の感情などは、例えば、テキストや絵文字などで表示されるが、被写体の感情の表示方法は、これらには限定されない。 Thedisplay unit 220 also displays the subject's situation and emotion in thedisplay area 222 . The emotion of the subject is displayed, for example, as text or pictograms, but the method of displaying the emotion of the subject is not limited to these.

さらに、表示部２２０は、表示領域２２３に被写体の会話情報として会話内容や発話内容を表示する。会話情報は、例えば、テキスト表示されるが、発話者ごとにテキストの大きさやフォントなどを変更してもよい。また、声の大きさに応じてテキストの大きさを変更してもよい。さらに、発話者ごとにテキストに色を付けてもよく、テキスト表示の方法は、これらには限定されない。また、表示部２２０は、表示領域２２４に被写体の付加情報を表示する。付加情報は、バイタルデータ：体温、心拍、脈拍、空腹、渇き、吐き気、尿意、痛み、行動、瞳孔、表情、脳波、生体情報（指紋、顔、声紋、虹彩）などを含む。 Furthermore, thedisplay unit 220 displays the content of the conversation and the content of the utterance as conversation information of the subject in thedisplay area 223 . Conversation information is displayed as text, for example, but the text size and font may be changed for each speaker. Also, the size of the text may be changed according to the volume of the voice. Furthermore, the text may be colored for each speaker, and the text display method is not limited to these. Thedisplay unit 220 also displays additional information about the subject in thedisplay area 224 . Additional information includes vital data: body temperature, heartbeat, pulse, hunger, thirst, nausea, urge to urinate, pain, behavior, pupil, facial expression, electroencephalogram, biological information (fingerprint, face, voiceprint, iris), and the like.

また、表示部２２０は、表示領域２２５に周辺環境情報を表示する。周辺環境情報は、温度、湿度、震度、雑音の大きさ、照度、輝度、水位、水量、速度、角度、位置、熟成度、時間などであり、これらの情報が表示される。なお、表示部２２０に表示する各種情報は、必要に応じてＯＮまたはＯＦＦすることができる。 Thedisplay unit 220 also displays the surrounding environment information on thedisplay area 225 . The surrounding environment information includes temperature, humidity, seismic intensity, noise level, illuminance, brightness, water level, water volume, speed, angle, position, maturity, time, etc., and these information are displayed. Various types of information displayed on thedisplay unit 220 can be turned ON or OFF as required.

情報処理システム２００は、例えば、カメラ２３０で撮像した映像中に写っている被写体の状況などに変化があった場合、映像を見ている人物などにアラートを報知してもよい。 For example, theinformation processing system 200 may issue an alert to a person watching the video when there is a change in the situation of an object captured in the video captured by thecamera 230 .

図３は、本実施形態に係る情報処理システムに含まれる情報処理装置の構成を示すブロック図である。情報処理装置２１０は、映像取得部３０１、被写体情報収集部３０２、周辺環境情報取得部３０３、状況認識部３０４、表示部３０５およびアラート報知部３０６を有する。 FIG. 3 is a block diagram showing the configuration of an information processing device included in the information processing system according to this embodiment. Theinformation processing device 210 has avideo acquisition unit 301 , a subjectinformation collection unit 302 , a surrounding environmentinformation acquisition unit 303 , asituation recognition unit 304 , adisplay unit 305 and analert notification unit 306 .

映像取得部３０１は、カメラ２３０などの撮像デバイスが撮像した所定エリアの映像を取得する。カメラ２３０は、例えば、施設に取り付けられた防犯カメラや監視カメラなどが代表的であるが、これらには限定されない。 Thevideo acquisition unit 301 acquires video of a predetermined area captured by an imaging device such as thecamera 230 . Thecamera 230 is typically, for example, a security camera or surveillance camera attached to a facility, but is not limited to these.

被写体情報収集部３０２は、カメラ２３０で撮像した映像に写っている被写体の被写体情報を収集する。被写体情報収集部３０２は、映像取得部３０１が取得した映像から、人物などの被写体を抽出し、抽出した被写体の被写体情報を収集する。被写体情報は、例えば、被写体の音声情報や生体情報、動作情報、表情情報である。音声情報は、被写体の声や声量、声紋、発話内容、会話内容などである。音声情報は、カメラ２３０に取り付けられたマイクや、施設に取り付けられたマイク、被写体の所持するスマートフォンなどの携帯端末のマイクから取得される。 A subjectinformation collecting unit 302 collects subject information of a subject appearing in an image captured by thecamera 230 . A subjectinformation collection unit 302 extracts a subject such as a person from the video image acquired by thevideo acquisition unit 301 and collects subject information of the extracted subject. The subject information is, for example, voice information, biological information, motion information, and facial expression information of the subject. The audio information includes the subject's voice, voice volume, voiceprint, utterance content, conversation content, and the like. The audio information is acquired from the microphone attached to thecamera 230, the microphone attached to the facility, or the microphone of a mobile terminal such as a smartphone owned by the subject.

また、生体情報は、いわゆるバイタルデータなどと呼ばれるものであり、例えば、体温、心拍、脈拍、空腹、のどの渇き、吐き気、尿意、痛み、呼吸数、脳波、発汗量などである。生体情報は、例えば、被写体が身に着けている時計型なメガネ型、肌着型などのウェアラブルデバイスや、医療機器などから取得されるが、生体情報の取得方法はこれには限定されない。 Biological information is so-called vital data, and includes, for example, body temperature, heart rate, pulse, hunger, thirst, nausea, urge to urinate, pain, respiration rate, electroencephalogram, and amount of perspiration. The biometric information is acquired, for example, from wearable devices such as watch-shaped glasses or underwear worn by the subject, medical equipment, or the like, but the biometric information acquisition method is not limited to this.

動作情報は、被写体の動きに関する情報であり、被写体がどのような動きをしているかに関する情報である。動作情報は、例えば、被写体が歩いているのか、座っているのか、走っているのか、止まっているのか、腕を動かしているのかなどに関する情報であるが、これらには限定されない。 The motion information is information about the movement of the subject, and information about how the subject is moving. The motion information is, for example, information about whether the subject is walking, sitting, running, standing still, or moving the arm, but is not limited to these.

さらに、表情情報は、被写体の表情や人相などに関する情報である。表情情報は、例えば、被写体の表情が、笑った表情か、怒った表情かなどに関する情報であるが、これらには限定されない。 Furthermore, facial expression information is information relating to the subject's facial expression, facial features, and the like. The facial expression information is, for example, information regarding whether the facial expression of the subject is a smiling facial expression or an angry facial expression, but is not limited to these.

これらの他にも被写体情報として、被写体の背格好や服装、性別、年齢、身長、髪型、メガネの有無などを含めてもよい。 In addition to these, subject information may include the subject's stature, clothing, sex, age, height, hairstyle, presence or absence of glasses, and the like.

周辺環境情報取得部３０３は、カメラ２３０で撮像した映像中の被写体の周辺環境に関する情報を取得する。周辺環境情報は、例えば、温度、湿度、震度、雑音の大きさ、照度、輝度、水位、水量、速度、角度、位置、熟成度、時間などである。周辺環境情報は、例えば、被写体が所持するスマートフォンやスマートウォッチなどの携帯端末や、被写体のいる施設に設置されたセンサ、その他のセンサ、赤外線カメラ、サーモグラフィーなどから取得されるが、取得方法はこれらには限定されない。 Surrounding environmentinformation acquisition unit 303 acquires information about the surrounding environment of the subject in the image captured bycamera 230 . The surrounding environment information includes, for example, temperature, humidity, seismic intensity, noise magnitude, illuminance, brightness, water level, water volume, speed, angle, position, maturity, and time. Surrounding environment information is acquired from, for example, a mobile terminal such as a smartphone or smart watch possessed by the subject, a sensor installed at the facility where the subject is, other sensors, an infrared camera, a thermography, etc., and acquisition methods are as follows. is not limited to

状況認識部３０４は、被写体の映像や被写体情報、被写体情報の変化情報などに基づいて、被写体の状況を認識する。例えば、体温や心拍などのバイタルデータの値と、バイタルデータの変化情報から、状況認識部３０４は、被写体の感情や心の動きなどを認識する。 Thesituation recognition unit 304 recognizes the situation of the subject based on the image of the subject, subject information, change information of the subject information, and the like. For example, thesituation recognition unit 304 recognizes the subject's emotions and mental movements from vital data values such as body temperature and heartbeat, and change information of the vital data.

また、状況認識部３０４は、バイタルデータ以外にも、温度や湿度、震度、雑音の大きさ、照度、輝度などの環境に関するパラメータを測定可能なセンサからの情報などから、被写体の状況を認識する。これらの情報によっても、被写体の感情や心の動きは推移するもと考えられ、これを踏まえて、状況認識部３０４は、被写体の状況を認識する。また、この他にも、状況認識部３０４は、被写体の表情の変化や動作の変化、発汗量の変化、音声の変化、環境情報の変化などに基づいて、被写体の状況を認識する。 In addition to the vital data, thesituation recognition unit 304 also recognizes the situation of the subject based on information from sensors capable of measuring environmental parameters such as temperature, humidity, seismic intensity, noise level, illuminance, and luminance. . It is considered that the emotions and mental movements of the subject change based on this information as well, and based on this, thesituation recognition unit 304 recognizes the situation of the subject. In addition, thesituation recognition unit 304 recognizes the situation of the subject based on changes in facial expressions and actions of the subject, changes in the amount of perspiration, changes in voice, changes in environmental information, and the like.

表示部３０５は、認識された状況を識別可能に表示する。例えば、表示部３０５は、認識した状況や感情、心の動きをテキストを含む何らかの形式で表現する。表示部３０５は、例えば、被写体が怒っている場合には、怒っていることが分かる形式で、被写体が平常心でいる場合には、平常心であることが分かる形式で、これらの状況を表現する。表示部３０５は、カメラ２３０が撮像した映像に、例えば、被写体の状況や被写体の発話内容、バイタイルデータ、周辺環境情報などを重畳して表示してもよい。 Adisplay unit 305 displays the recognized situation in an identifiable manner. For example, thedisplay unit 305 expresses the recognized situation, emotion, and movement of the mind in some form including text. For example, when the subject is angry, thedisplay unit 305 expresses these situations in a format that makes it clear that the subject is angry, and in a format that makes it clear that the subject is calm when the subject is calm. do. Thedisplay unit 305 may superimpose, for example, the subject's situation, the content of the subject's utterance, bitile data, and surrounding environment information on the image captured by thecamera 230 .

アラート報知部３０６は、認識した状況に基づいて、アラートを報知する。アラート報知部３０６は、例えば、表示部２２０を見ている人物などにアラートを報知する。アラート報知部３０６は、例えば、映像中の被写体である人物のバイタイルデータなどに異常を示す変化が現れた場合、その旨のアラートを報知してもよい。また、被写体の感情に変化があった場合、例えば、平常心であった被写体が突然激高した場合や、椅子に座っていた被写体が急な意識低下に見舞われた場合などに、その旨のアラートを報知してもよい。 Thealert notification unit 306 notifies an alert based on the recognized situation. Thealert notification unit 306 notifies an alert to, for example, a person who is looking at thedisplay unit 220 . For example, when a change indicating an abnormality appears in the bitile data of a person who is a subject in the video, thealert reporting unit 306 may issue an alert to that effect. In addition, if there is a change in the subject's emotions, for example, if the subject who was in a calm mood suddenly becomes agitated, or if the subject is sitting on a chair and suddenly loses consciousness, it will be notified to that effect. You may report an alert.

アラートは、例えば、画面の点滅、アラートの内容を示すアイコンなどの表示、アラート内容を示すテキストの表示、アラーム音の発報、表示部２２０の振動、ランプの点滅、などにより行われる。 The alert is issued by, for example, blinking the screen, displaying an icon or the like indicating the content of the alert, displaying text indicating the content of the alert, issuing an alarm sound, vibrating thedisplay unit 220, or blinking a lamp.

なお、被写体の状況や感情、環境の変化などから、次に起こりうることを予測して、予防策や対策を報知して、例えば、映像を見ている人が、これらの予防策や対策などを実施してもよい。 In addition, it predicts what may happen next from the situation and emotions of the subject, changes in the environment, etc., and notifies preventive measures and countermeasures, for example, so that the person watching the video can understand these preventive measures and countermeasures. may be implemented.

図４は、本実施形態に係る情報処理システムに含まれる情報処理装置が有する状況テーブルの一例を示す図である。状況テーブル４０１は、被写体ＩＤ（Identifier）４１１に対応付けて、被写体情報４１２、周辺環境情報４１３、状況４１４およびアラート４１５を記憶して保持する。 FIG. 4 is a diagram showing an example of a situation table held by an information processing device included in the information processing system according to this embodiment. The situation table 401 stores and holdssubject information 412 , surroundingenvironment information 413 ,situation 414 and alert 415 in association with a subject ID (Identifier) 411 .

被写体ＩＤ４１１は、カメラ２３０が撮像した所定エリアの映像中の被写体を識別する識別子（識別情報）である。被写体情報４１２は、被写体に関する情報であり、例えば、被写体の生体情報や動作情報、音声情報、表情情報などである。周辺環境情報４１３は、被写体の周辺環境の情報であり、例えば、温度や湿度、震度、雑音の大きさ、照度、輝度などである。状況４１４は、被写体の状況を表し、例えば、被写体の感情や心の動きなどを表す。アラート４１５は、報知するアラートの内容である。 Thesubject ID 411 is an identifier (identification information) for identifying a subject in the image of the predetermined area captured by thecamera 230 . Thesubject information 412 is information about the subject, such as the subject's biological information, motion information, voice information, facial expression information, and the like. Surroundingenvironment information 413 is information about the subject's surrounding environment, such as temperature, humidity, seismic intensity, noise level, illuminance, and luminance. Asituation 414 represents the subject's situation, for example, the subject's emotion or mental movement.Alert 415 is the content of the alert to be notified.

図５は、本実施形態に係る情報処理システムに含まれる情報処理装置２１０のハードウェア構成を説明するブロック図である。ＣＰＵ(Central Processing Unit)５１０は、演算制御用のプロセッサであり、プログラムを実行することで図３の情報処理装置２０３の機能構成部を実現する。ＣＰＵ５１０は複数のプロセッサを有し、異なるプログラムやモジュール、タスク、スレッドなどを並行して実行してもよい。ＲＯＭ(Read Only Memory)５２０は、初期データおよびプログラムなどの固定データおよびその他のプログラムを記憶する。また、ネットワークインタフェース５３０は、ネットワークを介して他の装置などと通信する。なお、ＣＰＵ５１０は１つに限定されず、複数のＣＰＵであっても、あるいは画像処理用のＧＰＵ(Graphics Processing Unit)を含んでもよい。また、ネットワークインタフェース５３０は、ＣＰＵ５１０とは独立したＣＰＵを有して、ＲＡＭ(Random Access Memory)５４０の領域に送受信データを書き込みあるいは読み出しするのが望ましい。また、ＲＡＭ５４０とストレージ５５０との間でデータを転送するＤＭＡＣ(Direct Memory Access Controller)を設けるのが望ましい（図示なし）。さらに、入出力インタフェース５６０は、ＣＰＵ５１０とは独立したＣＰＵを有して、ＲＡＭ５４０の領域に入出力データを書き込みあるいは読み出しするのが望ましい。したがって、ＣＰＵ５１０は、ＲＡＭ５４０にデータが受信あるいは転送されたことを認識してデータを処理する。また、ＣＰＵ５１０は、処理結果をＲＡＭ５４０に準備し、後の送信あるいは転送はネットワークインタフェース５３０やＤＭＡＣ、あるいは入出力インタフェース５６０に任せる。 FIG. 5 is a block diagram illustrating the hardware configuration of theinformation processing device 210 included in the information processing system according to this embodiment. A CPU (Central Processing Unit) 510 is a processor for arithmetic control, and implements the functional components of the information processing apparatus 203 in FIG. 3 by executing a program. TheCPU 510 may have multiple processors and execute different programs, modules, tasks, threads, etc. in parallel. A ROM (Read Only Memory) 520 stores fixed data such as initial data and programs, and other programs. Also, thenetwork interface 530 communicates with other devices and the like via a network. Note that the number ofCPUs 510 is not limited to one, and may include a plurality of CPUs or a GPU (Graphics Processing Unit) for image processing. Moreover, it is desirable that thenetwork interface 530 has a CPU independent of theCPU 510 and writes or reads transmission/reception data in a RAM (Random Access Memory) 540 area. It is also desirable to provide a DMAC (Direct Memory Access Controller) for transferring data betweenRAM 540 and storage 550 (not shown). Further, input/output interface 560 desirably has a CPU independent ofCPU 510 to write or read input/output data in the area ofRAM 540 . Therefore,CPU 510 recognizes that data has been received or transferred to RAM 540 and processes the data. In addition, theCPU 510 prepares the processing result in theRAM 540 and entrusts subsequent transmission or transfer to thenetwork interface 530, DMAC, or input/output interface 560. FIG.

ＲＡＭ５４０は、ＣＰＵ５１０が一時記憶のワークエリアとして使用するランダムアクセスメモリである。ＲＡＭ５４０には、本実施形態の実現に必要なデータを記憶する領域が確保されている。被写体ＩＤ５４１は、カメラ２３０により撮像された映像中の被写体を識別するデータである。被写体情報５４２は、被写体に関する情報である。周辺環境情報５４３は、被写体の周辺環境に関する情報である。被写体状況５４４は、被写体の状況に関するデータである。アラート内容５４５は、報知するアラートに関するデータである。これらのデータや情報は、例えば、状況テーブル４０１から展開される。 ARAM 540 is a random access memory used by theCPU 510 as a work area for temporary storage. TheRAM 540 has an area for storing data necessary for implementing the present embodiment. Thesubject ID 541 is data for identifying a subject in the image captured by thecamera 230 . Thesubject information 542 is information about the subject. Surroundingenvironment information 543 is information about the subject's surrounding environment. Thesubject situation 544 is data relating to the situation of the subject. Thealert content 545 is data relating to the alert to be notified. These data and information are expanded from the situation table 401, for example.

入出力データ５４６は、入出力インタフェース５６０を介して入出力されるデータである。送受信データ５４７は、ネットワークインタフェース５３０を介して送受信されるデータである。また、ＲＡＭ５４０は、各種アプリケーションモジュールを実行するためのアプリケーション実行領域５４８を有する。 The input/output data 546 is data input/output via the input/output interface 560 . The transmitted/receiveddata 547 is data transmitted/received via thenetwork interface 530 .RAM 540 also has anapplication execution area 548 for executing various application modules.

ストレージ５５０には、データベースや各種のパラメータ、あるいは本実施形態の実現に必要な以下のデータまたはプログラムが記憶されている。ストレージ５５０は、状況テーブル４０１を格納する。状況テーブル４０１は、図４に示した、被写体ＩＤ４１１と、状況４１４などとの関係を管理するテーブルである。 Thestorage 550 stores a database, various parameters, or the following data or programs necessary for realizing this embodiment.Storage 550 stores status table 401 . The situation table 401 is a table for managing the relationship between thesubject ID 411 and thesituation 414 shown in FIG.

ストレージ５５０は、さらに、映像取得モジュール５５１、被写体情報収集モジュール５５２、周辺環境情報取得モジュール５５３、状況認識モジュール５５４、表示モジュール５５５およびアラート報知モジュール５５６を格納する。 Thestorage 550 further stores animage acquisition module 551 , subjectinformation collection module 552 , surrounding environmentinformation acquisition module 553 ,situation recognition module 554 ,display module 555 andalert notification module 556 .

映像取得モジュール５５１は、カメラ２３０で撮像した所定エリアの映像を取得するモジュールである。被写体情報収集モジュール５５２は、カメラ２３０で撮像した映像中の被写体の情報を収集するモジュールである。周辺環境情報取得モジュール５５３は、被写体の周辺環境の情報を取得するモジュールである。状況認識モジュール５５４は、被写体の映像や被写体情報、周辺環境情報に基づいて、被写体の状況を認識する。表示モジュール５５５は、認識された被写体の状況を識別可能に表示するモジュールである。アラート報知モジュール５５６は、認識された状況に基づいて、アラートを報知するモジュールである。これらのモジュール５５１～５５６は、ＣＰＵ５１０によりＲＡＭ５４０のアプリケーション実行領域５４８に読み出され、実行される。制御プログラム５５７は、情報処理装置２１０の全体を制御するためのプログラムである。 Theimage acquisition module 551 is a module that acquires an image of a predetermined area captured by thecamera 230 . The subjectinformation collection module 552 is a module that collects information about the subject in the image captured by thecamera 230 . The surrounding environmentinformation acquisition module 553 is a module that acquires information on the surrounding environment of the subject. Thesituation recognition module 554 recognizes the situation of the subject based on the image of the subject, subject information, and surrounding environment information. Thedisplay module 555 is a module that identifiably displays the situation of the recognized subject. Thealert notification module 556 is a module that issues an alert based on the recognized situation. Thesemodules 551 to 556 are read by theCPU 510 into theapplication execution area 548 of theRAM 540 and executed. Thecontrol program 557 is a program for controlling theinformation processing device 210 as a whole.

入出力インタフェース５６０は、入出力機器との入出力データをインタフェースする。入出力インタフェース５６０には、表示部５６１、操作部５６２、が接続される。また、入出力インタフェース５６０には、さらに、記憶媒体５６４が接続されてもよい。さらに、音声出力部であるスピーカ５６３や、音声入力部であるマイク（図示せず）、あるいは、ＧＰＳ位置判定部が接続されてもよい。なお、図５に示したＲＡＭ５４０やストレージ５５０には、情報処理装置２１０が有する汎用の機能や他の実現可能な機能に関するプログラムやデータは図示されていない。 The input/output interface 560 interfaces input/output data with input/output devices. Adisplay unit 561 and anoperation unit 562 are connected to the input/output interface 560 . Astorage medium 564 may also be connected to the input/output interface 560 . Furthermore, aspeaker 563 as an audio output unit, a microphone (not shown) as an audio input unit, or a GPS position determination unit may be connected. Note that theRAM 540 and thestorage 550 shown in FIG. 5 do not show the general-purpose functions of theinformation processing apparatus 210 and the programs and data related to other realizable functions.

図６は、本実施形態に係る情報処理装置２１０の処理手順を説明するフローチャートである。このフローチャートは、図５のＣＰＵ５１０がＲＡＭ５４０を使用して実行し、図３の情報処理装置２１０の機能構成部を実現する。 FIG. 6 is a flowchart for explaining the processing procedure of theinformation processing apparatus 210 according to this embodiment. This flowchart is executed by theCPU 510 in FIG. 5 using theRAM 540, and implements the functional components of theinformation processing apparatus 210 in FIG.

ステップＳ６０１において、情報処理装置２１０は、カメラ２３０が撮像した所定エリアの映像を取得する。ステップＳ６０３において、情報処理装置２１０は、撮像された映像中の被写体の被写体情報を収集する。ステップＳ６０５において、情報処理装置２１０は、被写体の周辺環境の情報である周辺環境情報を取得する。ステップＳ６０７において、情報処理装置２１０は、被写体の映像、被写体情報および周辺環境情報に基づいて、被写体の状況を認識する。 In step S<b>601 , theinformation processing device 210 acquires an image of a predetermined area captured by thecamera 230 . In step S603, theinformation processing apparatus 210 collects subject information of the subject in the imaged video. In step S605, theinformation processing apparatus 210 acquires surrounding environment information, which is information about the subject's surrounding environment. In step S607, theinformation processing device 210 recognizes the situation of the subject based on the image of the subject, the subject information, and the surrounding environment information.

ステップＳ６０９において、情報処理装置２１０は、認識した状況を表示する。ステップＳ６１１において、情報処理装置２１０は、認識した情報に基づいて、アラートの報知が必要か否かを判断する。アラートの報知が必要ない場合（ステップＳ６１１のＮＯ）、情報処理装置２１０は、処理を終了する。アラートの報知が必要な場合（ステップＳ６１１のＹＥＳ）、情報処理装置２１０は、ステップＳ６１３へ進む。ステップＳ６１３において、情報処理装置２１０は、所定の報知方法でアラートを報知する。 In step S609, theinformation processing device 210 displays the recognized situation. In step S611, theinformation processing apparatus 210 determines whether or not it is necessary to issue an alert based on the recognized information. If notification of an alert is unnecessary (NO in step S611), theinformation processing device 210 terminates the process. If alert notification is required (YES in step S611), theinformation processing apparatus 210 proceeds to step S613. In step S613, theinformation processing apparatus 210 notifies an alert by a predetermined notification method.

本実施形態によれば、映像を見ている人が現場の臨場感や現場の状況を共有することができる。また、映像を見ている人が、被写体の状況を把握したり、共有したりすることができる。さらにまた、途中から映像を見た人物であっても、現場の臨場感などを共有することができる。さらに、アラートを報知するので、映像を見ている人がアラートに従って現場で起こっている事態に対して対処することができる。 According to the present embodiment, the people watching the video can share the presence of the site and the situation of the site. Also, a person viewing the video can grasp and share the situation of the subject. Furthermore, even a person who has watched the video in the middle can share the realism of the scene. Furthermore, since an alert is issued, the person watching the video can respond to the situation occurring at the site according to the alert.

［第３実施形態］
次に本発明の第３実施形態に係る情報処理システムについて、図７を用いて説明する。図７は、本実施形態に係る情報処理システムの構成を説明するための図である。本実施形態に係る情報処理システムは、上記第２実施形態と比べると、会議システムに情報処理システムを適用した点で異なる。その他の構成および動作は、第２実施形態と同様であるため、同じ構成および動作については同じ符号を付してその詳しい説明を省略する。[Third embodiment]
Next, an information processing system according to a third embodiment of the present invention will be explained using FIG. FIG. 7 is a diagram for explaining the configuration of the information processing system according to this embodiment. The information processing system according to this embodiment differs from the second embodiment in that the information processing system is applied to a conference system. Since other configurations and operations are similar to those of the second embodiment, the same configurations and operations are denoted by the same reference numerals, and detailed description thereof will be omitted.

情報処理システム（会議システム）７００は、マイクスピーカ付端末７０１，７０２から会議における参加者の発話や会話などの音声情報を取得する。また、情報処理システム７００は、各種センサ７０３から、会議中の映像や、会議参加者のバイタイルデータや、会議室内の環境データなどを取得する。 An information processing system (conference system) 700 acquires voice information such as utterances and conversations of participants in a conference fromterminals 701 and 702 with microphone speakers. Theinformation processing system 700 also acquires, fromvarious sensors 703, video during the conference, bitile data of conference participants, environment data in the conference room, and the like.

情報処理システム７００は、マイクスピーカ付端末７０１，７０２のマイクからの音声情報をもとに、声紋認識などにより話者を特定する。または、カメラからの映像の情報をもとに、顔認証などにより話者を特定する。あるいは、声紋認証と顔認証とを組み合わせて話者を特定してもよい。 Theinformation processing system 700 identifies the speaker by voiceprint recognition or the like based on voice information from the microphones of theterminals 701 and 702 with microphone speakers. Alternatively, the speaker is identified by facial recognition or the like based on the image information from the camera. Alternatively, the speaker may be specified by combining voiceprint authentication and face authentication.

また、例えば、１つのマイクスピーカ付端末７０１を会議参加者で共有している場合には、各種センサ７０３として３６０度カメラや魚眼レンズ付カメラなどの映像を合わせて総合的に判断することで、より高い精度で話者を特定できる。例えば、カメラの映像から会議参加者の口の動きの認証や、顔認証、発言者の方向の特定などを行い、マイクで収集した音声から声紋などを判定するなど、複数の情報を総合的に判断することにより、各発話や会話に対して、より高い精度で話者を特定することができる。 In addition, for example, when oneterminal 701 with a microphone speaker is shared by conference participants, by combining the images ofvarious sensors 703 such as a 360-degree camera and a camera with a fisheye lens and making a comprehensive judgment, Speakers can be identified with high accuracy. For example, it can recognize the movement of the mouths of meeting participants from camera images, face recognition, identify the direction of the speaker, and determine the voice print from the voice collected by the microphone. By judging, the speaker can be identified with higher accuracy for each utterance or conversation.

情報処理システム７００は、例えば、マイクスピーカ付端末７０１，７０２や各種センサ７０３からの情報をもとに、特定した話者と発言内容（発話内容）とをセットでテキスト化する。なお、テキスト化は端末７０４にインストールされたアプリケーションで行ってもよい。このように、端末７０４にインストールされたアプリケーションでテキスト化を行うと、ネットワークを経由する前の音声情報を入力とすることができるので、テキスト化の精度を上げることができる。 Theinformation processing system 700 , for example, based on the information from theterminals 701 and 702 with microphone speakers and thevarious sensors 703 , converts the specified speaker and the content of the speech (the content of the speech) into text as a set. Note that text conversion may be performed by an application installed on theterminal 704 . In this way, if the application installed in the terminal 704 converts the information into text, it is possible to input the voice information before passing through the network, so that the precision of the conversion into text can be improved.

そして、情報処理システム７００は、会話解析（発話解析）を行い、会議中の情景を思い浮かべられるような会話の付加情報（力強い、弱い、笑、怒り、悲しみなど）も合わせてテキストで表現する。このようにすることにより、マイクやスピーカなしで会議に参加している人物も会議の内容、臨場感などを共有することができる。 Then, theinformation processing system 700 performs conversation analysis (utterance analysis), and also expresses additional information (strong, weak, laughter, anger, sadness, etc.) of the conversation that makes you think of the scene during the meeting as text. By doing so, even persons participating in the conference without microphones or speakers can share the content of the conference, the sense of presence, and the like.

また、情報処理システム７００は、マイク、スピーカ以外にも、各種センサ７０３、例えば、温度、湿度、雑音、奇声、バイタル情報、計測器などの変化や動きを付加することで、会議室以外の場所においても、その場所にいない人が臨場感を共有できる。 In addition to microphones and speakers, theinformation processing system 700 also adds changes and movements tovarious sensors 703, such as temperature, humidity, noise, odd voices, vital information, and measuring instruments, so that theinformation processing system 700 can be used in places other than the conference room. Even in this case, people who are not at the place can share the sense of presence.

また、マイクやスピーカの無い環境において、端末７０４を用いて、テキストで会議に参加している人物がテキスト入力した内容を音声合成で各参加者のスピーカに流してもよい。 Further, in an environment without a microphone or a speaker, the terminal 704 may be used to output text input by a person who participates in the conference by speech synthesis to the speaker of each participant.

会議室以外の場所としては、例えば、設備の整っている病院ではなく、在宅医療などの設備の不十分な環境においても、その場にいない人が臨場感を共有することができる。 As a place other than a conference room, for example, even in an environment with insufficient facilities such as home medical care, instead of a well-equipped hospital, people who are not present can share a sense of reality.

その他に、情報処理システム７００は、話者特定ができているので、テキスト化した内容を会議の議事録として記録することができる。また、情報処理システム７００は、記録した議事録を会議参加者にメールなどで送信することにより、アクティブなフォローをすることができる。 In addition, since theinformation processing system 700 can identify the speaker, it is possible to record the content converted into text as the minutes of the meeting. Further, theinformation processing system 700 can actively follow up by sending the recorded minutes to the conference participants by e-mail or the like.

また、発話内容などをテキスト化するのでテキスト化した内容をリアルタイムまたは事後的に翻訳することもできる。これにより、議事録入手希望者の希望する言語の議事録を作成することも可能となる。 In addition, since the utterance content is converted into text, the text content can be translated in real time or after the fact. As a result, it is possible to prepare the minutes in the language desired by the person who wishes to obtain the minutes.

さらに、情報処理システム７００は、テキスト化された発話内容を解析することにより、発話内容に応じたアクションを自動的に行うことができる。情報処理システム７００は、商品の発注や各種調整、検索、回答、アラーム、発信、停止などを自動的に行うことができる。情報処理システム７００は、例えば、「３０２号室にタオルを至急運ぶ」といった内容のアラートを担当者に対して報知することができる。 Further, theinformation processing system 700 can automatically perform actions according to the contents of the utterance by analyzing the contents of the utterance converted into text. Theinformation processing system 700 can automatically perform ordering of products, various adjustments, searches, responses, alarms, transmission, suspension, and the like. Theinformation processing system 700 can, for example, notify the person in charge of an alert with content such as "Immediately bring a towel toroom 302."

本実施形態によれば、会議に参加していない人も、会議現場の臨場感を共有することができる。また、途中から会議に参加した場合でも、会議現場の臨場感などを共有することができる。さらに、設備の整っている病院ではなく、在宅医療などの設備の不十分な環境においても、映像を見ている人物が、被写体の状況や現場の臨場感を共有することができる。また、コンサート会場や刑務所、デモ行進などのように多くの人が集まるような状況において、その場にいない人物や、映像を見ている人物が参加者の状況や現場の臨場感を共有することができる。 According to this embodiment, even those who are not participating in the conference can share the presence of the conference site. In addition, even if participants join the conference midway through, they can share the realism of the conference site. Furthermore, even in an environment with insufficient facilities such as home medical care, instead of a well-equipped hospital, the person watching the video can share the situation of the subject and the realism of the scene. Also, in situations where many people gather, such as concert venues, prisons, and demonstration marches, people who are not there or who are watching the video can share the situation of the participants and the presence of the scene. can be done.

［他の実施形態］
以上、実施形態を参照して本願発明を説明したが、本願発明は上記実施形態に限定されるものではない。本願発明の構成や詳細には、本願発明のスコープ内で当業者が理解し得る様々な変更をすることができる。また、それぞれの実施形態に含まれる別々の特徴を如何様に組み合わせたシステムまたは装置も、本発明の範疇に含まれる。[Other embodiments]
Although the present invention has been described with reference to the embodiments, the present invention is not limited to the above embodiments. Various changes that can be understood by those skilled in the art can be made to the configuration and details of the present invention within the scope of the present invention. Also, any system or apparatus that combines separate features included in each embodiment is also included in the scope of the present invention.

また、本発明は、複数の機器から構成されるシステムに適用されてもよいし、単体の装置に適用されてもよい。さらに、本発明は、実施形態の機能を実現する情報処理プログラムが、システムあるいは装置に直接あるいは遠隔から供給される場合にも適用可能である。したがって、本発明の機能をコンピュータで実現するために、コンピュータにインストールされるプログラム、あるいはそのプログラムを格納した媒体、そのプログラムをダウンロードさせるＷＷＷ(World Wide Web)サーバも、本発明の範疇に含まれる。特に、少なくとも、上述した実施形態に含まれる処理ステップをコンピュータに実行させるプログラムを格納した非一時的コンピュータ可読媒体（non-transitory computer readable medium）は本発明の範疇に含まれる。 Further, the present invention may be applied to a system composed of a plurality of devices, or may be applied to a single device. Furthermore, the present invention is also applicable when an information processing program that implements the functions of the embodiments is directly or remotely supplied to a system or apparatus. Therefore, in order to implement the functions of the present invention on a computer, a program installed in a computer, a medium storing the program, and a WWW (World Wide Web) server from which the program is downloaded are also included in the scope of the present invention. . In particular, non-transitory computer readable media storing programs that cause a computer to perform at least the processing steps included in the above-described embodiments are included within the scope of the present invention.

［実施形態の他の表現］
上記の実施形態の一部または全部は、以下の付記のようにも記載されうるが、以下には限られない。
（付記１）
所定エリアを撮像する撮像手段から映像を取得する映像取得手段と、
撮像された映像に写っている被写体の被写体情報を収集する被写体情報収集手段と、
前記被写体の映像および前記被写体情報に基づいて、前記被写体の状況を認識する状況認識手段と、
認識された前記状況を識別可能に表示する表示手段と、
を備えた情報処理装置。
（付記２）
前記被写体の周辺環境の情報である周辺環境情報を取得する周辺環境情報取得手段をさらに備え、
前記状況認識手段は、さらに、前記周辺環境情報に基づいて、前記被写体の状況を認識する付記１に記載の情報処理装置。
（付記３）
前記周辺環境情報は、温度、湿度、震度、雑音の大きさ、照度および輝度のうち少なくとも１つを含む付記２に記載の情報処理装置。
（付記４）
認識された前記状況に基づいて、アラートを報知するアラート報知手段をさらに備える付記１乃至３のいずれか１項に記載の情報処理装置。
（付記５）
前記被写体情報は、前記被写体の音声情報、生体情報、動作情報および表情情報のうち少なくとも１つを含む付記１乃至４のいずれか１項に記載の情報処理装置。
（付記６）
生体情報は、体温、心拍、脈拍、空腹、のどの渇き、吐き気、尿意、痛み、呼吸数、脳波、発汗量のうち少なくとも１つを含む付記５に記載の情報処理装置。
（付記７）
前記状況は、少なくとも前記被写体の感情を含む付記１乃至６のいずれか１項に記載の情報処理装置。
（付記８）
前記表示手段は、前記状況を表すテキストを表示する付記１乃至７のいずれか１項に記載の情報処理装置。
（付記９）
所定エリアを撮像する撮像手段から映像を取得する映像取得ステップと、
撮像された映像に写っている被写体の被写体情報を収集する被写体情報収集ステップと、
前記被写体の映像および前記被写体情報に基づいて、前記被写体の状況を認識する状況認識ステップと、
認識された前記状況を識別可能に表示する表示ステップと、
を含む情報処理方法。
（付記１０）
所定エリアを撮像する撮像手段から映像を取得する映像取得ステップと、
撮像された映像に写っている被写体の被写体情報を収集する被写体情報収集ステップと、
前記被写体の映像および前記被写体情報に基づいて、前記被写体の状況を認識する状況認識ステップと、
認識された前記状況を識別可能に表示する表示ステップと、
をコンピュータに実行させる情報処理プログラム。
（付記１１）
所定エリアの映像を撮像する撮像手段と、
前記撮像手段により撮像された所定エリアの映像に含まれる被写体の被写体情報を収集する被写体情報収集手段と、
前記被写体の映像および前記被写体情報に基づいて、前記被写体の状況を認識する状況認識手段と、
認識された前記状況を識別可能に表示する表示手段と、
を備えた情報処理システム。[Other expressions of the embodiment]
Some or all of the above embodiments can also be described as the following additional remarks, but are not limited to the following.
(Appendix 1)
an image acquiring means for acquiring an image from an imaging means for imaging a predetermined area;
subject information collecting means for collecting subject information of a subject appearing in a captured image;
situation recognition means for recognizing the situation of the subject based on the image of the subject and the subject information;
display means for identifiably displaying the recognized situation;
Information processing device with
(Appendix 2)
Further comprising surrounding environment information acquisition means for acquiring surrounding environment information, which is information on the surrounding environment of the subject,
The information processing apparatus according to appendix 1, wherein the situation recognition means further recognizes the situation of the subject based on the surrounding environment information.
(Appendix 3)
2. The information processing apparatus according to appendix 2, wherein the surrounding environment information includes at least one of temperature, humidity, seismic intensity, noise level, illuminance and brightness.
(Appendix 4)
4. The information processing apparatus according to any one of appendices 1 to 3, further comprising alert reporting means for reporting an alert based on the recognized situation.
(Appendix 5)
5. The information processing apparatus according to any one of appendices 1 to 4, wherein the subject information includes at least one of voice information, biological information, motion information, and facial expression information of the subject.
(Appendix 6)
6. The information processing apparatus according to appendix 5, wherein the biological information includes at least one of body temperature, heartbeat, pulse, hunger, thirst, nausea, urge to urinate, pain, respiratory rate, brain wave, and amount of perspiration.
(Appendix 7)
7. The information processing apparatus according to any one of appendices 1 to 6, wherein the situation includes at least the emotion of the subject.
(Appendix 8)
8. The information processing apparatus according to any one of appendices 1 to 7, wherein the display means displays text representing the situation.
(Appendix 9)
an image acquisition step of acquiring an image from imaging means for imaging a predetermined area;
a subject information collecting step of collecting subject information of a subject appearing in the imaged video;
a situation recognition step of recognizing the situation of the subject based on the image of the subject and the subject information;
a display step of identifiably displaying the recognized situation;
Information processing method including.
(Appendix 10)
an image acquisition step of acquiring an image from imaging means for imaging a predetermined area;
a subject information collecting step of collecting subject information of a subject appearing in the imaged video;
a situation recognition step of recognizing the situation of the subject based on the image of the subject and the subject information;
a display step of identifiably displaying the recognized situation;
An information processing program that causes a computer to execute
(Appendix 11)
imaging means for capturing an image of a predetermined area;
subject information collection means for collecting subject information of a subject included in the image of the predetermined area captured by the imaging means;
situation recognition means for recognizing the situation of the subject based on the image of the subject and the subject information;
display means for identifiably displaying the recognized situation;
Information processing system with

Claims

Translated fromJapanese

所定エリアおよび少なくとも１人の被写体を撮像した映像を取得する映像取得手段と、
前記被写体に関する被写体情報と前記被写体の周辺環境の情報である周辺環境情報とを収集する情報収集手段と、
前記被写体の映像および前記被写体情報に基づいて、前記被写体の状況を認識する被写体状況認識手段と、
表示画面を見ている人が前記所定エリアの臨場感を共有できるように、前記被写体の映像を含む前記所定エリアの映像と、前記被写体の状況と、前記被写体情報と、前記周辺環境情報とを、１つの前記表示画面に表示する表示手段と、
を備えた情報処理装置。an image acquiring means for acquiring an image of a predetermined area and at least one subject;
information collecting means for collecting subject information about the subject and surrounding environment information that is information about the surrounding environment of the subject;
subject situation recognition means for recognizing the situation of the subject based on the image of the subject and the subject information;
The image of the predetermined area including the image of the subject, the situation of the subject, the subject information, and the surrounding environment information are displayed so that the viewers of the display screen can share the presence of the predetermined area. , display means for displayingon one of the display screens ;
Information processing device with

前記表示手段は、さらに、グラフ化した前記被写体情報を表示する請求項１に記載の情報処理装置。 2. The information processing apparatus according to claim 1, wherein said display means further displays said object information in graph form.

前記被写体状況認識手段は、前記被写体の映像および前記被写体情報と、さらに、前記周辺環境情報とに基づいて、前記被写体の状況を認識する請求項１または２に記載の情報処理装置。 3. The information processing apparatus according to claim 1, wherein the subject situation recognition means recognizes the situation of the subject based on the image of the subject, the subject information, and the surrounding environment information.

前記周辺環境情報は、温度、湿度、震度、雑音の大きさ、照度および輝度のうち少なくとも１つを含む請求項１乃至３のいずれか１項に記載の情報処理装置。 4. The information processing apparatus according to any one of claims 1 to 3, wherein said surrounding environment information includes at least one of temperature, humidity, seismic intensity, magnitude of noise, illuminance and brightness.

前記被写体情報は、前記被写体の音声情報、前記被写体から測定された生体情報、前記被写体の動作情報および前記被写体の表情情報うち少なくとも１つを含む請求項１乃至４のいずれか１項に記載の情報処理装置。 5. The subject information according to any one of claims 1 to 4, wherein the subject information includes at least one of voice information of the subject, biological information measured from the subject, motion information of the subject, and facial expression information of the subject. Information processing equipment.

前記被写体から測定された生体情報は、体温、心拍、脈拍、空腹、のどの渇き、吐き気、尿意、痛み、呼吸数、脳波、発汗量のうち少なくとも１つを含む請求項５に記載の情報処理装置。 6. The information processing according to claim 5, wherein the biological information measured from the subject includes at least one of body temperature, heartbeat, pulse, hunger, thirst, nausea, urge to urinate, pain, respiratory rate, brain wave, and amount of perspiration. Device.

前記所定エリアは、集会会場、コンサート会場、病院、刑務所、家庭および会議室のいずれかのエリアである請求項１乃至６のいずれか１項に記載の情報処理装置。 7. The information processing apparatus according to any one of claims 1 to 6, wherein said predetermined area is any one of an assembly hall, a concert hall, a hospital, a prison, a home, and a conference room.

所定エリアおよび少なくとも１人の被写体を撮像した映像を取得する映像取得ステップと、
前記被写体に関する被写体情報と前記被写体の周辺環境の情報である周辺環境情報とを収集する情報収集ステップと、
前記被写体の映像および前記被写体情報に基づいて、前記被写体の状況を認識する被写体状況認識ステップと、
表示画面を見ている人が前記所定エリアの臨場感を共有できるように、前記被写体の映像を含む前記所定エリアの映像と、前記被写体の状況と、前記被写体情報と、前記周辺環境情報とを、１つの前記表示画面に表示する表示ステップと、
を含む情報処理方法。an image acquisition step of acquiring an image of a predetermined area and at least one subject;
an information collecting step of collecting subject information about the subject and surrounding environment information that is information about the surrounding environment of the subject;
a subject situation recognition step of recognizing the situation of the subject based on the image of the subject and the subject information;
The image of the predetermined area including the image of the subject, the situation of the subject, the subject information, and the surrounding environment information are displayed so that the viewers of the display screen can share the presence of the predetermined area. , a display step of displayingon one of the display screens ;
Information processing method including.

所定エリアおよび少なくとも１人の被写体を撮像した映像を取得する映像取得ステップと、
前記被写体に関する被写体情報と前記被写体の周辺環境の情報である周辺環境情報とを収集する情報収集ステップと、
前記被写体の映像および前記被写体情報に基づいて、前記被写体の状況を認識する被写体状況認識ステップと、
表示画面を見ている人が前記所定エリアの臨場感を共有できるように、前記被写体の映像を含む前記所定エリアの映像と、前記被写体の状況と、前記被写体情報と、前記周辺環境情報とを、１つの前記表示画面に表示する表示ステップと、
をコンピュータに実行させる情報処理プログラム。an image acquisition step of acquiring an image of a predetermined area and at least one subject;
an information collecting step of collecting subject information about the subject and surrounding environment information that is information about the surrounding environment of the subject;
a subject situation recognition step of recognizing the situation of the subject based on the image of the subject and the subject information;
The image of the predetermined area including the image of the subject, the situation of the subject, the subject information, and the surrounding environment information are displayed so that the viewers of the display screen can share the presence of the predetermined area. , a display step of displayingon one of the display screens ;
An information processing program that causes a computer to execute

所定エリアおよび少なくとも１人の被写体の映像を撮像する撮像手段と、
前記撮像手段により撮像された前記被写体に関する被写体情報と、前記被写体の周辺環境の情報である周辺環境情報とを収集する情報収集手段と、
前記被写体の映像および前記被写体情報に基づいて、前記被写体の状況を認識する被写体状況認識手段と、
表示画面を見ている人が前記所定エリアの臨場感を共有できるように、前記被写体の映像を含む前記所定エリアの映像と、前記被写体の状況と、前記被写体情報と、前記周辺環境情報とを、１つの前記表示画面に表示する表示手段と、
を備えた情報処理システム。imaging means for capturing an image of a predetermined area and at least one subject;
information collecting means for collecting subject information about the subject imaged by the imaging means and surrounding environment information that is information on the surrounding environment of the subject;
subject situation recognition means for recognizing the situation of the subject based on the image of the subject and the subject information;
The image of the predetermined area including the image of the subject, the situation of the subject, the subject information, and the surrounding environment information are displayed so that the viewers of the display screen can share the presence of the predetermined area. , display means for displayingon one of the display screens ;
Information processing system with