JP2004247844A

Movatterモバイル変換

Info

Publication number: JP2004247844A
Application number: JP2003033736A
Authority: JP
Inventors: Kenji Otoi; 研二乙井; Shunichi Sekiguchi; 俊一関口; Yoshimi Moriya; 芳美守屋; Hirobumi Nishikawa; 博文西川; Junichi Yokosato; 純一横里; Yoshiaki Kato; 嘉明加藤; Fuminobu Ogawa; 文伸小川; Kotaro Asai; 光太郎浅井
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2003-02-12
Filing date: 2003-02-12
Publication date: 2004-09-02

Abstract

【課題】メタデータの処理負荷を低減する。また、映像再生端末の種別・能力を意識することなく映システムサイドの適応化によって映像データを配信する。
【解決手段】コンテンツ購入処理サーバ２では、ユーザの好みを示すユーザプロファイル情報１３と、メタデータ５の要約情報であるアダプテーションヒント７とにより、ユーザの好みに合うメタデータ５のみを選別し統合して各ユーザお勧めのコンテンツメニューであるＲＣＭ９を作成して、携帯端末１へ送る。携帯端末１はＲＣＭ９を参照して、コンテンツの購入要求１０をコンテンツ購入処理サーバ２へ送信し、コンテンツの視聴を許可するコンテンツ視聴券１２を得て、コンテンツ視聴券１２を有する映像端末１２から端末情報１６およびコンテンツ視聴ステイタ情報１５をコンテンツ配信サーバ３に送信し、当該映像端末４に最適なデータフォーマットの映像データ１７を受信する。
【選択図】図１An object of the present invention is to reduce the processing load of metadata. Also, video data is distributed by adapting the projection system side without being aware of the type and ability of the video reproduction terminal.
A content purchase processing server (2) selects and integrates only metadata (5) that meets the user's preference by using user profile information (13) indicating the user's preference and an adaptation hint (7) that is summary information of the metadata (5). RCM9, which is a content menu recommended by each user, is sent to the mobile terminal 1. The mobile terminal 1 refers to the RCM 9, transmits a content purchase request 10 to the content purchase processing server 2, obtains a content viewing ticket 12 permitting viewing of the content, and obtains a terminal from the video terminal 12 having the content viewing ticket 12. The information 16 and the content viewing status information 15 are transmitted to the content distribution server 3, and the video data 17 having a data format optimal for the video terminal 4 is received.
[Selection diagram] Fig. 1

Description

Translated fromJapanese

【０００１】
【発明の属する技術分野】
この発明は、メタデータに関連付けられて管理される映像コンテンツの効率的伝送、蓄積、管理、運用を行うシステムに供することのできるメタデータ選別処理方法、メタデータ選択統合処理方法、メタデータ選択統合処理プログラム、映像再生方法、コンテンツ購入処理方法、コンテンツ購入処理サーバ、コンテンツ配信サーバに関するものである。具体的な応用例としては、ネット配信型コンテンツサービスの提供、映像利用型監視システムの構築、などがある。
【０００２】
【従来の技術】
近年、ＭＰＥＧ−２，ＭＰＥＧ−４などデジタル映像コンテンツの圧縮技術の普及を背景として、ＤＶＤやハードディスクレコーダ、映像再生機能付携帯電話など、デジタル映像コンテンツを再生・記録する機器・装置が爆発的な普及を見せている。また、インターネットの広帯域化や広帯域モバイル網の整備に伴い、オンラインでデジタル映像コンテンツを伝送、蓄積、管理、運用するシステムの重要性が増している。一方で、いまもってデジタル映像コンテンツはオンラインで直接やりとりするには文書情報などに比してかさばるデータ量であること、かつ、意味内容が時系列変化する情報であることから、その内容を遠隔から的確かつ迅速に確認するためにメタデータを付与することの意義が大きい。このような映像コンテンツへのアクセスを効率化させる目的で、ＭＰＥＧ−７（ＩＳＯ／ＩＥＣ１５９３８）、ＴＶＡｎｙｔｉｍｅ（ｈｔｔｐ：／／ｗｗｗ．ｔｖ−ａｎｙｔｉｍｅ．ｏｒｇ／）などの国際標準・業界標準の映像メタデータフォーマットが定められている。このような標準化されたメタデータによれば、オンラインでの映像コンテンツ管理・運用を行うシステムは、ネット上で映像コンテンツそのものにアクセスすることなく外部からのアクセス（検索、ブラウジング、フィルタリング）に効率的に対処することができる。
【０００３】
なお、このようなメタデータを開示した従来技術として、例えば、特開平２００１−０２８７２２号公報に開示されている動画像管理装置がある。この動画像管理装置では、ビデオデータを複数のシーンに分割し、各シーンに対して、そのシーンの再生に必要な区間情報と、シーン番号と、シーンを代表する画像との集まりであるメタデータを作成し、各インデックスに検索目的に応じたタイトルを付与する。ユーザはタイトルを用いてメタデータを検索し、インデックスに含まれるシーン番号の順番と区間情報により、ビデオデータの中の必要なシーンだけをつなぎ合わせて再生することができる。
【０００４】
【特許文献１】
特開平２００１−０２８７２２号公報
【０００５】
【発明が解決しようとする課題】
しかしながら、映像コンテンツの総数が爆発的に増大すると、映像コンテンツのメタデータそのものを管理するデータベースや、メタデータ検索エンジンの処理負荷も大きくなり、結果的にコンテンツアクセスの効率低下を招くという問題が発生する。
【０００６】
特に、映像コンテンツは、時系列により内容が変わるコンテンツであるので、アクセスの主体（ユーザなど）側からの要求として、時系列中の特定の箇所に迅速にアクセスを行いたい、という要求があり、そのためには時系列を所定のシーン単位に分割してメタデータを付与して管理する、といった必要性が生じる。このため、メタデータ自体のデータ量が映像の長さに比例して増大する可能性がある。メタデータに基づいてアクセスの主体が欲する映像コンテンツを見つけようとすると、時系列の最後までメタデータの内容を確認しなければ所望の映像シーンを抽出することができない、という問題がある。
【０００７】
さらに、従来のデジタル映像コンテンツ管理・運用システムにおいては、映像配信側が、視聴に利用可能な映像再生端末の限定を行っており、コンテンツを視聴するには、ユーザ側がシステムの仕様に合わせて端末を用意する必要があった。
【０００８】
本発明は、上記課題を解決するためになされたものであり、メタデータの処理負荷を低減できるメタデータ選別処理方法、メタデータ選択統合処理方法、メタデータ選択統合処理プログラム、映像再生方法、およびユーザが映像再生端末の種別・能力を意識することなくシステムサイドの適応化によって映像データの伝送を行うことのできるコンテンツ購入処理方法、コンテンツ購入処理サーバ、コンテンツ配信サーバを提供することを目的とする。
【０００９】
【課題を解決するための手段】
上記課題を解決するため、本発明では、映像コンテンツのメタデータ中にどのような情報が含まれているかを記述したメタデータ要約情報に基づいて、上記メタデータを選別することを基本とする。
【００１０】
【発明の実施の形態】
実施の形態１．（ネット配信型コンテンツ提供サービスモデル）
本実施の形態１では、本発明を利用した映像コンテンツ管理運用システムの事例として、ネット配信型コンテンツ提供サービスを例に挙げてその構成について述べる。
【００１１】
図１に、本実施の形態１におけるネット配信型コンテンツ提供サービスの全体システム構成を示す。同図において、携帯電話１と、コンテンツ購入処理サーバ２とが携帯電話回線上に構成されるＩＰ網を介して接続されている。また、コンテンツ配信サーバ３と、映像端末４とが公衆ＩＰ網を介して接続されている。携帯電話１と映像端末４とは、例えば赤外線通信経由で接続されているが、動画再生機能付き携帯電話のように、携帯電話１と映像端末４とが同一装置で構成されていても勿論かまわない。なお、以下、すべての実施の形態において、映像データとは、映像のみのデータだけでなく、それに付随するオーディオデータを含んだオーディオビジュアルデータも含むものとする。
【００１２】
メタデータ５は、映像データ６に関連付けられており、少なくとも、映像データ６の例えばシーンのタイトル、シーンの開始点・長さなどのシーン構成と、個々のシーンの映像、音声等のメディアデータに関する特徴量を記述したデータである。
【００１３】
図２に、メタデータ５の構成例を示す。メタデータ５は、例えば、図２に示すように、映像コンテンツのタイトルや、ジャンル、符号化方式等のコンテンツ全体の属性と、各シーン毎に、開始時刻や、シーン時間長、キーワード、特徴量などのメタデータが記述されている。また、シーンによっては、その下位に幾つかのサブシーンが構成され、このサブシーンにも、開始時刻や、シーン時間長、キーワード、特徴量などのメタデータが記述される。なお、これらのメタデータ５は、映像データ６に関して、コンテンツ提供者や本実施の形態１のサービスをおこなう者によってあらかじめ作成されていても良いし、本実施の形態１のサービス提供中に動的に作成されても良い。
【００１４】
動的に作成するメタデータ５の例としては、例えば、ライブ映像コンテンツを提供するケースを想定し、サービス加入者がオンラインで即時視聴できない場合に、ある程度サービス提供側で映像データ６をキャッシュしておき、あらかじめ用意しておくライブ映像固有の映像特徴あるいは音声特徴のテンプレート、例えば、スポーツのライブ映像であれば、サッカーのゴールシーンや、野球のホームランシーン、ゴルフのショットシーンなどのテンプレートとのマッチングを取ることで、映像データ６中の特徴的な映像シーンの抽出を行うなどの事例が考えられる。
【００１５】
また、事前に付与されるメタデータ５の例としては、たとえば「俳優○○が出ているシーン」などといったキーワード的なデータ要素が考えられる。
【００１６】
アダプテーションヒント７は、メタデータ５に関連付けられており、個々のメタデータ５に実際にどのようなデータが含まれているかを記述したメタデータの要約情報である。アダプテーションヒント７も、メタデータ５に関してコンテンツ提供者や本実施の形態１のサービスをおこなう者によって、あらかじめ作成されていても良いし、本実施の形態１のサービスをおこなっている最中に動的に作成されても良い。同図において、メタデータ５、アダプテーションヒント７は、コンテンツ購入処理サーバ２の一部として蓄積されているものとして記載されているが、これらはインターネット上の任意のサーバに配置されていてもよい。
【００１７】
以下、図１のシステムの、一連の動作フローを説明する。
▲１▼．コンテンツ購入処理
コンテンツ購入処理サーバ２は、例えばネット配信型コンテンツサービスプロバイダが運用するものであり、サービス加入ユーザごとにカスタマイズされた、お勧めのコンテンツメニュー９（ＲｅｃｏｍｍｅｎｄｅｄＣｏｎｔｅｎｔｓＭｅｎｕ、以下ＲＣＭと略す）を携帯電話１に送出する。ＲＣＭ９は、ＲＣＭ生成部８によって生成されるが、その詳細動作は後述する。
【００１８】
ＲＣＭ９は、システムがユーザに勧めるコンテンツの一覧情報であり、例えば各コンテンツの代表画像のサムネイルや、サムネイルへリンクとして埋め込まれるコンテンツの識別情報（ＵＲＬなど）、コンテンツのタイトル、概要説明情報といったデータが含まれる。ＲＣＭ９はあらゆるデータ表現形式を取りうる。例えば、携帯電話がＷＥＢブラウザを搭載している場合はＨＴＭＬデータであってもよいし、携帯電話がＳＭＩＬブラウザを搭載している場合はＳＭＩＬデータとして、携帯電話がＪａｖａ（登録商標）実行環境を有する場合にはＪａｖａ（登録商標）プログラムとして実装されてもよい。
【００１９】
ユーザは、サービスプロバイダ等のコンテンツ購入処理サーバ２から送られてくるＲＣＭ９をディスプレイ上に表示して、自分が視聴したいコンテンツを選択し、購入要求１０を送出する。その際、ユーザインタフェースとしては、例えば、ＲＣＭ９を構成する情報の一部としてコンテンツの代表画像（サムネイル）をディスプレイ上に表示して、それを選択し、購入ボタンを押す、などの処理で購入要求１０を送出する。
【００２０】
購入要求１０は、ユーザが何を選択したかの情報となるため、選択されたコンテンツに関連付けられているメタデータからタイトル、ジャンル、キーワードなどの情報を吸い上げ、サービスプロバイダ側の個々の加入ユーザのプロファイル情報１３として蓄積・更新しておき、以降のＲＣＭ９供給時に反映させる。この機構により、ＲＣＭ作成部８は、更新されたユーザプロファイル情報１３を用いて、常にユーザの最新の嗜好を反映したＲＣＭ９を送信することができる。つまり、ユーザプロファイル情報１３は、ＲＣＭ９生成のためのトリガとなるクエリ情報として用いられる。
【００２１】
コンテンツ購入処理サーバ２は、その内部の機能ブロックである購入処理部１１において、適切なユーザ認証のもとで購入要求１０を処理し、ユーザが選択したコンテンツを視聴する権利情報であるコンテンツ視聴券１２を当該ユーザの携帯電話１へ送出する。この際の認証情報としては、携帯電話の電話番号を用いる、などが考えられる。
【００２２】
図３に、コンテンツ視聴券１２の一例を示す。コンテンツ視聴券１２は、例えば、図３に示すように、ユーザが購入したコンテンツのタイトル（Ｔｉｔｌｅ）や、概要情報（Ａｂｓｔｒａｃｔ）、購入日時（ＰｕｒｃｈａｓｅＴｉｍｅ／Ｄａｔｅ）、コンテンツの内容である映像データ６のＵＲＬ等のロケータ情報、コンテンツの総再生時間（ＴｏｔａｌＴｉｍｅ）、映像データ６のどの時刻から再生すればよいかの情報（ＳｔａｒｔＴｉｍｅ）、ユーザが改変・操作できないデータ形式で記述されたコンテンツ視聴有効期限の情報（ＥｘｐｉｒｅＴｉｍｅ／Ｄａｔｅ）のいずれかを含むデジタルデータである。なお、コンテンツ視聴有効期限の情報（ＥｘｐｉｒｅＴｉｍｅ／Ｄａｔｅ）はユーザが改変・操作できるデータ形式で記述されていても良いし、また、コンテンツ視聴券１２にはこのコンテンツ視聴有効期限の情報（ＥｘｐｉｒｅＴｉｍｅ／Ｄａｔｅ）が記述されてなく、コンテンツ購入処理部１２等のサーバ側でのみユーザが改変・操作できないように管理するようにしても勿論よい。また、コンテンツ視聴有効期限の情報（ＥｘｐｉｒｅＴｉｍｅ／Ｄａｔｅ）だけでなく、映像データ６をどの時刻から再生すればよいかの情報（ＳｔａｒｔＴｉｍｅ）以外の更新されない情報は、ユーザが改変・操作できないデータ形式で記述されていることが望ましい。また、このコンテンツ視聴券１２がなければ、コンテンツの内容である映像データ６のロケータ情報が入手できないように、このコンテンツ視聴券１２に、このロケータ情報を入手するための暗号キーなどが記述されていても勿論良い。携帯電話１は、コンテンツ視聴有効期限情報（ＥｘｐｉｒｅＴｉｍｅ／Ｄａｔｅ）が示す有効期限が残っている間は、コンテンツ視聴券１０を自端末内に記憶しておくようにする。また、有効期限が過ぎたコンテンツ視聴券１０は自端末から抹消する。コンテンツ購入処理サーバ２から送信された時点では、再生すべき時刻は、映像データ６の始まる時刻である。なお、再生開始時刻を、ユーザが以前の再生時刻に設定できず、自動的に再生時間分だけ自動的にインクリメントのみするように制御することにより、１回限りのコンテンツ視聴許可に制限すること可能となる。勿論このような１回限りのコンテンツ視聴許可に制限しなくても、本実施の携帯１では、コンテンツ視聴有効期限情報（ＥｘｐｉｒｅＴｉｍｅ／Ｄａｔｅ）により、コンテンツの視聴期限は制限している。
【００２３】
ＲＣＭ作成部８は、例えばネット上もしくはローカルの大容量ディスクに蓄積されたサービス提供対象となる映像コンテンツに関連付けて蓄積される膨大なメタデータ５を入力して、該入力メタデータそれぞれについて付与されているアダプテーションヒント７と、ユーザプロファイル１３を用いてターゲットとなるメタデータの部位をフィルタリングし、それらを統合することによってお勧めのコンテンツメニューであるＲＣＭ９を作成し送出する。具体的には、ターゲットとなるメタデータの部位とは、ユーザ嗜好としてユーザプロファイル１３に登録されてキーワードに合致する映像シーンのメタデータなどである。
【００２４】
ここで、アダプテーションヒント７は、対応するメタデータ５の特定のデータ種別に着目して、メタデータ５の実データ（以下、インスタンスと呼ぶ）に、着目する種別のデータがどの程度含まれているかを示す概要・要約情報である。例えば、着目するデータ種別の例として、ＭＰＥＧ−７で規定される、映像シーンを特定の観点から見た場合の重要度を記述するデータである「ＰｏｉｎｔＯｆＶｉｅｗ記述子」を考える。ＰｏｉｎｔＯｆＶｉｅｗ記述子は、以下のようなデータシンタックスで表現される。
【００２５】

【００２６】
ｖｉｅｗｐｏｉｎｔとは、上記の「観点」を記述するデータであり、いわば映像シーンをある側面から見た場合の特徴的なキーワードの情報と言い換えることができる。
【００２７】
Ｉｍｐｏｒｔａｎｃｅとは、０〜１の範囲で定義され、ｖｉｅｗｐｏｉｎｔから映像シーンを見た場合に、該ｖｉｅｗｐｏｉｎｔの意味で当該映像シーンがどの程度重要度が高いシーンかを示す情報である。上記のデータ例では、ある映像シーンを「スポーツ」という観点から見た場合に、その重要度は０．５、すなわち１００％を最重要とすれば５０％程度のそこそこの程度である、ということを記述していることになる。ＲＣＭ作成部８は、アダプテーションヒント７としてＭＰＥＧ−７で規定されるＰｏｉｎｔＯｆＶｉｅｗ記述子のｖｉｅｗｐｏｉｎｔの内容と、ユーザのプロファイル情報１３とのマッチングによって、ユーザが望むコンテンツのメタデータを抽出・特定することができる。
【００２８】
なお、ＭＰＥＧ−７規格の標準準拠性の観点からは、ＭＰＥＧ−７で規定される記述子は、スキーマ（シンタックス定義）によってメタデータ５のインスタンス内に含まれることが規定されても、実際のインスタンスに記述として現れるかどうかは、インスタンスを確認する以外に知る手段がない。したがって、映像コンテンツのように時系列で長時間の映像データでは、＜ＰｏｉｎｔＯｆＶｉｅｗ記述子＞のような映像シーン単位に出現の状況が異なるような記述子がメタデータ５のインスタンス内にどの程度含まれているか、サブシーン階層構造を有する映像コンテンツの場合にどの階層に含まれているかなどの情報は、メタデータ５のみでは事前に知ることができないので、上記例のように「スポーツ」を含む映像シーンの存在を＜ＰｏｉｎｔＯｆＶｉｅｗ＞でチェックするにはメタデータすべての解析作業が必要となる。この解析処理の処理量を低減する目的で、メタデータ要約情報としてのアダプテーションヒント７を用いることになる。アダプテーションヒント７は、例えば解析対象のメタデータ５に何個の＜ＰｏｉｎｔＯｆＶｉｅｗ＞が含まれるか、どの階層に＜ＰｏｉｎｔＯｆＶｉｅｗ＞が含まれるか、などのメタデータのインスタンスに関する要約情報を記述する。したがって、端的な例を示せば、複数のメタデータ５をチェックする前にアダプテーションヒント１４を確認して、個々のメタデータに＜ＰｏｉｎｔＯｆＶｉｅｗ＞が少なくとも１つ含まれるかどうかを知ることができれば、＜ＰｏｉｎｔＯｆＶｉｅｗ＞が含まれるメタデータについてのみ、そのインスタンスの記述内容を解析すればよいため、ＲＣＭ９を生成するためのユーザプロファイル情報１３とメタデータ５とのマッチング処理量を大幅に削減することができる。
【００２９】
▲２▼．映像データ配信要求・再生
以上の手順で、ユーザは自分の携帯電話１にコンテンツ購入サーバ２よりコンテンツ視聴券を獲得し、その有効期限が来るまで保持しておき、どこにでも持ち歩くことができる。持ち歩いたコンテンツ視聴券１２は、その内容を解釈できる映像端末４へ赤外線通信や、記録媒体等を介して伝達することにより、映像端末４が自身の映像再生能力に応じてコンテンツ配信サーバ３にユーザが購入した映像コンテンツを映像データ６として要求する。勿論、携帯電話１と映像端末４とが同一であれば、このようなコンテンツ視聴券の伝達は不要である。このため、映像端末４は、コンテンツ視聴券１２の情報を標準的に定められた手順で解釈し、ユーザが有する権利に応じて正当な映像データ６を要求し、再生することが可能な機能処理部を備える。このような機能を有する映像端末４と、携帯電話１および映像端末４と分離してのコンテンツ視聴券１２の流通により、ユーザは携帯電話１を持ち歩くだけで、いつでもどこでも利用可能な映像端末４を用いて、自宅でビデオ機器を操作するがごとくコンテンツを視聴することが可能となる。つまり、コンテンツ購入処理サーバ２から正当に取得したコンテンツ視聴券１２を自己の携帯電話１等に保持している限り、自己の映像端末４とは性能の異なる他人の映像端末４を借りてそこに自己のコンテンツ視聴券１２をセットする限りは、他人の映像端末４を借りても、そのコンテンツ視聴券１２により許可された映像コンテンツをコンテンツ配信サーバ３より正常に受信して使用できることになる。
【００３０】
ここで、映像端末４は、コンテンツ視聴ステイタス情報１５、端末情報１６をコンテンツ配信サーバ３に送出することにより、映像端末４自身の映像再生能力に応じた映像コンテンツを映像データ６として要求する。コンテンツ視聴ステイタス情報１５とは、コンテンツ視聴の状態を保持した情報であり、少なくともコンテンツのＵＲＬなどのロケータ情報、コンテンツを再生開始からどこまで視聴したかを示す再生時刻情報を含む。コンテンツ視聴ステイタス情報１５は、ロケータ情報と、再生時刻情報とにより、ユーザは映像端末４を特定してコンテンツ視聴券１２を送信したあと、コンテンツを再生したいポイントから瞬時に再生開始させることができる。これによれば、ある映像端末４で視聴していたコンテンツの視聴をある時刻で中止した場合、その時刻の次の時刻が再生時刻情報としてコンテンツ視聴券１２に更新されるので、同一のあるいは別の映像端末４でそのコンテンツを視聴する場合でも、前回中止した時点の次からコンテンツの視聴を継続できるというメリットがある。
【００３１】
また、前に別の映像端末４で視聴した時点の記憶があいまいで、今回の視聴を始める前に以前の映像のストーリーを確認したい場合には、あらかじめ再生再開時刻よりも遡った時刻から高速再生またはダイジェスト再生などを行わせることによって、手早く視聴記憶を取り戻して再生を再開させることもできる。この際、高速再生やダイジェスト再生の途中でストーリー上のキーになる映像箇所では代表画像を静止画として表示したり、といったインタフェースも可能である。
【００３２】
端末情報１６は、映像端末４に固有で、標準的なデータ定義に従って記述される情報として映像端末４から標準的な手順に従ってコンテンツ配信サーバ３に送信される。端末情報１６は、少なくとも、映像端末４の表示可能な時間・空間解像度や、受信可能な映像データフォーマット、例えば符号化方式やビットレートなどを含んでいる。
【００３３】
コンテンツ配信サーバ３では、映像フォーマット判定部１８が端末情報１６に従って、映像端末４に送信すべき映像データ６の最適なデータフォーマット、すなわち適応化映像データ１７のデータフォーマットを決定する。データフォーマットの種別には、例えばＭＰＥＧ−４、ＭＰＥＧ−２、Ｍｏｔｉｏｎ−ＪＰＥＧ、ＪＰＥＧ、Ｗｉｎｄｏｗｓ（登録商標）ＭｅｄｉａＶｉｄｅｏ，ＤｉｖＸなどがある。映像フォーマット判定部１８、決定した送出映像データフォーマットに対して、コンテンツ配信サーバ３自身からそのまま送出可能な映像データか、自分自身でリアルタイムに映像データフォーマットの変換を行いながら送出すべきデータか、あるいは他の連携して動作する別サーバから送出することが適当な映像データか、などの判断を行い、要求元の映像端末４にとって最適な解像度、映像データフォーマットにしたがって、要求された映像データ６を適応化映像データ１７として要求された再生開始時刻から配信する。
【００３４】
なお、映像端末４では、映像再生を中断した場合、中断した時刻でのコンテンツ視聴券１２を携帯電話１に送信する。すると、携帯電話１は、中断した時刻を更新したコンテンツ視聴券１２を記憶する。なお、コンテンツ視聴券１２への中断した時刻の更新は、携帯電話１だけでなく、映像端末４等が行うようにしても勿論よい。そして、携帯電話１が、コンテンツ視聴券１２を別の映像端末４に送信すれば、上記同様の手順によって、別の映像端末４では映像再生を中断した時刻から、再び映像を再生することができる。
【００３５】
▲３▼．映像データ蓄積配信
コンテンツ配信サーバ３は、映像端末４からの端末情報１６と、コンテンツ視聴ステイタス情報１５とを入力して、コンテンツ視聴券１２に対応した映像データ６を、再生すべき時刻から、端末情報１６に記述された映像端末４で再生可能な映像データフォーマット、例えばＭＰＥＧ−４やＭＰＥＧ−２、Ｍｏｔｉｏｎ−ＪＰＥＧ、ＪＰＥＧなどに変換し、変換結果の適応化映像データ１７として、当該映像端末４へ配信する。
【００３６】
その際、コンテンツ配信サーバ３内では、映像フォーマット判定部１８が端末情報１６の入力から映像端末４で再生可能なフォーマットを判定し、必要に応じて映像フォーマット変換部２０にフォーマット情報１９を送信し、コンテンツ視聴ステイタス情報１５に含まれる映像データのロケータ情報と再生開始時刻とを映像データサーバ２１に送信する。
【００３７】
映像データサーバ２１は、映像データのロケータ情報で指示される場所に格納される映像データ６を映像フォーマット変換部２０に送信する。
【００３８】
映像フォーマット変換部２０は、映像データサーバ２１からの映像データ６を、映像フォーマット判定部１８からのフォーマット情報１９で指定される映像データフォーマットに変換したものを、適応化映像データ１７として映像端末に送信する。
【００３９】
従って、本実施の形態１によれば、ユーザの好みを示すユーザプロファイル情報１３と、メタデータ５の要約情報であるアダプテーションヒント７とにより、ユーザの好みに合うメタデータ５のみを選別し統合して各ユーザお勧めのコンテンツメニューであるＲＣＭ９を作成するようにしたので、映像コンテンツのメタデータ５のようにメタデータ５の量が莫大であっても、効率良くユーザの好みにあうメタデータ５を抽出して、ＲＣＭ９を作成することができる。
【００４０】
また、本実施の形態１によれば、コンテンツ視聴券１２を保持した映像端末１２のみがコンテンツを視聴できるようにしたので、インターネット上などの分散環境に蓄積される映像コンテンツを、いつでもどこでも、動的に変化する視聴条件、例えば端末種別や、場所、時間、視聴形態に関する嗜好などに適応させて視聴可能とする、視聴する権利をユーザが購入可能となるシステムを構築することが可能である。
【００４１】
この結果、本実施の形態１によれば、コンテンツアクセスや、メディア伝送の処理に関し、ユーザがシステム仕様を意識することなく、システムが、利用ユーザの個々のコンテンツアクセス環境に適応することが可能となる。具体的には、個々のメタデータを自動生成したり、メタデータに基づいた検索などの処理を行ったりする際に、コンテンツ購入処理サーバ２等のサーバ側で、メタデータ自体の要約情報であるアダプテーションヒント７を自動的に生成し、該要約情報あるアダプテーションヒント７を利用することにより、以降のメタデータの処理、例えばメタデータに基づく所望のコンテンツの検索や、所望のコンテンツに類似したコンテンツの抽出、所望の映像シーンのピックアップと、その結果の提供などの処理実行ステップを低減させることができる。また、映像端末４に、映像のコンテンツ視聴券１２を与えて、映像端末４が端末情報１６やコンテンツ視聴ステイタス情報１５を介し自律的に映像コンテンツ配信サーバ３とネゴシエーションを行うようにしたので、ユーザのコンテンツ視聴環境に依存しない、すなわちどのようなスペックの映像端末４を変えたり、あるいは時間的に間を空けたしても、コンテンツ視聴券１２があって端末情報１６やコンテンツ視聴ステイタス情報１５をコンテンツ配信サーバ３に送信することができる限り、どこでも、何時でも継続的なコンテンツの配信を実現することが可能となる。
【００４２】
なお、本実施の形態１においては、携帯電話１と映像端末４とを別々の端末として説明したが、本発明では、携帯電話１と映像端末４とを同一の端末として構成しても良いし、また携帯電話１とした部分は、持ち運び可能なＰＣやＰＤＡに置き換えても良い。その場合、コンテンツ購入処理サーバ２と携帯電話１の間の通信、すなわちＲＣＭ９や購入要求１０、コンテンツ視聴券１２のやり取りは、携帯電話網ではなく、有線、無線のＩＰ網を通じておこなっても良い。また、携帯電話１と映像端末４間の、コンテンツ視聴券１２の送受信は、赤外線通信を想定しているが、Ｂｌｕｅｔｏｏｔｈ（登録商標）等の無線通信や、あるいはＳＤカード（登録商標）などのカードメディア、家庭用ＬＡＮなどのＩＰ網を通じて行っても良い。
【００４３】
また、コンテンツ配信サーバ３と映像端末４間の端末情報１６やコンテンツ視聴ステイタス情報１５、適応化映像データ１７の伝送は、本実施の形態１では一般のＩＰ網を通じて行うことを想定しているが、ＡＤＳＬやＣＡＴＶケーブル、光ファイバなど一般に利用可能な加入者アクセス網も勿論利用可能である。
【００４４】
また、コンテンツ購入サーバ２とコンテンツ配信サーバ３とを別々のサーバとして説明したが、コンテンツ購入サーバ２とコンテンツ配信サーバ３とを同一のサーバにより構成しても勿論よい。
【００４５】
さらに、ＲＣＭ作成部８や購入処理部１１、映像フォーマット判定部１８等のコンテンツ購入サーバ２やコンテンツ配信サーバ３における処理は、ハードウエアにより構成するだけでなくなく、プログラムの実行によりソフトウエア的に処理するように構成するようにしても勿論良い。このことは、携帯電話１および映像端末４でも同様で、それらの機能をハードウエアにより構成するだけでなくなく、プログラムの実行によりソフトウエア的に処理するように構成するようにしても勿論良い。
【００４６】
実施の形態２．（監視システムモデル）
実施の形態２では、本発明を利用した映像コンテンツ管理運用システムの事例として、映像監視システムを例に挙げてその構成について述べる。
【００４７】
図４に、本実施の形態における、映像監視システムのシステム構成を示す。
実施の形態２の映像監視システムは、図４に示すように、監視映像イベント検出サーバ３９と、監視映像集配信サーバ４９とから構成されており、これらと監視端末４０とが通信する。、監視映像イベント検出サーバ３９は、特徴量抽出処理部３４、メタデータサーバ３７と、アダプテーションヒントサーバ３８、ＲＣＭ作成部４３、および再生イベント決定部４５を有している。監視映像集配深部４９は、映像データサーバ３０、映像フォーマット判定部４７、および映像フォーマット変換部５１を有している。
以下、図４に示すシステムの、一連の動作を説明する。
【００４８】
▲１▼．監視映像蓄積・イベント抽出
映像データサーバ３０は、例えば、監視サービスセンター（図示せず）が運用するものであり、監視カメラ３１から伝送される監視映像を映像データ３２として蓄積する。勿論、監視カメラ３１は、１台でも複数台配置されていてもよい。映像データ３２に関連付けられて撮影条件情報３３も監視カメラ３１より送信されて蓄積される。撮影条件情報３３は、映像データ３２が撮影された日時・時刻の情報を含み、監視カメラ３１が複数台ある場合は、どのカメラかを識別する情報も含む。映像データサーバ３０は、映像データ３２、撮影条件情報３３を、監視映像イベント検出サーバ３９の特徴量抽出処理部３４に送出する。
【００４９】
監視映像イベント検出サーバ３９の特徴量抽出処理部３４は、映像データ３２、撮影条件情報３３を入力して、メタデータ３５、アダプテーションヒント３６を作成し、夫々をメタデータサーバ３７、アダプテーションヒントサーバ３８に蓄積する。メタデータ３５は、撮影条件情報３３、映像のシーン構造情報、映像データ３２中に発生するイベント、例えば映像特徴や音の抑揚に応じた監視映像内の他の映像シーンに対する特異性を、色、動き、音などのメディア特徴量で記述したデータの情報を含んでいる。また、アダプテーションヒント３６は、実施の形態１でも説明した通り、メタデータ３５に関連付けられていて、個々のメタデータ３５に実際にどのような内容、特に本実施の形態２ではイベントの内容が含まれているかを記述したメタデータ３５の要約情報である。
【００５０】
なお、映像データ３２を各カメラ３１のローカルな蓄積可能領域、例えばハードディスクなどに蓄積し、特徴抽出処理部３４の機能を当該カメラ３２の内部に含むように構成してもよい。この場合は、上記のイベント情報の検出に必要な特徴抽出処理を各カメラ３１で実施し、結果としてメタデータ３５およびアダプテーションヒント３６を出力するように構成されることになる。このような構成によれば、各カメラ３１内部の実装は複雑になるが、特徴抽出処理を行う目的での映像データ３２の伝送の手間を削減することができる。
【００５１】
▲２▼．監視映像イベント検出
監視映像イベント検出サーバ３９は、例えば監視サービスセンター（図示せず）が運用するものであり、監視端末４０が監視指定情報４１を送信すると、監視端末４０からその監視指定情報４１を受け取り、監視者ごとにカスタマイズされたＲＣＭ４２を監視端末４０に送出する。
【００５２】
監視指定情報４１は、ＲＣＭ４２を生成するためのトリガとなるクエリ情報であり、確認を行いたい監視カメラ３１を特定したり、確認したい記録時刻を指定する情報である。なお、監視指定情報４１は、監視者が明示的に情報を入力して送出するだけでなく、システムが監視端末４０の状態を自動的に検知して設定するように構成してもよい。例えば、イベントが発生した現場にガードマン等が携帯可能な監視端末４０をもって向かうケースを想定した場合には、システムが監視端末４０からのアクセス時刻を検知するとともに、ＧＰＳなどの位置計測システムによって特定される監視端末４０の現在位置に基づいてもっとも近くにある監視カメラ３１を自動的に特定するように構成してもよい。このように構成すれば、緊急時にガードマンが煩雑な端末操作を行うことなく、ガードマンが確認すべき現場のカメラで撮影されたイベントをＲＣＭ４２の情報として迅速に確認することが可能となる。
【００５３】
本実施の形態２で述べるような監視映像システムにおいては、日々、複数の監視カメラ３１で撮影された映像が映像データ３２として映像データサーバ３０に格納される。したがって、これらの監視映像アーカイブを効率よく管理・運用するため、特徴量抽出処理部３４においてメタデータ３５を生成し、個々の監視映像に関連付けておく。
【００５４】
ＲＣＭ作成部４３は、このように一つ一つが監視映像データに関連付けられた複数のメタデータ３５を入力として、アダプテーションヒント３６と、監視指定情報４１を用いてフィルタリングし、ＲＣＭ４２を作成、送出する。その際、ＲＣＭ作成部４３は、アダプテーションヒント３６に含まれるメタデータ３５の要約情報を用いて、不要なメタデータ３５の検索を防ぎ、検索を効率化できる。事例としては、例えば、アダプテーションヒント３６が記述すべき要約の項目として、監視映像中のイベントシーンを記述するための特徴量、例えば映像の色特徴や、動き特徴、音などが考えられる。具体的には、服装の色や、人物の動きの速さ、音の大きさ等の特徴量が考えられ、これらにより、例えば店舗内の不審者等を特定することが可能になる。類似イベントを他のカメラの映像も含めて確認を行いたい場合、類似イベントが色特徴によって特徴付けられるのであれば、アダプテーションヒント３６として映像中の色特徴記述子に関する要約情報を記述しておけばよい。
【００５５】
監視者は、監視端末４０によりＲＣＭ作成部４３からＲＣＭ４２を受け取ると、ＲＣＭ４２の情報として提供されるイベント発生区間の代表画像のサムネイルから再生したい画像を選択した再生映像指定情報の送信や、類似イベントとして検索したいイベントを表すサムネイルから検索したい画像を選択した検索キーの送信である再生映像指定情報・検索キー４４を再生イベント決定部４５に送信する。
【００５６】
再生イベント決定部４５は、再生映像指示情報・検索キー４４が再生映像指示情報である場合は、映像再生要求４６を映像フォーマット判定部４７に送信する。映像再生要求４６は、少なくとも、映像データ３２のロケータ情報と、その再生すべき時刻、例えば再生開始時刻や再生時間を含む。また、再生映像指示情報・検索キー４４が検索キーである場合、再生イベント決定部４５は、指定された監視カメラ３１の種類や時間区間などに含まれるすべてのメタデータ３５から、選択されたイベントに類似したイベントを含むメタデータ３５を検索する。この際、再生イベント決定部４５は、ＲＣＭ生成部４３と同様、アダプテーションヒント３６に含まれるメタデータ３５の要約情報を用いて、不要なメタデータ３５の検索を防ぎ、検索を効率化することができる。
【００５７】
▲３▼．映像データ配信要求・再生
監視端末４０は、映像データ配信要求をする場合、まず、端末情報４８を監視映像集配信サーバ４９に送出する。端末情報４８は、少なくとも、監視端末４０の表示解像度や、受信可能な画像フォーマット、例えばＭＰＥＧ−４、ＭＰＥＧ−２、Ｍｏｔｉｏｎ−ＪＰＥＧ、ＪＰＥＧ、Ｗｉｎｄｏｗｓ（登録商標）ＭｅｄｉａＶｉｄｅｏ，ＤｉｖＸなどに関する情報を含む。監視端末４０は、監視映像集配信サーバ４９から送出される適応化映像データ５０を受信し、映像を再生する。適応化映像データ５０については後述する。監視端末４０としては、監視センタ内のＰＣでもよいし、ガードマンが携帯するＰＤＡ、ＰＣ、携帯電話などでもよい。
【００５８】
▲４▼．監視映像集配信サーバ
監視映像集配信サーバ４９は、例えば監視サービスセンター（図示せず）が運用するものであり、監視端末４０からの端末情報４８、再生イベント決定部４５からの映像再生要求４６の入力に基づき、指定再生開始時刻からの要求映像データ３２を、端末情報４８に記述された監視端末４０で再生可能な映像フォーマット、例えばＭＰＥＧ−４、ＭＰＥＧ−２、Ｍｏｔｉｏｎ−ＪＰＥＧ、ＪＰＥＧ、Ｗｉｎｄｏｗｓ（登録商標）ＭｅｄｉａＶｉｄｅｏ，ＤｉｖＸなどに変換して、適応化映像データ５０を配信する。
【００５９】
映像フォーマット判定部４７は、監視端末４０から入力した端末情報４８に基づいて監視端末４０で再生可能なフォーマットを判定し、映像フォーマット変換部５１に対しフォーマット情報５２を送信する一方、映像データサーバ３０に対し映像再生要求４６に対応した映像データ３２のロケータ情報と、その再生開始時刻・再生時間等を指定した映像再生要求４６を送る。
【００６０】
映像データサーバ３０は、映像フォーマット判定部４７からの映像再生要求４６に対応した映像データ３２を映像フォーマット変換部５１に送信する。映像フォーマット変換部５１は、映像データサーバ３０からの入力映像データ３２のフォーマットを映像フォーマット判定部４７からのフォーマット情報５２に基づいて変換し、フォーマットに変換したものを適応化映像データ５０として監視端末４０に送信する。監視端末４０は、監視映像集配信サーバ４９から送信されてきた適応化映像データ５０を受信して再生する。
【００６１】
従って、本実施の形態２の映像監視システムによれば、監視カメラ３１から撮像した大量の映像データに対して、撮像時の条件情報とともに映像中で異常検知された箇所をイベントとしてメタデータ３５で統一的に管理することができ、所望の時刻や類似イベントを含む映像シーンに対して効率的にアクセスを行うことが可能となる。
【００６２】
特に、メタデータ３５は、すべてのイベントを記述しているので、映像再生機能を工夫することによって、イベント映像シーンのみを集約して高速再生したり、スキップ再生したりといった特殊再生機能も容易に実現できる。
【００６３】
また、映像再生端末の情報に基づいて伝送時の映像データフォーマットを自動的に変換することにより、監視センタ内の映像端末、現場に向かうガードマンが携帯する携帯端末、例えば映像再生機能付携帯電話やＰＤＡなどの端末種別にかかわらず、いつでもどこでも所望の映像シーンへの迅速なアクセスが可能となる。
【００６４】
なお、本実施の形態２では、監視映像イベント検出サーバ３９と、監視映像集配信サーバ４９とを別々のサーバで構成したが、勿論同一サーバで構成しても良い。また、監視端末４０は、映像の再生まで行うように説明したが、実施の形態１のように携帯電話１と映像端末４との別々の端末で構成しても勿論かまわない。
【００６５】
また、実施の形態１と同様に、ＲＣＭ作成部４３や特徴量抽出処理部３４、再生イベント決定部４５、映像フォーマット判定部４７等のコンテンツ購入サーバ２やコンテンツ配信サーバ３における処理は、ハードウエアにより構成するだけでなくなく、プログラムの実行によりソフトウエア的に処理するように構成するようにしても勿論良い。このことは、監視端末４０でも同様で、監視端末４０の機能をハードウエアにより構成するだけでなくなく、プログラムの実行によりソフトウエア的に処理するように構成するようにしても勿論良い。
【００６６】
【発明の効果】
以上説明したように、本発明によれば、映像コンテンツのメタデータ中にどのような情報が含まれているかを記述したメタデータ要約情報に基づいて、上記メタデータを選別するようにしので、映像コンテンツのメタデータのようにメタデータの量が莫大であっても、効率良く目的とするメタデータを抽出することができる。
【図面の簡単な説明】
【図１】実施の形態１のネット配信型コンテンツ提供サービスを実現する全体システム構成を示す図である。
【図２】実施の形態１で用いるメタデータの一実現例を示す図である。
【図３】実施の形態１で用いるコンテンツ購入券のデータフォーマットの一例を示す図である。
【図４】実施の形態２の映像監視システムの全体システム構成を示す図である。
【符号の説明】
１携帯電話、２コンテンツ購入処理サーバ、３コンテンツ配信サーバ、４映像端末、５メタデータ、６映像データ、７アダプテーションヒント、８ＲＣＭ作成部、９ＲＣＭ、１０購入要求、１１購入処理部、１２コンテンツ視聴券、１５コンテンツ視聴ステイタス情報、１６端末情報、１７適応化映像データ、１８映像フォーマット判定部、２０映像フォーマット変換部、２１映像データサーバ。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a metadata selection processing method, a metadata selection integration processing method, and a metadata selection integration method that can be provided to a system for efficiently transmitting, storing, managing, and operating video content managed in association with metadata. The present invention relates to a processing program, a video reproduction method, a content purchase processing method, a content purchase processing server, and a content distribution server. Specific application examples include provision of a network distribution type content service and construction of a video-based monitoring system.
[0002]
[Prior art]
In recent years, with the spread of compression technology for digital video contents such as MPEG-2 and MPEG-4, devices and devices for reproducing / recording digital video contents such as DVDs, hard disk recorders, and mobile phones with a video reproduction function have exploded. It is spreading. Also, with the broadening of the Internet and the development of broadband mobile networks, the importance of systems for transmitting, storing, managing, and operating digital video content online has been increasing. On the other hand, digital video content is still bulky compared to document information for direct online exchange, and its semantic content changes in a time-series manner. Giving metadata for accurate and quick confirmation is significant. In order to increase the efficiency of accessing such video contents, international standard / industry standard video such as MPEG-7 (ISO / IEC 15938) and TV Anytime (http://www.tv-anytime.org/) are used. A metadata format is defined. According to such standardized metadata, a system for online video content management and operation is efficient for external access (searching, browsing, filtering) without accessing video content itself on the Internet. Can be dealt with.
[0003]
As a conventional technique that discloses such metadata, there is, for example, a moving image management device disclosed in Japanese Patent Application Laid-Open No. 2001-028722. This moving image management apparatus divides video data into a plurality of scenes, and for each scene, metadata which is a collection of section information necessary for reproducing the scene, a scene number, and an image representative of the scene. And assign a title to each index according to the purpose of the search. The user searches the metadata using the title, and can connect and reproduce only necessary scenes in the video data based on the order of the scene numbers included in the index and the section information.
[0004]
[Patent Document 1]
JP 2001-028722 A
[0005]
[Problems to be solved by the invention]
However, when the total number of video contents explodes, the processing load on the database that manages the metadata of the video contents itself and the metadata search engine also increases, resulting in a problem that content access efficiency is reduced. I do.
[0006]
In particular, since video content is content that changes in time series, there is a request from the subject of access (such as a user) to quickly access a specific part in the time series. For this purpose, there is a need to divide the time series into predetermined scene units and to assign and manage metadata. Therefore, the data amount of the metadata itself may increase in proportion to the length of the video. When trying to find the video content desired by the accessing subject based on the metadata, there is a problem that a desired video scene cannot be extracted unless the content of the metadata is checked until the end of the time series.
[0007]
Furthermore, in the conventional digital video content management and operation system, the video distribution side limits the video playback terminals that can be used for viewing, and in order to view the content, the user side switches the terminal according to the system specifications. I had to prepare.
[0008]
The present invention has been made to solve the above problems, and has a metadata selection processing method, a metadata selection integration processing method, a metadata selection integration processing program, a video playback method, and a metadata selection processing method that can reduce the processing load of metadata. It is an object of the present invention to provide a content purchase processing method, a content purchase processing server, and a content distribution server that enable a user to transmit video data by adapting the system side without being aware of the type and capability of the video playback terminal. .
[0009]
[Means for Solving the Problems]
In order to solve the above-mentioned problem, the present invention is based on selecting metadata based on metadata summary information that describes what information is included in metadata of video content.
[0010]
BEST MODE FOR CARRYING OUT THE INVENTION
Embodiment 1 FIG.(Internet delivery type content provision service model)
In the first embodiment, as an example of a video content management and operation system using the present invention, the configuration of a network distribution type content providing service will be described as an example.
[0011]
FIG. 1 shows the overall system configuration of the network delivery type content providing service according to the first embodiment. In FIG. 1, amobile phone 1 and a contentpurchase processing server 2 are connected via an IP network formed on a mobile phone line. Further, thecontent distribution server 3 and thevideo terminal 4 are connected via a public IP network. Themobile phone 1 and thevideo terminal 4 are connected, for example, via infrared communication. However, it is needless to say that themobile phone 1 and thevideo terminal 4 may be configured by the same device, such as a mobile phone with a moving image reproduction function. Absent. Hereinafter, in all the embodiments, the video data includes not only video-only data but also audiovisual data including accompanying audio data.
[0012]
Themetadata 5 is associated with thevideo data 6 and at least relates to a scene configuration of thevideo data 6 such as a scene title, a start point and a length of the scene, and media data such as video and audio of each scene. This is data describing a feature value.
[0013]
FIG. 2 shows a configuration example of themetadata 5. For example, as shown in FIG. 2, themetadata 5 includes attributes of the entire content such as the title, genre, and encoding method of the video content, and a start time, a scene time length, a keyword, and a feature amount for each scene. Metadata such as is described. In addition, depending on the scene, several sub-scenes are formed under the scene, and metadata such as a start time, a scene time length, a keyword, and a feature amount are described in this sub-scene. Note that thesemetadata 5 may be created in advance by the content provider or the person who provides the service of the first embodiment with respect to thevideo data 6, or may be dynamically created during the provision of the service of the first embodiment. May be created.
[0014]
As an example of the dynamically createdmetadata 5, for example, assuming a case in which live video content is provided, if the service subscriber cannot immediately view online,video data 6 is cached to some extent on the service provider side. Matching with templates of video features or audio features specific to live video that are prepared in advance, such as soccer goal scenes, baseball home run scenes, golf shot scenes, etc. for sports live images In this case, a characteristic video scene in thevideo data 6 may be extracted.
[0015]
As an example of themetadata 5 added in advance, a keyword-like data element such as “scene in which actor XX appears” can be considered.
[0016]
The adaptation hint 7 is associated with themetadata 5, and is summary information of metadata describing what data is actually contained in eachmetadata 5. The adaptation hint 7 may be created in advance by the content provider or the person who provides the service of the first embodiment with respect to themetadata 5 or may be dynamically created during the service of the first embodiment. May be created. In the figure, themetadata 5 and the adaptation hint 7 are described as being stored as a part of the contentpurchase processing server 2, but they may be arranged on any server on the Internet.
[0017]
Hereinafter, a series of operation flows of the system of FIG. 1 will be described.
▲ 1 ▼. Content purchase processing
The contentpurchase processing server 2 is operated by, for example, a network distribution type content service provider, and a recommended content menu 9 (Recommended Content Menu, hereinafter abbreviated as RCM) customized for each service subscribing user. To send to. TheRCM 9 is generated by the RCM generation unit 8, and its detailed operation will be described later.
[0018]
TheRCM 9 is information of a list of contents recommended by the system to the user. For example, data such as a thumbnail of a representative image of each content, identification information (URL or the like) of the content embedded as a link to the thumbnail, a title of the content, and summary description information are included. included. TheRCM 9 can take any data representation format. For example, when the mobile phone has a WEB browser, the data may be HTML data. When the mobile phone has a SMIL browser, the mobile phone may use the Java (registered trademark) execution environment as SMIL data. If it does, it may be implemented as a Java (registered trademark) program.
[0019]
The user displays theRCM 9 sent from the contentpurchase processing server 2 such as a service provider on a display, selects the content he / she wants to view, and sends out apurchase request 10. At this time, as the user interface, for example, a representative image (thumbnail) of the content is displayed on the display as a part of the information configuring theRCM 9, the user selects the image, and presses the purchase button. Send 10
[0020]
Since thepurchase request 10 becomes information on what the user has selected, information such as a title, a genre, and a keyword is collected from the metadata associated with the selected content, and the information of each subscribed user on the service provider side is obtained. The information is accumulated and updated as theprofile information 13 and is reflected when theRCM 9 is supplied thereafter. With this mechanism, the RCM creating unit 8 can always transmit theRCM 9 reflecting the user's latest preference using the updateduser profile information 13. That is, theuser profile information 13 is used as query information serving as a trigger for generating theRCM 9.
[0021]
The contentpurchase processing server 2 processes thepurchase request 10 under appropriate user authentication in apurchase processing unit 11 which is an internal functional block of the contentpurchase processing server 2, and a content viewing ticket which is right information for viewing the content selected by the user. 12 to the user'smobile phone 1. As the authentication information at this time, use of a telephone number of a mobile phone is conceivable.
[0022]
FIG. 3 shows an example of thecontent viewing ticket 12. As shown in FIG. 3, thecontent viewing ticket 12 includes, for example, a title (Title) of the content purchased by the user, summary information (Abstract), a purchase date / time (Purchase Time / Date), andvideo data 6 which is the content of the content. Locator information such as URL, total playback time (Total Time) of the content, information (Start Time) from which time point of thevideo data 6 should be played back, and content viewing / listing described in a data format that the user cannot modify or operate. It is digital data including any of the expiration date information (Expire Time / Date). The information of the expiration date of the content (Expire Time / Date) may be described in a data format that can be modified and operated by the user, and the information of the expiration date of the content (Expire Time) is included in thecontent viewing ticket 12. / Date) is not described, and management may be performed such that the user cannot modify or operate only on the server side such as the contentpurchase processing unit 12. In addition to the information of the content viewing expiration date (Expire Time / Date), the information that is not updated other than the information (Start Time) indicating from which time thevideo data 6 should be reproduced is data that cannot be modified or operated by the user. It is desirable to be described in the format. If thecontent viewing ticket 12 does not exist, thecontent viewing ticket 12 describes an encryption key or the like for obtaining the locator information so that the locator information of thevideo data 6 as the content cannot be obtained. Of course it is good. Themobile phone 1 stores thecontent viewing ticket 10 in its own terminal while the expiration date indicated by the content viewing expiration date information (Expire Time / Date) remains. Also, thecontent viewing ticket 10 whose expiration date has passed is deleted from its own terminal. When transmitted from the contentpurchase processing server 2, the time to be reproduced is the time when thevideo data 6 starts. In addition, it is possible to limit the content to a one-time viewing permission by controlling the playback start time so that the user cannot set the previous playback time and automatically increments only the playback time. It becomes. Of course, without limiting to such one-time content viewing permission, in themobile phone 1 of the present embodiment, the content viewing time limit is limited by the content viewing expiration information (Expire Time / Date).
[0023]
The RCM creator 8 inputs, for example, a huge amount ofmetadata 5 stored in association with video contents to be provided on a network or on a local large-capacity disk and provided with a service, and is assigned to each of the input metadata. By using the adaptation hint 7 and theuser profile 13 to filter the target metadata parts, and integrating them, anRCM 9 as a recommended content menu is created and transmitted. Specifically, the target part of the metadata is, for example, metadata of a video scene registered as a user preference in theuser profile 13 and matching the keyword.
[0024]
Here, the adaptation hint 7 focuses on a specific data type of thecorresponding metadata 5 and indicates how much data of the type of interest is included in actual data (hereinafter, referred to as an instance) of themetadata 5. This is the summary / summary information indicating the information. For example, as an example of a data type of interest, consider a “PointOfView descriptor” that is data defined by MPEG-7 and that describes importance when a video scene is viewed from a specific viewpoint. The PointOfView descriptor is represented by the following data syntax.
[0025]

[0026]
The viewpoint is data describing the above “viewpoint”, and can be rephrased as characteristic keyword information when the video scene is viewed from a certain aspect.
[0027]
The Importance is defined in the range of 0 to 1, and is information indicating, when viewing a video scene from a viewpoint, how important the video scene is in the sense of the viewpoint. In the above data example, when a certain video scene is viewed from the viewpoint of “sports”, its importance is 0.5, that is, if 100% is the most important, it is a reasonable level of about 50%. Is described. The RCM creation unit 8 can extract and specify the metadata of the content desired by the user by matching the content of the viewpoint of the PointOfView descriptor defined by MPEG-7 as the adaptation hint 7 with theprofile information 13 of the user. it can.
[0028]
From the viewpoint of the standard conformity of the MPEG-7 standard, even if the descriptor specified by MPEG-7 is specified to be included in the instance of themetadata 5 by the schema (syntax definition), There is no way to know whether it appears as a description in an instance other than by checking the instance. Therefore, in the case of video data having a long time series in time series such as video content, a descriptor such as a <PointOfView descriptor> having a different appearance in each video scene unit is included in the instance of themetadata 5. Information, such as the video content having the subscene hierarchical structure, cannot be known in advance only by themetadata 5. In order to check the existence of a scene with <PointOfView>, all metadata must be analyzed. For the purpose of reducing the amount of the analysis processing, the adaptation hint 7 as the metadata summary information is used. The adaptation hint 7 describes summary information about an instance of the metadata, for example, how many <PointOfViews> are included in themetadata 5 to be analyzed, and which layer contains the <PointOfView>. Therefore, as a simple example, if the adaptation hint 14 is checked before checking a plurality ofmetadata 5 and if it is possible to know whether or not each piece of metadata includes at least one <PointOfView>, Since only the metadata containing PointOfView> needs to be analyzed for the description content of the instance, the amount of matching processing between theuser profile information 13 for generating theRCM 9 and themetadata 5 can be significantly reduced.
[0029]
▲ 2 ▼. Video data distribution request / reproduction
With the above procedure, the user can acquire a content viewing ticket from thecontent purchase server 2 on his / hermobile phone 1, hold the content viewing ticket until the expiration date comes, and carry it anywhere. Thecontent viewing ticket 12 carried by the user is transmitted to thevideo terminal 4 that can interpret the content via infrared communication or a recording medium, so that thevideo terminal 4 sends the user to thecontent distribution server 3 in accordance with the video reproduction capability of the user. Request the purchased video content as thevideo data 6. Of course, if themobile phone 1 and thevideo terminal 4 are the same, such transmission of the content viewing ticket is unnecessary. For this reason, thevideo terminal 4 interprets the information of thecontent viewing ticket 12 according to a standardly defined procedure, requests theproper video data 6 according to the right possessed by the user, and reproduces the function. It has a part. By distributing thevideo terminal 4 having such a function and thecontent viewing ticket 12 separated from themobile phone 1 and thevideo terminal 4, the user can carry thevideo terminal 4 which can be used anytime and anywhere simply by carrying themobile phone 1. By using this, it is possible to view the content as if operating a video device at home. In other words, as long as thecontent viewing ticket 12 properly obtained from the contentpurchase processing server 2 is held in its ownmobile phone 1 or the like, avideo terminal 4 of another person whose performance is different from that of itsown video terminal 4 is rented there. As long as thecontent viewing ticket 12 is set, the video content permitted by thecontent viewing ticket 12 can be normally received and used from thecontent distribution server 3 even if thevideo terminal 4 of another person is rented.
[0030]
Here, thevideo terminal 4 sends the contentviewing status information 15 and the terminal information 16 to thecontent distribution server 3 to request the video content corresponding to the video reproduction capability of thevideo terminal 4 itself as thevideo data 6. The contentviewing status information 15 is information holding the content viewing status, and includes at least locator information such as the URL of the content and playback time information indicating how far the content has been viewed from the start of playback. The contentviewing status information 15 is based on the locator information and the playback time information, so that the user can specify thevideo terminal 4 and transmit thecontent viewing ticket 12, and then immediately start playback from the point where the content is to be played. According to this, when the viewing of the content viewed on acertain video terminal 4 is stopped at a certain time, the time following the time is updated to thecontent viewing ticket 12 as the reproduction time information. Even if the user views the content on thevideo terminal 4, there is an advantage that the user can continue to view the content from the time after the last stop.
[0031]
In addition, if the memory at the time of viewing on anothervideo terminal 4 before is ambiguous and the user wants to check the story of the previous video before starting the current viewing, the high-speed playback is started from a time that is earlier than the playback restart time in advance. Alternatively, by performing a digest reproduction or the like, the viewing memory can be quickly restored and the reproduction can be resumed. At this time, an interface such as displaying a representative image as a still image at a video location that is a key in the story during high-speed playback or digest playback is also possible.
[0032]
The terminal information 16 is transmitted from thevideo terminal 4 to thecontent distribution server 3 according to a standard procedure as information unique to thevideo terminal 4 and described according to a standard data definition. The terminal information 16 includes at least a display time / spatial resolution of thevideo terminal 4 and a receivable video data format, for example, an encoding method and a bit rate.
[0033]
In thecontent distribution server 3, the video format determination unit 18 determines the optimal data format of thevideo data 6 to be transmitted to thevideo terminal 4, that is, the data format of theadaptive video data 17, according to the terminal information 16. Types of data formats include, for example, MPEG-4, MPEG-2, Motion-JPEG, JPEG, Windows (registered trademark) Media Video, and DivX. The video format determination unit 18 determines whether the determined transmission video data format is video data that can be directly transmitted from thecontent distribution server 3 itself, data that is to be transmitted while performing the video data format conversion in real time by itself, or Judgment is made as to whether the video data is appropriate to be transmitted from another server that operates in cooperation with another, and the requestedvideo data 6 is converted according to the resolution and the video data format that are optimal for thevideo terminal 4 that made the request. It is distributed from the requested reproduction start time as the adaptedvideo data 17.
[0034]
In addition, in thevideo terminal 4, when the video reproduction is interrupted, thecontent viewing ticket 12 at the interrupted time is transmitted to themobile phone 1. Then, themobile phone 1 stores thecontent viewing ticket 12 in which the interrupted time is updated. The update of the interrupted time to thecontent viewing ticket 12 may be performed not only by themobile phone 1 but also by thevideo terminal 4 or the like. Then, if themobile phone 1 transmits thecontent viewing ticket 12 to anothervideo terminal 4, the video can be reproduced again from the time when the video reproduction was interrupted in anothervideo terminal 4 by the same procedure as described above. .
[0035]
(3). Video data storage and delivery
Thecontent distribution server 3 inputs the terminal information 16 from thevideo terminal 4 and the contentviewing status information 15 and describes thevideo data 6 corresponding to thecontent viewing ticket 12 in the terminal information 16 from the time to be played. The converted video data is converted into a video data format reproducible by thevideo terminal 4, for example, MPEG-4, MPEG-2, Motion-JPEG, JPEG, and the like, and is delivered to thevideo terminal 4 asadaptive video data 17 as a conversion result.
[0036]
At this time, in thecontent distribution server 3, the video format determination unit 18 determines a format that can be reproduced by thevideo terminal 4 from the input of the terminal information 16, and transmits the format information 19 to the videoformat conversion unit 20 as necessary. Then, the locator information and the reproduction start time of the video data included in the contentviewing status information 15 are transmitted to the video data server 21.
[0037]
The video data server 21 transmits thevideo data 6 stored in the location specified by the locator information of the video data to the videoformat conversion unit 20.
[0038]
The videoformat conversion unit 20 converts thevideo data 6 from the video data server 21 into the video data format specified by the format information 19 from the video format determination unit 18 as the adaptedvideo data 17 to the video terminal. Send.
[0039]
Therefore, according to the first embodiment, only themetadata 5 that matches the user's preference is selected and integrated by theuser profile information 13 indicating the user's preference and the adaptation hint 7 that is the summary information of themetadata 5. Therefore, even if the amount of themetadata 5 such as themetadata 5 of the video content is enormous, themetadata 5 that efficiently meets the user's preference is created. Can be extracted to create theRCM 9.
[0040]
Further, according to the first embodiment, only thevideo terminal 12 holding thecontent viewing ticket 12 can view the content, so that the video content stored in the distributed environment such as the Internet can be transferred anytime and anywhere. It is possible to construct a system in which the user can purchase the right to view, which enables viewing by adapting to viewing conditions that change dynamically, for example, terminal types, places, time, and preferences regarding viewing modes.
[0041]
As a result, according to the first embodiment, the system can adapt to the individual content access environment of the use user without regard to the system specifications with respect to the content access and media transmission processing. Become. Specifically, when automatically generating individual metadata or performing a process such as a search based on the metadata, the server side such as the contentpurchase processing server 2 is the summary information of the metadata itself. By automatically generating the adaptation hint 7 and using the adaptation hint 7 having the summary information, subsequent processing of metadata, for example, searching for desired content based on the metadata, or searching for content similar to the desired content. Processing execution steps such as extraction, pickup of a desired video scene, and provision of the result can be reduced. Also, since thevideo terminal 4 is provided with the videocontent viewing ticket 12 and thevideo terminal 4 autonomously negotiates with the videocontent distribution server 3 via the terminal information 16 and the contentviewing status information 15, Irrespective of thevideo terminal 4 of any specification or time interval, thecontent viewing ticket 12 is available and the terminal information 16 and the contentviewing status information 15 As long as the content can be transmitted to thecontent distribution server 3, continuous content distribution can be realized anywhere and at any time.
[0042]
Although themobile phone 1 and thevideo terminal 4 are described as separate terminals in the first embodiment, themobile phone 1 and thevideo terminal 4 may be configured as the same terminal in the present invention. Alternatively, the part of themobile phone 1 may be replaced with a portable PC or PDA. In this case, the communication between the contentpurchase processing server 2 and themobile phone 1, that is, the exchange of theRCM 9, thepurchase request 10, and thecontent viewing ticket 12 may be performed through a wired or wireless IP network instead of the mobile phone network. In addition, transmission / reception of thecontent viewing ticket 12 between themobile phone 1 and thevideo terminal 4 is assumed to be performed by infrared communication. However, wireless communication such as Bluetooth (registered trademark) or a card such as an SD card (registered trademark) is used. It may be performed through an IP network such as a media or a home LAN.
[0043]
In the first embodiment, the transmission of the terminal information 16, the contentviewing status information 15, and the adaptedvideo data 17 between thecontent distribution server 3 and thevideo terminal 4 is assumed to be performed through a general IP network. Of course, generally available subscriber access networks such as ADSL, CATV cables, and optical fibers can also be used.
[0044]
Although thecontent purchase server 2 and thecontent distribution server 3 have been described as separate servers, thecontent purchase server 2 and thecontent distribution server 3 may of course be configured by the same server.
[0045]
Further, the processing in thecontent purchase server 2 and thecontent distribution server 3 such as the RCM creation unit 8, thepurchase processing unit 11, and the video format determination unit 18 is not only configured by hardware but also by software by executing a program. Of course, it may be configured to perform the processing. The same applies to themobile phone 1 and thevideo terminal 4. The functions thereof may be configured not only by hardware but also by software by processing a program.
[0046]
Embodiment 2 FIG.(Monitoring system model)
In the second embodiment, the configuration of a video monitoring system will be described as an example of a video content management and operation system using the present invention.
[0047]
FIG. 4 shows a system configuration of the video monitoring system in the present embodiment.
As shown in FIG. 4, the video surveillance system according to the second embodiment includes a surveillance video event detection server 39 and a surveillance video collection /distribution server 49, which communicate with the monitoring terminal 40. The surveillance video event detection server 39 includes a feature amountextraction processing unit 34, ametadata server 37, anadaptation hint server 38, anRCM creation unit 43, and a playbackevent determination unit 45. The monitoring video collection anddistribution unit 49 includes thevideo data server 30, the videoformat determination unit 47, and the videoformat conversion unit 51.
Hereinafter, a series of operations of the system shown in FIG. 4 will be described.
[0048]
▲ 1 ▼. Monitoring video storage and event extraction
Thevideo data server 30 is operated by, for example, a monitoring service center (not shown), and accumulates monitoring video transmitted from the monitoringcamera 31 asvideo data 32. Of course, one ormore monitoring cameras 31 may be arranged. Theshooting condition information 33 is also transmitted from the monitoringcamera 31 and stored in association with thevideo data 32. Theshooting condition information 33 includes information on the date and time when thevideo data 32 was shot, and when there are a plurality ofmonitoring cameras 31, also includes information for identifying which camera. Thevideo data server 30 sends thevideo data 32 and theshooting condition information 33 to the featureextraction processing unit 34 of the monitoring video event detection server 39.
[0049]
The feature amountextraction processing unit 34 of the surveillance video event detection server 39 receives thevideo data 32 and theshooting condition information 33, createsmetadata 35 and an adaptation hint 36, and respectively writes themetadata 35 and theadaptation hint server 38. To accumulate. Themetadata 35 includesshooting condition information 33, scene structure information of a video, an event occurring in thevideo data 32, for example, a uniqueness to another video scene in a monitoring video according to a video feature or inflection of sound, a color, It contains information on data described by media features such as motion and sound. Further, the adaptation hint 36 is associated with themetadata 35 as described in the first embodiment, and the actual content of eachindividual metadata 35, particularly, the content of the event in the second embodiment is included. This is summary information of themetadata 35 that describes whether or not the metadata has been written.
[0050]
Note that thevideo data 32 may be stored in a local storable area of eachcamera 31, for example, a hard disk or the like, and the function of the featureextraction processing unit 34 may be included in thecamera 32. In this case, the feature extraction process required for detecting the event information is performed by eachcamera 31, and as a result, themetadata 35 and the adaptation hint 36 are output. According to such a configuration, implementation inside eachcamera 31 becomes complicated, but it is possible to reduce the trouble of transmitting thevideo data 32 for the purpose of performing the feature extraction processing.
[0051]
▲ 2 ▼. Surveillance video event detection
The monitoring video event detection server 39 is operated by, for example, a monitoring service center (not shown). When the monitoring terminal 40 transmits the monitoring designation information 41, the monitoring video event detection server 39 receives the monitoring designation information 41 from the monitoring terminal 40, and TheRCM 42 customized for each is sent to the monitoring terminal 40.
[0052]
The surveillance designation information 41 is query information serving as a trigger for generating theRCM 42, and is information for specifying thesurveillance camera 31 to be confirmed and for specifying the recording time to be confirmed. The monitoring designation information 41 may be configured so that the system automatically detects and sets the state of the monitoring terminal 40 in addition to the monitoring person explicitly inputting and transmitting the information. For example, when assuming a case where a guardman or the like takes a portable monitoring terminal 40 to the site where the event has occurred, the system detects the access time from the monitoring terminal 40 and is identified by a position measurement system such as GPS. May be configured to automatically identify thenearest monitoring camera 31 based on the current position of the monitoring terminal 40. With this configuration, it is possible to promptly confirm, as information of theRCM 42, an event photographed by a camera at the site where the guardman should confirm without an complicated terminal operation by the guardman in an emergency.
[0053]
In a surveillance video system as described in the second embodiment, videos captured by a plurality ofmonitoring cameras 31 are stored in thevideo data server 30 asvideo data 32 every day. Therefore, in order to efficiently manage and operate these surveillance video archives, themetadata extraction unit 34 generates themetadata 35 and associates themetadata 35 with each surveillance video.
[0054]
TheRCM creation unit 43 receives the plurality ofmetadata 35 each associated with the monitoring video data as input, performs filtering using the adaptation hint 36 and the monitoring designation information 41, and creates and sends theRCM 42. . At that time, theRCM creating unit 43 can use the summary information of themetadata 35 included in the adaptation hint 36 to preventunnecessary metadata 35 from being searched, thereby improving the efficiency of the search. As an example, for example, as a summary item to be described by the adaptation hint 36, a feature amount for describing an event scene in a monitoring video, for example, a color feature, a motion feature, a sound, or the like of the video can be considered. Specifically, feature amounts such as the color of clothes, the speed of movement of a person, the loudness of a sound, and the like can be considered, and for example, a suspicious person or the like in a store can be specified. If it is desired to confirm a similar event including video from another camera, and if the similar event is characterized by color features, summary information about the color feature descriptor in the video may be described as an adaptation hint 36. Good.
[0055]
When the monitor receives theRCM 42 from theRCM creation unit 43 by the monitoring terminal 40, the monitor transmits playback video designation information for selecting an image to be played from the thumbnail of the representative image of the event occurrence section provided as information of theRCM 42, The playback video designation information / search key 44, which is the transmission of a search key that selects an image to be searched from thumbnails representing events to be searched, is transmitted to the playbackevent determination unit 45.
[0056]
When the playback video instruction information / search key 44 is playback video instruction information, the playbackevent determination unit 45 transmits a video playback request 46 to the videoformat determination unit 47. The video reproduction request 46 includes at least locator information of thevideo data 32 and a time at which thevideo data 32 is to be reproduced, for example, a reproduction start time and a reproduction time. When the playback video instruction information / search key 44 is a search key, the playbackevent determining unit 45 determines the selected event from all themetadata 35 included in the specified type of themonitoring camera 31 or the time section. Search formetadata 35 containing an event similar to. At this time, the playbackevent determination unit 45 can use the summary information of themetadata 35 included in the adaptation hint 36 to prevent the search forunnecessary metadata 35 and improve the search efficiency, similarly to theRCM generation unit 43. it can.
[0057]
(3). Video data distribution request / reproduction
When making a video data distribution request, the monitoring terminal 40 first sendsterminal information 48 to the monitoring video collection anddistribution server 49. Theterminal information 48 includes at least information relating to the display resolution of the monitoring terminal 40 and receivable image formats such as MPEG-4, MPEG-2, Motion-JPEG, JPEG, Windows (registered trademark) Media Video, DivX, and the like. . The monitoring terminal 40 receives the adapted video data 50 transmitted from the monitoring video collection anddistribution server 49, and reproduces the video. The adaptive video data 50 will be described later. The monitoring terminal 40 may be a PC in a monitoring center, a PDA, a PC, a mobile phone, or the like carried by a guardman.
[0058]
▲ 4 ▼. Surveillance video distribution server
The monitoringvideo collection server 49 is operated by, for example, a monitoring service center (not shown), and is designated based onterminal information 48 from the monitoring terminal 40 and a video playback request 46 from the playbackevent determination unit 45. The requestedvideo data 32 from the reproduction start time can be reproduced in a video format that can be reproduced by the monitoring terminal 40 described in theterminal information 48, for example, MPEG-4, MPEG-2, Motion-JPEG, JPEG, Windows (registered trademark) Media Video. , DivX, and the like, and distributes the adapted video data 50.
[0059]
The videoformat determination unit 47 determines a format that can be reproduced by the monitoring terminal 40 based on theterminal information 48 input from the monitoring terminal 40, and transmits theformat information 52 to the videoformat conversion unit 51, while thevideo data server 30 Then, the locator information of thevideo data 32 corresponding to the video reproduction request 46 and the video reproduction request 46 specifying the reproduction start time and the reproduction time are transmitted.
[0060]
Thevideo data server 30 transmits thevideo data 32 corresponding to the video playback request 46 from the videoformat determination unit 47 to the videoformat conversion unit 51. The videoformat conversion unit 51 converts the format of theinput video data 32 from thevideo data server 30 based on theformat information 52 from the videoformat determination unit 47, and converts the format into the adaptive video data 50 as a monitoring terminal. Send to 40. The monitoring terminal 40 receives and reproduces the adaptive video data 50 transmitted from the monitoring video collection anddistribution server 49.
[0061]
Therefore, according to the video surveillance system according to the second embodiment, for a large amount of video data captured from the monitoringcamera 31, the location where an abnormality is detected in the video together with the condition information at the time of capturing is used as an event in themetadata 35. It is possible to perform unified management, and to efficiently access a video scene including a desired time and a similar event.
[0062]
In particular, since all the events are described in themetadata 35, special playback functions such as high-speed playback or skip playback of only event video scenes can be easily performed by devising the video playback function. realizable.
[0063]
In addition, by automatically converting the video data format at the time of transmission based on the information of the video playback terminal, the video terminal in the monitoring center, a mobile terminal carried by a guard man on the spot, for example, a mobile phone with a video playback function, Regardless of the type of terminal such as a PDA, quick access to a desired video scene can be made anytime and anywhere.
[0064]
In the second embodiment, the surveillance video event detection server 39 and the surveillance video collection anddistribution server 49 are configured as separate servers, but may be configured as the same server. In addition, although the monitoring terminal 40 has been described to perform the processing up to the reproduction of the video, it is needless to say that the monitoring terminal 40 may be configured by separate terminals of themobile phone 1 and thevideo terminal 4 as in the first embodiment.
[0065]
Further, as in the first embodiment, the processing in thecontent purchase server 2 and thecontent distribution server 3 such as theRCM creation unit 43, the feature amountextraction processing unit 34, the reproductionevent determination unit 45, and the videoformat determination unit 47 is performed by hardware. In addition to the configuration described above, the configuration may be such that processing is performed by software by executing a program. The same applies to the monitoring terminal 40. The function of the monitoring terminal 40 may be configured not only by hardware but also by software processing by executing a program.
[0066]
【The invention's effect】
As described above, according to the present invention, the metadata is selected based on the metadata summary information that describes what information is included in the metadata of the video content. Even if the amount of metadata is enormous like the metadata of the content, the target metadata can be efficiently extracted.
[Brief description of the drawings]
FIG. 1 is a diagram showing an overall system configuration for realizing a network distribution type content providing service according to a first embodiment.
FIG. 2 is a diagram illustrating an example of realizing metadata used in the first embodiment;
FIG. 3 is a diagram showing an example of a data format of a content purchase ticket used in the first embodiment.
FIG. 4 is a diagram illustrating an overall system configuration of a video surveillance system according to a second embodiment.
[Explanation of symbols]
1 mobile phone, 2 content purchase processing server, 3 content distribution server, 4 video terminal, 5 metadata, 6 video data, 7 adaptation hint, 8 RCM creation unit, 9 RCM, 10 purchase request, 11 purchase processing unit, 12 content Viewing ticket, 15 content viewing status information, 16 terminal information, 17 adapted video data, 18 video format determination unit, 20 video format conversion unit, 21 video data server.

Claims

Translated fromJapanese

映像コンテンツのメタデータ中にどのような情報が含まれているかを記述したメタデータ要約情報に基づいて、上記メタデータを選別するメタデータ選別処理方法。A metadata selection processing method for selecting the metadata based on metadata summary information describing what information is included in the metadata of the video content.

前記メタデータは、映像コンテンツのシーン構造を記述する情報と、各シーンの特徴的な意味内容を記述する情報とを含み、
前記メタデータ要約情報に基づいて、前記各シーンの特徴的な意味内容を記述する情報が存在するメタデータを選別することを特徴とする請求項１記載のメタデータ選別処理方法。The metadata includes information describing a scene structure of the video content, and information describing a characteristic semantic content of each scene.
2. The metadata selection processing method according to claim 1, wherein metadata that includes information describing characteristic semantic content of each scene is selected based on the metadata summary information.

映像中に特異な映像特徴量の変化があったことをイベントとして検出し、
前記メタデータは、映像コンテンツのシーン構造を記述する情報と、各シーンのイベントを記述する情報とを含み、
前記メタデータ要約情報に基づいて、前記各シーンのイベントを記述する情報が存在するメタデータを選別することを特徴とする請求項１記載のメタデータ選別処理方法。Detects an unusual change in image feature value in the video as an event,
The metadata includes information describing a scene structure of the video content, and information describing an event of each scene,
2. The metadata selection processing method according to claim 1, wherein metadata that includes information describing the event of each scene is selected based on the metadata summary information.

映像コンテンツのメタデータ中にどのような情報が含まれているかを記述したメタデータ要約情報と、所望の映像コンテンツの部分または全体を抽出するためのクエリ情報とを入力し、前記メタデータ要約情報に基づいて、前記クエリ情報に該当するデータ要素が存在するメタデータを特定して該データ要素を抽出し、所定の規則にしたがって抽出したデータ要素を統合するメタデータ選択統合処理方法。Entering metadata summary information describing what information is included in the metadata of the video content and query information for extracting a part or the whole of the desired video content, A metadata selection and integration processing method for identifying metadata in which a data element corresponding to the query information exists, extracting the data element, and integrating the extracted data elements according to a predetermined rule.

前記メタデータは、映像コンテンツのシーン構造を記述する情報と、各シーンの特徴的な意味内容を記述する情報とを含み、
前記メタデータ要約情報に基づいて、前記各シーンの意味内容を記述する情報が存在するメタデータを特定して該データ要素を抽出し、所定の規則にしたがって抽出したデータ要素を統合することを特徴とする請求項４記載のメタデータ選択統合処理方法。The metadata includes information describing a scene structure of the video content, and information describing a characteristic semantic content of each scene.
On the basis of the metadata summary information, metadata that includes information describing the semantic content of each scene is specified, the data element is extracted, and the extracted data elements are integrated according to a predetermined rule. The metadata selection integration processing method according to claim 4, wherein

映像中に特異な映像特徴量の変化があったことをイベントとして検出し、
前記メタデータは、映像コンテンツのシーン構造を記述する情報と、各シーンのイベントを記述する情報とを含み、
前記メタデータ要約情報に基づいて、前記各シーンのイベントを記述する情報が存在するメタデータを特定して該データ要素を抽出し、所定の規則にしたがって抽出したデータ要素を統合することを特徴とする請求項４記載のメタデータ選択統合処理方法。Detects an unusual change in image feature value in the video as an event,
The metadata includes information describing a scene structure of the video content, and information describing an event of each scene,
Based on the metadata summary information, identify metadata in which information describing the event of each scene is present, extract the data element, and integrate the extracted data element according to a predetermined rule. 5. The metadata selection integration processing method according to claim 4, wherein

映像コンテンツのメタデータ中にどのような情報が含まれているかを記述したメタデータ要約情報と、所望の映像の部分または全体を抽出するためのクエリ情報とを入力し、前記メタデータ要約情報に基づいて、前記クエリ情報に該当するデータ要素が存在するメタデータを特定して該データ要素を抽出し、所定の規則にしたがって抽出したデータ要素を統合するとともに、抽出されたデータ要素が含まれる映像シーンを特定して連続的に再生することを特徴とする映像再生方法。Input metadata summary information describing what information is included in the metadata of the video content, and query information for extracting a desired video part or whole, and enter the metadata summary information in the metadata summary information. The metadata including the data element corresponding to the query information is specified based on the extracted data element, the data element is extracted according to a predetermined rule, and the video data including the extracted data element is integrated. A video reproducing method characterized in that a scene is specified and reproduced continuously.

ユーザ毎にカスタマイズされたコンテンツメニューをユーザ端末に送信し、
ユーザ端末からコンテンツメニューを参照してのコンテンツ購入要求があり、当該コンテンツの視聴を許可する場合には、少なくともコンテンツの内容である映像データのロケータ情報、コンテンツの総再生時間、映像データをどの時刻から再生すればよいかの情報、およびコンテンツの視聴期限が記述されているコンテンツ視聴券を当該ユーザ端末に送信するコンテンツ購入処理方法。Send a content menu customized for each user to the user terminal,
When the user terminal requests a content purchase with reference to the content menu and permits the viewing of the content, at least the locator information of the video data, which is the content of the content, the total playback time of the content, and the time A content purchase processing method for transmitting, to a user terminal, a content viewing ticket in which information on whether or not playback is to be started and a content viewing expiration date are described.

請求項８記載のコンテンツ購入処理方法において、
ユーザ端末は、前記コンテンツ視聴券を受信すると、
当該ユーザ端末自身、あるいは他の映像再生端末に前記コンテンツ視聴券を渡して当該他の映像再生端末から、前記コンテンツ視聴券に基づきコンテンツのロケータ情報およびコンテンツを再生開始からどこまで視聴したかを示す再生時刻情報を含むコンテンツ視聴ステイタス情報と、少なくとも当該端末にて受信可能な映像データフォーマットを示す端末情報を、コンテンツ配信サーバに対し送信し、
前記コンテンツ配信サーバは、前記端末から前記コンテンツ視聴ステイタス情報および端末情報を受信すると、前記端末情報に従って上記端末に送信すべき映像データの最適なデータフォーマットを決定し、前記コンテンツ視聴ステイタス情報に従って最初ないしは途中から要求されたコンテンツを決定した最適なデータフォーマットにて当該端末へ送信し、
当該端末は、当該端末に最適なデータフォーマットで送信される最初ないしは途中からのコンテンツを再生する、
ことを特徴とするコンテンツ購入処理方法。The content purchase processing method according to claim 8,
When the user terminal receives the content viewing ticket,
The content terminal ticket is passed to the user terminal itself or another video playback terminal, and from the other video playback terminal, the locator information of the content and the reproduction indicating how far the content has been viewed from the start of the playback based on the content ticket. Content viewing status information including time information, and at least terminal information indicating a video data format that can be received by the terminal, transmitted to the content distribution server,
Upon receiving the content viewing status information and the terminal information from the terminal, the content distribution server determines an optimal data format of video data to be transmitted to the terminal according to the terminal information, and determines first or second according to the content viewing status information. Send the requested content from the middle to the terminal in the determined optimal data format,
The terminal reproduces content from the beginning or from the middle which is transmitted in a data format optimal for the terminal,
A content purchase processing method characterized in that:

請求項９記載のコンテンツ購入処理方法において、
上記端末は、当該端末に最適なデータフォーマットで送信された最初ないしは途中からのコンテンツを再生した後、再生を終了すると、その再生時間に基づきコンテンツ視聴券の再生時刻情報を更新することを特徴とするコンテンツ購入処理方法。The content purchase processing method according to claim 9,
The terminal is characterized in that, after reproducing the content transmitted from the beginning or in the middle transmitted in the data format optimal for the terminal, when the reproduction is completed, the reproduction time information of the content viewing ticket is updated based on the reproduction time. Content purchase processing method to be performed.

ユーザ毎にカスタマイズされたコンテンツメニューをユーザ端末に送信し、
ユーザ端末からコンテンツメニューを参照してのコンテンツ購入要求があり、当該コンテンツの視聴を許可する場合には、少なくともコンテンツの内容である映像データのロケータ情報、コンテンツの総再生時間、映像データをどの時刻から再生すればよいかの情報、およびコンテンツの視聴期限が記述されているコンテンツ視聴券を当該ユーザ端末に送信するコンテンツ購入処理サーバ。Send a content menu customized for each user to the user terminal,
When the user terminal requests a content purchase with reference to the content menu and permits the viewing of the content, at least the locator information of the video data, which is the content of the content, the total playback time of the content, and the time A content purchase processing server that transmits to the user terminal a content viewing ticket in which information on whether or not to play the content and a viewing expiration date of the content are described.

コンテンツの内容である映像データのロケータ情報、コンテンツの総再生時間、映像データをどの時刻から再生すればよいかの情報、およびコンテンツの視聴期限が記述されているコンテンツ視聴券を受信したユーザ端末自身、あるいは当該コンテンツ視聴券を渡した他の映像再生端末から、前記コンテンツ視聴券に基づきコンテンツのロケータ情報およびコンテンツを再生開始からどこまで視聴したかを示す再生時刻情報を含むコンテンツ視聴ステイタス情報と、少なくとも当該端末にて受信可能な映像データフォーマットを示す端末情報とが送信されてくると、
前記端末情報に従って上記端末に送信すべき映像データの最適なデータフォーマットを決定し、前記コンテンツ視聴ステイタス情報に従って最初ないしは途中から要求されたコンテンツを決定した最適なデータフォーマットにて当該端末へ送信して、当該端末に最適なデータフォーマットで最初ないしは途中からのコンテンツを再生させる、
ことを特徴とするコンテンツ配信サーバ。The locator information of the video data, which is the content of the content, the total playback time of the content, the information from which time the video data should be played, and the user terminal itself that has received the content viewing ticket describing the content expiration date Or, from another video playback terminal that passed the content viewing ticket, content viewing status information including content locator information and playback time information indicating how far the content was viewed from the start of playback based on the content viewing ticket, at least When terminal information indicating a video data format that can be received by the terminal is transmitted,
Determine the optimal data format of the video data to be transmitted to the terminal according to the terminal information, and transmit the requested content from the beginning or from the middle according to the content viewing status information to the terminal in the determined optimal data format. , Play the content from the beginning or from the middle in the most appropriate data format for the terminal,
A content distribution server, characterized in that:

コンテンツの内容である映像データのロケータ情報、コンテンツの総再生時間、映像データをどの時刻から再生すればよいかの情報、およびコンテンツの視聴期限が記述されているコンテンツ視聴券の情報を受信して、該コンテンツ視聴券の情報と、少なくとも自身が受信再生可能な映像データフォーマットを示す端末情報とに基づいて、視聴対象となるコンテンツを、受信再生可能な最適な映像データフォーマットで送信することを要求する、
ことを特徴とする映像再生端末。Receiving the locator information of the video data, which is the content of the content, the total playback time of the content, the information from which time the video data should be played, and the information of the content viewing ticket which describes the content expiration date. Requesting that content to be viewed be transmitted in an optimal video data format that can be received and reproduced, based on the information of the content viewing ticket and at least terminal information indicating a video data format that can be received and reproduced by itself. Do
A video reproduction terminal characterized by the above-mentioned.

映像コンテンツのメタデータ中にどのような情報が含まれているかを記述したメタデータ要約情報と、所望の映像コンテンツの部分または全体を抽出するためのクエリ情報とを入力し、前記メタデータ要約情報に基づいて、前記クエリ情報に該当するデータ要素が存在するメタデータを特定して該データ要素を抽出し、所定の規則にしたがって抽出したデータ要素を統合することをコンピュータに実行させるメタデータ選択統合処理プログラム。Entering metadata summary information describing what information is included in the metadata of the video content and query information for extracting a part or the whole of the desired video content, Metadata selection and integration for causing a computer to identify metadata in which a data element corresponding to the query information exists, extract the data element, and integrate the extracted data element according to a predetermined rule Processing program.