JPH10304247A

Movatterモバイル変換

Info

Publication number: JPH10304247A
Application number: JP9112719A
Authority: JP
Inventors: Takashi Sato; 隆佐藤; Yoshinobu Tonomura; 佳伸外村
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1997-04-30
Filing date: 1997-04-30
Publication date: 1998-11-13
Anticipated expiration: 2017-04-30
Also published as: JP3503797B2

Abstract

PROBLEM TO BE SOLVED: To prevent excessive detection of objects other than a telop or detection miss of the telop due to a noise by extracting telop candidate picture elements on the basis of a picture element or a group of picture elements, storing them in a three-dimensional buffer constituted of vertical and horizontal spatial axis and time axes, and annexing the telop candidate picture elements on the buffer. SOLUTION: In a telop candidate picture element extracting part 102, an edge picture is obtained from an inputted video by using a picture processing operator such as Laplaian. Next, the edge picture is projected to a vertical direction for calculating frequency, and a range exceeding a threshold value is obtained. The similar processing is operated to a horizontal direction, and they are synthesized, and telop candidate picture elements in which the pixel value of the part in the obtained range is 1 and the other is 0 are outputted, and stored in a buffer 103. In an annexing part 104, a three-dimensional Gaussian filter is superimpose-integrated for the buffer 103. Moreover, an expansion/ contraction processing is performed, and the defects of picture elements such as a hole or a clearance are corrected. Finally, in a decision part 106, a central frame indicating the telop is determined and outputted.

Description

Translated fromJapanese

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、映像からテロップ
を検出する方法および装置に関する。[0001] 1. Field of the Invention [0002] The present invention relates to a method and an apparatus for detecting a telop from an image.

【０００２】[0002]

【従来の技術】映像からテロップを検出する従来の装置
は、１枚から数枚のフレーム画像による局所的な特徴を
用いてテロップを検出していた。2. Description of the Related Art In a conventional apparatus for detecting a telop from a video, the telop is detected by using local features of one to several frame images.

【０００３】例えば、テロップの周辺には大きな輝度差
があることを利用し、まず、フレーム間で輝度や色相の
分布の変化を調べ、テロップが出現するフレームを見つ
け、次にテロップ出現前後のフレーム間で差分をとり、
テロップを抽出するという方法がある（例えば、根本
他、「テロップの認識による資料映像の検索につい
て」、１９９４年電子情報通信学会春季大会Ｄ−４２
７、１９９４など）。For example, taking advantage of the fact that there is a large luminance difference around a telop, first, a change in the distribution of luminance and hue between frames is examined to find a frame where a telop appears, and then a frame before and after the telop appears Take the difference between
There is a method of extracting a telop (for example, Nemoto et al., “Retrieval of Material Video by Recognition of Telop”, D-42, IEICE Spring Conference 1994)
7, 1994).

【０００４】また、１枚のフレーム画像を対象にし、テ
ロップが背景に比べて高輝度でありエッジを抽出しやす
いという性質を用い、画像に対して１次微分によるエッ
ジ抽出を行い、エッジ画像を縦横方向に投影してテロッ
プを検出するという方法もある（例えば、茂木他、「ニ
ュース映像中の文字認識に基づく記事の索引付け」、電
子情報通信学会技術研究報告ＩＥ９５−１５３、１９９
６など）。[0004] In addition, the edge image is subjected to first-order differentiation for one frame image, and the edge image is extracted by using the property that the telop has a higher luminance than the background and the edge is easily extracted. There is also a method of detecting a telop by projecting in the vertical and horizontal directions (for example, Mogi et al., "Indexing Articles Based on Character Recognition in News Videos", IEICE Technical Report IE 95-153, 199).
6 etc.).

【０００５】また、テロップが静止していて、かつ、高
輝度であるという性質を用い、２枚のフレーム間で動き
のない部分を求め、さらに、輝度が所定値以上の領域を
字幕部分として検出する装置もある（例えば、特開平８
−３３１４５６「字幕移動装置」）。[0005] Further, by using the property that the telop is stationary and has high luminance, a portion where there is no movement between two frames is obtained, and a region where the luminance is equal to or more than a predetermined value is detected as a subtitle portion. Some devices (for example, see Japanese Unexamined Patent Application Publication No.
-331456 "Caption moving device").

【０００６】また、ＭＰＥＧなどフレーム間の相関を用
いて符号化された映像では、フレーム間の相関を用い、
かつ、動き補償を用いないで符号化された画素がテロッ
プの部分に時間的空間的に集中するという性質がある。
この性質を利用し、フレーム間の相関を用い、かつ、動
き補償を用いないで符号化された画素の出現頻度をある
時間区間で計数することによってテロップを検出する装
置もある（佐藤他、「ＭＰＥＧ映像からのテロップ領域
抽出法」１９９６年電子情報通信学会情報・システム
ソサイエティ大会Ｄ−２７３）。[0006] Further, in a video coded using the correlation between frames such as MPEG, the correlation between frames is used.
In addition, there is a property that pixels encoded without using motion compensation are temporally and spatially concentrated in a telop portion.
Some devices utilize this property to detect telops by using the correlation between frames and counting the appearance frequency of coded pixels in a certain time interval without using motion compensation (Sato et al., Method of Extracting Telop Regions from MPEG Video ", IEICE Information and Systems Society Conference D-273).

【０００７】[0007]

【発明が解決しようとする課題】上述した従来技術で
は、１枚または２枚のフレーム画像という、時間的に局
所的な情報を用いてテロップを検出していたため、静止
している、輝度が高い、高周波成分が大きいなどのテロ
ップと類似した特徴を持ったテロップ以外の被写体が存
在すると、それをテロップとして誤検出してしまうとい
う問題があった。In the above-mentioned prior art, since a telop is detected using temporally local information such as one or two frame images, the telop is stationary and has high luminance. However, if a subject other than a telop having characteristics similar to a telop such as a large high-frequency component is present, there is a problem that it is erroneously detected as a telop.

【０００８】逆に、長時間画面に現れているテロップ
が、画質劣化やノイズ等の影響によって一時的に動いた
り、輪郭がぼけたりすると、その部分は検出漏れになっ
てしまう。このため、本来ひとつの連続したテロップ
を、複数の時間区間にわたる別々のテロップとして重複
検出してしまうことになる。Conversely, if a telop that has been displayed on the screen for a long time temporarily moves or the outline is blurred due to the influence of image quality deterioration, noise, or the like, the portion is missed in detection. Therefore, one continuous telop is redundantly detected as a separate telop over a plurality of time sections.

【０００９】つまり、従来技術は、ある短い区間を対象
にして、テロップが存在するかを判定しているため、テ
ロップ以外の被写体の過剰検出や、ノイズによるテロッ
プの検出漏れを免れることが難しい。したがって、映像
からテロップの一覧を得るという用途に従来技術を用い
ると、テロップ以外の被写体を誤って表示したり、一つ
のテロップを重複して表示してしまうことがしばしばあ
った。That is, in the prior art, since it is determined whether or not a telop exists in a short section, it is difficult to avoid excessive detection of a subject other than the telop and omission of telop detection due to noise. Therefore, when the conventional technique is used for obtaining a list of telops from a video, a subject other than the telop is often displayed erroneously, or one telop is often displayed repeatedly.

【００１０】本発明の目的は、映像からテロップを過不
足なく検出する映像テロップ検出方法および装置を提供
することである。An object of the present invention is to provide a video telop detection method and apparatus for detecting a telop from a video without excess or deficiency.

【００１１】[0011]

【課題を解決するための手段】上記目的を達成するため
に、映像テロップ検出方法は、画素または画素の集合の
単位でテロップ候補画素を抽出し、縦横の空間軸と時間
軸とから成る３次元のバッファに格納する抽出段階と、
前記バッファ上のテロップ候補画素を併合する併合段階
を有する。In order to achieve the above-mentioned object, a video telop detection method extracts a telop candidate pixel in a unit of a pixel or a set of pixels, and extracts a telop candidate pixel from a three-dimensional space axis and a time axis. An extraction step of storing in a buffer of
And merging the telop candidate pixels on the buffer.

【００１２】また、本発明の映像テロップ検出装置は、
縦横の空間軸と時間軸とから成る３次元のバッファと、
画素または画素の集合の単位でテロップ候補画素を抽出
し、バッファに格納する抽出手段と、バッファ上のテロ
ップ候補画素を併合する併合手段を有する。Also, the video telop detection device of the present invention
A three-dimensional buffer composed of vertical and horizontal spatial axes and a time axis,
It has extraction means for extracting telop candidate pixels in units of pixels or sets of pixels and storing them in a buffer, and merging means for merging telop candidate pixels on the buffer.

【００１３】映像から画素または画素の集合の単位でテ
ロップ候補画素を抽出し、縦横の空間軸と時間軸とから
成る３次元のバッファに格納することにより、従来技術
よりも長時間に渡る映像を処理することが可能となる。
さらに、バッファ上のテロップ候補画素を併合すること
により、短時間の微小な変化を無視してノイズによる影
響を除去することができる。[0013] A telop candidate pixel is extracted from a video in a unit of a pixel or a set of pixels and stored in a three-dimensional buffer composed of a vertical and horizontal spatial axis and a time axis. It can be processed.
Further, by merging the telop candidate pixels on the buffer, it is possible to ignore the minute change in a short time and remove the influence of noise.

【００１４】本発明の実施態様によれば、抽出手段は、
映像のエッジを求めるエッジ生成手段と、エッジの値を
縦方向と横方向に投影する投影手段と、投影された値と
閾値との比較結果に基づいてテロップ候補画素を判定す
る比較手段を有する。According to an embodiment of the present invention, the extracting means comprises:
The image processing apparatus includes an edge generating means for obtaining an edge of a video, a projecting means for projecting the edge value in the vertical direction and the horizontal direction, and a comparing means for determining a telop candidate pixel based on a comparison result between the projected value and a threshold value.

【００１５】抽出手段において、エッジ生成手段が映像
のエッジを求めることにより、テロップの高周波成分が
大きいという特徴に基づき、テロップ周辺にエッジが集
中した画像を得ることができる。次に、エッジの値を縦
横方向に投影する投影手段によって、エッジの集中の度
合を１次元で評価することが可能となる。比較手段で
は、投影された値と閾値とを比較することによって、エ
ッジが集中している部分を検出することができる。これ
によって、テロップ候補画素を求めることができる。In the extracting means, the edge generating means obtains the edge of the video, so that an image in which the edges are concentrated around the telop can be obtained based on the feature that the high frequency component of the telop is large. Next, the degree of concentration of the edge can be evaluated one-dimensionally by the projection unit that projects the value of the edge in the vertical and horizontal directions. The comparing means can detect a portion where edges are concentrated by comparing the projected value with a threshold. Thus, a telop candidate pixel can be obtained.

【００１６】本発明の実施態様によれば、抽出手段は、
フレーム間の相関を利用して符号化された映像データか
ら、フレーム間の相関を用い、かつ、動き補償を用いな
いで符号化された画素の数を、それぞれの画素の位置ご
とに、計数区間内で計数する計数手段を有する。According to an embodiment of the present invention, the extracting means comprises:
From the video data encoded using the correlation between the frames, the number of pixels encoded using the correlation between the frames and without using the motion compensation is calculated for each pixel position by a counting section. And counting means for counting within.

【００１７】計数手段は、フレーム間の相関を利用して
符号化された映像データから、フレーム間の相関を用
い、かつ、動き補償を用いないで符号化された画素の数
を、それぞれの画素の位置ごとに、ある計数区間で計数
する。テロップには、フレーム間の相関を用い、かつ、
動き補償を用いないで符号化された画素が集中するとい
う傾向があるため、テロップの画素についてのみ大きい
計数値が得られる。これによって、テロップの尤度が高
いほど値が大きいテロップ候補画素を求めることができ
る。The counting means calculates the number of pixels coded using the correlation between frames and without using motion compensation from the video data coded using the correlation between the frames. Is counted in a certain counting section for each position. The telop uses the correlation between frames, and
Since pixels coded without using motion compensation tend to be concentrated, a large count value is obtained only for pixels of the telop. As a result, a telop candidate pixel having a larger value as the likelihood of the telop is higher can be obtained.

【００１８】本発明の実施態様によれば、併合手段は、
３次元の平滑化フィルタによってテロップ候補画素を平
滑化する平滑化手段を有する。According to an embodiment of the present invention, the merging means comprises:
There is provided a smoothing means for smoothing the telop candidate pixels with a three-dimensional smoothing filter.

【００１９】併合手段において、平滑化手段が３次元の
平滑化フィルタによってテロップ候補画素を平滑化する
ことによって、近接するテロップ候補画素どうしが併合
されるとともに、孤立する小さいテロップ候補画素が消
滅する。In the merging means, the smoothing means smoothes the telop candidate pixels with a three-dimensional smoothing filter, so that adjacent telop candidate pixels are merged with each other and isolated small telop candidate pixels disappear.

【００２０】本発明の実施態様によれば、併合手段は、
テロップ候補画素を近傍画素の最大値に置き換える膨張
手段と、テロップ候補画素を近傍画素の最小値に置き換
える収縮手段を有する。According to an embodiment of the present invention, the merging means comprises:
There are expansion means for replacing the telop candidate pixel with the maximum value of the neighboring pixels and contraction means for replacing the telop candidate pixel with the minimum value of the neighboring pixels.

【００２１】併合手段において、膨張手段がテロップ候
補画素を近傍画素の最大値に置き換えることによって、
近接するテロップ候補画素どうしが併合される。収縮手
段において、テロップ候補を近傍画素の最小値に置き換
えることによって、孤立する小さい画素が消滅する。こ
の２つの手段の組合せによって、近接するテロップ候補
画素どうしが併合されるとともに、孤立する小さいテロ
ップ候補画素が消滅する。In the merging means, the dilation means replaces the telop candidate pixel with the maximum value of the neighboring pixels.
Adjacent telop candidate pixels are merged. By replacing the telop candidate with the minimum value of the neighboring pixels in the shrinking means, isolated small pixels disappear. By combining these two means, adjacent telop candidate pixels are merged with each other, and isolated small telop candidate pixels disappear.

【００２２】本発明の実施態様によれば、映像テロップ
検出装置は、テロップ候補画素の存在しない時間帯の前
または後のフレームを、テロップを含む代表フレームと
する判定手段をさらに有する。According to the embodiment of the present invention, the video telop detecting device further has a judging means for setting a frame before or after a time zone in which no telop candidate pixel exists as a representative frame including the telop.

【００２３】判定手段が、テロップ候補画素の存在しな
い時間区間の前または後のフレームをテロップを表す代
表フレームとすることによって、テロップを表す代表フ
レームを得ることができる。The determining means sets a frame before or after the time interval in which no telop candidate pixel exists as a representative frame representing the telop, thereby obtaining a representative frame representing the telop.

【００２４】本発明の実施態様によれば、映像テロップ
検出装置は、併合されたテロップ候補画素の連結成分に
ラベルを付与するラベリング手段と、ラベルのつけられ
たテロップ候補画素を含むフレームをテロップを含むフ
レームとする判定手段をさらに有する。According to an embodiment of the present invention, the video telop detection device includes a labeling means for labeling the connected components of the merged telop candidate pixels, and a frame including the labeled telop candidate pixels. The image processing apparatus further includes determination means for determining a frame to be included.

【００２５】ラベリング手段が、併合されたテロップ候
補画素の連結成分にラベルを付与するので、個々のテロ
ップを識別することが可能となる。判定手段が、ラベル
のつけられたテロップ候補画素を含むフレームをテロッ
プを表す代表フレームとするので、個々のテロップにつ
いて、過不足なく代表フレームを得ることができる。Since the labeling means adds a label to the connected component of the merged telop candidate pixels, it is possible to identify individual telops. Since the determination unit sets the frame including the labeled telop candidate pixel as the representative frame representing the telop, the representative frame can be obtained for each telop without excess or shortage.

【００２６】[0026]

【発明の実施の形態】次に、本発明の実施の形態につい
て図面を参照して説明する。Next, embodiments of the present invention will be described with reference to the drawings.

【００２７】図１は本発明の第１の実施形態の映像テロ
ップ検出装置を表すブロック図である。FIG. 1 is a block diagram showing a video telop detecting apparatus according to the first embodiment of the present invention.

【００２８】本実施形態の映像テロップ検出装置は、バ
ッファ１０３と、入力端子１０１から入力された映像か
らテロップ領域の候補となる画素または画素の集合を検
出し、バッファ１０３に蓄積するテロップ候補画素抽出
部１０２と、バッファ１０３に蓄積されたテロップ候補
を併合し、出力端子１０５に出力する併合部１０４で構
成されている。The video telop detection apparatus of this embodiment detects a pixel or a set of pixels that are candidates for a telop area from a buffer 103 and a video input from an input terminal 101, and extracts telop candidate pixels stored in the buffer 103. A merging unit 104 merges the telop candidates accumulated in the buffer 103 and outputs the merged telop candidates to an output terminal 105.

【００２９】画素の集合として８×８ないし１６×１６
のブロックを用いることができる。バッファ１０３は３
次元であり、図２のように画面と平行な２軸ｘ，ｙと垂
直な軸ｔによって表される。例えば、７２０×４８０画
素の画面で１６×１６のブロックを用いる場合、バッフ
ァ１０３の幅Ｗは４５、高さＨは３０となる。バッファ
１０３の奥行きＴは、対象にする映像の時間を時間解像
度によって割った値になる。例えば、１０分の映像を
０．５秒間隔で処理する場合には、Ｔは１２００にな
る。8 × 8 to 16 × 16 as a set of pixels
Block can be used. Buffer 103 is 3
The dimension is represented by two axes x and y parallel to the screen and an axis t perpendicular to the screen as shown in FIG. For example, when a 16 × 16 block is used on a screen of 720 × 480 pixels, the width W of the buffer 103 is 45 and the height H is 30. The depth T of the buffer 103 is a value obtained by dividing the target video time by the time resolution. For example, when a 10-minute video is processed at 0.5-second intervals, T becomes 1200.

【００３０】図３はテロップ候補画素抽出部１０２を表
すブロック図である。入力端子２０１に入力された映像
からエッジ生成部２０２においてエッジ画像を求め、バ
ッファ３０３に格納する。エッジを求める方法として、
ラプラシアンや、Robertなどの画像処理オペレータを用
いることができる。次に、縦投影部２０４によって、画
像を縦方向に投影し頻度をとる。すると、図４のｖのよ
うに、エッジの集中している部分の頻度が高くなるの
で、これを比較部２０５によって入力端子２０６に入力
された閾値と比較し、閾値以上の範囲（ｘ０〜ｘ１）を
求める。（ｘ０〜ｘ１）の範囲に限定して、さらに、横
投影部２０７において、エッジ画像を横方向に投影頻度
を求める。これを比較部２０８において、エッジ画像を
横方向に投影頻度を求める。これを比較部２０８におい
て入力端子２０９に与えられた閾値と比較し、閾値以上
の範囲（ｙ０〜ｙ１）を求める。合成部２１０では、以
上により求められた範囲（ｘ１〜ｘ１，ｙ０〜ｙ１）の
部分の画素値を１、それ以外を０としたテロップ候補画
素として出力端子２１１に出力する。FIG. 3 is a block diagram showing the telop candidate pixel extraction unit 102. An edge image is obtained in the edge generation unit 202 from the video input to the input terminal 201 and stored in the buffer 303. As a method to find the edge,
An image processing operator such as Laplacian or Robert can be used. Next, the image is projected in the vertical direction by the vertical projection unit 204 to determine the frequency. Then, as shown by v in FIG. 4, the frequency of the portion where the edges are concentrated increases, and this is compared with the threshold value input to the input terminal 206 by the comparing unit 205, and the range (x0 to x1) ). Limiting the range to (x0 to x1), the horizontal projection unit 207 further calculates the projection frequency of the edge image in the horizontal direction. The comparison unit 208 calculates the projection frequency of the edge image in the horizontal direction. This is compared with the threshold value given to the input terminal 209 by the comparison unit 208, and a range (y0 to y1) which is equal to or larger than the threshold value is obtained. The synthesizing unit 210 outputs to the output terminal 211 the telop candidate pixels in which the pixel values in the range (x1 to x1, y0 to y1) obtained as described above are 1 and the other values are 0.

【００３１】なお、縦投影において閾値以上の範囲が複
数存在する場合には、それぞれの範囲について横投影を
行う。また、縦投影部２０４と横投影部２０７の順序は
入れ替わってもよい。When a plurality of ranges equal to or larger than the threshold value exist in the vertical projection, the horizontal projection is performed for each of the ranges. Further, the order of the vertical projection unit 204 and the horizontal projection unit 207 may be interchanged.

【００３２】図５はテロップ候補画素抽出部１０２の他
の例を表すブロック図である。この例は、ＭＰＥＧ等、
フレーム間の相関を利用して符号化された映像データを
対象にしている。入力端子３０１に入力された符号化映
像データは、位置復号部３０２によって画素の位置が復
号され、カウンタ３０４のアドレス（Ａ）に出力され
る。同様に、種類復号部３０３によって画素の符号化の
種類が復号される。種類復号部３０３では、画素の種類
が、フレーム間の相関を用い、かつ、動き補償を用いな
いで符号化されたものである場合に限り、“１”が出力
され、それ以外は“０”が出力される。この信号はカウ
ンタ３０４の増減を制御する。カウンタ３０４の値は計
数時間内で増減され、そのままテロップ候補画素として
出力端子３０５に出力される。出力後は、カウンタ３０
４の値はすべて０にリセットされる。FIG. 5 is a block diagram showing another example of the telop candidate pixel extraction unit 102. This example uses MPEG,
It targets video data encoded using correlation between frames. In the encoded video data input to the input terminal 301, the position of the pixel is decoded by the position decoding unit 302 and output to the address (A) of the counter 304. Similarly, the type decoding unit 303 decodes the type of pixel encoding. The type decoding unit 303 outputs “1” only when the type of the pixel is encoded using the correlation between frames and without using motion compensation, and otherwise outputs “0”. Is output. This signal controls the increase or decrease of the counter 304. The value of the counter 304 is increased or decreased within the counting time, and is directly output to the output terminal 305 as a telop candidate pixel. After output, the counter 30
All four values are reset to zero.

【００３３】次に、併合部１０４について説明する。Next, the merging unit 104 will be described.

【００３４】まず、３次元平滑化フィルタを用いる併合
部１０４について説明する。３次元の平滑化フィルタと
して、次のような３次元ガウシアンフィルタを考える。First, the merging unit 104 using a three-dimensional smoothing filter will be described. As a three-dimensional smoothing filter, consider the following three-dimensional Gaussian filter.

【００３５】[0035]

【数１】これを、バッファ１０３（Ｂ（ｘ，ｙ，ｔ））に対して
畳み込み積分を行う。(Equation 1) The convolution is performed on the buffer 103 (B (x, y, t)).

【００３６】[0036]

【数２】あるいは、１次元のガウシアンフィルタ(Equation 2) Or one-dimensional Gaussian filter

【００３７】[0037]

【数３】をｘ，ｙ，ｔの３軸方向について、順番に畳み込んでも
よい。すなわち、(Equation 3) May be sequentially convoluted in the three axis directions x, y, and t. That is,

【００３８】[0038]

【数４】とする。(Equation 4) And

【００３９】次に、膨張処理と収縮処理を用いる併合部
１０４について説明する。膨張処理は、ある画素Ｂ
（ｘ，ｙ，ｔ）の近傍Ｒ（ｘ，ｙ，ｔ）に含まれる点の
最大値をその画素の値とする。すなわち、画素Ｂ（ｘ，
ｙ，ｚ）は次の式により画素Ｂｅ（ｘ，ｙ，ｚ）の値と
なる。Next, the merging unit 104 using the expansion processing and the contraction processing will be described. Expansion processing is performed for a certain pixel B
The maximum value of a point included in the neighborhood R (x, y, t) of (x, y, t) is defined as the value of the pixel. That is, pixel B (x,
(y, z) is the value of the pixel Be (x, y, z) according to the following equation.

【００４０】[0040]

【数５】膨張処理は、幅、高さ、奥行きのいずれかがＲより小さ
い穴や隙間を埋める働きをする。例えば、図６（１）に
ついて４近傍（注目画素の上下左右に接する４画素）の
膨張処理を行うと図６（２）のようになる。２つの黒領
域の間の隙間がなくなり、黒領域内の白い穴も埋められ
る。収縮処理は、ある画素Ｂ（ｘ，ｙ，ｔ）の近傍Ｒ
（ｘ，ｙ，ｔ）に含まれる点の最小値をその画素の値と
する。すなわち、画素Ｂ（ｘ，ｙ，ｚ）は次の式により
画素Ｂｅ（ｘ，ｙ，ｚ）の値となる。(Equation 5) The expansion process works to fill holes or gaps whose width, height, or depth are smaller than R. For example, FIG. 6 (2) is obtained by performing an expansion process on four neighbors (four pixels in contact with the pixel of interest at the top, bottom, left and right) in FIG. There is no gap between the two black areas, and white holes in the black areas are filled. The contraction processing is performed in the vicinity R of a certain pixel B (x, y, t).
The minimum value of the points included in (x, y, t) is defined as the value of the pixel. That is, the pixel B (x, y, z) has the value of the pixel Be (x, y, z) according to the following equation.

【００４１】[0041]

【数６】収縮処理は、幅、高さ、奥行きのいずれかがＲより小さ
い領域を消去する働きをする。例えば、図６（１）につ
いて４近傍の収縮処理を行うと図６（３）のようにな
る。高さが２の黒領域が消滅していることがわかる。ま
た、先程膨張処理を行った図６（２）について収縮処理
を行うと、図６（４）のようになる。図６（１）と図６
（４）を比べると、大きさを維持しながら、穴や隙間が
無くなっていることがわかる。すなわち、膨張収縮処理
は、穴や隙間などの画素の欠落を補う働きがある。ま
た、収縮処理の結果である図６（３）に対して膨張処理
を行うと、図６（５）のようになる。図６（１）と図６
（５）を比べると、大きさを維持しながら小さい領域が
消滅していることがわかる。すなわち、収縮膨張処理
は、ノイズを除去する働きがある。(Equation 6) The contraction processing serves to erase a region whose width, height, or depth is smaller than R. For example, when the contraction process of about 4 is performed on FIG. 6A, the result is as shown in FIG. It can be seen that the black area having a height of 2 has disappeared. When the contraction processing is performed on FIG. 6 (2) on which the expansion processing has been performed earlier, the result is as shown in FIG. 6 (4). 6 (1) and 6
Comparing (4), it can be seen that holes and gaps are eliminated while maintaining the size. In other words, the expansion / contraction process has a function of compensating for missing pixels such as holes and gaps. When the expansion processing is performed on FIG. 6 (3) which is the result of the contraction processing, the result is as shown in FIG. 6 (5). 6 (1) and 6
Comparing (5), it can be seen that the small area disappears while maintaining the size. That is, the contraction / expansion processing has a function of removing noise.

【００４２】併合部１０４の他の例では、膨張収縮処理
と膨張膨張の順番を変えられるように、図７のような構
成をとる。すなわち、入力端子４０１に入力されたテロ
ップ候補画素は、連動する４つのスイッチ４０６を介し
て膨張部４０２，４０５と収縮部４０３，４０４によっ
て処理され、出力端子４０７に出力される。スイッチ４
０６の接片が上側にあるときは、先に膨張収縮処理を行
い、次に収縮膨張処理を行う。スイッチ４０６の接片が
下側にあるときは、逆の順番になる。In another example of the merging unit 104, a configuration as shown in FIG. 7 is adopted so that the order of expansion / contraction processing and expansion / expansion can be changed. That is, the telop candidate pixels input to the input terminal 401 are processed by the expansion units 402 and 405 and the contraction units 403 and 404 via the four switches 406 in conjunction with each other, and output to the output terminal 407. Switch 4
When the contact piece 06 is on the upper side, the expansion / contraction processing is performed first, and then the contraction / expansion processing is performed. When the contact piece of the switch 406 is on the lower side, the order is reversed.

【００４３】先に、膨張収縮処理を行うと、欠損を補う
ことを優先し、先に収縮膨張処理を行うと、ノイズの除
去を優先するという構成になる。If the expansion / contraction processing is performed first, the correction of the defect is given priority, and if the expansion / contraction processing is performed first, the removal of noise is given priority.

【００４４】図８は図１の第１の実施形態に判定部を追
加した実施形態のブロック図である。入力端子１０１に
入力された映像から、テロップ候補画素抽出部１０２に
よってテロップ領域の候補となる画素あるいは画素の集
合を検出し、バッファ１０３に蓄積し、併合部１０４に
よってテロップ候補画素が併合される。併合されたテロ
ップ候補画素は、判定部１０６によってテロップを表す
代表フレームが判定され、出力端子１０５に出力され
る。FIG. 8 is a block diagram of an embodiment in which a judgment unit is added to the first embodiment of FIG. From the video input to the input terminal 101, a telop candidate pixel extraction unit 102 detects a pixel or a set of pixels that are candidates for a telop area, stores the detected pixel or pixel in a buffer 103, and merges telop candidate pixels by a merging unit 104. From the merged telop candidate pixels, the determination unit 106 determines a representative frame representing the telop, and outputs the representative frame to the output terminal 105.

【００４５】次に、判定部１０６について２つの例を説
明する。まず、テロップ候補画素の存在しない時間区間
の前または後のフレームをテロップを表す代表フレーム
として判定する例を説明する。Next, two examples of the judgment unit 106 will be described. First, an example will be described in which a frame before or after a time section in which no telop candidate pixel exists is determined as a representative frame representing a telop.

【００４６】例えば、図９のようにテロップＡ〜Ｇが時
間的に配置されているとする。この図で、横軸が時間軸
（ｔ軸）であり、縦軸はｘまたはｙ軸である。テロップ
候補画素の存在しない時間区間の後のフレームを示した
のが、ｂ₁〜ｂ₄である。また、ｆ₁〜ｆ₄はテロップ候補
画素の存在しない時間区間の前のフレームを示したもの
である。For example, assume that the telops A to G are temporally arranged as shown in FIG. In this figure, the horizontal axis is the time axis (t axis), and the vertical axis is the x or y axis. Shown the frame after the nonexistent time interval of the telop candidate pixel is a b₁ ~b_4. Also, f_{1 to} f₄ indicate frames before a time section in which no telop candidate pixel exists.

【００４７】ｂ₁〜ｂ₄をテロップを表す代表フレームと
すると、テロップＡ，Ｂ，Ｄ，Ｆ，Ｇは反映されるが、
テロップＣとＥのように、他のテロップが出現している
途中で出現するテロップは反映されない。一方、ｆ₁〜
ｆ₄を用いると、テロップＡ，Ｂ，Ｃ，Ｄ，Ｆは反映さ
れるが、テロップＥ，Ｇのように、他のテロップが出現
している途中で消滅するテロップが反映されない。テロ
ップ候補画素が存在しない区間の検出は比較的簡単に実
現できるため、この方法には、簡便性という利点があ
る。If b_{1 to} b₄ are representative frames representing a telop, telops A, B, D, F, and G are reflected.
A telop appearing while other telops are appearing, such as telops C and E, is not reflected. On the other hand, f₁
With f_4, telop A, B, C, D, although F is reflected, telop E, as G, telop disappears in the middle of another telop has appeared is not reflected. This method has an advantage of simplicity, since detection of a section where no telop candidate pixel exists can be realized relatively easily.

【００４８】次に、ラベリングを用いた判定部１０６の
例について説明する。Next, an example of the determination unit 106 using labeling will be described.

【００４９】図１０は、ラベリングを用いた判定部１０
６のブロック図である。入力端子５０１に入力されたテ
ロップ候補画素は、ラベリング部５０２により、近傍画
素との連結成分が求められ、ラベル情報としてバッファ
５０３に蓄えられる。ラベル情報は、図１１（１）に示
すような表形式によって管理される。ここでは、図１１
（２）に示すように、外接直方体の座標値によってラベ
ルの位置を表現している。判定部５０４は、ｔ₀≦ｔ≦
ｔ₁の範囲のｔを選び、代表フレームとして出力端子５
０５に出力する。FIG. 10 shows a judgment unit 10 using labeling.
6 is a block diagram of FIG. With respect to the telop candidate pixel input to the input terminal 501, a connected component with a neighboring pixel is obtained by the labeling unit 502, and is stored in the buffer 503 as label information. The label information is managed in a table format as shown in FIG. Here, FIG.
As shown in (2), the position of the label is represented by the coordinate values of the circumscribed rectangular parallelepiped. The determination unit 504 determines that t₀ ≦ t ≦
Select t in the range of t_{1 and use} the output terminal 5 as the representative frame.
Output to 05.

【００５０】例として、図１２に図９と同様のテロップ
の時間配置を示す。本実施形態によれば、各テロップＡ
〜Ｇを識別し、その時間範囲を求めることができる。こ
こでは、テロップの出現するフレーム（ｔ₀）を代表フ
レームとし、ｔ₁〜ｔ₆の時間を出力している。As an example, FIG. 12 shows a time arrangement of telops similar to FIG. According to the present embodiment, each telop A
G can be identified and its time range can be determined. Here, the frame (t₀ ) in which the telop appears is set as the representative frame, and the time from t₁ to t₆ is output.

【００５１】なお、代表フレームとして、テロップの消
滅する直前のフレーム（ｔ₁）を用いてもよいし、ｔ₀と
ｔ₁の中間のフレームを用いてもよい。The frame (t₁ ) immediately before the telop disappears may be used as the representative frame, or an intermediate frame between t₀ and t₁ may be used.

【００５２】本発明は、発明の趣旨を変えない範囲で、
様々に変更して実施することもできる。例えば、テロッ
プの検出結果を用いて代表フレームを表示し、映像のテ
ロップ一覧を作成することもできる。[0052] The present invention provides a method that does not change the spirit of the invention.
Various modifications can be made. For example, a representative frame may be displayed by using the detection result of the telop, and a telop list of the video may be created.

【００５３】[0053]

【発明の効果】以上説明したように、本発明によれば、
テロップの類似した被写体が短時間出現することによる
誤検出を除去し、画質劣化やノイズ等の影響による一時
的なテロップ検出漏れを補うので、過不足のないテロッ
プ検出が可能となる。As described above, according to the present invention,
Erroneous detection due to the appearance of a subject having a similar telop for a short period of time is eliminated, and temporary telop detection omission due to the influence of image quality deterioration, noise, or the like is compensated for.

【図面の簡単な説明】[Brief description of the drawings]

【図１】本発明の一実施形態の映像テロップ抽出装置の
ブロック図である。FIG. 1 is a block diagram of a video telop extraction device according to an embodiment of the present invention.

【図２】３次元のバッファ１０３を示す説明図である。FIG. 2 is an explanatory diagram showing a three-dimensional buffer 103;

【図３】テロップ候補画素抽出部１０２の一例のブロッ
ク図である。FIG. 3 is a block diagram illustrating an example of a telop candidate pixel extraction unit 102;

【図４】テロップ検出の原理を示す説明図である。FIG. 4 is an explanatory diagram illustrating the principle of telop detection.

【図５】テロップ候補画素抽出部１０２の他の例のブロ
ック図である。FIG. 5 is a block diagram of another example of the telop candidate pixel extraction unit 102;

【図６】テロップ候補画素の膨張処理、収縮処理を示す
例示図である。FIG. 6 is an exemplary diagram showing expansion processing and contraction processing of a telop candidate pixel.

【図７】テロップ候補画素併合部１０４の一例のブロッ
ク図である。FIG. 7 is a block diagram illustrating an example of a telop candidate pixel merging unit 104;

【図８】本発明の他の実施形態の映像テロップ抽出装置
のブロック図である。FIG. 8 is a block diagram of a video telop extraction device according to another embodiment of the present invention.

【図９】テロップ候補画素の存在しない時間区間の前後
を判定する一実施形態による判定結果を示す例示部であ
る。FIG. 9 is an illustration showing an example of a determination result according to an embodiment for determining before and after a time section in which no telop candidate pixel exists.

【図１０】ラベリングを用いた判定部１０６の一例を示
すブロック図である。FIG. 10 is a block diagram illustrating an example of a determination unit using labeling.

【図１１】ラベル情報を示す例示図である。FIG. 11 is an exemplary diagram showing label information.

【図１２】ラベリングを用いた判定部１０６の一例によ
る判定結果を示す例示図である。FIG. 12 is an exemplary diagram showing a determination result by an example of a determination unit using labeling.

【符号の説明】[Explanation of symbols]

１０１入力端子１０２テロップ画素候補抽出部１０３バッファ１０４併合部１０５出力端子１０６判定部２０１入力端子２０２エッジ生成部２０３バッファ２０４縦投影部２０５比較部２０６入力端子２０７横投影部２０８比較部２０９入力端子２１０合成部２１１出力端子３０１入力端子３０２位置復号部３０３種類復号部３０４カウンタ３０５出力端子４０１入力端子４０２，４０５膨張部４０３，４０４収縮部４０６スイッチ４０７出力端子５０１入力端子５０２ラベリング部５０３バッファ５０４判定部５０５出力端子 101 input terminal 102 telop pixel candidate extraction unit 103 buffer 104 merging unit 105 output terminal 106 determination unit 201 input terminal 202 edge generation unit 203 buffer 204 vertical projection unit 205 comparison unit 206 input terminal 207 horizontal projection unit 208 comparison unit 209 input terminal 210 Combining unit 211 Output terminal 301 Input terminal 302 Position decoding unit 303 Type decoding unit 304 Counter 305 Output terminal 401 Input terminal 402, 405 Expansion unit 403, 404 Contraction unit 406 Switch 407 Output terminal 501 Input terminal 502 Labeling unit 503 Buffer 504 Judgment unit 505 output terminal

Claims

Translated fromJapanese

【特許請求の範囲】[Claims]

【請求項１】映像からテロップを検出する方法であっ
て、画素または画素の集合の単位でテロップ候補画素を抽出
し、縦横の空間軸と時間軸とから成る３次元のバッファ
に格納する抽出段階と、前記バッファ上のテロップ候補画素を併合する併合段階
を有する映像テロップ検出方法。1. A method for detecting a telop from a video, comprising the steps of: extracting a telop candidate pixel in a unit of a pixel or a set of pixels; and storing the telop candidate pixel in a three-dimensional buffer including a vertical axis and a horizontal axis and a time axis. And a video telop detection method, comprising: merging telop candidate pixels on the buffer.

【請求項２】映像からテロップを検出する装置であっ
て、縦横の空間軸と時間軸とからなる３次元のバッファと、画素または画素の集合の単位でテロップ候補画素を抽出
し、前記バッファに格納する抽出手段と、前記バッファ上のテロップ候補画素を併合する併合手段
を有する映像テロップ検出装置。2. A device for detecting a telop from an image, comprising: a three-dimensional buffer composed of a vertical and horizontal spatial axis and a time axis; and a telop candidate pixel extracted in a unit of a pixel or a set of pixels. A video telop detection device comprising: extraction means for storing; and merging means for merging telop candidate pixels on the buffer.

【請求項３】前記抽出手段が、映像のエッジを求める
エッジ生成手段と、エッジの値を縦方向と横方向に投影
する投影手段と、投影された値と閾値との比較結果に基
づいてテロップ候補画素を判定する比較手段を有する請
求項２記載の映像テロップ検出装置。3. The image processing apparatus according to claim 2, wherein the extracting means includes an edge generating means for obtaining an edge of the image, a projecting means for projecting the edge value in a vertical direction and a horizontal direction, and a telop based on a comparison result between the projected value and a threshold value. 3. The video telop detection device according to claim 2, further comprising comparison means for determining a candidate pixel.

【請求項４】前記抽出手段が、フレーム間の相関を利
用して符号化された映像データから、フレーム間の相関
を用い、かつ、動き補償を用いないで符号化された画素
の数を、それぞれの画素の位置ごとに計数区間内で計数
する計数手段を有する請求項２記載の映像テロップ検出
装置。4. The video processing apparatus according to claim 1, wherein the extracting unit calculates, from the video data encoded using the correlation between frames, the number of pixels encoded using the correlation between frames and without using motion compensation. 3. The video telop detection device according to claim 2, further comprising a counting means for counting within a counting section for each pixel position.

【請求項５】前記併合手段が、３次元の平滑化フィル
タによってテロップ候補画素を平滑化する平滑化手段を
有する請求項２から４のいずれか１項に記載の映像テロ
ップ検出装置。5. The video telop detection device according to claim 2, wherein the merging unit includes a smoothing unit that smoothes the telop candidate pixels with a three-dimensional smoothing filter.

【請求項６】前記併合手段が、テロップ候補画素を近
傍画素の最大値に置き換える膨張手段と、テロップ候補
画素を近傍画素の最小値に置き換える収縮手段を有する
請求項２から４のいずれか１項に記載の映像テロップ検
出装置。6. The method according to claim 2, wherein the merging means includes expansion means for replacing the telop candidate pixel with the maximum value of the neighboring pixels and contraction means for replacing the telop candidate pixel with the minimum value of the neighboring pixels. 3. The video telop detection device according to 1.

【請求項７】テロップ候補画素の存在しない時間区間
の前または後のフレームを、テロップを表す代表フレー
ムとする判定手段をさらに有する請求項２から６のいず
れか１項に記載の映像テロップ検出装置。7. The video telop detection device according to claim 2, further comprising a determination unit that sets a frame before or after a time section in which no telop candidate pixel exists as a representative frame representing a telop. .

【請求項８】併合されたテロップ候補画素の連結成分
にラベルを付与するラベリング手段と、ラベルのつけら
れたテロップ候補画素を含むフレームをテロップを表す
代表フレームとする判定手段をさらに有する請求項２か
ら６のいずれか１項に記載の映像テロップ検出装置。8. The apparatus according to claim 2, further comprising: a labeling unit for giving a label to the connected components of the merged telop candidate pixels; and a judging unit for setting a frame including the labeled telop candidate pixel as a representative frame representing the telop. 7. The video telop detection device according to any one of items 1 to 6.