JP7633357B1

Movatterモバイル変換

Info

Publication number: JP7633357B1
Application number: JP2023195624A
Authority: JP
Inventors: 隆史武藤; 秀幸田中; 惠介清水; 愛池上; 有輝荻田; 征斗村田; 廣野北川
Original assignee: Dentsu Inc
Current assignee: Dentsu Group Inc
Priority date: 2023-11-17
Filing date: 2023-11-17
Publication date: 2025-02-19
Anticipated expiration: 2043-11-17
Also published as: JP2025082852A; WO2025105114A1; JP2025082362A

Abstract

Translated fromJapanese

【課題】ユーザの意図した表情のキャラクターを含むコマで構成されたコンテンツを容易に生成すること。
【解決手段】コンテンツ生成システムは、キャラクターの動画像又は静止画像を含む一以上のコマで構成されるコンテンツを生成するコンテンツ生成システムであって、第１の二次元操作領域を含むユーザインタフェースと、ユーザが編集するコマのプレビュー画像と、を表示部に表示する表示制御部と、第１の二次元操作領域上のポインタの位置に応じて、プレビュー画像上のキャラクターの表情を決定する表情決定部と、ユーザが編集したコマを含む一以上のコマで構成されるコンテンツを生成するコンテンツ生成部と、を備える。
【選択図】図３

The present invention provides an easy generation of content made up of frames including characters with facial expressions as intended by a user.
[Solution] The content generation system generates content consisting of one or more frames including moving or still images of a character, and is equipped with a user interface including a first two-dimensional operation area, a display control unit that displays a preview image of a frame to be edited by the user on a display unit, an expression determination unit that determines the expression of the character on the preview image depending on the position of a pointer on the first two-dimensional operation area, and a content generation unit that generates content consisting of one or more frames including a frame edited by the user.
[Selected figure] Figure 3

Description

Translated fromJapanese

本発明は、コンテンツ生成システム、方法、プログラム及び記憶媒体に関する。The present invention relates to a content generation system, method, program, and storage medium.

従来、マルチモーダルモデルを使用して、仮想キャラクター（又は「アバター」）を制御するための方法が知られている（特許文献１参照）。Conventionally, a method for controlling a virtual character (or "avatar") using a multimodal model is known (see Patent Document 1).

特表２０２２－５３４７０８号公報Special Publication No. 2022-534708

しかしながら、従来、ユーザの意図した表情のキャラクターを含むコマで構成されたコンテンツを生成することについては、何ら提案されていない。However, there has been no previous proposal for generating content consisting of frames containing characters with facial expressions as intended by the user.

本発明が解決しようとする課題は、ユーザの意図した表情のキャラクターを含むコマで構成されたコンテンツを容易に生成することである。The problem that this invention aims to solve is to easily generate content made up of frames that include characters with facial expressions that the user intends.

［１］一態様に係るコンテンツ生成システムは、
キャラクターの動画像又は静止画像を含む一以上のコマで構成されるコンテンツを生成するコンテンツ生成システムであって、
第１の二次元操作領域を含むユーザインタフェースと、ユーザが編集するコマのプレビュー画像と、を表示部に表示する表示制御部と、
前記第１の二次元操作領域上のポインタの位置に応じて、前記プレビュー画像上のキャラクターの表情を決定する表情決定部と、
前記ユーザが編集した前記コマを含む一以上のコマで構成されるコンテンツを生成するコンテンツ生成部と、
を備える。[1] A content generation system according to one aspect includes:
A content generation system that generates content consisting of one or more frames including a moving image or a still image of a character,
a display control unit that displays, on a display unit, a user interface including a first two-dimensional operation area and a preview image of a frame to be edited by a user;
a facial expression determination unit that determines a facial expression of the character on the preview image in accordance with a position of a pointer on the first two-dimensional operation area;
a content generating unit that generates content consisting of one or more frames including the frame edited by the user;
Equipped with.

［２］一態様に係るコンテンツ生成システムは、上記［１］に記載のコンテンツ生成システムにおいて、
前記ユーザインタフェースは、第２の二次元操作領域を含み、該第２の二次元操作領域上のポインタの位置に応じて、前記プレビュー画像上のキャラクターの目線を決定する目線決定部をさらに備える。[2] A content generation system according to one aspect is the content generation system according to the above [1],
The user interface includes a second two-dimensional operation area, and further includes a line of sight determination unit that determines a line of sight of a character on the preview image according to a position of a pointer on the second two-dimensional operation area.

［３］一態様に係るコンテンツ生成システムは、上記［１］又は［２］に記載のコンテンツ生成システムにおいて、
前記ユーザによる前記ユーザインタフェースの操作に応じて、前記プレビュー画像上の前記キャラクターのモーションを決定するモーション決定部をさらに備える。[3] A content generation system according to one aspect is the content generation system according to the above [1] or [2],
The game device further includes a motion determination unit that determines a motion of the character on the preview image in response to an operation of the user interface by the user.

［４］一態様に係るコンテンツ生成システムは、上記［３］に記載のコンテンツ生成システムにおいて、
前記ユーザインタフェースは、一次元操作領域を含み、
前記モーション決定部は、前記一次元操作領域上のノブの位置に応じて、前記モーションの再生範囲を決定する。[4] A content generation system according to one aspect is the content generation system according to the above [3],
the user interface includes a one-dimensional operation area;
The motion determination unit determines a reproduction range of the motion according to a position of the knob on the one-dimensional operation area.

［５］一態様に係るコンテンツ生成システムは、上記［１］～［４］のいずれかに記載のコンテンツ生成システムにおいて、
前記第１の二次元操作領域上のポインタの位置に応じて、前記プレビュー画像上の前記キャラクターのモーションを決定するモーション決定部をさらに備える。[5] A content generation system according to one aspect is the content generation system according to any one of [1] to [4] above,
The device further includes a motion determination unit that determines a motion of the character on the preview image in accordance with a position of a pointer on the first two-dimensional operation area.

［６］一態様に係るコンテンツ生成システムは、上記［１］～［５］のいずれかに記載のコンテンツ生成システムにおいて、
前記プレビュー画像上に表示するキャラクターを決定するキャラクター決定部をさらに備える。[6] A content generation system according to one aspect is the content generation system according to any one of [1] to [5] above,
The display device further includes a character determination unit that determines a character to be displayed on the preview image.

［７］一態様に係るコンテンツ生成システムは、上記［１］～［６］のいずれかに記載のコンテンツ生成システムにおいて、
前記プレビュー画像の背景画像を決定する背景画像決定部をさらに備える。[7] A content generation system according to one aspect is the content generation system according to any one of [1] to [6] above,
The image display device further includes a background image determining unit that determines a background image of the preview image.

［８］一態様に係るコンテンツ生成システムは、上記［１］～［７］のいずれかに記載のコンテンツ生成システムにおいて、
前記ユーザによる前記ユーザインタフェースの操作に応じて、前記プレビュー画像に表示するテキスト、該テキストを表示する吹き出しの種別、前記テキストの色、前記テキストのサイズ、のうち少なくとも一つを決定するテキスト情報決定部をさらに備える。[8] A content generation system according to one aspect is the content generation system according to any one of [1] to [7] above,
The display device further includes a text information determination unit that determines at least one of the text to be displayed in the preview image, the type of speech bubble in which the text is to be displayed, the color of the text, and the size of the text, in response to an operation of the user interface by the user.

［９］一態様に係るコンテンツ生成システムは、上記［１］～［８］のいずれかに記載のコンテンツ生成システムにおいて、
前記第１の二次元操作領域が、前記キャラクターの感情を設定するためのユーザインタフェースである。[9] A content generation system according to one aspect is the content generation system according to any one of [1] to [8] above,
The first two-dimensional operation area is a user interface for setting an emotion of the character.

［１０］一態様に係るコンテンツ生成システムは、上記［１］～［９］のいずれかに記載のコンテンツ生成システムにおいて、
キャラクター画像と該キャラクター画像に関連付けられた音声データとを学習データに用いた音声モデルに、前記ユーザが選択したキャラクターの画像データを入力することにより、キャラクターの音声データを生成する音声生成部をさらに備える。[10] A content generation system according to one aspect is the content generation system according to any one of [1] to [9] above,
The system further includes a voice generation unit that generates voice data for a character by inputting image data of the character selected by the user into a voice model that uses a character image and voice data associated with the character image as learning data.

［１１］一態様に係るコンテンツ生成システムは、上記［１］～［１０］のいずれかに記載のコンテンツ生成システムにおいて、
キャラクターの画像データ及び／又は音声データを学習データの一部として用いて学習された言語モデルに、前記ユーザの発話データを入力することにより、前記ユーザの発話データに応答するキャラクターの発話データを生成する発話データ生成部をさらに備える。[11] A content generation system according to one aspect is the content generation system according to any one of [1] to [10] above,
The system further includes a speech data generation unit that generates character speech data responsive to the user's speech data by inputting the user's speech data into a language model trained using the character's image data and/or voice data as part of training data.

［１２］一態様に係るコンテンツ生成システムは、上記［１］～［１１］のいずれかに記載のコンテンツ生成システムにおいて、
前記ユーザが入力した画像データを学習済みの画像処理モデルに入力することにより、キャラクター画像を生成するキャラクター画像生成部をさらに備える。[12] A content generation system according to one aspect is the content generation system according to any one of [1] to [11] above,
The device further includes a character image generation unit that generates a character image by inputting the image data input by the user to a trained image processing model.

［１３］一態様に係るコンテンツ生成システムは、上記［１］～［１２］のいずれかに記載のコンテンツ生成システムにおいて、
前記ユーザが入力した画像データを学習済みの画像処理モデルに入力することにより、背景画像を生成する背景画像生成部をさらに備える。[13] A content generation system according to one aspect is the content generation system according to any one of [1] to [12] above,
The image processing device further includes a background image generation unit that generates a background image by inputting the image data input by the user to a trained image processing model.

［１４］一態様に係るコンテンツ生成システムは、上記［１］～［１３］のいずれかに記載のコンテンツ生成システムにおいて、
前記キャラクター画像生成部及び前記背景画像生成部が生成した画像データは三次元の画像データであり、メタバース空間上に配置される。[14] A content generation system according to one aspect is the content generation system according to any one of [1] to [13] above,
The image data generated by the character image generation unit and the background image generation unit is three-dimensional image data and is placed in the metaverse space.

［１５］一態様に係るコンテンツ生成システムは、上記１］～［１４］のいずれかに記載のコンテンツ生成システムにおいて、
前記コンテンツ生成部が生成したコンテンツ及び／又は該コンテンツ内のキャラクターの画像データに関するＮＦＴを発行するＮＦＴ発行部をさらに備える。[15] A content generation system according to one aspect is the content generation system according to any one of 1] to [14] above,
The device further includes an NFT issuing unit that issues an NFT related to the content generated by the content generating unit and/or image data of a character within the content.

［１６］一態様に係る方法は、
キャラクターの動画像又は静止画像を含む一以上のコマで構成されるコンテンツを生成する方法であって、
第１の二次元操作領域を含むユーザインタフェースと、ユーザが編集するコマのプレビュー画像と、を表示部に表示する表示制御ステップと、
前記第１の二次元操作領域上のポインタの位置に応じて、前記プレビュー画像上のキャラクターの表情を決定する表情決定ステップと、
前記ユーザが編集した前記コマを含む一以上のコマで構成されるコンテンツを生成するコンテンツ生成ステップと、
を有する。[16] A method according to one embodiment comprises the steps of:
A method for generating content consisting of one or more frames including a moving image or a still image of a character, comprising the steps of:
a display control step of displaying, on a display unit, a user interface including a first two-dimensional operation area and a preview image of a frame to be edited by a user;
a facial expression determination step of determining a facial expression of the character on the preview image in accordance with a position of a pointer on the first two-dimensional operation area;
a content generating step of generating content including one or more frames including the frame edited by the user;
has.

［１７］一態様に係るプログラムは、
コンピュータに、キャラクターの動画像又は静止画像を含む一以上のコマで構成されるコンテンツを生成する方法を実行させるためのプログラムであって、前記方法は、
第１の二次元操作領域を含むユーザインタフェースと、ユーザが編集するコマのプレビュー画像と、を表示部に表示する表示制御ステップと、
前記第１の二次元操作領域上のポインタの位置に応じて、前記プレビュー画像上のキャラクターの表情を決定する表情決定ステップと、
前記ユーザが編集した前記コマを含む一以上のコマで構成されるコンテンツを生成するコンテンツ生成ステップと、
を有する。[17] A program according to one embodiment includes:
A program for causing a computer to execute a method for generating content composed of one or more frames including a moving image or a still image of a character, the method comprising the steps of:
a display control step of displaying, on a display unit, a user interface including a first two-dimensional operation area and a preview image of a frame to be edited by a user;
a facial expression determination step of determining a facial expression of the character on the preview image in accordance with a position of a pointer on the first two-dimensional operation area;
a content generating step of generating content including one or more frames including the frame edited by the user;
has.

［１８］一態様に係る記憶媒体は、
上記［１７］に記載のプログラムを格納したコンピュータで読み取り可能な記憶媒体である。[18] A storage medium according to one embodiment includes:
A computer-readable storage medium storing the program described in [17] above.

［１９］一態様に係るコンテンツ生成システムは、
キャラクターの動画像又は静止画像を含む一以上のコマで構成されるコンテンツを生成するコンテンツ生成システムであって、
ユーザインタフェースと、ユーザが編集するコマのプレビュー画像と、を表示部に表示する表示制御部と、
前記ユーザによる前記ユーザインタフェースの操作に応じて、前記プレビュー画像上に表示するキャラクターを決定するキャラクター決定部と、
前記ユーザによる前記ユーザインタフェースの操作に応じて、前記キャラクターの表情を決定する表情決定部と、
前記ユーザによる前記ユーザインタフェースの操作に応じて、前記キャラクターの目線を決定する目線決定部と、
前記ユーザによる前記ユーザインタフェースの操作に応じて、前記キャラクターのモーションを決定するモーション決定部と、
前記ユーザによる前記ユーザインタフェースの操作に応じて、前記プレビュー画像の背景画像を決定する背景決定部と、
前記ユーザによる前記ユーザインタフェースの操作に応じて、前記プレビュー画像に表示するテキスト、該テキストを表示する吹き出しの種別、前記テキストの色、前記テキストのサイズ、のうち少なくとも一つを決定するテキスト情報決定部と、
前記ユーザが編集した前記コマを含む一以上のコマで構成されるコンテンツを生成するコンテンツ生成部と、
を備える。[19] A content generation system according to one aspect includes:
A content generation system that generates content consisting of one or more frames including a moving image or a still image of a character,
a display control unit that displays a user interface and a preview image of a frame to be edited by a user on a display unit;
a character determination unit that determines a character to be displayed on the preview image in response to an operation of the user interface by the user;
a facial expression determination unit that determines a facial expression of the character in response to an operation of the user interface by the user;
a gaze direction determination unit that determines a gaze direction of the character in response to an operation of the user interface by the user;
a motion determination unit that determines a motion of the character in response to an operation of the user interface by the user;
a background determination unit that determines a background image of the preview image in response to an operation of the user interface by the user;
a text information determination unit that determines at least one of a text to be displayed in the preview image, a type of a speech bubble displaying the text, a color of the text, and a size of the text, in response to an operation of the user interface by the user;
a content generating unit that generates content consisting of one or more frames including the frame edited by the user;
Equipped with.

本発明によれば、ユーザの意図した表情のキャラクターを含むコマで構成されたコンテンツを容易に生成することができる。The present invention makes it easy to generate content made up of frames containing characters with facial expressions as intended by the user.

本実施形態に係るコンテンツ生成システム１のシステム構成の一例を示す図である。1 is a diagram showing an example of a system configuration of a content generation system 1 according to an embodiment of the present invention.本実施形態に係るコンテンツ生成システム１のハードウェア構成の一例を示すブロック図である。1 is a block diagram showing an example of a hardware configuration of a content generation system 1 according to an embodiment of the present invention.本実施形態に係るコンテンツ生成システム１の機能構成の一例を示すブロック図である。1 is a block diagram showing an example of a functional configuration of a content generation system 1 according to an embodiment of the present invention.本実施形態に係るコンテンツ生成システム１の動作の一例を示すフローチャートである。4 is a flowchart showing an example of the operation of the content generation system 1 according to the present embodiment.本実施形態に係るコンテンツ作成画面の一例を示す図である。FIG. 11 is a diagram showing an example of a content creation screen according to the embodiment.本実施形態に係るコンテンツ作成画面の一例を示す図であって、キャラクターを決定した後の状態を示す図である。FIG. 13 is a diagram showing an example of a content creation screen according to the present embodiment, showing a state after a character has been determined.本実施形態に係るコンテンツ作成画面の一例を示す図であって、キャラクターの表情を決定した後の状態を示す図である。FIG. 13 is a diagram showing an example of a content creation screen according to the present embodiment, showing a state after a character's facial expression has been determined.本実施形態に係るコンテンツ作成画面の一例を示す図であって、キャラクターの目線を決定した後の状態を示す図である。FIG. 13 is a diagram showing an example of a content creation screen according to the present embodiment, showing a state after a character's line of sight has been determined.本実施形態に係るコンテンツ作成画面の一例を示す図であって、キャラクターのモーションを選択中の状態を示す図である。FIG. 13 is a diagram showing an example of a content creation screen according to the embodiment, and is a diagram showing a state in which a character motion is being selected.本実施形態に係るコンテンツ作成画面の一例を示す図であって、キャラクターのモーションを決定した後の状態を示す図である。FIG. 13 is a diagram showing an example of a content creation screen according to the present embodiment, showing a state after a character's motion has been determined.本実施形態に係るコンテンツ作成画面の一例を示す図であって、キャラクターの配置を決定した後の状態を示す図である。FIG. 13 is a diagram showing an example of a content creation screen according to the present embodiment, showing a state after the placement of characters has been determined.本実施形態に係るコンテンツ作成画面の一例を示す図であって、背景画像を決定した後の状態を示す図である。FIG. 13 is a diagram showing an example of a content creation screen according to the present embodiment, showing a state after a background image has been determined.本実施形態に係るコンテンツ作成画面の一例を示す図であって、テキスト情報を決定した後の状態を示す図である。FIG. 13 is a diagram showing an example of a content creation screen according to the embodiment, and shows a state after text information has been determined.本実施形態に係る表情設定部の一例を示す図である。FIG. 4 is a diagram illustrating an example of a facial expression setting unit according to the embodiment.本実施形態に係る目線設定部の一例を示す図である。FIG. 4 is a diagram illustrating an example of a line of sight setting unit according to the embodiment.本実施形態に係るモーション設定部の一例を示す図である。FIG. 4 is a diagram illustrating an example of a motion setting unit according to the embodiment.本実施形態に係る表情設定部の他の例を示す図である。13 is a diagram showing another example of the facial expression setting unit according to the embodiment; FIG.本実施形態に係る目線設定部の他の例を示す図である。13A and 13B are diagrams illustrating another example of the line of sight setting unit according to the embodiment.

以下に、添付の図面を参照して、本発明の実施の形態を詳細に説明する。なお、各図において同等の機能を有する構成要素には同一の符号を付し、同一符号の構成要素の詳しい説明は繰り返さない。Below, an embodiment of the present invention will be described in detail with reference to the attached drawings. Note that in each drawing, components having equivalent functions are given the same reference numerals, and detailed descriptions of components with the same reference numerals will not be repeated.

（コンテンツ生成システムの概要）
本実施形態に係るコンテンツ生成システムは、ユーザが、マンガやアニメ等の一以上のコマから構成されるコンテンツを生成するために用いられる。以下、本実施形態に係るコンテンツ生成システムの詳細について説明する。(Overview of the content generation system)
The content generation system according to this embodiment is used by a user to generate content consisting of one or more frames of a manga, anime, etc. The content generation system according to this embodiment will be described in detail below.

（コンテンツ生成システムの構成）
図１は、本実施形態に係るコンテンツ生成システム１の概略的な構成を示す図である。図１に示すように、本実施形態に係るコンテンツ生成システム１は、ユーザが使用するユーザ端末２（端末装置）と、サーバ３と、を備えている。(Configuration of Content Generation System)
Fig. 1 is a diagram showing a schematic configuration of a content creation system 1 according to the present embodiment. As shown in Fig. 1, the content creation system 1 according to the present embodiment includes a user terminal 2 (terminal device) used by a user, and a server 3.

ユーザ端末２とサーバ３とは、インターネット等のネットワーク４を介して互いに通信可能に接続されている。ネットワーク４は、有線回線と無線回線のいずれでもよく、回線の種類や形態は問わない。ユーザ端末２及びサーバ３の少なくとも一部は、コンピュータ（情報処理装置）により実現される。ユーザ端末２は、例えば、パーソナルコンピュータやスマートフォン、タブレット端末等の端末装置である。ユーザ端末２は、コンテンツ生成システム１を利用するユーザの数に応じて、多数存在しているものとする。The user terminal 2 and the server 3 are connected to each other so that they can communicate with each other via a network 4 such as the Internet. The network 4 may be either a wired line or a wireless line, and the type and form of the line are not important. At least a part of the user terminal 2 and the server 3 is realized by a computer (information processing device). The user terminal 2 is, for example, a terminal device such as a personal computer, a smartphone, or a tablet terminal. There are a large number of user terminals 2, depending on the number of users who use the content generation system 1.

（ハードウェア構成）
次に、本実施形態に係るコンテンツ生成システム１のハードウェア構成について説明する。図２は、本実施形態に係るコンテンツ生成システム１に含まれるユーザ端末２とサーバ３のハードウェア構成の一例を示すブロック図である。(Hardware configuration)
Next, a description will be given of the hardware configuration of the content generation system 1 according to this embodiment. Fig. 2 is a block diagram showing an example of the hardware configuration of the user terminal 2 and the server 3 included in the content generation system 1 according to this embodiment.

ユーザ端末２において、ＣＰＵ２０１は、このユーザ端末２全体の動作を制御する処理装置である。ＲＯＭ２０２は、ＣＰＵ２０１が実行する制御プログラムや各種のデータを格納する不揮発性メモリである。ＲＡＭ２０３は、ＣＰＵ２０１が実行するプログラムのロード領域やワーク領域に使用する揮発性メモリである。記憶装置２０４は、各種の情報を記憶するための記憶手段であり、ユーザ端末２本体に内蔵されているものでも、記憶媒体が着脱可能なものでもよい。入力装置２０５は、ユーザ端末２のユーザが情報を入力するための装置であり、例えば、キーボードやマウス、タッチパネル、マイクロフォン等である。ディスプレイ２０６は、各種の情報（ユーザインタフェース等）を表示する表示装置である。撮像素子２０７は、被写体の像を撮像する光電変換素子である。通信Ｉ／Ｆ（インタフェース）２０８は、ネットワーク４と接続するためのインタフェースである。バス２０９は、上記各構成要素を相互に接続するバスラインである。In the user terminal 2, the CPU 201 is a processing device that controls the operation of the entire user terminal 2. The ROM 202 is a non-volatile memory that stores the control program executed by the CPU 201 and various data. The RAM 203 is a volatile memory used as a load area and a work area for the program executed by the CPU 201. The storage device 204 is a storage means for storing various information, and may be built into the main body of the user terminal 2 or may have a removable storage medium. The input device 205 is a device for the user of the user terminal 2 to input information, and is, for example, a keyboard, a mouse, a touch panel, a microphone, etc. The display 206 is a display device that displays various information (user interface, etc.). The image sensor 207 is a photoelectric conversion element that captures an image of a subject. The communication I/F (interface) 208 is an interface for connecting to the network 4. The bus 209 is a bus line that connects the above components to each other.

サーバ３において、ＣＰＵ３０１は、このサーバ３全体の動作を制御する処理装置である。ＲＯＭ３０２は、ＣＰＵ３０１が実行する制御プログラムや各種のデータを格納する不揮発性メモリである。ＲＡＭ３０３は、ＣＰＵ３０１が実行するプログラムのロード領域やワーク領域に使用する揮発性メモリである。記憶装置３０４は、各種の情報を記憶するための記憶手段であり、サーバ３本体に内蔵されているものでも、記憶媒体が着脱可能なものでもよい。通信Ｉ／Ｆ（インタフェース）３０５は、ネットワーク４と接続するためのインタフェースである。バス３０６は、上記各構成要素を相互に接続するバスラインである。In server 3, CPU 301 is a processing device that controls the operation of the entire server 3. ROM 302 is a non-volatile memory that stores the control programs executed by CPU 301 and various data. RAM 303 is a volatile memory used as a load area and work area for the programs executed by CPU 301. Storage device 304 is a storage means for storing various information, and may be built into the server 3 main body or may have a removable storage medium. Communication I/F (interface) 305 is an interface for connecting to network 4. Bus 306 is a bus line that connects the above components to each other.

（機能構成）
次に、本実施形態に係るコンテンツ生成システム１の機能構成について説明する。図３は、本実施形態に係るコンテンツ生成システム１の機能構成の一例を示す図である。(Functional configuration)
Next, a functional configuration of the content generation system 1 according to the present embodiment will be described below. Fig. 3 is a diagram showing an example of the functional configuration of the content generation system 1 according to the present embodiment.

まず、ユーザ端末２の機能構成について説明する。図３に示すように、ユーザ端末２は、通信部２１と、ユーザ端末２の全体の動作を制御する制御部２２と、ユーザが各種情報を入力する入力部２３と、各種情報を出力する出力部２４と、被写体の像を撮像する撮像部２５と、各種情報を記憶する記憶部２６と、を有している。First, the functional configuration of the user terminal 2 will be described. As shown in FIG. 3, the user terminal 2 has a communication unit 21, a control unit 22 that controls the overall operation of the user terminal 2, an input unit 23 into which the user inputs various information, an output unit 24 that outputs various information, an imaging unit 25 that captures an image of a subject, and a storage unit 26 that stores various information.

通信部２１は、ユーザ端末２とネットワーク４との間の通信インタフェースである。通信部２１は、ネットワーク４を介してユーザ端末２とサーバ３との間で情報を送受信する。The communication unit 21 is a communication interface between the user terminal 2 and the network 4. The communication unit 21 transmits and receives information between the user terminal 2 and the server 3 via the network 4.

制御部２２は、表示制御部２２ａと、コマ決定部２２ｂと、キャラクター決定部２２ｃと、表情決定部２２ｄと、目線決定部２２ｅと、モーション決定部２２ｆと、キャラクター配置決定部２２ｇと、背景画像決定部２２ｈと、テキスト情報決定部２２ｉと、を有している。The control unit 22 has a display control unit 22a, a frame determination unit 22b, a character determination unit 22c, a facial expression determination unit 22d, a line of sight determination unit 22e, a motion determination unit 22f, a character placement determination unit 22g, a background image determination unit 22h, and a text information determination unit 22i.

表示制御部２２ａは、コンテンツ生成システム１に係るコンテンツ生成アプリ（プログラム）のコンテンツ作成画面（操作画面）を後述の出力部２４（表示部）に表示する。The display control unit 22a displays a content creation screen (operation screen) of the content generation app (program) related to the content generation system 1 on the output unit 24 (display unit) described below.

図５は、表示制御部２２ａが出力部２４（表示部）に表示したコンテンツ作成画面５０（操作画面）の一例を示す図である。図５に示すように、コンテンツ作成画面５０には、コンテンツを構成するコマを編集するための複数のＧＵＩ（Graphical User Interface）が表示される。具体的には、コンテンツ作成画面５０は、編集項目を切り替えるためのタブ５１と、コンテンツを構成するコマのサムネイル画像が一覧表示され、編集するコマの選択等を行うコマ一覧エリア５２と、編集中のコマのプレビュー画像が表示されるプレビュー表示エリア５３と、プレビュー表示エリア中の少なくとも一部の領域であって背景画像が表示される背景画像エリア５３ａと、キャラクターを選択するキャラクター選択エリア５４と、キャラクターの表情設定部５５と、キャラクターの目線設定部５６と、キャラクターのモーション設定部５７と、ユーザが各ＧＵＩの選択操作を行うためのカーソルＣを含む。FIG. 5 is a diagram showing an example of a content creation screen 50 (operation screen) displayed by the display control unit 22a on the output unit 24 (display unit). As shown in FIG. 5, the content creation screen 50 displays a plurality of GUIs (Graphical User Interfaces) for editing the frames that make up the content. Specifically, the content creation screen 50 includes a tab 51 for switching editing items, a frame list area 52 in which thumbnail images of the frames that make up the content are displayed in a list and in which the frame to be edited is selected, a preview display area 53 in which a preview image of the frame being edited is displayed, a background image area 53a which is at least a part of the preview display area in which a background image is displayed, a character selection area 54 for selecting a character, a character facial expression setting section 55, a character line of sight setting section 56, a character motion setting section 57, and a cursor C for the user to select each GUI.

図３に戻って、コマ決定部２２ｂは、コンテンツを構成する一以上のコマのうち、ユーザが編集するコマを決定する。具体的には、コマ決定部２２ｂは、コマ一覧エリア５２（図５参照）に表示されたコマのサムネイル画像をユーザがカーソルＣを用いて選択する操作に応じて、ユーザが編集するコマを決定する。図５は、コマ一覧エリアに表示されたコマ１～コマ４のサムネイル画像のうち、コマ１がユーザによって選択され、コマ決定部２２ｂによって、コマ１が編集するコマとして決定された状態を示している。Returning to FIG. 3, the frame determination unit 22b determines which frame the user will edit from among one or more frames that make up the content. Specifically, the frame determination unit 22b determines which frame the user will edit in response to the user's operation of using cursor C to select a thumbnail image of a frame displayed in the frame list area 52 (see FIG. 5). FIG. 5 shows a state in which frame 1 has been selected by the user from the thumbnail images of frames 1 to 4 displayed in the frame list area, and frame determination unit 22b has determined that frame 1 is the frame to be edited.

なお、コマ一覧エリア５２の内部に表示された「＋」ボタン５２ａを選択することでコマ一覧エリア５２にコマを追加することができる。また、「－」ボタン５２ｂを選択することで、選択中のコマ（図５に示す例では、コマ１）をコマ一覧エリア５２から削除することができる。また、再生ボタン５２ｃを選択することにより、コマ一覧エリア５２に表示された一以上のコマ（図５に示す例では、コマ１～４）で構成されたコンテンツが後述のコンテンツ生成部３２ｃによって生成され、生成されたコンテンツがプレビュー表示エリアに表示される。例えば、図５に示す例では、コマ一覧エリア５２にサムネイル表示されたコマ１～４が、スライドショーのように、所定時間間隔で連続的に切り替えながらプレビュー表示エリア５３に表示される。Note that a frame can be added to the frame list area 52 by selecting the "+" button 52a displayed within the frame list area 52. Also, the selected frame (frame 1 in the example shown in FIG. 5) can be deleted from the frame list area 52 by selecting the "-" button 52b. Also, by selecting the play button 52c, content made up of one or more frames (frames 1 to 4 in the example shown in FIG. 5) displayed in the frame list area 52 is generated by the content generation unit 32c described below, and the generated content is displayed in the preview display area. For example, in the example shown in FIG. 5, frames 1 to 4 displayed as thumbnails in the frame list area 52 are displayed in the preview display area 53 while being continuously switched at a predetermined time interval, like a slide show.

図３に戻って、キャラクター決定部２２ｃは、ユーザが編集中のコマに表示するキャラクターを決定する。具体的には、キャラクター決定部２２ｃは、キャラクター選択エリア５４（図５参照）に表示されたキャラクターのうち、ユーザがカーソルＣを用いて選択したキャラクターをユーザが編集中のコマに表示するコマとして決定する。キャラクター決定部２２ｃが決定したキャラクターの画像は、プレビュー表示エリア５３に表示されたコマのプレビュー画像上に重畳表示される。図５に示す例のように、キャラクター選択エリア５４から「非表示」ボタンを選択した状態では、プレビュー表示エリア５３にはキャラクターは表示されないが、キャラクター選択エリア５４から任意のキャラクターを選択すると、キャラクター決定部２２ｃによってキャラクターの選択が決定され、図６に示す例のように、プレビュー表示エリア５３に表示されたコマのプレビュー画像上にユーザが選択したキャラクターＡの画像が重畳表示される。プレビュー表示エリア５３に表示されたキャラクターＡは、カーソルＣを用いて、プレビュー表示エリア５３上で移動したり、拡大縮小したりすることができる。例えば、ユーザ端末２の入力装置２０５としてマウスを用いている場合には、マウスでカーソルＣを操作し、キャラクターＡをプレビュー表示エリア５３上でドラッグすることでキャラクターＡを任意の位置に移動してもよい。また、ドラッグ操作中にマウスホイールを操作することにより、キャラクターＡを拡大又は縮小してもよい。Returning to FIG. 3, the character determination unit 22c determines the character to be displayed in the frame being edited by the user. Specifically, the character determination unit 22c determines the character selected by the user using the cursor C from among the characters displayed in the character selection area 54 (see FIG. 5) as the frame to be displayed in the frame being edited by the user. The image of the character determined by the character determination unit 22c is superimposed on the preview image of the frame displayed in the preview display area 53. As shown in the example in FIG. 5, when the "Hide" button is selected from the character selection area 54, no character is displayed in the preview display area 53, but when an arbitrary character is selected from the character selection area 54, the character selection is determined by the character determination unit 22c, and as shown in the example in FIG. 6, the image of the character A selected by the user is superimposed on the preview image of the frame displayed in the preview display area 53. The character A displayed in the preview display area 53 can be moved or enlarged or reduced on the preview display area 53 using the cursor C. For example, if a mouse is used as the input device 205 of the user terminal 2, character A may be moved to any position by operating cursor C with the mouse and dragging character A on the preview display area 53. Character A may also be enlarged or reduced in size by operating the mouse wheel during the drag operation.

なお、プレビュー表示エリア５３に表示されるキャラクターＡの画像は、動画像又は静止画像のいずれでもよいが、後述のモーション決定部によりキャラクターのモーション（動作）として「何もしない」が決定された場合には、プレビュー表示エリア５３に表示されるキャラクターの画像は静止画像となる。The image of character A displayed in the preview display area 53 may be either a moving image or a still image, but if the motion determination unit described below determines that the character's motion (action) is to "do nothing," the image of the character displayed in the preview display area 53 will be a still image.

表情決定部２２ｄは、ユーザによる表情設定部５５に対する操作に応じて、プレビュー表示エリア５３に表示されたキャラクターＡの「表情」を決定する。The facial expression determination unit 22d determines the "facial expression" of character A displayed in the preview display area 53 in response to the user's operation on the facial expression setting unit 55.

図１４は、表情設定部５５の一例を示す図である。図１４に示すように、表情設定部５５は、二次元操作領域５５ａ（第１の二次元操作領域）と、該二次元操作領域５５ａ上を移動可能なポインタ５５ｂと、を含む。表情設定部５５は、二次元操作領域５５ａ上のポインタ５５ｂの位置に応じて、プレビュー表示エリア５３に表示されたキャラクターＡの表情（喜怒哀楽）を設定することができる。具体的には、表情設定部５５の二次元操作領域５５ａ上で、ポインタ５５ｂを「楽」側（図１４の＋Ｘ方向）に移動させるほど、プレビュー表示エリア５３に表示されたキャラクターＡの表情がより「楽しさ」を表現した表情に変化する。逆に、表情設定部５５の二次元操作領域５５ａ上で、ポインタ５５ｂを「哀」側（図１４の－Ｘ方向）に移動させるほど、プレビュー表示エリア５３に表示されたキャラクターＡの表情がより「哀しさ」を表現した表情に変化する。また、表情設定部５５の二次元操作領域５５ａ上で、ポインタ５５ｂを「喜」側（図１４の＋Ｙ方向）に移動させるほど、プレビュー表示エリア５３に表示されたキャラクターＡの表情がより「喜び」を表現した表情に変化する。逆に、表情設定部５５の二次元操作領域５５ａ上で、ポインタ５５ｂを「怒」側（図１４の－Ｙ方向）に移動させるほど、プレビュー表示エリア５３に表示されたキャラクターＡの表情がより「怒り」を表現した表情に変化する。このように、表情決定部２２ｄは、表情設定部５５の二次元操作領域５５ａ上におけるポインタ５５ｂの位置に応じて、プレビュー表示エリア５３に表示されたキャラクターＡの「表情」を決定する。これにより、コンテンツ生成システム１のユーザは、コンテンツ上のキャラクターの「表情」を直感的な操作で設定することができる。14 is a diagram showing an example of the facial expression setting unit 55. As shown in FIG. 14, the facial expression setting unit 55 includes a two-dimensional operation area 55a (first two-dimensional operation area) and a pointer 55b movable on the two-dimensional operation area 55a. The facial expression setting unit 55 can set the facial expression (joy, anger, sadness, and happiness) of the character A displayed in the preview display area 53 according to the position of the pointer 55b on the two-dimensional operation area 55a. Specifically, the more the pointer 55b is moved toward the "happy" side (+X direction in FIG. 14) on the two-dimensional operation area 55a of the facial expression setting unit 55, the more the facial expression of the character A displayed in the preview display area 53 changes to an expression that expresses "fun". Conversely, the more the pointer 55b is moved toward the "sad" side (-X direction in FIG. 14) on the two-dimensional operation area 55a of the facial expression setting unit 55, the more the facial expression of the character A displayed in the preview display area 53 changes to an expression that expresses "sadness". Furthermore, the more the pointer 55b is moved toward the "happiness" side (the +Y direction in FIG. 14) on the two-dimensional operation area 55a of the facial expression setting unit 55, the more the facial expression of the character A displayed in the preview display area 53 changes to an expression that expresses "happiness." Conversely, the more the pointer 55b is moved toward the "anger" side (the -Y direction in FIG. 14) on the two-dimensional operation area 55a of the facial expression setting unit 55, the more the facial expression of the character A displayed in the preview display area 53 changes to an expression that expresses "anger." In this way, the facial expression determination unit 22d determines the "facial expression" of the character A displayed in the preview display area 53 according to the position of the pointer 55b on the two-dimensional operation area 55a of the facial expression setting unit 55. This allows the user of the content generation system 1 to set the "facial expression" of the character in the content by intuitive operation.

図７は、ユーザがカーソルＣを用いて表情設定部５５を操作してキャラクターＡの「表情」が決定された後の状態を示す。図７に示すように、表情設定部５５の二次元操作領域５５ａ上におけるポインタ５５ｂが「喜」側に移動されているため、プレビュー表示エリア５３に表示されたキャラクターＡの表情が「喜び」を表現した表情に変化している。Figure 7 shows the state after the user has determined the "expression" of character A by operating the expression setting unit 55 using cursor C. As shown in Figure 7, the pointer 55b on the two-dimensional operation area 55a of the expression setting unit 55 has been moved to the "happiness" side, so that the expression of character A displayed in the preview display area 53 has changed to one expressing "happiness."

図３に戻って、目線決定部２２ｅは、ユーザによる目線設定部５６に対する操作に応じて、プレビュー表示エリア５３に表示されたキャラクターＡの「目線」を決定する。Returning to FIG. 3, the line of sight determination unit 22e determines the "line of sight" of character A displayed in the preview display area 53 in response to a user's operation on the line of sight setting unit 56.

図１５は、目線設定部５６の一例を示す図である。図１５に示すように、目線設定部５６は、二次元操作領域５６ａ（第２の二次元操作領域）と、該二次元操作領域５６ａ上を移動可能なポインタ５６ｂと、を含む。目線設定部５６は、二次元操作領域５６ａ上のポインタ５６ｂの位置に応じて、プレビュー表示エリア５３に表示されたキャラクターＡの目線（視線方向）を設定することができる。具体的には、目線設定部５６の二次元操作領域５６ａ上で、ポインタ５６ｂを「右」側（図１５の＋Ｘ方向）に移動させるほど、プレビュー表示エリア５３に表示されたキャラクターＡの目線がより右向きになる（キャラクターの目が画面に向かって右側に移動する）。逆に、目線設定部５６の二次元操作領域５６ａ上で、ポインタ５６ｂを「左」側（図１５の－Ｘ方向）に移動させるほど、プレビュー表示エリア５３に表示されたキャラクターＡの目線がより左向きになる（キャラクターの目が、画面に向かって左側に移動する）。また、目線設定部５６の二次元操作領域５６ａ上で、ポインタ５６ｂを「上」側（図１５の＋Ｙ方向）に移動させるほど、プレビュー表示エリア５３に表示されたキャラクターＡの目線がより上向きになる。逆に、目線設定部５６の二次元操作領域５６ａ上で、ポインタ５６ｂを「下」側（図１４の－Ｙ方向）に移動させるほど、プレビュー表示エリア５３に表示されたキャラクターＡの目線がより下向きになる。このように、目線決定部２２ｅは、目線設定部５６の二次元操作領域５６ａ上におけるポインタ５６ｂの位置に応じて、プレビュー表示エリア５３に表示されたキャラクターＡの「目線」を決定する。これにより、コンテンツ生成システム１のユーザは、コンテンツ上のキャラクターの「目線」を直感的な操作で設定することができる。15 is a diagram showing an example of the line of sight setting unit 56. As shown in FIG. 15, the line of sight setting unit 56 includes a two-dimensional operation area 56a (a second two-dimensional operation area) and a pointer 56b movable on the two-dimensional operation area 56a. The line of sight setting unit 56 can set the line of sight (direction of line of sight) of the character A displayed in the preview display area 53 according to the position of the pointer 56b on the two-dimensional operation area 56a. Specifically, the more the pointer 56b is moved to the "right" side (the +X direction in FIG. 15) on the two-dimensional operation area 56a of the line of sight setting unit 56, the more the line of sight of the character A displayed in the preview display area 53 faces rightward (the character's eyes move to the right side as viewed from the screen). Conversely, the more the pointer 56b is moved to the "left" side (-X direction in FIG. 15) on the two-dimensional operation area 56a of the line of sight setting unit 56, the more the line of sight of the character A displayed in the preview display area 53 turns leftward (the character's eyes move to the left side as you face the screen). Also, the more the pointer 56b is moved to the "up" side (+Y direction in FIG. 15) on the two-dimensional operation area 56a of the line of sight setting unit 56, the more the line of sight of the character A displayed in the preview display area 53 turns upward. Conversely, the more the pointer 56b is moved to the "down" side (-Y direction in FIG. 14) on the two-dimensional operation area 56a of the line of sight setting unit 56, the more the line of sight of the character A displayed in the preview display area 53 turns downward. In this way, the line of sight determination unit 22e determines the "line of sight" of the character A displayed in the preview display area 53 according to the position of the pointer 56b on the two-dimensional operation area 56a of the line of sight setting unit 56. This allows users of the content creation system 1 to intuitively set the "line of sight" of characters in content.

図８は、ユーザがカーソルＣを用いて目線設定部５６を操作してキャラクターＡの「目線」が決定された後の状態を示す。図８に示すように、目線設定部５６の二次元操作領域５６ａ上におけるポインタ５６ｂが「右」側に移動されているため、プレビュー表示エリア５３に表示されたキャラクターＡの目線が右向き（画面向かって右側）に変化している。Figure 8 shows the state after the user has determined the "line of sight" of character A by operating the line of sight setting unit 56 with cursor C. As shown in Figure 8, the pointer 56b on the two-dimensional operation area 56a of the line of sight setting unit 56 has been moved to the "right" side, so that the line of sight of character A displayed in the preview display area 53 has changed to face rightward (to the right as one faces the screen).

図３に戻って、モーション決定部２２ｆは、ユーザによるモーション設定部５７に対する操作に応じて、プレビュー表示エリア５３に表示されたキャラクターＡのモーション（動作）を決定する。Returning to FIG. 3, the motion determination unit 22f determines the motion (movement) of character A displayed in the preview display area 53 in response to a user's operation on the motion setting unit 57.

図１６は、モーション設定部５７の一例を示す図である。図１６に示すように、モーション設定部５７は、キャラクターのモーション種別を選択するためのプルダウンメニュー５７１と、選択したモーション種別の再生範囲を設定するための再生範囲設定部５７２と、を含む。そして、再生範囲設定部５７２は、第１シャフト５７２ａと、該第１シャフト５７２ａ上を移動可能な第１ノブ５７２ｂと、第２シャフト５７２ｃと、該第２シャフト５７２ｃ上を移動可能な第２ノブ５７２ｄと、を含む。Fig. 16 is a diagram showing an example of the motion setting unit 57. As shown in Fig. 16, the motion setting unit 57 includes a pull-down menu 571 for selecting a character's motion type, and a playback range setting unit 572 for setting the playback range of the selected motion type. The playback range setting unit 572 includes a first shaft 572a, a first knob 572b that can move on the first shaft 572a, a second shaft 572c, and a second knob 572d that can move on the second shaft 572c.

モーション設定部５７において、カーソルＣ（図１６に不図示）によりプルダウンメニュー５７１を選択すると、設定可能なモーションの種別がリストで表示される。そして、リスト表示されたモーション種別から所望のモーション種別を選択することにより、モーション決定部２２ｆにより、キャラクターのモーションが決定される。When the pull-down menu 571 is selected in the motion setting section 57 with the cursor C (not shown in FIG. 16), a list of the types of motion that can be set is displayed. Then, by selecting the desired type of motion from the list of types of motion displayed, the motion determination section 22f determines the motion of the character.

また、再生範囲設定部５７２のシャフト（第１シャフト５７２ａ及び第２シャフト５７２ｃ）は、予め設定されたモーションの時間軸を表しており、当該シャフト上でノブ（第１ノブ５７２ｂ及び第２ノブ５７２ｄ）を移動させることにより、所望の再生範囲でモーションを設定することができる。The shafts (first shaft 572a and second shaft 572c) of the playback range setting unit 572 represent the time axis of a preset motion, and the motion can be set in the desired playback range by moving the knobs (first knob 572b and second knob 572d) on the shafts.

図９は、モーション設定部５７のプルダウンメニュー５７１をカーソルＣで選択した後に、モーション種別がリスト表示された状態を示す図である。例えば、図９に示すモーション種別のリストから「手を振る」を選択した場合には、モーション決定部２２ｆにより、キャラクターのモーションが「手を振る」に決定され、図１０に示すように、プレビュー表示エリア５３に表示されたキャラクターＡが「手を振る」ような動作（アニメーション）を示すように動的に変化する。Figure 9 is a diagram showing a state in which a list of motion types is displayed after the pull-down menu 571 of the motion setting section 57 is selected with the cursor C. For example, when "wave" is selected from the list of motion types shown in Figure 9, the motion determination section 22f determines the character's motion to be "wave", and as shown in Figure 10, character A displayed in the preview display area 53 dynamically changes to show a "wave" motion (animation).

図３に戻って、キャラクター配置決定部２２ｇは、ユーザによるキャラクター配置設定部５８（図１１参照）に対する操作に応じて、プレビュー表示エリア５３に表示されたキャラクターＡの配置を決定する。キャラクター配置設定部５８は、例えば、図１１に示すように、「カメラ」、「キャラクター位置」、「回転」及び「ズームアウト／イン」の設定項目を含む。「カメラ」の設定項目では、キャラクターの「全身」、「上半身」、「頭部」といったカメラの画角に相当する内容を設定できる。「キャラクター位置」の設定項目では、プレビュー表示エリア５３上におけるキャラクターの表示位置を設定できる。「回転」の設定項目では、プレビュー表示エリア５３上のキャラクターＡを、所定の角度間隔（例えば、９０度間隔）で、ヨー方向あるいはピッチ方向に回転させることができる。また、「ズームアウト／イン」の設定項目では、シャフト上でノブを移動させることにより、プレビュー表示エリア５３上に表示されたキャラクターＡを拡大又は縮小することができる。Returning to FIG. 3, the character placement determination unit 22g determines the placement of the character A displayed in the preview display area 53 in response to the user's operation on the character placement setting unit 58 (see FIG. 11). The character placement setting unit 58 includes, for example, the setting items "camera", "character position", "rotation" and "zoom out/in" as shown in FIG. 11. In the "camera" setting item, the content corresponding to the camera's angle of view, such as the character's "whole body", "upper body" and "head", can be set. In the "character position" setting item, the display position of the character on the preview display area 53 can be set. In the "rotation" setting item, the character A on the preview display area 53 can be rotated in the yaw direction or pitch direction at a predetermined angle interval (for example, 90 degree interval). In the "zoom out/in" setting item, the character A displayed on the preview display area 53 can be enlarged or reduced by moving the knob on the shaft.

図３に戻って、背景画像決定部２２ｈは、ユーザによる背景画像設定部５９（図１２参照）に対する操作に応じて、プレビュー表示エリア５３に表示される背景画像を決定する。Returning to FIG. 3, the background image determination unit 22h determines the background image to be displayed in the preview display area 53 in response to a user's operation on the background image setting unit 59 (see FIG. 12).

図１２は、背景画像設定部５９に対するユーザ操作により背景画像が設定され、背景画像決定部２２ｈにより背景画像が決定された状態を示す図である。図１２に示すように、背景画像設定部５９は、例えば、「背景」、「背景位置」、「背景サイズ」及び「ライティング」の設定項目を含む。Figure 12 is a diagram showing a state in which a background image is set by a user operation on the background image setting unit 59 and the background image is determined by the background image determination unit 22h. As shown in Figure 12, the background image setting unit 59 includes setting items such as "background," "background position," "background size," and "lighting."

「背景」の設定項目では、予め用意された複数の背景画像（二次元画像又は三次元画像。静止画又は動画。）を選択可能な他、「単色」（背景画像なし）を選択することができる。また、「追加」ボタンを選択することにより、ユーザの任意の背景画像を追加することができる。さらに、「自動生成」ボタンを選択することにより、学習済の画像処理モデルを用いて背景画像を自動で生成することが可能である。具体的には、「自動生成」ボタンが選択されたことに応じて、制御部３２の不図示の背景画像生成部が、ユーザが選択したキャラクター、該キャラクターの表情、目線、モーション、後述のテキスト情報等を入力データとして学習済の画像処理モデルに入力することにより、背景画像が生成される。この画像処理モデルは、例えば、敵対的生成ネットワーク（Generative adversarial networks：ＧＡＮ）を用いて学習される。なお、ユーザが事前に用意した風景の写真等の静止画像を入力データとして画像処理モデルに入力して背景画像を生成するようにしてもよい。これにより、ユーザの好みの背景を自動生成することができる。ここで、背景画像を三次元データ（三次元画像）として生成した場合には、キャラクター画像生成部３２ａが生成したキャラクターの三次元データとともに背景画像の三次元データをメタバース空間上に配置してもよい。In the "Background" setting item, in addition to being able to select multiple pre-prepared background images (two-dimensional or three-dimensional images, still images or videos), it is also possible to select "monochrome" (no background image). In addition, by selecting the "Add" button, the user can add any background image. Furthermore, by selecting the "Auto-generate" button, it is possible to automatically generate a background image using a trained image processing model. Specifically, in response to the selection of the "Auto-generate" button, a background image generating unit (not shown) of the control unit 32 inputs the character selected by the user, the character's facial expression, line of sight, motion, text information (described later), etc., as input data into the trained image processing model to generate a background image. This image processing model is trained, for example, using generative adversarial networks (GAN). In addition, a still image such as a landscape photo prepared in advance by the user may be input as input data into the image processing model to generate a background image. This allows the user's preferred background to be automatically generated. Here, if the background image is generated as three-dimensional data (three-dimensional image), the three-dimensional data of the background image may be placed in the metaverse space together with the three-dimensional data of the character generated by the character image generation unit 32a.

「背景位置」の設定項目では、プレビュー表示エリア５３上における背景画像エリア５３ａの位置を設定することができる。具体的には、二次元操作領域上でポインタを移動させることにより、プレビュー表示エリア５３上における背景画像エリア５３ａの位置を設定することができる。「背景サイズ」の設定項目では、プレビュー表示エリア５３上における背景画像エリア５３ａの大きさを設定することができる。具体的には、シャフト上でノブを移動させることにより、プレビュー表示エリア５３上の背景画像エリア５３ａの大きさを拡大又は縮小することができる。「ライティング」の設定項目では、プルダウンメニューから「昼間」、「夕方」、「夜間」等を選択可能であり、時間帯に応じた背景画像の描写を設定することができる。In the "Background Position" setting item, the position of the background image area 53a on the preview display area 53 can be set. Specifically, the position of the background image area 53a on the preview display area 53 can be set by moving the pointer on the two-dimensional operation area. In the "Background Size" setting item, the size of the background image area 53a on the preview display area 53 can be set. Specifically, the size of the background image area 53a on the preview display area 53 can be enlarged or reduced by moving the knob on the shaft. In the "Lighting" setting item, "Daytime," "Evening," "Nighttime," etc. can be selected from a pull-down menu, and the depiction of the background image can be set according to the time of day.

図３に戻って、テキスト情報決定部２２ｉは、ユーザによるテキスト情報設定部６０（図１３参照）に対する操作に応じて、プレビュー表示エリア５３に表示されるテキスト情報を決定する。Returning to FIG. 3, the text information determination unit 22i determines the text information to be displayed in the preview display area 53 in response to a user's operation on the text information setting unit 60 (see FIG. 13).

図１３は、テキスト情報設定部６０に対するユーザ操作によりテキスト情報が設定され、テキスト情報決定部２２ｉによりテキスト情報が決定された状態を示す図である。図１３に示すように、テキスト情報設定部６０は、例えば、「吹き出し」、「フォント」、「テキスト」、「テキスト色」及び「文字サイズ」の設定項目を含む。「吹き出し」の設定項目では、予め用意された複数の吹き出し画像を選択可能な他、「非表示」（テキスト情報なし）や「テキストのみ」（吹き出し画像なし）を選択することができる。「フォント」の設定項目では、プレビュー表示エリア５３に表示するテキストのフォント種別をプルダウンメニューから選択することができる。「テキスト」の設定項目では、後述の入力部２３を介したユーザの入力操作により、プレビュー表示エリア５３に表示するテキストを入力することができる。「テキスト色」の設定項目では、プレビュー表示エリア５３に表示するテキストの色を設定することができる。「文字サイズ」の設定項目では、プレビュー表示エリア５３に表示するテキストのサイズを設定することができる。テキスト情報設定部６０によりテキスト情報が設定されると、テキスト情報決定部２２ｉによりテキスト情報が決定され、このテキスト情報がプレビュー表示エリア５３に表示される。図１３に示す例では、テキスト情報設定部６０の「吹き出し」の設定項目でユーザが選択した吹き出し画像に、テキスト情報設定部６０の「テキスト」の設定項目でユーザが入力したテキスト「こんにちは」が合成された吹き出し画像６１がプレビュー表示エリア５３に表示されている。この吹き出し画像６１は、カーソルＣを用いたユーザ操作（例えば、ドラッグ操作）によりプレビュー表示エリア５３上を移動させることができる。また、吹き出し画像６１の近傍に表示された拡大縮小アイコンＤを操作（例えば、ドラッグ操作）することにより吹き出し画像６１のサイズを変更（拡大縮小）することができる。カーソルＣでプレビュー表示エリア５３外の領域を選択すると、吹き出し画像６１のプレビュー表示エリア５３上における位置及びサイズが確定し、拡大縮小アイコンＤは消失する。13 is a diagram showing a state in which text information is set by a user operation on the text information setting unit 60 and the text information is determined by the text information determination unit 22i. As shown in FIG. 13, the text information setting unit 60 includes setting items such as "speech bubble", "font", "text", "text color", and "character size". In the "speech bubble" setting item, a plurality of speech bubble images prepared in advance can be selected, and "non-display" (no text information) and "text only" (no speech bubble image) can be selected. In the "font" setting item, the font type of the text to be displayed in the preview display area 53 can be selected from a pull-down menu. In the "text" setting item, the text to be displayed in the preview display area 53 can be input by the user's input operation via the input unit 23 described later. In the "text color" setting item, the color of the text to be displayed in the preview display area 53 can be set. In the "character size" setting item, the size of the text to be displayed in the preview display area 53 can be set. When the text information is set by the text information setting unit 60, the text information is determined by the text information determination unit 22i, and this text information is displayed in the preview display area 53. In the example shown in FIG. 13, a speech bubble image 61 is displayed in the preview display area 53, in which the speech bubble image selected by the user in the "speech bubble" setting item of the text information setting unit 60 is combined with the text "Hello" input by the user in the "text" setting item of the text information setting unit 60. This speech bubble image 61 can be moved on the preview display area 53 by a user operation (e.g., dragging) using the cursor C. In addition, the size of the speech bubble image 61 can be changed (enlarged or reduced) by operating (e.g., dragging) a zoom icon D displayed near the speech bubble image 61. When an area outside the preview display area 53 is selected with the cursor C, the position and size of the speech bubble image 61 on the preview display area 53 are confirmed, and the zoom icon D disappears.

図３に戻って、入力部２３は、ユーザ端末２のユーザが情報を入力するための要素であり、例えば、キーボードやマウス、タッチパネル、マイクロフォン、ジェスチャー入力装置等である。Returning to FIG. 3, the input unit 23 is an element through which the user of the user terminal 2 inputs information, such as a keyboard, mouse, touch panel, microphone, gesture input device, etc.

出力部２４は、ユーザ端末２からユーザに対して各種情報（画像や音声）を出力するインタフェースであり、例えば、液晶ディスプレイ等の映像表示デバイス（表示部）やスピーカである。出力部２４を表示部として構成するとき、この表示部には、ユーザからの操作を受け付けるためのＧＵＩが表示制御部２２ａによって表示される。The output unit 24 is an interface that outputs various information (images and sounds) from the user terminal 2 to the user, and is, for example, a video display device (display unit) such as a liquid crystal display, or a speaker. When the output unit 24 is configured as a display unit, a GUI for accepting operations from the user is displayed on this display unit by the display control unit 22a.

撮像部２５は、ユーザ端末２の撮像素子を含むカメラモジュールにより、被写体の像を撮像する。撮像部２５により撮像した画像は、制御部２２に出力され、各種画像処理が施され、表示制御部２２ａによって表示部（出力部２４）に表示されたり、通信部２１により、ネットワーク４を介して、サーバ３に送信されたりする。The imaging unit 25 captures an image of a subject using a camera module including an image sensor of the user terminal 2. The image captured by the imaging unit 25 is output to the control unit 22, where various image processing operations are performed, and the image is displayed on the display unit (output unit 24) by the display control unit 22a, or transmitted to the server 3 by the communication unit 21 via the network 4.

記憶部２６は、例えば、内蔵メモリや外部メモリ（ＳＤメモリカード等）などのデータストレージである。記憶部２６には、制御部２２が取り扱う各種データや通信部２１がネットワーク４を介してサーバ３からダウンロードした各種情報、撮像部２５が撮像した画像等が記憶される。なお、記憶部２６は、必ずしもユーザ端末２内に設けられていなくてもよく、記憶部２６の一部または全部は、ネットワーク４を介してユーザ端末２と通信可能に接続された別の装置内に設けられていてもよい。The storage unit 26 is, for example, a data storage such as an internal memory or an external memory (such as an SD memory card). The storage unit 26 stores various data handled by the control unit 22, various information downloaded by the communication unit 21 from the server 3 via the network 4, images captured by the imaging unit 25, and the like. Note that the storage unit 26 does not necessarily have to be provided within the user terminal 2, and a part or all of the storage unit 26 may be provided within another device that is communicatively connected to the user terminal 2 via the network 4.

次に、サーバ３の機能構成について説明する。図３に示すように、サーバ３は、通信部３１と、制御部３２と、記憶部３３と、を有する。Next, the functional configuration of the server 3 will be described. As shown in FIG. 3, the server 3 has a communication unit 31, a control unit 32, and a storage unit 33.

通信部３１は、サーバ３とネットワーク４との間の通信インタフェースである。通信部３１は、ネットワーク４を介しサーバ３とユーザ端末２との間で情報を送受信する。The communication unit 31 is a communication interface between the server 3 and the network 4. The communication unit 31 transmits and receives information between the server 3 and the user terminal 2 via the network 4.

制御部３２は、サーバ３の全体の動作を制御する。また、制御部３２は、キャラクター画像生成部３２ａと、音声生成部３２ｂと、コンテンツ生成部３２ｃと、発話データ生成部３２ｄと、ＮＦＴ発行部３２ｅと、を有している。The control unit 32 controls the overall operation of the server 3. The control unit 32 also has a character image generation unit 32a, a voice generation unit 32b, a content generation unit 32c, a speech data generation unit 32d, and an NFT issuing unit 32e.

キャラクター画像生成部３２ａは、ユーザ端末２から受信した画像（例えば、撮像部２５を用いて撮像されたユーザの顔画像や全身画像、事前に用意したキャラクター画像）を基にしてキャラクターの画像データ（二次元又は三次元の画像データ）を生成する。例えば、キャラクター画像生成部３２ａとして画像生成ＡＩを用いることでキャラクターの画像を生成するようにしてもよい。具体的には、キャラクター画像生成部３２ａは、例えば、敵対的生成ネットワーク（Generative adversarial networks：ＧＡＮ）を用いて学習された画像処理モデルに、ユーザ端末２を介してユーザが入力した画像データを入力することにより、キャラクターの画像を生成してもよい。生成されたキャラクターの画像は、後述の記憶部３３のキャラクター画像ＤＢ３３ａに記憶したり、通信部３１によりネットワーク４を介してユーザ端末２に送信したりしてもよい。The character image generating unit 32a generates image data (two-dimensional or three-dimensional image data) of a character based on an image received from the user terminal 2 (e.g., a face image or a full-body image of the user captured using the imaging unit 25, or a character image prepared in advance). For example, an image generation AI may be used as the character image generating unit 32a to generate a character image. Specifically, the character image generating unit 32a may generate a character image by inputting image data input by the user via the user terminal 2 into an image processing model trained using generative adversarial networks (GAN). The generated character image may be stored in a character image DB 33a of the storage unit 33 described below, or may be transmitted to the user terminal 2 via the network 4 by the communication unit 31.

音声生成部３２ｂは、キャラクター画像生成部３２ａが生成したキャラクター画像（二次元画像又は三次元画像）に対応する音声データを生成する。具体的には、音声生成部３２ｂは、キャラクター画像生成部３２ａが生成したキャラクター画像を学習済みの音声合成モデルに入力し、音声データを出力させる。これにより、キャラクター画像から、そのキャラクターの外見や雰囲気、性格の印象に合った声色の音声データを自動生成することができる。なお、音声モデルは、例えば、多数のキャラクター画像とそのキャラクター画像に関連付けられた音声データ（アニメのキャラクターの台詞等）を学習データに用いて学習させたものである。The voice generation unit 32b generates voice data corresponding to the character image (two-dimensional image or three-dimensional image) generated by the character image generation unit 32a. Specifically, the voice generation unit 32b inputs the character image generated by the character image generation unit 32a into a trained voice synthesis model and outputs the voice data. This makes it possible to automatically generate voice data with a voice tone that matches the impression of the character's appearance, atmosphere, and personality from the character image. Note that the voice model is trained using, for example, a large number of character images and voice data associated with the character images (such as lines of animated characters) as training data.

また、音声生成部３２ｂは、音声生成部３２ｂが生成した音声データに基づいて、テキスト情報決定部２２ｉが決定したテキストや、後述の発話データ生成部３２ｄが生成したテキストを生成済みの音声データに基づいて音声を生成するテキスト音声合成を行う。音声生成部３２ｂがテキスト音声合成を行って生成された音声は音声ファイルとして出力される。The voice generating unit 32b also performs text-to-speech synthesis based on the voice data generated by the voice generating unit 32b, generating voice based on the text determined by the text information determining unit 22i and the text generated by the speech data generating unit 32d (described later) that has already been generated. The voice generated by the voice generating unit 32b through text-to-speech synthesis is output as a voice file.

コンテンツ生成部３２ｃは、ユーザ端末２の表示部（出力部２４）に表示された各種インタフェースを用いたユーザ操作により編集された一以上コマで構成されたコンテンツ（マンガ、アニメ等）を生成する。例えば、コンテンツ生成部３２ｃは、図５に示すコンテンツ作成画面５０のコマ一覧エリア５２における再生ボタン５２ｃが選択操作されたことに応じてコンテンツのデータを生成する。ここで、コンテンツのデータとは、ユーザが編集した一以上のコマが一つのファイルにカプセル化されたデータであって、例えば、ＭＰ４、ＡＶＩ等の動画ファイルや複数の画像を格納可能なＨＥＩＦ（High Efficiency Image File Format）、ＡＶＩＦ（AV1 Image File Format）等の静止画ファイルであり、音声生成部３２ｂにより音声ファイルが生成されている場合には当該音声ファイルを含む。生成されたコンテンツのデータは、後述の記憶部３３のコンテンツＤＢ３３ｅに記憶され、通信部３１によりネットワーク４を介してユーザ端末２に送信される。The content generating unit 32c generates content (manga, animation, etc.) consisting of one or more frames edited by user operation using various interfaces displayed on the display unit (output unit 24) of the user terminal 2. For example, the content generating unit 32c generates content data in response to the selection of the play button 52c in the frame list area 52 of the content creation screen 50 shown in FIG. 5. Here, the content data is data in which one or more frames edited by the user are encapsulated into one file, for example, a video file such as MP4 or AVI, or a still image file such as HEIF (High Efficiency Image File Format) or AVIF (AV1 Image File Format) that can store multiple images, and includes an audio file if one is generated by the audio generating unit 32b. The generated content data is stored in a content DB 33e of the storage unit 33 described later, and is transmitted to the user terminal 2 via the network 4 by the communication unit 31.

発話データ生成部３２ｄは、ユーザ端末２の入力部２３を介して入力されたユーザの発話データ（音声データ又はテキストデータ）に応答するキャラクターの発話データを生成する。これにより、キャラクター画像生成部３２ａが生成したキャラクターを対話ボットとして機能させることができる。具体的には、発話データ生成部３２ｄは、ユーザ端末２の入力部２３を介して入力されたユーザの発話データをテキストデータに変換して言語モデル（大規模言語モデル）に入力し、ユーザの発話に応答するキャラクターの発話データを出力することにより発話データを生成する。ここで、言語モデルは、キャラクターの画像データ及び／又は音声データを学習データの一部として用いて学習されたものであってもよい。これにより、ユーザの発話データとともにキャラクター画像及び音声データを言語モデルに入力することによって、キャラクターの外見や雰囲気、性格の印象に合った発話データを生成することができる。発話データ生成部３２ｄにより生成されたキャラクターの発話データは、通信部３１によりネットワーク４を介してユーザ端末２に送信され、ユーザ端末２の出力部２４（スピーカ）から音声として出力される。The speech data generating unit 32d generates speech data of a character that responds to the user's speech data (voice data or text data) input through the input unit 23 of the user terminal 2. This allows the character generated by the character image generating unit 32a to function as a conversational bot. Specifically, the speech data generating unit 32d converts the user's speech data input through the input unit 23 of the user terminal 2 into text data, inputs it into a language model (large-scale language model), and generates speech data by outputting the character's speech data that responds to the user's speech. Here, the language model may be trained using the image data and/or voice data of the character as part of the training data. This allows the character image and voice data to be input into the language model together with the user's speech data, thereby generating speech data that matches the impression of the character's appearance, atmosphere, and personality. The character's speech data generated by the speech data generating unit 32d is transmitted to the user terminal 2 through the network 4 by the communication unit 31, and is output as sound from the output unit 24 (speaker) of the user terminal 2.

ＮＦＴ発行部３２ｅは、コンテンツ生成部３２ｃが生成したコンテンツのデータ又は当該コンテンツ内のキャラクターの画像データ（キャラクター画像生成部３２ａが生成したキャラクター画像（二次元画像又は三次元画像）のデータ）に関するＮＦＴ（Non-Fungible Token；非代替性トークン）を発行する。すなわち、ＮＦＴ発行部３２ｅは、コンテンツ生成部３２ｃが生成したコンテンツ及び／又は当該コンテンツ内のキャラクターの画像データをＮＦＴ化する。具体的には、例えば、ＮＦＴ発行部３２ｅは、ユーザ端末２の表示部（出力部２４）に表示された不図示のインタフェースに対する操作に応じて、コンテンツのデータ又はキャラクター画像のデータのＮＦＴ化の指示を受け付けると、コンテンツのデータ又はキャラクター画像のデータに紐づくＮＦＴＩＤを生成する。そして、ＮＦＴ発行部３２ｅは、生成されたＮＦＴＩＤを含むメタデータを有するコードを生成し、不図示のブロックチェーンネットワークに送信する。これにより、コンテンツのデータやコンテンツ内のキャラクター画像のデータをＮＦＴ化することができる。また、ユーザは、ＮＦＴ化されたコンテンツのデータやキャラクター画像のデータをＮＦＴの取引所で売買したり、他のＮＦＴ化されたデータと交換したりすることが可能となる。The NFT issuing unit 32e issues an NFT (Non-Fungible Token) related to the content data generated by the content generating unit 32c or image data of a character in the content (data of a character image (two-dimensional image or three-dimensional image) generated by the character image generating unit 32a). That is, the NFT issuing unit 32e converts the content generated by the content generating unit 32c and/or the image data of the character in the content into an NFT. Specifically, for example, when the NFT issuing unit 32e receives an instruction to convert the content data or the character image data into an NFT in response to an operation on an interface (not shown) displayed on the display unit (output unit 24) of the user terminal 2, the NFT issuing unit 32e generates an NFT ID linked to the content data or the character image data. Then, the NFT issuing unit 32e generates a code having metadata including the generated NFT ID and transmits it to a blockchain network (not shown). This makes it possible to convert the content data or the character image data in the content into an NFT. Additionally, users will be able to buy and sell NFT content data and character image data on NFT exchanges, and exchange them for other NFT data.

記憶部３３は、例えば、内蔵メモリや外部メモリ（ＳＤメモリカード等）などのデータストレージである。記憶部３３には、制御部３２が取り扱う各種データや通信部３１がネットワーク４を介してユーザ端末２から受信した各種情報や、各種データベース（ＤＢ）等が記憶される。データベースとしては、キャラクター画像ＤＢ３３ａ、背景画像ＤＢ３３ｂ、吹き出し画像ＤＢ３３ｃ、コンテンツＤＢ３３ｄが含まれる。The memory unit 33 is, for example, a data storage such as an internal memory or an external memory (such as an SD memory card). The memory unit 33 stores various data handled by the control unit 32, various information received by the communication unit 31 from the user terminal 2 via the network 4, various databases (DB), and the like. The databases include a character image DB 33a, a background image DB 33b, a speech bubble image DB 33c, and a content DB 33d.

キャラクター画像ＤＢ３３ａは、事前に用意された複数のキャラクターの画像データ（２Ｄ又は３Ｄのモデルデータを含む）を格納している。また、キャラクター画像ＤＢ３３ａは、制御部３２のキャラクター画像生成部３２ａにより生成されたキャラクター画像を格納する。キャラクター画像ＤＢ３３ａに格納されるキャラクター画像は、複数のモーションパターンと関連付けられていてもよい。キャラクター画像ＤＢ３３ａに格納されたキャラクター画像は、ユーザ端末２のリクエストに応じて、通信部３１によりネットワーク４を介してユーザ端末２に送信され、ユーザ端末２の表示制御部２２ａにより表示部（出力部２４）に表示される。The character image DB 33a stores image data (including 2D or 3D model data) of multiple characters prepared in advance. The character image DB 33a also stores character images generated by the character image generation unit 32a of the control unit 32. The character images stored in the character image DB 33a may be associated with multiple motion patterns. In response to a request from the user terminal 2, the character images stored in the character image DB 33a are transmitted to the user terminal 2 by the communication unit 31 via the network 4, and are displayed on the display unit (output unit 24) by the display control unit 22a of the user terminal 2.

背景画像ＤＢ３３ｂは、事前に用意された複数の背景画像の画像データを格納している。背景画像ＤＢ３３ｂに格納された背景画像は、ユーザ端末２のリクエストに応じて、通信部３１によりネットワーク４を介してユーザ端末２に送信され、ユーザ端末２の表示制御部２２ａにより表示部（出力部２４）に表示される。The background image DB 33b stores image data of multiple background images prepared in advance. In response to a request from the user terminal 2, the background images stored in the background image DB 33b are transmitted to the user terminal 2 by the communication unit 31 via the network 4, and are displayed on the display unit (output unit 24) by the display control unit 22a of the user terminal 2.

吹き出し画像ＤＢ３３ｃは、事前に用意された複数の吹き出し画像の画像データを格納している。吹き出し画像ＤＢ３３ｃに格納された吹き出し画像は、ユーザ端末２のリクエストに応じて、通信部３１によりネットワーク４を介してユーザ端末２に送信され、ユーザ端末２の表示制御部２２ａにより表示部（出力部２４）に表示される。The speech bubble image DB 33c stores image data of multiple speech bubble images prepared in advance. In response to a request from the user terminal 2, the speech bubble images stored in the speech bubble image DB 33c are transmitted to the user terminal 2 by the communication unit 31 via the network 4, and are displayed on the display unit (output unit 24) by the display control unit 22a of the user terminal 2.

コンテンツＤＢ３３ｄは、コンテンツ生成部３２ｃによって生成されたコンテンツのデータを格納する。コンテンツＤＢ３３ｄに格納されたコンテンツのデータは、ユーザ端末２のリクエストに応じて、通信部３１によりネットワーク４を介してユーザ端末２に送信され、ユーザ端末２の表示制御部２２ａにより表示部（出力部２４）に表示される。The content DB 33d stores the data of the content generated by the content generation unit 32c. In response to a request from the user terminal 2, the content data stored in the content DB 33d is transmitted to the user terminal 2 via the network 4 by the communication unit 31, and is displayed on the display unit (output unit 24) by the display control unit 22a of the user terminal 2.

なお、記憶部３３は、必ずしもサーバ３内に設けられていなくてもよく、記憶部３３の一部または全部は、ネットワーク４を介してサーバ３と通信可能に接続された別の装置内に設けられていてもよい。Note that the memory unit 33 does not necessarily have to be provided within the server 3, and part or all of the memory unit 33 may be provided within another device that is communicatively connected to the server 3 via the network 4.

（動作の一例）
次に、コンテンツ生成システム１の動作の一例について説明する。図４は、コンテンツ生成システム１のユーザが、ユーザ端末２の表示部（出力部２４）に表示されたユーザインタフェースを用いて、一以上のコマで構成されたコンテンツを生成する動作を示すフローチャートである。(Example of operation)
Next, a description will be given of an example of the operation of the content generation system 1. Fig. 4 is a flowchart showing the operation of a user of the content generation system 1 using a user interface displayed on the display unit (output unit 24) of the user terminal 2 to generate content made up of one or more frames.

まず、ユーザ端末２において、表示制御部２２ａが表示部（出力部２４）にユーザインタフェースを表示する（ステップＳ１）。First, in the user terminal 2, the display control unit 22a displays a user interface on the display unit (output unit 24) (step S1).

次に、ユーザ端末２において、表示部（出力部２４）に表示されたユーザインタフェースに対するユーザ操作に応じて、コマ決定部２２ｂが編集するコマを決定する（ステップＳ２）。Next, in the user terminal 2, the frame determination unit 22b determines the frame to be edited in response to user operations on the user interface displayed on the display unit (output unit 24) (step S2).

次に、ユーザ端末２において、表示部（出力部２４）に表示されたユーザインタフェースに対するユーザ操作に応じて、キャラクター決定部２２ｃがキャラクターを決定する（ステップＳ３）。Next, in the user terminal 2, the character determination unit 22c determines a character in response to a user operation on the user interface displayed on the display unit (output unit 24) (step S3).

次に、ユーザ端末２において、表示部（出力部２４）に表示されたユーザインタフェース（表情設定部５５）に対するユーザ操作に応じて、表情決定部２２ｄがキャラクターの表情を決定する（ステップＳ４）。Next, in the user terminal 2, the facial expression determination unit 22d determines the facial expression of the character in response to user operations on the user interface (facial expression setting unit 55) displayed on the display unit (output unit 24) (step S4).

次に、ユーザ端末２において、表示部（出力部２４）に表示されたユーザインタフェース（目線設定部５６）に対するユーザ操作に応じて、目線決定部２２ｅがキャラクターの目線を決定する（ステップＳ５）。Next, in the user terminal 2, the line of sight determination unit 22e determines the line of sight of the character in response to a user operation on the user interface (line of sight setting unit 56) displayed on the display unit (output unit 24) (step S5).

次に、ユーザ端末２において、表示部（出力部２４）に表示されたユーザインタフェース（モーション設定部５７）に対するユーザ操作に応じて、モーション決定部２２ｆがキャラクターのモーションを決定する（ステップＳ６）。Next, in the user terminal 2, the motion determination unit 22f determines the character's motion in response to user operations on the user interface (motion setting unit 57) displayed on the display unit (output unit 24) (step S6).

次に、ユーザ端末２において、表示部（出力部２４）に表示されたユーザインタフェース（キャラクター配置設定部５８）に対するユーザ操作に応じて、キャラクター配置決定部２２ｇがキャラクターの配置を決定する（ステップＳ７）。Next, in the user terminal 2, the character placement determination unit 22g determines the placement of the characters in response to user operations on the user interface (character placement setting unit 58) displayed on the display unit (output unit 24) (step S7).

次に、ユーザ端末２において、表示部（出力部２４）に表示されたユーザインタフェース（背景画像設定部５９）に対するユーザ操作に応じて、背景画像決定部２２ｈが背景画像を決定する（ステップＳ８）。Next, in the user terminal 2, the background image determination unit 22h determines a background image in response to a user operation on the user interface (background image setting unit 59) displayed on the display unit (output unit 24) (step S8).

次に、ユーザ端末２において、表示部（出力部２４）に表示されたユーザインタフェース（テキスト情報設定部６０）に対するユーザ操作に応じて、テキスト情報決定部２２ｉがテキスト情報を決定（ステップＳ９）。Next, in the user terminal 2, the text information determination unit 22i determines text information in response to user operations on the user interface (text information setting unit 60) displayed on the display unit (output unit 24) (step S9).

次に、サーバ３において、コンテンツ生成部３２ｃが一以上のコマを含むコンテンツを生成する（ステップＳ１０）。Next, on the server 3, the content generation unit 32c generates content including one or more frames (step S10).

以上説明したとおり、本実施形態によればコンテンツ生成システム１は、キャラクターの動画像又は静止画像を含む一以上のコマで構成されるコンテンツを生成するコンテンツ生成システム１であって、二次元操作領域５５ａを含むユーザインタフェース（表情設定部５５）と、ユーザが編集するコマのプレビュー画像（プレビュー表示エリア５３）と、を表示部（出力部２４）に表示する表示制御部２２ａと、二次元操作領域５５ａ上のポインタ５５ｂの位置に応じて、プレビュー画像上のキャラクターの表情を決定する表情決定部２２ｄと、ユーザが編集したコマを含む一以上のコマで構成されるコンテンツを生成するコンテンツ生成部３２ｃと、を備えるので、ユーザの意図した表情のキャラクターを含むコマで構成されたコンテンツを容易に生成することができる。As described above, according to this embodiment, the content generation system 1 is a content generation system 1 that generates content consisting of one or more frames including a moving or still image of a character, and is equipped with a user interface (facial expression setting unit 55) including a two-dimensional operation area 55a, a display control unit 22a that displays a preview image (preview display area 53) of the frame edited by the user on the display unit (output unit 24), a facial expression determination unit 22d that determines the facial expression of the character in the preview image according to the position of the pointer 55b on the two-dimensional operation area 55a, and a content generation unit 32c that generates content consisting of one or more frames including a frame edited by the user, and therefore can easily generate content consisting of frames including a character with an expression intended by the user.

なお、本実施形態では、一例として、表情設定部５５によって「喜」「怒」「哀」「楽」の４つの感情に対応する表情を設定する例を説明するが、４つ以上の感情に対応する表情を設定できるようにしてもよい。例えば、「怒り」「恐れ」「期待」「驚き」「喜び」「悲しみ」「信頼」「嫌悪」の８つ感情（プルチックモデル）に対応する表情を設定するようにしてもよい。In this embodiment, as an example, an example is described in which facial expressions corresponding to the four emotions of "joy", "anger", "sorrow", and "pleasure" are set by the facial expression setting unit 55, but it is also possible to set facial expressions corresponding to four or more emotions. For example, it is also possible to set facial expressions corresponding to eight emotions (Plutchik model) of "anger", "fear", "expectation", "surprise", "joy", "sorrow", "trust", and "disgust".

また、本実施形態では、一例として、プルダウンメニュー５７１を用いてキャラクターのモーションを設定する例を説明したが、表情設定部５５によって設定可能な表情に対応するモーションを予め関連付けておき、表情設定部５５によって設定された表情に対応するモーションが自動的に設定されるようにしてもよい。In addition, in this embodiment, as an example, an example of setting a character's motion using the pull-down menu 571 has been described, but it is also possible to associate motions corresponding to facial expressions that can be set by the facial expression setting unit 55 in advance, and to automatically set the motions corresponding to the facial expressions set by the facial expression setting unit 55.

また、本実施形態では、一例として、表情設定部５５とモーション設定部５７とを別個に設け、それぞれの設定部により「表情」及び「モーション」を設定する例を説明したが、単一の設定部により「表情」及び「モーション」を設定するようにしてもよい。例えば、ユーザインタフェースとして複数の感情を設定可能な二次元操作領域を有する感情設定部を一つ設け、この感情設定部によって設定した感情に対応するキャラクターの「表情」及び「モーション」を予め関連付けておき、感情設定部によって設定された感情に対応する「表情」及び「モーション」が自動的に設定されるようにしてもよい。ここで、感情設定部により設定される感情の種類は、例えば、「喜」「怒」「哀」「楽」の４つの感情でもよいし、「怒り」「恐れ」「期待」「驚き」「喜び」「悲しみ」「信頼」「嫌悪」の８つ感情（プルチックモデル）でもよいし、あるいは、これら以上の感情の種類であってもよい。In the present embodiment, as an example, the facial expression setting unit 55 and the motion setting unit 57 are provided separately, and the "facial expression" and "motion" are set by each setting unit. However, the "facial expression" and "motion" may be set by a single setting unit. For example, a single emotion setting unit having a two-dimensional operation area in which multiple emotions can be set as a user interface may be provided, and the "facial expression" and "motion" of the character corresponding to the emotion set by this emotion setting unit may be associated in advance, so that the "facial expression" and "motion" corresponding to the emotion set by the emotion setting unit are automatically set. Here, the types of emotions set by the emotion setting unit may be, for example, the four emotions of "joy", "anger", "sorrow", and "pleasure", or the eight emotions of "anger", "fear", "expectation", "surprise", "joy", "sorrow", "trust", and "disgust" (Plutchik model), or more than these types of emotions.

また、本実施形態では、表情設定部５５が矩形の二次元操作領域５５ａを有する例を説明したが、二次元操作領域の形状はこれに限定されない。例えば、図１７に示すように、表情設定部５５が円形の二次元操作領域５５ｃを有し、この円形の二次元操作領域５５ｃ内でドット状のポインタ５５ｄを移動させることによって、喜怒哀楽等の各表情を設定するようにしてもよいし、二次元操作領域の形状を矩形や円形以外の形状にし、当該形状の内部でポインタを移動させることによって各表情を設定するようにしてもよい。In addition, in this embodiment, an example has been described in which the facial expression setting unit 55 has a rectangular two-dimensional operation area 55a, but the shape of the two-dimensional operation area is not limited to this. For example, as shown in FIG. 17, the facial expression setting unit 55 may have a circular two-dimensional operation area 55c, and each facial expression such as joy, anger, sadness, and happiness may be set by moving a dot-shaped pointer 55d within this circular two-dimensional operation area 55c, or the shape of the two-dimensional operation area may be a shape other than a rectangle or circle, and each facial expression may be set by moving the pointer within the shape.

また、本実施形態では、目線設定部５６が矩形の二次元操作領域５６ａを有する例を説明したが、二次元操作領域の形状はこれに限定されない。例えば、図１８に示すように、目線設定部５６が円形の二次元操作領域５６ｃを有し、この円形の二次元操作領域５６ｃ内でドット状のポインタ５６ｄを移動させることによって、各方向の目線を設定するようにしてもよいし、二次元操作領域の形状を矩形や円形以外の形状にし、当該形状の内部でポインタを移動させることによって各方向の目線を設定するようにしてもよい。In addition, in this embodiment, an example has been described in which the gaze setting unit 56 has a rectangular two-dimensional operation area 56a, but the shape of the two-dimensional operation area is not limited to this. For example, as shown in FIG. 18, the gaze setting unit 56 may have a circular two-dimensional operation area 56c, and the gaze in each direction may be set by moving a dot-shaped pointer 56d within this circular two-dimensional operation area 56c, or the shape of the two-dimensional operation area may be a shape other than a rectangle or circle, and the gaze in each direction may be set by moving the pointer within the shape.

本明細書で述べた各機能部の任意の一部または全部をプログラムによって実現するようにしてもよい。本明細書で言及したプログラムは、コンピュータ読み取り可能な記録媒体に非一時的に記録して頒布されてもよいし、インターネットなどの通信回線（無線通信も含む）を介して頒布されてもよいし、任意の端末にインストールされた状態で頒布されてもよい。Any part or all of the functional units described in this specification may be realized by a program. The programs mentioned in this specification may be distributed by being non-temporarily recorded on a computer-readable recording medium, or may be distributed via a communication line (including wireless communication) such as the Internet, or may be distributed in a state where they are installed on any terminal.

上記の記載に基づいて、当業者であれば、本発明の追加の効果や種々の変形例を想到できるかもしれないが、本発明の態様は、上述した個々の実施形態には限定されるものではない。特許請求の範囲に規定された内容およびその均等物から導き出される本発明の概念的な思想と趣旨を逸脱しない範囲で種々の追加、変更および部分的削除が可能である。Based on the above description, a person skilled in the art may be able to conceive additional effects and various modifications of the present invention, but the aspects of the present invention are not limited to the individual embodiments described above. Various additions, modifications, and partial deletions are possible within the scope that does not deviate from the conceptual idea and intent of the present invention derived from the contents defined in the claims and their equivalents.

例えば、本明細書において１台の装置（あるいは部材、以下同じ）として説明されるもの（図面において１台の装置として描かれているものを含む）を複数の装置によって実現してもよい。逆に、本明細書において複数の装置として説明されるもの（図面において複数の装置として描かれているものを含む）を１台の装置によって実現してもよい。あるいは、ある装置（例えばサーバ）に含まれるとした手段や機能の一部または全部が、他の装置（例えばユーザ端末）に含まれるようにしてもよい。For example, what is described in this specification as one device (or component, the same applies below) (including what is depicted in the drawings as one device) may be realized by multiple devices. Conversely, what is described in this specification as multiple devices (including what is depicted in the drawings as multiple devices) may be realized by one device. Alternatively, some or all of the means and functions included in a certain device (e.g. a server) may be included in another device (e.g. a user terminal).

また、本明細書に記載された事項の全てが必須の要件というわけではない。特に、本明細書に記載され、特許請求の範囲に記載されていない事項は任意の付加的事項ということができる。Furthermore, not all of the matters described in this specification are essential requirements. In particular, matters described in this specification but not in the claims can be considered optional additional matters.

なお、本出願人は本明細書の「先行技術文献」欄の文献に記載された文献公知発明を知っているにすぎず、本発明は必ずしも同文献公知発明における課題を解決することを目的とするものではないことにも留意されたい。本発明が解決しようとする課題は本明細書全体を考慮して認定されるべきものである。例えば、本明細書において、特定の構成によって所定の効果を奏する旨の記載がある場合、当該所定の効果の裏返しとなる課題が解決されるということもできる。ただし、必ずしもそのような特定の構成を必須の要件とする趣旨ではない。Please note that the applicant is only aware of the inventions disclosed in the documents in the "Prior Art Documents" column of this specification, and that the present invention does not necessarily aim to solve the problems in the disclosed inventions. The problems that the present invention aims to solve should be identified by considering this specification as a whole. For example, if this specification describes that a specific effect is achieved by a specific configuration, it can also be said that the present invention solves a problem that is the reverse of the specific effect. However, it is not necessarily intended that such a specific configuration is an essential requirement.

１コンテンツ生成システム
２ユーザ端末
２１通信部
２２制御部
２２ａ表示制御部
２２ｂコマ決定部
２２ｃキャラクター決定部
２２ｄ表情決定部
２２ｅ目線決定部
２２ｆモーション決定部
２２ｇキャラクター配置決定部
２２ｈ背景画像決定部
２２ｉテキスト情報決定部
２３撮像部
２４入力部
２５出力部（表示部）
２６記憶部
３サーバ
３１通信部
３２制御部
３２ａキャラクター画像生成部
３２ｂ音声生成部
３２ｃコンテンツ生成部
３２ｄ発話データ生成部
３２ｅＮＦＴ発行部
３３記憶部
３３ａキャラクター画像ＤＢ
３３ｂモーションパターンＤＢ
３３ｃ背景画像ＤＢ
３３ｄ吹き出しＤＢ
３３ｅコンテンツＤＢ1 Content generation system 2 User terminal 21 Communication unit 22 Control unit 22a Display control unit 22b Frame determination unit 22c Character determination unit 22d Facial expression determination unit 22e Line of sight determination unit 22f Motion determination unit 22g Character placement determination unit 22h Background image determination unit 22i Text information determination unit 23 Imaging unit 24 Input unit 25 Output unit (display unit)
26 Storage unit 3 Server 31 Communication unit 32 Control unit 32a Character image generation unit 32b Voice generation unit 32c Content generation unit 32d Speech data generation unit 32e NFT issuing unit 33 Storage unit 33a Character image DB
33b Motion pattern DB
33c Background image DB
33d Speech bubble DB
33e Content DB