KR102858127B1

Movatterモバイル変換

Info

Publication number: KR102858127B1
Application number: KR1020190023901A
Authority: KR
Inventors: 김주영; 이현우
Original assignee: 삼성전자주식회사
Priority date: 2018-03-12
Filing date: 2019-02-28
Publication date: 2025-09-10
Anticipated expiration: 2039-02-28
Also published as: KR20190118108A; EP3698258A4; EP3698258A1; CN111902812A

Abstract

Translated fromKorean

본 개시는 기계학습, 신경망 또는 딥러닝 알고리즘 중 적어도 하나에 따라 학습된 인공지능 모델을 이용하는 인공지능(AI) 시스템 및 그 응용에 관련된 것이다. 본 개시에서는 전자 장치의 제어방법이 제공된다. 본 제어방법은 사용자 입력을 기반으로 텍스트를 획득하는 단계, 획득된 텍스트로부터 복수의 핵심 단어를 결정하는 단계, 복수의 핵심 단어에 대응되는 복수의 제1 삽화를 획득하는 단계, 복수의 제1 삽화 중에서 적어도 2 개 이상의 제1 삽화를 합성하여 제2 삽화를 획득하는 단계 및 획득된 제2 삽화를 출력하는 단계를 포함한다.The present disclosure relates to an artificial intelligence (AI) system and its application, which utilizes an AI model trained according to at least one of machine learning, neural network, or deep learning algorithms. The present disclosure provides a method for controlling an electronic device. The method includes the steps of: obtaining text based on user input; determining a plurality of key words from the obtained text; obtaining a plurality of first illustrations corresponding to the plurality of key words; obtaining a second illustration by synthesizing at least two first illustrations from among the plurality of first illustrations; and outputting the obtained second illustration.

Description

Translated fromKorean

전자 장치 및 그의 제어방법 {ELECTRONIC APPARATUS AND CONTROLLING METHOD THEREOF}Electronic apparatus and its control method {ELECTRONIC APPARATUS AND CONTROLLING METHOD THEREOF}

본 개시는 전자 장치 및 그의 제어방법에 관한 것으로, 더욱 상세하게는 텍스트와 연관된 영상을 생성하는 전자 장치 및 그의 제어방법에 관한 것이다.The present disclosure relates to an electronic device and a method for controlling the same, and more particularly, to an electronic device for generating an image associated with text and a method for controlling the same.

또한, 본 개시는 기계 학습 알고리즘을 활용하여 인간 두뇌의 인지, 판단 등의 기능을 모사하는 인공 지능(Artificial Intelligence, AI) 시스템 및 그 응용에 관한 것이다.In addition, the present disclosure relates to an artificial intelligence (AI) system and its application that simulates the cognitive, judgment, and other functions of the human brain by utilizing a machine learning algorithm.

근래에는 인간 수준의 지능을 구현하는 인공 지능 시스템이 다양한 분야에서 이용되고 있다. 인공 지능 시스템은 기존의 룰(rule) 기반 스마트 시스템과 달리 기계가 스스로 학습하고 판단하며 똑똑해지는 시스템이다. 인공 지능 시스템은 사용할수록 인식률이 향상되고 사용자 취향을 보다 정확하게 이해할 수 있게 되어, 기존 룰 기반 스마트 시스템은 점차 딥러닝 기반 인공 지능 시스템으로 대체되고 있다.Recently, artificial intelligence systems that achieve human-level intelligence are being utilized in various fields. Unlike existing rule-based smart systems, AI systems are machines that learn, make decisions, and become intelligent on their own. As AI systems become more used, their recognition rates improve and their ability to understand user preferences more accurately is increasing. As a result, existing rule-based smart systems are gradually being replaced by deep learning-based AI systems.

인공 지능 기술은 기계학습(예로, 딥러닝) 및 기계학습을 활용한 요소 기술들로 구성된다.Artificial intelligence technology consists of machine learning (e.g., deep learning) and elemental technologies that utilize machine learning.

기계학습은 입력 데이터들의 특징을 스스로 분류/학습하는 알고리즘 기술이며, 요소기술은 딥러닝 등의 기계학습 알고리즘을 활용하여 인간 두뇌의 인지, 판단 등의 기능을 모사하는 기술로서, 언어적 이해, 시각적 이해, 추론/예측, 지식 표현, 동작 제어 등의 기술 분야로 구성된다.Machine learning is an algorithm technology that classifies/learns the characteristics of input data on its own, and element technology is a technology that imitates the functions of the human brain, such as cognition and judgment, by utilizing machine learning algorithms such as deep learning, and is composed of technical fields such as linguistic understanding, visual understanding, inference/prediction, knowledge representation, and motion control.

인공 지능 기술이 응용되는 다양한 분야는 다음과 같다. 언어적 이해는 인간의 언어/문자를 인식하고 응용/처리하는 기술로서, 자연어 처리, 기계 번역, 대화시스템, 질의 응답, 음성 인식/합성 등을 포함한다. 시각적 이해는 사물을 인간의 시각처럼 인식하여 처리하는 기술로서, 오브젝트 인식, 오브젝트 추적, 영상 검색, 사람 인식, 장면 이해, 공간 이해, 영상 개선 등을 포함한다. 추론 예측은 정보를 판단하여 논리적으로 추론하고 예측하는 기술로서, 지식/확률 기반 추론, 최적화 예측, 선호 기반 계획, 추천 등을 포함한다. 지식 표현은 인간의 경험정보를 지식데이터로 자동화 처리하는 기술로서, 지식 구축(데이터 생성/분류), 지식 관리(데이터 활용) 등을 포함한다. 동작 제어는 차량의 자율 주행, 로봇의 움직임을 제어하는 기술로서, 움직임 제어(항법, 충돌, 주행), 조작 제어(행동 제어) 등을 포함한다.The various fields in which artificial intelligence technology is applied are as follows. Linguistic understanding refers to the technology that recognizes, applies, and processes human language/text, including natural language processing, machine translation, dialogue systems, question-answering, and speech recognition/synthesis. Visual understanding refers to the technology that recognizes and processes objects similar to human vision, including object recognition, object tracking, image search, person recognition, scene understanding, spatial understanding, and image enhancement. Inference prediction refers to the technology that logically infers and predicts information by judging it, including knowledge/probability-based inference, optimization prediction, preference-based planning, and recommendations. Knowledge representation refers to the technology that automatically processes human experience information into knowledge data, including knowledge construction (data creation/classification) and knowledge management (data utilization). Motion control refers to the technology that controls the movement of autonomous vehicles and robots, including movement control (navigation, collision, driving), and manipulation control (behavior control).

한편, 정보의 효율적 전달을 위해 책, 신문, 광고, 발표 자료 등을 제작할 때 텍스트와 함께 삽화를 함께 삽입하여 제작할 수 있는데, 종래엔 텍스트에 어울리는 삽화를 일일이 찾아야만 했는바, 원하는 삽화를 찾는데 오랜 시간이 소요되었고, 하나의 자료 안에 들어가는 삽화들의 디자인을 통일시키는 것에도 어려움이 있었다.Meanwhile, in order to efficiently convey information, illustrations can be inserted together with text when producing books, newspapers, advertisements, presentation materials, etc. However, in the past, it was necessary to find illustrations that matched the text one by one, which took a long time to find the desired illustration, and it was also difficult to unify the design of illustrations in one material.

본 개시는 상술한 문제점을 해결하기 위해 안출된 것으로, 본 개시의 목적은 인공지능 모델을 이용하여 텍스트와 연관된 영상을 생성하는 전자 장치 및 그의 제어방법을 제공함에 있다.The present disclosure has been made to solve the above-described problems, and the purpose of the present disclosure is to provide an electronic device that generates an image associated with text using an artificial intelligence model, and a control method thereof.

본 개시의 일 실시 예에 따른, 전자 장치의 제어 방법은 사용자 입력을 기반으로 텍스트를 획득하는 단계, 상기 획득된 텍스트로부터 복수의 핵심 단어를 결정하는 단계, 상기 복수의 핵심 단어에 대응되는 복수의 제1 삽화를 획득하는 단계, 상기 복수의 제1 삽화 중에서 적어도 2 개 이상의 제1 삽화를 합성하여 제2 삽화를 획득하는 단계 및 상기 획득된 제2 삽화를 출력하는 단계를 포함한다.According to one embodiment of the present disclosure, a method for controlling an electronic device includes the steps of obtaining text based on a user input, determining a plurality of key words from the obtained text, obtaining a plurality of first illustrations corresponding to the plurality of key words, obtaining a second illustration by synthesizing at least two first illustrations from among the plurality of first illustrations, and outputting the obtained second illustration.

또한, 본 개시의 일 실시 예에 따른 전자장치는 하나 이상의 인스트럭션들(instructions)을 저장하는 메모리 및 상기 메모리에 저장된 상기 하나 이상의 인스트럭션들을 실행하는 적어도 하나의 프로세서를 포함하고, 상기 적어도 하나의 프로세서는 사용자 입력을 기반으로 텍스트를 획득하고, 상기 획득된 텍스트로부터 복수의 핵심 단어를 결정하고, 상기 복수의 핵심 단어에 대응되는 복수의 제1 삽화를 획득하고, 상기 복수의 제1 삽화 중에서 적어도 2개 이상의 제1 삽화를 합성하여 제2 삽화를 획득하고, 상기 획득된 제2 삽화를 출력한다.In addition, an electronic device according to an embodiment of the present disclosure includes a memory storing one or more instructions and at least one processor executing the one or more instructions stored in the memory, wherein the at least one processor obtains text based on a user input, determines a plurality of key words from the obtained text, obtains a plurality of first illustrations corresponding to the plurality of key words, synthesizes at least two first illustrations from among the plurality of first illustrations to obtain a second illustration, and outputs the obtained second illustration.

도 1은 본 개시의 일 실시 예에 따른 삽화 제공 방법을 설명하기 위한 도면,
도 2a 및 도 2b는 본 개시의 일 실시 예에 따른 전자 장치의 제어방법을 설명하기 위한 흐름도,
도 3은 본 개시의 일 실시 예에 따른 생성적 적대 신경망을 통한 학습 방식의 일 예를 도시한 도면,
도 4는 태그 정보와 매칭된 삽화들로 구성된 데이터 베이스를 이용하는 본 개시의 일 실시 예에 따른 삽화 검색 방법을 설명하기 위한 도면,
도 5 내지 도 8은 복수의 삽화를 합성한 합성 삽화를 획득하는 본 개시의 일 실시 예를 설명하기 위한 도면,
도 9 내지 도 11은 다양한 조합으로 합성된 복수의 합성 삽화를 제공하는 본 개시의 일 실시 예를 설명하기 위한 도면,
도 12는 텍스트와 관련되며 프레젠테이션 영상의 디자인과 대응되는 삽화를 획득하는 본 개시의 일 실시 예를 설명하기 위한 도면,
도 13 내지 도 16은 본 개시의 다양한 실시 예에 따른 삽화 제공을 위한 사용자 인터페이스를 설명하기 위한 도면,
도 17 내지 도 18a는 삽화 생성 기능이 메신저 프로그램에 적용된 본 개시의 다양한 실시 예를 설명하기 위한 도면,
도 18b는 본 개시의 삽화 생성 기능이 키보드 프로그램에 적용된 실시 예를 설명하기 위한 도면,
도 19는 본 개시의 일 실시 예에 따른 전자 장치의 구성을 설명하기 위한 블럭도,
도 20a는 본 개시의 다양한 실시 예에 따른, 인식 모델을 이용하는 네트워크 시스템의 흐름도,
도 20b는 본 개시의 일 실시 예에 따른 인공지능 모델을 이용하는 네트워크 시스템의 흐름도,
도 20c는 본 개시의 일 실시 예에 따른 네트워크 시스템의 구성도,
도 21은 본 개시의 일 실시 예에 따른, 인식 모델을 학습하고 이용하기 위한 전자 장치를 설명하기 위한 블록도, 그리고,
도 22 내지 도 23은 본 개시의 다양한 실시 예에 따른 학습부 및 분석부를 설명하기 위한 블록도이다.FIG. 1 is a drawing for explaining an illustration providing method according to one embodiment of the present disclosure;
FIG. 2a and FIG. 2b are flowcharts for explaining a control method of an electronic device according to an embodiment of the present disclosure;
FIG. 3 is a diagram illustrating an example of a learning method using a generative adversarial network according to an embodiment of the present disclosure;
FIG. 4 is a diagram for explaining an illustration search method according to one embodiment of the present disclosure using a database composed of illustrations matched with tag information;
FIGS. 5 to 8 are drawings for explaining an embodiment of the present disclosure for obtaining a composite illustration by synthesizing a plurality of illustrations.
FIGS. 9 to 11 are drawings illustrating an embodiment of the present disclosure providing a plurality of composite illustrations synthesized in various combinations;
FIG. 12 is a drawing illustrating one embodiment of the present disclosure for obtaining an illustration that is related to text and corresponds to the design of a presentation video;
FIGS. 13 to 16 are drawings for explaining a user interface for providing illustrations according to various embodiments of the present disclosure.
Figures 17 to 18a are drawings for explaining various embodiments of the present disclosure in which the illustration generation function is applied to a messenger program.
FIG. 18b is a drawing for explaining an embodiment in which the illustration generation function of the present disclosure is applied to a keyboard program.
FIG. 19 is a block diagram illustrating a configuration of an electronic device according to an embodiment of the present disclosure;
FIG. 20a is a flowchart of a network system using a recognition model according to various embodiments of the present disclosure;
FIG. 20b is a flowchart of a network system using an artificial intelligence model according to an embodiment of the present disclosure;
FIG. 20c is a configuration diagram of a network system according to an embodiment of the present disclosure;
FIG. 21 is a block diagram illustrating an electronic device for learning and utilizing a recognition model according to an embodiment of the present disclosure; and
FIGS. 22 and 23 are block diagrams illustrating a learning unit and an analysis unit according to various embodiments of the present disclosure.

이하, 본 문서의 다양한 실시 예가 첨부된 도면을 참조하여 기재된다. 그러나, 이는 본 문서에 기재된 기술을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 문서의 실시예의 다양한 변경(modifications), 균등물(equivalents), 및/또는 대체물(alternatives)을 포함하는 것으로 이해되어야 한다. 도면의 설명과 관련하여, 유사한 구성요소에 대해서는 유사한 참조 부호가 사용될 수 있다.Hereinafter, various embodiments of this document are described with reference to the attached drawings. However, this is not intended to limit the technology described in this document to specific embodiments, and it should be understood that various modifications, equivalents, and/or alternatives of the embodiments of this document are included. In connection with the description of the drawings, similar reference numerals may be used for similar components.

본 문서에서, "가진다," "가질 수 있다," "포함한다," 또는 "포함할 수 있다" 등의 표현은 해당 특징(예: 수치, 기능, 동작, 또는 부품 등의 구성요소)의 존재를 가리키며, 추가적인 특징의 존재를 배제하지 않는다.In this document, the expressions "has," "may have," "includes," or "may include" indicate the presence of a feature (e.g., a number, function, operation, or component such as a part), but do not exclude the presence of additional features.

본 문서에서, "A 또는 B," "A 또는/및 B 중 적어도 하나," 또는 "A 또는/및 B 중 하나 또는 그 이상" 등의 표현은 함께 나열된 항목들의 모든 가능한 조합을 포함할 수 있다. 예를 들면, "A 또는 B," "A 및 B 중 적어도 하나," 또는 "A 또는 B 중 적어도 하나"는, (1) 적어도 하나의 A를 포함, (2) 적어도 하나의 B를 포함, 또는 (3) 적어도 하나의 A 및 적어도 하나의 B 모두를 포함하는 경우를 모두 지칭할 수 있다.In this document, the expressions "A or B," "at least one of A and/or B," or "one or more of A or/and B" can include all possible combinations of the listed items. For example, "A or B," "at least one of A and B," or "at least one of A or B" can all refer to cases where (1) at least one A is included, (2) at least one B is included, or (3) both at least one A and at least one B are included.

본 문서에서 사용된 "제1," "제2," "첫째," 또는 "둘째," 등의 표현들은 다양한 구성요소들을, 순서 및/또는 중요도에 상관없이 수식할 수 있고, 한 구성요소를 다른 구성요소와 구분하기 위해 사용될 뿐 해당 구성요소들을 한정하지 않는다. 예를 들면, 제1 사용자 기기와 제2 사용자 기기는, 순서 또는 중요도와 무관하게, 서로 다른 사용자 기기를 나타낼 수 있다. 예를 들면, 본 문서에 기재된 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 바꾸어 명명될 수 있다.The terms "first," "second," "first," or "second," as used herein, may describe various components, regardless of order and/or importance, and are only used to distinguish one component from another, without limiting the components. For example, "first user device" and "second user device" may represent different user devices, regardless of order or importance. For example, without departing from the scope of the rights set forth in this document, "first component" may be referred to as "second component," and similarly, "second component" may also be referred to as "first component."

본 문서에서 사용된 "모듈", "유닛", "부(part)" 등과 같은 용어는 적어도 하나의 기능이나 동작을 수행하는 구성요소를 지칭하기 위한 용어이며, 이러한 구성요소는 하드웨어 또는 소프트웨어로 구현되거나 하드웨어 및 소프트웨어의 결합으로 구현될 수 있다. 또한, 복수의 "모듈", "유닛", "부(part)" 등은 각각이 개별적인 특정한 하드웨어로 구현될 필요가 있는 경우를 제외하고는, 적어도 하나의 모듈이나 칩으로 일체화되어 적어도 하나의 프로세서로 구현될 수 있다.Terms such as "module," "unit," and "part" used in this document are terms used to refer to components that perform at least one function or operation, and such components may be implemented as hardware or software, or a combination of hardware and software. In addition, multiple "modules," "units," and "parts," etc. may be integrated into at least one module or chip and implemented as at least one processor, except in cases where each needs to be implemented as a separate, specific hardware.

어떤 구성요소(예: 제1 구성요소)가 다른 구성요소(예: 제2 구성요소)에 "(기능적으로 또는 통신적으로) 연결되어((operatively or communicatively) coupled with/to)" 있다거나 "접속되어(connected to)" 있다고 언급된 때에는, 상기 어떤 구성요소가 상기 다른 구성요소에 직접적으로 연결되거나, 다른 구성요소(예: 제3 구성요소)를 통하여 연결될 수 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소(예: 제1 구성요소)가 다른 구성요소(예: 제2 구성요소)에 "직접 연결되어" 있다거나 "직접 접속되어" 있다고 언급된 때에는, 상기 어떤 구성요소와 상기 다른 구성요소 사이에 다른 구성요소(예: 제3 구성요소)가 존재하지 않는 것으로 이해될 수 있다.When it is said that a component (e.g., a first component) is "(operatively or communicatively) coupled with/to" or "connected to" another component (e.g., a second component), it should be understood that the component is directly coupled to the other component, or can be connected via another component (e.g., a third component). Conversely, when it is said that a component (e.g., a first component) is "directly coupled to" or "directly connected to" another component (e.g., a second component), it should be understood that no other component (e.g., a third component) exists between the first component and the other component.

본 문서에서 사용된 표현 "~하도록 구성된(또는 설정된)(configured to)"은 상황에 따라, 예를 들면, "~에 적합한(suitable for)," "~하는 능력을 가지는(having the capacity to)," "~하도록 설계된(designed to)," "~하도록 변경된(adapted to)," "~하도록 만들어진(made to)," 또는 "~를 할 수 있는(capable of)"과 바꾸어 사용될 수 있다. 용어 "~하도록 구성된(또는 설정된)"은 하드웨어적으로 "특별히 설계된(specifically designed to)" 것만을 반드시 의미하지 않을 수 있다. 대신, 어떤 상황에서는, "~하도록 구성된 장치"라는 표현은, 그 장치가 다른 장치 또는 부품들과 함께 "~할 수 있는" 것을 의미할 수 있다. 예를 들면, 문구 "A, B, 및 C를 수행하도록 구성된(또는 설정된) 프로세서"는 해당 동작을 수행하기 위한 전용 프로세서(예: 임베디드 프로세서), 또는 메모리 장치에 저장된 하나 이상의 소프트웨어 프로그램들을 실행함으로써, 해당 동작들을 수행할 수 있는 범용 프로세서(generic-purpose processor)(예: CPU 또는 application processor)를 의미할 수 있다.As used herein, the expression "configured to" can be used interchangeably with, for example, "suitable for," "having the capacity to," "designed to," "adapted to," "made to," or "capable of." The term "configured to" does not necessarily mean something is "specifically designed to" in hardware. Instead, in some contexts, the expression "a device configured to" can mean that the device, in conjunction with other devices or components, is "capable of." For example, the phrase "a processor configured to perform A, B, and C" can mean a dedicated processor for performing the operations (e.g., an embedded processor), or a general-purpose processor (e.g., a CPU or application processor) that can perform the operations by executing one or more software programs stored in a memory device.

본 문서에서 사용된 용어들은 단지 특정한 실시 예를 설명하기 위해 사용된 것으로, 다른 실시 예의 범위를 한정하려는 의도가 아닐 수 있다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함할 수 있다. 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 용어들은 본 문서에 기재된 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가질수 있다. 본 문서에 사용된 용어들 중 일반적인 사전에 정의된 용어들은, 관련 기술의 문맥상 가지는 의미와 동일 또는 유사한 의미로 해석될 수 있으며, 본 문서에서 명백하게 정의되지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다. 경우에 따라서, 본 문서에서 정의된 용어일지라도 본 문서의 실시 예들을 배제하도록 해석될 수 없다.The terms used in this document are used only to describe specific embodiments and may not be intended to limit the scope of other embodiments. The singular expression may include the plural expression unless the context clearly indicates otherwise. Terms used herein, including technical or scientific terms, have the same meaning as commonly understood by those of ordinary skill in the art described in this document. Terms defined in general dictionaries among the terms used in this document may be interpreted as having the same or similar meaning in the context of the relevant technology, and shall not be interpreted in an idealized or overly formal sense unless explicitly defined in this document. In some cases, even if a term is defined in this document, it cannot be interpreted to exclude embodiments of this document.

본 문서의 다양한 실시 예들에 따른 전자 장치는, 예를 들면, 스마트폰(smartphone), 태블릿 PC(tablet personal computer), 이동 전화기(mobile phone), 영상 전화기, 전자책 리더기(e-book reader), 데스크탑 PC(desktop personal computer), 랩탑 PC(laptop personal computer), 넷북 컴퓨터(netbook computer), 워크스테이션(workstation), 서버, PDA(personal digital assistant), PMP(portable multimedia player), MP3 플레이어, 모바일 의료기기, 카메라(camera), 또는 웨어러블 장치(wearable device) 중 적어도 하나를 포함할 수 있다. 다양한 실시 예에 따르면, 웨어러블 장치는 액세서리형(예: 시계, 반지, 팔찌, 발찌, 목걸이, 안경, 콘택트렌즈, 또는 머리 착용형 장치(head-mounted-device(HMD)), 직물 또는 의류 일체형(예: 전자 의복), 신체 부착형(예: 스킨 패드(skin pad) 또는 문신), 또는 생체 이식형(예: implantable circuit) 중 적어도 하나를 포함할 수 있다.An electronic device according to various embodiments of the present document may include, for example, at least one of a smartphone, a tablet personal computer (PC), a mobile phone, a video phone, an e-book reader, a desktop personal computer (PC), a laptop personal computer (PC), a netbook computer, a workstation, a server, a personal digital assistant (PDA), a portable multimedia player (PMP), an MP3 player, a mobile medical device, a camera, or a wearable device. According to various embodiments, the wearable device may include at least one of an accessory type (e.g., a watch, a ring, a bracelet, an anklet, a necklace, glasses, contact lenses, or a head-mounted device (HMD)), a fabric or clothing-integrated type (e.g., an electronic garment), a body-attached type (e.g., a skin pad or a tattoo), or a bio-implantable type (e.g., an implantable circuit).

어떤 실시 예들에서, 전자 장치는 가전제품(home appliance)일 수 있다. 가전제품은, 예를 들면, 텔레비전, DVD(digital video disk) 플레이어, 오디오, 냉장고, 에어컨, 청소기, 오븐, 전자레인지, 세탁기, 공기 청정기, 셋톱 박스(set-top box), 홈 오토매이션 컨트롤 패널(home automation control panel), 보안 컨트롤 패널(security control panel), TV 박스(예: 삼성 HomeSync??, 애플TV??, 또는 구글 TV??), 게임 콘솔(예: Xbox??, PlayStation??), 전자 사전, 전자 키, 캠코더(camcorder), 또는 전자 액자 중 적어도 하나를 포함할 수 있다.In some embodiments, the electronic device may be a home appliance. The home appliance may include, for example, at least one of a television, a digital video disk (DVD) player, an audio device, a refrigerator, an air conditioner, a vacuum cleaner, an oven, a microwave oven, a washing machine, an air purifier, a set-top box, a home automation control panel, a security control panel, a TV box (e.g., Samsung HomeSync™, Apple TV™, or Google TV™), a game console (e.g., Xbox™, PlayStation™), an electronic dictionary, an electronic key, a camcorder, or an electronic picture frame.

다른 실시 예에서, 전자 장치는, 각종 의료기기(예: 각종 휴대용 의료측정기기(혈당 측정기, 심박 측정기, 혈압 측정기, 또는 체온 측정기 등), MRA(magnetic resonance angiography), MRI(magnetic resonance imaging), CT(computed tomography), 촬영기, 또는 초음파기 등), 네비게이션(navigation) 장치, 위성 항법 시스템(GNSS(global navigation satellite system)), EDR(event data recorder), FDR(flight data recorder), 자동차 인포테인먼트(infotainment) 장치, 선박용 전자 장비(예: 선박용 항법 장치, 자이로 콤파스 등), 항공 전자기기(avionics), 보안 기기, 차량용 헤드 유닛(head unit), 산업용 또는 가정용 로봇, 금융 기관의 ATM(automatic teller's machine), 상점의 POS(point of sales), 또는 사물 인터넷 장치(internet of things)(예: 전구, 각종 센서, 전기 또는 가스 미터기, 스프링클러 장치, 화재경보기, 온도조절기(thermostat), 가로등, 토스터(toaster), 운동기구, 온수탱크, 히터, 보일러 등) 중 적어도 하나를 포함할 수 있다.In another embodiment, the electronic device may be any of various medical devices (e.g., various portable medical measuring devices (e.g., blood glucose meters, heart rate monitors, blood pressure monitors, or body temperature monitors), magnetic resonance angiography (MRA), magnetic resonance imaging (MRI), computed tomography (CT), cameras, or ultrasound machines), navigation devices, global navigation satellite systems (GNSS), event data recorders (EDR), flight data recorders (FDR), automotive infotainment devices, marine electronic equipment (e.g., marine navigation devices, gyrocompasses, etc.), avionics, security devices, head units for vehicles, industrial or home robots, automatic teller's machines (ATMs) of financial institutions, point of sales (POSs) of stores, or internet of things devices (e.g., light bulbs, various sensors, electric or gas meters, sprinkler devices, fire alarms, thermostats, It may include at least one of the following: streetlights, toasters, exercise equipment, hot water tanks, heaters, boilers, etc.

어떤 실시 예에 따르면, 전자 장치는 가구(furniture) 또는 건물/구조물의 일부, 전자 보드(electronic board), 전자 사인 수신 장치(electronic signature receiving device), 프로젝터(projector), 또는 각종 계측 기기(예: 수도, 전기, 가스, 또는 전파 계측 기기 등) 중 적어도 하나를 포함할 수 있다. 다양한 실시 예에서, 전자 장치는 전술한 다양한 장치들 중 하나 또는 그 이상의 조합일 수 있다. 어떤 실시 예에 따른 전자 장치는 플렉서블 전자 장치일 수 있다. 또한, 본 문서의 실시예에 따른 전자 장치는 전술한 기기들에 한정되지 않으며, 기술 발전에 따른 새로운 전자 장치를 포함할 수 있다.According to some embodiments, the electronic device may include at least one of a piece of furniture or a building/structure, an electronic board, an electronic signature receiving device, a projector, or various measuring devices (e.g., water, electricity, gas, or radio wave measuring devices). In various embodiments, the electronic device may be a combination of one or more of the various devices described above. The electronic device according to some embodiments may be a flexible electronic device. In addition, the electronic device according to the embodiments of the present document is not limited to the devices described above, and may include new electronic devices according to technological advancements.

이하에서는 첨부된 도면을 이용하여 본 개시에 대하여 구체적으로 설명한다.Hereinafter, the present disclosure will be described in detail using the attached drawings.

도 1은 본 개시의 일 실시 예에 따른 삽화 제공 방법을 설명하기 위한 도면이다.FIG. 1 is a drawing for explaining a method for providing illustrations according to one embodiment of the present disclosure.

도 1을 참고하면, 사용자가 마이크로소프트 파워포인트?? 등과 같은 프레젠테이션 프로그램을 통해 발표 대본인 텍스트(10)를 입력하면, 텍스트(10)에 대응되는 삽화(illustration)(20)가 제공될 수 있다. 본 개시에 따르면, 인공지능 기술을 이용하여 텍스트(10)의 의미를 파악하고 이에 어울리는 삽화(20)가 제공될 수 있다.Referring to FIG. 1, when a user inputs text (10) as a presentation script through a presentation program such as Microsoft PowerPoint, an illustration (20) corresponding to the text (10) may be provided. According to the present disclosure, the meaning of the text (10) may be identified using artificial intelligence technology, and an illustration (20) matching the text may be provided.

이와 같은 삽화 제공 기능은, 마이크로소프트 파워포인트??, 키노트?? 등과 같은 프레젠테이션 소프트웨어에 플러그인(plugin) 또는 추가 기능(애드인;add-in, 애드온;add-on)으로서 제공될 수 있고, 또는, 별도의 소프트웨어로서 제공될 수도 있다.This illustration providing function may be provided as a plug-in or add-on to presentation software such as Microsoft PowerPoint, Keynote, etc., or may be provided as separate software.

본 개시에 따른 삽화 제공 기능은 발표 자료뿐만 아니라, 책, 신문, 광고, 잡지, 전자 엽서, 이메일, 인스턴트 메신저(instant messenger) 등 텍스트에 어울리는 이미지를 활용하는 어떠한 분야에라도 적용될 수 있다.The illustration provision function according to the present disclosure can be applied not only to presentation materials but also to any field that utilizes images that match text, such as books, newspapers, advertisements, magazines, electronic postcards, e-mails, and instant messengers.

본 개시에서 사용되는 용어 '삽화'는 경우에 따라 픽토그램, 플래티콘, Isotype (International System Of Typographic Picture Education), 인포그래픽, 영상(동영상 또는 정지 영상), 사진, 이모티콘 등과 같은 용어로도 명명될 수 있다.The term 'illustration' used in this disclosure may also be referred to as a pictogram, platicon, Isotype (International System Of Typographic Picture Education), infographic, image (moving image or still image), photograph, emoticon, etc.

본 개시에서 사용되는 삽화는 본 서비스를 제공하는 주체가 직접 제작한 것일 수도 있고, 외부에서 취합된 삽화일 수 있다. 외부에서 취합된 삽화일 경우 본 서비스를 제공하는 주체가 저작권 문제를 해결된 삽화만을 취합하여 본 서비스에 활용하여야 한다. 만약 보다 품질이 우수한 품질의 삽화를 서비스에서 제공하기 위해 저작권이 포함된 삽화를 사용할 경우, 서비스를 제공하는 주체는 저작권 문제를 해결해야 한다. 그리고 그 서비스를 제공하기 위해 추가적인 과금을 사용자로부터 받을 수 있다.The illustrations used in this disclosure may be created directly by the service provider or collected from external sources. If the illustrations are collected from external sources, the service provider must collect only those illustrations for which copyright issues have been resolved and use them in this service. If copyrighted illustrations are used to provide higher-quality illustrations, the service provider must resolve copyright issues. Additionally, the service provider may charge additional fees to users for providing such services.

본 개시의 다양한 실시 예에 따른 삽화 제공 기능은 전자 장치를 통해 구현될 수 있다. 이하, 도 2a 및 도 2b를 참고하여 본 개시의 일 실시 예에 따른 전자 장치의 제어방법을 설명하도록 한다.The illustration providing function according to various embodiments of the present disclosure may be implemented through an electronic device. Hereinafter, a method for controlling an electronic device according to an embodiment of the present disclosure will be described with reference to FIGS. 2A and 2B.

도 2a 및 도 2b는 본 개시의 일 실시 예에 따른 전자 장치의 제어방법을 설명하기 위한 흐름도이다.FIG. 2a and FIG. 2b are flowcharts for explaining a control method of an electronic device according to an embodiment of the present disclosure.

도 2a에 도시된 바와 같이, 본 개시의 일 실시 예에 따른 전자 장치는 사용자 입력을 기반으로 텍스트를 획득한다(S210). 구체적으로, 전자 장치는 프레젠테이션 영상을 제공하고, 프레젠테이션 영상에 대한 텍스트를 입력받을 수 있으며, 사용자 입력을 기반으로 프레젠테이션 영상에 대한 텍스트를 획득할 수 있다.As illustrated in FIG. 2A, an electronic device according to an embodiment of the present disclosure obtains text based on user input (S210). Specifically, the electronic device can provide a presentation video, receive text for the presentation video, and obtain text for the presentation video based on the user input.

여기서, 프레젠테이션 영상은 프레젠테이션 소프트웨어를 실행하여 제공되는 화면으로서, 예컨대 도 1에 도시된 것과 같은 화면이 될 수 있다. 프레젠테이션 영상은 전자 장치에 내장된 디스플레이를 통해 표시되거나, 또는 전자 장치와 연결된 외부 디스플레이 장치를 통해 표시될 수 있다. 그리고, 프레젠테이션 영상에 대한 텍스트는 대본(스크립트), 발표 글 등으로 명명될 수도 있다.Here, the presentation video is a screen provided by running presentation software, such as the screen illustrated in Figure 1. The presentation video may be displayed through a display built into the electronic device or through an external display device connected to the electronic device. Furthermore, the text for the presentation video may be referred to as a script, presentation text, or other such terms.

예컨대, 도 1에 도시한 것과 같이 프레젠테이션 영상이 표시된 화면에 마련된 텍스트 입력창에 텍스트(10)를 입력받을 수 있다. 전자 장치는 입력 장치를 통해 프레젠테이션 영상에 대한 텍스트를 입력받을 수 있다. 입력 장치는 예컨대, 키보드, 터치 패드, 마우스, 버튼 등을 포함할 수 있다. 입력 장치는 전자 장치에 내장되어 있거나 또는 전자 장치와 연결된 외부 입력 장치일 수 있다.For example, as illustrated in FIG. 1, text (10) can be input into a text input window provided on a screen displaying a presentation video. An electronic device can input text for a presentation video through an input device. The input device may include, for example, a keyboard, a touchpad, a mouse, buttons, etc. The input device may be built into the electronic device or may be an external input device connected to the electronic device.

한편, 사용자 입력은 사용자의 발화에 따른 음성 입력일 수 있다. 구체적으로, 전자 장치는 사용자의 음성 입력을 수신하고, 수신된 음성 입력을 분석함으로써 사용자의 발화 정보를 획득하며, 획득된 사용자의 발화 정보에 대응되는 텍스트를 획득할 수 있다. 텍스트가 획득되면, 전자 장치는 획득된 텍스트로부터 복수의 핵심 단어를 결정한다(S220). 그리고, 복수의 핵심 단어가 결정되면, 전자 장치는 복수의 핵심 단어에 대응되는 복수의 제1 삽화를 획득한다(S230).Meanwhile, the user input may be a voice input based on the user's speech. Specifically, the electronic device may receive the user's voice input, analyze the received voice input to obtain the user's speech information, and obtain text corresponding to the obtained user's speech information. Once the text is obtained, the electronic device determines a plurality of key words from the obtained text (S220). Then, once the plurality of key words are determined, the electronic device obtains a plurality of first illustrations corresponding to the plurality of key words (S230).

구체적으로, 전자 장치는 프레젠테이션 영상의 디자인에 대한 정보 및 텍스트를 인공지능 알고리즘에 의해 학습된 제1 인공지능 모델에 입력하여, 텍스트와 관련되며 프레젠테이션 영상의 디자인과 대응되는 복수의 제1 삽화를 획득할 수 있다. 예컨대, 도 1을 참고하면, "Great teamwork leads to success"라는 텍스트(10)를 인공지능 알고리즘에 의해 학습된 인공지능 모델에 입력하여 텍스트(10)와 관련된 삽화(20)를 획득할 수 있다.Specifically, the electronic device can input information and text regarding the design of a presentation video into a first artificial intelligence model trained by an artificial intelligence algorithm, thereby obtaining a plurality of first illustrations related to the text and corresponding to the design of the presentation video. For example, referring to FIG. 1, the text "Great teamwork leads to success" (10) can be input into an artificial intelligence model trained by an artificial intelligence algorithm, thereby obtaining illustrations (20) related to the text (10).

한편, 본 개시의 일 실시 예에 따르면, 인공지능 모델은 생성적 적대 신경망(GAN, Generative Adversarial Network)에 의해 학습될 수 있다. 생성적 적대 신경망 기술은 생성 모델(Generative model)과 구별 모델(discriminative model)이 서로 대립하여 서로의 성능을 점차 개선해 나가는 것이 주요 개념이다. 도 3에 생성적 적대 신경망을 통한 학습 방식의 일 예를 도시하였다.Meanwhile, according to one embodiment of the present disclosure, an AI model can be trained using a Generative Adversarial Network (GAN). The core concept of GAN technology is that a generative model and a discriminative model compete against each other, gradually improving each other's performance. Figure 3 illustrates an example of a learning method using a GAN.

도 3을 참고하면, 생성 모델(310)은 램덤 노이즈로부터 임의의 이미지(가짜 이미지)를 생성하고, 구별 모델(320)은 진짜 이미지(또는 학습 데이터)와 생성 모델에 의해 생성된 가짜 이미지를 구분한다. 생성 모델(310)은 구별 모델(320)이 점차 진짜 이미지와 가짜 이미지를 구분할 수 없게 하는 방향으로 학습되고, 반면, 구별 모델(320)은 진짜 이미지와 가짜 이미지를 잘 구분하는 방향으로 학습된다. 학습이 진행될수록 생성 모델(310)은 진짜 이미지와 상당히 유사한 가짜 이미지를 생성해낼 수 있다. 이와 같이 학습된 생성 모델(310)이 S230 단계에서의 인공지능 모델로 활용될 수 있다.Referring to FIG. 3, the generation model (310) generates a random image (fake image) from random noise, and the discrimination model (320) distinguishes between a real image (or training data) and a fake image generated by the generation model. The generation model (310) is trained in a direction in which the discrimination model (320) gradually becomes unable to distinguish between real images and fake images, while the discrimination model (320) is trained in a direction in which the real images and fake images are well distinguished. As the training progresses, the generation model (310) can generate fake images that are considerably similar to real images. The generation model (310) trained in this way can be utilized as an artificial intelligence model in step S230.

본 개시의 또 다른 실시 예에 따르면, S230 단계에서, 텍스트로부터 적어도 하나의 핵심 단어를 인공지능 모델을 이용하여 획득하고 상기 획득된 적어도 하나의 핵심 단어에 대응하는 삽화를 기 저장된 데이터 베이스로부터 검색할 수 있다. 예컨대, 자연어 처리를 위한 인공지능 모델이 제공되고, 이 인공지능 모델을 이용하여 형태소 분석, 핵심 단어 추출, 키워드들의 의미 및 연관 관계를 파악(예컨대, 동음이의어, 배경어, 핵심어 등을 파악)을 할 수 있다.According to another embodiment of the present disclosure, in step S230, at least one key word from a text may be acquired using an artificial intelligence model, and an illustration corresponding to the acquired at least one key word may be retrieved from a pre-stored database. For example, an artificial intelligence model for natural language processing may be provided, and this artificial intelligence model may be used to perform morphological analysis, extract key words, and identify the meaning and relationship between keywords (e.g., identify homonyms, background words, key words, etc.).

예컨대, 도 4에 도시된 것과 같이 태그 정보와 매칭된 삽화들이 저장된 데이터 베이스가 마련될 수 있다. 이 경우, 예컨대, "최근 인공지능 기술의 비약적인 발전으로 인공지능 관련 스타트업 진출이 증가되고 있다" 라는 텍스트가 입력되면, 텍스트를 NLP(Natural-language processing)를 위한 인공지능 모델에 입력하여 핵심 단어로서 '인공지능', '스타트업', '증가'를 획득하고, 핵심 단어를 포함하는 태그 정보와 매칭된 삽화들을 데이터 베이스로부터 검색할 수 있다.For example, a database storing illustrations matched with tag information, as illustrated in FIG. 4, can be prepared. In this case, for example, if text such as "Recently, with the rapid advancement of artificial intelligence technology, the number of AI-related startups is increasing" is input, the text can be input into an artificial intelligence model for NLP (Natural-language processing) to obtain the keywords "artificial intelligence," "startup," and "increase," and illustrations matched with tag information containing the keywords can be retrieved from the database.

본 개시의 일 실시 예에 따르면, 문장 전체를 입력받는 경우 구/절 별로 구분지어 문장 전체의 주요 의미를 가진 구/절들에 대응하는 삽화들을 연속으로 생성하여 제공할 수 있다.According to one embodiment of the present disclosure, when an entire sentence is input, illustrations corresponding to phrases/phrases having the main meaning of the entire sentence can be generated and provided in succession by dividing them into phrases/phrases.

한편, 전자 장치는 텍스트를 제1 인공지능 모델에 입력하여 텍스트와 관련되며 서로 동일한 그래픽 효과를 갖는 복수의 제1 삽화를 획득할 수 있다.Meanwhile, the electronic device can input text into the first artificial intelligence model to obtain a plurality of first illustrations related to the text and having the same graphic effects.

복수의 제1 삽화가 획득되면, 전자 장치는 복수의 제1 삽화 중에서 적어도 2 개 이상의 제1 삽화를 합성하여 제2 삽화를 획득한다(S240). 구체적으로, 전자 장치는 프레젠테이션 영상의 디자인에 대한 정보 및 복수의 제1 삽화를 학습된 제2 인공지능 모델에 입력하여, 복수의 제1 삽화 중 적어도 2 개 이상의 제1 삽화가 프레젠테이션 영상의 디자인과 대응되도록 수정된 제2 삽화를 획득하여 출력할 수 있다. 다시 말해, 본 개시의 또 다른 실시 예에 따르면, 전자 장치는 텍스트를 인공지능 모델에 입력하여, 복수의 제1 삽화를 획득하고 복수의 제1 삽화를 합성한 제2 삽화를 텍스트와 연관된 삽화로 획득할 수 있다. 즉, 여러 개의 삽화를 합성한 합성 삽화가 제공될 수 있다.When a plurality of first illustrations are acquired, the electronic device acquires a second illustration by synthesizing at least two first illustrations from among the plurality of first illustrations (S240). Specifically, the electronic device may input information about the design of the presentation video and the plurality of first illustrations into a learned second artificial intelligence model, and acquire and output a second illustration in which at least two first illustrations from among the plurality of first illustrations are modified so that they correspond to the design of the presentation video. In other words, according to another embodiment of the present disclosure, the electronic device may input text into the artificial intelligence model, acquire a plurality of first illustrations, and acquire a second illustration synthesized from the plurality of first illustrations as an illustration associated with the text. That is, a synthesized illustration synthesized from multiple illustrations may be provided.

예컨대, 인공지능 모델을 이용하여, 텍스트로부터 복수의 핵심 단어를 결정하고, 상기 복수의 핵심 단어에 대응되는 복수의 제1 삽화를 획득하고, 상기 복수의 핵심 단어의 문맥에 따라 상기 복수의 제1 삽화를 배치하여 합성한 제2 삽화를 획득할 수 있다.For example, using an artificial intelligence model, a plurality of key words can be determined from a text, a plurality of first illustrations corresponding to the plurality of key words can be obtained, and a second illustration can be obtained by arranging the plurality of first illustrations according to the context of the plurality of key words.

한편, 전자 장치는 복수의 핵심 단어의 문맥에 따라 복수의 제1 삽화를 배치하여 합성함으로써 제2 삽화를 획득할 수 있다.Meanwhile, the electronic device can obtain a second illustration by arranging and synthesizing a plurality of first illustrations according to the context of a plurality of key words.

이상에서 상술한 바와 같은 과정에 따라 제2 삽화가 획득되면, 전자 장치는 획득된 제2 삽화를 출력한다(S250). 구체적으로, 제2 삽화가 획득되면, 전자 장치는 획득된 제2 삽화를 표시하도록 디스플레이를 제어하고, 디스플레이를 통해 획득된 제2 삽화를 출력할 수 있다.When a second illustration is acquired through the process described above, the electronic device outputs the acquired second illustration (S250). Specifically, when the second illustration is acquired, the electronic device controls the display to display the acquired second illustration and outputs the acquired second illustration through the display.

한편, 전술한 바와 같이, 전자 장치는 프레젠테이션 영상을 표시하지 않은 상태에서 사용자 입력을 기반으로 텍스트를 획득할 수도 있으나, 프레젠테이션 영상을 표시하고 프레젠테이션 영상에 대한 텍스트를 획득할 수도 있다. 이하에서는 도 2b를 참조하여, 프레젠테이션 영상에 대한 텍스트를 획득하고, 획득된 텍스트를 바탕으로 삽화를 획득하는 실시 예에 대해 다시 설명한다. 다만, 각 단계에 대한 구체적인 설명은 도 2a를 참조하여 설명하였으므로 중복 설명은 생략한다.Meanwhile, as described above, an electronic device may acquire text based on user input without displaying a presentation video, or it may display a presentation video and acquire text related to the presentation video. Below, with reference to FIG. 2b, an embodiment of acquiring text related to a presentation video and then acquiring an illustration based on the acquired text will be described again. However, since a detailed description of each step has been described with reference to FIG. 2a, a redundant description will be omitted.

도 2b를 참고하면, 먼저, 프레젠테이션 영상을 표시한다(S210-1). 프레젠테이션 영상은 프레젠테이션 소프트웨어를 실행하여 제공되는 화면으로서, 예컨대 도 1에 도시된 것과 같은 화면이 될 수 있다.Referring to Fig. 2b, first, a presentation video is displayed (S210-1). The presentation video is a screen provided by executing presentation software, and may be, for example, a screen like the one illustrated in Fig. 1.

프레젠테이션 영상이 표시되면, 전자 장치는 프레젠테이션 영상에 대한 텍스트를 입력받는다(S220-1). 예컨대, 전자 장치는 도 1에 도시한 것과 같이 프레젠테이션 영상이 표시된 화면에 마련된 텍스트 입력창에 텍스트(10)를 입력받을 수 있다.When a presentation video is displayed, the electronic device receives text related to the presentation video (S220-1). For example, the electronic device can receive text (10) in a text input window provided on the screen where the presentation video is displayed, as illustrated in FIG. 1.

프레젠테이션 영상에 대한 텍스트가 입력되면, 전자 장치는 인공지능 알고리즘에 의해 학습된 인공지능 모델에 상기 텍스트를 입력하여 텍스트와 관련된 적어도 하나의 삽화를 획득한다(S230-1). 예컨대, 도 1을 참고하면, "Great teamwork leads to success"라는 텍스트(10)를 인공지능 알고리즘에 의해 학습된 인공지능 모델에 입력하여 텍스트(10)와 관련된 삽화(20)를 획득할 수 있다.When text for a presentation video is entered, the electronic device inputs the text into an artificial intelligence model trained by an artificial intelligence algorithm to obtain at least one illustration related to the text (S230-1). For example, referring to FIG. 1, the text "Great teamwork leads to success" (10) may be input into an artificial intelligence model trained by an artificial intelligence algorithm to obtain an illustration (20) related to the text (10).

텍스트와 관련된 적어도 하나의 삽화가 획득되면, 전자 장치는 획득된 적어도 하나의 삽화 중 사용자에 의해 선택된 삽화를 프레젠테이션 영상에 표시한다(S240-1).When at least one illustration related to the text is acquired, the electronic device displays an illustration selected by the user from among the acquired at least one illustration in the presentation video (S240-1).

도 5 내지 도 8은 복수의 삽화를 합성한 합성 삽화를 획득하는 일 실시 예를 설명하기 위한 도면이다.FIGS. 5 to 8 are drawings for explaining an embodiment of obtaining a composite illustration by synthesizing multiple illustrations.

도 5를 참고하면, 예컨대 "최근 인공지능 기술의 비약적인 발전으로 인공지능 관련 스타트업 진출이 증가되고 있다" 라는 텍스트가 입력되면, 인공지능 모델을 이용하여, "인공지능", "스타트업", "비약적인 발전", "증가"가 핵심 단어로서 획득될 수 있고, 이들의 연관관계가 판단될 수 있다. 연관 관계는 각 단어들의 연관도 수치(퍼센트)로 산출될 수 있다.Referring to Figure 5, for example, if the text "Recently, the rapid development of AI technology has led to an increase in the number of AI-related startups," is input, the AI model can extract "AI," "startup," "rapid development," and "increase" as key words, and determine their correlations. The correlations can be calculated as a correlation value (percentage) for each word.

도 6을 참고하면, 상기 획득된 핵심 단어들의 문맥이 판단될 수 있다. 문맥의 판단 과정은 각 핵심 단어들의 문장 내에서의 역할, 예컨대 배경에 해당하는 단어인지, 현상/결과에 해당하는 단어인지, 문장의 중심 단어에 해당하는지 등을 판단하는 과정을 포함한다.Referring to Figure 6, the context of the acquired key words can be determined. The context determination process involves determining the role of each key word within the sentence, such as whether it is a word representing the background, a word representing a phenomenon/result, or a central word within the sentence.

그리고 도 7을 참고하면, 상기 획득된 핵심 단어들에 대응되는 복수의 삽화가 획득될 수 있다. 복수의 삽화는 핵심 단어들의 연관 관계 및 문맥에 따라 분류될 수 있다. 예컨대, 배경에 해당하는 핵심 단어들에 대응하는 적어도 하나의 삽화와, 현상/결과에 해당하는 핵심 단어들에 대응하는 적어도 하나의 삽화가 분류될 수 있다.Referring to Figure 7, multiple illustrations corresponding to the acquired key words can be acquired. These illustrations can be categorized based on the correlation and context of the key words. For example, at least one illustration corresponding to the key words corresponding to the background and at least one illustration corresponding to the key words corresponding to the phenomenon/result can be categorized.

그리고 도 8을 참고하면, 복수의 삽화들은 핵심 단어들의 연관 관계 및 문맥에 따라 배치되어 합성될 수 있다. 예컨대, 배경 단어에 해당하는 삽화는 다른 삽화들 뒤쪽에 배치될 수 있고, 투명도가 다른 삽화들보다 높도록 설정될 수 있다. 그리고 중심 단어, 현상/결과를 나타내는 단어에 해당하는 삽화는 다른 삽화보다 투명도가 낮도록 설정될 수 있고, 선 굵기가 진하게 표현될 수 있다. 사용자는 도 8과 같이 합성된 삽화를 그대로 사용할 수 있고, 또는 합성된 삽화 내의 복수의 삽화들을 원하는 대로 개별적으로 수정(크기, 그래픽 효과, 배치 위치 등의 수정)하여 새로운 합성 삽화를 생성해낼 수도 있다.And referring to Fig. 8, multiple illustrations can be arranged and synthesized according to the relationship and context of key words. For example, an illustration corresponding to a background word can be arranged behind other illustrations and its transparency can be set to be higher than that of other illustrations. In addition, an illustration corresponding to a central word or a word representing a phenomenon/result can be set to have lower transparency than other illustrations and its line thickness can be expressed thicker. The user can use the synthesized illustration as shown in Fig. 8 as is, or can individually modify multiple illustrations within the synthesized illustration (modifying size, graphic effects, arrangement location, etc.) as desired to create a new synthesized illustration.

본 개시의 일 실시 예에 따르면, 다양한 조합으로 합성된 복수의 합성 삽화가 제공될 수 있다. 본 실시 예에 대해선 도 9 내지 도 11을 참고하여 설명하도록 한다.According to one embodiment of the present disclosure, multiple composite illustrations synthesized in various combinations may be provided. This embodiment will be described with reference to FIGS. 9 to 11.

도 9를 참고하면, 예컨대, "최근 인공지능 기술의 비약적인 발전으로 인공지능 관련 스타트업 진출이 증가되고 있다"라는 텍스트가 입력되면, 인공 지능 모델을 이용하여 핵심 단어들을 추출하고, 각 핵심 단어에 대응하는 복수의 삽화들을 획득할 수 있다. 예컨대, 도 9에 도시한 바와 같이 핵심 단어인 "인공지능"에 대응하는 삽화들, 핵심 단어인 "스타트업"에 대응하는 삽화들, 핵심 단어인 "증가"에 대응하는 삽화들이 각각 획득될 수 있다. 그리고 도 10을 참고하면, 각 핵심 단어의 삽화들을 다양한 조합으로 구성할 수 있다. 이 경우, 인공지능 모델을 이용하여 삽화의 유형과 프레젠테이션 영상의 유형의 유사도, 삽화들 간의 유사도 등을 고려하여 다양한 조합들을 제공할 수 있다. 그리고 도 11을 참고하면, 각 조합의 삽화들을 핵심 단어들의 문맥을 기초로 배치하여 합성하여 다양한 합성 삽화를 추천 리스트 형태로 제공할 수 있다. 이 경우, 단어끼리의 연관 관계 종류에 따라 정의된 삽화 배치에 대한 템플릿으로 구성된 제1 데이터베이스와 구/절끼리의 연관 관계 종류에 따라 정의된 삽화 배치 템플릿으로 구성된 제2 데이터베이스가 이용될 수 있다. 이러한 데이터베이스들로부터 템플릿을 로드하여 삽화들을 배치할 수 있다.Referring to Figure 9, for example, if the text "Recently, with the rapid development of artificial intelligence technology, the number of AI-related startups is increasing" is input, an AI model can be used to extract key words and obtain multiple illustrations corresponding to each key word. For example, as shown in Figure 9, illustrations corresponding to the key word "artificial intelligence," illustrations corresponding to the key word "startup," and illustrations corresponding to the key word "increase" can be obtained, respectively. Referring to Figure 10, the illustrations for each key word can be composed into various combinations. In this case, an AI model can be used to provide various combinations by considering the similarity between the type of illustration and the type of presentation video, as well as the similarity between illustrations. Referring to Figure 11, the illustrations for each combination can be arranged and synthesized based on the context of the key words, and various synthesized illustrations can be provided in the form of a recommendation list. In this case, a first database consisting of templates for illustration arrangement defined according to the type of relationship between words and a second database consisting of illustration arrangement templates defined according to the type of relationship between phrases and clauses may be utilized. Templates may be loaded from these databases to arrange illustrations.

사용자는 추천 리스트에서 원하는 합성 삽화를 선택하여 사용할 수 있다. 또는, 사용자는 제공된 합성 삽화를 그대로 사용하는 대신 합성 삽화 내의 복수의 삽화들을 원하는 대로 개별적으로 수정(크기, 그래픽 효과, 배치 위치 등의 수정)하여 새로운 합성 삽화를 생성해낼 수도 있다. 사용자가 선택한 합성 삽화, 즉 사용자가 선택한 조합에는 가중치가 부여되고, 이를 이용해 인공지능 모델이 재학습될 수 있다. 즉, 강화 학습 기술이 이용될 수 있다.Users can select and use the desired synthetic illustration from the recommended list. Alternatively, instead of using the provided synthetic illustration as is, users can individually modify multiple illustrations within the synthetic illustration (e.g., adjusting size, graphic effects, placement, etc.) to create a new synthetic illustration. The synthetic illustration selected by the user, or the user-selected combination, is weighted and can be used to retrain the AI model. In other words, reinforcement learning technology can be utilized.

본 개시의 일 실시 예에 따르면, 프레젠테이션 영상의 디자인에 대한 정보 및 텍스트를 인공지능 모델에 입력하여, 텍스트와 관련되며 프레젠테이션 영상의 디자인과 대응되는 적어도 하나의 삽화를 획득할 수 있다. 프레젠테이션 영상의 디자인에 대한 정보는 프레젠테이션 영상의 테마, 배경 스타일(background styles), 색상, 글꼴(fonts), 그래픽 효과, 밝기(brightness), 대비(contrast), 투명도(transparency) 등의 정보, 또는 현재 프레젠테이션 영상 전체의 캡처 화면을 포함할 수 있다.According to one embodiment of the present disclosure, information and text regarding the design of a presentation video can be input into an artificial intelligence model to obtain at least one illustration related to the text and corresponding to the design of the presentation video. Information regarding the design of the presentation video may include information such as the theme, background styles, colors, fonts, graphic effects, brightness, contrast, transparency, etc. of the presentation video, or a captured screen of the entire current presentation video.

이 경우, 인공지능 모델은, 삽화의 기초 형태를 생성하는 제1 인공지능 모델과 기초 형태의 삽화를 프레젠테이션 영상의 디자인과 대응되도록 수정하는 제2 인공지능 모델을 포함할 수 있다. 삽화의 기초 형태는 색상, 디자인 효과가 적용되지 않은 형태, 선으로만 이루어진 그림, 흑백 그림 등을 포함할 수 있다. 이에 대해선 도 12를 참고하여 설명하도록 한다.In this case, the AI model may include a first AI model that generates the basic form of the illustration and a second AI model that modifies the basic form of the illustration to match the presentation video design. The basic form of the illustration may include forms without color or design effects, drawings consisting solely of lines, or black and white drawings. This will be explained with reference to Figure 12.

도 12를 참고하면, 제1 인공지능 모델(1210)은 텍스트에 대응되는 삽화를 생성하는 모델로서, 학습 데이터로서 텍스트와 이미지를 이용하여 학습된 모델이다. 제2 인공지능 모델(1220)은 프레젠테이션 영상의 디자인과 대응되도록 이미지를 수정하는 모델로서, 학습 데이터로서 프레젠테이션 영상에 대한 정보와 이미지를 이용하여 학습된 모델이다. 프레젠테이션 영상의 디자인에 대한 정보는, 프레젠테이션 영상의 테마, 배경 스타일, 색상, 글꼴, 그래픽 효과, 밝기, 대비, 투명도 등에 대한 정보일 수 있다.Referring to Figure 12, the first artificial intelligence model (1210) is a model that generates illustrations corresponding to text and is a model trained using text and images as training data. The second artificial intelligence model (1220) is a model that modifies images to match the design of a presentation video and is a model trained using information and images about the presentation video as training data. Information about the design of the presentation video may include information about the theme, background style, color, font, graphic effects, brightness, contrast, transparency, etc. of the presentation video.

제2 인공지능 모델(1220)은 입력된 이미지의 테마, 선 스타일, 선 굵기, 색상, 크기, 그래픽 효과, 밝기, 대비, 형태, 배치, 합성 등과 관련하여 프레젠테이션 영상의 디자인에 맞게 수정할 수 있다. 예컨대, 제2 인공지능 모델(1220)은 프레젠테이션 영상의 디자인에서 사용된 색을 나열하고, 그 색의 빈도, 면적 등을 가중치로 하여 프레젠테이션 영상의 색 테마 정보를 계산할 수 있으며 계산된 색 테마 내의 색상을 이용하여 삽화를 채색할 수 있다. 또는, 제2 인공지능 모델(1220)은 색 정보 외에도 프레젠테이션 영상의 디자인에서 사용된 선 스타일, 선 굵기, 곡선 빈도, 모서리 처리 등의 디자인 요소 등으로 부터 프레젠테이션 영상의 스타일을 정의하고, 그 정보를 이용해 삽화의 그래픽 효과를 변경할 수 있다.The second artificial intelligence model (1220) can modify the theme, line style, line thickness, color, size, graphic effect, brightness, contrast, shape, arrangement, composition, etc. of the input image to fit the design of the presentation video. For example, the second artificial intelligence model (1220) can list the colors used in the design of the presentation video, calculate the color theme information of the presentation video by using the frequency, area, etc. of the colors as weights, and color the illustration using the colors within the calculated color theme. Alternatively, the second artificial intelligence model (1220) can define the style of the presentation video from design elements such as line style, line thickness, curve frequency, and edge treatment used in the design of the presentation video in addition to the color information, and change the graphic effect of the illustration using the information.

제2 인공지능 모델은 삽화의 동적인 움직임을 주거나 음향효과를 주는 것이 가능하다. 삽화의 특정 부분이 회전, 깜빡임, 흔들림, 일정 크기 이상으로 커지거나 작아짐을 반복하는 등의 움직임이 있을 수 있고, 삽화의 등장시 삽화에 적절하게 매칭되는 효과음이나 짧은 음악 등이 삽화와 함께 제공될 수 있다.The second AI model can provide dynamic movement and sound effects to illustrations. Specific parts of the illustration can rotate, blink, shake, or repeatedly grow and shrink beyond a certain size. When an illustration appears, appropriate sound effects or short music can be provided along with the illustration.

일 실시 예에 따르면, 프레젠테이션 영상에 대한 텍스트를 제1 인공지능 모델(1210)에 입력하여 적어도 하나의 제1 삽화(1211)를 획득할 수 있다. 제1 인공지능 모델(1210)은 자연어 처리를 수행할 수 있어, 텍스트로부터 핵심 단어를 추출하고 각 핵심 단어의 의미, 연관 관계를 파악할 수 있다. 핵심 단어의 의미에 따라 제1 삽화(1211)의 형태가 생성되며, 제1 삽화(1211)는 핵심 단어들 간의 연관관계, 문맥의 의미에 따라 복수의 삽화들을 배치하여 합성함으로써 형성될 수 있고, 상기 복수의 삽화들은 핵심 단어들의 중요도에 따라(배경 단어인지, 주요 단어인지, 부속 단어인지 등에 따른 판단에 따라) 크기, 위치, 투명도 등이 결정될 수 있다.According to one embodiment, text for a presentation video may be input into a first artificial intelligence model (1210) to obtain at least one first illustration (1211). The first artificial intelligence model (1210) may perform natural language processing to extract key words from the text and understand the meaning and relatedness of each key word. The shape of the first illustration (1211) is generated according to the meaning of the key word, and the first illustration (1211) may be formed by arranging and synthesizing a plurality of illustrations according to the relatedness between the key words and the meaning of the context, and the size, position, transparency, etc. of the plurality of illustrations may be determined according to the importance of the key words (based on whether they are background words, main words, auxiliary words, etc.).

그리고 프레젠테이션 영상의 디자인에 대한 정보 및 적어도 하나의 제1 삽화(1211)를 제2 인공지능 모델(1220)에 입력하여 적어도 하나의 제1 삽화(1211)를 프레젠테이션 영상의 디자인과 대응되도록 수정된 적어도 하나의 제2 삽화(1221)를 획득할 수 있다.And by inputting information about the design of the presentation video and at least one first illustration (1211) into the second artificial intelligence model (1220), at least one second illustration (1221) modified to correspond to the design of the presentation video can be obtained.

프레젠테이션 영상의 디자인은 슬라이드별로 다를 수 있으므로, 현재 슬라이드의 디자인에 맞는 삽화들이 생성될 수 있다.Since the design of the presentation video may vary from slide to slide, illustrations that match the design of the current slide can be created.

또 다른 실시 예에 따르면, 프레젠테이션 영상의 디자인에 대한 정보가 없더라도 기존에 생성한 삽화의 디자인에 맞게 새로운 삽화의 디자인이 결정될 수 있다.In another embodiment, the design of a new illustration can be determined based on the design of an existing illustration even if there is no information about the design of the presentation video.

본 개시의 일 실시 예에 따르면, 인공지능 모델을 이용하여 획득된 삽화를 프레젠테이션 영상에 적용한 후, 사용자가 전체 프레젠테이션 영상의 디자인을 편집하여 디자인이 달라진 경우, 달라진 디자인에 맞게 삽화의 그래픽 효과가 자동으로 변경될 수 있다. 또 다른 예로, 프레젠테이션 영상에 적용된 삽화들 중 어느 하나의 삽화의 그래픽 효과가 사용자에 의해 변경된 경우, 수정된 그래픽 효과와 동일하게 다른 삽화들의 그래픽 효과도 자동으로 변경될 수 있다.According to one embodiment of the present disclosure, if an illustration acquired using an artificial intelligence model is applied to a presentation video and the user edits the design of the entire presentation video, resulting in a change in the design, the graphic effects of the illustration may be automatically changed to match the changed design. As another example, if the graphic effect of one of the illustrations applied to the presentation video is changed by the user, the graphic effects of the other illustrations may also be automatically changed to match the changed graphic effect.

상술한 실시 예에 따르면, 프레젠테이션 영상의 디자인과 어울리며 통일감 있는 삽화들을 획득할 수 있으므로 사용자는 디자인적으로 좀 더 완성도 높은 프레젠테이션 자료를 제작할 수 있다.According to the above-described embodiment, since illustrations that match the design of the presentation video and are unified can be obtained, the user can produce presentation materials with a higher degree of design perfection.

한편, 프레젠테이션 영상의 디자인과 어울리는 것도 중요하지만, 삽화들도 서로 어울릴 필요가 있다. 본 개시의 또 다른 실시 예에 따르면, 삽화들 간의 디자인이 서로 유사하도록 생성될 수 있다. 예컨대, 프레젠테이션 영상에 대한 텍스트를 인공지능 모델에 입력하여 텍스트와 관련되며 서로 동일한 그래픽 효과를 갖는 복수의 삽화를 획득할 수 있다. 그래픽 효과는 그림자 효과, 반사 효과, 네온사인 효과, 입체 효과, 3차원 회전 효과 등을 포함할 수 있다.While it's important for the presentation video design to match, the illustrations also need to be compatible with each other. According to another embodiment of the present disclosure, illustrations can be generated so that their designs are similar to each other. For example, by inputting text for a presentation video into an AI model, multiple illustrations related to the text and having identical graphic effects can be obtained. These graphic effects can include shadow effects, reflection effects, neon sign effects, stereoscopic effects, and 3D rotation effects.

이 경우, 한 문장/한 단락, 사용자가 지정한 문장/단락으로부터 획득된 삽화들 간의 디자인이 서로 유사하게 생성될 수 있고, 또는, 동일 프레젠테이션 자료의 전체 삽화들 간의 디자인이 서로 유사하게 생성될 수 있다.In this case, designs may be generated to be similar to each other among illustrations obtained from a sentence/paragraph, a user-specified sentence/paragraph, or designs may be generated to be similar to each other among all illustrations of the same presentation material.

다시 도 2a를 참고하면, 상술한 다양한 실시 예들에 따라 획득된 적어도 하나의 삽화 중 사용자에 의해 선택된 삽화를 프레젠테이션 영상에 표시한다(S240).Referring again to FIG. 2a, an illustration selected by a user from among at least one illustration obtained according to the various embodiments described above is displayed in a presentation video (S240).

예컨대, 상술한 실시 예들에 따라 획득된 적어도 하나의 삽화는 프레젠테이션 영상이 표시된 화면 내의 일부 영역에서 제공될 수 있고, 여기서 선택된 삽화가 프레젠테이션 영상에 표시될 수 있다. 또 다른 예로, 획득된 삽화는 사용자의 선택 없이도 바로 프레젠테이션 영상에 표시될 수 있다.For example, at least one illustration obtained according to the above-described embodiments may be provided in a portion of the screen where the presentation video is displayed, and the selected illustration may be displayed in the presentation video. As another example, the obtained illustration may be displayed directly in the presentation video without user selection.

프레젠테이션 영상에 표시된 삽화는 추가적인 사용자 조작에 의해 편집될 수 있다.The illustrations shown in the presentation video can be edited by additional user manipulation.

도 13 내지 도 16은 본 개시의 다양한 실시 예에 따른 삽화 제공을 위한 사용자 인터페이스를 설명하기 위한 도면이다.FIGS. 13 to 16 are drawings for explaining a user interface for providing illustrations according to various embodiments of the present disclosure.

도 13을 참고하면, 본 개시에 따른 삽화 생성 기능은 프레젠테이션 소프트웨어에 포함될 수 있다. 프레젠테이션 소프트웨어에서 제공하는 기능 메뉴들 중 삽화 생성 메뉴(1310)가 선택되면 삽화 검색을 위한 UI(1320)가 표시되고, UI(1320)에 마련된 텍스트 입력 영역(1321)에 텍스트를 입력하여 검색(1323)을 선택하면, 텍스트와 연관된 적어도 하나의 삽화를 포함하는 검색 결과(1325)가 제공될 수 있다.Referring to FIG. 13, the illustration creation function according to the present disclosure may be included in presentation software. When the illustration creation menu (1310) is selected from among the function menus provided by the presentation software, a UI (1320) for illustration search is displayed, and when a search (1323) is selected by entering text in a text input area (1321) provided in the UI (1320), a search result (1325) including at least one illustration related to the text may be provided.

검색 결과(1325)에는 다른 사용자들에 의한 사용 횟수 및 디자인 어울림 정도 등에 의해 평가된 점수에 따라 여러 개의 삽화가 나열될 수 있다.Search results (1325) may list multiple illustrations based on scores evaluated by other users, such as the number of times they have been used and the degree of design compatibility.

검색 결과(1325)에 포함된 삽화들 중 사용자에 의해 선택된 삽화는 프레젠테이션 영상(1330)에 표시될 수 있다. 사용자는 삽화를 마우스, 터치 패드 등과 같은 입력 장치를 이용해 클릭, 드래그 앤 드롭, 롱 터치 등의 조작으로 프레젠테이션 영상(1330)에 표시되도록 할 수 있다.An illustration selected by a user from among the illustrations included in the search results (1325) may be displayed in a presentation video (1330). The user may display the illustration in the presentation video (1330) by using an input device such as a mouse, touch pad, etc., through operations such as clicking, dragging and dropping, or long touching.

도 14 내지 도 15는 본 개시의 또 다른 실시 예에 따른 삽화 제공 방법을 설명하기 위한 도면이다.FIGS. 14 and 15 are drawings for explaining an illustration providing method according to another embodiment of the present disclosure.

도 14를 참고하면, 프레젠테이션 소프트웨어가 제공하는 화면에 마련된 스크립트 입력창(1400)에 삽화 생성 버튼(1410)이 마련될 수 있다. 사용자가 스크립트 입력창(1400)에 텍스트를 입력하고 삽화 생성 버튼(1410)을 선택하면 프레젠테이션 영상(1420) 상에 텍스트와 연관된 적어도 하나의 삽화(1421)가 표시될 수 있다.Referring to Fig. 14, a button for creating an illustration (1410) may be provided in a script input window (1400) provided on a screen provided by the presentation software. When a user inputs text in the script input window (1400) and selects the button for creating an illustration (1410), at least one illustration (1421) related to the text may be displayed on a presentation video (1420).

도 15를 참고하면, 사용자가 삽화를 생성할 텍스트를 블록지정(예컨대, 텍스트를 드래그함)하고 삽화 생성 버튼(1510)을 선택하면, 지정된 텍스트(1520)와 연관된 적어도 하나의 삽화(1531)가 프레젠테이션 영상(1530) 상에 표시될 수 있다. 본 실시 예에 따르면 지정한 문장마다 삽화를 생성할 수 있다.Referring to FIG. 15, when a user blocks text for which an illustration is to be created (e.g., by dragging the text) and selects the illustration creation button (1510), at least one illustration (1531) associated with the designated text (1520) may be displayed on the presentation video (1530). According to this embodiment, an illustration may be created for each designated sentence.

도 16은 본 개시의 또 다른 실시 예에 따른 삽화 제공 방법에 대한 것으로, 도 16을 참조하면, 사용자가 삽화를 생성할 텍스트를 블록지정(예컨대, 텍스트를 드래그함)하고 블록 지정된 텍스트에 대한 특정 조작(예컨대, 마우스의 오른쪽 버튼 누름, 터치 패드를 롱 프레스함)을 하면, 메뉴(1640)가 표시되고, 메뉴(1640)에 포함된 삽화 생성 아이템을 선택하면, 삽화 검색을 위한 UI(1600)의 텍스트 입력 영역(1610)에 블록 지정된 텍스트가 입력되고, 이후 사용자가 검색(1620)을 선택하면, 블록 지정된 텍스트와 연관된 적어도 하나의 삽화를 포함하는 검색 결과(1630)가 제공될 수 있다. 검색 결과(1630)에는 다른 사용자들에 의한 사용 횟수 및 디자인 어울림 정도 등에 의해 평가된 점수에 따라 여러 개의 삽화가 나열될 수 있다.FIG. 16 is a diagram illustrating a method for providing illustrations according to another embodiment of the present disclosure. Referring to FIG. 16, when a user blocks text for generating an illustration (e.g., drags text) and performs a specific operation on the blocked text (e.g., right-clicks a mouse, long-presses a touchpad), a menu (1640) is displayed, and when an illustration generation item included in the menu (1640) is selected, the blocked text is input into a text input area (1610) of a UI (1600) for illustration search, and when the user then selects a search (1620), a search result (1630) including at least one illustration related to the blocked text may be provided. The search result (1630) may list multiple illustrations according to scores evaluated by the number of uses by other users and the degree of design compatibility.

본 개시의 일 실시 예에 따르면 텍스트를 자연어 처리를 수행하는 인공지능 모델에 입력하여 텍스트로부터 복수의 핵심 단어를 추출하고 복수의 핵심 단어의 우선순위를 매기고, 상기 복수의 핵심 단어와 우선 순위에 대한 정보를 키워드 벡터로 정의하고 삽화 생성을 위해 학습된 인공지능 모델에 입력하여 삽화의 형태를 생성할 수 있다.According to one embodiment of the present disclosure, a text may be input into an artificial intelligence model that performs natural language processing to extract a plurality of key words from the text, prioritize the plurality of key words, define information about the plurality of key words and their priorities as a keyword vector, and input into an artificial intelligence model trained for illustration generation to generate the form of an illustration.

구체적으로, 사용자가 문장을 입력하면, 문장을 자연어 처리를 수행하는 인공지능 모델에 입력하여 문장을 구/절 단위로 쪼개 해당 구/절이 가지는 의미에 대해 파악한다. 그리고 각 구/절 간의 관계를 정의한다(배경/현상, 원인/결과, 대조, 주장과 근거 등). 그리고 각 구/절 내의 문장의 단어 구분한다. 그리고 각 단어를 구분하여 단어가 포함되었던 구/절이 가지는 의미에서 차지하는 우선순위를 매긴다. 그리고 우선순위에 따라 각 구/절의 단어를 정렬하여 N 개(예컨대 2개)의 우선순위가 큰 단어만 삽화 생성에 사용하고, 나머지는 무시할 수 있다.Specifically, when a user inputs a sentence, the sentence is input into an AI model that performs natural language processing, which breaks the sentence down into phrases/clauses and identifies the meaning of each phrase/clause. The model then defines the relationships between each phrase/clause (background/phenomenon, cause/effect, contrast, claim and basis, etc.). The words within each phrase/clause are then distinguished. Each word is then classified and assigned a priority within the meaning of the phrase/clause it is in. The words within each phrase/clause are then sorted based on priority, and only the N highest-priority words (e.g., 2) are used for illustration generation, with the rest being ignored.

그리고 우선순위가 매겨진 구/절 내의 N 개의 주요 단어 간의 연관 관계 (주어와 서술어, 서술어와 목적어, 주술목 등)와 단어 간의 연결 정도를 정의한다. 예를 들어, "점점 늘어가는 스타트업 도전"이라는 문장이 입력되었다면 '늘어가는(1)', '스타트업(2)', '도전(3)', '점점(4)' 와 같이 핵심 단어들을 추출하여 우선순위를 매길 수 있다.And it defines the relationship between N key words within a prioritized phrase/clause (subject and predicate, predicate and object, subject-object clause, etc.) and the degree of connection between words. For example, if the sentence "Increasing startup challenges" is input, core words such as "increasing (1)", "startup (2)", "challenge (3)", and "increasing (4)" can be extracted and prioritized.

이 경우, 작은 범위의 개념부터 삽화로 형상화시킬 수 있다. 예컨대, 단어끼리의 연관 관계 종류에 따라 정의된 삽화 배치에 대한 템플릿이 제1 데이터베이스로 구성되어 있을 수 있고, 구/절끼리의 연관 관계 종류에 따라 정의된 삽화 배치 템플릿이 제2 데이터베이스로 구성되어 있을 수 있다. 그리고 각 단어의 의미와 매칭되는 삽화들이 검색되고 가장 높은 확률로 매칭된 삽화가 선택될 수 있다. 단어의 연관관계로 템플릿을 내부적으로 로드하여 준비되고, 각 단어로부터 매칭된 삽화가 템플릿에 삽입되어 1차 삽화들을 생성한다. 그리고 1차 삽화들로 생성된 구/절들의 연관관계에 따라 제2 데이터베이스로부터 로드된 템플릿에 1차 삽화들이 삽입되어 2차 삽화로 생성된다. 이렇게 생성된 2차 삽화를 기초 형태라고 정의할 수 있다.In this case, a small range of concepts can be visualized as illustrations. For example, a first database may contain templates for illustration arrangement defined according to the types of associations between words, and a second database may contain templates for illustration arrangement defined according to the types of associations between phrases/phrases. Then, illustrations matching the meaning of each word can be searched for, and the illustration with the highest probability can be selected. Templates are internally loaded and prepared based on word associations, and illustrations matched from each word are inserted into the templates to generate primary illustrations. Then, based on the associations between phrases/phrases generated from the primary illustrations, the primary illustrations are inserted into templates loaded from the second database to generate secondary illustrations. The secondary illustrations generated in this way can be defined as basic forms.

그리고 삽화의 기초 형태와 현재의 프레젠테이션 영상의 디자인을 이용하여 삽화의 기초 형태에 대한 그래픽 효과를 자동으로 변경한다. 예컨대, 현재의 프레젠테이션 영상의 디자인에서 사용된 색을 나열하고, 그 색의 빈도, 면적 등을 가중치로 하여 현재 프레젠테이션 영상의 색 테마 정보를 계산할 수 있으며 계산된 색 테마 내의 색상을 이용하여 기초 형태의 삽화를 채색할 수 있다. 또는, 색 정보 외에도 프레젠테이션 영상의 디자인에서 사용된 선 스타일, 선 굵기, 곡선 빈도, 모서리 처리 등의 디자인 요소 등으로부터 현재 프레젠테이션의 디자인을 정의하고, 그 정보를 이용해 삽화의 그래픽 효과를 변경할 수 있다.And, using the basic shape of the illustration and the design of the current presentation video, the graphic effects of the basic shape of the illustration can be automatically changed. For example, the colors used in the design of the current presentation video can be listed, and the color theme information of the current presentation video can be calculated using the frequency and area of the colors as weights, and the basic shape of the illustration can be colored using the colors within the calculated color theme. Alternatively, in addition to color information, the design of the current presentation can be defined based on design elements such as line style, line thickness, curve frequency, and edge treatment used in the presentation video design, and the graphic effects of the illustration can be changed using this information.

이와 같이 생성된 삽화에 대해 사용자가 후 편집할 수 있다. 그리고 프레젠테이션 영상의 디자인 변경시 그에 맞게 재생성될 수도 있다. 각 템플릿이나 1차 삽화 검색 등에 있어서 사용자의 선택이 가능하고 그 사용자의 선택을 점수화하여 인공지능 모델을 강화학습시킬 수 있다. 강화학습 개념을 이용하여 템플릿이나 삽화 검색에 있어서 사용자들 혹은 개인 사용자가 더 선호한 결과를 점진적으로 학습하여 보여줄 수도 있다.The illustrations generated in this way can be later edited by the user. They can also be regenerated to reflect changes in the presentation video design. For each template or primary illustration search, the user can select a selection, and these selections can be scored to drive reinforcement learning for the AI model. Using reinforcement learning, the system can also gradually learn and present results preferred by users or individual users in template or illustration searches.

한편, 본 개시의 다양한 실시 예들은 메신저 프로그램에서도 적용될 수 있다. 도 17 내지 도 18a은 본 개시의 삽화 생성 기능이 메신저 프로그램에 적용된 실시 예를 설명하기 위한 도면이다.Meanwhile, various embodiments of the present disclosure can also be applied to messenger programs. FIGS. 17 to 18a are diagrams illustrating embodiments in which the illustration generation function of the present disclosure is applied to a messenger program.

도 17을 참고하면, 사용자가 메신저 프로그램의 메신저 UI에 마련된 이모티콘 버튼을 선택하고 생성하고자 이모티콘에 대한 텍스트를 입력하면, 입력된 텍스트와 관련된 적어도 하나의 이모티콘이 생성되어 표시될 수 있다. 사용하는 생성된 이모티콘들 중 원하는 이모티콘을 선택하여 대화 상대방에게 전송할 수 있다. 또한, 이모티콘뿐만 아니라 글과 어울리는 삽화를 생성하여 상대방에게 전송하는 것도 가능하다.Referring to Figure 17, when a user selects the emoticon button provided in the messenger UI of a messenger program and enters text for the emoticon to be created, at least one emoticon related to the entered text will be created and displayed. The user can select a desired emoticon from the generated emoticons and send it to the conversation partner. Furthermore, in addition to emoticons, it is also possible to create and send an illustration that matches the text to the other party.

도 18a를 참고하면, 메신저 프로그램의 메신저 UI에 마련된 특정 버튼(편지 봉투 모양)을 선택하면 입력된 텍스트와 어울리는 배경 이미지가 생성될 수 있다. 그리고 텍스트 창에 입력된 텍스트가 배경 이미지에 삽입될 수 있다. 텍스트의 위치는 터치 및 드래그 등의 사용자 조작에 의해 변경 가능하다. 이와 같이 배경 이미지에 텍스트가 삽입된 이미지 형태의 메시지가 대화 상대방에게 전송될 수 있다.Referring to Figure 18a, selecting a specific button (shaped like an envelope) in the messenger program's messenger UI can generate a background image that matches the entered text. Furthermore, the text entered in the text window can be inserted into the background image. The text's position can be changed through user manipulations such as touch and drag. In this way, an image-based message with text inserted into the background image can be sent to the conversation partner.

한편, 본 개시의 다양한 실시 예들은 키보드 프로그램에서도 적용될 수 있다. 도 18b는 본 개시의 삽화 생성 기능이 키보드 프로그램에 적용된 실시 예를 설명하기 위한 도면이다.Meanwhile, various embodiments of the present disclosure can also be applied to a keyboard program. FIG. 18b is a diagram illustrating an embodiment in which the illustration generation function of the present disclosure is applied to a keyboard program.

도 18b를 참고하면, 사용자가 키보드 프로그램의 UI에 마련된 삽화 생성 버튼(1810)을 선택하고 생성하고자 삽화에 대한 텍스트를 입력하면, 입력된 텍스트와 관련된 적어도 하나의 삽화가 생성되어 표시될 수 있다. 상기 삽화가 생성되는 과정은 도1 내지 도 12를 통해서 설명된 방법으로 수행될 수 있다. 상기 키보드 프로그램은 다양한 다른 프로그램과 연동하여 작동할 수 있다. 예를 들어, 상기 키보드 프로그램은 웹 브라우저 프로그램, 문서 작성 프로그램, 채팅 프로그램, 메신저 프로그램 등과 연동하여 동작할 수 있다. 즉, 상기 키보드 프로그램으로 입력된 텍스트와 관련된 삽화 정보를 획득하여, 상기 삽화 정보를 상기 웹 브라우저 프로그램, 상기 문서 작성 프로그램, 상기 채팅 프로그램, 또는 상기 메신저 프로그램으로 전달할 수 있다.Referring to Fig. 18b, when a user selects an illustration creation button (1810) provided on the UI of the keyboard program and inputs text for an illustration to be created, at least one illustration related to the input text may be created and displayed. The process of creating the illustration may be performed in the manner described through Figs. 1 to 12. The keyboard program may operate in conjunction with various other programs. For example, the keyboard program may operate in conjunction with a web browser program, a document writing program, a chat program, a messenger program, etc. That is, the keyboard program may acquire illustration information related to text inputted into the keyboard program and transmit the illustration information to the web browser program, the document writing program, the chat program, or the messenger program.

도 19는 본 개시의 일 실시 예에 따른 전자 장치(100)의 구성을 설명하기 위한 블록도이다. 전자 장치(100)는 도 1 내지 도 18a를 참고하여 상술한 실시 예들의 동작의 전부 또는 일부를 수행할 수 있는 장치이다.FIG. 19 is a block diagram illustrating the configuration of an electronic device (100) according to an embodiment of the present disclosure. The electronic device (100) is a device capable of performing all or part of the operations of the embodiments described above with reference to FIGS. 1 to 18a.

도 19를 참고하면, 전자 장치(100)는 메모리(110), 프로세서(120)를 포함한다.Referring to FIG. 19, the electronic device (100) includes a memory (110) and a processor (120).

예를 들면, 메모리(110)는 내장 메모리 또는 외장 메모리를 포함할 수 있다. 내장 메모리는, 예를 들면, 휘발성 메모리(예: DRAM(dynamic RAM), SRAM(static RAM), 또는 SDRAM(synchronous dynamic RAM) 등), 비휘발성 메모리(non-volatile Memory)(예: OTPROM(one time programmable ROM), PROM(programmable ROM), EPROM(erasable and programmable ROM), EEPROM(electrically erasable and programmable ROM), mask ROM, flash ROM, 플래시 메모리(예: NAND flash 또는 NOR flash 등), 하드 드라이브, 또는 솔리드 스테이트 드라이브(solid state drive(SSD)) 중 적어도 하나를 포함할 수 있다.For example, the memory (110) may include built-in memory or external memory. The built-in memory may include, for example, at least one of volatile memory (e.g., dynamic RAM (DRAM), static RAM (SRAM), or synchronous dynamic RAM (SDRAM)), non-volatile memory (e.g., one time programmable ROM (OTPROM), programmable ROM (PROM), erasable and programmable ROM (EPROM), electrically erasable and programmable ROM (EEPROM), mask ROM, flash ROM, flash memory (e.g., NAND flash or NOR flash), hard drive, or solid state drive (SSD)).

외장 메모리는 플래시 드라이브(flash drive), 예를 들면, CF(compact flash), SD(secure digital), Micro-SD(micro secure digital), Mini-SD(mini secure digital), xD(extreme digital), MMC(multi-media card) 또는 메모리 스틱(memory stick) 등을 포함할 수 있다. 외장 메모리는 다양한 인터페이스를 통하여 전자 장치(100)와 기능적으로 및/또는 물리적으로 연결될 수 있다.The external memory may include a flash drive, for example, a compact flash (CF), a secure digital (SD), a micro secure digital (Micro-SD), a mini secure digital (Mini-SD), an extreme digital (xD), a multi-media card (MMC), or a memory stick. The external memory may be functionally and/or physically connected to the electronic device (100) via various interfaces.

메모리(110)는 프로세서(120)에 의해 액세스되며, 프로세서(120)에 의한 데이터의 독취/기록/수정/삭제/갱신 등이 수행될 수 있다. 본 개시에서 메모리라는 용어는 메모리(110), 프로세서(120) 내 롬, 램 또는 전자 장치(100)에 장착되는 메모리 카드(예를 들어, micro SD 카드, 메모리 스틱)를 포함할 수 있다.The memory (110) is accessed by the processor (120), and data can be read/written/modified/deleted/updated by the processor (120). In the present disclosure, the term "memory" may include a memory (110), a ROM, a RAM within the processor (120), or a memory card (e.g., a micro SD card, a memory stick) mounted on the electronic device (100).

메모리(110)는 도 1 내지 도 18a을 참고하여 상술한 실시 예들에 따른 제어방법을 수행하기 위한 컴퓨터 실행가능 명령어(computer executable instructions)를 저장할 수 있다.The memory (110) can store computer executable instructions for performing the control method according to the embodiments described above with reference to FIGS. 1 to 18a.

메모리(110)는 프레젠테이션 소프트웨어, 메신저 소프트웨어 등을 저장할 수 있다.The memory (110) can store presentation software, messenger software, etc.

메모리(110)는 도 1 내지 도 18a를 참고하여 상술한 실시 예들에 따른 인공지능 모델을 저장할 수 있다. 인공지능 모델은 외부의 서버에서 학습되어 전자 장치(100)로 제공될 수 있다. 전자 장치(100)는 외부 서버로부터 인공지능 모델을 다운로드받아 메모리(110)에 저장할 수 있고, 인공지능 모델이 업데이트(또는 재학습)되면 업데이트된 인공지능 모델을 외부 서버로부터 수신하여 저장할 수 있다. 전자 장치(100)는 이와 같은 외부 서버에 근거리 통신망(LAN: Local Area Network), 인터넷망 등을 통해 접속될 수 있다.The memory (110) can store the artificial intelligence model according to the embodiments described above with reference to FIGS. 1 to 18A. The artificial intelligence model can be learned from an external server and provided to the electronic device (100). The electronic device (100) can download the artificial intelligence model from the external server and store it in the memory (110), and when the artificial intelligence model is updated (or relearned), the electronic device (100) can receive and store the updated artificial intelligence model from the external server. The electronic device (100) can be connected to such an external server via a local area network (LAN), the Internet, or the like.

메모리(110)는 태그 정보가 매칭된 삽화들로 구성된 데이터 베이스, 문장 내 단어들의 연관 관계에 따라 삽화들의 배치 형태를 정의한 템플릿으로 구성된 데이터 베이스, 문장의 구/절끼리의 연관 관계에 따라 삽화들의 배치 형태를 정의한 템플릿으로 구성된 데이터 베이스 등 다양한 데이터 베이스를 저장할 수 있다.The memory (110) can store various databases, such as a database composed of illustrations with matching tag information, a database composed of templates defining the arrangement of illustrations according to the relationship between words in a sentence, and a database composed of templates defining the arrangement of illustrations according to the relationship between phrases/clauses in a sentence.

일 실시 예에 따르면, 메모리(110)는 클라우드 서버와 같은 전자 장치(100) 외부의 서버로 구현될 수도 있다.According to one embodiment, the memory (110) may be implemented as a server external to the electronic device (100), such as a cloud server.

프로세서(120)는 전자 장치(100)의 전반적인 동작을 제어하기 위한 구성이다. 프로세서(120)는 예컨대, CPU, ASIC, SoC, MICOM 등으로 구현될 수 있다. 프로세서(120)는 운영 체제 또는 응용 프로그램을 구동하여 프로세서(120)에 연결된 다수의 하드웨어 또는 소프트웨어 구성요소들을 제어할 수 있고, 각종 데이터 처리 및 연산을 수행할 수 있다. 한 실시 예에 따르면, 프로세서(120)는 GPU(graphic processing unit) 및/또는 이미지 신호 프로세서(image signal processor)를 더 포함할 수 있다.The processor (120) is a component for controlling the overall operation of the electronic device (100). The processor (120) may be implemented as, for example, a CPU, an ASIC, an SoC, a MICOM, etc. The processor (120) may control a number of hardware or software components connected to the processor (120) by driving an operating system or an application program, and may perform various data processing and calculations. According to one embodiment, the processor (120) may further include a GPU (graphics processing unit) and/or an image signal processor.

프로세서(120)는 메모리(110)에 저장된 컴퓨터 실행가능 명령어를 실행함으로써, 도 1 내지 도 18a에서 설명한 실시 예들 전부 또는 일부에 따른 기능을 전자 장치(100)가 수행할 수 있도록 한다.The processor (120) executes computer-executable instructions stored in the memory (110), thereby enabling the electronic device (100) to perform all or part of the functions according to the embodiments described in FIGS. 1 to 18a.

프로세서(120)는 메모리(110)에 저장된 하나 이상의 인스트럭션들을 실행함으로써, 사용자 입력을 기반으로 텍스트를 획득하고, 획득된 텍스트로부터 복수의 핵심 단어를 결정하며, 복수의 핵심 단어에 대응되는 복수의 제1 삽화를 획득하고, 복수의 제1 삽화들 중에서 적어도 2개 이상의 제1 삽화를 합성하여 제2 삽화를 획득하고, 획득된 제2 삽화를 출력할 수 있다.The processor (120) can obtain text based on user input by executing one or more instructions stored in the memory (110), determine a plurality of key words from the obtained text, obtain a plurality of first illustrations corresponding to the plurality of key words, obtain a second illustration by synthesizing at least two first illustrations among the plurality of first illustrations, and output the obtained second illustration.

또한, 프로세서(120)는 프레젠테이션 영상을 제공하고, 프레젠테이션 영상에 대한 텍스트가 입력되면 인공지능 알고리즘에 의해 학습된 인공지능 모델에 텍스트를 입력하여 텍스트와 관련된 적어도 하나의 삽화를 획득하고, 획득된 적어도 하나의 삽화 중 사용자에 의해 선택된 삽화를 프레젠테이션 영상 상에 제공할 수 있다.In addition, the processor (120) provides a presentation video, and when text for the presentation video is input, the text is input into an artificial intelligence model learned by an artificial intelligence algorithm to obtain at least one illustration related to the text, and an illustration selected by the user from among the at least one obtained illustration can be provided on the presentation video.

일 실시 예에 따르면, 전자 장치(100)는 텍스트와 관련된 삽화를 획득하기 위하여 인공지능 전용 프로그램(또는 인공지능 에이전트, Artificial intelligence agent)인 개인 비서 프로그램을 이용할 수 있다. 이때, 개인 비서 프로그램은 AI(Artificial Intelligence) 기반의 서비스를 제공하기 위한 전용 프로그램으로서, 프로세서(120)에 의해 실행될 수 있다. 프로세서(120)는 범용 프로세서 또는 별도의 AI 전용 프로세서일 수 있다.According to one embodiment, the electronic device (100) may utilize a personal assistant program, which is an artificial intelligence (AI)-specific program (or AI agent), to obtain illustrations related to text. In this case, the personal assistant program is a dedicated program for providing AI (Artificial Intelligence)-based services and may be executed by a processor (120). The processor (120) may be a general-purpose processor or a separate AI-specific processor.

본 개시의 일 실시 예에 따르면 전자 장치(100)는 자체적으로 디스플레이를 포함하고, 프로세서(120)는 다양한 영상을 표시하도록 디스플레이를 제어할 수 있다. 또 다른 실시 예에 따르면, 전자 장치(100)는 외부 디스플레이 장치와 연결되어 외부 디스플레이 장치에서 다양한 영상이 표시되도록 외부 디스플레이 장치로 영상 신호를 출력할 수 있다. 후자의 경우, 전자 장치(100)는 외부 디스플레이 장치와 유선 또는 무선으로 연결될 수 있다. 예컨대 전자 장치(100)는 컴포넌트 입력 잭, HDMI(High-Definition MultimediaInterface) 입력 포트, USB 포트, RGB, DVI, HDMI, DP, 썬더볼트 등의 포트 중 적어도 하나를 포함하고, 이러한 포트를 통해 외부 디스플레이 장치와 연결될 수 있다. 또 다른 예로, 전자 장치(100)는 WiFi(Wireless Fidelity), WiDi(WirelessDisplay), WiHD(WirelessHD), WHDI(Wireless Home Digital Interface), 미라캐스트(Miracast), 와이파이 다이렉트(Wi-Fi Direct), 블루투스(ex. 블루투스 클래식(Bluetooth Classic), 블루투스 저 에너지(Bluetooth Low Energy)), AirPlay, 지그비(Zigbee) 등의 통신 방식을 통해 외부 디스플레이 장치와 연결될 수 있다.According to one embodiment of the present disclosure, the electronic device (100) includes a display on its own, and the processor (120) can control the display to display various images. According to another embodiment, the electronic device (100) can be connected to an external display device and output an image signal to the external display device so that various images can be displayed on the external display device. In the latter case, the electronic device (100) can be connected to the external display device by wire or wirelessly. For example, the electronic device (100) includes at least one of a component input jack, an HDMI (High-Definition MultimediaInterface) input port, a USB port, an RGB, a DVI, an HDMI, a DP, a Thunderbolt port, and the like, and can be connected to the external display device through such a port. As another example, the electronic device (100) may be connected to an external display device through a communication method such as WiFi (Wireless Fidelity), WiDi (Wireless Display), WiHD (Wireless HD), WHDI (Wireless Home Digital Interface), Miracast, Wi-Fi Direct, Bluetooth (e.g., Bluetooth Classic, Bluetooth Low Energy), AirPlay, Zigbee, etc.

전자 장치(100)에 포함된 디스플레이 또는 전자 장치(100)와 연결되는 외부 디스플레이 장치는 예를 들면, 액정 디스플레이(liquid crystal display(LCD)), 발광 다이오드(light-emitting diode(LED)) 디스플레이, 유기 발광 다이오드(organic light-emitting diode(OLED)) 디스플레이(예컨대 AMOLED(active-matrix organic light-emitting diode), PMOLED(passive-matrix OLED)), 또는 마이크로 전자기계 시스템(microelectromechanical systems(MEMS)) 디스플레이, 또는 전자종이(electronic paper) 디스플레이, 터치 스크린을 포함할 수 있다.The display included in the electronic device (100) or the external display device connected to the electronic device (100) may include, for example, a liquid crystal display (LCD), a light-emitting diode (LED) display, an organic light-emitting diode (OLED) display (e.g., an active-matrix organic light-emitting diode (AMOLED) or a passive-matrix OLED (PMOLED)), or a microelectromechanical systems (MEMS) display, or an electronic paper display, or a touch screen.

본 개시에서 프로세서(120)가 영상, 삽화, 아이콘 등을 "제공"하는 것은 전자 장치(100)의 내부 디스플레이를 제어하여 영상 또는 삽화를 내부 디스플레이를 통해 표시하거나, 또는 전자 장치(100)의 외부 디스플레이 장치로 영상, 삽화 등에 대한 영상 신호를 출력하는 것을 포함한다.In the present disclosure, “providing” an image, illustration, icon, etc. by the processor (120) includes controlling the internal display of the electronic device (100) to display the image or illustration through the internal display, or outputting an image signal for the image, illustration, etc. to an external display device of the electronic device (100).

본 개시의 일 실시 예에 따르면, 전자 장치(100)는 자체적으로 입력 장치를 포함하고, 입력 장치 통해 다양한 사용자 입력을 수신할 수 있다. 상기 입력 장치는 예컨대, 터치 패널, 터치 스크린, 버튼, 모션 입력을 수신할 수 있는 센서, 카메라 또는 음성 입력을 수신할 수 있는 마이크 등을 포함할 수 있다.According to one embodiment of the present disclosure, an electronic device (100) includes an input device and can receive various user inputs through the input device. The input device may include, for example, a touch panel, a touch screen, a button, a sensor capable of receiving motion input, a camera, or a microphone capable of receiving voice input.

또 다른 실시 예에 따르면, 전자 장치(100)는 외부 입력 장치와 연결되어 외부 입력 장치를 통해 다양한 사용자 입력을 수신할 수 있다. 예컨대, 외부 입력 장치는 키보드, 마우스 또는 리모컨 등을 포함할 수 있다. 전자 장치(100)는 외부 입력 장치와 무선 또는 유선으로 연결될 수 있다. 예컨대 전자 장치(100)는 USB 포트 등을 통해 외부 입력 장치와 유선으로 연결될 수 있다. 또 다른 예로, 전자 장치(100)는 적외선 통신(IrDA, infrared Data Association), RFID(Radio Frequency Identification), WiFi(Wireless Fidelity), 와이파이 다이렉트(Wi-Fi Direct), 블루투스(ex. 블루투스 클래식(Bluetooth Classic), 블루투스 저 에너지(Bluetooth Low Energy)), 지그비(Zigbee) 등의 통신 방식을 통해 외부 입력 장치와 무선으로 연결될 수 있다.According to another embodiment, the electronic device (100) may be connected to an external input device and receive various user inputs through the external input device. For example, the external input device may include a keyboard, a mouse, or a remote control. The electronic device (100) may be connected to the external input device wirelessly or by wire. For example, the electronic device (100) may be connected to the external input device by wire through a USB port, etc. As another example, the electronic device (100) may be connected wirelessly to the external input device through a communication method such as infrared communication (IrDA, infrared Data Association), RFID (Radio Frequency Identification), WiFi (Wireless Fidelity), Wi-Fi Direct, Bluetooth (e.g., Bluetooth Classic, Bluetooth Low Energy), Zigbee, etc.

전자 장치(100)는 자체적으로 포함하는 입력 장치 또는 외부 입력 장치를 통해 삽화 생성을 위한 텍스트, 삽화를 선택하기 위한 사용자 입력 등 다양한 사용자 입력을 수신할 수 있다.The electronic device (100) can receive various user inputs, such as text for generating illustrations, user input for selecting illustrations, etc., through an input device included in the device itself or an external input device.

일 실시 예에 따르면, 프로세서(120)는 도 1에 도시된 것과 같이 텍스트 입력창이 마련된 화면을 제공할 수 있고, 텍스트 입력창에 텍스트가 입력되면 프로세서(120)는 인공지능 모델에 상기 텍스트를 입력하여 상기 텍스트와 관련된 적어도 하나의 삽화를 획득할 수 있다.According to one embodiment, the processor (120) may provide a screen having a text input window as illustrated in FIG. 1, and when text is entered into the text input window, the processor (120) may input the text into an artificial intelligence model to obtain at least one illustration related to the text.

또 다른 실시 예에 따르면, 프로세서(120)는 도 13에 도시된 것과 같은 화면을 제공할 수 있고, 삽화 생성 메뉴(1310)가 선택되면 삽화 검색을 위한 UI(1320)를 제공할 수 있다. 그리고 삽화 검색을 위한 UI(1320)에 마련된 텍스트 입력 영역(1321)에 텍스트가 입력되고 검색(1323)이 선택되면, 프로세서(120)는 텍스트를 인공지능 모델에 입력하여, 텍스트와 연관된 적어도 하나의 삽화를 포함하는 검색 결과(1325)를 제공할 수 있다. 그리고 프로세서(120)는 검색 결과(1325)에서 선택된 삽화를 프레젠테이션 영상(1330)에 제공할 수 있다.According to another embodiment, the processor (120) may provide a screen as illustrated in FIG. 13, and when an illustration creation menu (1310) is selected, a UI (1320) for illustration search may be provided. When text is input into a text input area (1321) provided in the UI (1320) for illustration search and a search (1323) is selected, the processor (120) may input the text into an artificial intelligence model, and provide a search result (1325) including at least one illustration related to the text. The processor (120) may then provide an illustration selected from the search result (1325) in a presentation video (1330).

또 다른 실시 예에 따르면, 프로세서(120)는 도 14에 도시된 것과 같은 화면을 제공할 수 있고, 스크립트 입력창(1400)에 텍스트가 입력되고 삽화 생성 버튼(1410)이 선택되면, 프로세서(120)는 입력된 텍스트를 인공지능 모델에 입력하여 텍스트와 연관된 적어도 하나의 삽화(1421)를 제공할 수 있다.According to another embodiment, the processor (120) may provide a screen as illustrated in FIG. 14, and when text is entered into a script input window (1400) and an illustration generation button (1410) is selected, the processor (120) may input the entered text into an artificial intelligence model to provide at least one illustration (1421) associated with the text.

또 다른 실시 예에 따르면, 프로세서(120)는 도 15에 도시된 것과 같은 화면을 제공할 수 있고, 텍스트 지정을 위한 사용자 입력 및 삽화 생성 버튼(1510)을 선택하는 사용자 입력이 수신되면, 프로세서(120)는 지정된 텍스트(1520)를 인공지능 모델에 입력하여 텍스트와 연관된 적어도 하나의 삽화(1531)를 제공할 수 있다.According to another embodiment, the processor (120) may provide a screen as illustrated in FIG. 15, and when a user input for specifying text and selecting an illustration generation button (1510) is received, the processor (120) may input the specified text (1520) into an artificial intelligence model to provide at least one illustration (1531) associated with the text.

또 다른 실시 예에 따르면, 프로세서(120)는 도 16에 도시된 것과 같이 텍스트를 블록지정 하고 블록 지정된 텍스트에 대한 특정한 사용자 조작이 입력되면 메뉴(1640)를 제공하고, 메뉴(1640)에 포함된 삽화 생성 아이템이 선택되면, 프로세서(120)는 텍스트 입력 영역(1610)에 블록 지정된 텍스트가 입력된 삽화 검색을 위한 UI(1600)를 제공하고, 검색(1620)이 선택되면, 블록 지정된 텍스트를 인공지능 모델에 입력하여 텍스트와 연관된 적어도 하나의 삽화를 포함하는 검색 결과(1630)를 제공할 수 있다. 그리고 프로세서(120)는 검색 결과(1630)에서 선택된 삽화를 프레젠테이션 영상(1330)에 제공할 수 있다.According to another embodiment, the processor (120) blocks text as illustrated in FIG. 16, and when a specific user operation for the blocked text is input, the processor (120) provides a menu (1640), and when an illustration generation item included in the menu (1640) is selected, the processor (120) provides a UI (1600) for searching for an illustration in which the blocked text is input in a text input area (1610), and when a search (1620) is selected, the processor (120) inputs the blocked text into an artificial intelligence model to provide a search result (1630) including at least one illustration associated with the text. In addition, the processor (120) can provide an illustration selected from the search result (1630) in a presentation video (1330).

본 개시의 일 실시 예에 따르면, 프로세서(120)는 프레젠테이션 영상의 디자인에 대한 정보 및 텍스트를 인공지능 모델에 입력하여, 텍스트와 관련되며 프레젠테이션 영상의 디자인과 대응되는 적어도 하나의 삽화를 획득할 수 있다.According to one embodiment of the present disclosure, the processor (120) can input information and text about the design of a presentation video into an artificial intelligence model to obtain at least one illustration related to the text and corresponding to the design of the presentation video.

예컨대, 도 12를 참고하여 설명하자면, 프로세서(120)는 텍스트를 제1 인공지능 모델(1210)에 입력하여 적어도 하나의 제1 삽화(1211)를 획득하고, 프레젠테이션 영상의 디자인에 대한 정보 및 적어도 하나의 제1 삽화(1211)를 제2 인공지능 모델(1220)에 입력하여, 적어도 하나의 제1 삽화(1211)가 프레젠테이션 영상의 디자인과 대응되도록 수정된 적어도 하나의 제2 삽화(1221)를 획득할 수 있다.For example, referring to FIG. 12, the processor (120) may input text into a first artificial intelligence model (1210) to obtain at least one first illustration (1211), and input information about the design of the presentation video and at least one first illustration (1211) into a second artificial intelligence model (1220) to obtain at least one second illustration (1221) in which at least one first illustration (1211) is modified to correspond to the design of the presentation video.

일 실시 예에 따르면, 프로세서(120)는 텍스트를 인공지능 모델에 입력하여 텍스트와 관련되며 서로 동일한 그래픽 효과를 갖는 복수의 삽화를 획득할 수 있다.According to one embodiment, the processor (120) may input text into an artificial intelligence model to obtain a plurality of illustrations related to the text and having the same graphic effects.

일 실시 예에 따르면, 프로세서(120)는 텍스트를 인공지능 모델에 입력하여, 복수의 제1 삽화를 획득하고 복수의 제1 삽화를 합성한 제2 삽화를 상기 텍스트와 연관된 삽화로 획득할 수 있다. 예컨대, 프로세서(120)는 도 5 내지 도 11을 참고하여 설명한 것과 같이 인공지능 모델을 이용하여 복수의 삽화를 합성한 삽화를 획득할 수 있다.In one embodiment, the processor (120) may input text into an artificial intelligence model to obtain a plurality of first illustrations and obtain a second illustration synthesized from the plurality of first illustrations as an illustration associated with the text. For example, the processor (120) may obtain an illustration synthesized from a plurality of illustrations using an artificial intelligence model as described with reference to FIGS. 5 to 11 .

본 개시의 일 실시 예에 따르면, 메모리(120)는 도 4에서 설명한 것과 같이 태그 정보와 매칭된 삽화들을 포함한 데이터 베이스를 저장할 수 있다. 이 경우, 프로세서(120)는 인공지능 모델에 텍스트를 입력하여 상기 텍스트로부터 적어도 하나의 핵심 단어를 획득하고 상기 획득된 적어도 하나의 핵심 단어에 대응하는 삽화를 메모리(120)에 저장된 데이터 베이스로부터 검색할 수 있다. 데이터 베이스는 전자 장치(100)의 외부 서버에 저장되어 있을 수도 있다.According to one embodiment of the present disclosure, the memory (120) may store a database including illustrations matched with tag information, as described in FIG. 4. In this case, the processor (120) may input text into an artificial intelligence model to obtain at least one key word from the text, and retrieve an illustration corresponding to the obtained at least one key word from a database stored in the memory (120). The database may also be stored on an external server of the electronic device (100).

본 개시의 일 실시 예에 따르면, 프로세서(120)는 인공지능 모델을 이용해 획득된 적어도 하나의 삽화 중 사용자에 의해 선택된 삽화에 대한 정보를 포함하는 피드백 데이터를 상기 인공지능 모델에 적용하여 상기 인공지능 모델을 재학습시킬 수 있다.According to one embodiment of the present disclosure, the processor (120) can retrain the artificial intelligence model by applying feedback data including information about an illustration selected by a user from among at least one illustration obtained using the artificial intelligence model to the artificial intelligence model.

본 개시의 또 다른 실시 예에 따르면, 프로세서(120)는 메신저 프로그램을 실행하여 제공되는 UI에 입력된 텍스트를 인공지능 모델에 입력하여, 예컨대 도 17에서 설명한 것과 같이 텍스트와 관련된 이모티콘을 제공할 수 있고, 도 18a에서 설명한 것과 같이 배경 이미지를 제공할 수 있다.According to another embodiment of the present disclosure, the processor (120) may input text inputted in the provided UI by executing a messenger program into an artificial intelligence model, thereby providing an emoticon related to the text as described in FIG. 17, and providing a background image as described in FIG. 18a.

한편, 상술한 실시 예들에선 전자 장치(100) 하나의 장치만이 이용되는 것으로 설명하였으나, 여러 대의 장치를 통해 상술한 실시 예가 구현될 수도 있다. 이와 관련하여선 도 20a, 도 20b 및 도 20c를 참고하여 설명하도록 한다.Meanwhile, although the above-described embodiments have been described as utilizing only one electronic device (100), the above-described embodiments may be implemented using multiple devices. In this regard, a description will be given with reference to FIGS. 20a, 20b, and 20c.

도 20a는 본 개시의 다양한 실시 예에 다른 인공지능 모델을 이용하는 네트워크 시스템의 흐름도이다.FIG. 20A is a flowchart of a network system using different artificial intelligence models in various embodiments of the present disclosure.

도 20a를 참고하면, 인공지능 모델을 이용하는 네트워크 시스템은 제1 구성 요소(2010a) 및 제2 구성 요소(2020a)를 포함할 수 있다. 예컨대, 제1 구성 요소(2010a)는 데스크톱, 스마트폰, 태블릿 PC 등과 같은 전자 장치이고 제2 구성 요소(2020a)는 인공지능 모델, 데이터 베이스 등이 저장된 서버일 수 있다. 또는, 제1 구성 요소(2010a)는 범용 프로세서이고, 제2 구성 요소(2020a)는 인공 지능 전용 프로세서가 될 수 있다. 또는, 제1 구성 요소(2010)는 적어도 하나의 애플리케이션이 될 수 있고, 제2 구성 요소(2020a)는 운영 체제(operating system, OS)가 될 수 있다. 즉, 제2 구성 요소(2020a)는 제1 구성 요소(2010a)보다 더 집적화되거나, 전용화되거나, 딜레이(delay)가 작거나, 성능이 우세하거나 또는 많은 리소스를 가진 구성 요소로서 모델의 생성, 갱신 또는 적용 시에 요구되는 많은 연산을 제1 구성 요소(2010a)보다 신속하고 효과적으로 처리 가능한 구성 요소가 될 수 있다.Referring to FIG. 20a, a network system utilizing an artificial intelligence model may include a first component (2010a) and a second component (2020a). For example, the first component (2010a) may be an electronic device such as a desktop, smartphone, or tablet PC, and the second component (2020a) may be a server storing an artificial intelligence model, a database, and the like. Alternatively, the first component (2010a) may be a general-purpose processor, and the second component (2020a) may be an artificial intelligence-specific processor. Alternatively, the first component (2010) may be at least one application, and the second component (2020a) may be an operating system (OS). That is, the second component (2020a) may be a component that is more integrated, dedicated, has a smaller delay, has superior performance, or has more resources than the first component (2010a), and may be a component that can process many operations required when creating, updating, or applying a model more quickly and effectively than the first component (2010a).

제1 구성 요소(2010a) 및 제2 구성 요소(2020a) 간에 데이터를 송/수신하기 위한 인터페이스가 정의될 수 있다.An interface for transmitting/receiving data between the first component (2010a) and the second component (2020a) may be defined.

일 예로, 모델에 적용할 학습 데이터를 인자 값(또는, 매개 값 또는 전달 값)으로 갖는 API(application program interface)가 정의될 수 있다. API는 어느 하나의 프로토콜(예로, 제1 구성 요소(2010a)에서 정의된 프로토콜)에서 다른 프로토콜(예로, 제2 구성 요소(2020a)에서 정의된 프로토콜)의 어떤 처리를 위해 호출할 수 있는 서브 루틴 또는 함수의 집합으로 정의될 수 있다. 즉, API를 통하여 어느 하나의 프로토콜에서 다른 프로토콜의 동작이 수행될 수 있는 환경을 제공될 수 있다.For example, an application program interface (API) may be defined that has training data to be applied to a model as argument values (or, parameter values or pass values). The API may be defined as a set of subroutines or functions that can be called from one protocol (e.g., the protocol defined in the first component (2010a)) to perform some processing on another protocol (e.g., the protocol defined in the second component (2020a)). In other words, an environment in which operations of another protocol can be performed on one protocol may be provided through the API.

도 20a를 참고하면, 먼저, 제1 구성요소(2010a)는 텍스트를 입력받을 수 있다(S2001a). 제1 구성요소(2010a)는 키보드, 터치 스크린 등 다양한 입력 장치를 통해 텍스트를 입력받을 수 있다. 또는, 제1 구성 요소(2010a)는 음성을 입력받아 이를 텍스트로 변환할 수 있다. 여기서 텍스트는 프레젠테이션 영상에 대한 스크립트 또는 메신저 프로그램의 텍스트 입력창에 입력되는 텍스트 등일 수 있다.Referring to Figure 20a, first, the first component (2010a) can receive text input (S2001a). The first component (2010a) can receive text input through various input devices such as a keyboard or touch screen. Alternatively, the first component (2010a) can receive voice input and convert it into text. Here, the text may be a script for a presentation video or text entered into a text input window of a messenger program.

그리고 제1 구성요소(2010a)는 입력된 텍스트를 제2 구성요소(2020a)로 전송할 수 있다(S2003a). 예컨대, 제1 구성요소(2010a)는 근거리 통신망(LAN: Local Area Network) 및 인터넷망을 통해 제2 구성요소(2020a)에 접속될 수 있고, 또는 무선 통신(예를 들어, GSM, UMTS, LTE, WiBRO 등의 무선 통신) 방식에 의해서 제2 구성요소(2020a)에 접속될 수 있다.And the first component (2010a) can transmit the input text to the second component (2020a) (S2003a). For example, the first component (2010a) can be connected to the second component (2020a) via a local area network (LAN) and the Internet, or can be connected to the second component (2020a) via a wireless communication method (e.g., wireless communication such as GSM, UMTS, LTE, WiBRO, etc.).

제1 구성요소(2010a)는 입력된 텍스트 그대로를 제2 구성요소(2020a)에 전송하거나, 또는 입력된 텍스트에 자연어 처리를 하여 제2 구성요소(2020a)에 전송할 수 있다. 이 경우, 제1 구성요소(2010a)는 자연어 처리를 위한 인공 지능 모델을 저장하고 있을 수 있다.The first component (2010a) may transmit the input text as is to the second component (2020a), or may perform natural language processing on the input text and transmit it to the second component (2020a). In this case, the first component (2010a) may store an artificial intelligence model for natural language processing.

제2 구성요소(2020a)는 수신된 텍스트를 인공지능 모델에 입력하여 텍스트와 연관된 적어도 하나의 삽화를 획득할 수 있다(S2005a). 제2 구성요소(2020a)는 인공지능 모델 및 삽화 생성에 필요한 다양한 데이터를 포함한 데이터 베이스를 저장할 수 있다. 제2 구성요소(2020a)는 상술한 다양한 실시 예에 따른 인공지능 모델을 이용한 동작을 수행할 수 있다.The second component (2020a) can input the received text into an artificial intelligence model to obtain at least one illustration associated with the text (S2005a). The second component (2020a) can store a database containing various data necessary for generating the artificial intelligence model and the illustration. The second component (2020a) can perform operations using the artificial intelligence model according to the various embodiments described above.

그리고 제2 구성요소(2020a)는 획득된 적어도 하나의 삽화를 제1 구성요소(2010a)로 전송할 수 있다(S2007a). 이 경우, 예컨대, 제2 구성요소(2020a)는 획득된 적어도 하나의 삽화를 이미지 파일 형태로 제1 구성요소(2010a)에 전송할 수 있다. 또 다른 예로, 제2 구성요소(2020a)는 획득된 적어도 하나의 삽화의 저장 주소(예컨대, URL 주소)에 대한 정보를 제1 구성요소(2010a)로 전송할 수 있다.The second component (2020a) can then transmit at least one acquired illustration to the first component (2010a) (S2007a). In this case, for example, the second component (2020a) can transmit the at least one acquired illustration to the first component (2010a) in the form of an image file. As another example, the second component (2020a) can transmit information about the storage address (e.g., URL address) of the at least one acquired illustration to the first component (2010a).

제1 구성요소(2010a)는 제2 구성요소(2020a)로부터 수신한 삽화를 제공할 수 있다(S2009a). 예컨대 제1 구성요소(2010a)는 자체적으로 포함한 디스플레이 또는 외부 디스플레이 장치를 통해 수신한 적어도 하나의 삽화를 표시할 수 있다. 사용자는 표시된 적어도 하나의 삽화 중 사용을 원하는 삽화를 선택하여 이용할 수 있다. 예컨대, 삽화는 프레젠테이션 영상 제작을 위해 이용될 수 있고, 메신저 프로그램에서 대화 상대방에게 보낼 이모티콘, 배경화면 등으로 이용될 수 있다.The first component (2010a) can provide an illustration received from the second component (2020a) (S2009a). For example, the first component (2010a) can display at least one illustration received through its own display or an external display device. The user can select and use the desired illustration from among the displayed at least one illustration. For example, the illustration can be used to create a presentation video, or as an emoticon or background image to send to a conversation partner in a messenger program.

상술한 바와 같은 인공지능 모델은 인공지능 알고리즘 기반으로 학습된 판단 모델로서, 예로, 신경망(Neural Network)을 기반으로 하는 모델일 수 있다. 학습된 인공지능 모델은 인간의 뇌 구조를 컴퓨터상에서 모의하도록 설계될 수 있으며 인간의 신경망의 뉴런(neuron)을 모의하는, 가중치를 가지는 복수의 네트워크 노드들을 포함할 수 있다. 복수의 네트워크 노드들은 뉴런이 시냅스(synapse)를 통하여 신호를 주고받는 뉴런의 시냅틱(synaptic) 활동을 모의하도록 각각 연결 관계를 형성할 수 있다. 또한, 학습된 인공지능 모델은, 일 예로, 신경망 모델, 또는 신경망 모델에서 발전한 딥 러닝 모델을 포함할 수 있다. 딥 러닝 모델에서 복수의 네트워크 노드들은 서로 다른 깊이(또는, 레이어)에 위치하면서 컨볼루션(convolution) 연결 관계에 따라 데이터를 주고받을 수 있다. 학습된 인공지능 모델의 예에는 DNN(Deep Neural Network), RNN(Recurrent Neural Network), BRDNN(Bidirectional Recurrent Deep Neural Network) 등이 있을 수 있으나 이에 한정되지 않는다.The artificial intelligence model described above is a judgment model learned based on an artificial intelligence algorithm, and may be, for example, a neural network-based model. The learned artificial intelligence model may be designed to simulate the structure of the human brain on a computer and may include multiple network nodes with weights that simulate the neurons of a human neural network. The multiple network nodes may each form a connection relationship to simulate the synaptic activity of neurons that exchange signals through synapses. In addition, the learned artificial intelligence model may include, for example, a neural network model or a deep learning model developed from a neural network model. In a deep learning model, multiple network nodes may be located at different depths (or layers) and exchange data according to convolutional connection relationships. Examples of learned artificial intelligence models may include, but are not limited to, a DNN (Deep Neural Network), an RNN (Recurrent Neural Network), and a BRDNN (Bidirectional Recurrent Deep Neural Network).

일 실시 예에 따르면, 제1 구성요소(2010a)는 상술한 바와 텍스트와 관련된 삽화를 획득하기 위하여 인공지능 전용 프로그램(또는 인공지능 에이전트, Artificial intelligence agent)인 개인 비서 프로그램을 이용할 수 있다. 이때, 개인 비서 프로그램은 AI(Artificial Intelligence) 기반의 서비스를 제공하기 위한 전용 프로그램으로서, 기존의 범용 프로세서 또는 별도의 AI 전용 프로세서에 의해 실행될 수 있다.According to one embodiment, the first component (2010a) may utilize a personal assistant program, which is an artificial intelligence (AI)-specific program (or AI agent), to obtain illustrations related to the text described above. In this case, the personal assistant program is a dedicated program for providing AI (Artificial Intelligence)-based services and may be executed by an existing general-purpose processor or a separate AI-specific processor.

구체적으로, 기설정된 사용자 입력(예를 들어, 개인 비서 챗봇에 대응되는 아이콘 터치, 기설정된 단어를 포함하는 사용자 음성 등)이 입력되거나 제1 구성요소(2010a)에 구비된 버튼(예를 들어, 인공지능 에이전트를 실행하기 위한 버튼)이 눌러지는 경우, 인공지능 에이전트가 동작(또는 실행)할 수 있다. 그리고 인공지능 에이전트는 텍스트를 제2 구성요소(2020a)로 전송하고 제2 구성요소(2020a)로부터 수신된 적어도 하나의 삽화를 제공할 수 있다.Specifically, when a preset user input (e.g., touching an icon corresponding to a personal assistant chatbot, a user voice including a preset word, etc.) is input or a button provided in the first component (2010a) is pressed (e.g., a button for executing an artificial intelligence agent), the artificial intelligence agent may operate (or execute). In addition, the artificial intelligence agent may transmit text to the second component (2020a) and provide at least one illustration received from the second component (2020a).

물론, 화면상에 기설정된 사용자 입력이 감지되거나 제1 구성요소(2010a)에 구비된 버튼(예를 들어, 인공지능 에이전트를 실행하기 위한 버튼)이 눌러지면, 인공지능 에이전트가 동작할 수도 있다. 또는, 인공지능 에이전트는 기설정된 사용자 입력이 감지되거나 제1 구성요소(2010a)에 구비된 버튼이 선택되기 이전에 기 실행된 상태일 수 있다. 이 경우, 기설정된 사용자 입력이 감지되거나 제1 구성요소(2010a)에 구비된 버튼이 선택된 이후에는 제1 구성요소(2010a)의 인공지능 에이전트가 텍스트를 바탕으로 삽화를 획득할 수 있다. 또한, 인공지능 에이전트는 기설정된 사용자 입력이 감지되거나 제1 구성요소(2010a)에 구비된 버튼이 선택되기 이전에 대기 상태일 수 있다. 여기서, 대기 상태란, 인공지능 에이전트의 동작 시작을 제어하기 위해 미리 정의된 사용자 입력이 수신되는 것을 감지하는 상태이다. 인공지능 에이전트가 대기 상태인 동안 기설정된 사용자 입력이 감지되거나 제1 구성요소(2010a)에 구비된 버튼이 선택되면, 제1 구성요소(2010a)는 인공지능 에이전트를 동작시키고, 텍스트를 바탕으로 획득된 삽화를 제공할 수 있다.Of course, the artificial intelligence agent may be activated when a preset user input is detected on the screen or a button (e.g., a button for executing an artificial intelligence agent) provided in the first component (2010a) is pressed. Alternatively, the artificial intelligence agent may be in a pre-executed state before the preset user input is detected or the button provided in the first component (2010a) is selected. In this case, after the preset user input is detected or the button provided in the first component (2010a) is selected, the artificial intelligence agent of the first component (2010a) may obtain an illustration based on the text. In addition, the artificial intelligence agent may be in a waiting state before the preset user input is detected or the button provided in the first component (2010a) is selected. Here, the waiting state is a state in which it is detected that a predefined user input is received to control the start of the operation of the artificial intelligence agent. When a preset user input is detected or a button provided in the first component (2010a) is selected while the artificial intelligence agent is in a standby state, the first component (2010a) can operate the artificial intelligence agent and provide an illustration obtained based on the text.

본 개시의 또 다른 실시 예로, 제1 구성요소(2010a)가 직접 인공지능 모델을 이용하여 텍스트와 관련한 적어도 하나의 삽화를 획득하는 경우 인공지능 에이전트는 인공지능 모델을 제어하여 텍스트와 관련한 적어도 하나의 삽화를 획득할 수 있다. 이때, 인공지능 에이전트는 상술한 제2 구성요소(2020a)의 동작을 수행할 수 있다.In another embodiment of the present disclosure, if the first component (2010a) directly uses an AI model to obtain at least one illustration related to the text, the AI agent may control the AI model to obtain at least one illustration related to the text. In this case, the AI agent may perform the operations of the second component (2020a) described above.

도 20b는 본 개시의 일 실시 예에 따른 인공지능 모델을 이용하는 네트워크 시스템의 흐름도이다.FIG. 20b is a flowchart of a network system using an artificial intelligence model according to an embodiment of the present disclosure.

도 20b를 참조하면, 인공지능 모델을 이용하는 네트워크 시스템은 제1 구성 요소(2010b), 제2 구성 요소(2020b) 및 제3 구성 요소(2030b)를 포함할 수 있다. 예컨데, 제1 구성 요소(2010b)는 데스크톱, 스마트폰, 태블릿 PC 등과 같은 전자 장치이고 제2 구성 요소(2020b)는 마이크로소프트 파워포인트?? 또는 키노트?? 등과 같은 프레젠테이션 소프트웨어를 구동하는 서버일 수 있으며, 제3 구성 요소(2030b)는 자연어 처리를 수행하는 인공지능 모델 등이 저장된 서버일 수 있다.Referring to FIG. 20b, a network system utilizing an artificial intelligence model may include a first component (2010b), a second component (2020b), and a third component (2030b). For example, the first component (2010b) may be an electronic device such as a desktop, smartphone, or tablet PC, the second component (2020b) may be a server running presentation software such as Microsoft PowerPoint or Keynote, and the third component (2030b) may be a server storing an artificial intelligence model performing natural language processing.

제1 구성 요소(2010b), 제2 구성 요소(2020b) 및 제3 구성 요소(2030b) 간에 데이터를 송/수신하기 위한 인터페이스가 정의될 수 있다.An interface for transmitting/receiving data between the first component (2010b), the second component (2020b), and the third component (2030b) may be defined.

도 20b를 참고하면, 먼저, 제1 구성요소(2010b)는 텍스트를 입력받을 수 있다(S2001b). 제1 구성요소(2010b)는 키보드, 터치 스크린 등 다양한 입력 장치를 통해 텍스트를 입력받을 수 있다. 또는, 제1 구성 요소(2010b)는 음성을 입력받아 이를 텍스트로 변환할 수 있다.Referring to Figure 20b, first, the first component (2010b) can receive text input (S2001b). The first component (2010b) can receive text input through various input devices such as a keyboard or touch screen. Alternatively, the first component (2010b) can receive voice input and convert it into text.

이후, 제1 구성요소(2010b)는 입력된 텍스트를 제3 구성요소(2030b)로 전송할 수 있다(S2003b). 예컨대, 제1 구성요소(2010b)는 근거리 통신망(LAN: Local Area Network) 및 인터넷망을 통해 제3 구성요소(2030b)에 접속될 수 있고, 또는 무선 통신(예를 들어, GSM, UMTS, LTE, WiBRO 등의 무선 통신) 방식에 의해서 제3 구성요소(2030b)에 접속될 수 있다.Thereafter, the first component (2010b) can transmit the input text to the third component (2030b) (S2003b). For example, the first component (2010b) can be connected to the third component (2030b) via a local area network (LAN) and the Internet, or can be connected to the third component (2030b) via a wireless communication method (e.g., wireless communication such as GSM, UMTS, LTE, WiBRO, etc.).

제3 구성요소(2030b)는 수신된 텍스트를 인공지능 모델에 입력하여 텍스트와 연관된 적어도 하나의 핵심 단어 및 핵심 단어 사이의 연관 관계를 획득할 수 있다(S2005b). 제3 구성요소(2030b)는 핵심 단어 및 핵심 단어 사이의 연관 관계를 제2 구성요소(2020b)로 전송할 수 있다(S2007b).The third component (2030b) can input the received text into an artificial intelligence model to obtain at least one key word associated with the text and the associations between the key words (S2005b). The third component (2030b) can transmit the key words and the associations between the key words to the second component (2020b) (S2007b).

제2 구성요소(2020b)는 수신된 핵심 단어 및 핵심 단어 사이의 연관 관계를 이용하여, 합성 삽화를 생성할 수 있다(S2009b). 제2 구성요소(2020b)는 생성된 합성 삽화를 제1 구성 요소(2010b)로 전달할 수 있다(S2011b). 예컨대 제1 구성요소(2010b)는 자체적으로 포함한 디스플레이 또는 외부 디스플레이 장치를 통해 수신한 적어도 하나의 삽화를 표시할 수 있다. 예컨대, 삽화는 프레젠테이션 영상 제작을 위해 이용될 수 있고, 메신저 프로그램에서 대화 상대방에게 보낼 이모티콘, 배경화면 등으로 이용될 수 있다.The second component (2020b) can generate a synthetic illustration using the received key words and the relationships between the key words (S2009b). The second component (2020b) can transmit the generated synthetic illustration to the first component (2010b) (S2011b). For example, the first component (2010b) can display at least one illustration received through its own display or an external display device. For example, the illustration can be used to create a presentation video, or as an emoticon or background image to be sent to a conversation partner in a messenger program.

도 20c는 본 개시의 일 실시 예에 따른 네트워크 시스템의 구성도이다.FIG. 20c is a configuration diagram of a network system according to an embodiment of the present disclosure.

도 20c를 참조하면, 인공지능 모델을 이용하는 네트워크 시스템은 제1 구성 요소(2010c) 및 제2 구성 요소(2020c)를 포함할 수 있다. 예컨대, 제1 구성 요소(2010c)는 데스크톱, 스마트폰, 태블릿 PC 등과 같은 전자 장치이고 제2 구성 요소(2020c)는 인공지능 모델, 데이터 베이스 등이 저장된 서버일 수 있다.Referring to FIG. 20c, a network system utilizing an artificial intelligence model may include a first component (2010c) and a second component (2020c). For example, the first component (2010c) may be an electronic device such as a desktop, smartphone, or tablet PC, and the second component (2020c) may be a server storing an artificial intelligence model, a database, and the like.

제1 구성 요소(2010c)는 입력부(2012c) 및 출력부(2014c)를 포함할 수 있다. 입력부(2012c)는 입력 장치를 통해서 텍스트를 입력 받을 수 있다. 입력 장치는 예컨대, 키보드, 터치 패드, 마우스, 버튼 등을 포함할 수 있다. 입력 장치는 제1 구성 요소(2010c)에 내장되어 있거나 또는 제1 구성 요소(2010c)와 연결된 외부 입력 장치일 수 있다. 출력부(2014c)는 출력 장치를 통해서 영상을 출력할 수 있다. 예를 들어, 출력부(2014c)는 출력 장치를 통해서, 제2 구성 요소(2020c)로부터 수신한 정보를 바탕으로 삽화를 출력할 수 있다. 출력 장치는 예를 들어, 액정 디스플레이(LCD), 발광 다이오드(LED) 디스플레이, 유기 발광 다이오드(OLED) 디스플레이, 또는 마이크로 전자기계 시스템(MEMS) 디스플레이, 또는 전자종이(electronic paper) 디스플레이, 터치 스크린을 포함할 수 있다. 출력장치는 제1 구성 요소(2010c)에 내장되어 있거나 또는 제1 구성 요소(2010c)와 연결된 외부 출력 장치일 수 있다.The first component (2010c) may include an input unit (2012c) and an output unit (2014c). The input unit (2012c) may receive text input through an input device. The input device may include, for example, a keyboard, a touchpad, a mouse, buttons, etc. The input device may be built into the first component (2010c) or may be an external input device connected to the first component (2010c). The output unit (2014c) may output an image through the output device. For example, the output unit (2014c) may output an illustration based on information received from the second component (2020c) through the output device. The output device may include, for example, a liquid crystal display (LCD), a light emitting diode (LED) display, an organic light emitting diode (OLED) display, a micro electro mechanical system (MEMS) display, an electronic paper display, or a touch screen. The output device may be built into the first component (2010c) or may be an external output device connected to the first component (2010c).

제2 구성 요소(2020c)는 자연어 처리부(2022c), 데이터베이스(2026c) 및 삽화 생성부(2024c)를 포함할 수 있다.The second component (2020c) may include a natural language processing unit (2022c), a database (2026c), and an illustration generation unit (2024c).

자연어 처리부(2022c)는 수신된 텍스트가 입력되면, 이를 인공 지능 모델을 이용하여 핵심 단어를 추출하고, 핵심 단어들 사이의 연관 관계 및 문맥을 파악할 수 있다.When the received text is input, the natural language processing unit (2022c) can extract key words using an artificial intelligence model and identify the relationships and context between the key words.

데이터 베이스(2026c)는 태그 정보와 매칭된 삽화들을 저장할 수 있다. 예를 들어, 자연어 처리부(2022c)에서 출력된 핵심 단어들을 포함하는 태그 정보와 매칭된 삽화들을 데이터 베이스로부터 검색할 수 있다.The database (2026c) can store illustrations matched with tag information. For example, illustrations matched with tag information containing key words output from the natural language processing unit (2022c) can be retrieved from the database.

삽화 생성부(2024c)는 수신된 핵심 단어들 및 이들의 연관 관계를 바탕으로, 데이터 베이스(2026c)로부터 검색된 복수의 삽화들을 조합하여 합성 삽화를 생성할 수 있다.The illustration generation unit (2024c) can generate a synthetic illustration by combining multiple illustrations retrieved from a database (2026c) based on the received key words and their relationships.

본 개시의 일 실시 예에 따른 자연어 처리부(2022c) 및 삽화 생성부(2024c)는 하나의 서버에 포함되는 것으로 도시되었으나, 이는 일 실시 예에 불과하다. 예를 들어, 자연어 처리부(2022c)와 삽화 생성부(2024c)는 별도의 서버에 포함될 수도 있으며, 제1 구성 요소(2010c)에 포함될 수도 있다.Although the natural language processing unit (2022c) and the illustration generation unit (2024c) according to one embodiment of the present disclosure are depicted as being included in a single server, this is merely an example. For example, the natural language processing unit (2022c) and the illustration generation unit (2024c) may be included in separate servers, or may be included in the first component (2010c).

도 21은 본 개시의 일 실시 예에 따른 인공지능 모델을 학습하고 이용하기 위한 전자 장치의 구성을 나타내는 블록도이다.FIG. 21 is a block diagram showing the configuration of an electronic device for learning and using an artificial intelligence model according to one embodiment of the present disclosure.

도 21을 참조하면, 전자 장치(2100)는 학습부(2110) 및 판단부(2120) 중 적어도 하나를 포함할 수 있다. 도 21의 전자 장치(2100)는 도 19의 전자 장치(100), 도 20a의 제2 구성요소(2020a)에 대응될 수 있다.Referring to FIG. 21, the electronic device (2100) may include at least one of a learning unit (2110) and a judgment unit (2120). The electronic device (2100) of FIG. 21 may correspond to the electronic device (100) of FIG. 19 and the second component (2020a) of FIG. 20a.

학습부(2110)는 학습 데이터를 이용하여 텍스트와 관련된 적어도 하나의 영상(삽화, 이모티콘 등)를 획득하기 위한 기준을 갖는 인공지능 모델을 생성 또는 학습시킬 수 있다. 학습부(2110)는 수집된 학습 데이터를 이용하여 판단 기준을 갖는 인공지능 모델을 생성할 수 있다.The learning unit (2110) can create or train an artificial intelligence model having criteria for acquiring at least one image (illustration, emoticon, etc.) related to text using learning data. The learning unit (2110) can create an artificial intelligence model having judgment criteria using the collected learning data.

일 예로, 학습부(2110)는 텍스트와 영상을 학습 데이터로 하여, 텍스트와 연관된 영상을 획득하도록 인공지능 모델을 생성, 학습 또는 재학습시킬 수 있다. 또한, 학습부(2110)는 영상과 프레젠테이션의 디자인에 대한 정보를 학습 데이터로서 이용하여, 영상을 프레젠테이션의 디자인과 대응되도록 수정하기 위한 인공지능 모델을 생성, 학습 또는 재학습시킬 수 있다.For example, the learning unit (2110) may create, train, or retrain an artificial intelligence model to acquire images associated with text using text and images as learning data. Furthermore, the learning unit (2110) may use information about the image and presentation design as learning data to create, train, or retrain an artificial intelligence model to modify the image to match the presentation design.

판단부(2120)는 소정의 데이터를 학습된 인공지능 모델의 입력 데이터로 사용하여, 텍스트와 연관된 영상을 획득할 수 있다.The judgment unit (2120) can use certain data as input data for a learned artificial intelligence model to obtain an image associated with the text.

일 예로, 판단부(2120)는 텍스트를 학습된 인공지능 모델의 입력 데이터로 사용하여 텍스트와 관련된 영상을 획득할 수 있다. 또 다른 예로, 판단부(2120)는 영상과 프레젠테이션의 디자인에 대한 정보를 인공지능 모델의 입력 데이터로 사용하여 영상을 프레젠테이션의 디자인과 대응되도록 수정할 수 있다.For example, the judgment unit (2120) may use text as input data for a trained artificial intelligence model to obtain an image related to the text. As another example, the judgment unit (2120) may use information about the image and the presentation design as input data for the artificial intelligence model to modify the image to match the presentation design.

학습부(2110)의 적어도 일부 및 판단부(2120)의 적어도 일부는, 소프트웨어 모듈로 구현되거나 적어도 하나의 하드웨어 칩 형태로 제작되어 전자 장치(100) 또는 제2 구성요소(2020)에 탑재될 수 있다. 예를 들어, 학습부(2110) 및 판단부(2120) 중 적어도 하나는 인공 지능(AI; artificial intelligence)을 위한 전용 하드웨어 칩 형태로 제작될 수도 있고, 또는 기존의 범용 프로세서(예: CPU 또는 application processor) 또는 그래픽 전용 프로세서(예: GPU)의 일부로 제작되어 각종 전자 장치에 탑재될 수도 있다. 이때, 인공 지능을 위한 전용 하드웨어 칩은 확률 연산에 특화된 전용 프로세서로서, 기존의 범용 프로세서보다 병렬처리 성능이 높아 기계 학습과 같은 인공 지능 분야의 연산 작업을 빠르게 처리할 수 있다. 학습부(2110) 및 판단부(2120)가 소프트웨어 모듈(또는, 인스트럭션(instruction) 포함하는 프로그램 모듈)로 구현되는 경우, 소프트웨어 모듈은 컴퓨터로 읽을 수 있는 판독 가능한 비일시적 판독 가능 기록매체(non-transitory computer readable media)에 저장될 수 있다. 이 경우, 소프트웨어 모듈은 OS(Operating System)에 의해 제공되거나, 소정의 애플리케이션에 의해 제공될 수 있다. 또는, 소프트웨어 모듈 중 일부는 OS(Operating System)에 의해 제공되고, 나머지 일부는 소정의 애플리케이션에 의해 제공될 수 있다.At least a part of the learning unit (2110) and at least a part of the judgment unit (2120) may be implemented as a software module or manufactured in the form of at least one hardware chip and mounted on the electronic device (100) or the second component (2020). For example, at least one of the learning unit (2110) and the judgment unit (2120) may be manufactured in the form of a dedicated hardware chip for artificial intelligence (AI), or may be manufactured as a part of an existing general-purpose processor (e.g., CPU or application processor) or a graphics-only processor (e.g., GPU) and mounted on various electronic devices. In this case, the dedicated hardware chip for artificial intelligence is a dedicated processor specialized in probability calculation, and has higher parallel processing performance than an existing general-purpose processor, so it can quickly process calculation tasks in the field of artificial intelligence such as machine learning. When the learning unit (2110) and the judgment unit (2120) are implemented as software modules (or program modules including instructions), the software modules may be stored in a non-transitory computer-readable recording medium that can be read by a computer. In this case, the software modules may be provided by an operating system (OS) or by a predetermined application. Alternatively, some of the software modules may be provided by the operating system (OS), and the remaining parts may be provided by a predetermined application.

이 경우, 학습부(2110) 및 판단부(2120)는 하나의 전자 장치에 탑재될 수도 있으며, 또는 별개의 전자 장치들에 각각 탑재될 수도 있다. 또한, 학습부(2110) 및 판단부(2120)는 유선 또는 무선으로 통하여, 학습부(2110)가 구축한 모델 정보를 판단부(2120)로 제공할 수도 있고, 학습부(2110)로 입력된 데이터가 추가 학습 데이터로서 학습부(2110)로 제공될 수도 있다.In this case, the learning unit (2110) and the judgment unit (2120) may be mounted on a single electronic device, or may be mounted on separate electronic devices. In addition, the learning unit (2110) and the judgment unit (2120) may provide model information constructed by the learning unit (2110) to the judgment unit (2120) via wired or wireless communication, and data input to the learning unit (2110) may be provided to the learning unit (2110) as additional learning data.

도 21 내지 도 22는 다양한 실시예에 따른 학습부(2110) 및 판단부(2120)의 블록도이다.Figures 21 and 22 are block diagrams of a learning unit (2110) and a judgment unit (2120) according to various embodiments.

도 21을 참고하면, 일부 실시 예에 따른 학습부(2110)는 학습 데이터 획득부(2110-1) 및 모델 학습부(2110-4)를 포함할 수 있다. 또한, 학습부(2110)는 학습 데이터 전처리부(2110-2), 학습 데이터 선택부(2110-3) 및 모델 평가부(2110-5) 중 적어도 하나를 선택적으로 더 포함할 수 있다.Referring to FIG. 21, a learning unit (2110) according to some embodiments may include a learning data acquisition unit (2110-1) and a model learning unit (2110-4). In addition, the learning unit (2110) may optionally further include at least one of a learning data preprocessing unit (2110-2), a learning data selection unit (2110-3), and a model evaluation unit (2110-5).

학습 데이터 획득부(2110-1)는 텍스트와 관련된 적어도 하나의 영상을 획득하기 위한 인공지능 모델에 필요한 학습 데이터를 획득할 수 있다. 본 개시의 실시 예로, 학습 데이터 획득부(2110-1)는 텍스트, 영상, 프레젠테이션의 디자인에 대한 정보 등을 학습 데이터로서 획득할 수 있다. 학습 데이터는 학습부(2110) 또는 학습부(2110)의 제조사가 수집 또는 테스트한 데이터가 될 수도 있다.The learning data acquisition unit (2110-1) can acquire learning data required for an artificial intelligence model to acquire at least one image related to text. In an embodiment of the present disclosure, the learning data acquisition unit (2110-1) can acquire information regarding text, images, presentation design, etc. as learning data. The learning data may be data collected or tested by the learning unit (2110) or the manufacturer of the learning unit (2110).

모델 학습부(2110-4)는 학습 데이터를 이용하여, 인공지능 모델이 텍스트와 연관된 영상을 획득하는 기준을 갖도록 학습시킬 수 있다. 예로, 모델 학습부(2110-4)는 학습 데이터 중 적어도 일부를 텍스트와 연관된 영상을 획득하기 위한 기준으로 이용하는 지도 학습(supervised learning)을 통하여, 인공지능 모델을 학습시킬 수 있다. 또는, 모델 학습부(2110-4)는, 예를 들어, 별다른 지도 없이 학습 데이터를 이용하여 스스로 학습함으로써, 텍스트와 연관된 영상을 획득하기 위한 기준을 발견하는 비지도 학습(unsupervisedlearning)을 통하여, 인공지능 모델을 학습시킬 수 있다. 예컨대, 모델 학습부(2110-4)는 GAN(Generative Adversarial Network) 기술을 이용하여 인공지능 모델을 학습시킬 수 있다. 또한, 모델 학습부(2110-4)는, 예를 들어, 학습에 따른 판단 결과가 올바른지에 대한 피드백을 이용하는 강화 학습(reinforcement learning)을 통하여, 인공지능 모델을 학습시킬 수 있다. 또한, 모델 학습부(2110-4)는, 예를 들어, 오류 역전파법(error back-propagation) 또는 경사 하강법(gradient descent)을 포함하는 학습 알고리즘 등을 이용하여 인공지능 모델을 학습시킬 수 있다.The model learning unit (2110-4) can train the artificial intelligence model using learning data to have a standard for acquiring images associated with text. For example, the model learning unit (2110-4) can train the artificial intelligence model through supervised learning, which uses at least some of the learning data as a standard for acquiring images associated with text. Alternatively, the model learning unit (2110-4) can train the artificial intelligence model through unsupervised learning, which discovers a standard for acquiring images associated with text by learning on its own using the learning data without any additional guidance. For example, the model learning unit (2110-4) can train the artificial intelligence model using Generative Adversarial Network (GAN) technology. In addition, the model learning unit (2110-4) can train the artificial intelligence model through reinforcement learning, which uses feedback on whether the judgment result according to learning is correct, for example. Additionally, the model learning unit (2110-4) can learn an artificial intelligence model using a learning algorithm including, for example, error back-propagation or gradient descent.

또한, 모델 학습부(2110-4)는 입력 데이터를 이용하여 텍스트와 관련된 영상을 획득하기 위하여 어떤 학습 데이터를 이용해야 하는지에 대한 선별 기준을 학습할 수도 있다.Additionally, the model learning unit (2110-4) can learn selection criteria for which learning data should be used to obtain an image related to text using input data.

모델 학습부(2110-4)는 미리 구축된 인공지능 모델이 복수 개가 존재하는 경우, 입력된 학습 데이터와 기본 학습 데이터의 관련성이 큰 인공지능 모델을 학습할 인공지능 모델로 결정할 수 있다. 이 경우, 기본 학습 데이터는 데이터의 타입별로 기 분류되어 있을 수 있으며, 인공지능 모델은 데이터의 타입별로 미리 구축되어 있을 수 있다. 예를 들어, 기본 학습 데이터는 학습 데이터가 생성된 지역, 학습 데이터가 생성된 시간, 학습 데이터의 크기, 학습 데이터의 장르, 학습 데이터의 생성자, 학습 데이터 내의 오브젝트의 종류 등과 같은 다양한 기준으로 미리 분류되어 있을 수 있다.If multiple pre-built AI models exist, the model learning unit (2110-4) can determine the AI model with the highest correlation between the input training data and the basic training data as the AI model to be trained. In this case, the basic training data may be pre-classified by data type, and the AI models may be pre-classified by data type. For example, the basic training data may be pre-classified based on various criteria, such as the region where the training data was generated, the time the training data was generated, the size of the training data, the genre of the training data, the creator of the training data, and the type of objects within the training data.

인공지능 모델이 학습되면, 모델 학습부(2110-4)는 학습된 인공지능 모델을 저장할 수 있다. 예컨대, 모델 학습부(2110-4)는 학습된 인공지능 모델을 전자 장치(100)의 메모리(110), 또는 제2 구성요소(2020)의 메모리에 저장할 수 있다.Once the artificial intelligence model is trained, the model training unit (2110-4) can store the trained artificial intelligence model. For example, the model training unit (2110-4) can store the trained artificial intelligence model in the memory (110) of the electronic device (100) or the memory of the second component (2020).

텍스트와 영상 세트로부터 학습된 인공지능 모델은 텍스트가 의미하는 내용에 대한 영상 형태 특징이 학습되어 있다.An AI model trained from a set of text and images has learned image-like features that reflect the meaning of the text.

프레젠테이션 영상의 디자인에 대한 정보와 영상 세트로부터 학습된 인공지능 모델은 프레젠테이션 영상의 디자인에 대해 영상이 어떤 특징을 가지는지 학습되어 있다.An artificial intelligence model trained from a set of videos and information about the design of presentation videos has learned what characteristics the videos have in terms of the design of the presentation videos.

학습부(2110)는 인공지능 모델의 판단 결과를 향상시키거나, 인공지능 모델의 생성에 필요한 자원 또는 시간을 절약하기 위하여, 학습 데이터 전처리부(2110-2) 및 학습 데이터 선택부(2110-3)를 더 포함할 수도 있다.The learning unit (2110) may further include a learning data preprocessing unit (2110-2) and a learning data selection unit (2110-3) to improve the judgment results of the artificial intelligence model or to save resources or time required for creating the artificial intelligence model.

학습 데이터 전처리부(2110-2)는 텍스트와 관련된 영상을 획득하기 위한 학습에 획득된 데이터가 이용될 수 있도록, 획득된 데이터를 전처리할 수 있다. 학습 데이터 전처리부(2110-2)는 모델 학습부(2110-4)가 텍스트와 관련된 영상을 획득하기 위하여 획득된 데이터를 이용할 수 있도록, 획득된 데이터를 기 설정된 포맷으로 가공할 수 있다. 예를 들어, 학습 데이터 전처리부(2110-2)는 입력된 텍스트 중 인공지능 모델이 응답을 제공할 때 필요없는 텍스트(예를 들어, 부사, 감탄사 등)를 제거할 수 있다.The learning data preprocessing unit (2110-2) can preprocess the acquired data so that the acquired data can be used for learning to acquire text-related images. The learning data preprocessing unit (2110-2) can process the acquired data into a preset format so that the model learning unit (2110-4) can use the acquired data to acquire text-related images. For example, the learning data preprocessing unit (2110-2) can remove unnecessary text (e.g., adverbs, exclamations, etc.) from the input text when the artificial intelligence model provides a response.

학습 데이터 선택부(2110-3)는 학습 데이터 획득부(2110-1)에서 획득된 데이터 또는 학습 데이터 전처리부(2110-2)에서 전처리된 데이터 중에서 학습에 필요한 데이터를 선택할 수 있다. 선택된 학습 데이터는 모델 학습부(2110-4)에 제공될 수 있다. 학습 데이터 선택부(2110-3)는 기 설정된 선별 기준에 따라, 획득되거나 전처리된 데이터 중에서 학습에 필요한 학습 데이터를 선택할 수 있다. 또한, 학습 데이터 선택부(2110-3)는 모델 학습부(2110-4)에 의한 학습에 의해 기 설정된 선별 기준에 따라 학습 데이터를 선택할 수도 있다.The learning data selection unit (2110-3) can select data required for learning from among the data acquired by the learning data acquisition unit (2110-1) or the data preprocessed by the learning data preprocessing unit (2110-2). The selected learning data can be provided to the model learning unit (2110-4). The learning data selection unit (2110-3) can select learning data required for learning from among the acquired or preprocessed data according to preset selection criteria. In addition, the learning data selection unit (2110-3) can also select learning data according to preset selection criteria through learning by the model learning unit (2110-4).

학습부(2110)는 인공지능 모델의 판단 결과를 향상시키기 위하여, 모델 평가부(2110-5)를 더 포함할 수도 있다.The learning unit (2110) may further include a model evaluation unit (2110-5) to improve the judgment results of the artificial intelligence model.

모델 평가부(2110-5)는 인공지능 모델에 평가 데이터를 입력하고, 평가 데이터로부터 출력되는 판단 결과가 소정 기준을 만족하지 못하는 경우, 모델 학습부(2110-4)로 하여금 다시 학습하도록 할 수 있다. 이 경우, 평가 데이터는 인공지능 모델을 평가하기 위한 기 정의된 데이터일 수 있다.The model evaluation unit (2110-5) inputs evaluation data into the artificial intelligence model, and if the judgment result output from the evaluation data does not meet a predetermined standard, it can cause the model learning unit (2110-4) to relearn the model. In this case, the evaluation data may be predefined data for evaluating the artificial intelligence model.

예를 들어, 모델 평가부(2110-5)는 평가 데이터에 대한 학습된 인공지능 모델의 판단 결과 중에서, 판단 결과가 정확하지 않은 평가 데이터의 개수 또는 비율이 미리 설정된 임계치를 초과하는 경우 소정 기준을 만족하지 못한 것으로 평가할 수 있다.For example, the model evaluation unit (2110-5) may evaluate that a predetermined standard is not satisfied if the number or ratio of evaluation data for which the judgment result of the learned artificial intelligence model for the evaluation data is inaccurate exceeds a preset threshold.

한편, 학습된 인공지능 모델이 복수 개가 존재하는 경우, 모델 평가부(2110-5)는 각각의 학습된 인공지능 모델에 대하여 소정 기준을 만족하는지를 평가하고, 소정 기준을 만족하는 모델을 최종 인공지능 모델로서 결정할 수 있다. 이 경우, 소정 기준을 만족하는 모델이 복수 개인 경우, 모델 평가부(2110-5)는 평가 점수가 높은 순으로 미리 설정된 어느 하나 또는 소정 개수의 모델을 최종 인공지능 모델로서 결정할 수 있다.Meanwhile, if there are multiple trained artificial intelligence models, the model evaluation unit (2110-5) can evaluate whether each trained artificial intelligence model satisfies a predetermined criterion and determine the model that satisfies the predetermined criterion as the final artificial intelligence model. In this case, if there are multiple models that satisfy the predetermined criterion, the model evaluation unit (2110-5) can determine one or a predetermined number of models, in descending order of evaluation scores, as the final artificial intelligence model.

도 23을 참조하면, 일부 실시예에 따른 판단부(2120)는 입력 데이터 획득부(2120-1) 및 판단 결과 제공부(2120-4)를 포함할 수 있다.Referring to FIG. 23, a judgment unit (2120) according to some embodiments may include an input data acquisition unit (2120-1) and a judgment result provision unit (2120-4).

또한, 판단부(2120)는 입력 데이터 전처리부(2120-2), 입력 데이터 선택부(2120-3) 및 모델 갱신부(2120-5) 중 적어도 하나를 선택적으로 더 포함할 수 있다.Additionally, the judgment unit (2120) may optionally further include at least one of an input data preprocessing unit (2120-2), an input data selection unit (2120-3), and a model update unit (2120-5).

입력 데이터 획득부(2120-1)는 텍스트와 연관된 적어도 하나의 영상을 획득하기 위해 필요한 데이터를 획득할 수 있다. 판단 결과 제공부(2120-4)는 입력 데이터 획득부(2120-1)에서 획득된 입력 데이터를 입력 값으로 학습된 인공지능 모델에 적용하여 텍스트와 연관된 적어도 하나의 영상을 획득할 수 있다. 판단 결과 제공부(2120-4)는 후술할 입력 데이터 전처리부(2120-2) 또는 입력 데이터 선택부(2120-3)에 의해 선택된 데이터를 입력 값으로 인공지능 모델에 적용하여 판단 결과를 획득할 수 있다.The input data acquisition unit (2120-1) can acquire data necessary to acquire at least one image associated with the text. The judgment result provision unit (2120-4) can apply the input data acquired by the input data acquisition unit (2120-1) to an artificial intelligence model trained as input values to acquire at least one image associated with the text. The judgment result provision unit (2120-4) can apply data selected by the input data preprocessing unit (2120-2) or the input data selection unit (2120-3) described below to an artificial intelligence model as input values to acquire a judgment result.

일 실시 예로, 판단 결과 제공부(2120-4)는 입력 데이터 획득부(2120-1)에서 획득한 텍스트를 학습된 인공지능 모델 적용하여 텍스트와 연관된 적어도 하나의 영상을 획득할 수 있다.As an example, the judgment result provision unit (2120-4) can obtain at least one image related to the text by applying a learned artificial intelligence model to the text obtained from the input data acquisition unit (2120-1).

판단부(2120)는 인공지능 모델의 판단 결과를 향상시키거나, 판단 결과의 제공을 위한 자원 또는 시간을 절약하기 위하여, 입력 데이터 전처리부(2120-2) 및 입력 데이터 선택부(2120-3)를 더 포함할 수도 있다.The judgment unit (2120) may further include an input data preprocessing unit (2120-2) and an input data selection unit (2120-3) to improve the judgment results of the artificial intelligence model or save resources or time for providing the judgment results.

입력 데이터 전처리부(2120-2)는 텍스트와 연관된 적어도 하나의 영상을 획득하기 위해 획득된 데이터가 이용될 수 있도록, 획득된 데이터를 전처리할 수 있다. 입력 데이터 전처리부(2120-2)는 판단 결과 제공부(2120-4)가 텍스트와 연관된 적어도 하나의 영상을 획득하기 위하여 획득된 데이터를 이용할 수 있도록, 획득된 데이터를 기 정의된 포맷으로 가공할 수 있다.The input data preprocessing unit (2120-2) can preprocess the acquired data so that the acquired data can be used to obtain at least one image associated with the text. The input data preprocessing unit (2120-2) can process the acquired data into a predefined format so that the judgment result providing unit (2120-4) can use the acquired data to obtain at least one image associated with the text.

입력 데이터 선택부(2120-3)는 입력 데이터 획득부(2120-1)에서 획득된 데이터 또는 입력 데이터 전처리부(2120-2)에서 전처리된 데이터 중에서 응답 제공에 필요한 데이터를 선택할 수 있다. 선택된 데이터는 판단 결과 제공부(2120-4)에게 제공될 수 있다. 입력 데이터 선택부(2120-3)는 응답 제공을 위한 기 설정된 선별 기준에 따라, 획득되거나 전처리된 데이터 중에서 일부 또는 전부를 선택할 수 있다. 또한, 입력 데이터 선택부(2120-3)는 모델 학습부(2110-4)에 의한 학습에 의해 기 설정된 선별 기준에 따라 데이터를 선택할 수도 있다.The input data selection unit (2120-3) can select data required for providing a response from among the data acquired by the input data acquisition unit (2120-1) or the data preprocessed by the input data preprocessing unit (2120-2). The selected data can be provided to the judgment result provision unit (2120-4). The input data selection unit (2120-3) can select some or all of the acquired or preprocessed data according to preset selection criteria for providing a response. In addition, the input data selection unit (2120-3) can also select data according to preset selection criteria through learning by the model learning unit (2110-4).

모델 갱신부(2120-5)는 판단 결과 제공부(2120-4)에 의해 제공되는 판단 결과에 대한 평가에 기초하여, 인공지능 모델이 갱신되도록 제어할 수 있다. 예를 들어, 모델 갱신부(2120-5)는 판단 결과 제공부(2120-4)에 의해 제공되는 판단 결과를 모델 학습부(2110-4)에게 제공함으로써, 모델 학습부(2110-4)가 인공지능 모델을 추가 학습 또는 갱신하도록 요청할 수 있다. 특히, 모델 갱신부(2120-5)는 사용자 입력에 따른 피드백 정보를 바탕으로 인공지능 모델을 재학습할 수 있다.The model update unit (2120-5) can control the AI model to be updated based on the evaluation of the judgment result provided by the judgment result provider unit (2120-4). For example, the model update unit (2120-5) can request the model learning unit (2110-4) to additionally learn or update the AI model by providing the judgment result provided by the judgment result provider unit (2120-4) to the model learning unit (2110-4). In particular, the model update unit (2120-5) can relearn the AI model based on feedback information according to user input.

상술한 다양한 실시 예들에 따르면, 발표자료, 신문, 책 등의 문서를 제작할 때 문서에 들어가는 스크립트, 기사, 책 내용에 어울리는 영상, 삽화 등이 바로 생성될 수 있으므로 사용자가 따로 영상, 삽화 등을 검색하는 수고를 줄여줄 수 있다. 또한, 인공 지능 모델에 의해 디자인적으로 유사한 삽화들을 획득할 수 있으므로 통일감 있는 자료 작성이 가능하다.According to the various embodiments described above, when creating documents such as presentation materials, newspapers, and books, videos, illustrations, and other content that match the scripts, articles, and book content can be generated immediately, reducing the user's need to search for videos, illustrations, and other content separately. Furthermore, AI models can be used to obtain similarly designed illustrations, enabling the creation of unified materials.

한편, 상술한 실시 예들에선 설명된 프레젠테이션 자료 제작 방법은 예컨대 서적, 잡지, 신문, 광고, 웹페이지 제작 등과 같이 텍스트에 어울리는 영상을 필요로 하는 어떠한 분야에서도 적용될 수 있음은 당업자에게 자명할 것이다.Meanwhile, it will be apparent to those skilled in the art that the presentation material production method described in the above-described embodiments can be applied to any field that requires images that match text, such as books, magazines, newspapers, advertisements, and web page production.

이상에서 설명된 다양한 실시 예들은 소프트웨어(software), 하드웨어(hardware) 또는 이들의 조합으로 구현될 수 있다. 하드웨어적인 구현에 의하면, 본 개시에서 설명되는 실시 예들은 ASICs(Application Specific Integrated Circuits), DSPs(digital signal processors), DSPDs(digital signal processing devices), PLDs(programmable logic devices), FPGAs(field programmable gate arrays), 프로세서(processors), 제어기(controllers), 마이크로 컨트롤러(micro-controllers), 마이크로 프로세서(microprocessors), 기타 기능 수행을 위한 전기적인 유닛(unit) 중 적어도 하나를 이용하여 구현될 수 있다. 소프트웨어적인 구현에 의하면, 본 명세서에서 설명되는 절차 및 기능과 같은 실시 예들은 별도의 소프트웨어 모듈들로 구현될 수 있다. 상기 소프트웨어 모듈들 각각은 본 명세서에서 설명되는 하나 이상의 기능 및 작동을 수행할 수 있다.The various embodiments described above may be implemented in software, hardware, or a combination thereof. In terms of hardware implementation, the embodiments described in the present disclosure may be implemented using at least one of Application Specific Integrated Circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), processors, controllers, microcontrollers, microprocessors, and other electrical units for performing functions. In terms of software implementation, embodiments such as the procedures and functions described in this specification may be implemented as separate software modules. Each of the software modules may perform one or more functions and operations described in this specification.

본 개시의 다양한 실시 예들은 기기(machine)(예: 컴퓨터)로 읽을 수 있는 저장 매체(machine-readable storage media)에 저장될 수 있는 명령어를 포함하는 소프트웨어로 구현될 수 있다. 기기는, 저장 매체로부터 저장된 명령어를 호출하고, 호출된 명령어에 따라 동작이 가능한 장치로서, 개시된 실시 예들에 따른 전자 장치(예: 전자 장치(100))를 포함할 수 있다. 상기 명령이 프로세서에 의해 실행될 경우, 프로세서가 직접, 또는 상기 프로세서의 제어하에 다른 구성요소들을 이용하여 상기 명령에 해당하는 기능을 수행할 수 있다. 명령은 컴파일러 또는 인터프리터에 의해 생성 또는 실행되는 코드를 포함할 수 있다. 기기로 읽을 수 있는 저장매체는, 비 일시적(non-transitory) 저장매체의 형태로 제공될 수 있다. 여기서, '비일시적'은 저장매체가 신호(signal)를 포함하지 않으며 실재(tangible)한다는 것을 의미할 뿐 데이터가 저장매체에 반영구적 또는 임시적으로 저장됨을 구분하지 않는다.Various embodiments of the present disclosure may be implemented as software including instructions that can be stored in a machine-readable storage medium that can be read by a machine (e.g., a computer). The device may include an electronic device (e.g., an electronic device (100)) according to the disclosed embodiments, which is a device that can call instructions stored in the storage medium and operate according to the called instructions. When the instructions are executed by a processor, the processor may directly or under the control of the processor use other components to perform a function corresponding to the instructions. The instructions may include code generated or executed by a compiler or an interpreter. The machine-readable storage medium may be provided in the form of a non-transitory storage medium. Here, 'non-transitory' means that the storage medium does not contain signals and is tangible, but does not distinguish between data being stored semi-permanently or temporarily in the storage medium.

일 실시 예에 따르면, 본 문서에 개시된 다양한 실시 예들에 따른 방법은 컴퓨터 프로그램 제품(computer program product)에 포함되어 제공될 수 있다. 컴퓨터 프로그램 제품은 상품으로서 판매자 및 구매자 간에 거래될 수 있다. 컴퓨터 프로그램 제품은 기기로 읽을 수 있는 저장 매체(예: compact disc read only memory (CD-ROM))의 형태로, 또는 애플리케이션 스토어(예: 플레이 스토어??)를 통해 온라인으로 배포될 수 있다. 온라인 배포의 경우에, 컴퓨터 프로그램 제품의 적어도 일부는 제조사의 서버, 애플리케이션 스토어의 서버, 또는 중계 서버의 메모리와 같은 저장 매체에 적어도 일시 저장되거나, 임시적으로 생성될 수 있다.According to one embodiment, the method according to the various embodiments disclosed in the present document may be provided as included in a computer program product. The computer program product may be traded as a product between a seller and a buyer. The computer program product may be distributed in the form of a machine-readable storage medium (e.g., compact disc read-only memory (CD-ROM)) or online through an application store (e.g., Play Store). In the case of online distribution, at least a portion of the computer program product may be temporarily stored or temporarily generated in a storage medium, such as the memory of a manufacturer's server, an application store's server, or a relay server.

다양한 실시 예들에 따른 구성 요소(예: 모듈 또는 프로그램) 각각은 단수 또는 복수의 개체로 구성될 수 있으며, 전술한 해당 서브 구성 요소들 중 일부 서브 구성 요소가 생략되거나, 또는 다른 서브 구성 요소가 다양한 실시 예에 더 포함될 수 있다. 대체적으로 또는 추가적으로, 일부 구성 요소들(예: 모듈 또는 프로그램)은 하나의 개체로 통합되어, 통합되기 이전의 각각의 해당 구성 요소에 의해 수행되는 기능을 동일 또는 유사하게 수행할 수 있다. 다양한 실시 예들에 따른, 모듈, 프로그램 또는 다른 구성 요소에 의해 수행되는 동작들은 순차적, 병렬적, 반복적 또는 휴리스틱하게 실행되거나, 적어도 일부 동작이 다른 순서로 실행되거나, 생략되거나, 또는 다른 동작이 추가될 수 있다.Each component (e.g., a module or a program) according to various embodiments may be composed of one or more entities, and some of the aforementioned sub-components may be omitted, or other sub-components may be further included in various embodiments. Alternatively or additionally, some components (e.g., a module or a program) may be integrated into a single entity, which may perform the same or similar functions as those performed by each of the respective components prior to integration. Operations performed by a module, program, or other component according to various embodiments may be executed sequentially, in parallel, iteratively, or heuristically, or at least some operations may be executed in a different order, omitted, or other operations may be added.

이상에서는 본 개시의 바람직한 실시 예에 대하여 도시하고 설명하였지만, 본 개시는 상술한 특정의 실시 예에 한정되지 아니하며, 청구범위에서 청구하는 본 개시의 요지를 벗어남이 없이 당해 개시에 속하는 기술분야에서 통상의 지식을 가진자에 의해 다양한 변형실시가 가능한 것은 물론이고, 이러한 변형실시들은 본 개시의 기술적 사상이나 전망으로부터 개별적으로 이해되어서는 안될 것이다.Although the preferred embodiments of the present disclosure have been illustrated and described above, the present disclosure is not limited to the specific embodiments described above, and various modifications may be made by a person having ordinary skill in the art to which the present disclosure pertains without departing from the gist of the present disclosure as claimed in the claims, and such modifications should not be understood individually from the technical idea or prospect of the present disclosure.

100: 전자 장치
110: 메모리
120: 프로세서100: Electronic devices
110: Memory
120: Processor