KR20250147482A

Movatterモバイル変換

Info

Publication number: KR20250147482A
Application number: KR1020240046014A
Authority: KR
Inventors: 이승현; 정판준; 최정규
Original assignee: 인스피언 주식회사
Filing date: 2024-04-04
Publication date: 2025-10-13

Abstract

Translated fromKorean

본 발명의 일 실시 예는 수집된 데이터로부터 개인정보를 추출하는 단계; 상기 개인정보를 암호화하여 제 1 데이터로 저장하는 단계; 상기 저장된 제 1 데이터에 포함된 필드의 암호화 여부를 검사하는 단계; 상기 제 1 데이터에 포함된 필드 중 제 1 미 암호화 필드가 검출되는 경우, 상기 제 1 미 암호화 필드에 대한 상세 정보를 저장하는 단계; 및 상기 제 1 데이터를 재암호화하는 단계를 포함하는, 데이터 관리 방법을 제공한다.One embodiment of the present invention provides a data management method, comprising: extracting personal information from collected data; encrypting the personal information and storing it as first data; checking whether a field included in the stored first data is encrypted; storing detailed information about a first unencrypted field when a first unencrypted field is detected among the fields included in the first data; and re-encrypting the first data.

Description

Translated fromKorean

데이터 관리 장치, 데이터 관리 방법 및 데이터 관리 방법을 컴퓨터에 실행시키기 위한 컴퓨터 프로그램이 저장된 컴퓨터가 판독 가능한 기록 매체{DATA MANAGEMENT DEVICE, DATA MANAGEMENT METHOD AND A COMPUTER-READABLE RECORDING MEDIUM STORING A COMPUTER PROGRAM FOR EXECUTING THE DATA MANAGEMENT METHOD ON A COMPUTER}Data management device, data management method and a computer-readable recording medium storing a computer program for executing the data management method on a computer {DATA MANAGEMENT DEVICE, DATA MANAGEMENT METHOD AND A COMPUTER-READABLE RECORDING MEDIUM STORING A COMPUTER PROGRAM FOR EXECUTING THE DATA MANAGEMENT METHOD ON A COMPUTER}

본 발명은 데이터 관리 장치, 데이터 관리 방법 및 데이터 관리 방법을 컴퓨터에 실행시키기 위한 컴퓨터 프로그램이 저장된 컴퓨터가 판독 가능한 기록 매체에 관한 것이다.The present invention relates to a data management device, a data management method, and a computer-readable recording medium storing a computer program for executing the data management method on a computer.

기술의 발전으로 인해 데이터의 양과 다양성이 증가함에 따라 데이터 관리의 중요성이 더욱 부각되고 있다. 데이터 관리는 기업이나 조직이 데이터를 효과적으로 수집, 저장, 분석, 활용하는 것을 의미한다.As technological advancements increase the volume and variety of data, the importance of data management is becoming increasingly apparent. Data management refers to how businesses and organizations effectively collect, store, analyze, and utilize data.

기업의 핵심 기능을 통합적으로 관리하는 Enterprise Resource Planning (ERP) 시스템은 기업 내의 중요한 데이터를 효율적으로 관리하는 도구로 널리 사용된다. 예산 관리, 생산 관리, 재고 관리, 구매 관리 등과 같은 기업의 주요 기능들을 통합하여 한 시스템에서 효과적으로 운영할 수 있다. 이때, 대표적인 ERP 시스템으로 SAP 제품을 예로 들 수 있다.Enterprise Resource Planning (ERP) systems, which comprehensively manage a company's core functions, are widely used as tools for efficiently managing critical data within the company. They integrate key corporate functions, such as budget management, production management, inventory management, and purchasing management, allowing for effective operation within a single system. SAP products are a prime example of such ERP systems.

다만, 기존의 시스템에서는 서버와의 통신에서 발생하는 패킷을 일반적인 분석 방법으로는 직접적인 의미를 이해하기 어려울 수 있다. 특히, SAP 시스템과 같이 복잡한 데이터 구조와 특정 프로토콜을 사용하여 데이터를 교환하는 경우에는 패킷으로부터 의미 있는 값을 추출하거나 해석하는 것이 더욱 어렵기 때문에 패킷을 분석하는 방법에 대한 구체적인 방안이 필요하다.However, in existing systems, it can be difficult to directly understand the meaning of packets generated during server communication using standard analysis methods. This is especially true in systems like SAP, which exchange data using complex data structures and specific protocols. Therefore, extracting or interpreting meaningful values from packets becomes even more challenging. Therefore, a more specific approach to packet analysis is needed.

또한, 로그 데이터는 다양한 소스로부터 발생하게 된다. 예를 들어, 시스템이나 애플리케이션의 동작을 모니터링 하기 위해 로그 데이터가 기록되며, 시스템의 보안 상태를 모니터링하고 사용자의 활동을 추적하기 위해 사용되며, 애플리케이션에서 발생하는 오류를 추적하고 디버깅하기 위해 로그 데이터가 사용된다.Additionally, log data comes from a variety of sources. For example, log data is recorded to monitor the operation of a system or application, used to monitor the security status of a system and track user activity, and used to track and debug errors occurring in an application.

이외에도, 로그 데이터는 비즈니스 분석에 활용되어 의사 결정에 도움을 주게 된다. 예를 들어, 고객의 행동 분석, 마케팅 효과 분석, 품질 관리 등에 활용될 수 있다. 이러한 이유로 대용량 로그 데이터가 발생하며, 이 데이터를 효율적으로 처리하고 분석하는 것은 중요한 과제가 된다.Additionally, log data is utilized in business analytics to aid decision-making. For example, it can be used to analyze customer behavior, marketing effectiveness, and quality control. This generates large amounts of log data, and efficiently processing and analyzing this data becomes a critical task.

따라서, 본 발명은 기존 기술의 문제점을 해결하기 위한 데이터 관리 장치, 데이터 관리 방법 및 데이터 관리 방법을 컴퓨터에 실행시키기 위한 컴퓨터 프로그램이 저장된 컴퓨터가 판독 가능한 기록 매체를 제공하고자 한다.Accordingly, the present invention aims to provide a data management device, a data management method, and a computer-readable recording medium storing a computer program for executing the data management method on a computer to solve the problems of the existing technology.

개시하는 일 실시 예는 수집된 데이터로부터 개인정보를 추출하는 단계; 상기 개인정보를 암호화하여 제 1 데이터로 저장하는 단계; 상기 저장된 제 1 데이터에 포함된 필드의 암호화 여부를 검사하는 단계; 상기 제 1 데이터에 포함된 필드 중 제 1 미 암호화 필드가 검출되는 경우, 상기 제 1 미 암호화 필드에 대한 상세 정보를 저장하는 단계; 및 상기 제 1 데이터를 재암호화하는 단계를 포함하는, 데이터 관리 방법을 제공한다.One embodiment of the present invention provides a data management method, comprising: extracting personal information from collected data; encrypting the personal information and storing it as first data; checking whether a field included in the stored first data is encrypted; storing detailed information about a first unencrypted field when a first unencrypted field is detected among the fields included in the first data; and re-encrypting the first data.

상기 데이터 관리 방법은 상기 제 1 데이터가 마스터 데이터인 경우 실시간으로 암호화 여부를 검사하고, 상기 제 1 데이터가 트랜잭션 데이터인 경우, 기 설정된 주기로 암호화 여부를 검사하는 단계를 더 포함하는 것을 특징으로 한다.The above data management method is characterized by further including a step of checking whether the first data is encrypted in real time if the first data is master data, and checking whether the first data is encrypted at a preset cycle if the first data is transaction data.

상기 제 1 데이터를 재암호화하는 단계는, 사용자 설정에 따라 자동 또는 수동으로 암호화되는 단계를 포함하고, 상기 데이터 관리 방법은 상기 제 1 데이터를 재암호화하는 과정에 대응하는 로그를 기록하여 저장하는 단계를 포함하는 것을 특징으로 한다.The step of re-encrypting the first data includes a step of automatically or manually encrypting the data according to a user setting, and the data management method is characterized in that it includes a step of recording and storing a log corresponding to the process of re-encrypting the first data.

개시하는 일 실시 예는 데이터를 저장하는 데이터베이스; 및 상기 데이터를 처리하는 프로세서를 포함하고, 상기 프로세서는, 수집된 데이터로부터 개인정보를 추출하고, 상기 개인정보를 암호화하여 제 1 데이터로 저장하고, 상기 저장된 제 1 데이터에 포함된 필드의 암호화 여부를 검사하고, 상기 제 1 데이터에 포함된 필드 중 제 1 미 암호화 필드가 검출되는 경우, 상기 제 1 미 암호화 필드에 대한 상세 정보를 저장하고, 상기 제 1 데이터를 재암호화하는, 데이터 관리 장치를 제공한다.One embodiment of the present invention provides a data management device comprising: a database storing data; and a processor processing the data, wherein the processor extracts personal information from collected data, encrypts the personal information and stores it as first data, checks whether a field included in the stored first data is encrypted, and, if a first unencrypted field is detected among the fields included in the first data, stores detailed information about the first unencrypted field and re-encrypts the first data.

상기 프로세서는, 상기 제 1 데이터가 마스터 데이터인 경우 실시간으로 암호화 여부를 검사하고, 상기 제 1 데이터가 트랜잭션 데이터인 경우, 기 설정된 주기로 암호화 여부를 검사하는 단계를 더 포함하는 것을 특징으로 한다.The processor is characterized in that it further includes a step of checking whether the first data is encrypted in real time if the first data is master data, and checking whether the first data is encrypted at a preset cycle if the first data is transaction data.

상기 프로세서는, 사용자 설정에 따라 자동 또는 수동으로 상기 제 1 데이터를 재암호화하고, 상기 제 1 데이터를 재암호화하는 과정에 대응하는 로그를 기록하여 저장하는 것을 특징으로 한다.The above processor is characterized in that it automatically or manually re-encrypts the first data according to user settings, and records and stores a log corresponding to the process of re-encrypting the first data.

개시하는 일 실시 예는 수집된 데이터로부터 개인정보를 추출하고, 상기 개인정보를 암호화하여 제 1 데이터로 저장하고, 상기 저장된 제 1 데이터에 포함된 필드의 암호화 여부를 검사하고, 상기 제 1 데이터에 포함된 필드 중 제 1 미 암호화 필드가 검출되는 경우, 상기 제 1 미 암호화 필드에 대한 상세 정보를 저장하고, 상기 제 1 데이터를 재암호화하는, 데이터 관리 방법을 컴퓨터에 실행시키기 위한 컴퓨터 프로그램이 저장된 컴퓨터가 판독 가능한 기록 매체를 제공한다.One embodiment of the present invention provides a computer-readable recording medium storing a computer program for causing a computer to execute a data management method, which extracts personal information from collected data, encrypts the personal information and stores it as first data, checks whether a field included in the stored first data is encrypted, and, if a first unencrypted field is detected among the fields included in the first data, stores detailed information about the first unencrypted field, and re-encrypts the first data.

본 발명의 일 실시 예에 따르면, ERP 시스템에서의 데이터 관리를 위한 혁신적인 기술을 제공함으로써 기업의 데이터 보안과 모니터링에 대한 요구를 충족시킬 수 있다.According to one embodiment of the present invention, an innovative technology for data management in an ERP system can be provided to satisfy the needs of enterprises for data security and monitoring.

또한, 본 발명의 일 실시 예에 따르면, 서버에 대한 모든 접속 기록을 로그로 생성하여 로그 기록에 대한 안정성을 확보할 수 있다는 장점이 있다.In addition, according to one embodiment of the present invention, there is an advantage in that all access records to the server are created as logs, thereby ensuring stability of log records.

또한, 본 발명의 일 실시 예에 따르면, 개인정보 메타데이터 및 예외처리 리스트를 사용한 다차원 추출 방법을 통해 개인정보 추출에 대한 정확도를 높일 수 있다는 장점이 있다.In addition, according to one embodiment of the present invention, there is an advantage in that the accuracy of personal information extraction can be increased through a multidimensional extraction method using personal information metadata and an exception handling list.

또한, 본 발명의 일 실시 예에 따르면, 개인정보가 포함된 모든 로그 데이터에 대해 사용자 행위를 매핑할 수 있다는 장점이 있다.Additionally, according to one embodiment of the present invention, there is an advantage in that user actions can be mapped to all log data containing personal information.

또한, 본 발명의 일 실시 예에 따르면, 다량의 개인정보 데이터를 실시간으로 복호화 및 재암호화할 수 있다는 장점이 있다.Additionally, according to one embodiment of the present invention, there is an advantage in that a large amount of personal information data can be decrypted and re-encrypted in real time.

또한, 본 발명의 일 실시 예에 따르면, 암호화되지 않은 개인정보를 실시간으로 확인하여 재암호화하고, 재암호화되지 않은 원인을 알 수 있다는 장점잉 있다.In addition, according to one embodiment of the present invention, there is an advantage in that unencrypted personal information can be checked in real time, re-encrypted, and the cause of non-re-encryption can be identified.

또한, 본 발명의 일 실시 예에 따르면, 정규식 입력을 어려워하는 사용자들에게 부분적으로 정규식 입력을 도와줘 전체 정규식을 완성할 수 있는 편의성을 제공할 수 있다.In addition, according to one embodiment of the present invention, it is possible to provide convenience to users who have difficulty entering regular expressions by partially assisting them in entering regular expressions and completing the entire regular expression.

또한, 본 발명의 일 실시 예에 따르면, 등록된 파서의 재사용성을 높일 수 있도록 기 등록된 파서를 스캐닝하는 편의 기능을 제공할 수 있다.In addition, according to one embodiment of the present invention, a convenient function for scanning a pre-registered parser can be provided so as to increase the reusability of the registered parser.

또한, 본 발명의 일 실시 예에 따르면, 대량의 데이터를 동시에 검색하며, 실시간으로 결과를 확인할 수 있다는 장점이 있다.In addition, according to one embodiment of the present invention, there is an advantage in that a large amount of data can be searched simultaneously and the results can be confirmed in real time.

또한, 전체 압축된 로그 데이터 중 사용자/클라이언트로부터 요청 받은 적어도 하나의 로그 데이터가 포함된 블록만을 압축 해제할 수 있다.Additionally, only blocks containing at least one log data requested by a user/client among the entire compressed log data can be decompressed.

또한, 본 발명의 일 실시 예에 따르면, 압축 정보 및 블록 오프셋 정보를 이용하여 전체 로그 데이터의 압축 해제 없이 조회 가능하도록 하여 익스포트/임포트 과정이 매우 단축될 수 있다.In addition, according to one embodiment of the present invention, the export/import process can be greatly shortened by enabling the entire log data to be searched without decompression by using compression information and block offset information.

도 1은 본 발명의 데이터 관리 장치를 설명하기 위한 하드웨어를 개시하는 도면
도 2는 본 발명의 데이터 관리 장치의 일 실시 예를 개시하는 도면
도 3은 본 발명의 데이터 관리 플랫폼의 일 실시 예를 개시하는 도면
도 4는 본 발명의 데이터 관리 방법의 일 실시 예를 개시하는 도면
도 5는 본 발명의 분석된 패킷에 포함된 데이터 중 개인정보를 추출하는 일 실시 예를 개시하는 도면
도 6은 본 발명의 분석된 패킷에 포함된 데이터를 시각화한 일 실시 예를 개시하는 도면
도 7은 본 발명의 분석된 패킷에 포함된 데이터를 검색하는 일 실시 예를 개시하는 도면
도 8는 본 발명의 데이터 관리 플랫폼이 감사 로그를 저장하고 모니터링하는 실시 예를 설명하는 도면
도 9은 본 발명의 데이터 관리 방법이 수집된 패킷을 분배하는 실시 예를 설명하는 도면
도 10은 본 발명의 데이터 관리 플랫폼이 HTTPS 기반 패킷을 분석하는 실시 예를 설명하는 도면
도 11는 본 발명의 데이터 관리 방법이 HTTPS 기반 패킷을 분석하는 실시 예를 설명하는 도면
도 12은 본 발명의 데이터 관리 플랫폼에서 개인정보를 추출하고 저장하는 실시 예를 설명하는 도면
도 13는 본 발명의 데이터 관리 플랫폼에서 정의하는 개인정보 메타데이터의 일 예를 설명하는 도면
도 14은 본 발명의 데이터 관리 플랫폼에서 아키텍쳐 유형을 구분하여 개인정보를 추출하는 실시 예를 설명하는 도면
도 15은 본 발명의 데이터 관리 플랫폼에서 개인정보 추출 규칙을 생성하는 실시 예를 설명하는 도면
도 16은 본 발명의 데이터 관리 플랫폼에서 개인정보 예외처리 리스트를 생성하는 실시 예를 설명하는 도면
도 17은 본 발명의 데이터 관리 플랫폼의 예외 필터를 생성하는 실시 예를 설명하는 도면
도 18는 본 발명의 데이터 관리 방법이 개인정보를 추출하고 저장하는 다른 실시 예를 설명하는 도면
도 19은 본 발명의 데이터 관리 플랫폼에서 사용자 행위를 수집하고 매핑하는 실시 예를 설명하는 도면
도 20은 본 발명의 데이터 관리 플랫폼에서 사용자 행위 메타데이터를 생성하는 실시 예를 설명하는 도면
도 21는 본 발명의 데이터 관리 플랫폼에서 사용자 행위 메타데이터와 로그 데이터를 매핑하는 실시 예를 설명하는 도면
도 22은 본 발명의 데이터 관리 플랫폼에서 개인정보를 암호화하는 실시 예를 설명하는 도면
도 23는 본 발명의 데이터 관리 플랫폼에서 개인정보를 암호화하는 실시 예를 설명하는 도면
도 24는 본 발명의 매핑 정보 테이블과 업무 테이블을 설명하는 도면
도 25은 본 발명의 데이터 관리 플랫폼에서 신규 암호화 키를 생성하는 실시 예를 설명하는 도면
도 26은 본 발명의 데이터 관리 플랫폼에서 새로운 업무 데이터를 추가하는 실시 예를 설명하는 도면
도 27는 본 발명의 데이터 관리 플랫폼이 로그 데이터를 정규화하는 실시 예를 설명하는 도면
도 28는 본 발명의 데이터 관리 방법이 파서를 생성하는 일 실시 예를 설명하는 도면
도 29은 본 발명의 파서 생성 화면의 일 실시 예를 설명하는 도면
도 30은 본 발명의 데이터 관리 방법이 변환 규칙을 생성하는 일 실시 예를 설명하는 도면
도 31은 본 발명의 데이터 관리 방법이 수집 경로 규칙을 생성하는 일 실시 예를 설명하는 도면
도 32는 본 발명의 데이터 관리 방법이 이벤트 로그를 검색하는 일 실시 예를 설명하는 도면
도 33은 본 발명의 데이터 관리 플랫폼에서 로그를 검색하는 실시 예를 설명하는 도면
도 34은 본 발명의 익스터널 머지 소트 알고리즘 실시 예를 설명하는 도면
도 35는 본 발명의 데이터 관리 플랫폼에서 데이터를 분석하는 실시 예를 설명하는 도면
도 36은 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면
도 37는 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면
도 38는 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면
도 39은 본 발명의 데이터 관리 플랫폼이 로그 데이터를 저장하고 조회하는 일 실시 예를 설명하는 도면
도 40은 본 발명의 데이터 관리 플랫폼이 로그 데이터를 저장하고 조회하는 다른 일 실시 예를 설명하는 도면
도 41은 실시 예에 따른 데이터 관리 장치에 대한 레이어 별 요소들을 개시한 개념도
도 42는 실시 예에 따른 데이터 관리 장치가 에이전트를 모니터링하는 실시 예를 설명하는 도면
도 43은 실시 예에 따른 데이터 관리 방법의 흐름도
도 44는 일 실시 예에 따른 프록시 서버가 다른 클라우드 서비스에 연결되는 실시 예를 설명하는 도면
도 45는 일 실시 예에 따른 프록시 서버의 기능을 설명하는 도면
도 46는 일 실시 예에 따른 프록시 서버의 다른 기능을 설명하는 도면
도 47는 일 실시 예에 따른 프록시 서버 내 서비스를 분리하는 실시 예를 설명하는 도면
도 48는 일 실시 예에 따른 프록시 서버를 스케일 아웃하는 실시 예를 설명하는 도면
도 49는 일 실시 예에 따른 프록시 서버가 NI 프로토콜을 처리하는 실시 예를 설명하는 도면
도 50은 일 실시 예에 따른 데이터 관리 장치의 프록시 데이터 수집 프로세스를 설명하는 도면
도 51은 일 실시 예에 따른 프록시 서버가 다른 클라우드 서비스에 연결되는 실시 예를 설명하는 도면
도 52는 일 실시 예에 따른 데이터 관리 방법을 설명하는 흐름도
도 53은 실시 예에 따른 데이터 관리 장치를 설명하는 도면
도 54는 실시 예에 따른 WEBGUI 데이터의 구조의 예시를 설명하는 도면
도 55는 실시 예에 따른 WEBGUI 데이터 내 HTML 분석 결과를 나타내는 도면
도 56은 실시 예에 따른 WEBGUI 데이터 내 HTML 분석 결과를 나타내는 도면
도 57은 실시 예에 따른 WEBGUI 데이터 원문을 나타내는 도면
도 58은 실시 예에 따른 불필요한 정보를 제거한 xml 데이터를 나타내는 도면
도 59는 실시 예에 따른 데이터 관리 방법을 설명하는 도면
도 60은 실시 예에 따른 데이터 관리 플랫폼이 컴퓨팅 시스템에 대한 접근 통제를 수행하는 예를 개시하는 도면
도 61은 실시 예에 따른 데이터 관리 플랫폼이 컴퓨팅 시스템에 대한 접근 통제를 수행하는 다른 예를 개시하는 도면
도 62는 실시 예에 따른 접근 통제 정보 입력 화면의 예를 개시하는 도면
도 63은 실시 예에 따른 이벤트 처리 규칙 화면의 예를 개시하는 도면
도 64는 실시 예에 따른 상관 규칙 입력 화면의 예를 개시하는 도면
도 65는 실시 예에 따른 상관 규칙 활성화 화면의 예를 개시하는 도면
도 66은 실시 예에 따른 통제 데이터 설정 화면의 예를 개시하는 도면
도 67은 실시 예에 따른 데이터 관리 방법이 데이터 포맷을 변환하는 예를 개시하는 흐름도
도 68은 실시 예에 따른 데이터 관리 플랫폼이 로그 데이터 관련 인덱싱 및 로그 데이터 압축을 수행하는 예를 개시하는 도면
도 69는 실시 예에 따른 데이터 관리 플랫폼이 로그 데이터 관련 인덱싱 및 로그 데이터 압축을 수행하는 다른 예를 개시하는 도면
도 70은 실시 예에 따른 데이터 관리 방법이 로그 데이터 관련 인덱싱 및 로그 데이터 압축을 수행하는 예를 개시하는 흐름도
도 71은 실시 예에 따른 데이터 관리 플랫폼이 통계 쿼리 생성 및 통계 데이터 시각화를 수행하는 예를 개시하는 도면
도 72는 실시 예에 따른 쿼리 생성 화면의 예를 개시하는 도면
도 73은 실시 예에 따른 TopN 쿼리 설정 화면의 예를 개시하는 도면
도 74는 실시 예에 따른 Time Series 쿼리 설정 화면의 예를 개시하는 도면
도 75는 실시 예에 따른 Group By 쿼리 설정 화면의 예를 개시하는 도면
도 76은 실시 예에 따른 통계 검색 화면의 예를 개시하는 도면
도 77은 실시 예에 따른 시각화 차트 선택 화면의 예를 개시하는 도면
도 78은 실시 예에 따른 필드 매핑 화면의 예를 개시하는 도면
도 79는 실시 예에 따른 데이터 관리 방법이 통계 쿼리 생성 및 통계 데이터 시각화를 수행하는 예를 개시하는 흐름도
도 80은 실시 예에 따른 데이터 관리 플랫폼이 타임 윈도우 기반 상관 분석을 수행하는 예를 개시하는 도면
도 81은 실시 예에 따른 데이터 관리 플랫폼이 타임 윈도우 기반 상관 분석을 수행하는 다른 예를 개시하는 도면
도 82는 실시 예에 따른 상관 분석 규칙 등록 화면의 예를 개시하는 도면
도 83은 실시 예에 따른 대상 필드 미사용에 대한 횟수 초과의 예를 개시하는 도면
도 84는 실시 예에 따른 대상 필드 사용에 대한 횟수 초과의 예를 개시하는 도면
도 85는 실시 예에 따른 억제 조건의 예를 개시하는 도면
도 86은 실시 예에 따른 미발생 감지의 예를 개시하는 도면
도 87은 실시 예에 따른 연속 조건의 예를 개시하는 도면
도 88은 실시 예에 따른 데이터 관리 방법이 타임 윈도우 기반 상관 분석을 수행하는 예를 개시하는 흐름도
도 89는 실시 예에 따른 데이터 관리 플랫폼이 사용자 유형에 따른 로그 데이터 검색을 수행하는 예를 개시하는 도면
도 90은 실시 예에 따른 검색 조건 설정 화면의 예를 개시하는 도면
도 91은 실시 예에 따른 검색 조건 UI의 예를 개시하는 도면
도 92는 실시 예에 따른 검색 쿼리의 예를 개시하는 도면
도 93은 실시 예에 따른 일반 검색 결과 표시 영역의 예를 개시하는 도면
도 94는 실시 예에 따른 검색 옵션 설정 화면의 예를 개시하는 도면
도 95는 실시 예에 따른 고급 검색 결과 표시 영역의 예를 개시하는 도면
도 96은 실시 예에 따른 데이터 관리 방법이 사용자 유형에 따른 로그 데이터 검색을 수행하는 예를 개시하는 흐름도
도 97은 본 발명의 데이터 관리 플랫폼에서 에이전트를 관리하는 실시 예를 설명하는 도면
도 98는 본 발명의 데이터 관리 플랫폼에서 에이전트를 설치하는 실시 예를 설명하는 도면
도 99는 본 발명의 데이터 관리 플랫폼에서 에이전트를 승인하는 실시 예를 설명하는 도면이다.
도 100는 본 발명의 데이터 관리 플랫폼에서 에이전트를 관리하는 실시 예를 설명하는 도면
도 101는 본 발명의 데이터 관리 방법에서 에이전트를 관리하는 방법을 설명하는 도면
도 102는 본 발명의 데이터 관리 방법에서 에이전트의 버전을 관리하는 방법을 설명하는 도면
도 103는 본 발명의 데이터 관리 방법이 에이전트의 상태를 모니터링하는 실시 예를 설명하는 도면
도 104는 본 발명의 데이터 관리 방법에서 에이전트와의 통신 방법을 설명하는 도면
도 105는 본 발명의 데이터 관리 방법에서 에이전트의 동작 방법을 설명하는 도면
도 106는 본 발명의 데이터 관리 방법에서 에이전트의 동작 방법을 설명하는 도면
도 107는 실시 예에 따른 데이터 관리 장치를 설명하는 도면
도 108은 실시 예에 따른 미 암호화 필드 검출 방법을 설명하는 도면
도 109은 실시 예에 따른 미 암호화 필드 검출 후 재암호화 수행 방법을 설명하는 도면
도 110은 실시 예에 따른 미 암호화 필드 검출 결과를 나타내는 도면
도 111은 실시 예에 따른 재암호화된 데이터에 대한 변경 이력 로그를 나타내는 도면
도 112은 실시 예에 따른 데이터 관리 방법을 설명하는 순서도Figure 1 is a drawing disclosing hardware for explaining the data management device of the present invention.
FIG. 2 is a drawing disclosing one embodiment of a data management device of the present invention.
FIG. 3 is a diagram disclosing one embodiment of the data management platform of the present invention.
Figure 4 is a drawing disclosing one embodiment of a data management method of the present invention.
FIG. 5 is a diagram disclosing an embodiment of extracting personal information from data included in an analyzed packet of the present invention.
FIG. 6 is a diagram disclosing an embodiment of visualizing data included in an analyzed packet of the present invention.
FIG. 7 is a diagram disclosing an embodiment of searching data included in an analyzed packet of the present invention.
FIG. 8 is a diagram illustrating an embodiment in which the data management platform of the present invention stores and monitors audit logs.
FIG. 9 is a drawing illustrating an embodiment of a data management method of the present invention distributing collected packets.
FIG. 10 is a diagram illustrating an embodiment in which the data management platform of the present invention analyzes HTTPS-based packets.
FIG. 11 is a drawing illustrating an embodiment of a data management method of the present invention analyzing HTTPS-based packets.
Figure 12 is a drawing illustrating an embodiment of extracting and storing personal information in the data management platform of the present invention.
Figure 13 is a drawing illustrating an example of personal information metadata defined in the data management platform of the present invention.
Figure 14 is a drawing illustrating an embodiment of extracting personal information by distinguishing the architecture type in the data management platform of the present invention.
FIG. 15 is a diagram illustrating an embodiment of creating a personal information extraction rule in the data management platform of the present invention.
FIG. 16 is a diagram illustrating an example of creating a personal information exception processing list in the data management platform of the present invention.
FIG. 17 is a diagram illustrating an embodiment of creating an exception filter of the data management platform of the present invention.
Figure 18 is a drawing illustrating another embodiment of the data management method of the present invention for extracting and storing personal information.
FIG. 19 is a diagram illustrating an embodiment of collecting and mapping user behavior in the data management platform of the present invention.
FIG. 20 is a diagram illustrating an embodiment of generating user behavior metadata in the data management platform of the present invention.
FIG. 21 is a diagram illustrating an embodiment of mapping user behavior metadata and log data in the data management platform of the present invention.
Figure 22 is a drawing illustrating an embodiment of encrypting personal information in the data management platform of the present invention.
Figure 23 is a drawing illustrating an embodiment of encrypting personal information in the data management platform of the present invention.
Figure 24 is a drawing explaining the mapping information table and work table of the present invention.
FIG. 25 is a drawing illustrating an embodiment of generating a new encryption key in the data management platform of the present invention.
Figure 26 is a drawing illustrating an embodiment of adding new business data to the data management platform of the present invention.
Figure 27 is a drawing illustrating an embodiment in which the data management platform of the present invention normalizes log data.
Figure 28 is a drawing illustrating an embodiment of a data management method of the present invention for generating a parser.
Figure 29 is a drawing illustrating one embodiment of the parser generation screen of the present invention.
Figure 30 is a drawing illustrating an embodiment of a data management method of the present invention for generating a conversion rule.
FIG. 31 is a diagram illustrating an embodiment of a data management method of the present invention for generating a collection path rule.
Figure 32 is a drawing illustrating an embodiment of a data management method of the present invention for searching an event log.
FIG. 33 is a drawing illustrating an embodiment of searching a log in a data management platform of the present invention.
Figure 34 is a drawing illustrating an embodiment of the external merge sort algorithm of the present invention.
Figure 35 is a drawing illustrating an embodiment of analyzing data in the data management platform of the present invention.
Figure 36 is a drawing illustrating the user interface of the analysis task editor provided in the data management platform of the present invention.
Figure 37 is a drawing illustrating the user interface of the analysis task editor provided in the data management platform of the present invention.
Figure 38 is a drawing illustrating the user interface of the analysis task editor provided in the data management platform of the present invention.
Figure 39 is a drawing illustrating an embodiment of a data management platform of the present invention storing and retrieving log data.
Figure 40 is a drawing illustrating another embodiment of a data management platform of the present invention storing and retrieving log data.
Figure 41 is a conceptual diagram disclosing layer-by-layer elements for a data management device according to an embodiment.
FIG. 42 is a drawing illustrating an embodiment in which a data management device according to an embodiment monitors an agent.
Figure 43 is a flowchart of a data management method according to an embodiment.
FIG. 44 is a diagram illustrating an embodiment in which a proxy server according to one embodiment is connected to another cloud service.
Figure 45 is a drawing illustrating the function of a proxy server according to one embodiment.
Figure 46 is a drawing illustrating another function of a proxy server according to one embodiment.
FIG. 47 is a drawing illustrating an embodiment of separating services within a proxy server according to one embodiment.
FIG. 48 is a drawing illustrating an embodiment of scaling out a proxy server according to an embodiment.
Figure 49 is a drawing illustrating an embodiment in which a proxy server processes the NI protocol according to one embodiment.
FIG. 50 is a diagram illustrating a proxy data collection process of a data management device according to an embodiment.
FIG. 51 is a diagram illustrating an embodiment in which a proxy server according to one embodiment is connected to another cloud service.
Figure 52 is a flowchart illustrating a data management method according to one embodiment.
Figure 53 is a drawing illustrating a data management device according to an embodiment.
Figure 54 is a drawing illustrating an example of the structure of WEBGUI data according to an embodiment.
Figure 55 is a diagram showing the HTML analysis results in WEBGUI data according to an embodiment.
Figure 56 is a diagram showing the HTML analysis results in WEBGUI data according to an embodiment.
Figure 57 is a diagram showing the original text of WEBGUI data according to an embodiment.
Figure 58 is a drawing showing XML data with unnecessary information removed according to an embodiment.
Figure 59 is a drawing illustrating a data management method according to an embodiment.
FIG. 60 is a diagram disclosing an example of a data management platform according to an embodiment performing access control to a computing system.
FIG. 61 is a diagram disclosing another example of a data management platform according to an embodiment performing access control to a computing system.
Figure 62 is a drawing disclosing an example of an access control information input screen according to an embodiment.
Figure 63 is a drawing disclosing an example of an event processing rule screen according to an embodiment.
Figure 64 is a drawing disclosing an example of a correlation rule input screen according to an embodiment.
Figure 65 is a drawing disclosing an example of a correlation rule activation screen according to an embodiment.
Figure 66 is a drawing disclosing an example of a control data setting screen according to an embodiment.
Figure 67 is a flowchart showing an example of a data management method according to an embodiment of the present invention converting a data format.
FIG. 68 is a diagram disclosing an example of a data management platform according to an embodiment performing log data-related indexing and log data compression.
FIG. 69 is a diagram disclosing another example of a data management platform according to an embodiment performing log data-related indexing and log data compression.
Figure 70 is a flowchart showing an example of a data management method according to an embodiment performing log data-related indexing and log data compression.
FIG. 71 is a diagram disclosing an example of a data management platform according to an embodiment performing statistical query generation and statistical data visualization.
Figure 72 is a drawing disclosing an example of a query creation screen according to an embodiment.
Figure 73 is a drawing disclosing an example of a TopN query setting screen according to an embodiment.
Figure 74 is a drawing disclosing an example of a Time Series query setting screen according to an embodiment.
Figure 75 is a drawing disclosing an example of a Group By query setting screen according to an embodiment.
Figure 76 is a drawing disclosing an example of a statistical search screen according to an embodiment.
Figure 77 is a drawing disclosing an example of a visualization chart selection screen according to an embodiment.
Figure 78 is a drawing disclosing an example of a field mapping screen according to an embodiment.
Figure 79 is a flowchart disclosing an example of a data management method according to an embodiment of the present invention for generating statistical queries and visualizing statistical data.
FIG. 80 is a diagram disclosing an example of a data management platform according to an embodiment performing time window-based correlation analysis.
FIG. 81 is a diagram disclosing another example of a data management platform according to an embodiment performing time window-based correlation analysis.
Figure 82 is a drawing disclosing an example of a correlation analysis rule registration screen according to an embodiment.
Figure 83 is a drawing disclosing an example of an excess of the number of times a target field is not used according to an embodiment.
FIG. 84 is a drawing disclosing an example of exceeding the number of times for using a target field according to an embodiment.
Figure 85 is a drawing disclosing an example of a suppression condition according to an embodiment.
Figure 86 is a drawing disclosing an example of non-occurrence detection according to an embodiment.
Figure 87 is a drawing disclosing an example of a continuous condition according to an embodiment.
Figure 88 is a flowchart disclosing an example of a data management method according to an embodiment performing time window-based correlation analysis.
FIG. 89 is a diagram disclosing an example of a data management platform according to an embodiment performing a log data search according to a user type.
FIG. 90 is a drawing disclosing an example of a search condition setting screen according to an embodiment.
FIG. 91 is a drawing disclosing an example of a search condition UI according to an embodiment.
FIG. 92 is a drawing disclosing an example of a search query according to an embodiment.
FIG. 93 is a drawing disclosing an example of a general search results display area according to an embodiment.
FIG. 94 is a drawing disclosing an example of a search option setting screen according to an embodiment.
FIG. 95 is a drawing disclosing an example of an advanced search results display area according to an embodiment.
Figure 96 is a flowchart showing an example of a data management method according to an embodiment performing a log data search according to a user type.
Figure 97 is a drawing illustrating an embodiment of managing an agent in the data management platform of the present invention.
Figure 98 is a drawing illustrating an embodiment of installing an agent in the data management platform of the present invention.
Figure 99 is a drawing illustrating an embodiment of approving an agent in the data management platform of the present invention.
Figure 100 is a drawing illustrating an embodiment of managing an agent in a data management platform of the present invention.
Figure 101 is a drawing explaining a method for managing an agent in the data management method of the present invention.
Figure 102 is a drawing explaining a method for managing the version of an agent in the data management method of the present invention.
Figure 103 is a drawing illustrating an embodiment of a data management method of the present invention for monitoring the status of an agent.
Figure 104 is a drawing explaining a communication method with an agent in the data management method of the present invention.
Figure 105 is a drawing explaining the operation method of an agent in the data management method of the present invention.
Figure 106 is a drawing explaining the operation method of an agent in the data management method of the present invention.
Figure 107 is a drawing illustrating a data management device according to an embodiment.
Figure 108 is a drawing illustrating a method for detecting an unencrypted field according to an embodiment.
Figure 109 is a drawing illustrating a method for performing re-encryption after detecting an unencrypted field according to an embodiment.
Figure 110 is a diagram showing the results of detecting an unencrypted field according to an embodiment.
FIG. 111 is a diagram showing a change history log for re-encrypted data according to an embodiment.
Figure 112 is a flowchart illustrating a data management method according to an embodiment.

이하에서는 도면을 참조하여 다양한 실시예들을 상세히 설명한다. 이하에서 설명되는 실시예들은 여러 가지 상 이한 형태로 변형되어 실시될 수도 있다. 실시예들의 특징을 보다 명확히 설명하기 위하여 이하의 실시예들이 속하는 기술분야에서 통상의 지식을 가진 자에게 널리 알려져 있는 사항들에 관해서 자세한 설명은 생략한다.Hereinafter, various embodiments will be described in detail with reference to the drawings. The embodiments described below may be modified and implemented in various different forms. In order to more clearly explain the features of the embodiments, detailed descriptions of matters widely known to those of ordinary skill in the art to which the embodiments pertain below are omitted.

한편, 본 명세서에서 어떤 구성이 다른 구성과 "연결"되어 있다고 할 때, 이는 '직접적으로 연결'되어 있는 경우 뿐 아니라, '그 중간에 다른 구성을 사이에 두고 연결'되어 있는 경우도 포함한다. 또한, 어떤 구성이 다른 구성을 "포함"한다고 할 때, 이는 특별히 반대되는 기재가 없는 한, 그 외 다른 구성을 제외하는 것이 아니라 다른 구성들 더 포함할 수도 있다는 것을 의미한다.Meanwhile, when a component is said to be "connected" to another component in this specification, this includes not only cases where it is "directly connected" but also cases where it is "connected with another component in between." Furthermore, when a component is said to "include" another component, this does not mean that it excludes other components, but rather that it may include other components, unless otherwise specifically stated.

또한, 본 명세서에서 사용되는 “제 1”또는 “제 2”등과 같이 서수를 포함하는 용어는 다양한 구성 요소들을 설 명하는데 사용할 수 있지만, 상기 구성 요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성 요소를 다른 구성 요소로부터 구별하는 목적으로만 사용된다.Additionally, terms including ordinal numbers, such as “first” or “second,” used herein may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another.

또한, 각각의 모듈이 포함하는 기능부(function unit)들은 모듈이 수행하는 기능을 설명하기 위한 논리적 구조이기 때문에, 각각의 기능부가 수행하는 기능은 모듈이 수행할 수 있음은 물론이다. 즉, 각각의 모듈은 모듈 내에 포함된 모든 기능부들을 포함할 필요가 없고, 기능을 수행하기 위한 적어도 하나의 기능부를 포함할 수 있다.Furthermore, since the functional units included in each module are a logical structure for describing the function performed by the module, it goes without saying that the module can perform the function performed by each functional unit. In other words, each module need not include all the functional units included within the module, and can include at least one functional unit for performing the function.

이하에서는 첨부한 도면을 참조하여 실시 예를 예시하여 상세히 기술하도록 한다. 실시 예에서 프레임워크, 모듈, 응용 프로그램 인터페이스 등은 물리 장치 결합된 장치로 구현할 수도 있고 소프트웨어로 구현할 수도 있다. 이때, 실시 예가 소프트웨어로 구현될 경우 저장매체에 저장되고 컴퓨터 등에 설치되어 프로세서에 의해 실행될 수 있다.Hereinafter, embodiments will be described in detail with reference to the attached drawings. In the embodiments, the framework, modules, application program interfaces, etc. may be implemented as physical devices combined with each other or as software. If the embodiments are implemented as software, they may be stored on a storage medium, installed on a computer, etc., and executed by a processor.

도 1은 본 발명의 데이터 관리 장치를 설명하기 위한 하드웨어를 개시하는 도면이다.FIG. 1 is a drawing disclosing hardware for explaining a data management device of the present invention.

본 도면은 후술하는 데이터 관리 플랫폼과 관련된 하드웨어 구성 요소 간의 데이터 송수신에 대한 실시 예를 설명한다.This drawing illustrates an embodiment of data transmission and reception between hardware components related to the data management platform described below.

본 발명은 플랫폼을 통하여 사용자에게 제공될 수 있다. 이를 위하여 사용자/클라이언트(1000)는 웹 브라우저 등을 통하여 네트워크 장치(1001)에 접속하여 컴퓨팅 서버(1004)의 제어 하에 별도의 스토리지/데이터베이스(1003)에 저장된 데이터와 애플리케이션을 이용할 수 있게 된다.The present invention can be provided to users through a platform. To this end, a user/client (1000) can access a network device (1001) via a web browser or the like and utilize data and applications stored in a separate storage/database (1003) under the control of a computing server (1004).

보다 상세하게는, 사용자/클라이언트(1000)는 클라이언트 단말기를 통하여 서비스(예를 들어, 데이터의 검색, 수정, 삭제 등의 작업)를 요청하고, 컴퓨팅 서버(1004)를 통하여 연산되고 네트워크 장치(1001)를 통하여 수신한 데이터를 화면에 출력할 수 있다. 일 실시 예에서, 사용자/클라이언트(1000)는 본 발명의 데이터 관리 플랫폼에 접근하는 모든 대상을 포함할 수 있다. 즉, 본 도면에서 사용자/클라이언트(1000)는 네트워크 장치(1001)를 이용하여 데이터 관리 플랫폼에 접근할 수 있으며, 네트워크 장치(1001)의 제한을 두지 않는다.More specifically, a user/client (1000) may request a service (e.g., data search, modification, deletion, etc.) through a client terminal, and may output data calculated through a computing server (1004) and received through a network device (1001) on a screen. In one embodiment, the user/client (1000) may include all objects that access the data management platform of the present invention. That is, in this drawing, the user/client (1000) may access the data management platform using the network device (1001), and is not limited by the network device (1001).

네트워크 장치(1001)는 사용자/클라이언트(1000)와 컴퓨팅 서버(1004) 사이에서 데이터 전송을 중개하는 역할을 한다. 여기에서, 네트워크 장치(1001)는 라우터, TAP, 스위치 등을 포함할 수 있다. 라우터는 IP 주소를 이용하여 데이터를 전송하고, 스위치는 MAC 주소를 이용하여 데이터를 전송한다. 여기에서, 라우터와 같은 네트워크 장비(1001)는 SPAN(Switched Port Analyzer) 모드를 이용하여 특정 포트(port)만을 미러링하여 패킷을 데이터 관리 플랫폼(10000)에 전달할 수 있다. 이하에서 자세히 설명하도록 한다.A network device (1001) mediates data transmission between a user/client (1000) and a computing server (1004). Here, the network device (1001) may include a router, a tap-to-access point (TAP), a switch, etc. The router transmits data using an IP address, and the switch transmits data using a MAC address. Here, a network device (1001), such as a router, can use a SPAN (Switched Port Analyzer) mode to mirror only a specific port and transmit packets to a data management platform (10000). This will be described in detail below.

TAP(Test Access Point)은 네트워크 상에서 데이터를 수집하는 용도로 사용될 수 있다. 보다 상세하게는, TAP은 네트워크 장치(1001) 중 하나로, 네트워크의 백-본(back-bone) 라인에 추가되어 미러링만 전문적으로 해주는 장치에 대응한다. 즉, Network TAP(Test Access Point, 이하, TAP)을 통해 본 발명의 데이터 관리 플랫폼(10000)에 패킷을 전달할 수 있다. 이에 따라, 네트워크 상에서 송수신되는 데이터 패킷의 흐름에 영향을 주지 않고 패킷을 복사하여 데이터 관리 플랫폼에 전달할 수 있다.A Test Access Point (TAP) can be used to collect data on a network. More specifically, a TAP is a network device (1001) that is added to the network backbone line and corresponds to a device that specializes in mirroring. In other words, packets can be transmitted to the data management platform (10000) of the present invention through a Network Test Access Point (TAP, hereinafter referred to as "TAP"). Accordingly, packets can be copied and transmitted to the data management platform without affecting the flow of data packets transmitted and received on the network.

스토리지/데이터베이스(1003)는 데이터를 저장하고, 관리할 수 있다. 스토리지는 주로 하드 디스크나 SSD를 이용하여 데이터를 저장하고, 데이터베이스는 구조화된 데이터를 관리하며, 검색 및 수정 등의 작업을 수행할 수 있다.Storage/database (1003) can store and manage data. Storage primarily uses hard disks or SSDs to store data, while the database manages structured data and can perform tasks such as searching and modifying it.

컴퓨팅 서버(1004)는 데이터 처리를 담당한다. 클라이언트(1000)로부터 요청된 작업을 처리하고, 처리 결과를 클라이언트(10000)에게 반환한다. 이를 위해 중앙연산장치(Central Processing Unit, CPU), RAM 등의 하드웨어를 이용하여 계산 작업을 수행할 수 있다. 그리고 컴퓨팅 서버(1004)는 여러 가지 데이터의 입출력을 제어하고 데이터 관리 플랫폼(10000)에서 처리된 데이터를 스토리지/데이터베이스(1003)에 저장할 수 있다. 이때, 본 발명의 데이터 관리 플랫폼(10000)이 수행하는 기능은 컴퓨팅 서버(1004)의 프로세서에 의해 수행될 수 있다. 또한, 컴퓨팅 서버(1004)는 하드웨어 구성 요소 또는 데이터 관리 플랫폼 내의 모듈들의 상태를 모니터링하고 제어하는 시스템 매니저를 포함할 수 있다.The computing server (1004) is responsible for data processing. It processes tasks requested from the client (1000) and returns the processing results to the client (10000). To this end, it can perform computational tasks using hardware such as a central processing unit (CPU) and RAM. Furthermore, the computing server (1004) controls the input/output of various data and can store data processed by the data management platform (10000) in a storage/database (1003). At this time, the functions performed by the data management platform (10000) of the present invention can be performed by the processor of the computing server (1004). Furthermore, the computing server (1004) may include a system manager that monitors and controls the status of hardware components or modules within the data management platform.

이하, 본 발명을 실시하기 위하여 본 도면의 하드웨어 구성요소가 사용될 수 있으며, 상기 하드웨어 구성 요소 간의 데이터 처리(processing) 방법이 포함됨은 물론이다.Hereinafter, the hardware components of this drawing can be used to implement the present invention, and of course, a data processing method between the hardware components is included.

도 2는 본 발명의 데이터 관리 장치의 일 실시 예를 개시하는 도면이다.FIG. 2 is a drawing disclosing one embodiment of a data management device of the present invention.

이 도면의 실시 예는 데이터 관리 장치를 예시하고 있으며, 데이터 관리 장치를 설명하기 위한 physical) 장치와 논리적인 요소(logical component)를 포함하고 있다.An embodiment of this drawing illustrates a data management device, including physical components and logical components for describing the data management device.

본 발명은 일 실시 예에서 SaaS(Software as a Service) 플랫폼을 통해 사용자에게 제공될 수 있다. SaaS 플랫폼은 클라우드 컴퓨팅 기술을 이용하여 네트워크를 통해 사용자에게 서비스로 제공되는 소프트웨어를 뜻한다. 이를 위하여, 스토리지/데이터베이스(1003), 컴퓨팅 서버(1004) 및 컨테이너 플랫폼(1005)은 사용자/클라이언트(1000)가 데이터 관리 플랫폼(10000) 및 데이터 관리 소프트웨어 패키지(20000)를 클라우드 상에서 이용할 수 있도록 지원할 수 있다.In one embodiment, the present invention may be provided to users via a Software as a Service (SaaS) platform. A SaaS platform refers to software provided as a service to users via a network using cloud computing technology. To this end, a storage/database (1003), a computing server (1004), and a container platform (1005) may support users/clients (1000) to utilize a data management platform (10000) and a data management software package (20000) in the cloud.

본 발명은 데이터 처리를 위하여 사용자/클라이언트(1000), 스토리지/데이터베이스(1003), 애플리케이션 서버(1002), 컴퓨팅 서버(1004), 컨테이너 플랫폼(1005), 데이터 관리 플랫폼(10000) 및 데이터 관리 소프트웨어 패키지(20000)를 사용할 수 있다.The present invention can use a user/client (1000), a storage/database (1003), an application server (1002), a computing server (1004), a container platform (1005), a data management platform (10000), and a data management software package (20000) for data processing.

이때, 스토리지/데이터베이스(1003) 및 컴퓨팅 서버(1004)는 하드웨어일 수 있으며, 애플리케이션 서버(1002), 컨테이너 플랫폼(1005), 데이터 관리 플랫폼(10000) 및 데이터 관리 소프트웨어 패키지(20000)는 소프트웨어에 대응할 수 있다. 하드웨어에 대하여는 상술한 내용을 참고하도록 하고, 이 도면을 참조하여 데이터 관리 장치의 실시 예를 설명하면 다음과 같다.At this time, the storage/database (1003) and computing server (1004) may be hardware, and the application server (1002), container platform (1005), data management platform (10000), and data management software package (20000) may correspond to software. For hardware, refer to the above-described content, and with reference to this drawing, an embodiment of the data management device will be described as follows.

사용자/클라이언트(1000)는 데이터 처리를 위해 데이터 관리 소프트웨어(20000)에 접속할 수 있다.A user/client (1000) can access data management software (20000) for data processing.

컨테이너 플랫폼(1005)은 OS(Operating System), 컨테이너(container), 도커(docker) 등으로 구성되어 데이터 처리를 위한 가상 환경을 제공할 수 있다.The container platform (1005) can provide a virtual environment for data processing by being composed of an operating system (OS), a container, and docker.

데이터 관리 플랫폼(10000)은 데이터 관리 소프트웨어 패키지(20000) 내에 포함된 적어도 하나의 엔진 또는 모듈을 제어할 수 있다. 이를 위하여, 데이터 관리 플랫폼(10000)은 내부 데이터베이스(여기에서, 데이터베이스는 데이터 관리 소프트웨어 패키지(20000) 내의 내부 데이터베이스를 의미한다.), 스토리지, 분산 파일 시스템 등의 기술을 사용하여 데이터를 관리할 수 있다. 또한, 데이터 관리 플랫폼(10000)은 데이터 관리 소프트웨어 패키지(20000) 내에 포함된 적어도 하나의 엔진 또는 모듈을 관리하기 위한 시스템 매니저 또는 관리콘솔을 포함할 수 있다.The data management platform (10000) can control at least one engine or module included in the data management software package (20000). To this end, the data management platform (10000) can manage data using technologies such as an internal database (here, the database refers to the internal database within the data management software package (20000), storage, and a distributed file system. In addition, the data management platform (10000) can include a system manager or management console for managing at least one engine or module included in the data management software package (20000).

데이터 관리 소프트웨어 패키지(20000)는 수집 모듈(20001), 분석 모듈(20002), 키 관리 모듈(20003), 개인정보 관리 모듈(20004), 모니터링 모듈(20005) 및 AI 엔진(20006) 중 적어도 하나를 포함할 수 있다. 다만, 데이터 관리 소프트웨어 패키지(20000)에 포함된 모듈 및 엔진은 필수적인 구성요소가 아니며 본 발명을 설명하기 위한 요소에 해당한다. 따라서, 데이터 실시 예를 수행하기 위한 다른 이름의 모듈이 포함될 수 있음은 물론이다.The data management software package (20000) may include at least one of a collection module (20001), an analysis module (20002), a key management module (20003), a personal information management module (20004), a monitoring module (20005), and an AI engine (20006). However, the modules and engines included in the data management software package (20000) are not essential components and are elements for explaining the present invention. Therefore, it is understood that modules with different names may be included to perform the data implementation examples.

수집 모듈(20001)은 다양한 소스에서 데이터를 수집하고 데이터를 처리 파이프 라인으로 전송할 수 있다. 수집 모듈(20001)은 로그, 이벤트, 센서, 웹 서버 등 다양한 소스에서 데이터(예를 들어, 패킷)를 수집할 수 있다. 특히, 수집 모듈(20001)은 로그 수집을 위해 에이전트를 사용할 때 에이전트를 중앙 관리할 수 있다.The collection module (20001) can collect data from various sources and transmit the data to a processing pipeline. The collection module (20001) can collect data (e.g., packets) from various sources, such as logs, events, sensors, and web servers. In particular, the collection module (20001) can centrally manage agents when using agents for log collection.

분석 모듈(20002)은 데이터를 분석하고 가치 있는 인사이트를 도출하는 데 사용될 수 있다. 분석 모듈(20002)은 수집된 패킷을 분석하여 데이터를 추출할 수 있다. 이때, 포함된 데이터가 개인정보인 경우, 개인정보 관리 모듈(20004)을 통하여 개인정보 보호와 관련된 기능을 제공할 수 있다. 또한, 포함된 데이터를 미리 수집된 행위 정보(예를 들어, 조회, 삭제, 추가, 변경, 출력 등을 포함한다.)와 매핑할 수 있다.The analysis module (20002) can be used to analyze data and derive valuable insights. The analysis module (20002) can analyze collected packets to extract data. If the data contained herein contains personal information, the personal information management module (20004) can provide functions related to personal information protection. Furthermore, the data contained herein can be mapped to previously collected behavioral information (e.g., including search, deletion, addition, modification, and output).

키 관리 모듈(20003)은 데이터 암호화 및 복호화를 위한 키를 생성, 저장 및 관리할 수 있다. 키 관리 모듈(20003)은 토큰, 대칭 키, 공개 키, 디지털 인증서 등의 기술을 사용하여 키를 관리할 수 있다. 키 관리 모듈(20003)은 데이터에 개인정보가 포함되어 있는 경우, 개인정보를 암호화 및 복호화를 위한 키를 생성, 저장 및 관리할 수 있다. 또한, 키 관리 모듈(20003)은 데이터에 개인정보가 포함되어 있지 않더라도 데이터의 보안을 위하여 토큰, 대칭 키, 공개 키, 디지털 인증서 등의 기술을 사용할 수 있다.The key management module (20003) can generate, store, and manage keys for data encryption and decryption. The key management module (20003) can manage keys using technologies such as tokens, symmetric keys, public keys, and digital certificates. If the data contains personal information, the key management module (20003) can generate, store, and manage keys for encrypting and decrypting the personal information. In addition, the key management module (20003) can use technologies such as tokens, symmetric keys, public keys, and digital certificates to ensure data security even if the data does not contain personal information.

개인정보 관리 모듈(20004)은 개인정보 보호와 관련된 기능을 제공할 수 있다. 개인정보 관리 모듈(20004)은 데이터에 개인정보가 포함되어 있는 경우, 개인정보의 수집, 추출, 암호화, 저장, 처리, 검색, 삭제 등을 제어할 수 있다.The personal information management module (20004) can provide functions related to personal information protection. If the data contains personal information, the personal information management module (20004) can control the collection, extraction, encryption, storage, processing, retrieval, and deletion of such personal information.

모니터링 모듈(20005)은 데이터의 검색 및 탐지를 수행할 수 있다. 모니터링 모듈(20005)은 데이터 처리 및 데이터 처리 환경을 모니터링하고 문제를 식별할 수 있다. 또한, 모니터링 모듈(20005)은 로그, 성능 지표, 이벤트 등을 모니터링하고 경고를 생성할 수 있다.The monitoring module (20005) can perform data retrieval and detection. It can monitor data processing and the data processing environment and identify issues. Furthermore, it can monitor logs, performance indicators, events, and generate alerts.

AI 엔진(20006)은 인공지능 기술(기계 학습을 포함한다.)을 이용하여 데이터 처리 및 데이터 분석 작업을 수행할 수 있다. 특히, 수집되어 저장된 로그에 텍스트가 포함되어 있는 경우, AI 엔진(20006)은 수집된 텍스트를 인공지능을 기반으로 행위를 분류할 수 있다.The AI engine (20006) can perform data processing and analysis tasks using artificial intelligence technology (including machine learning). Specifically, if the collected and stored logs contain text, the AI engine (20006) can classify the collected text into actions based on artificial intelligence.

도 3은 본 발명의 데이터 관리 플랫폼의 일 실시 예를 개시하는 도면이다.FIG. 3 is a diagram disclosing one embodiment of the data management platform of the present invention.

이 도면의 실시 예는 데이터 관리 플랫폼(10000)을 예시하고 있으며, 물리적인(physical) 장치와 논리적인 요소(logical component)를 포함하고 있다. 특히, 본 도면에서 데이터 관리 플랫폼(10000)은 상술한 데이터 관리 플랫폼보다 더 넓은 범위에 대응할 수 있다. 예를 들어, 데이터 관리 플랫폼(10000)은 상술한 데이터 관리 소프트웨어 패키지(20000)에서 구현하는 모듈 중 적어도 하나를 포함하고, 물리 장치 상에서 구동되는 응용 프로그래밍 인터페이스 Application Programming Interface, API)를 포함할 수 있다. 물리 장치에 대하여는 상술한 내용을 참고하도록 한다.The embodiment of this drawing illustrates a data management platform (10000), which includes a physical device and a logical component. In particular, the data management platform (10000) in this drawing can correspond to a wider range than the data management platform described above. For example, the data management platform (10000) can include at least one of the modules implemented in the data management software package (20000) described above, and can include an application programming interface (API) that runs on a physical device. For the physical device, refer to the above description.

데이터 관리 플랫폼(10000)은 컴퓨팅 서버(1004)와 스토리지/데이터베이스(1003)의 리소스(resource)를 이용하여 데이터 관리 플랫폼(10000) 내에 포함된 모듈의 기능을 수행할 수 있다. 이때, 시스템 매니저는 데이터 관리 플랫폼(10000) 내의 모듈 또는 엔진 중 적어도 하나를 제어할 수 있으며, 시스템 매니저는 컴퓨팅 서버(1004) 내에 위치하거나 별도로 위치하여 데이터 관리 플랫폼(10000)을 제어할 수 있다.The data management platform (10000) can perform the functions of the modules included in the data management platform (10000) by utilizing the resources of the computing server (1004) and the storage/database (1003). At this time, the system manager can control at least one of the modules or engines within the data management platform (10000), and the system manager can be located within the computing server (1004) or located separately to control the data management platform (10000).

데이터 관리 플랫폼(10000)은 수집 모듈(20001), 분석 모듈(20002), 키 관리 모듈(20003), 개인정보 관리 모듈(20004), 모니터링 모듈(20005) 및 AI 엔진(20006) 중 적어도 하나를 포함할 수 있다. 각각의 모듈에 대한 설명은 상술한 바와 같다.The data management platform (10000) may include at least one of a collection module (20001), an analysis module (20002), a key management module (20003), a personal information management module (20004), a monitoring module (20005), and an AI engine (20006). The description of each module is as described above.

데이터 관리 플랫폼(10000)은 사용자/클라이언트(1000)와 데이터를 송수신하며, 송수신된 데이터에 대하여 수집 모듈(20001), 분석 모듈(20002), 키 관리 모듈(20003), 개인정보 관리 모듈(20004), 모니터링 모듈(20005) 및 AI 엔진(20006)이 수행하는 적어도 하나의 기능을 적용할 수 있다. 이때, 데이터 관리 플랫폼(10000)은 사용자/클라이언트(1000)의 요청에 의해 데이터 관리 플랫폼(10000) 내에 포함된 개별적인 모듈을 단독으로 사용할 수 있다. 예를 들어, 사용자/클라이언트(1000)는 데이터 관리 플랫폼(10000) 내의 키 관리 모듈(20003) 또는 개인정보 관리 모듈(20004)의 기능만을 선택적으로 사용할 수 있다.The data management platform (10000) transmits and receives data with a user/client (1000), and can apply at least one function performed by a collection module (20001), an analysis module (20002), a key management module (20003), a personal information management module (20004), a monitoring module (20005), and an AI engine (20006) to the transmitted and received data. At this time, the data management platform (10000) can independently use individual modules included in the data management platform (10000) at the request of the user/client (1000). For example, the user/client (1000) can selectively use only the functions of the key management module (20003) or the personal information management module (20004) within the data management platform (10000).

또한, 도면에 도시되지는 않았으나 데이터 관리 플랫폼(10000)은 내부에 포함된 모듈의 기능을 수행하기 위하여 외부 스토리지/데이터베이스(1003)과는 다른 내부 데이터베이스를 사용할 수 있다.Additionally, although not shown in the drawing, the data management platform (10000) may use an internal database different from the external storage/database (1003) to perform the functions of the modules included therein.

도 4는 본 발명의 데이터 관리 방법의 일 실시 예를 개시하는 도면이다.FIG. 4 is a drawing disclosing one embodiment of a data management method of the present invention.

단계(S1010)에서, 패킷을 수집할 수 있다. 일 실시 예에서, 패킷을 수집하는 방법은 에이전트(agent) 방식의 클라우드 기반의 패킷 수집 방법 또는 패킷 미러 방식의 패킷 수집 방법을 이용할 수 있다. 자세한 설명은 후술하도록 한다.In step (S1010), packets can be collected. In one embodiment, the method for collecting packets may utilize an agent-based, cloud-based packet collection method or a packet mirror-based packet collection method. A detailed description will be provided below.

단계(S1020)에서, 수집된 패킷에 필터를 적용할 수 있다. 보다 상세하게는, 본 발명의 데이터 관리 방법은 수집된 패킷을 재조합하고 필터링하여 분석 프로세스로 분배할 수 있다.In step (S1020), a filter may be applied to the collected packets. More specifically, the data management method of the present invention can reassemble and filter the collected packets and distribute them to an analysis process.

단계(S1030)에서, 필터가 적용된 패킷을 분석할 수 있다.In step (S1030), packets to which a filter is applied can be analyzed.

여기에서, 패킷의 기반이 되는 프로토콜의 종류에 따라 다르게 분석할 수 있다. 보다 상세하게는, 단계(S1031)에서, RFC(Remote Function Call) 프로토콜 기반의 패킷을 분석하고, 단계(S1032)에서, GUI 프로토콜 기반의 패킷을 분석하고, 단계(S1033)에서, HTTP/HTTPS 기반의 패킷을 분석할 수 있다.Here, packets can be analyzed differently depending on the type of protocol they are based on. More specifically, in step (S1031), packets based on the RFC (Remote Function Call) protocol can be analyzed, in step (S1032), packets based on the GUI protocol can be analyzed, and in step (S1033), packets based on HTTP/HTTPS can be analyzed.

단계(S1031)에서, RFC 프로토콜 기반의 패킷 분석은 네트워크에서 RFC 프로토콜을 사용하는 패킷을 분석하는 과정을 의미한다. RFC는 분산 환경에서 서로 다른 시스템 또는 컴퓨터 간에 함수 호출을 수행하기 위한 프로토콜과 메커니즘으로, RFC는 클라이언트와 서버 모델을 기반으로 작동하며 클라이언트가 원격 시스템에 있는 서버의 함수를 호출하여 원격에서 실행할 수 있도록 한다. 즉, RFC 프로토콜은 원격 함수 호출을 위한 통신 프로토콜이기 때문에 RFC 프로토콜을 사용하는 패킷은 이러한 원격 함수 호출에 대한 정보를 담고 있다.In step (S1031), packet analysis based on the RFC protocol refers to the process of analyzing packets using the RFC protocol on a network. RFC is a protocol and mechanism for performing function calls between different systems or computers in a distributed environment. RFC operates based on a client-server model, allowing a client to call a function on a server in a remote system and execute it remotely. In other words, since the RFC protocol is a communication protocol for remote function calls, packets using the RFC protocol contain information about these remote function calls.

단계(S1032)에서, GUI 프로토콜 기반의 패킷 분석은 클라이언트와 애플리케이션 서버 간에 통신하는 GUI 프로토콜 기반의 패킷을 수집하여, 패킷에 포함된 클라이언트 IP또는 Port, 서버 IP 또는 Port, 패킷 데이터(byte stream) 등을 추출하는 방식이다.In step (S1032), packet analysis based on the GUI protocol collects packets based on the GUI protocol that communicate between the client and the application server, and extracts the client IP or Port, server IP or Port, packet data (byte stream), etc. included in the packets.

단계(S1033)에서, HTTP/HTTPS 기반의 패킷 분석은 웹 브라우저와 서버간 송수신하는 패킷을 미러링하거나 SSL 프로세싱하여 패킷에 포함된 데이터를 추출하는 방식이다.In step (S1033), HTTP/HTTPS-based packet analysis is a method of extracting data contained in packets by mirroring or SSL processing packets transmitted and received between a web browser and a server.

각각에 대한 자세한 분석 방법은 후술하도록 한다.Detailed analysis methods for each will be described later.

단계(S1040)에서, 분석된 패킷으로부터 개인정보 메타데이터를 활용하여 개인정보를 추출할 수 있다. 일 실시 예에서, 분석된 패킷에 개인정보가 포함되어 있는지 확인하기 위하여, 저장된 개인정보 메타데이터를 사용하여 개인정보를 추출할 수 있다. 이에 대하여는, 후술하도록 한다.In step (S1040), personal information can be extracted from the analyzed packets using personal information metadata. In one embodiment, to determine whether the analyzed packets contain personal information, the personal information can be extracted using stored personal information metadata. This will be described later.

단계(S1050)에서, 로그를 저장할 수 있다. 이때, 분석된 패킷에 포함된 유의미한 정보는 로그로 저장될 수 있다. 일 실시 예에서, 데이터 관리 방법은 분석된 정보를 감사 로그(Audit log)에 저장할지 여부를 결정할 수 있다. 또한, 개인정보가 포함된 경우, 개인정보는 패턴화 되어 데이터베이스에 저장될 수 있다. 마지막으로, 데이터 관리 방법은 로그(log) 저장 속도를 높이기 위해 멀티 쓰레딩(multithreading) 방식으로 동작하며, 일시적으로 데이터베이스에 접근이 되지 않는 경우에 대비하여 메모리 큐잉(queuing) 및 파일 큐잉을 수행할 수 있다.In step S1050, a log can be stored. At this time, meaningful information contained in the analyzed packets can be stored as a log. In one embodiment, the data management method can determine whether to store the analyzed information in an audit log. Additionally, if personal information is included, the personal information can be patterned and stored in a database. Finally, the data management method operates in a multithreaded manner to increase the log storage speed, and can perform memory queuing and file queuing in case the database is temporarily unavailable.

단계(S1060)에서, 저장된 로그를 이용하여 이상행위를 감지할 수 있다. 이때, 이상행위가 감지된 경우, 이상행위 감지에 대한 정보를 기록한 새로운 로그를 생성하여 다시 단계(S1050)를 통해 로그를 저장할 수 있다. 또한, 데이터 관리 방법은 사용자로부터 감사 로그(Audit Log)를 요청받는 경우, 수집된 데이터의 화면을 재구현할 수 있다. 보다 상세하게는, GUI 프로토콜은 그래픽 사용자 인터페이스를 표시하고 상호 작용하는데 사용되는 프로토콜로, 이러한 프로토콜을 분석함으로써 사용자의 작업 흐름, 입력, 출력 등을 시각적으로 이해할 수 있으며 시스템의 동작 상태를 파악하고 문제를 진단하는데 도움을 줄 수 있다. 이에 대하여는 이하의 도면에서 자세히 설명하도록 한다.In step (S1060), abnormal behavior can be detected using the stored log. At this time, if abnormal behavior is detected, a new log recording information about the abnormal behavior detection can be created and the log can be saved again through step (S1050). In addition, the data management method can re-implement the screen of the collected data when an audit log is requested from the user. More specifically, the GUI protocol is a protocol used to display and interact with a graphical user interface. By analyzing such protocol, the user's work flow, input, output, etc. can be visually understood, and the operating status of the system can be identified and problems can be diagnosed. This will be described in detail in the drawings below.

도 5는 본 발명의 분석된 패킷에 포함된 데이터 중 개인정보를 추출하는 일 실시 예를 개시하는 도면이다.FIG. 5 is a diagram disclosing an embodiment of extracting personal information from data included in an analyzed packet of the present invention.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 개인정보 메타데이터를 사용하여 로그 데이터(log data)와 인덱스(index)를 포함하는 감사 로그(Audit log, 1033)로부터 개인정보를 추출할 수 있다.In one embodiment, the data management platform (10000) can extract personal information from an audit log (1033) including log data and an index using personal information metadata.

보다 상세하게는, 감사 로그(1033)는 네트워크 상에서 발생하는 이벤트들을 기록한 로그의 집합으로, 로그 데이터(log data) 및 인덱스(index)를 포함할 수 있다. 여기에서, 감사 로그(1033)는 인덱스 처리가 완료된 로그 데이터의 집합일 수 있다.More specifically, the audit log (1033) is a collection of logs that record events occurring on a network and may include log data and an index. Here, the audit log (1033) may be a collection of log data for which index processing has been completed.

데이터 관리 플랫폼(10000)은 감사 로그에서 개인 정보를 추출하기 위하여 개인정보 메타데이터를 사용할 수 있다. 여기에서, 개인정보 메타데이터는 개인정보가 어떤 형태로 저장되어 있는지, 어떤 필드에 저장되어 있는지, 개인정보 유형 별로 사용되는 정규식 패턴 및 마스킹 패턴 등이 정의되어 있다. 예를 들어, 이름, 주소, 전화번호 등의 개인정보 유형에 대한 메타데이터를 구성할 수 있다. 이에 따라, 분석 모듈은 개인정보 메타데이터를 기반으로 로그 데이터에서 개인정보를 식별하고 추출할 수 있다.The data management platform (10000) can use personal information metadata to extract personal information from audit logs. Here, personal information metadata defines the format in which personal information is stored, the fields in which it is stored, and the regular expression patterns and masking patterns used for each type of personal information. For example, metadata can be configured for personal information types such as name, address, and phone number. Accordingly, the analysis module can identify and extract personal information from log data based on the personal information metadata.

일 실시 예에서, 추출된 개인정보는 개인정보 유형 별로 기 설정된 방식에 따라 암호화되어 다시 감사 로그(1033) 또는 데이터 관리 플랫폼(10000) 내부 데이터베이스 상에 저장될 수 있다. 자세한 내용은 후술하도록 한다.In one embodiment, the extracted personal information may be encrypted according to a preset method for each type of personal information and stored again in an audit log (1033) or an internal database of the data management platform (10000). Further details will be provided below.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 추출된 개인 정보를 사용하여 특정 기간 동안의 로그 데이터를 검색할 수 있고, 관리자나 사용자가 필요할 때 검색하거나 검색된 로그 데이터를 확인할 수 있다. 이에 대하여는 후술하도록 한다.In one embodiment, the data management platform (10000) can use the extracted personal information to search log data for a specific period of time, and allow administrators or users to search or check the searched log data when necessary. This will be described later.

도 6은 본 발명의 분석된 패킷에 포함된 데이터를 시각화한 일 실시 예를 개시하는 도면이다.FIG. 6 is a diagram disclosing an embodiment of visualizing data included in an analyzed packet of the present invention.

상술한 실시 예를 통하여 분석된 패킷에 포함되는 데이터는 다음과 같다.The data included in the packet analyzed through the above-described example is as follows.

(1) 세션 정보: 시작 시간, Duration Time, Log ID, Session ID, Context ID(1) Session information: Start time, Duration Time, Log ID, Session ID, Context ID

(2) 접속 정보: 서버 IP, 서버 Port, 서버 Mac, 클라이언트 IP, 클라이언트 Port, 클라이언트 Mac(2) Connection information: Server IP, Server Port, Server Mac, Client IP, Client Port, Client Mac

(3) SAP 정보: SID, 프로토콜, SAP 인스턴스, 클라이언트(3) SAP information: SID, protocol, SAP instance, client

(4) 프로그램 정보: OK Code, T-code, Title App, Title Main(4) Program information: OK Code, T-code, Title App, Title Main

(5) CUA(Central User Administration) 정보: CUA Name, CUA Status(5) CUA (Central User Administration) information: CUA Name, CUA Status

(6) 사용자 정보: 사용자 ID, 사용자 UID, 사용자명(6) User information: User ID, User UID, User name

(7) 이벤트 정보: 이벤트 카테고리, 이벤트 코드, 이벤트 이름, 이벤트 설명, 이벤트 값, 알림 수준(7) Event information: event category, event code, event name, event description, event value, notification level

(8) 사용자 정의 정보: 이벤트, 경고, 이벤트 유형, 이벤트 건수, 경고 건수, 아키텍쳐, 개인정보 존재 여부, 개인정보 유형 건수, 개인정보건수(8) User-defined information: events, alerts, event types, number of events, number of alerts, architecture, presence of personal information, number of personal information types, number of personal information

일 실시 예에서, 데이터 관리 플랫폼은 분석된 패킷으로부터 위와 같은 데이터를 추출할 수 있다. 이때, 데이터 관리 플랫폼은 패킷으로부터 추출된 데이터에 개인정보가 포함된 경우, 개인정보에 대한 암호화를 수행하여 저장할 수 있다. 특히, 데이터 관리 플랫폼은 개인정보에 대하여는 별도의 화면으로 출력할 수 있다.In one embodiment, the data management platform can extract data, such as the above, from analyzed packets. If the data extracted from the packets contains personal information, the data management platform can encrypt and store the personal information. In particular, the data management platform can display the personal information on a separate screen.

이를 통해, 법 및 인증에서 요구하는 모든 사항(계정 정보, 접속일시, 접속지 정보, 처리한 주체 정보, 수행 업무)을 수집할 수 있다. 보다 상세하게는, 계정 정보는 사용자 정보의 사용자 ID, 사번, 조직이름을 통하여, 접속일시는 세션 정보의 시작 시간을 통하여, 접속지 정보는 접속 정보의 클라이언트 IP 또는 Port를 통하여, 처리한 정보주체 정보는 사용자 정의 정보(개인정보 존재 여부, 개인정보 건수, 주민등록번호, 외국인등록번호, 여권 정보 및 카드 정보 등)를 통하여, 수행 업무는 프로그램 정보 및 사용자 정의 정보를 통하여 판단할 수 있다.Through this, all information required by law and certification (account information, access date and time, access location information, information on the subject processed, and tasks performed) can be collected. More specifically, account information can be determined through the user ID, employee number, and organization name in the user information; access date and time can be determined through the start time in the session information; access location information can be determined through the client IP or port in the access information; processed information can be determined through user-defined information (such as the presence of personal information, number of personal information, resident registration number, alien registration number, passport information, and card information); and tasks performed can be determined through program information and user-defined information.

도 7은 본 발명의 분석된 패킷에 포함된 데이터를 검색하는 일 실시 예를 개시하는 도면이다.FIG. 7 is a diagram disclosing one embodiment of searching data included in an analyzed packet of the present invention.

일 실시 예에서, 데이터 관리 플랫폼은 상술한 데이터들을 검색하여 사용자에게 검색 결과를 제공할 수 있다.In one embodiment, the data management platform can search the data described above and provide search results to the user.

이를 위하여, 본 도면의 (a)와 같이 데이터 관리 플랫폼은 사용자가 쉽게 검색할 수 있도록 검색 인터페이스를 제공할 수 있다. 데이터 관리 플랫폼은 분석된 데이터의 필드를 기준으로 데이터를 검색할 수 있다. 예를 들어, 데이터 필드는 프로토콜, 시스템, 계정, 사번, 서버 IP, 서버 Port, 클라이언트 IP, 클라이언트 Port, 인스턴스명, 트랜잭션명, 프로그램명 등을 포함할 수 있다. 이때, 데이터 관리 플랫폼은 사용자로부터 입력 받은 필드 중 적어도 하나를 기준으로 데이터를 검색하고, 검색 결과를 출력할 수 있다.To this end, the data management platform can provide a search interface, as shown in (a) of the drawing, to enable users to easily search. The data management platform can search data based on fields of the analyzed data. For example, data fields may include protocol, system, account, employee number, server IP, server port, client IP, client port, instance name, transaction name, program name, etc. In this case, the data management platform can search data based on at least one of the fields input by the user and output the search results.

다른 실시 예에서, 본 도면의 (b)와 같이 데이터 관리 플랫폼은 쿼리(query) 검색 기능을 제공할 수 있다. 예를 들어, 사용자가 protocol_type=GUI를 제 1 쿼리로 입력하고, server_ip=175.117.145.125를 제 2 쿼리로 입력하는 경우, 데이터 관리 플랫폼은 제 1 쿼리 및 제 2 쿼리를 기준으로 데이터를 검색할 수 있다.In another embodiment, the data management platform may provide a query search function, as illustrated in (b) of the present drawing. For example, if a user inputs protocol_type=GUI as a first query and server_ip=175.117.145.125 as a second query, the data management platform may search for data based on the first and second queries.

일 실시 예에서, 검색된 결과는 상술한 바와 같이 “화면 재현” 형태 또는 “데이터” 형태로 출력될 수 있다.In one embodiment, the search results may be output in the form of a “screen reproduction” or “data” as described above.

도 8는 본 발명의 데이터 관리 플랫폼이 감사 로그를 저장하고 모니터링하는 실시 예를 설명하는 도면이다.FIG. 8 is a diagram illustrating an embodiment in which the data management platform of the present invention stores and monitors audit logs.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 상술한 실시 예에 따라 추출한 원본 데이터를 정의된 필드 규칙에 따라 변환하고 가공하여 감사 로그(1033)를 생성할 수 있다. 여기에서, 생성된 감사 로그(1033)는 데이터베이스(20007)에 저장될 수 있다.In one embodiment, the data management platform (10000) may convert and process the original data extracted according to the above-described embodiment according to defined field rules to generate an audit log (1033). Here, the generated audit log (1033) may be stored in a database (20007).

또한, 데이터 관리 플랫폼(10000)은 가공된 데이터 중에서 개인정보 메타데이터를 이용하여 개인정보를 추출할 수 있다. 이에 따라, 데이터 관리 플랫폼(10000)은 감사 로그(1033)와 개인정보를 각각 데이터베이스(20007)에 저장할 수 있다. 이를 위하여, 분석 모듈(20002)은 감사 로그 저장부(2008)를 더 포함할 수 있다.Additionally, the data management platform (10000) can extract personal information from processed data using personal information metadata. Accordingly, the data management platform (10000) can store audit logs (1033) and personal information in a database (20007), respectively. To this end, the analysis module (20002) may further include an audit log storage unit (2008).

또한, 분석 모듈(20003)은 상관 분석 규칙 생성부(2009)를 더 포함할 수 있다. 데이터 관리 플랫폼(10000)은 분석 모듈(20002)를 통하여 상관 분석 규칙을 생성할 수 있다. 여기에서, 상관 규칙이란 이상 행위 판단에 대한 규칙을 나타낼 수 있다. 예를 들어, 데이터 관리 플랫폼(10000)은 1초당 데이터가 1-2건 입력되는 경우 정상 행위로 판단하고, 1초당 데이터가 10건 이상 입력되는 경우 이상 행위로 판단할 수 있다. 여기에서, “상관”이라는 표현은 로그(log) 간의 상관성을 나타낼 수 있다. 이때, 데이터 관리 플랫폼(10000)은 상관 규칙에 의해 또다른 이벤트(로그)를 발생시킬 수 있다. 이때의 로그는 인시던트로 정의할 수 있다. 즉, 데이터 관리 플랫폼(10000)을 통하여 모니터링을 원하는 주체(subject)가 상관 규칙을 생성하고, 데이터 관리 플랫폼(10000)은 상관 규칙에 기초하여 로그들을 분석하고 또 다른 로그인 인시던트를 발생시킬 수 있다.In addition, the analysis module (20003) may further include a correlation analysis rule generation unit (2009). The data management platform (10000) may generate correlation analysis rules through the analysis module (20002). Here, the correlation rule may represent a rule for determining abnormal behavior. For example, the data management platform (10000) may determine that a case in which 1-2 data are input per second is normal behavior, and may determine that a case in which 10 or more data are input per second is abnormal behavior. Here, the expression “correlation” may represent a correlation between logs. At this time, the data management platform (10000) may generate another event (log) based on the correlation rule. The log at this time may be defined as an incident. That is, a subject who wants to monitor through the data management platform (10000) can create a correlation rule, and the data management platform (10000) can analyze logs based on the correlation rule and generate another login incident.

또한, 상관 분석 규칙은 다른 플랫폼에 의해 기 정의된 규칙에 대응할 수 있다. 이에 따라, 데이터 관리 플랫폼(10000)은 생성된 상관 분석 규칙을 이용하여 감사 로그(1033)를 분석하여 이상행위 이벤트를 생성할 수 있다. 이때, 생성된 상관 분석 규칙에 대한 정보 및 이상행위 이벤트에 대한 정보는 데이터베이스(20007)에 저장될 수 있다.Additionally, correlation analysis rules can correspond to rules predefined by other platforms. Accordingly, the data management platform (10000) can analyze audit logs (1033) using the generated correlation analysis rules to generate abnormal behavior events. At this time, information about the generated correlation analysis rules and the abnormal behavior events can be stored in the database (20007).

데이터 관리 플랫폼(10000)의 모니터링 모듈(20005)는 이상행위 모니터링부(2010)를 더 포함할 수 있다. 이상행위 모니터링부(2010)는 생성된 이상행위 이벤트에 기초하여 저장된 감사 로그(1033), 개인정보 등을 검색 또는 모니터링할 수 있다.The monitoring module (20005) of the data management platform (10000) may further include an abnormal behavior monitoring unit (2010). The abnormal behavior monitoring unit (2010) may search or monitor stored audit logs (1033), personal information, etc. based on generated abnormal behavior events.

이에 따라, 데이터 관리 플랫폼(10000)은 수집, 분석되어 저장된 데이터에서 개인정보를 추출할 수 있고, 추출된 개인정보를 이용하여 사용자 행위를 통계화할 수 있다. 이후, 수집된 사용자 행위 중 위반 행위가 발생한 경우, 데이터 관리 플랫폼(10000)은 사용자를 차단 또는 관리자에게 경고를 제공할 수 있다.Accordingly, the data management platform (10000) can extract personal information from the collected, analyzed, and stored data and use the extracted personal information to compile statistics on user behavior. If a violation occurs among the collected user behavior, the data management platform (10000) can block the user or issue a warning to the administrator.

도 9은 본 발명의 데이터 관리 방법이 수집된 패킷을 분배하는 실시 예를 설명하는 도면이다.FIG. 9 is a drawing illustrating an embodiment of a data management method of the present invention for distributing collected packets.

단계(S10010)에서, 데이터 관리 방법은 패킷을 수집하고 필터를 적용할 수 있다. 이때, 데이터 관리 방법은 NIC와 같은 네트워크 장비를 통하여 패킷을 수집할 수 있다. 이후, 데이터 관리 방법은 수집된 패킷을 재조합하고, 각각의 네트워크 패킷을 분석 가능한 패킷 형태로 합칠 수 있다.In step (S10010), the data management method may collect packets and apply a filter. At this time, the data management method may collect packets via network equipment such as a NIC. Thereafter, the data management method may reassemble the collected packets and combine each network packet into an analyzable packet format.

일 실시 예에서, 데이터 관리 방법은 필터링 규칙에 기초하여 패킷을 분석할지 여부를 결정할 수 있다. 여기에서, 필터링 규칙은 데이터 관리 플랫폼에 의해 결정될 수 있다. 보다 상세하게는, 데이터 관리 방법은 상술한 NIC나 라우터와 같은 네트워크 장비를 통하여 패킷을 수집할 수 있다. 이때, 수집되는 모든 패킷을 분석한다면 성능에 이슈가 있을 수 있기 때문에 필터링 규칙에 의해 분석할 패킷을 필터링할 수 있다. 예를 들면, HTTP나 SAP GUI 프로토콜 기반의 패킷은 분석할 필요가 있지만 그 외의 다른 프로토콜로 수집된 패킷은 분석할 필요가 없을 수 있다. 이때, 데이터 관리 방법은 포트 또는 IP 별로 원하는 프로토콜 기반의 패킷만을 분석하도록 수집된 패킷을 필터링할 수 있다. 이때, 사용자는 데이터 관리 방법을 이용하여 직접 필터링 규칙을 설정할 수 있다.In one embodiment, the data management method can determine whether to analyze packets based on filtering rules. Here, the filtering rules can be determined by the data management platform. More specifically, the data management method can collect packets through network devices, such as the aforementioned NIC or router. At this time, analyzing all collected packets may result in performance issues, so the packets to be analyzed can be filtered based on filtering rules. For example, packets based on HTTP or SAP GUI protocols may need to be analyzed, but packets collected using other protocols may not. In this case, the data management method can filter the collected packets to analyze only packets based on the desired protocol, by port or IP. At this time, the user can directly configure filtering rules using the data management method.

단계(S10020)에서, 데이터 관리 방법은 TCP 세션 정보에 기초하여 프로토콜을 구분할 수 있다. 상술한 바와 같이, 데이터 관리 방법은 수집된 패킷을 프로토콜을 기반으로 구분하여 분석할 수 있다.In step (S10020), the data management method can distinguish protocols based on TCP session information. As described above, the data management method can analyze collected packets by distinguishing them based on the protocol.

데이터 관리 방법은 단계(S10020)에서 구분된 프로토콜에 기초하여 다른 방법으로 패킷을 분석할 수 있다. 각각에 대한 분석 방법은 상술한 바와 같다.The data management method can analyze packets in different ways based on the protocols identified in step (S10020). The analysis methods for each are as described above.

단계(S10030)에서, 데이터 관리 방법은 RFC 프로토콜 기반 패킷을 분석할 수 있다.In step (S10030), the data management method can analyze a packet based on the RFC protocol.

단계(S10040)에서, 데이터 관리 방법은 GUI/SNC 프로토콜 기반 패킷을 분석할 수 있다.In step (S10040), the data management method can analyze a GUI/SNC protocol-based packet.

단계(S10050)에서, 데이터 관리 방법은 HTTP/HTTPS 기반 패킷을 분석할 수 있다.In step (S10050), the data management method can analyze HTTP/HTTPS-based packets.

단계(S10030) 내지 단계(S10050)에서, 데이터 관리 방법은 패킷을 파싱한 후 분석을 위한 처리 규칙을 수행할 수 있다. 이때, 데이터 관리 방법은 패킷의 바이너리 데이터를 분석하여 각종 데이터(계정 데이터, SID, 화면 입력 데이터, 화면 출력 데이터 등)를 추출할 수 있다. 또한, 처리 규칙은 분석된 패킷 내에 포함된 데이터를 감사 로그(1033)에 저장할 것인지 여부, 분석된 패킷 내에 포함된 데이터의 가공 또는 로깅(logging) 여부, 이벤트 및 경고를 발생시킬지 여부 등을 처리하는 설정을 나타낼 수 있다. 이외에도, 데이터 관리 방법은 불필요한 로그를 필터링하는 필터 규칙과 처리 규칙을 처리하기 위한 데이터를 로딩하는 적제 규칙을 더 수행할 수 있다.In steps (S10030) to (S10050), the data management method may execute processing rules for analysis after parsing the packet. At this time, the data management method may analyze the binary data of the packet to extract various data (account data, SID, screen input data, screen output data, etc.). In addition, the processing rules may indicate settings for processing whether to store data included in the analyzed packet in an audit log (1033), whether to process or log data included in the analyzed packet, and whether to generate events and warnings. In addition, the data management method may further execute filter rules for filtering unnecessary logs and load rules for loading data for processing the processing rules.

단계(S10060)에서, 데이터 관리 방법은 분석된 패킷에 기초하여 감사 로그(1033)를 생성할 수 있다. 이때, 데이터 관리 방법은 로그 저장 속도를 높이기 위하여 멀티 쓰레딩(multi-threading) 방식으로 동작하며 데이터베이스의 접근이 일시적으로 불가능한 경우에 대비하여 메모리 큐잉 및 파일 큐잉을 수행할 수 있다.In step (S10060), the data management method may generate an audit log (1033) based on the analyzed packet. At this time, the data management method may operate in a multi-threading manner to increase the log storage speed, and may perform memory queuing and file queuing in preparation for cases where database access is temporarily unavailable.

단계(S10070)에서, 데이터 관리 방법은 생성된 감사 로그(1033)에 대하여 개인정보 메타데이터를 이용하여 개인정보를 추출할 수 있다.In step (S10070), the data management method can extract personal information using personal information metadata for the generated audit log (1033).

단계(S10080)에서, 데이터 관리 방법은 감사 로그(1033)를 저장할 수 있다. 이때, 데이터 관리 방법은 감사 로그(1033)와 개인정보를 각각 저장할 수 있다.In step (S10080), the data management method can store an audit log (1033). At this time, the data management method can store the audit log (1033) and personal information, respectively.

단계(S10090)에서, 데이터 관리 방법은 이상행위를 모니터링할 수 있다.In step (S10090), the data management method can monitor abnormal behavior.

도 10은 본 발명의 데이터 관리 플랫폼이 HTTPS 기반 패킷을 분석하는 실시 예를 설명하는 도면이다.FIG. 10 is a diagram illustrating an embodiment in which the data management platform of the present invention analyzes HTTPS-based packets.

HTTPS(SSL)로 보호된 웹 브라우저에서의 사용자 행위는 기존의 네트워크 트래픽 미러 기술로는 로그를 기록할 수 없다. 특히, HTTPS 연결 시 키교환을 위해 디피-헬만(Diffie-Hellman) 알고리즘을 사용하는 경우에는 로그를 기록할 수 없다. 여기에서, 디피-헬만(Diffie-Hellman) 알고리즘은 대칭키 암호화 방식에서 사용되는 알고리즘 중 하나로, 키 교환 프로토콜을 안전하게 수행하기 위한 방법에 대응한다.User activity in web browsers protected by HTTPS (SSL) cannot be logged using existing network traffic mirroring technologies. In particular, logging is impossible when the Diffie-Hellman algorithm is used for key exchange during HTTPS connections. The Diffie-Hellman algorithm, one of the algorithms used in symmetric key cryptography, represents a method for securely performing key exchange protocols.

하지만 기업 내 이상행위 및 개인정보 과남용을 모니터링하기 위해서는 웹 브라우저에서의 사용자 행위에 대한 로그를 기록하고 모니터링 해야 하는 요구가 있다.However, to monitor for abnormal behavior and misuse of personal information within a company, there is a need to record and monitor logs of user behavior in web browsers.

본 발명의 일 실시 예에 따르면, 키교환을 위해 디피-헬만 알고리즘을 사용하는 경우에도 웹 브라우저에서의 사용자 행위에 대한 로그를 기록해 기업 내 이상행위 및 개인정보 과남용을 모니터링할 수 있다.According to one embodiment of the present invention, even when the Diffie-Hellman algorithm is used for key exchange, logs of user actions in a web browser can be recorded to monitor abnormal behavior and abuse of personal information within a company.

이를 위하여, 데이터 관리 플랫폼(10000)은 디피-헬만 알고리즘을 사용하는 경우에도 사용자 행위를 모니터링할 수 있도록 한다.To this end, the data management platform (10000) enables monitoring of user behavior even when using the Diffie-Hellman algorithm.

보다 상세하게는, 데이터 관리 플랫폼(10000)의 분석 모듈(20002)은 프록시 서버 구성부(2011), SSL 설정부(2012), HTTP 요청/응답 데이터 분석부(2013) 및 메시지 데이터 생성부(2014)를 더 포함할 수 있다.More specifically, the analysis module (20002) of the data management platform (10000) may further include a proxy server configuration unit (2011), an SSL configuration unit (2012), an HTTP request/response data analysis unit (2013), and a message data generation unit (2014).

여기에서, 프록시 서버 구성부(2011)는 웹 브라우저(1030)와 웹 서버(1034) 사이에 프록시 서버(1031)를 구성할 수 있다. 여기에서, 프록시 서버(1031)는 클라이언트와 서버 간의 네트워크 통신을 중개하는 서버로, 클라이언트는 프록시 서버(1031)를 이용하는 경우 직접 웹 서버(1034)와 통신하지 않고 프록시 서버(1031)를 통하여 간접적으로 통신할 수 있다.Here, the proxy server configuration unit (2011) can configure a proxy server (1031) between a web browser (1030) and a web server (1034). Here, the proxy server (1031) is a server that mediates network communication between a client and a server, and when the client uses the proxy server (1031), it can communicate indirectly through the proxy server (1031) without directly communicating with the web server (1034).

SSL 설정부(2012)는 프록시 서버(1031)에서 SSL을 설정할 수 있다. 보다 상세하게는, SSL 설정부(2012)는 프록시 서버(1031)를 구성하고, 클라이언트 SSL 설정을 이용하여 웹 브라우저(1030)와 프록시 서버(1031) 간의 SSL 환경을 구성하여 HTTPS 요청/응답에 대한 처리를 수행할 수 있다. 일 실시 예에서, SSL 설정부(2012)는 SSL 설정을 위하여 SSL 인증서를 사용할 수 있다. 여기에서, SSL 인증서는 상술한 패킷을 분석하기 위한 인증서와는 상이한 인증서에 대응할 수 있다. 즉, 이때의 SSL 인증서는 SSL 설정을 지원(support)하기 위한 것에 대응한다.The SSL configuration unit (2012) can configure SSL in the proxy server (1031). More specifically, the SSL configuration unit (2012) can configure the proxy server (1031) and use the client SSL configuration to configure an SSL environment between the web browser (1030) and the proxy server (1031) to process HTTPS requests/responses. In one embodiment, the SSL configuration unit (2012) can use an SSL certificate for SSL configuration. Here, the SSL certificate may correspond to a certificate different from the certificate for analyzing the packet described above. That is, the SSL certificate at this time corresponds to one for supporting SSL configuration.

HTTPS 요청 데이터를 전달받은 프록시 서버(1031)는 SSL 연결을 일시적으로 중단할 수 있고, HTTP 요청/응답 분석부(2013)를 통해 HTTP 요청/응답 데이터를 처리하고, 다시 SSL 연결을 수행할 수 있다.The proxy server (1031) that receives HTTPS request data can temporarily suspend the SSL connection, process the HTTP request/response data through the HTTP request/response analysis unit (2013), and perform the SSL connection again.

이에 따라, 메시지 데이터 생성부(2014)는 HTTP 요청/응답 데이터를 조합하여 메시지 데이터(1035)를 생성할 수 있고, 생성된 메시지 데이터(1035)를 큐(1032)에 저장할 수 있다. 여기에서, 큐(1032)는 메모리 큐와 파일 큐를 포함할 수 있다.Accordingly, the message data generation unit (2014) can generate message data (1035) by combining HTTP request/response data and store the generated message data (1035) in a queue (1032). Here, the queue (1032) can include a memory queue and a file queue.

이후, 데이터 관리 플랫폼(10000)은 큐(1032)에 저장된 메시지 데이터(1035)를 외부로 전송할 수 있다.Afterwards, the data management platform (10000) can transmit message data (1035) stored in the queue (1032) to the outside.

마지막으로, 데이터 관리 플랫폼(10000)은 서버 SSL 설정을 이용하여 프록시 서버(1031)와 웹 서버(1034) 간의 SSL 환경을 구성하여 HTTPS 요청/응답에 대한 처리를 수행하고, HTTPS 응답 데이터를 전달할 수 있다.Finally, the data management platform (10000) can configure an SSL environment between a proxy server (1031) and a web server (1034) using server SSL settings to process HTTPS requests/responses and transmit HTTPS response data.

이를 통해 디피-헬만 알고리즘을 사용하는 경우에도 웹 브라우저에서의 사용자 행위에 대한 로그를 기록할 수 있다.This allows logging of user actions in web browsers even when using the Diffie-Hellman algorithm.

도 11는 본 발명의 데이터 관리 방법이 HTTPS 기반 패킷을 분석하는 실시 예를 설명하는 도면이다.FIG. 11 is a diagram illustrating an embodiment of a data management method of the present invention analyzing HTTPS-based packets.

단계(S20010)에서, 데이터 관리 방법은 HTTPS 기반 패킷을 수집하기 위하여 프록시 서버를 구성하고 클라이언트 SSL을 설정하고, 단계(S20020)에서, 프록시 서버를 구성하고 서버 SSL을 설정할 수 있다. 즉, 데이터 관리 방법은 프록시 서버를 구성하고, 서버 SSL 설정을 이용하여 프록시 서버와 웹 서버 간의 SSL 환경을 구성하여 HTTPS 요청/응답에 대한 처리를 수행하고, HTTPS 응답 데이터를 전달할 수 있다.In step (S20010), the data management method may configure a proxy server and set up client SSL to collect HTTPS-based packets, and in step (S20020), configure a proxy server and set up server SSL. That is, the data management method may configure a proxy server, configure an SSL environment between the proxy server and a web server using server SSL settings, perform processing for HTTPS requests/responses, and transmit HTTPS response data.

프록시 서버가 구성되고, SSL이 설정되는 경우, 단계(S20030)에서, 데이터 관리 방법은 HTTPS 패킷을 수집할 수 있다. 상술한 바와 달리, 본 도면의 데이터 관리 방법에서는 HTTPS 패킷을 수집하는 경우를 예를 들어 설명한다.When a proxy server is configured and SSL is set up, the data management method can collect HTTPS packets at step (S20030). Unlike the above, the data management method of this drawing will be described as an example of collecting HTTPS packets.

HTTPS 연결의 경우 키교환을 위해 디피-헬만 알고리즘을 사용하는데, 디피-헬만 알고리즘을 이용하여 패킷을 암호화하는 경우에는 기존의 네트워크 트래픽 미러 기술로는 사용자 행위에 대한 로그를 기록할 수 없기 때문에 본 발명은 이하의 방법을 제안한다.In the case of HTTPS connections, the Diffie-Hellman algorithm is used for key exchange. Since the existing network traffic mirror technology cannot record logs of user actions when encrypting packets using the Diffie-Hellman algorithm, the present invention proposes the following method.

단계(S20040)에서, 데이터 관리 방법은 HTTPS 요청/응답에 대한 처리를 수행할 수 있다. 즉, 데이터 관리 방법은 프록시 서버를 구성하고, 클라이언트 SSL 설정을 이용하여 웹 브라우저와 프록시 서버 간의 SSL 환경을 구성하여 HTTPS 요청/응답에 대한 처리를 수행할 수 있다. 이후, HTTPS 요청 데이터를 전달받은 프록시 서버는 SSL 연결을 중단하고, HTTP 요청/응답 데이터를 처리한 후 다시 SSL 연결을 수행할 수 있다.In step (S20040), the data management method can process HTTPS requests/responses. That is, the data management method can configure a proxy server and use client SSL settings to configure an SSL environment between a web browser and the proxy server to process HTTPS requests/responses. Thereafter, the proxy server, upon receiving the HTTPS request data, can terminate the SSL connection, process the HTTP request/response data, and then re-establish the SSL connection.

단계(S20050)에서, 데이터 관리 방법은 HTTP 요청/응답 데이터를 조합하여 메시지 데이터(1035)를 생성할 수 있다. 이때, 데이터 관리 방법은 불필요한 데이터를 필터링할 수 있다. 보다 상세하게는, 데이터 관리 방법은 HTTP 요청/응답 데이터를 조합하여 메시지 데이터(1035)를 생성할 수 있는데, 로깅(logging)할 때 불필요한 데이터(예를 들어, 이미지 데이터)는 전달하지 않을 수 있다.In step (S20050), the data management method may generate message data (1035) by combining HTTP request/response data. At this time, the data management method may filter out unnecessary data. More specifically, the data management method may generate message data (1035) by combining HTTP request/response data, and may not transmit unnecessary data (e.g., image data) when logging.

단계(S20060)에서, 데이터 관리 방법은 생성된 메시지 데이터(1035)를 메모리 큐(queue) 또는 파일 큐 중 적어도 하나에 적재할 수 있다. 보다 상세하게는, 데이터 관리 방법은 메시지 데이터(1035)를 저장하기 위한 큐로 메모리 큐 및 파일 큐를 사용할 수 있다. 이때, 메시지 데이터(1035)를 저장하기 위하여 파일 큐만 사용하는 경우 성능이 낮아질 위험이 있고, 메모리 큐만 사용하는 경우 데이터의 유실이 발생할 수 있기 때문에 두가지 큐를 조합하여 사용할 수 있다.In step (S20060), the data management method can load the generated message data (1035) into at least one of a memory queue and a file queue. More specifically, the data management method can use a memory queue and a file queue as queues for storing the message data (1035). At this time, if only a file queue is used to store the message data (1035), there is a risk of performance degradation, and if only a memory queue is used, data loss may occur. Therefore, the two queues can be used in combination.

단계(S20070)에서, 데이터 관리 방법은 큐에 적재된 메시지 데이터(1035)를 저장소로 전달하여 저장할 수 있다. 여기에서, 저장소는 상술한 데이터 관리 플랫폼 내에 포함된 데이터베이스에 대응한다.In step (S20070), the data management method can transfer message data (1035) loaded into a queue to a storage and store it. Here, the storage corresponds to a database included in the data management platform described above.

도 12은 본 발명의 데이터 관리 플랫폼에서 개인정보를 추출하고 저장하는 실시 예를 설명하는 도면이다.Figure 12 is a drawing illustrating an embodiment of extracting and storing personal information in the data management platform of the present invention.

개인정보를 추출하기 위한 방법으로는 주로 정규식(Regular expression)을 이용한 방법이 사용된다. 다만, 정규식을 이용한 방법은 잘못 추출될 우려가 존재한다.Regular expressions are the most commonly used method for extracting personal information. However, there is a risk of incorrect extraction using regular expressions.

이러한 점을 보완하기 위하여 본 발명에서는 개인정보를 추출하기 위하여 정규식 방법 뿐만 아니라 개인정보 메타데이터 및 예외처리 리스트를 사용한 다차원 추출 방법을 통해 개인정보 추출에 대한 정확도를 높일 수 있다.To compensate for these points, the present invention can increase the accuracy of personal information extraction through a multidimensional extraction method using personal information metadata and an exception handling list as well as a regular expression method to extract personal information.

본 발명의 데이터 관리 플랫폼(10000)에 포함된 개인정보 관리 모듈(20004)는 감사 로그(1033)로부터 개인정보를 추출 위하여 개인정보 유형을 정의할 수 있다. 여기에서, 감사 로그(1033)에는 인덱스(index) 처리가 완료된 로그 데이터(log data)를 포함할 수 있다. 또한, 개인정보 유형은 예를 들어, 주민등록번호, 신용카드 번호, 계좌 번호 등과 같은 개인정보의 종류를 포함할 수 있다.The personal information management module (20004) included in the data management platform (10000) of the present invention can define personal information types to extract personal information from the audit log (1033). Here, the audit log (1033) may include log data that has undergone index processing. Furthermore, the personal information types may include, for example, types of personal information such as resident registration numbers, credit card numbers, and account numbers.

보다 상세하게는, 개인정보 관리 모듈(20004)의 개인정보 유형 분석부(2019)는 개인정보 추출을 위한 개인정보 메타데이터(1036) 생성을 위해 저장된 감사 로그(1033)에서 사용되는 개인정보 유형 및 추출 방식을 분석할 수 있다. 이에 대한 분석 결과를 개인정보 메타데이터(1036)에 저장할 수 있다. 이때, 개인정보 관리 모듈(20004)는 아키텍쳐(여기에서, 아키텍쳐는 애플리케이션 개발환경, 화면 사용자 인터페이스(User Interface, UI)를 포함한다.) 유형에 기초하여 개인정보 메타데이터(1036)에 포함된 필드를 정의할 수 있다. 여기에서, 아키텍쳐 유형은 로그 데이터의 변수 및 값을 분석하는 파서(parser)를 구분하는 정보에 대응한다. 일 실시 예에서, 개인정보 관리 모듈(20004)는 로그 데이터 내 필드 정보에 대하여 프로토콜 유형 및 URL을 기준으로 아키텍쳐 유형을 구분할 수 있고, 이에 따라 인덱스의 필드 정보를 생성할 수 있다. 또한, 개인정보 메타데이터(1036)는 아키텍쳐 유형(화면 유형), 화면 정보(level 1, level 2) 및 개인정보 추출 규칙을 포함할 수 있다.More specifically, the personal information type analysis unit (2019) of the personal information management module (20004) can analyze the personal information type and extraction method used in the stored audit log (1033) to generate personal information metadata (1036) for personal information extraction. The analysis result can be stored in the personal information metadata (1036). At this time, the personal information management module (20004) can define the fields included in the personal information metadata (1036) based on the architecture type (here, the architecture includes an application development environment and a screen user interface (UI)). Here, the architecture type corresponds to information that distinguishes a parser that analyzes variables and values of log data. In one embodiment, the personal information management module (20004) can distinguish the architecture type based on the protocol type and URL for field information in the log data, and generate field information of the index accordingly. Additionally, personal information metadata (1036) may include architecture type (screen type), screen information (level 1, level 2), and personal information extraction rules.

이를 위하여, 개인정보 관리 모듈(20004)의 개인정보 패턴 저장부(2020)는 개인정보 유형 별로 사용되는 패턴을 저장할 수 있다. 개인정보 관리 모듈(20004)은 개인정보 유형을 정의하고, 개인정보 유형 별로 사용되는 정규식 패턴 및 마스킹 패턴을 저장할 수 있다. 여기에서, 개인정보 관리 모듈(20004)이 개인정보 유형을 정의하고, 개인정보 유형 별로 사용되는 정규식 패턴을 저장하는 이유는 개인정보 메타데이터(1036)에 포함되는 정보를 생성하기 위함이다.To this end, the personal information pattern storage unit (2020) of the personal information management module (20004) can store patterns used for each personal information type. The personal information management module (20004) can define personal information types and store regular expression patterns and masking patterns used for each personal information type. Here, the reason the personal information management module (20004) defines personal information types and stores regular expression patterns used for each personal information type is to generate information included in the personal information metadata (1036).

결론적으로, 데이터 관리 플랫폼(10000)은 개인정보 유형을 정의하고, 개인정보 유형 별로 사용되는 정규식 패턴을 저장해 놓음으로써 개인정보 메타데이터(1036)를 생성할 수 있게 되고, 개인정보 메타데이터(1036)와 후술하는 개인정보 예외처리 리스트(1037)를 이용하여 개인정보 추출 규칙을 생성하고, 개인정보 추출 규칙을 이용해 개인정보를 추출할 수 있다.In conclusion, the data management platform (10000) can create personal information metadata (1036) by defining personal information types and storing regular expression patterns used for each personal information type, and can create personal information extraction rules using the personal information metadata (1036) and the personal information exception processing list (1037) described below, and can extract personal information using the personal information extraction rules.

또한, 개인정보 관리 모듈(20004)의 개인정보 예외처리 리스트 생성부(2021)는 개인정보 예외처리 리스트(1037)를 생성할 수 있고, 개인정보 추출 규칙 생성부(2022)는 추출할 개인정보 추출 규칙을 생성할 수 있다. 이때, 개인정보 관리 모듈(20004)는 개인정보 예외처리 리스트(1037)를 생성하기 위하여, 개인정보 메타데이터(1036)를 이용하여 감사 로그(1033)에서 가상으로 추출되는 개인정보 항목을 확인할 수 있고, 추출된 값이나 분석된 변수에 기초하여 예외처리 리스트(1037)에 포함되는 예외처리 규칙을 정의할 수 있다.In addition, the personal information exception processing list generation unit (2021) of the personal information management module (20004) can generate a personal information exception processing list (1037), and the personal information extraction rule generation unit (2022) can generate personal information extraction rules to be extracted. At this time, the personal information management module (20004) can use personal information metadata (1036) to confirm personal information items virtually extracted from the audit log (1033) in order to generate the personal information exception processing list (1037), and can define exception processing rules included in the exception processing list (1037) based on the extracted values or analyzed variables.

다른 실시 예에서, 개인정보 관리 모듈(20004)의 개인정보 해쉬 값 수집부(2023)는 개인정보 유형별 해쉬(hash) 값을 수집할 수 있고, 개인정보 유형별 값 필터(value filter)를 생성할 수 있다. 여기에서, 값 필터는 개인정보 값의 해쉬 값에 대해 블룸-필터 자료 구조를 사용하여 만든 필터에 대응한다. 본 발명에서는, 값 필터를 통해 대량의 데이터 집합에서 비교 값의 포함 여부를 빠르게 확인할 수 있다는 장점이 있다.In another embodiment, the personal information hash value collection unit (2023) of the personal information management module (20004) can collect hash values for each personal information type and generate a value filter for each personal information type. Here, the value filter corresponds to a filter created using a Bloom-filter data structure for the hash values of the personal information values. The present invention has the advantage of being able to quickly check whether a comparison value is included in a large data set through the value filter.

보다 상세하게는, 개인정보 관리 모듈(20004)는 여러 유형 별로 개인정보 값에 대한 해쉬 값을 수집하여 개인정보 유형 별로 저장할 수 있다. 개인정보 관리 모듈(20004)의 개인정보 값 필터 생성부(2024)는 수집된 개인정보 데이터를 기준으로 개인정보 유형별 값 필터 파일을 생성할 수 있다.More specifically, the personal information management module (20004) can collect hash values for personal information values by type and store them by personal information type. The personal information value filter generation unit (2024) of the personal information management module (20004) can generate a personal information type-specific value filter file based on the collected personal information data.

이후, 개인정보 관리 모듈(20004)의 개인정보 추출부(2025)는 개인정보를 추출할 수 있고, 개인정보 암호화부(2026)는 추출된 개인정보를 암호화하고, 개인정보 저장부(2027)는 암호화된 개인정보를 저장할 수 있다. 보다 상세하게는, 개인정보 관리 모듈(20004)는 감사 로그(1033)에 포함된 로그 데이터를 아키텍쳐 유형 별로 분석하여 변수 및 값을 추출하고, 추출된 값에 개인정보 (추출) 규칙이 포함된 개인정보 메타데이터(1036)를 이용하여 개인정보를 추출할 수 있다.Thereafter, the personal information extraction unit (2025) of the personal information management module (20004) can extract personal information, the personal information encryption unit (2026) can encrypt the extracted personal information, and the personal information storage unit (2027) can store the encrypted personal information. More specifically, the personal information management module (20004) can analyze the log data included in the audit log (1033) by architecture type to extract variables and values, and extract personal information using personal information metadata (1036) that includes personal information (extraction) rules in the extracted values.

일 실시 예에서, 개인정보 관리 모듈(20004)는 추출된 개인정보 값의 해쉬 값이 값 필터 내에 포함되어 있는지 검증할 수 있다. 이때, 해쉬 값이 값 필터 내에 포함되어 있는 경우, 개인정보 관리 모듈(20004)는 감사 로그(1033)의 인덱스에 추출된 개인정보를 저장할 수 있다. 반면, 해쉬 값이 값 필터 내에 포함되어 있지 않은 경우, 개인정보 관리 모듈(20004)는 잘못 추출된 것으로 판단하여 추출된 개인정보를 제거할 수 있다.In one embodiment, the personal information management module (20004) can verify whether the hash value of the extracted personal information value is included in the value filter. If the hash value is included in the value filter, the personal information management module (20004) can store the extracted personal information in the index of the audit log (1033). On the other hand, if the hash value is not included in the value filter, the personal information management module (20004) can determine that the extracted personal information has been extracted incorrectly and remove the extracted personal information.

이후, 암호화된 개인정보를 검색하는 경우, 개인정보 관리 모듈(20004)은 암호화된 개인정보를 복호화한 후 마스킹 규칙에 의해 처리된 개인정보를 출력할 수 있다.Afterwards, when searching for encrypted personal information, the personal information management module (20004) can decrypt the encrypted personal information and then output the personal information processed by the masking rule.

이하, 개인정보 관리 모듈(20004)이 수행하는 기능은 데이터 관리 플랫폼(10000)이 수행하는 것으로 지칭할 수 있다.Hereinafter, the function performed by the personal information management module (20004) may be referred to as that performed by the data management platform (10000).

이를 통하여, 개인정보 추출에 대한 정확도를 높일 수 있다. 이하, 후술하는 도면을 통하여 본 발명을 상세히 설명하도록 한다.Through this, the accuracy of personal information extraction can be improved. The present invention will be described in detail below with reference to the accompanying drawings.

도 13는 본 발명의 데이터 관리 플랫폼에서 정의하는 개인정보 메타데이터(1036)의 일 예를 설명하는 도면이다.Figure 13 is a drawing illustrating an example of personal information metadata (1036) defined in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼은 개인정보 메타데이터(1036)를 이용하여 감사 로그에 포함된 개인정보를 추출할 수 있다. 이를 위하여, 데이터 관리 플랫폼은 개인정보 메타데이터(1036)를 생성하고, 저장할 수 있다.In one embodiment, the data management platform can extract personal information contained in audit logs using personal information metadata (1036). To this end, the data management platform can generate and store personal information metadata (1036).

데이터 관리 플랫폼은 아키텍쳐 유형 별로 추출할 개인정보 유형 및 방식을 정의하고 개인정보 메타데이터(1036)에 저장할 수 있다. 여기에서, 아키텍쳐 유형이란 GUI 데이터인지, json 데이터인지, WEBGUI인지 여부 등을 나타낼 수 있다. 즉, 데이터 관리 플랫폼은 GUI 데이터 별로 추출할 개인정보 유형 및 방식을 정의하고, 개인정보 메타데이터(1036)에 저장할 수 있고, json 데이터 별로 추출할 개인정보 유형 및 방식을 정의할 수 있다.The data management platform can define the types and methods of personal information to be extracted for each architecture type and store them in personal information metadata (1036). Here, the architecture type can indicate whether it is GUI data, JSON data, or WEBGUI. In other words, the data management platform can define the types and methods of personal information to be extracted for each GUI data and store them in personal information metadata (1036), and can also define the types and methods of personal information to be extracted for each JSON data.

보다 상세하게는, 개인정보 메타데이터(1036)은 아키텍쳐 유형, 화면 정보(레벨 1), 화면 정보(레벨 2) 및 개인정보 추출 규칙을 포함할 수 있다.More specifically, personal information metadata (1036) may include architecture type, screen information (level 1), screen information (level 2), and personal information extraction rules.

여기에서, 아키텍쳐 유형은 상술한 바와 같이 화면 유형에 대응한다. 실제로 출력되는 화면이 어떤 화면인지에 대한 정보를 포함하고 있다. 일 실시 예에서, 아키텍쳐 유형에 기초하여 개인정보 추출 규칙이 결정될 수 있다. 또한, 개인정보 메타데이터(1036)는 아키텍쳐 유형에 대응하는 레벨 1 화면 정보, 레벨 2 화면 정보를 포함할 수 있다.Here, the architecture type corresponds to the screen type as described above. It contains information about the screen that is actually output. In one embodiment, personal information extraction rules can be determined based on the architecture type. Additionally, personal information metadata (1036) can include level 1 screen information and level 2 screen information corresponding to the architecture type.

개인정보 추출 규칙은 추출 방식 및 값을 포함할 수 있다. 추출 방식은 값 추출 방식, 변수 추출 방식, 값 컨텐트 추출 방식, 전문 정규식 추출 방식, 복합 개인정보 추출 방식을 예로 들 수 있다. 이하 추출 방식에 대해 설명하도록 한다.Personal information extraction rules can include extraction methods and values. Examples of extraction methods include value extraction, variable extraction, value content extraction, specialized regular expression extraction, and complex personal information extraction. The following describes the extraction methods.

값 추출 방식은 추출된 개인정보의 값(value)을 기준으로 개인정보를 추출하는 방식이다. 이때, 본 발명의 데이터 관리 플랫폼은 값 추출 방식의 값(value)에 대하여 개인정보 유형 리스트를 이용할 수 있다.The value extraction method extracts personal information based on the extracted personal information's value. In this case, the data management platform of the present invention can utilize a list of personal information types for the values of the value extraction method.

변수 추출 방식은 아키텍쳐 유형에 포함된 변수(variable)를 기준으로 개인정보를 추출하는 방식이다. 이때, 본 발명의 데이터 관리 플랫폼은 변수 추출 방식의 값에 대하여 개인정보 유형 리스트와 변수명 리스트를 함께 이용할 수 있다. 여기에서, 변수명 리스트는 개인정보 유형에 따른 변수명 리스트에 대응한다. 예를 들어, A 화면(A 유형)에서는 “주민”이 변수명 리스트에 포함될 수 있고, B 화면(B 유형)에서는 “SSN”이 변수명리스트에 포함될 수 있다.The variable extraction method extracts personal information based on variables included in an architecture type. At this time, the data management platform of the present invention can utilize both a personal information type list and a variable name list for the variable extraction method values. Here, the variable name list corresponds to a variable name list according to the personal information type. For example, on Screen A (Type A), "Resident" may be included in the variable name list, and on Screen B (Type B), "SSN" may be included in the variable name list.

전문 정규식 추출 방식은 변수가 존재하지 않은 상태에서 파싱(parsing)이 불가능한 전체 텍스트(full text)를 기준으로 개인정보를 추출하는 방식이다. 이때, 본 발명의 데이터 관리 플랫폼은 전문 정규식 추출 방식의 값에 대하여 개인정보 유형 리스트와 패턴 리스트를 함께 이용할 수 있다. 여기에서, 패턴 리스트는 값에 대한 정규식 뿐만 아니라 정규식의 앞 또는 뒤에 패턴을 추가한 것을 포함할 수 있다. 예를 들어, 데이터 관리 플랫폼은 주민번호가 정규식인 경우, 주민번호 앞에 “:”이 있는 경우를 패턴 리스트에 포함시킬 수 있다.The specialized regular expression extraction method extracts personal information based on full text that cannot be parsed in the absence of variables. In this case, the data management platform of the present invention can utilize both a personal information type list and a pattern list for the values of the specialized regular expression extraction method. The pattern list can include not only the regular expression for the value, but also patterns added before or after the regular expression. For example, if the resident registration number is a regular expression, the data management platform can include cases where a ":" is placed before the resident registration number in the pattern list.

복합 개인정보 추출 방식은 개인정보를 추출할 때 하나의 개인정보를 기준으로 개인정보를 추출하는 것이 아닌 두개 이상의 개인정보를 기준으로 개인정보를 추출하는 방식이다. 일 실시 예에서, 데이터 관리 플랫폼은 아키텍쳐 유형 별로 “이름”과 “주민등록번호”가 모두 존재할 때만 개인정보로 판단할 수 있다. 반면, 데이터 관리 플랫폼은 아키텍쳐 유형 별로 “이름” 또는 “주민등록번호” 중 하나만 있는 경우에는 개인정보가 아니라고 판단할 수 있다.Composite personal information extraction is a method of extracting personal information based on two or more pieces of information, rather than a single piece of information. In one embodiment, the data management platform can only determine that information is personal information if both "name" and "resident registration number" are present, depending on the architecture type. Conversely, if only "name" or "resident registration number" is present, the data management platform can determine that information is not personal information, depending on the architecture type.

이와 같이, 본 발명의 데이터 관리 플랫폼은 개인정보 메타데이터(1036)에 포함된 정보를 바탕으로 개인정보를 추출할 수 있다.In this way, the data management platform of the present invention can extract personal information based on information included in personal information metadata (1036).

도 14는 본 발명의 데이터 관리 플랫폼에서 아키텍쳐 유형을 구분하여 개인정보를 추출하는 실시 예를 설명하는 도면이다.Figure 14 is a drawing illustrating an example of extracting personal information by distinguishing architecture types in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼(10000)의 아키텍쳐 유형 분석부(2029)는 감사 로그(1033)로부터 아키텍쳐 유형을 구분한 뒤, 개인정보 추출부(2025)를 통하여 아키텍쳐 유형 별로 정의된 개인정보 메타데이터(1036)를 이용하여 개인정보를 추출할 수 있다.In one embodiment, the architecture type analysis unit (2029) of the data management platform (10000) can distinguish the architecture type from the audit log (1033) and then extract personal information using personal information metadata (1036) defined for each architecture type through the personal information extraction unit (2025).

보다 상세하게는, 데이터 관리 플랫폼(10000)은 아키텍쳐 유형에 기초하여 개인정보 메타데이터(1036)에 포함되는 필드를 정의할 수 있다. 여기에서, 아키텍쳐 유형은 로그 데이터의 변수 및 값을 분석하는 파서를 구분하는 정보에 대응한다. 즉, 아키텍쳐 유형에 따라 변수 및 값을 분석하기 위한 파싱(parsing) 방법이 달라진다. 따라서, 데이터 관리 플랫폼(10000)은 아키텍쳐 유형에 기초하여 개인정보 메타데이터(1036)에 포함되는 필드를 정의해야 한다.More specifically, the data management platform (10000) can define fields included in the personal information metadata (1036) based on the architecture type. Here, the architecture type corresponds to information that identifies the parser that analyzes the variables and values of the log data. In other words, the parsing method for analyzing the variables and values varies depending on the architecture type. Therefore, the data management platform (10000) must define fields included in the personal information metadata (1036) based on the architecture type.

일 실시 예에서, 데이터 관리 플랫폼(10000)의 아키텍쳐 유형 분석부(2029)는 감사 로그(1033) 내에 포함된 로그 데이터 내의 필드 정보에 대하여 프로토콜 유형 및 URL을 기준으로 아키텍쳐 유형을 구분할 수 있고, 이에 따라 인덱스의 필드 정보를 생성할 수 있다.In one embodiment, the architecture type analysis unit (2029) of the data management platform (10000) can distinguish the architecture type based on the protocol type and URL for field information in the log data included in the audit log (1033), and can generate field information of the index accordingly.

이에 따라, 데이터 관리 플랫폼(10000)의 개인정보 추출부(2025)는 아키텍쳐 유형 별로 정의된 개인정보 메타데이터(1036) 및 상술한 예외처리 리스트(1037) 중 적어도 하나를 활용하여 개인정보를 추출할 수 있다.Accordingly, the personal information extraction unit (2025) of the data management platform (10000) can extract personal information by utilizing at least one of the personal information metadata (1036) defined for each architecture type and the above-described exception handling list (1037).

이후, 데이터 관리 플랫폼은 추출된 개인정보를 암호화하고, 암호화된 개인정보를 저장할 수 있다.Afterwards, the data management platform can encrypt the extracted personal information and store the encrypted personal information.

도 15은 본 발명의 데이터 관리 플랫폼에서 개인정보 추출 규칙을 생성하는 실시 예를 설명하는 도면이다.FIG. 15 is a diagram illustrating an example of creating a personal information extraction rule in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼은 개인정보 추출 규칙을 생성할 수 있다. 이를 위하여, 본 도면에서는 개인정보 추출 규칙을 생성하는 사용자 인터페이스를 설명한다. 개인정보 추출 규칙을 생성하기 위하여, 사용자 인터페이스는 기본 정보, 값 정규식 추출 정보, 변수 기반 추출 정보 중 적어도 하나를 포함할 수 있다.In one embodiment, the data management platform can create personal information extraction rules. To this end, this drawing describes a user interface for creating personal information extraction rules. To create personal information extraction rules, the user interface may include at least one of basic information, value regular expression extraction information, and variable-based extraction information.

보다 상세하게는, 기본 정보는 개인정보 추출 규칙을 적용할 대상을 나타낸다. 여기에서, 기본 정보는 로그 유형, 레벨 1(화면, 트랜잭션), 레벨 2(프로그램, 서비스) 중 적어도 하나를 포함할 수 있다. 이에 따라, 사용자는 로그 유형(예를 들어, SAP GUI 로그), 레벨 1 또는 레벨 2 중 적어도 하나를 개인정보 추출 규칙을 적용할 대상으로 입력할 수 있다.More specifically, the basic information indicates the target to which the personal information extraction rule will be applied. Here, the basic information may include at least one of the following: log type, level 1 (screen, transaction), or level 2 (program, service). Accordingly, the user can input at least one of the following: log type (e.g., SAP GUI log), level 1, or level 2, as the target to which the personal information extraction rule will be applied.

또한, 값 정규식 추출 정보는 개인정보 추출 방식 별로 추출할 개인 정보를 포함할 수 있다. 보다 상세하게는, 전체 개인정보가 존재하며, 데이터 관리 플랫폼은 사용자로부터 추출할 개인정보를 선택받을 수 있다. 이에 따라, 데이터 관리 플랫폼은 개인정보 추출 방식 별로 로그에서 값을 추출한 후 정규식 패턴과 비교하여 개인정보를 판단할 수 있다. 이를 위하여, 데이터 관리 플랫폼은 상술한 실시 예인 개인정보 유형별로 저장된 정규식 패턴 파일을 이용할 수 있다.Additionally, the value regular expression extraction information may include personal information to be extracted according to the personal information extraction method. More specifically, the entire personal information exists, and the data management platform can request the user to select the personal information to be extracted. Accordingly, the data management platform can extract values from the log according to the personal information extraction method and compare them with the regular expression pattern to determine the personal information. To this end, the data management platform can utilize the regular expression pattern file stored by personal information type, as described in the above-described embodiment.

또한, 데이터 관리 플랫폼은 로그 유형별로 분석된 변수의 값을 이용하여 개인정보 추출 규칙을 생성할 수 있다. 이를 위하여, 데이터 관리 플랫폼은 사용자로부터 변수 기반 추출 정보를 입력 받을 수 있다.Additionally, the data management platform can create personal information extraction rules using the values of variables analyzed by log type. To this end, the data management platform can receive variable-based extraction information from users.

도 16은 본 발명의 데이터 관리 플랫폼에서 개인정보 예외처리 리스트를 생성하는 실시 예를 설명하는 도면이다.FIG. 16 is a diagram illustrating an example of creating a personal information exception processing list in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼은 개인정보 예외처리 리스트(1037)를 생성할 수 있다. 보다 상세하게는, 데이터 관리 플랫폼은 감사 로그(1033)로부터 개인정보 메타데이터(1036)를 이용하여 가상으로 추출되는 개인정보 항목을 확인할 수 있다. 즉, 데이터 관리 플랫폼은 개인정보 메타데이터(1036)에 포함된 항목(예를 들어, 상술한 key/value, 화면명/필드명 등)을 이용하여 개인정보를 가상으로 추출할 수 있다. 여기에서, 가상으로 추출된 개인정보 항목은 원본 로그 데이터에서 본 발명의 실시 예에 따라 생성된 개인정보 메타데이터(1036)의 항목을 기준으로 출력된 후보 데이터(candidate data)에 대응한다.In one embodiment, the data management platform can generate a personal information exception processing list (1037). More specifically, the data management platform can identify personal information items virtually extracted using personal information metadata (1036) from the audit log (1033). That is, the data management platform can virtually extract personal information using items included in the personal information metadata (1036) (e.g., the aforementioned key/value, screen name/field name, etc.). Here, the virtually extracted personal information items correspond to candidate data output based on the items of personal information metadata (1036) generated according to an embodiment of the present invention from the original log data.

데이터 관리 플랫폼은 추출된 값 또는 분석된 변수를 기준으로 예외처리 리스트(1037)를 생성할 수 있다. 보다 상세하게는, 데이터 관리 플랫폼은 개인정보 메타데이터(1036)에 포함된 항목을 기준으로 예외처리 리스트(1037)를 생성할 수 있다.The data management platform can generate an exception handling list (1037) based on extracted values or analyzed variables. More specifically, the data management platform can generate an exception handling list (1037) based on items included in personal information metadata (1036).

이때, 데이터 관리 플랫폼은 개인정보 메타데이터(1036)에 포함된 아키텍쳐 유형, 화면 정보 및 개인정보 추출 규칙 중 적어도 하나를 예외처리 리스트(1037)를 생성할 수 있다. 예를 들어, 데이터 관리 플랫폼은 예외처리 리스트를 생성할 때, “제 1 아키텍쳐(UI) 유형에서 해당 키는 개인정보가 아니기 때문에 추출하지 않는다” 또는 “제 2 아키텍쳐(UI) 유형에서 해당 값은 개인정보가 아니기 때문에 추출하지 않는다”와 같이 등록할 수 있다.At this time, the data management platform can create an exception handling list (1037) including at least one of the architecture type, screen information, and personal information extraction rules included in the personal information metadata (1036). For example, when creating the exception handling list, the data management platform can register something like, “In the first architecture (UI) type, the corresponding key is not extracted because it is not personal information” or “In the second architecture (UI) type, the corresponding value is not extracted because it is not personal information.”

예를 들면, 데이터 관리 플랫폼은 감사 로그(1033)에서 가상의 개인정보로 “계좌번호”를 추출할 수 있다. 여기에서, “계좌번호”는 후보 데이터에 대응한다. 또한, 개인정보 메타데이터(1036)의 화면명/필드명에는 A제품/시리얼 넘버가 포함되어 있을 수 있다. 여기에서, A제품의 시리얼 넘버는 개인정보에 해당하지 않는다고 가정한다. 이때, A제품에 대한 시리얼 넘버가 개인정보인 “계좌번호”와 동일한 경우, 데이터 관리 플랫폼은 추출된 “계좌번호”는 개인정보가 아닌 것으로 판단하여 예외처리 리스트(1037)에 등록할 수 있다.For example, the data management platform can extract "account number" as virtual personal information from the audit log (1033). Here, "account number" corresponds to candidate data. Additionally, the screen name/field name of the personal information metadata (1036) may include Product A/serial number. Here, it is assumed that the serial number of Product A does not correspond to personal information. In this case, if the serial number for Product A is identical to the "account number" (personal information), the data management platform may determine that the extracted "account number" is not personal information and register it in the exception handling list (1037).

즉, 데이터 관리 플랫폼은 상술한 실시 예에 따라 정의된 개인정보 메타데이터(1036)를 참고하여, 가상으로 추출되는 개인정보를 확인할 수 있다. 이때의 개인정보는 정확한 개인정보가 아닌 후보 데이터이기 때문에, 후보 데이터에 대하여 개인정보 메타데이터(1036)에 포함된 항목을 기준으로 예외처리 리스트(1037)를 생성할 수 있다.That is, the data management platform can verify virtually extracted personal information by referencing the personal information metadata (1036) defined according to the above-described embodiment. Since the personal information in this case is candidate data rather than exact personal information, an exception handling list (1037) can be created for the candidate data based on the items included in the personal information metadata (1036).

일 실시 예에서, 데이터 관리 플랫폼은 감사 로그(1033)에서 개인정보를 추출할 때, 예외처리 리스트(1037)에 포함된 항목을 제외하고 개인정보를 추출할 수 있다.In one embodiment, when extracting personal information from an audit log (1033), the data management platform may extract personal information excluding items included in the exception handling list (1037).

이에 따라, 잘못된 개인정보를 추출할 확률을 낮추는 장점이 있다.Accordingly, it has the advantage of reducing the probability of extracting incorrect personal information.

도 17은 본 발명의 데이터 관리 플랫폼의 예외 필터를 생성하는 실시 예를 설명하는 도면이다.FIG. 17 is a diagram illustrating an embodiment of creating an exception filter of the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼은 상술한 방법으로 예외처리 리스트를 생성할 수 있다. 본 도면에서는 예외처리 리스트를 생성하기 위한 예외 필터 사용자 인터페이스를 설명한다. 예외 필터 사용자 인터페이스는 적어도 하나의 개인정보 항목을 포함할 수 있다.In one embodiment, the data management platform can generate an exception handling list using the method described above. This drawing illustrates an exception filter user interface for generating an exception handling list. The exception filter user interface may include at least one personal information item.

일 실시 예에서, 데이터 관리 플랫폼은 사용자로부터 개인정보 항목에 대한 예외 필터 정보를 입력받을 수 있다. 여기에서, 예외 필터 정보는 아키텍쳐 유형(UI 유형), 레벨 1(화면, 트랜잭션), 값, 레벨 2(프로그램, 서비스), 변수 중 적어도 하나를 포함할 수 있다.In one embodiment, the data management platform may receive exception filter information for personal information items from a user. The exception filter information may include at least one of an architecture type (UI type), level 1 (screen, transaction), value, level 2 (program, service), and variable.

본 도면의 예를 들어 설명하면, 데이터 관리 플랫폼은 사용자로부터 예외 필터 정보로 “아키텍쳐 유형=SAP GUI”“레벨 1=SE16”“레벨 2=SAPMF02D-7230”“변수=RDTMS_VAL2”“값=8201301037329”를 수신할 수 있다.For example, in this drawing, the data management platform can receive “Architecture Type=SAP GUI”, “Level 1=SE16”, “Level 2=SAPMF02D-7230”, “Variable=RDTMS_VAL2”, “Value=8201301037329” as exception filter information from the user.

이에 따라, 데이터 관리 플랫폼은 사용자로부터 입력받은 예외 필터 정보를 이용하여 예외처리 리스트를 생성할 수 있다.Accordingly, the data management platform can create an exception handling list using exception filter information input by the user.

도 18는 본 발명의 데이터 관리 방법이 개인정보를 추출하고 저장하는 다른 실시 예를 설명하는 도면이다.FIG. 18 is a diagram illustrating another embodiment of the data management method of the present invention for extracting and storing personal information.

또한, 정규식 패턴 방식으로 추출된 개인정보의 경우 잘못 추출될 우려가 있다. 이에 따라, 본 발명에서는 상술한 실시 예 이외에도 개인정보가 잘못 추출될 위험을 줄이기 위하여 실제로 로그 상에 존재하는 개인정보를 블룸-필터(Bloom filter)화 시킬 수 있다. 즉, 추출된 개인정보에 블룸-필터를 적용하여 개인정보라고 판단되는 경우에만 추출하여 개인정보 추출의 정확도를 높일 수 있다. 이하에서 자세히 설명하도록 한다.Furthermore, there is a risk of personal information being extracted using regular expression patterns being mis-extracted. Therefore, in addition to the aforementioned embodiments, the present invention can also Bloom-filter personal information actually present in the log to reduce the risk of mis-extracting personal information. In other words, Bloom-filtering is applied to extracted personal information, allowing extraction only of information determined to be personal information, thereby increasing the accuracy of personal information extraction. This will be explained in detail below.

단계(S50010)에서, 데이터 관리 방법은 개인정보에 대한 해쉬(hash) 값을 수집할 수 있다. 보다 상세하게는, 데이터 관리 방법은 개인정보 암호화 솔루션과 같이 개인정보를 관리하는 시스템/플랫폼으로부터 여러 유형의 개인정보 값에 대한 해쉬 값을 수집할 수 있다. 여기에서, 개인정보를 관리하는 시스템/플랫폼은 외부 서버에 존재할 수 있다. 일 실시 예에서, 데이터 관리 방법은 수집된 해쉬 값을 개인정보 유형 별로 저장할 수 있다.In step S50010, the data management method may collect hash values for personal information. More specifically, the data management method may collect hash values for various types of personal information from a system/platform that manages personal information, such as a personal information encryption solution. The system/platform that manages personal information may reside on an external server. In one embodiment, the data management method may store the collected hash values by personal information type.

단계(S50020)에서, 데이터 관리 방법은 개인정보 유형 별 값 필터(value filter)를 생성할 수 있다. 여기에서, 값 필터는 개인정보 값의 해쉬 값에 대해 블룸-필터 자료 구조(bloom filter data structure)를 사용하여 만든 필터에 대응한다.In step (S50020), the data management method can generate a value filter for each personal information type. Here, the value filter corresponds to a filter created using a bloom filter data structure for the hash value of the personal information value.

단계(S50030)에서, 데이터 관리 방법은 개인정보를 추출할 수 있다. 일 실시 예에서, 감사 로그 내 로그 데이터를 아키텍쳐 유형 별로 분석하여 변수 및 값을 추출하고, 추출된 값에 개인정보 추출 규칙을 적용하여 개인정보를 추출할 수 있다. 이에 대하여는 상술한 바와 같다.In step (S50030), the data management method can extract personal information. In one embodiment, log data within the audit log can be analyzed by architecture type to extract variables and values, and personal information can be extracted by applying personal information extraction rules to the extracted values. This is as described above.

단계(S50040)에서, 데이터 관리 방법은 개인정보 값을 검증할 수 있다. 즉, 상술한 내용에 더불어 추출된 개인정보 값을 검증하기 위하여 값 필터를 사용할 수 있다.In step (S50040), the data management method can verify personal information values. That is, in addition to the above-described content, a value filter can be used to verify the extracted personal information values.

단계(S50050)에서, 개인정보에 대한 해쉬 값이 값 필터 내에 포함되어 있는 경우, 단계(S50060)에서, 데이터 관리 방법은 인덱스에 추출된 개인정보를 저장할 수 있다. 보다 상세하게는, 추출한 개인정보에 대한 해쉬 값이 블룸-필터 자료 구조를 사용하여 만든 값 필터 내에 포함되어 있는 경우, 데이터 관리 방법은 추출한 개인정보를 진짜(real) 개인정보인 것으로 판단할 수 있다. 이에 따라, 데이터 관리 방법은 감사 로그 내 추출된 개인정보에 대하여 진짜 개인정보라는 인덱스를 저장할 수 있다.In step (S50050), if the hash value for the personal information is included in the value filter, in step (S50060), the data management method can store the extracted personal information in an index. More specifically, if the hash value for the extracted personal information is included in the value filter created using the Bloom-filter data structure, the data management method can determine that the extracted personal information is real personal information. Accordingly, the data management method can store an index indicating real personal information for the extracted personal information in the audit log.

단계(S50070)에서, 데이터 관리 방법은 개인정보에 대한 해쉬 값이 값 필터 내에 포함되어 있지 않은 경우, 단계(S50080)에서, 데이터 관리 방법은 추출된 개인정보를 제거할 수 있다. 보다 상세하게는, 추출한 개인정보에 대한 해쉬 값이 블룸-필터 자료 구조를 사용하여 만든 값 필터 내에 포함되어 있지 않은 경우, 데이터 관리 방법은 추출한 개인정보를 가짜(fake) 개인정보인 것으로 판단할 수 있다. 이에 따라, 데이터 관리 방법은 추출한 개인정보가 잘못 추출된 것으로 판단하여 추출된 개인정보를 제거할 수 있다. 여기에서, 추출된 개인정보를 제거한다는 것은 데이터를 자체를 제거하는 것이 아닌 개인정보로서의 가치를 잃는 것을 의미할 수 있다.In step (S50070), if the hash value for the personal information is not included in the value filter, the data management method may remove the extracted personal information in step (S50080). More specifically, if the hash value for the extracted personal information is not included in the value filter created using the Bloom-filter data structure, the data management method may determine that the extracted personal information is fake. Accordingly, the data management method may determine that the extracted personal information was extracted incorrectly and remove the extracted personal information. Here, removing the extracted personal information may not mean removing the data itself, but rather losing its value as personal information.

이에 따라, 본 발명에서는, 블룸-필터 자료 구조를 이용하여 대량으로 관리되는 개인정보와 개인정보로 추출된 값을 빠르게 비교할 수 있다. 또한, 값을 검증하기 힘든 유형의 개인정보에 대한 오류를 제거할 수 있다.Accordingly, the present invention utilizes a Bloom-filter data structure to quickly compare personal information managed in bulk with values extracted from the personal information. Furthermore, errors in types of personal information whose values are difficult to verify can be eliminated.

도 19은 본 발명의 데이터 관리 플랫폼에서 사용자 행위를 수집하고 매핑하는 실시 예를 설명하는 도면이다.FIG. 19 is a diagram illustrating an embodiment of collecting and mapping user behavior in the data management platform of the present invention.

애플리케이션 상에서 수행하는 사용자의 업무 행위는 개발자의 취향 및 개발 표준에 따라 상이하게 표현된다. 즉, 본 발명을 통하여 사용자의 업무 행위를 기초로 하는 데이터에 대한 “조회, 삭제, 추가, 변경, 프린트”와 같은 사용자 행위의 구분을 용이하게 하고자 한다.User actions performed within an application are expressed differently depending on the developer's preferences and development standards. That is, the present invention aims to facilitate the distinction between user actions, such as "search, delete, add, modify, and print," based on the user's actions.

따라서, 본 발명은 애플리케이션의 메뉴 및 아이콘의 텍스트를 수집하고, 인공지능을 기반으로 자동으로 분류하여 법적으로 요구하는 사용자 행위에 대한 상세 구분을 할 수 있다.Accordingly, the present invention collects texts of menus and icons of an application, automatically classifies them based on artificial intelligence, and can provide detailed classification of user actions as legally required.

또한, 감사 로그(audit log) 시스템에서 발생하는 이벤트 및 사용자의 활동 등의 정보를 기록한 로그로, 보안 및 감사 추적 등을 위해 사용될 수 있다. 여기에서, 감사 로그는 로그 데이터 및 로그 데이터에 대응하는 인덱스를 포함할 수 있다.Additionally, the audit log is a log that records information such as events occurring in the system and user activities, and can be used for security and audit tracking. Here, the audit log may include log data and an index corresponding to the log data.

로그 데이터는 일반적으로 시간, 이벤트/행위, 사용자/주체, 대상/객체, 결과 등의 정보로 구별되며, 각각의 로그 데이터는 하나의 레코드(Record)를 형성하며, 여러 개의 레코드가 연속적으로 기록될 수 있다.Log data is generally distinguished by information such as time, event/action, user/subject, target/object, and result, and each log data forms one record, and multiple records can be recorded continuously.

또한, 감사 로그의 데이터는 데이터베이스 등에서 인덱싱 처리되어 관리되는 것이 일반적이다. 이때, 로그 데이터의 검색 및 분석을 효율적으로 수행하기 위해 각 로그 레코드에 대한 인덱스도 함께 관리될 수 있다. 여기에서, 인덱스는 보통 검색에 사용되는 필드와 검색 속도를 향상시키기 위한 키(Key) 등의 정보를 포함한다.Additionally, audit log data is typically indexed and managed in databases and other systems. To facilitate efficient log data retrieval and analysis, an index for each log record may also be maintained. Here, the index typically includes fields used for searches and keys to improve search speed.

예를 들어, 시간, 이벤트/행위, 사용자/주체, 대상/객체 등의 필드를 가진 감사 로그에서 사용자가 특정 파일을 삭제한 기록을 검색하기 위해, 시간 필드와 대상 필드를 조합하여 인덱스를 생성할 수 있다. 이렇게 생성된 인덱스를 활용하면, 검색 시간을 대폭 줄일 수 있다는 장점이 있다.For example, to search for records of users deleting specific files in audit logs with fields such as time, event/action, user/subject, and target/object, an index can be created by combining the time and target fields. Utilizing this index has the advantage of significantly reducing search times.

뿐만 아니라, 인덱스에 사용자 행위를 매핑하면, 로그 데이터를 사용자 행위를 기준으로 분류할 수 있어 보안 분석이나 모니터링에 용이하다는 장점이 있다.In addition, mapping user behavior to an index has the advantage of allowing log data to be classified based on user behavior, making it easy to conduct security analysis or monitor.

본 발명의 데이터 관리 플랫폼은 감사 로그에 포함된 로그 데이터를 인공지능을 기반으로 분류된 사용자 행위를 기준으로 인덱싱 처리해 사용자가 사용자 행위를 기준으로 로그 데이터를 검색할 수 있도록 한다. 이하, 본 발명에 대해 자세히 설명한다.The data management platform of the present invention indexes log data contained in audit logs based on user behavior classified using artificial intelligence, enabling users to search log data based on user behavior. The present invention is described in detail below.

본 발명의 데이터 관리 플랫폼(10000)은 수집 모듈(20001), 분석 모듈(20002), 모니터링 모듈(20005) 및 AI 엔진(20006)을 이용하여 사용자 행위를 수집하고, 로그 데이터와 매핑하여 인덱스 처리한 후 로그 데이터 조회 요청에 따라 사용자 행위로 인덱스 처리된 로그 데이터를 제공할 수 있다.The data management platform (10000) of the present invention can collect user behavior using a collection module (20001), an analysis module (20002), a monitoring module (20005), and an AI engine (20006), map the user behavior to log data, index the data, and then provide the log data indexed with the user behavior in response to a log data query request.

보다 상세하게는, 수집 모듈(20001)을 사용하여 사용자 행위에 대응하는 키 및 텍스트를 수집할 수 있다. 이때, 수집 모듈(20001)은 애플리케이션 개발환경에 연결하여 사용자 행위를 수집할 수 있다. 애플리케이션 개발환경에서는 사용자 행위를 선택하는 메뉴, 아이콘에 대한 키 및 텍스트를 보관하고 있다. 이에 따라, 본 발명의 데이터 관리 플랫폼(10000)에 포함된 수집 모듈(20001)은 이러한 사용자 행위에 대응하는 키 및 텍스트를 수집할 수 있다.More specifically, the collection module (20001) can be used to collect keys and text corresponding to user actions. At this time, the collection module (20001) can be connected to an application development environment to collect user actions. The application development environment stores keys and text for menus and icons used to select user actions. Accordingly, the collection module (20001) included in the data management platform (10000) of the present invention can collect keys and text corresponding to such user actions.

AI 엔진(20006)은 수집된 텍스트를 인공지능(AI)을 기반으로 사용자 행위(action)을 분류할 수 있다. 예를 들어, 사용자 행위는 조회, 삭제, 추가, 변경, 프린트 등의 사용자의 업무 행위를 포함할 수 있다. 이를 위하여, AI 엔진(20006)은 기계학습(machine learning), 딥 러닝(deep learning), 자연어 처리(Natural Language Processing, NLP), 규칙 기반 접근(Rule-Based Approach) 등의 방법을 사용할 수 있다.The AI engine (20006) can classify collected text into user actions based on artificial intelligence (AI). For example, user actions may include user tasks such as searching, deleting, adding, modifying, and printing. To this end, the AI engine (20006) can utilize methods such as machine learning, deep learning, natural language processing (NLP), and a rule-based approach.

분석 모듈(20002)은 분류된 사용자 행위에 대한 데이터를 키(key) 및 사용자 행위(action)로 구분하여 사용자 행위 메타데이터(1038)를 생성할 수 있다. 이때, 사용자 행위 메타데이터(1038)는 데이터 관리 플랫폼(10000) 내부 데이터베이스(20007) 안에 저장될 수 있다.The analysis module (20002) can generate user action metadata (1038) by dividing data on classified user actions into keys and user actions. At this time, the user action metadata (1038) can be stored in the internal database (20007) of the data management platform (10000).

분석 모듈(20002)은 저장된 사용자 행위 메타데이터(1038)와 감사 로그(1033) 안에 포함된 로그 데이터를 매핑할 수 있다. 이때, 분석 모듈(20002)는 로그 데이터와 사용자 행위 메타데이터(1038)을 매핑하기 위하여, 로그 데이터를 인덱스 처리할 때 애플리케이션 개발환경 유형 별 키로 사용할 수 있는 필드와 사용자 행위 메타데이터(1038) 내에 키 정보를 매핑하여 인덱스의 사용자 행위 필드에 저장할 수 있다.The analysis module (20002) can map the stored user behavior metadata (1038) and the log data contained in the audit log (1033). At this time, in order to map the log data and the user behavior metadata (1038), the analysis module (20002) can map the key information in the user behavior metadata (1038) to a field that can be used as a key for each application development environment type when indexing the log data and store it in the user behavior field of the index.

여기에서, 키 정보는 사용자 행위를 나타내는 식별 정보를 나타낸다. 일 실시 예에서, 데이터 관리 플랫폼(10000)은 사용자 행위를 텍스트로 저장하지 않고, 사용자 행위를 식별하기 위한 축약된 ID(Identification)로 저장할 수 있다. 예를 들어, 데이터 관리 플랫폼(10000)은 사용자 행위가 “조회”인 경우, 키 정보로 “R”을 저장하고, 사용자 행위가 “삭제”인 경우, 키 정보로 “D”를 저장하고, 사용자 행위가 “수정”인 경우, 키 정보로 “U”를 저장하고, 사용자 행위가 “프린트”인 경우, 키 정보로 “P”를 저장할 수 있다.Here, key information represents identification information that indicates a user action. In one embodiment, the data management platform (10000) may store the user action as an abbreviated ID (Identification) for identifying the user action, rather than storing it as text. For example, if the user action is “search,” the data management platform (10000) may store “R” as key information, if the user action is “delete,” the data management platform (10000) may store “D” as key information, if the user action is “modify,” the data management platform (10000) may store “U” as key information, and if the user action is “print,” the data management platform (10000) may store “P” as key information.

사용자는 모니터링 모듈(20005)를 통하여 로그를 조회할 수 있다. 일 실시 예에서, 모니터링 모듈(20005)은 데이터베이스(20007) 내 감사 로그(1033)에 포함된 로그 데이터를 제공할 때, 인덱스의 사용자 행위 필드를 참조하여 사용자에게 정보를 제공할 수 있다.A user can view logs through the monitoring module (20005). In one embodiment, the monitoring module (20005) may provide information to the user by referencing the user action field of the index when providing log data included in the audit log (1033) within the database (20007).

이하, 데이터 관리 플랫폼(10000) 내부의 각각의 모듈에서 수행되는 기능은 데이터 관리 플랫폼(10000)이 수행하는 것으로 기재하도록 한다.Hereinafter, the functions performed in each module within the data management platform (10000) are described as being performed by the data management platform (10000).

도 20은 본 발명의 데이터 관리 플랫폼에서 사용자 행위 메타데이터를 생성하는 실시 예를 설명하는 도면이다.FIG. 20 is a diagram illustrating an embodiment of generating user behavior metadata in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 상술한 AI 엔진을 통하여 사용자 행위에 대응하는 키 및 텍스트를 수집할 수 있고, 수집된 텍스트를 인공지능을 기반으로 사용자 행위로 분류할 수 있다. 이에 따라, 데이터 관리 플랫폼(10000)은 수집된 사용자 행위에 대응하는 키 및 텍스트에 대하여 사용자 행위 메타데이터(1038)를 생성할 수 있다.In one embodiment, the data management platform (10000) can collect keys and text corresponding to user actions through the aforementioned AI engine and classify the collected text into user actions based on artificial intelligence. Accordingly, the data management platform (10000) can generate user action metadata (1038) for the collected keys and text corresponding to the user actions.

보다 상세하게는, 데이터 관리 플랫폼(10000)은 사용자 행위 메타데이터(1038)를 생성하기 위하여, 애플리케이션 개발환경 유형 별 키 및 사용자 행위 키를 매핑할 수 있다. 즉, 사용자 행위 메타데이터(1038)는 애플리케이션 개발환경 유형 별 키, 사용자 행위 키, 사용자 행위를 필드로 가질 수 있다.More specifically, the data management platform (10000) can map application development environment type-specific keys and user action keys to generate user action metadata (1038). That is, the user action metadata (1038) can have application development environment type-specific keys, user action keys, and user actions as fields.

여기에서, 애플리케이션 개발환경 유형은 상술한 아키텍쳐 유형, 화면 유저 인터페이스 유형을 포함할 수 있다. 즉, 애플리케이션 개발환경 유형은 애플리케이션 개발환경 내에서 사용자 업무 행위로 사용할 수 있는 정보 필드를 나타낸다. 예를 들어, 화면에 존재하는 메뉴 아이콘, ok 아이콘, cancel 아이콘 등을 포함할 수 있다. 또한, 사용자 행위 키는 상술한 사용자 행위를 식별하기 위한 ID를 나타낸다. 따라서, 데이터 관리 플랫폼(10000)은 사용자 행위 메타데이터(1038) 내에 애플리케이션 개발환경 유형 별 키 및 사용자 행위 키를 매핑할 수 있다.Here, the application development environment type may include the aforementioned architecture type and screen user interface type. That is, the application development environment type represents information fields that can be used for user actions within the application development environment. For example, it may include menu icons, OK icons, and cancel icons present on the screen. Additionally, the user action key represents an ID for identifying the aforementioned user actions. Accordingly, the data management platform (10000) may map application development environment type-specific keys and user action keys within the user action metadata (1038).

예를 들어, 애플리케이션 개발환경 유형별 키가 “제 1 애플리케이션에서 del 키”라면, 상술한 AI 엔진 및 분석 모듈을 통하여, 데이터 관리 플랫폼(10000)은 사용자 행위 키 “D”와 매핑하여 사용자 행위 메타데이터(1038)를 생성할 수 있다. 이때, 데이터 관리 플랫폼(10000)은 애플리케이션 개발환경 유형 별 키인 “제 1 애플리케이션에서 del 키”와 사용자 행위 키인 ”D”의 사용자 행위가 “삭제”임을 사용자 행위 메타데이터(1038)에 함께 저장할 수 있다.For example, if the key by application development environment type is “del key in the first application”, the data management platform (10000) can generate user action metadata (1038) by mapping it with the user action key “D” through the AI engine and analysis module described above. At this time, the data management platform (10000) can store in the user action metadata (1038) that the user action of the “del key in the first application” which is the key by application development environment type and the user action key “D” is “delete”.

도 21은 본 발명의 데이터 관리 플랫폼에서 사용자 행위 메타데이터와 로그 데이터를 매핑하는 실시 예를 설명하는 도면이다.FIG. 21 is a diagram illustrating an embodiment of mapping user behavior metadata and log data in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼(10000)은 저장된 사용자 행위 메타데이터(1038)와 감사 로그(1033)에 포함된 로그 데이터를 매핑할 수 있다.In one embodiment, the data management platform (10000) can map stored user behavior metadata (1038) and log data included in an audit log (1033).

보다 상세하게는, 데이터 관리 플랫폼(10000)은 로그 데이터와 사용자 행위 메타데이터(1038)을 매핑하기 위하여, 로그 데이터를 인덱스 처리할 수 있다. 구체적으로, 데이터 관리 플랫폼(10000)은 상술한 실시 예를 통하여 사용자 행위 메타데이터(1038) 내의 애플리케이션 개발환경 유형 별 키 필드와 사용자 행위 키 필드를 매핑할 수 있다. 또한, 일 실시 예에서, 데이터 관리 플랫폼(10000)은 인덱스의 사용자 행위 필드에 사용자 행위 메타데이터(1038)의 사용자 행위를 저장할 수 있다.More specifically, the data management platform (10000) can index log data to map log data and user behavior metadata (1038). Specifically, the data management platform (10000) can map key fields for each application development environment type within the user behavior metadata (1038) and user behavior key fields through the above-described embodiment. Furthermore, in one embodiment, the data management platform (10000) can store user behaviors of the user behavior metadata (1038) in the user behavior field of the index.

이에 따라, 감사 로그(1033)에 저장된 제 1 로그 데이터가 사용자가 제 1 애플리케이션 화면에서 del 키를 누르는 로그를 나타내는 경우, 데이터 관리 플랫폼(10000)은 제 1 로그 데이터를 조회하는 요청을 수신하는 경우, 매핑된 사용자 행위인 “삭제”를 바로 제공할 수 있다.Accordingly, if the first log data stored in the audit log (1033) indicates a log in which a user presses the del key on the first application screen, the data management platform (10000) can immediately provide the mapped user action “delete” when receiving a request to view the first log data.

이렇게 인덱스에 사용자 행위를 매핑하면, 로그 데이터를 사용자 행위를 기준으로 분류할 수 있어 보안 분석이나 모니터링에 용이하다는 장점이 있다.Mapping user behavior to an index in this way has the advantage of allowing log data to be classified based on user behavior, making it easy to conduct security analysis or monitor.

도 22은 본 발명의 데이터 관리 플랫폼에서 개인정보를 암호화하는 실시 예를 설명하는 도면이다.Figure 22 is a drawing illustrating an embodiment of encrypting personal information in the data management platform of the present invention.

개인정보 암호화 키는 법적 기준을 준수하기 위해 주기적으로 변경해야 한다. 이때, 암호화를 위한 키 값은 해당 키를 변경하게 되면 기존 키를 복호화하고 새로운 키로 재 암호화해야 한다. 즉, 키를 변경하는 경우 수천만 내지 수억 건의 데이터를 복호화하고, 재 암호화하기 위해 많은 시간이 소요된다. 따라서, 대부분의 회사들은 법적 기준을 준수해야 함에도 불구하고 암호화 키 변경을 수행하고 있지 않는 경우가 많다.Personal information encryption keys must be changed periodically to comply with legal standards. Changing the encryption key requires decrypting the existing key and re-encrypting it with the new key. Changing the key requires decrypting and re-encrypting tens or hundreds of millions of data items, which requires significant time. Therefore, most companies often fail to change their encryption keys despite their legal obligations.

본 발명은 암호화 값에 대한 토큰(token, 대체 값)과 암호화 값에 대한 키 ID를 별도로 저장하는 방안을 제안하고자 한다. 이를 통하여 신규로 생성된 키 값으로 실시간으로 암호화할 수 있고, 새로운 키 값으로 운영 서버에 영향을 주지 않고 복호화 및 재암호화하는 기능을 제공할 수 있다.The present invention proposes a method for separately storing a token (replacement value) for an encrypted value and a key ID for the encrypted value. This allows for real-time encryption with a newly generated key value, and provides the ability to decrypt and re-encrypt data without affecting the operating server.

또한, 상술한 점 이외에도 애플리케이션 서버의 요구에 따라 암호화 데이터를 처리해야 할 필요가 있다. 이때, 데이터베이스의 암호화 방식을 사용하는 경우, 데이터가 로드될 때 복호화되어 원문(original text)으로 전송되기 때문에 다양한 방법에 의해서 원문이 유출될 수 있다. 따라서, 서버 메모리 내에서 암호화 데이터를 처리하고 필요할 때만 복호화 해야 한다.In addition to the above, encrypted data must be processed according to the application server's requirements. When using database encryption, data is decrypted and transmitted in its original form when loaded, which can lead to the original text being leaked through various means. Therefore, encrypted data must be processed within server memory and decrypted only when necessary.

본 발명의 데이터 관리 플랫폼에서는 암호화 키가 변경되더라도 데이터의 복호화 및 재암호화가 내부에서 일어나기 때문에 원문의 유출이 없다는 장점이 있다.The data management platform of the present invention has the advantage of preventing leakage of original text because data decryption and re-encryption occur internally even if the encryption key is changed.

이하 본 발명에 대하여 자세히 설명하도록 한다.The present invention will be described in detail below.

본 발명의 일 실시 예에서, 데이터 관리 플랫폼(10000)은 키 관리 모듈(20003)을 통하여 암호화 키 및 키 ID를 생성하고, 생성된 암호화 키를 사용하여 개인정보를 암호화하고, 암호화된 개인정보에 대응하는 토큰 값을 생성할 수 있다.In one embodiment of the present invention, the data management platform (10000) can generate an encryption key and a key ID through a key management module (20003), encrypt personal information using the generated encryption key, and generate a token value corresponding to the encrypted personal information.

이를 위하여, 데이터 관리 플랫폼(10000)의 키 관리 모듈(20003)은 암호화 키 및 키 ID 생성부(2015), 개인정보 암호화부(2016), 토큰 값 생성부(2017) 및 매핑 정보 저장부(2018)를 포함할 수 있다. 여기에서, 키 관리 모듈(20003)은 Java 및 RFC를 사용할 수 있고, 개인정보의 암호화 및 복호화를 담당할 수 있다.To this end, the key management module (20003) of the data management platform (10000) may include an encryption key and key ID generation unit (2015), a personal information encryption unit (2016), a token value generation unit (2017), and a mapping information storage unit (2018). Here, the key management module (20003) may use Java and RFC, and may be responsible for encryption and decryption of personal information.

여기에서, 암호화 키 및 키 ID 생성부(2015)는 개인정보 암호화를 위한 암호화 키 및 키 ID를 생성 및 관리할 수 있다. 일 실시 예에서, 암호화 키 및 키 ID 생성부(2015)는 사용자의 제어에 기초하여 암호화 키 및 키 ID를 새로 생성할 수 있다. 이때, 암호화 키는 신규 키 ID가 발급됨으로써 변경될 수 있다.Here, the encryption key and key ID generation unit (2015) can generate and manage encryption keys and key IDs for personal information encryption. In one embodiment, the encryption key and key ID generation unit (2015) can generate new encryption keys and key IDs based on user control. At this time, the encryption key can be changed by issuing a new key ID.

예를 들어, 사용자는 데이터 관리 플랫폼(10000)이 제공하는 일괄 암호화 키 변경 기능을 선택할 수 있고, 이에 따라 암호화 키 및 키 ID 생성부(2015)는 암호화 키 및 키 ID를 새로운 값으로 생성할 수 있다. 또한, 다른 일 실시 예에서, 암호화 키 및 키 ID 생성부(2015)는 사용자 제어가 없더라도 기 설정된 주기에 기초하여 암호화 키 및 키 ID를 업데이트할 수 있다.For example, a user may select a batch encryption key change function provided by the data management platform (10000), and accordingly, the encryption key and key ID generation unit (2015) may generate the encryption key and key ID with new values. In addition, in another embodiment, the encryption key and key ID generation unit (2015) may update the encryption key and key ID based on a preset cycle even without user control.

개인정보 암호화부(2016)는 암호화 키를 이용하여 개인정보를 암호화할 수 있다. 개인정보 암호화부(2016)는 데이터 관리 플랫폼(10000) 내에 저장된 가장 최신 암호화 키를 이용하여 개인정보를 암호화할 수 있다.The personal information encryption unit (2016) can encrypt personal information using an encryption key. The personal information encryption unit (2016) can encrypt personal information using the most recent encryption key stored within the data management platform (10000).

토큰 값 생성부(2017)는 암호화된 개인정보에 대응하는 토큰 값(대체 값)을 생성할 수 있다.The token value generation unit (2017) can generate a token value (replacement value) corresponding to encrypted personal information.

정보 저장부(2018)는 생성된 토큰 값, 암호화된 개인정보에 대응하는 암호본, 키 ID를 매핑 정보 테이블(1039)에 저장할 수 있다. 또한, 정보 저장부(2018)는 토큰 값을 업무 테이블(1040)에 별도로 저장할 수 있다.The information storage unit (2018) can store the generated token value, the encrypted personal information corresponding to the encrypted password, and the key ID in the mapping information table (1039). In addition, the information storage unit (2018) can separately store the token value in the business table (1040).

즉, 암호화 키 및 키 ID 생성부(2015)를 통하여 암호화 키가 변경되더라도 업무 테이블(1040)에 저장된 값은 변하지 않기 때문에, 시스템 운영을 중단하지 않으면서 암호화 키를 변경할 수 있다.That is, even if the encryption key is changed through the encryption key and key ID generation unit (2015), the value stored in the work table (1040) does not change, so the encryption key can be changed without stopping system operation.

도 23는 본 발명의 데이터 관리 플랫폼에서 개인정보를 암호화하는 실시 예를 설명하는 도면이다.Figure 23 is a drawing illustrating an embodiment of encrypting personal information in the data management platform of the present invention.

일 실시 예에서, 데이터 관리 플랫폼은 개인정보를 수집할 수 있다. 데이터 관리 플랫폼은 수집된 패킷으로부터 분석한 데이터 또는 직접적으로 수신한 데이터 중 개인정보를 추출할 수 있다. 또한, 데이터 관리 플랫폼은 개인정보 자체(예를 들어, 주민등록번호 “73****-*******”를 수신할 수 있다.In one embodiment, the data management platform may collect personal information. The data management platform may extract personal information from data analyzed from collected packets or directly received data. Additionally, the data management platform may receive the personal information itself (e.g., resident registration number "73****-*******").

일 실시 예에서, 데이터 관리 플랫폼은 상술한 암호화 키 및 키 ID 생성부를 통하여 생성된 암호화 키와 키 ID를 이용하여 개인정보를 암호화할 수 있다.In one embodiment, the data management platform can encrypt personal information using an encryption key and key ID generated through the encryption key and key ID generation unit described above.

본 도면을 예로 들어 설명하면, 데이터 관리 플랫폼은 생성된 제 1 암호화 키 및 키 ID는 “KEY001”를 사용하여 개인정보를 암호화할 수 있다. 여기에서, 암호화 키는 binary 형태로 생성될 수 있다. 예를 들어, 제 1 암호화 키는 “01001010 00110001 00110001 00110011 00111000 00110000 00110001 01011010 00111101 00111101”에 대응할 수 있다. 상술한 예를 들어 설명하면, 개인정보인 주민등록번호 “73****-*******”은 생성된 암호화 키에 의해 암호화될 수 있다.For example, using this drawing, the data management platform can encrypt personal information using the generated first encryption key and key ID “KEY001.” Here, the encryption key can be generated in binary format. For example, the first encryption key can correspond to “01001010 00110001 00110001 00110011 00111000 00110000 00110001 01011010 00111101 00111101.” Using the above example, the personal information, resident registration number “73****-*******,” can be encrypted using the generated encryption key.

일 실시 예에서, 암호화는 개인정보 전체를 암호화하거나, 일부를 암호화하는 방법으로 진행될 수 있다. 특히, 본 발명은 개인정보 중 일부를 암호화하는 것을 특징으로 한다. 이때, 키 ID는 암호화 알고리즘이 매핑되어 있다. 일 실시 예에서, 사용 가능한 암호화 알고리즘은 양방향 알고리즘으로 SEED, ARIA128, ARIS192, ARIA256, AES128, AES192, DES, TDES를 사용할 수 있고, 단방향 알고리즘으로 SHA-256을 사용할 수 있다. 이때, 키 ID에 기초하여 암호화되는 알고리즘이 결정될 수 있다.In one embodiment, encryption may be performed by encrypting all or part of the personal information. In particular, the present invention is characterized by encrypting a part of the personal information. At this time, the key ID is mapped to an encryption algorithm. In one embodiment, available encryption algorithms include SEED, ARIA128, ARIS192, ARIA256, AES128, AES192, DES, and TDES as two-way algorithms, and SHA-256 as a one-way algorithm. At this time, the encryption algorithm may be determined based on the key ID.

개인정보가 암호화된 이후, 데이터 관리 플랫폼은 암호화된 개인정보에 대응하는 토큰 값을 생성할 수 있다. 본 도면의 예에서 토큰 값은 “abcxxf”에 대응한다. 이때, 토큰 값은 암호화된 개인정보의 자리 수 및 형식(format)을 유지할 수 있다. 예를 들어, 개인정보인 주민등록번호 “73****-*******”중 뒤 6자리를 부분 암호화하는 경우, 이를 대체하는 토큰 값은 동일한 자리 수 및 형식인 “abcxxf”에 대응할 수 있다.After personal information is encrypted, the data management platform can generate a token value corresponding to the encrypted personal information. In the example shown in this diagram, the token value corresponds to "abcxxf." In this case, the token value can maintain the number of digits and format of the encrypted personal information. For example, if the last six digits of the personal information, such as the resident registration number "73****-*******," are partially encrypted, the token value replacing it can correspond to "abcxxf," which has the same number of digits and format.

일 실시 예에서, 데이터 관리 플랫폼은 생성된 토큰 값, 암호화된 개인정보에 대응하는 암호본, 키 ID를 매핑 정보 테이블에 저장하고, 토큰 값을 업무 테이블에 저장할 수 있다. 매핑 정보 테이블과 업무 테이블에 대하여는 후술하도록 한다.In one embodiment, the data management platform may store the generated token value, the encrypted private information corresponding to the encrypted private information, and the key ID in a mapping information table, and store the token value in a business table. The mapping information table and business table are described below.

도 24는 본 발명의 매핑 정보 테이블과 업무 테이블을 설명하는 도면이다.Figure 24 is a drawing explaining the mapping information table and work table of the present invention.

본 도면은 매핑 정보 테이블(1039)과 업무 테이블(1040)를 예시하는 도면이다. 본 도면에서 매핑 정보 테이블(1039)과 업무 테이블(1040)은 본 발명과 관련이 있는 필드만을 나타낸 것으로, 이외의 필드를 더 포함할 수 있음은 물론이다.This drawing is a drawing illustrating a mapping information table (1039) and a business table (1040). In this drawing, the mapping information table (1039) and the business table (1040) only show fields related to the present invention, and of course, they may include additional fields.

일 실시 예에서, 매핑 정보 테이블(1039)은 토큰 값, 암호본 및 키 ID를 포함할 수 있다. 토큰 값, 암호본, 키 ID는 각각의 필드로 구성되어 있으며, 매핑 정보 테이블(1039)은 매핑 값을 각각 필드에 맞게 포함할 수 있다. 상술한 실시 예를 예로 들어 설명하면, 키 ID “KEY001”를 이용하여 개인정보를 암호화할 수 있고, 암호화된 개인정보에 대응하는 암호본은 “HJ113801Z==”이고, 이에 대응하는 토큰 값이 “abcxxf”인 경우, 매핑 정보 테이블(1039)은 토큰 값, 암호본 및 키 ID를 동일한 행으로 매핑하여 저장할 수 있다.In one embodiment, the mapping information table (1039) may include a token value, a ciphertext, and a key ID. The token value, ciphertext, and key ID are each composed of fields, and the mapping information table (1039) may include mapping values corresponding to each field. For example, in the above-described embodiment, if personal information can be encrypted using the key ID “KEY001”, and the ciphertext corresponding to the encrypted personal information is “HJ113801Z==”, and the corresponding token value is “abcxxf”, the mapping information table (1039) may store the token value, ciphertext, and key ID by mapping them in the same row.

일 실시 예에서, 업무 테이블(1040)은 토큰 값을 포함할 수 있다. 여기에서, 업무 테이블(1040)은 필드의 길이가 정해져 있기 때문에 암호본이나 키 ID를 직접 저장할 수 없기 때문에 데이터 관리 플랫폼은 토큰에 매핑되어 있는 암호본과 키 ID를 매핑 정보 테이블(1039)에 별도로 저장할 수 있다.In one embodiment, the business table (1040) may include a token value. Here, since the business table (1040) cannot directly store a password or key ID because the field length is fixed, the data management platform may separately store the password and key ID mapped to the token in the mapping information table (1039).

이때, 권한 있는 사용자가 개인정보를 확인하기 위해서는, 매핑 정보 테이블(1039)에 포함된 암호본을 이용할 수 있다. 상술한 예를 들어 설명하면, 권한 있는 사용자가 데이터 관리 플랫폼에게 개인정보의 복호화를 요청하는 경우, 데이터 관리 플랫폼은 업무 테이블(1040)에 포함된 토큰 값 “abcxxf”을 이용해 매핑 정보 테이블(1039)에 있는 암호본 “HJ113801Z==”을 추출하고, 암호본을 이용하여 개인정보를 복호화할 수 있다. 특히, 본 발명은 매핑 정보 테이블(1039)와 업무 테이블(1040)이 저장된 저장소와 암호화 키 및 키 ID가 저장된 저장소를 별도로 구분된 것을 특징으로 한다. 또한, 본 발명의 데이터 관리 플랫폼은 매핑 정보 테이블(1039)와 업무 테이블(1040)이 저장된 시스템과 암호화 키 및 키 ID가 저장된 시스템을 별도로 구비할 수 있다. 이를 통해, 법적인 보안 요구 조건을 만족할 수 있다.At this time, in order for an authorized user to verify personal information, the encrypted text included in the mapping information table (1039) can be used. For example, if an authorized user requests the data management platform to decrypt personal information, the data management platform can extract the encrypted text “HJ113801Z==” in the mapping information table (1039) using the token value “abcxxf” included in the work table (1040), and decrypt the personal information using the encrypted text. In particular, the present invention is characterized in that the storage in which the mapping information table (1039) and the work table (1040) are stored and the storage in which the encryption key and key ID are stored are separately distinguished. In addition, the data management platform of the present invention can separately have a system in which the mapping information table (1039) and the work table (1040) are stored and a system in which the encryption key and key ID are stored. Through this, legal security requirements can be satisfied.

상술한 실시 예에 따라, 데이터 관리 플랫폼은 업무 테이블(1040)에 토큰 값을 저장할 수 있다. 여기에서, 데이터 관리 플랫폼은 키 ID나 개인정보 암호화 값이 변경되더라도 토큰 값은 변경하지 않은 상태로 유지할 수 있다.According to the above-described embodiment, the data management platform can store the token value in the business table (1040). Here, the data management platform can maintain the token value unchanged even if the key ID or personal information encryption value is changed.

이를 통하여, 토큰 값은 변하지 않은 상태에서 암호화 키와 키 ID를 새롭게 변경할 수 있다.Through this, the encryption key and key ID can be changed while the token value remains unchanged.

도 25은 본 발명의 데이터 관리 플랫폼에서 신규 암호화 키를 생성하는 실시 예를 설명하는 도면이다.FIG. 25 is a diagram illustrating an embodiment of generating a new encryption key in the data management platform of the present invention.

일 실시 예에서, 암호화 키 및 키 ID 생성부(2015)는 새로운 암호화 키 및 키 ID를 생성할 수 있다. 예를 들어, 암호화 키 및 키 ID 생성부(2015)는 제 2 암호화 키 및 키 ID “KEY002”를 생성할 수 있다. 여기에서, 제 2 암호화 키는 상술한 실시 예와 마찬가지로 binary로 표현될 수 있다.In one embodiment, the encryption key and key ID generation unit (2015) can generate a new encryption key and key ID. For example, the encryption key and key ID generation unit (2015) can generate a second encryption key and key ID “KEY002.” Here, the second encryption key can be expressed in binary, similar to the embodiment described above.

이에 따라, 데이터 관리 플랫폼은 기존의 매핑 정보 테이블(1039)에 포함된 암호본과 키 ID를 새로운 암호본 및 키 ID로 업데이트할 수 있다.Accordingly, the data management platform can update the password and key ID included in the existing mapping information table (1039) with a new password and key ID.

보다 상세하게는, 데이터 관리 플랫폼은 새롭게 생성된 키 ID에 기초하여 암호화 알고리즘을 결정하고, 새로운 암호화 키를 이용하여 개인정보 암호화 값을 변경할 수 있다. 이를 위하여, 데이터 관리 플랫폼은 기존 암호본 및 키 ID를 이용하여 개인정보를 복호화한 뒤, 새롭게 생성된 암호화 키 및 키 ID를 이용하여 개인정보를 재암호화할 수 있다.More specifically, the data management platform can determine an encryption algorithm based on the newly generated key ID and use the new encryption key to change the encrypted value of the personal information. To do this, the data management platform can decrypt the personal information using the existing ciphertext and key ID, and then re-encrypt the personal information using the newly generated encryption key and key ID.

이에 따라, 매핑 정보 테이블(1039)에 포함된 토큰 값은 유지가 되지만, 개인정보 암호화 값과 키 ID는 새로 생성된 값으로 변경된다. 이때, 업무 테이블(1040)에 포함된 토큰 값은 변함이 없다.Accordingly, the token value included in the mapping information table (1039) is maintained, but the personal information encryption value and key ID are changed to newly generated values. At this time, the token value included in the business table (1040) remains unchanged.

도 26은 본 발명의 데이터 관리 플랫폼에서 새로운 업무 데이터를 추가하는 실시 예를 설명하는 도면이다.FIG. 26 is a drawing illustrating an embodiment of adding new business data to the data management platform of the present invention.

본 도면에서는 데이터 관리 플랫폼에서 새로운 개인정보가 추가된 실시 예를 설명한다. 데이터 관리 플랫폼은 새로운 업무 데이터(개인정보인 경우를 예로 한다.)를 수집 또는 수신하는 경우, 가장 최신 암호화 키 및 키 ID를 이용하여 새로운 개인정보를 암호화할 수 있다.This diagram illustrates an example where new personal information is added to the data management platform. When the data management platform collects or receives new business data (e.g., personal information), it can encrypt the new personal information using the most recent encryption key and key ID.

예를 들어, 새로운 개인정보로 계좌번호 “114-910224-12345”가 수신된 경우, 데이터 관리 플랫폼은 가장 최신 암호화 키인 제 2 암호화 키 및 키 ID “KEY002”를 이용하여 개인정보를 암호화할 수 있다. 이후, 데이터 관리 플랫폼은 개인정보에 대응하는 토큰 값을 생성할 수 있다. 예를 들어, 토큰 값은 “hijklm”에 대응한다.For example, if new personal information, such as account number "114-910224-12345," is received, the data management platform can encrypt the personal information using the second encryption key and key ID "KEY002," which is the most recent encryption key. The data management platform can then generate a token value corresponding to the personal information. For example, the token value corresponds to "hijklm."

데이터 관리 플랫폼은 토큰 값, 암호본 및 키 ID를 매핑 정보 테이블(1039)에 저장하고, 토큰 값을 업무 테이블(1040)에 저장할 수 있다.The data management platform can store the token value, password, and key ID in a mapping information table (1039), and store the token value in a business table (1040).

새로운 개인정보가 추가되기 전 가장 최신 암호화 키 및 키 ID가 반영된 매핑 정보 테이블(1039)은 제 1 행에 토큰 값 “abcxxf”암호본 “29AB3801Z==”및 키 ID “KEY002”를 저장하고 있다. 새로운 개인정보가 추가되면, 매핑 정보 테이블(1039)은 제 2 행에 토큰 값 “hijklm”암호본 ”AQ348701Z==”및 키 ID “KEY002”를 더 포함할 수 있다.Before new personal information is added, the mapping information table (1039), which reflects the most recent encryption key and key ID, stores the token value “abcxxf,” the password “29AB3801Z==,” and the key ID “KEY002” in the first row. When new personal information is added, the mapping information table (1039) may further include the token value “hijklm,” the password “AQ348701Z==,” and the key ID “KEY002” in the second row.

마찬가지로, 새로운 개인정보가 추가되기 전 업무 테이블(1040)은 제 1 행에 “abcxxf”를 저장하고 있고, 새로운 개인정보가 추가되면 제 2 행에 “hijklm”을 저장할 수 있다.Similarly, before new personal information is added, the business table (1040) may store “abcxxf” in the first row, and when new personal information is added, “hijklm” may be stored in the second row.

즉, 본 발명의 데이터 관리 플랫폼은 가장 최신 암호화 키 및 키 ID를 기준으로 매핑 정보 테이블(1039)에 포함된 정보를 업데이트할 수 있고, 키 ID만 변경하면 개인정보의 복호화 및 재암호화를 진행하기 때문에 수천만 건의 데이터를 실시간으로 변경할 수 있다.That is, the data management platform of the present invention can update information included in the mapping information table (1039) based on the most recent encryption key and key ID, and can change tens of millions of data in real time because it performs decryption and re-encryption of personal information by changing only the key ID.

도 27는 본 발명의 데이터 관리 플랫폼이 로그 데이터를 정규화하는 실시 예를 설명하는 도면이다.Figure 27 is a diagram illustrating an embodiment in which the data management platform of the present invention normalizes log data.

일부 사용자들의 경우 로그 데이터로부터 필요한 정보를 추출하기 위한 정규식 입력을 어려워할 수 있다. 본 발명은 로그 데이터의 정규화를 통해 이러한 문제를 해결하고자 한다. 이를 통해 로그 데이터에서 의미 있는 데이터의 일부분을 추출하기 위한 정규식을 포함하는 파서를 손쉽게 생성함으로써 사용자 편의성을 증가시킬 수 있다.Some users may find it difficult to input regular expressions to extract necessary information from log data. The present invention aims to address this issue by normalizing log data. This allows for the easy creation of a parser containing regular expressions for extracting meaningful portions of data from log data, thereby increasing user convenience.

본 발명의 데이터 관리 플랫폼(10000)은 수집 모듈(20001), 분석 모듈(20002) 및 모니터링 모듈(20005)을 통하여 로그 데이터를 정규화할 수 있고, 사용자/클라이언트(1000)의 이벤트 로그 검색 요청에 따라 로그 데이터를 검색할 수 있다.The data management platform (10000) of the present invention can normalize log data through a collection module (20001), an analysis module (20002), and a monitoring module (20005), and can search log data according to an event log search request from a user/client (1000).

구체적으로, 수집 모듈(20001)은 로그 수집부(2040)를 포함할 수 있다. 로그 수집부(2040)는 외부로부터 전송되는 적어도 하나의 로그 데이터를 획득할 수 있다.Specifically, the collection module (20001) may include a log collection unit (2040). The log collection unit (2040) may obtain at least one log data transmitted from the outside.

분석 모듈(20002)은 파서 생성부(2043), 변환 규칙 생성부(2044) 및 수집 규칙 생성부(2045)를 포함할 수 있다. 파서 생성부(2043)는 선택된 텍스트에 일정 빈도 이상으로 사용되는 정규식 패턴 별 매칭 블록을 추출할 수 있다. 여기서, 정규식 패턴은 날짜, 시간, 문자열 등 일정 빈도 이상으로 자주 사용되는 정규식 패턴을 포함할 수 있다. 또한, 매칭 블록은 정규식 패턴과 일치하는 텍스트 정보를 나타낼 수 있다.The analysis module (20002) may include a parser generation unit (2043), a conversion rule generation unit (2044), and a collection rule generation unit (2045). The parser generation unit (2043) may extract matching blocks for regular expression patterns that are used with a certain frequency or more in the selected text. Here, the regular expression patterns may include regular expression patterns that are frequently used with a certain frequency or more, such as dates, times, and strings. In addition, the matching blocks may represent text information that matches the regular expression patterns.

파서 생성부(2043)는 추출된 다수의 매칭 블록 중 사용자 입력에 의해 선택된 매칭 블록에 기반하여, 로그 데이터로부터 텍스트를 추출하기 위한 정규식을 포함하는 파서를 생성할 수 있다. 일 실시 예에서, 파서는 사용자 입력에 기반한 필드명을 포함할 수 있다.The parser generation unit (2043) may generate a parser including a regular expression for extracting text from log data based on a matching block selected by user input from among a plurality of extracted matching blocks. In one embodiment, the parser may include field names based on user input.

일 실시 예에서, 파서 생성부(2043)는 로그 데이터에서 정규식을 테스트하여 정규식을 검증할 수 있다. 즉, 파서 생성부(2043)는 정규식을 이용하여 로그 데이터로부터 각 필드에 해당하는 필드값이 제대로 추출되는지 테스트를 진행할 수 있다.In one embodiment, the parser generation unit (2043) can verify regular expressions by testing them on log data. That is, the parser generation unit (2043) can test whether field values corresponding to each field are properly extracted from log data using regular expressions.

변환 규칙 생성부(2044)는 정규식에 기반한 파서 및 파서에 대응하는 이벤트 유형을 포함하는 변환 규칙을 생성할 수 있다. 변환 규칙 생성부(2044)는 기 생성된 파서 중 사용자 입력에 기반한 로그 데이터가 매칭되는 정규식에 기반한 파서를 스캔할 수 있다. 변환 규칙 생성부(2044)는 해당 로그 데이터에 대하여 사용 가능한 파서 중 사용자 입력에 기반한 파서를 변환 규칙에 적용하여 저장할 수 있다. 즉, 변환 규칙은 로그 데이터 저장 시 추출된 필드를 어떻게 가공하여 저장할지에 대한 규칙을 나타낼 수 있다.The conversion rule generation unit (2044) can generate a conversion rule including a parser based on regular expressions and an event type corresponding to the parser. The conversion rule generation unit (2044) can scan a parser based on regular expressions that matches log data based on user input among the previously generated parsers. The conversion rule generation unit (2044) can apply a parser based on user input among the available parsers for the corresponding log data to the conversion rule and store it. In other words, the conversion rule can indicate a rule on how to process and store the extracted fields when storing the log data.

따라서, 본 발명에 따르면, 정규식 입력을 어려워하는 사용자들에게 부분적으로 정규식 입력을 도와주어 전체 정규식을 완성할 수 있는 편의성을 제공할 수 있다. 또한, 본 발명에 따르면, 등록된 파서의 재사용성을 높일 수 있도록 기 생성된 파서를 스캔하는 편의성을 제공할 수 있다.Therefore, the present invention provides convenience to users who struggle with regular expression input by partially assisting them with entering regular expressions, allowing them to complete the entire regular expression. Furthermore, the present invention provides convenience in scanning previously created parsers, thereby increasing the reusability of registered parsers.

수집 규칙 생성부(2045)는 로그 수집 장치의 로그 데이터에 사용할 변환 규칙을 선택하여 수집 경로 규칙을 생성할 수 있다.The collection rule generation unit (2045) can generate a collection path rule by selecting a conversion rule to be used for log data of a log collection device.

모니터링 모듈(20005)는 검색 요청 처리부(2041), 사용자 입력 처리부(2042) 및 이벤트 로그 검색부(2046)를 포함할 수 있다.The monitoring module (20005) may include a search request processing unit (2041), a user input processing unit (2042), and an event log search unit (2046).

검색 요청 처리부(2041)는 사용자/클라이언트(1000)로부터 이벤트 유형에 대한 이벤트 검색 요청을 수신할 수 있다. 이후, 검색 요청 처리부(2041)는 미리 저장된 적어도 하나의 로그 데이터로부터 추출된 검색 결과를 사용자/클라이언트(1000)에게 송신할 수 있다. 이 경우, 송신된 검색 결과는 사용자/클라이언트(1000)의 화면에 디스플레이될 수 있다.The search request processing unit (2041) may receive an event search request for an event type from a user/client (1000). Thereafter, the search request processing unit (2041) may transmit search results extracted from at least one pre-stored log data to the user/client (1000). In this case, the transmitted search results may be displayed on the screen of the user/client (1000).

사용자 입력 처리부(2042)는 사용자/클라이언트(1000)로부터 사용자 입력에 기반한 정보를 획득할 수 있다. 즉, 사용자 입력 처리부(2042)는 사용자/클라이언트(1000)를 통해 사용자 입력에 의해 선택된 정보를 획득할 수 있다.The user input processing unit (2042) can obtain information based on user input from the user/client (1000). That is, the user input processing unit (2042) can obtain information selected by user input through the user/client (1000).

일 실시 예에서, 사용자 입력에 기반한 정보는 정규식 생성 및 테스트를 위한 로그 데이터, 로그 데이터에 포함된 텍스트, 텍스트 정보를 나타내는 다수의 매칭 블록 중 선택된 매칭 블록, 텍스트에 대한 필드, 파서(parser)에 대한 이벤트 유형, 로그 수집 장치의 등록 정보 및 수집 경로 규칙 중 적어도 하나를 포함할 수 있다.In one embodiment, the information based on user input may include at least one of log data for generating and testing regular expressions, text included in the log data, a matching block selected from a plurality of matching blocks representing text information, a field for the text, an event type for a parser, registration information for a log collection device, and a collection path rule.

이벤트 로그 검색부(2046)는 이벤트 유형에 대한 이벤트 검색 요청을 수신함에 응답하여, 파서, 수집 경로 규칙 및 변환 규칙 중 적어도 하나에 기반한 미리 저장된 적어도 하나의 로그 데이터로부터 추출된 검색 결과를 출력할 수 있다.The event log search unit (2046) may, in response to receiving an event search request for an event type, output a search result extracted from at least one pre-stored log data based on at least one of a parser, a collection path rule, and a conversion rule.

도 28는 본 발명의 데이터 관리 방법이 파서를 생성하는 일 실시 예를 설명하는 도면이다.FIG. 28 is a drawing illustrating an embodiment of a data management method of the present invention for generating a parser.

단계(S18010)에서, 데이터 관리 방법은 로그 데이터를 획득할 수 있다. 일 실시 예에서, 데이터 관리 방법은 사용자/클라이언트(1000)를 통해 사용자에 의해 입력된 로그 데이터를 획득할 수 있다.In step (S18010), the data management method can obtain log data. In one embodiment, the data management method can obtain log data input by a user through a user/client (1000).

단계(S18020)에서, 데이터 관리 방법은 사용자 입력에 기반한 로그 데이터에 포함된 텍스트를 획득할 수 있다. 즉, 데이터 관리 방법은 원본 로그 데이터에서 정규식으로 변환하고자 사용자에 의해 선택된 텍스트를 획득할 수 있다. 일 실시 예에서, 정규식은 '정규 표현식' 또는 이와 동등한 기술적 의미를 갖는 용어로 지칭될 수 있다.In step (S18020), the data management method can obtain text contained in log data based on user input. That is, the data management method can obtain text selected by the user to be converted into a regular expression from the original log data. In one embodiment, the regular expression may be referred to as a "regular expression" or a term having an equivalent technical meaning.

단계(S18030)에서, 데이터 관리 방법은 텍스트에 대한 텍스트 정보를 나타내는 매칭 블록을 생성할 수 있다. 예를 들어, 텍스트 정보는 1개의 숫자, 공백을 제외한 1개 이상의 숫자, 1개의 글자, 공백을 제외한 1개 이상의 글자 및 쌍따옴표 사이의 모든 글자 등 다양한 텍스트 정보를 나타낼 수 있다.In step (S18030), the data management method can generate a matching block representing text information for the text. For example, the text information can represent various text information, such as a number, one or more numbers excluding spaces, one letter, one or more letters excluding spaces, and all letters between double quotation marks.

단계(S18040)에서, 데이터 관리 방법은 매칭 블록에 기반하여 로그 데이터로부터 텍스트를 추출하기 위한 정규식을 생성할 수 있다. 일 실시 예에서, 로그 데이터에 포함된 각 텍스트에 대하여 정규식을 생성하며, 각 정규식을 포함하는 전체 정규식을 생성할 수 있다.In step (S18040), the data management method may generate a regular expression for extracting text from log data based on a matching block. In one embodiment, a regular expression may be generated for each text contained in the log data, and an entire regular expression including each regular expression may be generated.

단계(S18050)에서, 데이터 관리 방법은 사용자 입력에 기반한 텍스트에 대한 필드를 획득할 수 있다. 즉, 데이터 관리 방법은 해당 텍스트에 대한 필드명을 설정하기 위한 필드명 정보를 획득할 수 있다.In step (S18050), the data management method can obtain a field for text based on user input. That is, the data management method can obtain field name information for setting a field name for the corresponding text.

단계(S18060)에서, 데이터 관리 방법은 정규식 및 필드를 포함하는 파서를 생성할 수 있다.In step (S18060), the data management method can generate a parser including a regular expression and a field.

도 29은 본 발명의 파서 생성 화면의 일 실시 예를 설명하는 도면이다.Figure 29 is a drawing illustrating one embodiment of a parser generation screen of the present invention.

일 실시 예에서, 파서 생성 화면은 사용자/클라이언트(1000)에 의해 디스플레이될 수 있다. 파서 생성 화면은 로그 데이터(1042), 텍스트(1043), 매칭 블록(1044), 정규식(1045) 및 필드(1046)를 포함할 수 있다.In one embodiment, a parser generation screen may be displayed by a user/client (1000). The parser generation screen may include log data (1042), text (1043), matching blocks (1044), regular expressions (1045), and fields (1046).

사용자에 의해 로그 데이터(1042)가 입력되고, 로그 데이터(1042)에 포함된 텍스트(1043)가 선택되어 입력될 수 있다. 예를 들어, 텍스트는 문자열로 구성될 수 있으며, 16을 나타낼 수 있다.Log data (1042) may be input by a user, and text (1043) included in the log data (1042) may be selected and input. For example, the text may be composed of a string and may represent 16.

입력된 로그 데이터(1042)와 텍스트(1043)가 데이터 관리 플랫폼(10000)에게 전달되면, 데이터 관리 플랫폼(10000)에 의해 생성된 텍스트(1043)에 대응하는 다수의 매칭 블록들이 사용자/클라이언트(1000)의 파서 생성 화면에 표시될 수 있다.When the input log data (1042) and text (1043) are transmitted to the data management platform (10000), a plurality of matching blocks corresponding to the text (1043) generated by the data management platform (10000) can be displayed on the parser generation screen of the user/client (1000).

이 경우, 다수의 매칭 블록들 각각은 색(color)으로 구분될 수 있으며, 각 매칭 블록은 서로 다른 텍스트 정보를 나타낼 수 있다. 이후, 사용자 입력에 의해 다수의 매칭 블록 중 텍스트(1043)의 텍스트 정보를 나타내는 하나의 매칭 블록(1044)이 선택되는 경우, 선택된 매칭 블록(1044)에 대응하는 정규식(1045)이 자동으로 생성될 수 있다. 예를 들어, 정규식(1045)은 (?<REPLACEGROUPNAME>/d+)로 표현될 수 있다.In this case, each of the plurality of matching blocks can be distinguished by color, and each matching block can represent different text information. Thereafter, when one matching block (1044) representing text information of text (1043) is selected from among the plurality of matching blocks by user input, a regular expression (1045) corresponding to the selected matching block (1044) can be automatically generated. For example, the regular expression (1045) can be expressed as (?<REPLACEGROUPNAME>/d+).

또한, 사용자에 의해 해당 텍스트(1043)에 해당하는 필드(1046)가 입력되며, 전체 정규식과 필드가 포함된 파서가 생성될 수 있다.Additionally, a field (1046) corresponding to the text (1043) can be input by the user, and a parser including the entire regular expression and field can be generated.

도 30은 본 발명의 데이터 관리 방법이 변환 규칙을 생성하는 일 실시 예를 설명하는 도면이다.FIG. 30 is a drawing illustrating an embodiment of a data management method of the present invention for generating a conversion rule.

단계(S19010)에서, 데이터 관리 방법은 로그 데이터를 획득할 수 있다. 일 실시 예에서, 데이터 관리 방법은 사용자/클라이언트(1000)를 통해 사용자에 의해 입력된 로그 데이터를 획득할 수 있다.In step (S19010), the data management method can obtain log data. In one embodiment, the data management method can obtain log data input by a user through a user/client (1000).

단계(S19020)에서, 데이터 관리 방법은 로그 데이터에 대응하는 파서를 결정할 수 있다. 즉, 사용자에 의해 입력된 로그 데이터에 대한 스캐닝을 통해 로그 데이터에 대응하는 파서가 결정될 수 있으며, 해당 파서에 대한 파서명, 버전 및 파서 유형이 결정될 수 있다. 이 경우, 상술한 바와 같이 정규식에 기반한 파서인 경우 파서 유형은 정규식을 나타낼 수 있다.In step (S19020), the data management method can determine a parser corresponding to the log data. That is, a parser corresponding to the log data can be determined by scanning the log data input by the user, and the parser name, version, and parser type for the corresponding parser can be determined. In this case, if the parser is based on a regular expression as described above, the parser type can indicate a regular expression.

단계(S19030)에서, 데이터 관리 방법은 사용자 입력에 기반한 파서에 대한 이벤트 유형을 결정할 수 있다. 예를 들어, 이벤트 유형은 개인정보 검색 로그 및 방화벽 로그를 포함할 수 있나, 이에 제한되지 않고 다양한 유형으로 구성될 수 있다.In step (S19030), the data management method can determine an event type for the parser based on user input. For example, the event type may include, but is not limited to, personal information search logs and firewall logs, and may be comprised of various types.

단계(S19040)에서, 데이터 관리 방법은 파서 및 이벤트 유형에 대응하는 적어도 하나의 필드를 결정할 수 있다. 일 실시예에서, 데이터 관리 방법은 변환 규칙의 각 파서의 필드의 일괄 등록을 수행할 수 있다. 예를 들어, 일괄 등록 시 필드, 필드명, 필드 유형(숫자 또는 문자열) 및 데이터 유형이 결정될 수 있다.In step (S19040), the data management method may determine at least one field corresponding to the parser and event type. In one embodiment, the data management method may perform batch registration of fields of each parser of the conversion rule. For example, during batch registration, the field, field name, field type (number or string), and data type may be determined.

단계(S19050)에서, 데이터 관리 방법은 파서, 이벤트 유형 및 적어도 하나의 필드를 포함하는 변환 규칙을 생성할 수 있다. 즉, 본 발명에 따르면, 데이터 관리 방법은 변환 규칙을 통해 실제로 추출된 필드값 중에서 저장할 것과 저장하지 않을 것을 구분해서 저장할 것인지를 결정할 수 있다. 다시 말해, 데이터 관리 방법은 정규식을 포함한 파서를 이용하여 로그 데이터로부터 필드값을 추출하고, 변환 규칙을 통해 추출된 필드값 중 무엇을 어떻게 저장할지를 결정할 수 있다.In step (S19050), the data management method can generate a conversion rule including a parser, an event type, and at least one field. That is, according to the present invention, the data management method can determine whether to store field values actually extracted through the conversion rule, distinguishing between those to be stored and those not to be stored. In other words, the data management method can extract field values from log data using a parser including a regular expression, and determine which of the extracted field values to store and how through the conversion rule.

도 31은 본 발명의 데이터 관리 방법이 수집 경로 규칙을 생성하는 일 실시 예를 설명하는 도면이다.FIG. 31 is a diagram illustrating an embodiment of a data management method of the present invention for generating a collection path rule.

단계(S11110)에서, 데이터 관리 방법은 로그 데이터를 수집하기 위한 로그 수집 장치를 등록할 수 있다. 일 실시예에서, 로그 수집 장치의 등록 시 해당 로그 수집 장치에 대한 장비명, 장비 IP, 장비 유형, OS(operating system) 및 수집 유형이 사용자 입력에 의해 사용자/클라이언트(1000)를 통해 설정될 수 있다.In step (S11110), the data management method may register a log collection device for collecting log data. In one embodiment, when registering a log collection device, the device name, device IP, device type, operating system (OS), and collection type for the log collection device may be set by a user/client (1000) through user input.

단계(S11120)에서, 데이터 관리 방법은 사용자 입력에 의한 로그 수집 장치의 로그 데이터에 사용할 변환 규칙을 선택하여 수집 경로 규칙을 생성할 수 있다. 일 실시 예에서, 수집 경로 규칙은 수집 경로, 수집 유형 및 장비명 중 적어도 하나를 포함할 수 있다.In step (S11120), the data management method may generate a collection path rule by selecting a conversion rule to be used for log data of a log collection device based on user input. In one embodiment, the collection path rule may include at least one of a collection path, a collection type, and an equipment name.

일 실시 예에서, 수집 유형은 에이전트 방식 및 시스템 로그 방식을 포함할 수 있다. 여기서, 에이전트 방식은 에이전트 프로그램을 시스템 및 장비에 설치하여 필요한 로그 데이터를 전송하는 방식을 포함하고, 시스템 로그 방식은 각종 보완 장비와 스위치, 라우터, 방화벽 등 네트워크 장비의 로그를 수집하는 방식을 포함할 수 있다.In one embodiment, the collection type may include an agent method and a system log method. Here, the agent method includes a method of installing an agent program on a system and equipment to transmit necessary log data, and the system log method may include a method of collecting logs from various supplementary equipment and network equipment such as switches, routers, and firewalls.

단계(S11130)에서, 데이터 관리 방법은 수집 경로 규칙에 변환 규칙을 적용할 수 있다.In step (S11130), the data management method can apply a transformation rule to the collection path rule.

도 32는 본 발명의 데이터 관리 방법이 이벤트 로그를 검색하는 일 실시 예를 설명하는 도면이다.FIG. 32 is a drawing illustrating an embodiment of a data management method of the present invention for searching an event log.

단계(S12110)에서, 데이터 관리 방법은 사용자/클라이언트(1000)로부터 이벤트 유형에 대한 이벤트 로그 검색 요청을 수신할 수 있다. 일 실시예에서, 다수의 이벤트 유형들 중 사용자에 의해 선택된 이벤트 유형에 대한 이벤트 로그 검색 요청이 수신될 수 있다.In step (S12110), the data management method may receive an event log search request for an event type from a user/client (1000). In one embodiment, an event log search request for an event type selected by the user among multiple event types may be received.

단계(S12120)에서, 데이터 관리 방법은 이벤트 로그 검색 요청에 응답하여, 수집 경로 규칙, 변환 규칙 및 파서에 기반하여 미리 저장된 적어도 하나의 로그 데이터로부터 추출된 검색 결과를 출력할 수 있다. 일 실시 예에서, 적어도 하나의 로그 데이터는 이벤트 유형에 대응할 수 있으며, 일정 기간 동안 수집되어 미리 저장된 로그 데이터를 포함할 수 있다.In step (S12120), the data management method may, in response to an event log search request, output a search result extracted from at least one pre-stored log data based on a collection path rule, a conversion rule, and a parser. In one embodiment, the at least one log data may correspond to an event type and may include log data collected and pre-stored over a certain period of time.

즉, 본 발명에 따르면, 정규식을 이용하여 로그 데이터에 탑재된 정보들(즉, 필드값)을 파싱시켜 구분시킴으로써, 사용자는 로그 데이터의 분석 시 구분된 로그 데이터에 내재된 정보를 보다 용이하게 확인할 수 있다.That is, according to the present invention, by parsing and distinguishing information (i.e., field values) contained in log data using regular expressions, a user can more easily confirm information inherent in the distinguished log data when analyzing the log data.

단계(S12130)에서, 데이터 관리 방법은 검색 결과를 사용자/클라이언트(1000)에게 송신할 수 있다. 이 경우, 검색 결과는 로그 데이터로부터 파싱된 필드값을 포함할 수 있으며, 사용자에 의한 이벤트 로그 검색 요청에 따라, 파서에 의해 파싱된 필드값이 사용자/클라이언트(1000)의 화면에 대시보드 형태로 표시될 수 있다.In step (S12130), the data management method may transmit search results to the user/client (1000). In this case, the search results may include field values parsed from log data, and in response to an event log search request by the user, the field values parsed by the parser may be displayed in the form of a dashboard on the screen of the user/client (1000).

도 33은 본 발명의 데이터 관리 플랫폼에서 로그를 검색하는 실시 예를 설명하는 도면이다.FIG. 33 is a diagram illustrating an embodiment of searching a log in a data management platform of the present invention.

대용량 로그 검색을 위해서는 검색 결과 전체를 메모리에 올리지 않고도 실시간으로 정렬 및 필터링된 결과를 조회하는 기술이 필요하다. 또한, 검색 중에도 일부 결과를 실시간으로 확인할 수 있어야 한다.For high-volume log searches, a technology is needed to retrieve sorted and filtered results in real time without loading the entire search result into memory. Furthermore, it is also necessary to be able to view some results in real time during the search.

본 발명의 데이터 관리 플랫폼은 대량의 데이터를 효율적으로 처리하고 필요한 결과만을 실시간으로 조회할 수 있도록 한다.The data management platform of the present invention efficiently processes large amounts of data and enables real-time query of only necessary results.

본 발명의 데이터 관리 플랫폼(10000)의 모니터링 모듈(20005)은 로그 검색을 지원하기 위해 코디네이터 제어부(2047)를 더 포함할 수 있다.The monitoring module (20005) of the data management platform (10000) of the present invention may further include a coordinator control unit (2047) to support log search.

여기에서, 코디네이터 제어부(2047)는 사용자(1000)의 로그 검색 요청에 기초하여 서버(1002)의 코디네이터(2048)를 제어할 수 있다.Here, the coordinator control unit (2047) can control the coordinator (2048) of the server (1002) based on the log search request of the user (1000).

여기에서, 코디네이터(2048)(coordinator)는 분산 시스템에서 여러 개의 인스턴스(2049a, 2049b, 2049c) 중 사용 가능한 인스턴스(2049a, 2049b, 2049c)의 목록을 관리하고 검색 요청을 분배할 수 있다.Here, the coordinator (2048) can manage a list of available instances (2049a, 2049b, 2049c) among multiple instances (2049a, 2049b, 2049c) in a distributed system and distribute search requests.

또한, 인스턴스(instance, 2049a, 2049b, 2049c)는 컴퓨팅 환경에서 실행되는 독립적인 단위로 하드웨어 또는 가상화 기술을 통해 프로세서, 메모리, 디스크 등의 자원을 할당받아 동작할 수 있다. 인스턴스(2049a, 2049b, 2049c)는 특정 운영체제와 응용프로그램을 통해 실행될 수 있으며, 일반적으로 서버, 가상 머신, 컨테이너 등의 형태로 구현될 수 있다.Additionally, instances (2049a, 2049b, 2049c) are independent units that run in a computing environment and can operate by being allocated resources such as processors, memory, and disks through hardware or virtualization technology. Instances (2049a, 2049b, 2049c) can be run through specific operating systems and applications, and can generally be implemented in the form of servers, virtual machines, containers, etc.

이하를 통하여 본 발명을 자세히 설명하도록 한다.The present invention will be described in detail below.

도 34은 본 발명의 익스터널 머지 소트 알고리즘 실시 예를 설명하는 도면이다.Figure 34 is a drawing illustrating an embodiment of the external merge sort algorithm of the present invention.

익스터널 머지 소트(External Merge Sort) 알고리즘은 대용량의 데이터를 정렬하는데 사용되는 알고리즘으로, 특히, 메모리 용량을 초과하는 데이터를 정렬할 때 효율적으로 작동한다. 익스터널 머지 소트 알고리즘은 주로 디스크나 외부 저장 장치와 같은 보조 메모리를 활용하여 데이터를 정렬할 수 있다.The External Merge Sort algorithm is used to sort large amounts of data, and is particularly efficient when sorting data that exceeds memory capacity. The External Merge Sort algorithm can primarily utilize auxiliary memory, such as disks or external storage devices, to sort data.

단계(S16110)에서, 정렬 필드의 값을 기준으로 데이터의 offset 정보를 저장하는 인덱스 파일을 생성할 수 있다. 여기에서, 정렬 필드의 값은 데이터 안에 포함된 항목을 나타낼 수 있다. 예를 들어, 데이터 안에 포함된 항목은 사용자 이름, 이메일 주소, 시간 등을 포함할 수 있다. 이에 따라, 익스터널 머지 소트 알고리즘은 데이터 안에 포함된 항목을 기준으로 데이터를 정렬할 수 있다. 또한, offset 정보는 정렬 필드의 기준 값, data 파일 내 위치(예를 들어, data 파일 내의 제 1 로그의 위치) 및 길이 정보(예를 들어, data 파일 내의 제 1 로그의 길이 정보)를 포함할 수 있다. 또한, 인덱스 파일은 특정 건수(예를 들어, 1000건) 단위의 offset 목록으로 이미 정렬된 데이터 순서를 가지는 것을 특징으로 한다.In step (S16110), an index file storing offset information of data can be generated based on the value of a sort field. Here, the value of the sort field can represent an item included in the data. For example, the item included in the data can include a user name, an email address, a time, etc. Accordingly, the external merge sort algorithm can sort the data based on the item included in the data. In addition, the offset information can include a reference value of the sort field, a location within a data file (e.g., a location of a first log within a data file), and length information (e.g., length information of a first log within a data file). In addition, the index file is characterized in that it has a data order already sorted as an offset list in units of a specific number of cases (e.g., 1000 cases).

단계(S16120)에서, 머지 소트에 대한 정렬 결과 파일에 포함된 데이터의 개수가 제 1 개수 이상이면, 머지 소트에 대한 정렬 결과 파일을 합쳐서 생성할 수 있다.In step (S16120), if the number of data included in the sort result file for the merge sort is greater than or equal to the first number, the sort result file for the merge sort can be generated by merging.

보다 상세하게는, 인덱스 파일을 생성할 때, 인덱스 파일에 포함된 파일의 개수가 제 1 개수 이상이면, 새로운 인덱스 파일을 생성할 수 있다. 예를 들어, 제 1 정렬 결과 파일에 포함된 파일의 개수가 제 n 개수 이상이면, 제 2 정렬 결과 파일을 생성할 수 있다.More specifically, when generating an index file, if the number of files included in the index file is greater than or equal to a first number, a new index file may be generated. For example, if the number of files included in the first sort result file is greater than or equal to the nth number, a second sort result file may be generated.

단계(S16130)에서, 검색이 종료되거나 머지 소트 (merge sort) 파일에 포함된 파일이 제 n 개수 이상이면, 새롭게 생성된 제 2 정렬 결과 파일을 병합하여 새로운 제 3 정렬 결과 파일을 생성할 수 있다.In step (S16130), if the search is terminated or the number of files included in the merge sort file is greater than or equal to n, the newly generated second sort result files can be merged to generate a new third sort result file.

즉, 익스터널 머지 소트 알고리즘을 활용하면 데이터를 분할하고 병합하는 과정을 반복하면서 정렬 작업을 수행하기 때문에 전체적인 성능을 최적화할 수 있다는 장점이 있다.In other words, the external merge sort algorithm has the advantage of optimizing overall performance because it performs sorting operations by repeating the process of dividing and merging data.

도 35는 본 발명의 데이터 관리 플랫폼에서 데이터를 분석하는 실시 예를 설명하는 도면이다.Figure 35 is a drawing illustrating an example of analyzing data in the data management platform of the present invention.

로그 데이터를 분석하는 시스템에는 저장된 데이터를 처리하기 위해 사용자를 위한 사용자 인터페이스(User Interface, UI)가 필요하다. 이때, 다양한 분석을 원하는 사용자의 입장에서는 개발사가 제공하는 UI는 한계가 있다. 즉, 보다 복잡한 데이터 분석이나 검색을 수행하거나 스크립트 형태로 쿼리 명령어를 입력하여 데이터를 조작하고 의미 있는 결과를 도출하기 위한 사용자 인터페이스가 필요하다. 이를 위하여, 본 발명에서는 UI 내에서 스크립트를 사용해서 UI 개발에 대한 프로그램 없이 데이터를 조작할 수 있는 기능을 제공하고자 한다.Systems that analyze log data require a user interface (UI) to process stored data. However, for users seeking diverse analyses, the UIs provided by developers are limited. Therefore, a user interface is needed to perform more complex data analysis or searches, or to input query commands in script form to manipulate data and derive meaningful results. To achieve this, the present invention aims to provide a function that allows data manipulation without the need for UI development programs, using scripts within the UI.

본 발명은 분석 작업 편집기(2056)를 통해 스크립트를 이용하여 사용자가 다양한 데이터 분석 및 데이터 조작을 할 수 있도록 한다.The present invention enables a user to perform various data analyses and data manipulations using scripts through an analysis task editor (2056).

이를 위하여, 본 발명의 데이터 관리 플랫폼(10000)의 분석 모듈(20002)은 분석 작업 실행부(2054) 및 분석 작업 관리부(2055)를 포함할 수 있다.To this end, the analysis module (20002) of the data management platform (10000) of the present invention may include an analysis task execution unit (2054) and an analysis task management unit (2055).

여기에서, 분석 작업 실행부(2054)는 분석 작업 편집기(2056)에 입력된 정보를 실행하는 분석 엔진을 포함할 수 있다. 일 실시 예에서, 분석 작업 실행부(2054)는 사용자가 분석 작업 편집기(2056)를 통해 입력한 제 1 분석 작업을 실행하는 명령에 기초하여 분석 작업을 실행할 수 있다. 구체적으로, 사용자가 분석 작업 편집기(2056) 내에서 코드 셀을 실행하면, Lua 스크립트가 분석 작업 실행부(2054) 내의 분석 엔진으로 전송되고, 분석 엔진 내에서 Lua 스크립트를 해석하여 실행하고 결과를 반환할 수 있다. 이를 위하여, 분석 작업 실행부(2054)는 데이터베이스(20007)의 로그 데이터 및 통계 데이터를 활용할 수 있다.Here, the analysis task execution unit (2054) may include an analysis engine that executes information entered in the analysis task editor (2056). In one embodiment, the analysis task execution unit (2054) may execute an analysis task based on a command to execute a first analysis task entered by a user through the analysis task editor (2056). Specifically, when a user executes a code cell within the analysis task editor (2056), a Lua script is transmitted to the analysis engine within the analysis task execution unit (2054), and the Lua script may be interpreted and executed within the analysis engine and a result may be returned. For this purpose, the analysis task execution unit (2054) may utilize log data and statistical data of the database (20007).

분석 작업 편집기(2056)는 애플리케이션, 소프트웨어 또는 웹 브라우저 등을 통하여 실행 가능한 편집기로 적어도 하나의 셀을 포함할 수 있다. 여기에서 셀은 실행 스크립트에 대한 설명 데이터를 추가하기 위한 마크다운 셀과 분석을 위한 Lua 스크립트 정보를 추가하기 위한 코드 셀을 포함한다.The analysis task editor (2056) may include at least one cell as an executable editor via an application, software, or web browser. The cell may include a markdown cell for adding descriptive data for an execution script and a code cell for adding Lua script information for analysis.

분석 작업 관리부(2055)는 작성된 Lua 스크립트를 등록할 수 있다. 보다 상세하게는, 사용자는 분석 작업 편집기(2056)를 통해 Lua 스크립트를 추가할 수 있고, 추가된 Lua 스크립트를 분석 작업 관리부(2055)에 등록할 수 있다. 등록된 Lua 스크립트는 분석 작업 관리부(2055)에 포함된 분석 작업 스케쥴러에 의해 주기적으로 실행될 수 있다.The analysis task management unit (2055) can register a written Lua script. More specifically, a user can add a Lua script through the analysis task editor (2056) and register the added Lua script in the analysis task management unit (2055). The registered Lua script can be periodically executed by the analysis task scheduler included in the analysis task management unit (2055).

이후, 모니터링 모듈(20005)은 분석 작업 관리부(2055)를 통해 관리된 분석 작업이 주기적으로 실행된 결과를 출력할 수 있다.Afterwards, the monitoring module (20005) can output the results of periodically executing analysis tasks managed through the analysis task management unit (2055).

이를 통하여 사용자는 개발사가 제공하는 사용자 인터페이스를 이용하지 않고도 원하는 데이터를 분석할 수 있다.This allows users to analyze desired data without using the user interface provided by the developer.

도 36은 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면이다.Figure 36 is a drawing illustrating a user interface of an analysis task editor provided in the data management platform of the present invention.

본 발명의 분석 작업 편집기(2056)는 마크다운 셀(Markdown cell, 2057)과 코드 셀(Code cell, 2058)을 포함할 수 있다. 분석 작업 편집기(2056)는 여러 개의 스크립트 및 텍스트 블록을 저장하는 단위에 대응할 수 있다.The analysis task editor (2056) of the present invention may include a Markdown cell (2057) and a code cell (2058). The analysis task editor (2056) may correspond to a unit that stores multiple scripts and text blocks.

여기에서, 마크다운 셀(2057)은 설명 블록(description block)에 대응하고, 코드 셀(2058)은 스크립트 블록(script block)에 대응한다.Here, the markdown cell (2057) corresponds to a description block, and the code cell (2058) corresponds to a script block.

일 실시 예에서, 사용자는 마크다운 셀(2057) 및 코드 셀(2058) 중 적어도 하나에 텍스트를 입력할 수 있다. 예를 들어, 사용자는 마크다운 셀(2057)에 코드 셀(2058)에 대한 설명을 기재할 수 있다. 또한, 사용자는 코드 셀(2058)에 실질적으로 실행하고자 하는 코드를 입력할 수 있다. 이에 따라, 분석 작업 편집기(2056)는 코드 셀(2058)에 포함된 코드를 실행할 수 있다.In one embodiment, a user may enter text into at least one of a Markdown cell (2057) and a code cell (2058). For example, the user may enter a description of the code cell (2058) in the Markdown cell (2057). Additionally, the user may enter code to be executed in the code cell (2058). Accordingly, the analysis task editor (2056) may execute the code contained in the code cell (2058).

일 실시 예에서, 사용자가 분석 작업 편집기(2056)에서 제공하는 “버튼(2059)”을 누르면, 분석 작업 편집기(2056)는 마크다운 셀 추가 버튼 및 코드 셀 추가 버튼(2060a, 2060b)을 추가로 출력할 수 있다. 일 실시 예에서, 분석 작업 편집기(2056)는 +버튼(2059)을 마크다운 셀(2057)의 위/아래에 모두 출력할 수 있고, 이에 따라 사용자는 추가할 마크다운 셀(2057) 또는 코드 셀(2058)의 위치를 결정할 수 있다.In one embodiment, when a user presses the “button (2059)” provided in the analysis task editor (2056), the analysis task editor (2056) may additionally output an add markdown cell button and an add code cell button (2060a, 2060b). In one embodiment, the analysis task editor (2056) may output a + button (2059) both above and below a markdown cell (2057), thereby allowing the user to determine the location of the markdown cell (2057) or code cell (2058) to be added.

이에 따라, 사용자는 복수 개의 마크다운 셀(2057)을 추가할 수 있으며, 마찬가지로 분석 작업 편집기(2056)에게 복수 개의 코드 셀(2058)을 요청할 수 있다.Accordingly, the user can add multiple markdown cells (2057) and similarly request multiple code cells (2058) from the analysis task editor (2056).

도 37은 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면이다.Figure 37 is a drawing illustrating a user interface of an analysis task editor provided in the data management platform of the present invention.

일 실시 예에서, 사용자가 출력된 코드 셀(2058)의 실행을 요청함에 따라, 분석 작업 편집기는 분석 작업 실행부를 통하여 코드 셀(2058)에 포함된 분석 작업을 실행할 수 있다.In one embodiment, when a user requests execution of an output code cell (2058), the analysis job editor can execute the analysis job contained in the code cell (2058) through the analysis job execution unit.

예를 들어, 사용자가 제 1 마크다운 셀(2057)에 “통계 데이터를 조회하여 line chart로 시각화”를 입력하고, 이 설명 블록에 대응하는 Lua 스크렙트를 제 1 코드 셀(2058) 상에 입력할 수 있다.For example, a user can enter “Query statistical data and visualize it as a line chart” in the first Markdown cell (2057) and enter a Lua script corresponding to this description block in the first code cell (2058).

이후, 사용자가 출력된 제 1 코드 셀(2058)의 실행을 요청함에 따라, 분석 작업 편집기는 도면과 같이 제 1 라인 차트(2061)를 시각화할 수 있다.Thereafter, when the user requests execution of the first code cell (2058) output, the analysis task editor can visualize the first line chart (2061) as shown in the drawing.

다른 예를 들면, 사용자가 제 2 마크다운 셀에 “외부 database에 리소스 사용률 테이블을 조회하여 line chart로 시각화”를 입력하고, 이 설명 블록에 대응하는 Lua 스크립트를 제 2 코드 셀 상에 입력할 수 있다.As another example, a user could type “Query a resource usage table in an external database and visualize it as a line chart” in a second Markdown cell, and then enter a Lua script corresponding to this description block in a second code cell.

이후, 사용자가 출력된 제 2 코드 셀의 실행을 요청함에 따라, 분석 작업 편집기는 도면과 같이 제 2 라인 차트를 시각화할 수 있다.Afterwards, when the user requests execution of the second code cell output, the analysis task editor can visualize the second line chart as shown in the drawing.

도 38는 본 발명의 데이터 관리 플랫폼에서 제공하는 분석 작업 편집기의 사용자 인터페이스를 설명하는 도면이다.Figure 38 is a drawing illustrating a user interface of an analysis task editor provided in the data management platform of the present invention.

이후, 도 95의 (a)를 참조하면, 사용자는 제 1 마크다운 셀, 제 1 코드 셀, 제 2 마크다운 셀 및 제 2 코드 셀이 포함된 제 1 분석 작업을 저장할 수 있다. 이때, 사용자는 구분을 위하여 제 1 분석 작업의 이름을 설정할 수 있다.Thereafter, referring to (a) of FIG. 95, the user can save the first analysis task including the first markdown cell, the first code cell, the second markdown cell, and the second code cell. At this time, the user can set a name for the first analysis task for distinction.

또한, 도 95의 (b)를 참조하면, 저장된 제 1 분석 작업은 동일한 사용자 또는 상이한 사용자에 의해 불러오기 될 수 있다. 사용자가 불러온 제 1 분석 작업에 대하여 분석 작업 편집기를 실행함에 따라, 제 1 분석 작업이 실행될 수 있다. 이를 위하여, 본 발명의 데이터 관리 플랫폼은 기 저장된 적어도 하나의 분석 작업 리스트(2062)를 제공할 수 있다.Additionally, referring to (b) of FIG. 95, the saved first analysis task can be loaded by the same user or a different user. When the user executes the analysis task editor for the loaded first analysis task, the first analysis task can be executed. To this end, the data management platform of the present invention can provide at least one previously stored analysis task list (2062).

도 39은 본 발명의 데이터 관리 플랫폼이 로그 데이터를 저장하고 조회하는 일 실시 예를 설명하는 도면이다.Figure 39 is a drawing illustrating an embodiment of a data management platform of the present invention storing and retrieving log data.

일 실시예에서, 데이터 관리 플랫폼(10000)은 수집 모듈(20001), 분석 모듈(20002) 및 모니터링 모듈(20005)을 통하여 로그 데이터를 압축하여 데이터베이스(20007) 및 아카이브 저장소(20009) 중 적어도 하나에 저장할 수 있고, 사용자/클라이언트(1000)의 로그 조회 요청에 따라 로그 데이터를 조회할 수 있다.In one embodiment, the data management platform (10000) can compress log data through a collection module (20001), an analysis module (20002), and a monitoring module (20005) and store the compressed log data in at least one of a database (20007) and an archive storage (20009), and can retrieve the log data in response to a log query request from a user/client (1000).

구체적으로, 수집 모듈(20001)은 데이터 수집부(2035)를 포함할 수 있다. 데이터 수집부(2035)는 신규 로그 데이터가 입력되는 경우 해당 로그 데이터를 수집할 수 있다. 일 실시 예에서, 데이터 수집부(2035)는 로그 데이터가 일정 기간 입력되어 쌓인 경우 수집된 로그 데이터를 분석 모듈(20002)에 전달할 수 있다. 분석 모듈(20002)은 압축 처리부(2036) 및 아카이빙 수행부(2037)를 포함할 수 있다. 즉, 분석 모듈(20002)은 수집 모듈(20001)로부터 전달받은 로그 데이터를 분석할 수 있다.Specifically, the collection module (20001) may include a data collection unit (2035). The data collection unit (2035) may collect new log data when new log data is input. In one embodiment, the data collection unit (2035) may transmit the collected log data to the analysis module (20002) when log data is input and accumulated for a certain period of time. The analysis module (20002) may include a compression processing unit (2036) and an archiving performing unit (2037). That is, the analysis module (20002) may analyze the log data received from the collection module (20001).

보다 상세하게는, 압축 처리부(2036)는 입력된 다수의 로그 데이터를 미리 정의된 기간 단위(예: 일(day) 단위)의 세그먼트(segment)로 저장할 수 있다. 또한, 압축 처리부(2036)는 저장된 세그먼트를 미리 정의된 제 1 크기의 청크 단위로 저장할 수 있다. 즉, 본 발명에 따르면, 다수의 로그 데이터가 포함된 파일을 청크 단위로 여러 파일로 저장함으로써, 가능한한 빠른 시간(Near Real-time)에 로그 데이터를 압축할 수 있으며, 로그 데이터를 건 단위로 압축을 하지 않는 이유는 건 단위로 압축을 하게 되면 압축 효율이 매우 떨어지기 때문일 수 있다.More specifically, the compression processing unit (2036) can store a plurality of input log data as segments of a predefined period unit (e.g., day unit). In addition, the compression processing unit (2036) can store the stored segments as chunk units of a predefined first size. That is, according to the present invention, by storing a file containing a plurality of log data as multiple files in chunk units, the log data can be compressed as quickly as possible (Near Real-time), and the reason why the log data is not compressed in units of blocks may be because the compression efficiency is greatly reduced if the log data is compressed in units of blocks.

또한, 압축 처리부(2036)는 저장된 청크를 미리 정의된 제 2 크기의 블록 단위로 압축할 수 있다. 즉, 본 발명에 따르면, 해당 청크를 블록 단위로 분할하여 압축을 수행함으로써, 압축 효율을 떨어뜨리지 않으면서도 실제 로그 데이터를 꺼내기 위해 압축을 풀 때에 조회 성능을 높일 수 있다. 이 경우,블록 사이즈는 청크 사이즈보다 작을 수 있다.Additionally, the compression processing unit (2036) can compress the stored chunks into blocks of a predefined second size. That is, according to the present invention, by dividing the chunks into block units and performing compression, the query performance can be improved when decompressing to extract actual log data without reducing compression efficiency. In this case, the block size may be smaller than the chunk size.

일 실시 예에서, 압축된 블록을 포함하는 세그먼트는 데이터베이스(20007)에 저장될 수 있다. 이 경우, 데이터베이스(20007)는 온라인 상으로 조회가 가능한 세그먼트를 보관하기 위한 저장소를 포함할 수 있다.In one embodiment, segments containing compressed blocks may be stored in a database (20007). In this case, the database (20007) may include storage for storing segments that can be retrieved online.

아카아빙 수행부(2037)는 데이터베이스(20007)에 저장된 세그먼트들 중 아카이브 저장소(20009)에 저장하기 위한 아카이빙 대상 세그먼트를 결정할 수 있다. 아카이빙 대상 세그먼트를 결정하는 내용은 이하에서 자세히 설명하도록 한다. 이후 아카이빙 수행부(2037)는 아카이빙 대상 세그먼트를 아카이브 저장소(20009)에 이관하여 저장할 수 있다.The archiving unit (2037) can determine segments to be archived among the segments stored in the database (20007) to be stored in the archive storage (20009). The process of determining segments to be archived will be described in detail below. Thereafter, the archiving unit (2037) can transfer and store the segments to be archived in the archive storage (20009).

모니터링 모듈(20005)는 압축 해제부(2038) 및 로그 조회부(2039)를 포함할 수 있다. 압축 해제부(2038)는 데이터베이스(20007)에 저장된 세그먼트 및 아카이브 저장소(20009)에 저장된 아카이빙 대상 세그먼트 중 적어도 하나에 포함된 전체 압축된 로그 데이터 중 사용자/클라이언트(1000)로부터 요청 받은 적어도 하나의 로그 데이터가 포함된 블록만을 압축 해제할 수 있다. 해당 블록을 압축 해제하는 내용은 이하에서 자세히 설명하도록 한다.The monitoring module (20005) may include a decompression unit (2038) and a log query unit (2039). The decompression unit (2038) may decompress only a block containing at least one log data requested by a user/client (1000) among the entire compressed log data contained in at least one of the segments stored in the database (20007) and the archiving target segments stored in the archive storage (20009). The details of decompressing the corresponding block will be described in detail below.

로그 조회부(2039)는 사용자/클라이언트(1000)로부터 로그 데이터에 대한 로그 조회 요청을 수신할 수 있으며, 다수의 로그 데이터 중 로그 조회 요청에 따라 아래 모니터링 모듈(20005)을 통해 압축 해제된 로그 데이터를 사용자/클라이언트(1000)에게 송신할 수 있다.The log query unit (2039) can receive a log query request for log data from a user/client (1000), and can transmit the decompressed log data to the user/client (1000) through the monitoring module (20005) below according to the log query request among a plurality of log data.

도 40은 본 발명의 데이터 관리 플랫폼이 로그 데이터를 저장하고 조회하는 다른 일 실시 예를 설명하는 도면이다.FIG. 40 is a diagram illustrating another embodiment of a data management platform of the present invention storing and retrieving log data.

일 실시 예에서, 압축 처리부(2036)는 신규 입력된 다수의 로그 데이터를 미리 정의된 기간 단위의 세그먼트로 저장하고, 신규 로그 데이터를 저장할 때 세그먼트의 세그먼트 접근 정보가 존재하는지 확인하고 없으면 해당 세그먼트의 세그먼트 접근 정보를 생성할 수 있다. 여기서, 세그먼트 접근 정보는 해당 세그먼트에 대한 메타 데이터를 포함할 수 있다. 예를 들어, 세그먼트 접근 정보는 해당 세그먼트가 저장 가능한 것인지 여부를 나타내는 정보, 해당 세그먼트가 아카이빙된 것인지 여부를 나타내는 정보 및 해당 세그먼트가 아카이브 저장소(20009) 내에서 아카이빙된 위치 정보 중 적어도 하나를 포함할 수 있다.In one embodiment, the compression processing unit (2036) stores a plurality of newly input log data as segments of a predefined period unit, and when storing new log data, checks whether segment access information of the segment exists, and if not, generates segment access information of the segment. Here, the segment access information may include metadata about the segment. For example, the segment access information may include at least one of information indicating whether the segment is storable, information indicating whether the segment is archived, and information on the location where the segment is archived within the archive storage (20009).

압축 처리부(2036)는 세그먼트를 미리 정의된 제 1 크기의 청크 단위로 저장할 수 있다. 일 실시 예에서, 압축 처리부(2036)는 저장된 오프셋(offset)을 기준으로 각 청크에 대한 인덱스(index)를 생성할 수 있다. 이는, 청크가 압축 파일 형식으로 생성되기 때문에, 해당 압축 파일 내에 각 청크가 어디에 위치하는지를 나타내기 위함 일 수 있다.The compression processing unit (2036) may store segments in chunk units of a predefined first size. In one embodiment, the compression processing unit (2036) may generate an index for each chunk based on the stored offset. This may be to indicate where each chunk is located within the compressed file, since the chunks are generated in a compressed file format.

압축 처리부(2036)는 로그 데이터의 변경이 없는 시점에서 저장된 청크를 미리 정의된 제 2 크기의 블록 단위로 압축하고, 블록 오프셋(offset) 정보와 압축 정보를 생성할 수 있다.The compression processing unit (2036) can compress the stored chunks into blocks of a predefined second size at a point in time when there is no change in the log data, and generate block offset information and compression information.

여기서, 압축된 블록은 청크를 블록 단위로 압축한 파일인 압축 데이터를 포함할 수 있다. 또한, 블록 오프셋 정보는 블록 단위로 압축된 블록의 청크 내 위치 정보와 압축된 길이 정보를 포함할 수 있다. 또한, 압축 정보는 압축된 블록의 크기 및 압축 전 원본 파일의 사이즈 정보를 포함할 수 있다.Here, the compressed block may include compressed data, which is a file that compresses chunks into blocks. Furthermore, block offset information may include information about the position of the compressed block within the chunk and compressed length information. Furthermore, the compressed information may include information about the size of the compressed block and the size of the original file before compression.

아카아빙 수행부(2037)는 아카이빙 수행 시 아카이브 대상 세그먼트를 구분하고, 아카이브 저장소(20009)로 구분된 아카이브 대상 세그먼트를 이관하여 저장하고, 아카이브 저장소(20009)에 대한 아카이브 대상 세그먼트의 접근 정보를 생성할 수 있다.The archiving execution unit (2037) can distinguish the archive target segments when performing archiving, transfer and store the distinguished archive target segments to the archive storage (20009), and generate access information for the archive target segments for the archive storage (20009).

압축 해제부(2038)는 사용자/클라이언트(1000)로부터 다수의 로그 데이터 중 적어도 하나의 로그 데이터에 대한 로그 조회 요청이 수신됨에 따라, 세그먼트 및 아카이브 대상 세그먼트 중 적어도 하나의 세그먼트 접근 정보, 블록 오프셋 정보 및 압축 정보 중 적어도 하나를 이용하여 전체 압축된 로그 데이터 중 요청 받은 적어도 하나의 로그 데이터가 위치하는 블록만 압축을 해제하여, 사용자/클라이언트(1000)에 의해 적어도 하나의 로그 데이터가 조회될 수 있도록 한다.When a log inquiry request for at least one log data among a plurality of log data is received from a user/client (1000), the decompression unit (2038) decompresses only the block in which at least one requested log data is located among the entire compressed log data by using at least one of segment access information, block offset information, and compression information of at least one of the segments and archive target segments, so that at least one log data can be queried by the user/client (1000).

일반적인 데이터베이스에 보관된 로그 데이터를 장기 보관하기 위해서 별도의 장소(예: 아카이브 저장소(20009))에 아카이빙이 수행될 수 있다. 이때 로그 데이터에 대한 익스포트(export) 과정이 필요하고, 아카이빙된 로그 데이터를 다시 조회하기 위해서는 데이터베이스에 대한 임포트(import) 과정이 필요하다. 이러한 익스포트, 임포트 과정에서 압축 및 압축 해제 시 많은 시간이 소모될 수 있다. 이에, 본 발명에 따르면, 압축 정보 및 블록 오프셋 정보를 이용하여 전체 로그 데이터의 압축 해제 없이 조회 가능하도록 하여 익스포트/임포트 과정이 매우 단축될 수 있다.To store log data stored in a general database for a long period of time, archiving can be performed in a separate location (e.g., archive storage (20009)). At this time, an export process for the log data is required, and to retrieve the archived log data again, an import process for the database is required. During these export and import processes, compression and decompression can consume a significant amount of time. Therefore, according to the present invention, the export/import process can be significantly shortened by enabling retrieval of the entire log data without decompression using compression information and block offset information.

도 41은 실시 예에 따른 데이터 관리 장치에 대한 레이어 별 요소들을 개시한 개념도이다.Figure 41 is a conceptual diagram disclosing layer-by-layer elements for a data management device according to an embodiment.

네트워크 패킷을 수집할 때, 일반적인 인트라 네트워크 환경에서는 네트워크 미러링을 통해 네트워크 트래픽을 복사하여 모니터링 장비에 전송하는 것이 일반적이다. 그러나 패킷 수집 장비(예를 들어, 네트워크 탭 또는 스위치 등)가 설치가 불가능한 클라우드 환경의 경우에는 네트워크 미러링을 통해 패킷을 수집하는 방법에 어려움이 있다.When collecting network packets, it's common to use network mirroring to copy network traffic and transmit it to monitoring equipment in typical intranet environments. However, in cloud environments where packet collection equipment (e.g., network taps or switches) cannot be installed, collecting packets through network mirroring presents challenges.

즉, 클라우드 환경의 특성상, 네트워크 토폴로지(network topology)가 물리적으로 분리되어 있고, 가상화된 네트워크 인프라를 사용하기 때문에 전통적인 네트워크 미러링 방법을 사용하는 것이 어렵다. 이로 인해 패킷 유실률이 증가하거나 네트워크 트래픽을 정확하게 모니터링하는데 어려움이 있다.In other words, the nature of cloud environments makes it difficult to use traditional network mirroring methods due to physically separated network topologies and the use of virtualized network infrastructure. This can lead to increased packet loss rates and difficulties in accurately monitoring network traffic.

본 발명은 이러한 문제를 해결하기 위하여 리눅스 커널 레이어의 넷필터(Netfilter) 메커니즘을 이용하도록 한다.The present invention utilizes the Netfilter mechanism of the Linux kernel layer to solve this problem.

에이전트 설치 서버는 OS 커널(kernel) 영역과 OS 사용자(user) 영역을 포함할 수 있다. 또한, 커널 영역은 패킷 필터링부(1051) 및 패킷 적재부(1052)를 포함할 수 있고, 사용자 영역은 전송 분배부(1053), 압축 처리부(1054) 및 메시지 전송부(1055)를 포함할 수 있다.The agent installation server may include an OS kernel area and an OS user area. Furthermore, the kernel area may include a packet filtering unit (1051) and a packet loading unit (1052), while the user area may include a transmission distribution unit (1053), a compression processing unit (1054), and a message transmission unit (1055).

일 실시 예에서, 에이전트 설치 서버는 클라우드 서버인 것을 특징으로 한다. 이에 따라, 패킷 수집 장비의 설치가 불가능한 클라우드 서버에 직접 에이전트를 설치하여 패킷을 수집할 수 있도록 한다. 에이전트를 클라우드 서버에서 실행하는 방법은, 데이터 관리 장치의 접속 정보 및 처리 옵션을 설정하고 에이전트를 옵션에 기초하여 실행할 수 있다. 이때, 에이전트 실행 옵션은 데이터 관리 장치의 IP 주소 및 수신 포트 정보, 압축 전송 여부, 재전송 시도 횟수, 큐 최대 적재 수량, 엔진으로 1회 전송 시 최대 패킷 수량, 넷필터 큐의 수신 버퍼 사이즈, 넷필터 큐의 커널 큐 최대 길이, 엔진 전송 큐의 개수(채널 수), 로그 레벨, 로그 경로, 롤링되는 로그 사이즈 등을 포함할 수 있다.In one embodiment, the agent installation server is characterized by being a cloud server. Accordingly, the agent can be installed directly on a cloud server where packet collection equipment cannot be installed to enable packet collection. The method for executing the agent on the cloud server can set connection information and processing options of the data management device and execute the agent based on the options. At this time, the agent execution options can include the IP address and receiving port information of the data management device, whether to perform compressed transmission, the number of retransmission attempts, the maximum queue load, the maximum number of packets per transmission to the engine, the receive buffer size of the netfilter queue, the maximum kernel queue length of the netfilter queue, the number of engine transmission queues (the number of channels), the log level, the log path, the rolling log size, etc.

상술한 실시 예와 같이 본 발명의 네트워크 관리 장치는 NIC(Network Interface Card)와 같은 네트워크 장비(1001)를 통하여 패킷을 수집할 수 있다.As in the above-described embodiment, the network management device of the present invention can collect packets through network equipment (1001) such as a NIC (Network Interface Card).

패킷 필터링부(1051)는 네트워크 장비(1001)를 통하여 수집된 패킷을 IP/Port 기반으로 구성된 패킷 필터 규칙을 이용하여 목적으로 하는 대상 패킷을 구분하고, 패킷 적재부에 전달할 수 있다. 일 실시 예에서, 패킷 필터링부(1051)는 네트워크 패킷을 패킷 필터 규칙에 기초하여 패킷을 전달하거나 무시(skip)할 수 있다. 이때, 패킷 필터 규칙을 사용자 정의에 따라 변경될 수 있다.The packet filtering unit (1051) can distinguish target packets collected through the network equipment (1001) using packet filter rules configured based on IP/Port, and can then forward them to the packet loading unit. In one embodiment, the packet filtering unit (1051) can forward or ignore network packets based on packet filter rules. At this time, the packet filter rules can be changed according to user definition.

패킷 적재부(1052)는 패킷 필터링부(1051)을 통하여 수신한 패킷을 패킷 큐에 적재할 수 있다. 여기에서, 패킷 큐는 넷필터 큐(netfilter queue)에 대응한다.The packet loading unit (1052) can load packets received through the packet filtering unit (1051) into a packet queue. Here, the packet queue corresponds to a netfilter queue.

일 실시 예에서, 넷필터 큐 설정 방법은 넷필터 큐를 설치하고, 대상 포트에 대한 넷필터를 설정할 수 있다. 이후, iptables -list 명령어를 이용해 넷필터 설정을 확인할 수 있다. 넷필터 큐를 사용하면 OS 사용자 공간에서 패킷을 처리할 수 있기 때문에, 사용자 정의 패킷을 처리할 수 있다.In one embodiment, the netfilter queue configuration method can install a netfilter queue and configure a netfilter for the target port. The netfilter configuration can then be verified using the iptables -list command. Because the netfilter queue allows packet processing in OS user space, it can handle user-defined packets.

전송 분배부(1053)는 패킷 큐에 적재된 패킷 데이터를 데이터 관리 장치로 병렬로 전송하기 위해 IP를 기준으로 전송 큐에 패킷을 분배할 수 있다.The transmission distribution unit (1053) can distribute packets to the transmission queue based on IP in order to transmit packet data loaded in the packet queue in parallel to the data management device.

압축 처리부(1054)는 전송 큐에 적재된 여러 패킷 데이터를 하나의 큰 전송 단위로 묶어서 압축한 뒤 메시지 전송부(1055)로 전달할 수 있다. 이때, 패킷 데이터를 병합(merge)하여 압축하면 압축 효율을 높일 수 있다.The compression processing unit (1054) can bundle multiple packet data loaded in the transmission queue into a single large transmission unit, compress it, and then transmit it to the message transmission unit (1055). At this time, if the packet data is merged and compressed, the compression efficiency can be improved.

일 실시 예에서, 압축 처리부(1054)는 압축 효율을 높이기 위하여 LZ4 알고리즘으로 패킷을 압축할 수 있다. 이때, 압축 처리부(1054)에서 LZ4 알고리즘을 통해 패킷을 압축하고, 이후 압축 해제부(1056)에서 패킷의 압축을 해제하기 위해 프로토콜에 압축 여부 필드를 추가할 수 있다. 구체적으로, 데이터를 전송할 때 프로토콜의 헤더(header)와 본문(body)을 분리하여 헤더 내에는 본문 데이터의 길이, 압축 여부 및 원본 길이를 포함하여 전송하고, 본문에는 병합된 패킷 데이터를 LZ4 알고리즘을 통해 압축하여 전송할 수 있다. 이때, 데이터를 수신할 때는, 압축 여부 필드에 따라 본문 데이터를 압축 해제한 후 원본 길이와 비교하여 검증하여 압축을 해제할 수 있다.In one embodiment, the compression processing unit (1054) may compress packets using the LZ4 algorithm to increase compression efficiency. At this time, the compression processing unit (1054) may compress packets using the LZ4 algorithm, and then a compression or not field may be added to the protocol to decompress the packets in the decompression unit (1056). Specifically, when transmitting data, the header and body of the protocol may be separated, and the length of the body data, whether it is compressed, and the original length may be included in the header for transmission, and the merged packet data may be compressed and transmitted in the body using the LZ4 algorithm. At this time, when receiving data, the body data may be decompressed according to the compression or not field, and then compared with the original length for verification to decompress the data.

이를 통하여, 압축율을 2.101 비율로 높일 수 있고, 압축 속도는 780MB/s, 압축 해제 속도는 4970 BM/s의 벤치마크를 나타낼 수 있다.Through this, the compression ratio can be increased to 2.101, and the compression speed can be benchmarked at 780MB/s and the decompression speed can be benchmarked at 4970BM/s.

메시지 전송부(1055)는 압축 처리부(1054)로부터 수신한 메시지 정보를 전송 정보와 TCP 통신을 이용하여 데이터 관리 장치의 압축 해제부(1056)로 전송할 수 있다.The message transmission unit (1055) can transmit message information received from the compression processing unit (1054) to the decompression unit (1056) of the data management device using transmission information and TCP communication.

압축 해제부(1056)는 적어도 하나 이상의 에이전트가 설치된 클라우드 서버로부터 TCP 통신을 이용하여 압축된 데이터를 수신할 수 있다. 압축 해제부(1056)는 수신한 데이터를 압축 해제 후 TCP 스트림 단위로 분석하기 위해 IP/Port 기준으로 패킷 데이터를 디스패쳐 큐(Dispatcher queue)에 적재할 수 있다.The decompression unit (1056) can receive compressed data using TCP communication from a cloud server on which at least one agent is installed. The decompression unit (1056) can load packet data into a dispatcher queue based on IP/Port to analyze the received data in TCP stream units after decompressing the data.

일 실시 예에서, LZ4 알고리즘으로 압축된 패킷을 해제하기 위하여, 본 발명의 클라우드 서버에 설치되는 에이전트와 압축 해제부(1056) 간 프로토콜에 압축 여부 필드를 추가하여 패킷의 압축을 해제할 수 있다. 이에 대하여는 상술한 바와 같다.In one embodiment, in order to decompress a packet compressed with the LZ4 algorithm, a compression or not field may be added to the protocol between the agent installed on the cloud server of the present invention and the decompression unit (1056) to decompress the packet. This is as described above.

패킷 전송부(1057)는 디스패쳐 큐에 적재된 패킷 데이터를 TCP 패킷 분석 프로세스로 파이프(Pipe) 또는 TCP 통신으로 전송할 수 있다. 여기에서, TCP 패킷 분석 프로세스는 조각난 TCP 데이터를 재병합하고 각 프로토콜 별로 분석 처리를 수행하는 프로세스에 대응한다. 각 프로토콜 별로 분석 처리를 수행하는 프로세스는 상술한 실시 예를 참고하도록 한다.The packet transmission unit (1057) can transmit packet data loaded in the dispatcher queue to the TCP packet analysis process via a pipe or TCP communication. Here, the TCP packet analysis process corresponds to a process that reassembles fragmented TCP data and performs analysis processing for each protocol. The process that performs analysis processing for each protocol is described in the above-described embodiment.

이를 통해, 미러링으로 패킷을 수집하기 어려운 상황에서도 패킷 수집이 가능하다는 장점이 있다. 또한, 패킷 수집의 효율을 높이기 위하여 모든 패킷을 수집하는 것이 아닌 목적으로 하는 대상 패킷만을 수집할 수 있다.This provides the advantage of enabling packet collection even in situations where mirroring is difficult. Furthermore, to increase packet collection efficiency, only targeted packets can be collected, rather than all packets.

도 42는 실시 예에 따른 데이터 관리 장치가 에이전트를 모니터링하는 실시 예를 설명하는 도면이다.FIG. 42 is a diagram illustrating an embodiment in which a data management device according to an embodiment monitors an agent.

일 실시 예에서, 데이터 관리 장치는 콘솔(10001, console), 수집 모듈(20001) 및 컨트롤러(20011)를 포함할 수 있다.In one embodiment, the data management device may include a console (10001), a collection module (20001), and a controller (20011).

콘솔(10001)은 데이터 관리 장치의 사용자 인터페이스 모듈의 기능을 수행할 수 있다. 일 실시 예에서, 콘솔(10001)은 에이전트 상태를 수집할지 여부를 설정 받을 수 있다. 콘솔(10001)이 에이전트 상태를 수집하는 것으로 설정된 경우, 수집 모듈 및 컨트롤러를 통해 에이전트의 상태를 수집하고, 에이전트 상태를 모니터링할 수 있다. 이때, 콘솔(10001)은 에이전트 상태를 모니터링하는 사용자 인터페이스를 제공할 수 있다. 여기에서, 에이전트 상태를 모니터링하는 사용자 인터페이스는 에이전트 IP, 에이전트 연결 상태(on/off), 수신 개수, 초당 수신 개수 정보를 포함할 수 있다.The console (10001) can perform the functions of a user interface module of a data management device. In one embodiment, the console (10001) can be configured to collect agent status. If the console (10001) is configured to collect agent status, the console can collect agent status and monitor agent status through a collection module and a controller. At this time, the console (10001) can provide a user interface for monitoring agent status. Here, the user interface for monitoring agent status can include information such as agent IP, agent connection status (on/off), number of receptions, and number of receptions per second.

콘솔(10001)을 통하여 에이전트 상태를 수집하는 것으로 설정된 경우, 컨트롤러(20011)는 수집 모듈(20001)이 적어도 하나의 에이전트의 상태를 수집하도록 제어하는 명령어를 처리할 수 있다. 이때, 컨트롤러(20011)와 수집 모듈(20001) 간의 통신은 명령어 기반으로 이루어질 수 있다. 컨트롤러(20011)는 명령어를 통해 수집 모듈(20001)에 특정 작업을 지시하고, 수집 모듈(20001)은 수집한 에이전트의 상태 정보를 명령어 응답으로 컨트롤러(20011)에 전달할 수 있다.When the agent status is set to be collected through the console (10001), the controller (20011) can process a command that controls the collection module (20001) to collect the status of at least one agent. At this time, communication between the controller (20011) and the collection module (20001) can be based on commands. The controller (20011) can instruct the collection module (20001) to perform a specific task through a command, and the collection module (20001) can transmit the status information of the collected agent to the controller (20011) as a command response.

일 실시 예에서, 수집 모듈(20001)은 컨트롤러(20011)의 제어에 기초하여 상태 조회 명령어를 처리할 수 있다. 이에 따라, 수집 모듈(20001)은 에이전트별 전송 상태를 수집할 수 있다. 에이전트별 전송 상태는 에이전트 IP, 에이전트 연결 상태(on/off), 전체 수신 개수(total received count), 초당 수신 개수(received count per second) 등을 포함할 수 있다.In one embodiment, the collection module (20001) can process a status inquiry command based on the control of the controller (20011). Accordingly, the collection module (20001) can collect transmission status for each agent. The transmission status for each agent can include the agent IP, agent connection status (on/off), total received count, received count per second, etc.

보다 상세하게는, 수집 모듈(20001)은 에이전트(1011, 1012)의 상태를 수집할 수 있다.More specifically, the collection module (20001) can collect the status of the agents (1011, 1012).

예를 들어, 제 1 에이전트(1011)가 정상 동작하는 경우, 수집 모듈(20001)은 제 1 에이전트(1011)의 전송 상태를 수집하여, 컨트롤러(20011)에게 전달할 수 있다.For example, if the first agent (1011) is operating normally, the collection module (20001) can collect the transmission status of the first agent (1011) and transmit it to the controller (20011).

다른 예를 들어, 제 2 에이전트(1012)가 접속이 종료된 경우, 수집 모듈(20001)은 제 2 에이전트의 접속 상태를 갱신할 수 있다. 수집 모듈(20001)은 제 2 에이전트의 접속 종료 상태를 갱신하여 컨트롤러(20011)에게 전달할 수 있다. 수집 모듈(20001)의 다른 기능은 상술한 실시 예를 참고하도록 한다.For another example, if the connection of the second agent (1012) is terminated, the collection module (20001) can update the connection status of the second agent. The collection module (20001) can update the connection termination status of the second agent and transmit it to the controller (20011). For other functions of the collection module (20001), refer to the above-described embodiment.

컨트롤러(20011)는 수집 모듈(20001)이 에이전트(1011, 1012)로부터 수집한 정보를 콘솔(10001)을 통하여 클라이언트에게 제공할 수 있다.The controller (20011) can provide information collected from the agents (1011, 1012) by the collection module (20001) to the client through the console (10001).

이에 따라, 클라이언트는 본 발명의 데이터 관리 장치의 콘솔(10001)을 통하여 클라우드 서버에 설치된 에이전트의 접속 상태를 모니터링할 수 있다.Accordingly, the client can monitor the connection status of the agent installed on the cloud server through the console (10001) of the data management device of the present invention.

도 43은 실시 예에 따른 데이터 관리 방법의 흐름도이다.Figure 43 is a flowchart of a data management method according to an embodiment.

클라우드 서버에 설치된 적어도 하나의 에이전트로부터 수집된 패킷의 압축을 해제할 수 있다(S101).Packets collected from at least one agent installed on a cloud server can be decompressed (S101).

일 실시 예에서, 적어도 하나의 에이전트로부터 수집된 패킷은 넷필터 큐(netfilter queue)에 적재된 뒤, 제 1 단위로 병합되고(merge) 압축된 것을 특징으로 한다. 여기에서, 제 1 단위로 병합하는 것은 기존에 적재된 여러 패킷 데이터를 하나의 큰 전송 단위로 묶는 것을 의미할 수 있다. 즉, 제 1 단위는 기존에 적재된 패킷 데이터보다 더 큰 전송 단위에 해당한다.In one embodiment, packets collected from at least one agent are loaded into a netfilter queue, and then merged and compressed into a first unit. Here, merging into a first unit may mean bundling multiple previously loaded packet data into a single large transmission unit. In other words, the first unit corresponds to a larger transmission unit than the previously loaded packet data.

일 실시 예에서, 적어도 하나의 에이전트로부터 수집된 패킷은 넷필터 큐에 적재된 뒤 전송 큐에 분배될 수 있고, 전송 큐에 분배된 패킷 데이터는 제 1 단위로 병합(merge)되어 압축될 수 있다.In one embodiment, packets collected from at least one agent may be loaded into a netfilter queue and then distributed to a transmission queue, and packet data distributed to the transmission queue may be merged and compressed as a first unit.

또한, 수집된 패킷은 패킷의 프로토콜의 헤더(header)와 본문(body)이 분리되어 압축되고, 헤더는 본문 데이터의 길이, 압축 여부 및 원본 길이를 포함하고, 본문은 상기 병합된 패킷 데이터를 포함할 수 있다. 이에 대하여는, 도 41에서 상술한 내용을 참고하도록 한다.Additionally, the collected packets are compressed by separating the header and body of the packet protocol, and the header includes the length of the body data, whether it is compressed, and the original length, and the body may include the merged packet data. For this, please refer to the contents described above in Fig. 41.

압축이 해제된 패킷을 패킷의 프로토콜에 기초하여 분석할 수 있다(S103). 이에 대하여는, 도 1 내지 도 11에서 상술한 내용을 참고하도록 한다.The decompressed packet can be analyzed based on the packet's protocol (S103). For this, refer to the contents described above in FIGS. 1 to 11.

일 실시 예에서, 적어도 하나의 에이전트의 상태를 수집할 수 있다. 또한, 수집된 에이전트 상태를 사용자 인터페이스를 통하여 제공할 수 있다. 이때, 에이전트 상태는 에이전트 IP, 에이전트 연결 상태(on/off), 수신 개수, 초당 수신 개수 정보 중 적어도 하나를 포함할 수 있다. 이에 대하여는, 도 42에서 상술한 내용을 참고하도록 한다.In one embodiment, the status of at least one agent can be collected. Furthermore, the collected agent status can be provided through a user interface. The agent status can include at least one of the following information: agent IP address, agent connection status (on/off), number of receptions, and number of receptions per second. For more information, please refer to the details described above in Figure 42.

도 44는 일 실시 예에 따른 프록시 서버가 다른 클라우드 서비스에 연결되는 실시 예를 설명하는 도면이다.FIG. 44 is a diagram illustrating an embodiment in which a proxy server according to one embodiment is connected to another cloud service.

클라이언트는 HTTP, RFC, SAP GUI 프로토콜을 통해 SAP 시스템에 접근할 수 있다.Clients can access SAP systems via HTTP, RFC, and SAP GUI protocols.

본 발명은 클라이언트와 SAP 시스템 사이의 프록시 서버(200)를 구성하여 HTTP, RFC, SAP GUI 프로토콜을 통해 송수신하는 패킷 데이터를 수집하고 분석하고자 한다.The present invention seeks to collect and analyze packet data transmitted and received via HTTP, RFC, and SAP GUI protocols by configuring a proxy server (200) between a client and an SAP system.

보다 상세하게는, 클라이언트는 사용자 웹 브라우저(web browser)를 통하여 SAP 시스템에 대응하는 웹 서버(web server)에 접속할 수 있다. 이때, 사용자 웹 브라우저와 웹 서버 사이에 프록시 서버(200)를 구성하여 패킷 데이터를 수집한 후, 로그 기록을 위해 분석 시스템으로 전송할 수 있다. 이때, 패킷 데이터가 HTTPS 프로토콜에 의해 수신되는 경우, 상술한 실시 예를 참고하여 SSL을 해제한 후 HTTP 요청/응답을 추출하여 분석 시스템으로 전송할 수 있다.More specifically, a client can access a web server corresponding to the SAP system via a user web browser. At this time, a proxy server (200) can be configured between the user web browser and the web server to collect packet data and then transmit it to an analysis system for logging purposes. If the packet data is received via the HTTPS protocol, SSL can be released with reference to the above-described embodiment, and the HTTP request/response can be extracted and transmitted to the analysis system.

특히, 본 발명에서는 상술한 프록시 서버(200)가 HTTP/HTTPS 프로토콜에 기반한 패킷 데이터 뿐만 아니라, SAP GUI 프로토콜에 의한 경우에도 의미 있는 패킷 데이터를 분석하여 추출할 수 있는 방안을 제공한다. 즉, 본 발명의 프록시 서버(200)는 하나의 프록시 서버(200) 안에서 HTTP 프록시와 SAP GUI 프록시의 기능을 모두 수행할 수 있다.In particular, the present invention provides a method by which the above-described proxy server (200) can analyze and extract meaningful packet data not only based on HTTP/HTTPS protocols but also based on SAP GUI protocols. That is, the proxy server (200) of the present invention can perform the functions of both HTTP proxy and SAP GUI proxy within a single proxy server (200).

일 실시 예에서, 프록시 서버(200)는 프록시 클라이언트 통신부(210), 프록시 데이터 추출부(220), 프록시 서버 통신부(230) 및 프록시 데이터 전송부(240)를 포함할 수 있다.In one embodiment, the proxy server (200) may include a proxy client communication unit (210), a proxy data extraction unit (220), a proxy server communication unit (230), and a proxy data transmission unit (240).

프록시 클라이언트 통신부(210)는 프록시 설정으로 프록시 서비스를 구성하고 접근하는 클라이언트와 프록시 서버(200) 간 통신 유형(예를 들어, HTTP, GUI, RFC 등)별로 처리를 수행하고 프록시 데이터 추출부(220)로 요청 데이터를 전달할 수 있다. 이때, 프록시 설정은 프록시 서비스 포트 정보, OS 서비스 파일 위치, 전송 큐 길이 등을 포함할 수 있다.The proxy client communication unit (210) can configure a proxy service with proxy settings, perform processing according to the communication type (e.g., HTTP, GUI, RFC, etc.) between the accessing client and the proxy server (200), and transmit request data to the proxy data extraction unit (220). At this time, the proxy settings can include proxy service port information, OS service file location, transmission queue length, etc.

또한, 프록시 클라이언트 통신부(210)는 프로토콜이 SAP GUI, RFC 프로토콜인 경우, NI 프로토콜(Network Interface protocol)을 분석하여 SAP 시스템에서 처리하는 단위의 패킷을 추출하여 전달할 수 있다. 여기에서, NI 프로토콜을 분석하여 전달하는 실시 예에 대하여 자세한 내용은 후술하도록 한다.In addition, the proxy client communication unit (210) can analyze the NI protocol (Network Interface protocol) and extract and transmit packets of units processed by the SAP system when the protocol is the SAP GUI or RFC protocol. Here, an embodiment of analyzing and transmitting the NI protocol will be described in detail later.

프록시 데이터 추출부(220)는 프록시 클라이언트 통신부(210)와 프록시 서버 통신부(230)에서 수신된 요청/응답 데이터를 SAP GUI와 RFC 프로토콜 기반의 경우, 패킷을 기준으로 전송 큐(Queue)에 적재할 수 있다. 또한, 프록시 데이터 추출부(220)는 프록시 클라이언트 통신부(210)와 프록시 서버 통신부(230)에서 수신된 요청/응답 데이터를 HTTP 프로토콜 기반의 경우, 요청/응답 기준으로 조합하여 전송 큐에 적재할 수 있다.The proxy data extraction unit (220) can load request/response data received from the proxy client communication unit (210) and the proxy server communication unit (230) into a transmission queue based on packets in the case of SAP GUI and RFC protocols. In addition, the proxy data extraction unit (220) can combine request/response data received from the proxy client communication unit (210) and the proxy server communication unit (230) into a transmission queue based on request/response in the case of HTTP protocols.

프록시 서버 통신부(230)는 프록시 서비스 별로 프록시 설정 내 대상 시스템(예를 들어, SAP 시스템)으로 클라이언트의 요청을 전달할 수 있다. 일 실시 예에서, 프록시 서버 통신부(230)는 SAP GUI, RFC 프로토콜의 경우, NI 프로토콜 내에서 연결하기 위한 서비스 포트가 대상 SAP 시스템 기준의 문자열로 전달되기 때문에 OS 서비스 파일에서 해당 문자열로 서비스 포트를 식별하여 SAP 시스템의 서비스 포트로 클라이언트 요청을 전달할 수 있다. 이를 위하여, OS 서비스 파일은 포트 정보를 지칭하는 문자열 정보 리스트를 포함할 수 있다.The proxy server communication unit (230) can forward a client request to a target system (e.g., an SAP system) within the proxy settings for each proxy service. In one embodiment, in the case of the SAP GUI and RFC protocols, the proxy server communication unit (230) can forward a client request to the service port of the SAP system by identifying the service port with the corresponding string in the OS service file, since the service port for connection within the NI protocol is forwarded as a string based on the target SAP system. For this purpose, the OS service file can include a list of string information indicating port information.

또한, 프록시 서버 통신부(230)는 프록시 서버와 대상 시스템 간 통신 유형별로 요청/응답에 대한 처리를 수행하고 프록시 데이터 추출부(220)로 응답 데이터를 전달할 수 있다.Additionally, the proxy server communication unit (230) can process requests/responses according to the communication type between the proxy server and the target system and transmit response data to the proxy data extraction unit (220).

프록시 데이터 전송부(240)는 전송 큐에 적재된 프록시 데이터를 전송 정보를 참조하여 데이터 관리 장치의 프록시 데이터 수신부(250)로 전달할 수 있다. 여기에서, 전송 정보는 프록시 설정 정보를 포함할 수 있다. 예를 들어, 프록시 설정 정보는 데이터 관리 장치 IP, 프록시 데이터 수신부(250) 포트 정보, 전송에 사용할 쓰레드(thread) 수, 1회 전송 시 프록시 데이터 건 수, 전송 큐의 크기 및 전송 시 재시도 횟수 등을 포함할 수 있다.The proxy data transmission unit (240) can transmit proxy data loaded in the transmission queue to the proxy data reception unit (250) of the data management device by referring to the transmission information. Here, the transmission information can include proxy setting information. For example, the proxy setting information can include the data management device IP, proxy data reception unit (250) port information, the number of threads to be used for transmission, the number of proxy data items per transmission, the size of the transmission queue, and the number of retries during transmission.

일 실시 예에서, 본 발명의 데이터 관리 장치(100)는 패킷 데이터 수집 및 필터부(110), TCP 패킷 재조부립부(120), 프록시 데이터 수신부(250), 프로토콜 분석부(130), 감사 로그 생성 및 전송부(140), RFC 정보 요청부(150) 및 인증서 관리부(160)를 포함할 수 있다.In one embodiment, the data management device (100) of the present invention may include a packet data collection and filter unit (110), a TCP packet reassembly unit (120), a proxy data receiving unit (250), a protocol analysis unit (130), an audit log generation and transmission unit (140), an RFC information request unit (150), and a certificate management unit (160).

이때, 패킷 데이터 수집 및 필터부(110)는 NIC 및 에이전트로부터 패킷을 수집하고 패킷을 필터링하여 TCP 패킷 재조립부(120)에 전달할 수 있다.At this time, the packet data collection and filter unit (110) can collect packets from the NIC and agent, filter the packets, and transmit them to the TCP packet reassembly unit (120).

TCP 패킷 재조립부(120)는 조각난 TCP 패킷을 재조립하여 프로토콜 분석부(130)에게 전달할 수 있다.The TCP packet reassembly unit (120) can reassemble fragmented TCP packets and transmit them to the protocol analysis unit (130).

프록시 데이터 수신부(250)는 프록시 서버로부터 데이터를 수신할 수 있다. 프록시 데이터 수신부(250)는 데이터 관리 장치(100)의 엔진 서버의 설정 정보에 기초하여 수신된 데이터를 수신 및 분배 처리할 수 있다. 예를 들어, 엔진 서버의 설정 정보는 SAP GUI 프로토콜의 처리 가능한 분석기(analyzer)의 수, SAP RFC 프로토콜의 처리 가능한 분석기 수, 분석기 포트 정보 등을 포함할 수 있다.The proxy data receiving unit (250) can receive data from a proxy server. The proxy data receiving unit (250) can receive and distribute the received data based on the configuration information of the engine server of the data management device (100). For example, the configuration information of the engine server can include the number of analyzers capable of processing the SAP GUI protocol, the number of analyzers capable of processing the SAP RFC protocol, analyzer port information, etc.

프로토콜 분석부(130)는 SAP GUI, RFC, HTTP 프로토콜 별로 수신된 데이터를 분석하고, 수신된 데이터가 SNC/SSL로 암호화되어 있는 경우, 인증서를 이용하여 데이터를 복호화할 수 있다. 또한, 프로토콜 분석부(130)는 RFC 프로토콜에 기반한 데이터를 분석하기 위한 RFC 구조 정보가 없는 경우, RFC 정보 요청 파일을 생성할 수 있다. 프로토콜 분석부(130)는 RFC 구조 정보가 있는 경우, RFC 구조 정보에 기초하여 RFC 프로토콜에 기반한 데이터를 분석할 수 있다. 이를 위하여 프로토콜 분석부(130)는 적어도 하나의 분석기(Analyzer)를 포함할 수 있다.The protocol analysis unit (130) analyzes the received data according to the SAP GUI, RFC, and HTTP protocols, and if the received data is encrypted with SNC/SSL, the data can be decrypted using a certificate. In addition, the protocol analysis unit (130) can generate an RFC information request file if there is no RFC structure information for analyzing data based on the RFC protocol. If there is RFC structure information, the protocol analysis unit (130) can analyze data based on the RFC protocol based on the RFC structure information. For this purpose, the protocol analysis unit (130) can include at least one analyzer.

감사 로그 생성 및 전송부(140)는 프로토콜 별로 분석된 데이터를 정의된 필드 규칙에 따라 변환 및 가공하여 감사 로그를 생성할 수 있다. 감사 로그 생성 및 전송부(140)는 생성된 감사 로그를 분석 시스템으로 전송할 수 있다. 여기에서, 분석 시스템은 감사 로그를 저장하고 감사 로그로부터 개인정보를 추출하고 상관 분석을 수행하는 프로세스를 의미한다. 이때, 감사 로그를 저장하고, 감사 로그로부터 개인정보를 추출하고, 상관 분석을 수행하는 프로세스는 상술한 실시 예를 참고하도록 한다. 여기에서, 분석 시스템은 별개의 시스템이 아닌 데이터 관리 장치(100)의 일부 모듈을 통하여 수행될 수 있음은 물론이다.The audit log generation and transmission unit (140) can generate an audit log by converting and processing data analyzed by protocol according to defined field rules. The audit log generation and transmission unit (140) can transmit the generated audit log to an analysis system. Here, the analysis system refers to a process of storing the audit log, extracting personal information from the audit log, and performing correlation analysis. At this time, the process of storing the audit log, extracting personal information from the audit log, and performing correlation analysis will refer to the above-described embodiment. Here, it goes without saying that the analysis system can be performed through some module of the data management device (100) rather than a separate system.

RFC 정보 요청부(150)는 RFC 정보 요청 파일을 감지하는 경우, 대상 시스템으로 RFC 구조 정보를 요청할 수 있다.When the RFC information request unit (150) detects an RFC information request file, it can request RFC structure information from the target system.

인증서 관리부(160)는 인증서 정보를 데이터 관리 장치의 운영자가 관리하고 등록된 인증서를 프로토콜 분석부(130)에서 사용할 수 있도록 동기화할 수 있다.The certificate management unit (160) can manage certificate information by the operator of the data management device and synchronize the registered certificates so that they can be used in the protocol analysis unit (130).

이를 통해 HTTP, RFC, SAP GUI 프로토콜을 통해 SAP 시스템에 접근하는 사용자들의 로그를 별도의 프로그램 설치 없이 수집할 수 있다는 장점이 있다.This has the advantage of being able to collect logs from users accessing the SAP system via HTTP, RFC, and SAP GUI protocols without installing a separate program.

또한, 프록시 서버(200)를 사용하여 HTTP, RFC, SAP GUI 프로토콜을 통한 데이터를 수집하고 모니터링할 수 있다는 장점이 있다.Additionally, there is an advantage in that data can be collected and monitored through HTTP, RFC, and SAP GUI protocols using a proxy server (200).

도 45는 일 실시 예에 따른 프록시 서버의 기능을 설명하는 도면이다.Figure 45 is a diagram illustrating the function of a proxy server according to one embodiment.

일 실시 예에서, 상술한 기능을 갖는 프록시 서버(200)는 클라이언트 A가 대상 시스템(target system)을 이용할 때 사용될 수 있다. 여기에서, 클라이언트 A는 SAP 시스템을 이용하는 사용자를 예로 들 수 있다. 이때, 클라이언트 A는 하나의 대상 시스템을 하나의 프록시 서버(200)로 이용하는 것이 아닌 복수 개의 대상 시스템을 이용하기 위해 복수 개의 프록시 서버(200)를 이용할 수 있음은 물론이다.In one embodiment, a proxy server (200) having the above-described functionality may be used when Client A utilizes a target system. Here, Client A may be, for example, a user utilizing an SAP system. In this case, it goes without saying that Client A may utilize multiple proxy servers (200) to utilize multiple target systems, rather than utilizing one target system with one proxy server (200).

본 발명의 프록시 서버(200)는 적어도 하나의 프록시 워커(Proxy Worker, 201)와 적어도 하나의 프록시 워커(201)를 관리하는 프록시 와치독(Proxy Watchdog, 202)을 포함할 수 있다.The proxy server (200) of the present invention may include at least one proxy worker (Proxy Worker, 201) and a proxy watchdog (Proxy Watchdog, 202) that manages at least one proxy worker (201).

프록시 워커(201)는 클라이언트 A와 대상 시스템과의 중개자 역할을 할 수 있다. 프록시 워커(201)는 클라이언트 A와 대상 시스템 사이에서 데이터를 수집하고 데이터 관리 장치(100)의 프록시 데이터 수집부(250)에게 데이터를 전달할 수 있다.The proxy worker (201) can act as an intermediary between client A and the target system. The proxy worker (201) can collect data between client A and the target system and transmit the data to the proxy data collection unit (250) of the data management device (100).

프록시 와치독(202)은 데이터 관리 장치(100)의 콘솔(console)의 프록시 매니저(proxy manager, 170)로부터 프록시 워커(201)들의 설정 데이터를 수신하고, 프록시 워크(201)들을 관리 및 제어할 수 있다. 또한, 프록시 와치독(202)은 프록시 워커(201)들의 상태와 성능 매트릭스를 수집하고 모니터링할 수 있다. 이때, 프록시 매니저(170)는 프록시 와치독(202)을 통해 각 프록시 워커(201)의 설정을 관리하며 프록시 워커(201)의 상태를 모니터링할 수 있다.The proxy watchdog (202) can receive configuration data of proxy workers (201) from the proxy manager (170) of the console of the data management device (100), and manage and control the proxy workers (201). In addition, the proxy watchdog (202) can collect and monitor the status and performance matrix of the proxy workers (201). At this time, the proxy manager (170) can manage the settings of each proxy worker (201) through the proxy watchdog (202) and monitor the status of the proxy worker (201).

일 실시 예에서, 데이터 관리 장치(100)의 프록시 데이터 수신부(250) 프록시 워커(201)들로부터 수집된 데이터를 수신하고, 동일한 엔진(engine) 내의 프로토콜 분석부(130)를 통해 데이터를 분석하여 클라이언트 B에게 전달할 수 있다. 여기에서, 클라이언트 B는 데이터 관리 장치(100)를 이용하는 사용자로 시스템 관리자에 대응한다. 이때, 데이터는 데이터 베이스 매니저(DataBase Manager)를 거쳐 콘솔(console)을 통해 클라이언트 B에게 전달될 수 있다. 또한, 클라이언트 B는 콘솔을 통해 프록시 매니저(170)를 통해 프록시 워커(201)들의 상태를 확인할 수 있다.In one embodiment, the proxy data receiving unit (250) of the data management device (100) may receive data collected from proxy workers (201), analyze the data through the protocol analysis unit (130) within the same engine, and transmit the data to client B. Here, client B corresponds to a system administrator as a user using the data management device (100). At this time, the data may be transmitted to client B through a console via a database manager. In addition, client B may check the status of proxy workers (201) through the proxy manager (170) via the console.

도 46는 일 실시 예에 따른 프록시 서버의 다른 기능을 설명하는 도면이다.FIG. 46 is a drawing illustrating another function of a proxy server according to one embodiment.

일 실시 예에서, 상술한 기능을 갖는 프록시 서버(200)는 전달 모드(Bypass mode)를 지원할 수 있다. 여기에서, 전달 모드는 기본적인 프록시 기능(예를 들어, 네트워크 트래픽 중계)만을 수행할 수 있다. 데이터의 수집, 분석 및 기록 등의 추가적인 기능은 비활성화 된다.In one embodiment, a proxy server (200) having the above-described functionality may support a bypass mode. Here, the bypass mode may only perform basic proxy functions (e.g., relaying network traffic). Additional functions, such as data collection, analysis, and recording, are disabled.

본 도면의 (a)는 자동 전달 모드(Automatic bypass mode)를 설명하고, (b)는 수동 전달 모드(Manually by manager)를 설명한다.(a) of this drawing describes the automatic bypass mode, and (b) describes the manual by manager mode.

자동 전달 모드에서는, 프록시 와치독(202)은 프록시 워커(201)의 상태를 확인하여 프록시 매니저(170)에게 프록시 워커(201)의 상태를 알릴 수 있다. 또한, 프록시 와치독(202)은 프록시 워커(201)에 시스템의 인정성을 위협할 수 있는 문제가 발생했을 때, 프록시 워커(201)를 자동으로 전달 모드로 전환할 수 있다.In automatic forwarding mode, the proxy watchdog (202) can check the status of the proxy worker (201) and inform the proxy manager (170) of the status of the proxy worker (201). In addition, the proxy watchdog (202) can automatically switch the proxy worker (201) to forwarding mode when a problem that may threaten the system's authenticity occurs in the proxy worker (201).

반면, 수동 전달 모드에서는, 프록시 매니저(170)를 통하여 프록시 워커(201)를 전달 모드에서 재시작할 수 있다.On the other hand, in manual forwarding mode, the proxy worker (201) can be restarted in forwarding mode through the proxy manager (170).

이를 통해 시스템의 안정성을 보장하고 잠재적인 오류로부터 시스템을 보호할 수 있다.This ensures the stability of the system and protects it from potential errors.

도 47는 일 실시 예에 따른 프록시 서버 내 서비스를 분리하는 실시 예를 설명하는 도면이다.FIG. 47 is a diagram illustrating an embodiment of separating services within a proxy server according to one embodiment.

일 실시 예에서, 상술한 기능을 갖는 프록시 서버(200)는 프록시 서버(200) 내의 서비스 프로세스를 분리하여 운영할 수 있다.In one embodiment, a proxy server (200) having the above-described function can operate by separating service processes within the proxy server (200).

일 실시 예에서, 본 발명의 프록시 서버(200)는 (a)는 프록시 자체에 대한 분리, (b)는 서비스 포트에 의한 분리, (c)는 서비스 유형에 의한 분리로 운영될 수 있다.In one embodiment, the proxy server (200) of the present invention can be operated by (a) separation by proxy itself, (b) separation by service port, and (c) separation by service type.

보다 상세하게는, 프록시 자체에 대한 분리에서는, 각 프록시 서버(예를 들어, 200a, 200b 등)가 독립적으로 작동하며, 각각의 프록시 서버(200a, 200b 등) 안에 SAP GUI/HTTP 프록시를 포함할 수 있다. 이때, SAP GUI/HTTP 프록시는 SAP GUI 프로토콜 기반의 패킷 데이터와 HTTP 프로토콜 기반의 패킷 데이터를 수신하되, 각각 다른 방법으로 수집 및 분석할 수 있다. 이에 대하여는 상술한 바와 같다.More specifically, in the separation of the proxy itself, each proxy server (e.g., 200a, 200b, etc.) operates independently, and each proxy server (e.g., 200a, 200b, etc.) may include an SAP GUI/HTTP proxy. In this case, the SAP GUI/HTTP proxy may receive packet data based on the SAP GUI protocol and packet data based on the HTTP protocol, but may collect and analyze them using different methods. This is as described above.

또한, 서비스 포트에 의한 분리에서는, 서비스 포트 번호를 기준으로 트래픽을 분리할 수 있다. 예를 들어, 프록시 서버(200)는 제 1 SAP GUI/HTTP 프록시, 제 2 SAP GUI/HTTP 프록시를 포함하며, 다른 포트 번호를 사용하는 서비스에 대해 서로 다른 프록시 설정이나 라우터 규칙을 적용할 수 있다.Additionally, in separation by service port, traffic can be separated based on the service port number. For example, the proxy server (200) includes a first SAP GUI/HTTP proxy and a second SAP GUI/HTTP proxy, and different proxy settings or router rules can be applied to services using different port numbers.

마지막으로, 서비스 유형에 의한 분리에서는, 서비스 유형(SAP GUI 또는 HTTP 등)에 따라 트래픽을 분리할 수 있다. 예를 들어, 프록시 서버(200)는 SAP GUI 프록시와 HTTP 프록시를 각각 포함할 수 있다.Finally, in the separation by service type, traffic can be separated based on the service type (e.g., SAP GUI or HTTP). For example, the proxy server (200) may each include a SAP GUI proxy and an HTTP proxy.

이렇게 프로세스를 분리하는 경우, 하나의 프로세스에 문제가 발생하더라도 다른 프로세스에 영향을 미치지 않아 시스템 전체의 가용성이 유지되며, 문제가 발생한 프로세스를 쉽게 식별하고 격리할 수 있다는 장점이 있다.Separating processes in this way has the advantage of maintaining system-wide availability by ensuring that even if a problem occurs in one process, it does not affect other processes, and making it easy to identify and isolate the process in question.

도 48는 일 실시 예에 따른 프록시 서버를 스케일 아웃하는 실시 예를 설명하는 도면이다.FIG. 48 is a diagram illustrating an example of scaling out a proxy server according to one embodiment.

플랫폼의 확장 기능을 사용하여 프록시 서버의 규모를 늘리거나 줄일 수 있다.You can scale your proxy server up or down by using the platform's scalability features.

일 실시 예에서, 플랫폼 로드 밸런서는 네트워크 트래픽 또는 요청을 시스템 내의 여러 프록시 서버(200a, 200b, 200c)에 분산시킬 수 있다. 이때, 플랫폼 로드 밸런서는 클라우드 플랫폼 별로 부하의 분산을 위해 특정 주소로 수신되는 트래픽을 프록시 서버들로 패킷을 분산시켜주는 기능을 제공한다.In one embodiment, the platform load balancer can distribute network traffic or requests to multiple proxy servers (200a, 200b, 200c) within the system. The platform load balancer provides a function that distributes packets of traffic received at specific addresses to proxy servers to distribute load across cloud platforms.

예를 들어, 플랫폼 로드 밸런서는 수신되는 데이터를 제 1 프록시 서버(200a), 제 2 프록시 서버(200b) 및 제 3 프록시 서버(200c)로 분산하여 전달할 수 있다. 또한, 플랫폼 로드 밸런서는 플랫폼 별로 제공해주는 서비스를 이용하여 수신되는 데이터의 양 및 프록시 서버(200a, 200b, 200c)의 부하량을 조건으로 프록시 서버의 스케일 아웃이 가능하다.For example, the platform load balancer can distribute and transmit received data to the first proxy server (200a), the second proxy server (200b), and the third proxy server (200c). In addition, the platform load balancer can scale out the proxy servers based on the amount of received data and the load of the proxy servers (200a, 200b, 200c) by using the services provided by each platform.

이때, 프록시 매니저(170)는 제 1 프록시 서버(200a)의 프록시 와치독(202a), 제 2 프록시 서버(200b)의 프록시 와치독(202b) 및 제 3 프록시 서버(200c)의 프록시 와치독(202c)을 관리할 수 있다.At this time, the proxy manager (170) can manage the proxy watchdog (202a) of the first proxy server (200a), the proxy watchdog (202b) of the second proxy server (200b), and the proxy watchdog (202c) of the third proxy server (200c).

이를 통해, 시스템의 부하를 유연하게 대응하고 스케일 아웃 형태로 확장되는 프록시 서버(200a, 200b, 200c)를 관리할 수 있다.Through this, it is possible to manage proxy servers (200a, 200b, 200c) that flexibly respond to the load of the system and expand in a scale-out manner.

도 49는 일 실시 예에 따른 프록시 서버가 NI 프로토콜을 처리하는 실시 예를 설명하는 도면이다.FIG. 49 is a diagram illustrating an embodiment in which a proxy server processes the NI protocol according to one embodiment.

본 도면은 상술한 프록시 서버의 NI 프로토콜 처리 및 클라이언트 요청/송수신 프로세스를 설명한다.This diagram illustrates the NI protocol processing and client request/transmit/receive process of the above-described proxy server.

프록시 서버는 OS 서비스 포트 정보 및 프록시 설정 정보에 기초하여 서비스 별 포트 정보를 수집할 수 있다. 프록시 서버는 서비스 별 포트 정보를 활용하여 NI 프로토콜을 처리하고 클라이언트의 요청을 송수신할 수 있다.The proxy server can collect service-specific port information based on OS service port information and proxy configuration information. Using this service-specific port information, the proxy server can process the NI protocol and send and receive client requests.

보다 상세하게는, 프록시 서버는 클라이언트가 접속하는 프로토콜이 SAP GUI, RFC 프로토콜인 경우 NI 프로토콜을 처리하여 클라이언트의 요청을 송수신할 수 있다.More specifically, the proxy server can process the NI protocol to send and receive client requests when the protocol the client is accessing is the SAP GUI or RFC protocol.

SAP GUI/RFC 클라이언트가 SAP 시스템과 데이터 송수신을 할 때, 프록시 서버는 NI 프로토콜 내에서 클라이언트의 요청을 전달해야 할 시스템의 IP 주소 및 포트 정보를 포함하여 이를 분석하여 전달 대상 SAP 시스템을 식별하여 전달할 수 있다.When a SAP GUI/RFC client sends or receives data with an SAP system, the proxy server can analyze the client's request within the NI protocol, including the IP address and port information of the system to which the request should be forwarded, and identify and forward the request to the target SAP system.

이때, OS 서비스별 포트 정보가 클라이언트의 OS 서비스 파일에 설정되어 있는 문자열로 전달되고, 프록시 서버는 프록시 서버 내 로딩된 OS 서비스별 포트 정보와 문자열 정보를 매핑(mapping)하여 대상 시스템인 SAP 시스템의 포트 정보를 식별하여 전송할 수 있다.At this time, the port information for each OS service is transmitted as a string set in the client's OS service file, and the proxy server can identify and transmit the port information of the target system, the SAP system, by mapping the port information for each OS service loaded in the proxy server and the string information.

또한, 프록시 서버는 SAP GUI/RFC 클라이언트에게 NI PING을 전달했을 때, NI PONG을 반환하여 네트워크 연결 상태가 정상적으로 가동 중임을 알릴 수 있다.Additionally, the proxy server can return NI PONG when NI PING is sent to the SAP GUI/RFC client to indicate that the network connection is operating normally.

또한, 프록시 데이터 추출부는 SAP GUI 또는 RFC 프로토콜로 송수신되는 NI 프로토콜 요청/응답 데이터를 추출하여 데이터 사이즈에 기초하여 임계치가 설정되어 있는 전송 큐에 적재할 수 있다. 이때, 동일한 세션의 데이터를 동일한 프로세스에서 처리해야 하기 때문에, 프록시 데이터 추출부는 프록시 서버에 접속된 클라이언트 IP와 포트 정보를 기준으로 전송 큐에 적재 처리할 수 있다.Additionally, the proxy data extraction unit can extract NI protocol request/response data transmitted and received via SAP GUI or RFC protocols and load it into a transmission queue with a threshold set based on data size. Since data from the same session must be processed in the same process, the proxy data extraction unit can load it into the transmission queue based on the client IP and port information connected to the proxy server.

도 50은 일 실시 예에 따른 데이터 관리 장치의 프록시 데이터 수집 프로세스를 설명하는 도면이다.FIG. 50 is a diagram illustrating a proxy data collection process of a data management device according to one embodiment.

본 발명의 데이터 관리 장치는 프록시 데이터를 수집하기 위한 프로세스를 시작할 수 있다. 이때, 프록시 데이터 수신부(250) 및 프로토콜 분석부(130) 등은, 데이터 관리 장치 내의 컨트롤러에 의해 프록시 데이터 수집 프로세스가 실행 및 파라미터가 전달되도록 구현될 수 있다.The data management device of the present invention can initiate a process for collecting proxy data. At this time, the proxy data receiving unit (250) and the protocol analysis unit (130) can be implemented so that the proxy data collection process is executed and parameters are transmitted by a controller within the data management device.

일 실시 예에서, 프록시 데이터 수집 프로세스를 실행하기 위해 컨트롤러는 데이터 관리 장치의 서버의 설정 정보에 기초하여 프록시 데이터 수집 프로세스를 실행할 수 있다. 프록시 데이터 수집 프로세스가 시작되면, 프록시 서버는 클라이언트가 대상 시스템과의 관계에서 발생한 데이터를 수집할 수 있다. 특히, 프록시 서버에서 전달되는 데이터는 TCP 재조립 과정을 거칠 필요가 없기 때문에 소켓 통신을 통하여 프로토콜 분석부(130)로 바로 전송될 수 있다. 이에 대하여는 상술한 실시 예를 참고하도록 한다.In one embodiment, the controller may execute the proxy data collection process based on configuration information of the server of the data management device. Once the proxy data collection process begins, the proxy server may collect data generated by the client's relationship with the target system. In particular, data transmitted from the proxy server can be directly transmitted to the protocol analysis unit (130) via socket communication without the need for a TCP reassembly process. For further details, please refer to the above-described embodiment.

일 실시 예에서, 프록시 데이터 전송부(240)는 전송 큐에 적재된 프록시 데이터를 프록시 설정 정보를 참조하여 프록시 데이터 수신부(250)에게 전송할 수 있다. 일 실시 예에서, 프록시 데이터 전송부(240)는 프록시 데이터의 전송이 실패하는 경우 N회 재전송할 수 있다. 프록시 데이터 전송부(240)에 대한 상세한 설명은 상술한 실시 예를 참고하도록 한다.In one embodiment, the proxy data transmission unit (240) may transmit proxy data loaded in the transmission queue to the proxy data reception unit (250) by referring to proxy setting information. In one embodiment, the proxy data transmission unit (240) may retransmit the proxy data N times if transmission fails. For a detailed description of the proxy data transmission unit (240), refer to the above-described embodiment.

프록시 데이터 수신부(250)는 프록시 데이터 전송부(240)로부터 수신한 데이터를 분석 큐에 적재할 수 있다. 프록시 데이터 수신부(250)는 분석 큐에 적재된 데이터를 프로토콜 분석부(130)에 전달할 수 있다.The proxy data receiving unit (250) can load data received from the proxy data transmitting unit (240) into an analysis queue. The proxy data receiving unit (250) can transmit the data loaded into the analysis queue to the protocol analysis unit (130).

보다 상세하게는, 프록시 데이터 수신부(250)는 IP와 포트를 숫자로 치환한 값을 합하여 나머지 연산으로 분석 큐 수량에 따라 TCP 스트림 별 NI 프로토콜 데이터를 분배할 수 있다. 이후, 분석 큐에 분배된 NI 프로토콜 데이터는 프로토콜 분석부(130)로 전달될 수 있다. 예를 들어, IP를 숫자로 치환한 값이 15고, 포트를 치환한 값이 13이며, 분석 큐 수량이 4인 경우, 프록시 데이터 수신부(250)는 15와 13을 더한 28을 4로 나머지 연산한 0번 분석 큐로 해당 TCP 스트림에 대한 NI 프로토콜 데이터를 프로토콜 분석부(130)로 분배할 수 있다.More specifically, the proxy data receiving unit (250) can distribute NI protocol data for each TCP stream according to the analysis queue quantity by adding the values obtained by replacing the IP and port with numbers and calculating the remainder. Thereafter, the NI protocol data distributed to the analysis queue can be transmitted to the protocol analysis unit (130). For example, if the value obtained by replacing the IP with a number is 15, the value obtained by replacing the port is 13, and the number of analysis queues is 4, the proxy data receiving unit (250) can distribute the NI protocol data for the corresponding TCP stream to the protocol analysis unit (130) as analysis queue number 0 by calculating the remainder of 4, which is 28, which is the sum of 15 and 13.

프로토콜 분석부(130)는 IP/Port를 합하여 나머지 연산으로 프로세스 수량에 따라 분배할 수 있다. 예를 들어, 목적지 포트가 3200~3299 포트를 갖는 데이터는 SAP GUI 프로토콜 분석기로 분배하고, 목적지 포트가 3300~3399 포트를 갖는 데이터는 RFC 분석기로 분배할 수 있다.The protocol analysis unit (130) can distribute data according to the number of processes by combining IP/Port and calculating the remainder. For example, data with destination ports 3200 to 3299 can be distributed to the SAP GUI protocol analyzer, and data with destination ports 3300 to 3399 can be distributed to the RFC analyzer.

또한, 프록시 데이터 전송부가 프록시 데이터 수신부에게 전송하는 프록시 데이터 포맷은 다음과 같다. [표 1]은 프록시 데이터 송신 포맷을 나타내고, [표 2]는 프록시 데이터 응답 포맷을 나타내고, [표 3]은 메시지 데이터 포맷을 나타낸다.In addition, the proxy data format transmitted by the proxy data transmission unit to the proxy data reception unit is as follows. [Table 1] shows the proxy data transmission format, [Table 2] shows the proxy data response format, and [Table 3] shows the message data format.

항목item길이length비고noteTotal LengthTotal Length44Message CountMessage Count44MessageMessageNN

항목item길이length비고noteTotal LengthTotal Length44TypeType110: SUCCESS, 1: Failed0: SUCCESS, 1: FailedMessageMessageNN실패 메시지Failure message

항목item길이length비고noteTimeTime88발생 시각Time of occurrenceDirectionDirection110: Request, 1: Response, 2: Close0: Request, 1: Response, 2: CloseSource IPSource IP88long 형태의 IP 주소IP address in long formatDestination IPDestination IP88long 형태의 IP 주소IP address in long formatSource PortSource Port22Destination PortDestination Port22Source Mac AddressSource Mac Address66Destination Mac AddressDestination Mac Address66Data LengthData Length44DataDataNN메시지 데이터Message data

도 51은 일 실시 예에 따른 프록시 서버가 다른 클라우드 서비스에 연결되는 실시 예를 설명하는 도면이다.FIG. 51 is a diagram illustrating an embodiment in which a proxy server according to one embodiment is connected to another cloud service.

본 발명의 프록시 서버는 다른 클라우드 서비스에도 쉽게 확장 가능하다는 장점이 있다. 예를 들어, 타 클라우드 서비스를 제공하는 A사의 로드 밸런서(Load Balancer)는 네트워크 트래픽이나 애플리케이션의 요청을 여러 서버에 분산시켜 처리할 수 있다.The proxy server of the present invention has the advantage of being easily scalable to other cloud services. For example, the load balancer of Company A, which provides other cloud services, can distribute network traffic and application requests across multiple servers for processing.

일 실시 예에서, A사의 로드 밸런서의 스테이트 프로브(State Probe)는 프록시 서버의 서비스 포트로 상태를 체크할 수 있다. 여기에서, 스테이트 프로브는 특정 기준에 따라 프록시 서버의 상태를 확인하는 메커니즘으로, 프록시 서버로 특정 요청(request)과 TCP 접속 요청을 보내 그 서버가 정상적으로 동작하는 지 여부를 확인할 수 있다.In one embodiment, a state probe of Company A's load balancer can check the status of a proxy server's service port. Here, a state probe is a mechanism that checks the status of a proxy server based on specific criteria. By sending a specific request and TCP connection request to the proxy server, it can determine whether the server is operating normally.

일 실시 예에서, 스테이트 프로브는 HTTP 프로토콜의 경우 HTTP 상태 코드(Status code)가 200이 아닌 다른 코드로 응답(response)하는 경우 실패 처리할 수 있다. 다른 실시 예에서, 기 설정된 시간(예를 들어, 30초) 이내에 응답이 없으면 실패 처리할 수 있다. 또한, TCP 연결 요청 및 응답 실패 시간이 제한시간을 초과하면 클라이언트 연결을 종료 처리하고, 신규 TCP 세션을 연결하지 않을 수 있다. 이후, 상태가 복구되는 경우 신규 TCP 세션 데이터를 전달할 수 있다.In one embodiment, the state probe may fail if the HTTP protocol responds with an HTTP status code other than 200. In another embodiment, the state probe may fail if there is no response within a preset time period (e.g., 30 seconds). Furthermore, if the TCP connection request and response failure time exceeds the time limit, the client connection may be terminated and a new TCP session may not be established. Afterwards, if the state is restored, new TCP session data may be transmitted.

일 실시 예에서, 제 1 프록시 서버가 제대로 동작하지 않는 경우, 본 발명의 데이터 관리 장치는 제 1 프록시 서버와 동일한 가상 머신 클러스터(Virtual Machine Cluster)에 포함된 제 2 프록시 서버 및/또는 제 3 프록시 서버로부터 프록시 패킷 데이터를 수신하도록 제어할 수 있다. 이에 따라, 데이터 관리 장치는 ERP 시스템이 설치된 서버(예를 들어, SAP 서버)와 송수신하는 패킷 데이터를 제 2 프록시 서버 및/또는 제 3 프록시 서버로부터 수신할 수 있다.In one embodiment, when the first proxy server does not operate properly, the data management device of the present invention can control to receive proxy packet data from a second proxy server and/or a third proxy server included in the same virtual machine cluster as the first proxy server. Accordingly, the data management device can receive packet data transmitted and received with a server (e.g., an SAP server) on which an ERP system is installed from the second proxy server and/or the third proxy server.

또한, 로드 밸런서는 세션의 지속성을 유지하기 위해 소스 IP와 대상 IP를 순서대로 결합한 2 튜플(tuple) 설정으로 분배할 수 있다.Additionally, the load balancer can distribute the 2-tuple configuration, which combines the source IP and destination IP in order, to maintain session persistence.

또한, 본 발명의 프록시 서버를 관리하는 가상 머신 클러스터는 프록시 서버가 설치된 가상 머신(VM)별 임계치를 설정할 수 있다. 이때, 가상 머신 클러스터는 가상 머신이 임계치를 초과하는 경우 가상 머신의 스케일 인/스케일 아웃을 수행할 필요가 있다. 이때, 가상 머신 클러스터는 메모리, CPU, NIC에 대한 수치 정보를 기준으로 오토 스케일(auto scale) 규칙을 정의할 수 있다.Additionally, the virtual machine cluster managing the proxy server of the present invention can set thresholds for each virtual machine (VM) on which the proxy server is installed. In this case, the virtual machine cluster needs to scale the virtual machine in/out when the virtual machine exceeds the threshold. In this case, the virtual machine cluster can define auto-scaling rules based on numerical information about memory, CPU, and NIC.

일 실시 예에서, 본 발명은 상술한 실시 예의 프록시 서버에 대응하는 VM 이미지를 생성하여 가상 머신 클러스터 내의 가상 머신에 VM 이미지를 배포(deployment)할 수 있다. 또한, 가상 머신 클러스터는 VM 이미지 갤러리를 통하여 VM의 이미지 및 버전을 관리할 수 있다.In one embodiment, the present invention can generate a VM image corresponding to the proxy server of the above-described embodiment and deploy the VM image to a virtual machine within a virtual machine cluster. In addition, the virtual machine cluster can manage the images and versions of the VM through a VM image gallery.

도 52는 일 실시 예에 따른 데이터 관리 방법을 설명하는 흐름도이다.Figure 52 is a flowchart illustrating a data management method according to one embodiment.

일 실시 예에서, 데이터 관리 방법은 HTTP 프로토콜, SAP GUI 프로토콜 및 RFC 프로토콜 중 하나로부터 송수신되는 데이터를 처리하는 프록시 서버로부터 제 1 데이터를 수신할 수 있다(S105). 이에 대하여는 도 44에서 상술한 내용을 참고하도록 한다.In one embodiment, the data management method may receive first data from a proxy server that processes data transmitted and received via one of the HTTP protocol, the SAP GUI protocol, and the RFC protocol (S105). For this, refer to the description above in FIG. 44.

이때, 프록시 서버는 SAP GUI 프로토콜 및 RFC 프로토콜로 데이터가 수신되는 경우, NI(Network Interface) 프로토콜을 처리하여 클라이언트의 요청을 송수신할 수 있다. 또한, 프록시 서버는 OS 서비스 포트 정보 및 프록시 설정 정보에 기초하여 서비스 별 포트 정보를 수집하되, OS 서비스 포트 정보는 상기 클라이언트의 OS 서비스 파일에 설정되어 있는 문자열 정보를 포함하고, 서비스 별 포트 정보를 활용하여 NI 프로토콜을 처리하는 것을 특징으로 한다. 이에 대하여는 도 49에서 상술한 내용을 참고하도록 한다.At this time, when data is received via the SAP GUI protocol and RFC protocol, the proxy server can process the NI (Network Interface) protocol to transmit and receive client requests. In addition, the proxy server collects service-specific port information based on OS service port information and proxy configuration information. The OS service port information includes string information set in the client's OS service file, and the proxy server processes the NI protocol using the service-specific port information. For this, please refer to the contents described above in Figure 49.

일 실시 예에서, 데이터 관리 방법은 수신된 제 1 데이터의 프로토콜을 분석할 수 있다(S107). 이때, 데이터 관리 방법은 수신된 제 1 데이터의 IP와 포트를 숫자로 치환한 값을 합하고, 나머지 연산으로 분석 큐 수량에 따라 상기 제 1 데이터를 분배할 수 있다. 이에 대하여는 도 50에서 상술한 내용을 참고하도록 한다.In one embodiment, the data management method may analyze the protocol of the received first data (S107). At this time, the data management method may add the values obtained by converting the IP and port of the received first data into numbers, and distribute the first data according to the number of analysis queues using the remainder operation. For details, please refer to the details described above in FIG. 50.

일 실시 예에서, 데이터 관리 방법은 분석된 제 1 데이터로부터 로그를 생성할 수 있다(S109). 생성된 로그는 상술한 실시 예에 따라 개인정보를 추출하기 위해 사용되거나 사용자 행위 메타데이터를 생성하기 위해 사용될 수 있다.In one embodiment, the data management method may generate a log from the analyzed first data (S109). The generated log may be used to extract personal information or to generate user behavior metadata according to the above-described embodiment.

도 53은 실시 예에 따른 데이터 관리 장치를 설명하는 도면이다.Figure 53 is a drawing illustrating a data management device according to an embodiment.

웹 서버의 사용자 로그 기록 시 저장 로그를 줄이기 위한 기술과 관련하여, 불필요한 데이터(예를 들어, CSS, 이미지, 스크립트 및 패치 파일 등)를 제거하고 HTML 본문에서 관련 없는 영역을 제거하는 것은 데이터 분석의 효율성을 높이고 저장 공간을 절약하는데 중요하다.When it comes to techniques for reducing the storage logs when recording user logs on a web server, removing unnecessary data (e.g., CSS, images, scripts, and patch files) and removing irrelevant areas from the HTML body is important for increasing the efficiency of data analysis and saving storage space.

이하에서는, SAP WEB GUI 데이터 변환을 통해 응답 본문의 크기를 약 95% 감소시키는 기술을 제안하고자 한다.Below, we propose a technique to reduce the size of the response body by approximately 95% through SAP WEB GUI data transformation.

본 도면은 상술한 데이터 관리 플랫폼에서 수집한 패킷을 분석하여 처리할 때, 로그의 용량을 줄이는 실시 예를 설명하는 도면이다. 여기에서, 네트워크 장치(1001), 패킷 수집부(1058) 및 패킷 분석부(1059)는 상술한 데이터 관리 플랫폼에서 설명한 내용을 참고하도록 한다.This drawing illustrates an example of reducing the log capacity when analyzing and processing packets collected from the aforementioned data management platform. Here, the network device (1001), packet collection unit (1058), and packet analysis unit (1059) refer to the descriptions made in the aforementioned data management platform.

일 실시 예에서, 데이터 관리 장치는 SSL 프로토콜 처리부(1060), HTTP 프로토콜 분석부(1061), HTTP 스크립트 처리부(1062), WEBGUI 데이터 변환부(1063) 및 감사 로그 저장 확인부(1064)를 포함할 수 있다.In one embodiment, the data management device may include an SSL protocol processing unit (1060), an HTTP protocol analysis unit (1061), an HTTP script processing unit (1062), a WEBGUI data conversion unit (1063), and an audit log storage confirmation unit (1064).

SSL 프로토콜 처리부(1060)는 패킷 분석부(1059)를 통하여 재조립된 패킷을 수신할 수 있다. 보다 상세하게는, SSL 프로토콜 처리부(1060)는 패킷 분석부를 통하여 재조립된 TCP 패킷을 수신하고, TLS(Transport Layer Security) 프로토콜로 암호화되어 있는 패킷을 SSL 핸드쉐이크(handshake) 과정 중에 전송되는 인증서의 지문 정보를 추출하여 적재되어 있는 SSL 프라이빗 인증서를 구분하고, 이를 이용하여 RSA 형태로 암호화된 패킷의 복호화를 수행할 수 있다. 이때, RSA 암호화 방식은 비대칭 키 암호화 방식으로 공개 키와 개인 키(프라이빗 키)의 한 쌍을 이용하는 방식이다. SSL 프로토콜 처리부(1060)는 복호화된 패킷을 HTTP 프로토콜 분석부로 전달할 수 있다.The SSL protocol processing unit (1060) can receive a reassembled packet through the packet analysis unit (1059). More specifically, the SSL protocol processing unit (1060) can receive a reassembled TCP packet through the packet analysis unit, extract the fingerprint information of the certificate transmitted during the SSL handshake process of the packet encrypted with the TLS (Transport Layer Security) protocol, distinguish the loaded SSL private certificate, and decrypt the packet encrypted in the RSA format using the fingerprint information. At this time, the RSA encryption method is an asymmetric key encryption method that uses a pair of a public key and a private key (private key). The SSL protocol processing unit (1060) can transmit the decrypted packet to the HTTP protocol analysis unit.

HTTP 프로토콜 분석부(1061)는 수신한 패킷을 HTTP 프로토콜을 기반으로 분석하여 요청(request) 및 응답(response) 별 Request-line과 status line을 커맨드(command)로 분석하고, 각 헤더(header)와 바디(body)로 분석할 수 있다. 일 실시 예에서, HTTP 프로토콜을 분석하는 방법은 상술한 내용을 참고할 수 있다.The HTTP protocol analysis unit (1061) analyzes the received packets based on the HTTP protocol, and analyzes the request line and status line for each request and response as commands, and can analyze each header and body. In one embodiment, the method of analyzing the HTTP protocol can be referred to the above-described content.

HTTP 스크립트 처리부(1062)는 HTTP 프로토콜 분석부를 통하여 분석된 HTTP 데이터를 입력으로 HTTP 처리 스크립트를 실행하여 불필요한 정보를 제거할 수 있다.The HTTP script processing unit (1062) can remove unnecessary information by executing an HTTP processing script with HTTP data analyzed through the HTTP protocol analysis unit as input.

보다 상세하게는, HTTP 스크립트 처리부(1062)는 HTTP 데이터의 헤더의 컨텐트-타입(content-type)을 이용하여 불필요한 정보를 제거할 수 있다. 예를 들어, HTTP 데이터의 헤더의 컨텐트-타입이 text/css 또는 text/javascript인 경우, 이러한 데이터는 화면에 사용자의 개입 없이 송수신되는 데이터로 고정적으로 화면에 표시되는 데이터이며, 헤더의 컨텐트-타입이 application/octet-stream 인 경우, 프로그램 파일을 업로드/다운로드하는 데이터로 감사 로그에 불필요한 정보로 판단할 수 있다. 따라서, HTTP 스크립트 처리부(1062)는 헤더의 컨텐트-타입에 따라 데이터 중 불필요한 정보를 제거할 수 있다.More specifically, the HTTP script processing unit (1062) can remove unnecessary information by using the content type of the header of the HTTP data. For example, if the content type of the header of the HTTP data is text/css or text/javascript, such data is data that is transmitted and received without user intervention on the screen and is fixedly displayed on the screen. If the content type of the header is application/octet-stream, such data is data for uploading/downloading a program file and can be determined as unnecessary information for the audit log. Therefore, the HTTP script processing unit (1062) can remove unnecessary information from the data according to the content type of the header.

또한, HTTP 스크립트 처리부(1062)는 HTTP 데이터의 URL 내 확장자를 추출하여 필터링 처리할 수 있다. 예를 들어, HTTP 데이터에 포함된 URL 내 파일의 확장자가 dll, cab, js, swf, pdf인 경우, 확장자를 추출하여 필터링 처리할 수 있다. 여기에서, 필터링 처리한다는 것은 감사 로그에 해당 데이터를 저장하지 않겠다는 의미이다.Additionally, the HTTP script processing unit (1062) can extract the extension within the URL of HTTP data and perform filtering. For example, if the extension of a file within the URL included in the HTTP data is dll, cab, js, swf, or pdf, the extension can be extracted and filtered. Here, filtering means not storing the corresponding data in the audit log.

또한, HTTP 스크립트 처리부(1062)는 로깅 대상이 아닌 시스템에 대한 HTTP 데이터인 경우 해당 HTTP 데이터를 감사 로그로 저장하지 않을 수 있다.Additionally, the HTTP script processing unit (1062) may not store the HTTP data as an audit log if the HTTP data is for a system that is not a logging target.

일 실시 예에서, HTTP 스크립트 처리부(1062)는 HTTP 처리 스크립트의 변경 여부를 주기적으로 확인하여 갱신할 수 있다. 이때, HTTP 처리 스크립트는 사용자에 의해 갱신될 수 있다.In one embodiment, the HTTP script processing unit (1062) may periodically check for changes in the HTTP processing script and update it. At this time, the HTTP processing script may be updated by the user.

WEBGUI 데이터 반환부(1063)는 HTTP 스크립트 처리부(1062)로부터 입력된 응답 본문 데이터를 변환된 XML로 반환할 수 있다. 이때, WEBGUI 데이터의 경우, 응답 본문(response body) 용량이 크기 때문에 XML로 수신된 응답 본문에서 사용자 행위에 대한 데이터만을 추출하여 XML 형태로 변환할 수 있다.The WEBGUI data return unit (1063) can return the response body data input from the HTTP script processing unit (1062) as converted XML. At this time, in the case of WEBGUI data, since the response body capacity is large, only data on user actions can be extracted from the response body received as XML and converted into XML format.

이때, HTTP 스크립트 처리부(1062)는 로그 전체를 필터링할 수 있고, WEBGUI 데이터 반환부(1063)는 불필요한 정보를 제거하고 감사 로그에 필요한 정보만을 저장해 용량을 줄일 수 있다.At this time, the HTTP script processing unit (1062) can filter the entire log, and the WEBGUI data return unit (1063) can reduce the capacity by removing unnecessary information and storing only the information necessary for the audit log.

일 실시 예에서, HTTP 스크립트 처리부(1062)가 로그를 필터링하는 동작과 WEBGUI 데이터 반환부(1063)가 데이터 변환을 수행하는 동작은 동시에 발생할 수 없다. 즉, WEBGUI 데이터 반환부(1063)는 HTTP 스크립트 처리부(1062)에 의해 HTTP 스크립트로 처리로 필터링된 로그들은 WEBGUI 데이터 변환을 수행하지 않을 수 있다. 이에 따라, WEBGUI 데이터 반환부(1063)는 HTTP 스크립트 처리부(1062)에 의해 로그 필터링이 수행되지 않은 HTTP 데이터 중 WEBGUI에 해당하는 로그만 구분하여 데이터 변환을 수행할 수 있다.In one embodiment, the operation of the HTTP script processing unit (1062) filtering logs and the operation of the WEBGUI data return unit (1063) performing data conversion cannot occur simultaneously. That is, the WEBGUI data return unit (1063) may not perform WEBGUI data conversion on logs filtered by the HTTP script processing unit (1062) through processing as HTTP scripts. Accordingly, the WEBGUI data return unit (1063) may perform data conversion by distinguishing only logs corresponding to WEBGUI among HTTP data for which log filtering has not been performed by the HTTP script processing unit (1062).

보다 상세하게는, WEBGUI 데이터 반환부(1063)는 변환된 XML에서 시작 스크립트(start script)와 관련된 부분, 각 컨트롤(control) 별 속성과 관련된 부분 및 스타일(Style)과 관련된 부분 중 적어도 하나를 제거하면서 사용자가 조회한 로그를 기록할 수 있다. 이에 대하여는 후술하는 도면을 통하여 자세히 설명하도록 한다.More specifically, the WEBGUI data return unit (1063) can record a log of user searches while removing at least one of the parts related to the start script, the parts related to the properties of each control, and the parts related to the style from the converted XML. This will be described in detail with reference to the drawings described below.

감사 로그 저장 확인부(1064)는 HTTP 처리 스크립트로 처리된 결과를 기준으로 데이터를 감사 로그에 저장할지 여부를 판단할 수 있다.The audit log storage confirmation unit (1064) can determine whether to store data in the audit log based on the result processed by the HTTP processing script.

이를 통해, 데이터 분석 시 필요 없는 불필요한 데이터를 제거할 수 있다.This allows us to remove unnecessary data that is not needed during data analysis.

도 54는 실시 예에 따른 WEBGUI 데이터의 구조의 예시를 설명하는 도면이다.Figure 54 is a drawing illustrating an example of the structure of WEBGUI data according to an embodiment.

본 도면은 WEBGUI 응답 데이터의 XML 내에서 control-update 내 content에 화면과 관련된 HTML 데이터가 포함되어 있음을 돔 트리(DOM tree) 구조로 나타낸다. 본 발명은 HTML에서 화면을 구성하는 control 정보만 추출하고 나머지 데이터는 크기를 줄이기 위해 삭제하는 것을 특징으로 한다.This drawing shows, in a DOM tree structure, that content within control-update within the XML of WEBGUI response data contains HTML data related to the screen. The present invention is characterized by extracting only the control information that constitutes the screen from HTML and deleting the remaining data to reduce the size.

예를 들어, start-script와 initialized-ids에는 감사 로그로 추출할 정보가 포함되어 있지 않는 것이 일반적이다. 반면, Control-update는 화면을 구성하는 요소 별로 하단 메시지(msgarea)와 메인 화면 정보(userpanel) 등으로 구분되며, 컨텐트(content) 내에는 화면을 표시하기 위한 컨트롤(control) 정보들을 포함하고, 각 컨트롤 들은 속성(attribute)에 따라 사용 부분이 구분될 수 있다.For example, start-script and initialized-ids typically do not contain information that can be extracted from audit logs. On the other hand, Control-update is divided into elements that compose the screen, such as the bottom message (msgarea) and main screen information (userpanel), and the content includes control information for displaying the screen, and each control can have its usage divided according to its attributes.

예를 들어, control-update 요소 안에는 다음과 같은 내용이 포함될 수 있다.For example, a control-update element might contain the following:

(1)webguiPopups: 팝업 화면에 관한 HTML 내용(1)webguiPopups: HTML content for pop-up screens

(2)backpackCUA: 컨텍스트 메뉴에 관한 HTML 내용(2)backpackCUA: HTML content for the context menu

(3)backpackUA: 시스템 정보 팝업에 관한 HTML 내용(3)backpackUA: HTML content for the system information pop-up

(4)webguiKeys: 단축키 정보가 포함된 HTML span 태그(4)webguiKeys: HTML span tag containing shortcut key information.

(5)cuaarea: 사용자 인터페이스의 상단 부분에 해당하는 HTML 내용(5)cuaarea: HTML content corresponding to the top part of the user interface

(6)msgarea: 메시지 출력에 사용되는 사용자 인터페이스(6)msgarea: User interface used to output messages

(7)screenarea/userPanel: 메인 화면의 사용자 인터페이스(7)screenarea/userPanel: The user interface of the main screen.

일 실시 예에서, 감사 로그의 데이터 사이즈를 줄이기 위하여 HTML에서 화면을 구성하는 데이터만 추출하고 나머지 데이터를 삭제할 수 있다. 예를 들어, control-update 요소 중 “msgarea”와 “screenarea/userPanel”은 화면을 구성하는 데이터이기 때문에 추출하고 나머지 데이터는 삭제할 수 있다. 다만, 이는 예시일 뿐으로, 감사 로그의 데이터 사이즈를 줄이기 위하여 기 설정된 조건에 만족하는 데이터는 추출하고, 나머지 데이터는 삭제할 수 있음은 물론이다.In one embodiment, to reduce the data size of the audit log, only the data that constitutes the screen can be extracted from the HTML and the remaining data can be deleted. For example, among the control-update elements, "msgarea" and "screenarea/userPanel" are data that constitute the screen, so they can be extracted and the remaining data can be deleted. However, this is merely an example, and it is of course possible to extract data that satisfies preset conditions and delete the remaining data to reduce the data size of the audit log.

도 55는 실시 예에 따른 WEBGUI 데이터 내 HTML 분석 결과를 나타내는 도면이다.Figure 55 is a diagram showing the HTML analysis results in WEBGUI data according to an embodiment.

일반적으로 HTML 내 요소(element)에 ct(content type) 또는 subct(sub content type) 속성이 존재하는 경우, 속성 값에 따라 하위 요소가 동일한 패턴으로 구성된다. 이때, Isdata 속성을 이용하여 ID 및 나타내고자 하는 데이터를 확인할 수 있다.Typically, when an HTML element contains the ct (content type) or subct (sub content type) attribute, sub-elements are structured in the same pattern based on the attribute value. The Isdata attribute can be used to identify the ID and data to be displayed.

본 도면은 웹 브라우저 상에서 WEBGUI가 표시된 부분을 화면과 응답 본문 내에서 추출한 것으로, 추출된 데이터의 ct 값이 R_standards인 경우의 Isdata 및 Isevents를 나타낸다. 여기에서, R_standards는 라디오 버튼을 의미하며, Isdata 내 4번 항목의 값이 화면상 표시되는 라벨을 의미하며, 13번 항목의 SID는 해당 컨트롤을 식별할 수 있는 ID 값을 의미한다.This drawing is an extract from the screen and response body of the portion displayed on the web browser where WEBGUI is displayed, and shows Isdata and Isevents when the ct value of the extracted data is R_standards. Here, R_standards refers to a radio button, the value of item 4 in Isdata refers to the label displayed on the screen, and the SID of item 13 refers to an ID value that can identify the control.

일 실시 예에서, 데이터 관리 장치는 HTML 문서 중 ct 또는 subct에 기초하여 Isdata 속성을 이용하여 추출 가능한 값을 추출한 후, 나머지는 불필요한 정보로 판단하여 삭제할 수 있다.In one embodiment, the data management device can extract extractable values using the Isdata attribute based on ct or subct in an HTML document, and then delete the rest as unnecessary information.

이때, ct(content type)/subct(sub content type) 유형별 추출 가능한 값 및 예시는 다음과 같다.At this time, the extractable values and examples for each ct (content type)/subct (sub content type) type are as follows.

일 실시 예에서, 아래 예시에 기초하여 원본 데이터에서 추출 가능한 값만 추출하고 나머지 데이터는 제거할 수 있다. 또한, 추출 가능한 값 중 “ID”는 추출이 가능함에도 불구하고 사용하지 않는 항목이기 때문에 추출에서 제외할 수 있다.In one embodiment, based on the example below, only extractable values can be extracted from the original data, and the remaining data can be removed. Furthermore, among the extractable values, "ID" can be excluded from extraction because it is an unused field, even though it can be extracted.

(1)ct/subct 유형이 CO이고 Type이 Control인 경우, 추출 가능한 값은 ID, Type이며 값 예시는 “wnd[0]/titl, FioriTitleBar”와 같다.(1)If the ct/subct type is CO and the Type is Control, the extractable values are ID and Type. An example value is “wnd[0]/titl, FioriTitleBar.”

(2)ct/subct 유형이 B이고 Type이 Button Control인 경우, 추출 가능한 값은 ID, Type이며 값 예시는 “wnd[0]/tbar[1]/btn[7], GuiButton”와 같다.(2)If the ct/subct type is B and the Type is Button Control, the extractable values are ID and Type, and examples of values are “wnd[0]/tbar[1]/btn[7], GuiButton.”

(3)ct/subct 유형이 PHT이고 Type이 Page Header Title인 경우, 추출 가능한 값은 Title이며 값 예시는 “HR\x20마스터\x20데이터\x20조회”와 같다.(3)If the ct/subct type is PHT and the Type is Page Header Title, the extractable value is Title, and an example value is “HR\x20Master\x20Data\x20Query.”

(4)ct/subct 유형이 T이고 Type이 Tool Bar인 경우, 추출 가능한 값은 ID, Type이며 값 예시는 “wnd[0]/sbar, FioriStatusBar”와 같다.(4)If the ct/subct type is T and the Type is Tool Bar, the extractable values are ID and Type, and examples of values are “wnd[0]/sbar, FioriStatusBar.”

(5)ct/subct 유형이 LNC이고 Type이 Link Choice인 경우, 추출 가능한 값은 Text이며 값(5) If the ct/subct type is LNC and the Type is Link Choice, the extractable value is Text and the value

예시는 “S42\x20\x28100\x29”와 같다.An example is “S42\x20\x28100\x29”.

(6)ct/subct 유형이 CBS고 Type이 Text Field인 경우, 추출 가능한 값은 ID, Type, value이며 값 예시는 “wnd[0]/usr/ctxtRSRD1-TBMA_VAL, GuiCTextField, MARAX”와 같다.(6)If the ct/subct type is CBS and the Type is Text Field, the extractable values are ID, Type, and value. An example value is “wnd[0]/usr/ctxtRSRD1-TBMA_VAL, GuiCTextField, MARAX.”

(7)ct/subct 유형이 RL이고 Type이 Raster Layout인 경우, 추출 가능한 값은 ID, Type, ModalNo이며 값 예시는 “wnd[0]/usr, GuiUserArea, 0”와 같다.(7)If the ct/subct type is RL and the Type is Raster Layout, the extractable values are ID, Type, and ModalNo. An example value is “wnd[0]/usr, GuiUserArea, 0.”

(8)ct/subct 유형이 RLI이고 Type이 Raster Layout Item인 경우, 추출 가능한 값은 X 좌표, Y 좌표이며 값 예시는 “168, 8”와 같다.(8)If the ct/subct type is RLI and the Type is Raster Layout Item, the extractable values are X-coordinate and Y-coordinate, and an example value is “168, 8.”

(9)ct/subct 유형이 R_standards이고 Type이 Radio Button인 경우, 추출 가능한 값은 ID, Type, Text, check 유무이며 값 예시는 “wnd[0]/usr/radRSRD1-TBMA, GuiRadioButton, Database\x20table, true”와 같다.(9)If the ct/subct type is R_standards and the Type is Radio Button, the extractable values are ID, Type, Text, and whether or not the check box is checked. An example value is “wnd[0]/usr/radRSRD1-TBMA, GuiRadioButton, Database\x20table, true.”

위의 항목은 일부 예시를 나열한 것으로 ct/subct 유형의 종류 및 HTML 문서에 포함된 ct/subct 유형에 따라 추출 가능한 값은 이에 한정되지 않음은 물론이다.The above items are just some examples and the extractable values are not limited to the type of ct/subct type and the ct/subct type contained in the HTML document.

도 56은 실시 예에 따른 WEBGUI 데이터 내 HTML 분석 결과를 나타내는 도면이다.Figure 56 is a diagram showing the HTML analysis results in WEBGUI data according to an embodiment.

WEBGUI 데이터 내 HTML 분석 결과는 상술한 도면과 같이 나타날 수 있다. 일 실시 예에서, HTML 분석 결과를 ct 또는 subct 속성에 기초하여 화면을 트리 형태의 구조로 표현할 수 있다.The HTML analysis results within WEBGUI data may be displayed as shown in the above-described diagram. In one embodiment, the HTML analysis results may be displayed in a tree-like structure on the screen based on the ct or subct attribute.

예를 들어, 도면의 왼쪽은 WEBGUI SE11 화면을 나타낸다. 여기에서, SE11은 SAP 시스템에서 데이터 딕셔너리 트랜잭션 코드로, 테이블, 뷰, 데이터 타입, 구조체 등 데이터 관련 객체를 생성, 관리 및 검사할 수 있는 툴이다. 이에 따라, WEBGUI는 웹 브라우저를 통해 SAP 시스템에 접근할 수 있는 인터페이스를 제공할 수 있다.For example, the left side of the diagram shows the WEBGUI SE11 screen. SE11 is a data dictionary transaction code in the SAP system, a tool that allows you to create, manage, and inspect data-related objects such as tables, views, data types, and structures. Accordingly, WEBGUI can provide an interface for accessing the SAP system via a web browser.

일 실시 예에서, WEBGUI SE11 화면의 응답 원문을 ct 또는 subct 속성에 따라 분석하면, 본 도면의 오른쪽 도면과 같이 사용자가 WEBGUI SE11 화면에서 조회한 화면의 데이터와 동일한 구조를 갖는 트리 형태의 구조가 출력된다.In one embodiment, when the response original text of the WEBGUI SE11 screen is analyzed according to the ct or subct attribute, a tree-shaped structure having the same structure as the data of the screen viewed by the user on the WEBGUI SE11 screen is output, as shown in the right drawing of this drawing.

이에 따라, 불필요한 정보를 제거하여 데이터의 용량을 줄이더라도 사용자가 조회한 화면의 데이터와 구조적으로 동일하다는 장점이 있으며, 이는 개인정보를 추출할 때 분석을 용이하게 할 수 있다.Accordingly, even if the data volume is reduced by removing unnecessary information, it has the advantage of being structurally identical to the data on the screen viewed by the user, which can facilitate analysis when extracting personal information.

도 57은 실시 예에 따른 WEBGUI 데이터 원문을 나타내는 도면이다.Figure 57 is a diagram showing the original text of WEBGUI data according to an embodiment.

본 도면은 상술한 WEBGUI SE11 화면을 XML 파일로 출력한 도면의 예시이다. 이 도면에 포함된 xml 데이터는 일부 예시일 뿐으로 실제 원문 xml 파일은 WEBGUI SE11 화면을 구성하기 위한 모든 코드를 포함할 수 있다.This drawing is an example of a drawing that outputs the aforementioned WEBGUI SE11 screen as an XML file. The XML data included in this drawing is only an example, and the actual original XML file may contain all the code required to configure the WEBGUI SE11 screen.

일 실시 예에서, 데이터 관리 장치 내에서 분석된 WEBGUI SE11화면의 응답 본문은 MIME(Multipurpose Internet Mail Extensions) 형식을 사용하여 인코딩되어 있을 수 있다. 이때, MIME 형식으로 인코딩된 HTTP 응답 본문을 분석하기 위하여 HTTP 파서를 이용할 수 있다.In one embodiment, the response body of the WEBGUI SE11 screen analyzed within the data management device may be encoded using the MIME (Multipurpose Internet Mail Extensions) format. In this case, an HTTP parser may be used to analyze the HTTP response body encoded in the MIME format.

이후, 데이터 관리 장치는 WEBGUI 원문을 분석하기 위한 파서를 이용하여 ct 또는 subct 속성을 기준으로 xml 변환 테스트를 수행할 수 있다. 이때, 데이터 관리 장치는 컨텍스트 메뉴(context menu)나 팝오버(popover) 창은 제외하고 변환을 수행할 수 있다.Afterwards, the data management device can perform XML conversion tests based on the ct or subct attribute using a parser for analyzing the WEBGUI original text. At this time, the data management device can perform the conversion excluding the context menu or popover window.

일 실시 예에서, WEBGUI SE11 화면의 xml 변환 결과 중 컨텐트-타입이 userpanel인 경우와 msgarea인 경우만을 추출할 수 있다. 이에 대하여 후술하도록 한다.In one embodiment, only cases where the content type is userpanel or msgarea among the XML conversion results of the WEBGUI SE11 screen can be extracted. This will be described later.

도 58은 실시 예에 따른 불필요한 정보를 제거한 xml 데이터를 나타내는 도면이다.Figure 58 is a diagram showing XML data with unnecessary information removed according to an embodiment.

상술한 실시 예에 기초하여, 데이터 관리 장치는 변환된 xml 데이터 중 ct 또는 subct에 기초하여 불필요한 정보를 제거할 수 있다.Based on the above-described embodiment, the data management device can remove unnecessary information based on ct or subct among the converted xml data.

본 도면은 변환된 xml 데이터 중 컨텐트-타입(ct)이 userpanel인 경우와 msgarea인 경우를 제외한 나머지 정보를 불필요한 정보로 판단하여 제거할 수 있다. 이때, msgarea는 메시지 출력 사용자 인터페이스 부분을 나타내는 HTML이고, userpanel은 메인 화면 사용자 인터페이스 부분을 나타내는 HTML이기 때문에 컨텐트-타입이 msgarea와 userpanel인 경우 화면을 구성하는 데 필요한 데이터로 판단하여 감사 로그에 저장할 수 있다.This drawing can remove the remaining information, except for cases where the content type (ct) of the converted XML data is userpanel or msgarea, as unnecessary information. In this case, msgarea is HTML representing the message output user interface part, and userpanel is HTML representing the main screen user interface part, so cases where the content type is msgarea or userpanel can be judged as data necessary for configuring the screen and can be stored in the audit log.

상술한 실시 예에 기초하여, 화면 별 원본 데이터 및 변환된 데이터의 XML 사이즈를 비교하면 다음과 같다.Based on the above-described embodiment, the XML sizes of the original data and converted data for each screen are compared as follows.

여기에서, T code(Transaction code)는 SAP GUI에서 특정 작업이나 프로세스에 접근하기 위한 짧은 키워드나 약어를 나타낸다. 또한, Dynpro(Dynamic Programming) Name은 SAP GUI 프로그램 이름을 나타내고, Dynpro number는 SAP GUI 프로그램 내의 화면 번호를 나타낸다. 표를 참고하면, 본 발명의 일 실시 예에 따라, xml 데이터의 사이즈를 8% 내지 1%로 축소시킨 것을 확인할 수 있다.Here, the T code (Transaction code) represents a short keyword or abbreviation for accessing a specific task or process in the SAP GUI. Additionally, the Dynpro (Dynamic Programming) Name represents the SAP GUI program name, and the Dynpro number represents the screen number within the SAP GUI program. Referring to the table, it can be seen that, according to one embodiment of the present invention, the size of XML data has been reduced by 8% to 1%.

뿐만 아니라, 데이터가 입출력되는 요소(element)에는 ID 정보가 있기 때문에 ID 정보를 기준으로 key/value를 추출할 수 있는 파서(parser)를 구현하는 경우, 개인 정보를 추출할 수 있다.In addition, since the elements that input and output data have ID information, if you implement a parser that can extract key/value based on ID information, you can extract personal information.

예를 들어, ID 정보의 표현 형태는 “wnd[0]/usr/radRSRD1-TBMA” 또는 “wnd[0]/usr/ctxtRSRD1-TBMA_VAL - MARAX” 또는 “wnd[0]/sbar_msg - MARAX가(이) 없습니다. 이름을 점검하십시오”와 같은 형태로 표현될 수 있다. 특히, 동일한 의미의 값을 중복해서 표현하는 경우 어레이(array) 형태로 표현될 수 있다. 이에 따라, 본 발명은 ID 정보의 표현 형태를 분석하여 개인 정보를 추출할 수 있다.For example, the representation form of ID information can be expressed in the form of “wnd[0]/usr/radRSRD1-TBMA” or “wnd[0]/usr/ctxtRSRD1-TBMA_VAL - MARAX” or “wnd[0]/sbar_msg - MARAX is not present. Please check the name.” In particular, when values with the same meaning are expressed repeatedly, they can be expressed in the form of an array. Accordingly, the present invention can extract personal information by analyzing the representation form of ID information.

도 59는 실시 예에 따른 데이터 관리 방법을 설명하는 도면이다.Figure 59 is a drawing explaining a data management method according to an embodiment.

일 실시 예에서, 제 1 데이터를 수신할 수 있다(S111). 본 발명은 네트워크 장치를 통하여 제 1 데이터를 포함하는 패킷을 수신할 수 있다. 이에 대하여는, 도 1 내지 4 및 도 53에서 상술한 내용을 참고하도록 한다.In one embodiment, first data can be received (S111). The present invention can receive a packet containing first data via a network device. For this, refer to the contents described above in FIGS. 1 to 4 and FIG. 53.

일 실시 예에서, 수신된 제 1 데이터의 프로토콜을 분석할 수 있다(S113). 본 발명은 제 1 데이터의 프로토콜에 기초하여 제 1 데이터를 분석할 수 있다. 예를 들어, TLS 프로토콜로 암호화되어 있는 경우 인증서를 이용하여 키를 복호화한 뒤 데이터를 분석할 수 있다. 이에 대하여는, 도 53을 참고하도록 한다.In one embodiment, the protocol of the received first data can be analyzed (S113). The present invention can analyze the first data based on the protocol of the first data. For example, if the data is encrypted using the TLS protocol, the key can be decrypted using a certificate before analyzing the data. For this purpose, refer to FIG. 53.

일 실시 예에서, 프로토콜이 HTTP 프로토콜인 경우, HTTP 처리 스크립트를 이용하여 제 1 데이터를 필터링할 수 있다(S115). 예를 들어, 분석된 HTTP 데이터를 입력으로 HTTP 처리 스크립트를 실행하여 불필요한 정보를 제거할 수 있다. HTTP 처리 스크립트의 예시는 도 53에 개시된 바와 같다.In one embodiment, if the protocol is HTTP, the first data can be filtered using an HTTP processing script (S115). For example, an HTTP processing script can be executed with the analyzed HTTP data as input to remove unnecessary information. An example of an HTTP processing script is as disclosed in FIG. 53.

일 실시 예에서, 제 1 데이터에서 HTTP 처리 스크립트를 이용하여 필터링되지 않은 HTTP 데이터 중 WEBGUI에 대응하는 제 2 데이터에 포함된 제 3 데이터를 삭제하여 제 4 데이터를 생성할 수 있다(S117). 즉, 제 2 데이터에 포함된 제 3 데이터를 분석하고 불필요한 부분을 삭제하여 제 4 데이터를 생성할 수 있다. 여기에서, 제 3 데이터는 기 설정된 조건에 따라 추출 가능한 값인 것을 특징으로 한다. 또한, 제 4 데이터는 제 2 데이터 중 프로토콜에 대응하는 화면을 구성하는데 필요한 데이터를 포함할 수 있다.In one embodiment, fourth data may be generated by deleting third data included in second data corresponding to WEBGUI among unfiltered HTTP data using an HTTP processing script in first data (S117). That is, the third data included in the second data may be analyzed and unnecessary portions may be deleted to generate fourth data. Here, the third data is characterized in that it is a value that can be extracted according to preset conditions. In addition, the fourth data may include data necessary for configuring a screen corresponding to a protocol among the second data.

보다 상세하게는, 데이터 관리 방법은 제 3 데이터에 포함된 control-update의 속성이 화면을 구성하는데 필요한 속성인 msgarea 또는 userPanel을 제외한 나머지 데이터를 삭제할 수 있다. 이때, 데이터 관리 방법은 제 3 데이터 내에 control-update의 속성이 msgarea 또는 userPanel인 경우에도, 추출 가능한 값 중 ID는 삭제하고, 나머지 데이터만으로 감사 로그를 생성할 수 있다. 이에 대하여는 도 53 내지 도 58에서 상술한 내용을 참고하도록 한다.More specifically, the data management method can delete all data except for the msgarea or userPanel attributes, which are necessary for configuring the screen, contained in the control-update attribute of the third data. At this time, even if the control-update attribute in the third data is msgarea or userPanel, the data management method can delete the ID among the extractable values and generate an audit log using only the remaining data. For this, please refer to the contents described above in FIGS. 53 to 58.

도 60은 실시 예에 따른 데이터 관리 플랫폼이 컴퓨팅 시스템에 대한 접근 통제를 수행하는 예를 개시하는 도면FIG. 60 is a diagram disclosing an example of a data management platform according to an embodiment performing access control to a computing system.

도시한 예에서 데이터 관리 플랫폼(10000)은 이벤트 기반의 접근 차단 및 MAC 주소 기반의 접근 차단을 수행할 수 있다. 일 실시 예에서, 데이터 관리 플랫폼(10000)은 수집 모듈(20001), 분석 모듈(20002) 및 모니터링 모듈(20005)을 통하여 사용자(1000)의 컴퓨팅 시스템에 대한 접근을 통제할 수 있다. 예를 들어, 컴퓨팅 시스템은 SAP 시스템을 포함할 수 있다.In the illustrated example, the data management platform (10000) can perform event-based access blocking and MAC address-based access blocking. In one embodiment, the data management platform (10000) can control a user's (1000) access to the computing system through a collection module (20001), an analysis module (20002), and a monitoring module (20005). For example, the computing system may include an SAP system.

일 실시 예에서, 수집 모듈(20001)은 패킷을 수집할 수 있다. 일 실시 예에서, 수집 모듈(20001)은 사용자(1000)와 애플리케이션 서버(1002) 사이에서 전송되는 패킷을 수집할 수 있다. 예를 들어, 애플리케이션 서버(1002)는 사용자(1000)에게 SAP 시스템을 제공하는 SAP 서버를 포함할 수 있다. 일 실시 예에서, 수집 모듈(20001)은 스위치를 통해 IP 주소 및 MAC 주소 중 적어도 하나에 기반하여 패킷을 수집할 수 있다. 일 실시 예에서, 수집 모듈(20001)은 사용자(1000)의 접근 정보를 수집할 수 있다. 예를 들어, 접근 정보는 사용자(1000)의 IP 및 MAC 주소 중 적어도 하나를 포함할 수 있다.In one embodiment, the collection module (20001) can collect packets. In one embodiment, the collection module (20001) can collect packets transmitted between the user (1000) and the application server (1002). For example, the application server (1002) can include an SAP server that provides an SAP system to the user (1000). In one embodiment, the collection module (20001) can collect packets based on at least one of an IP address and a MAC address through a switch. In one embodiment, the collection module (20001) can collect access information of the user (1000). For example, the access information can include at least one of an IP address and a MAC address of the user (1000).

분석 모듈(20002)은 사용자(1000)의 컴퓨팅 시스템에 대한 로그인 차단을 통해 접근 통제를 수행할 수 있다. 일 실시 예에서, 분석 모듈(20002)은 컴퓨팅 시스템에 대한 사용자 계정별 접근 통제 정보를 생성하고, 접근 통제 정보에 기반하여 이벤트 처리 규칙을 생성할 수 있다. 또한, 분석 모듈(20002)은 이벤트 처리 규칙에 기반하여 접근 통제 이벤트의 발생 여부를 결정할 수 있다. 여기서, 접근 통제 정보는 컴퓨팅 시스템에 접근을 통제하는 기준 정보를 포함할 수 있다. 또한, 이벤트 처리 규칙은 접근 통제 정보에 기반하여 접근을 통제하는 규칙 정보를 포함할 수 있다.The analysis module (20002) can perform access control by blocking a user's (1000) login to the computing system. In one embodiment, the analysis module (20002) can generate access control information for each user account on the computing system and generate an event processing rule based on the access control information. In addition, the analysis module (20002) can determine whether an access control event occurs based on the event processing rule. Here, the access control information can include reference information for controlling access to the computing system. In addition, the event processing rule can include rule information for controlling access based on the access control information.

일 실시 예에서, 분석 모듈(20002)은 사용자(1000)의 사용자 행위 분석을 통해 접근 통제를 수행할 수 있다. 분석 모듈(20002)은 사용자(1000)의 컴퓨팅 시스템에 대한 적어도 하나의 이벤트에 기반하여 상관 분석 규칙을 생성할 수 있다. 또한, 분석 모듈(20002)은 상관 분석 규칙에 기반하여 사용자 행동 분석을 수행하여 접근 통제 이벤트의 발생 여부를 결정할 수 있다.In one embodiment, the analysis module (20002) may perform access control by analyzing the user behavior of the user (1000). The analysis module (20002) may generate a correlation analysis rule based on at least one event of the computing system of the user (1000). In addition, the analysis module (20002) may perform user behavior analysis based on the correlation analysis rule to determine whether an access control event occurs.

일 실시 예에서, 분석 모듈(20002)은 사용자(1000)의 접근 정보를 수집하고 이를 기반으로 통제하는 프로그램이 설치된 어플리케이션 서버(1002)를 대상으로 사용자(1000)의 접근 정보에 기반하여 접근 통제를 수행할 수 있다. 분석 모듈(20002)은 컴퓨팅 시스템에 대한 사용자 계정별 접근 통제 정보를 관리하고 이 정보를 모니터링 모듈(20005)에 의해 어플리케이션 서버(1002)와 동기화하여 반영한다. 어플리케이션 서버(1002)에 설치된 통제 프로그램은 이 계정별 접근 통제 정보를 기반으로 사용자(1000)의 접근 정보를 비교하여 접근 통제 여부를 결정할 수 있다.In one embodiment, the analysis module (20002) can collect access information of the user (1000) and perform access control based on the access information of the user (1000) for the application server (1002) on which a program for controlling based on the collected access information of the user (1000) is installed. The analysis module (20002) manages access control information for each user account for the computing system and synchronizes and reflects this information with the application server (1002) through the monitoring module (20005). The control program installed in the application server (1002) can compare the access information of the user (1000) based on this account-specific access control information to determine whether to control access.

모니터링 모듈(20005)은 이벤트 처리 규칙, 상관 분석 규칙 중 적어도 하나에 기반한 접근 통제 이벤트가 발생되는 경우 SAP RFC(Remote Function Call) 통신 방식으로 접근 통제 요청을 애플리케이션 서버(1002)에게 송신할 수 있다. 일 실시 예에서, 모니터링 모듈(20005)은 접근 정보에 기반한 접근 통제가 결정되는 경우 접근 통제 요청을 애플리케이션 서버(1002)에게 송신할 수 있다.The monitoring module (20005) may transmit an access control request to the application server (1002) via SAP RFC (Remote Function Call) communication when an access control event occurs based on at least one of an event processing rule and a correlation analysis rule. In one embodiment, the monitoring module (20005) may transmit an access control request to the application server (1002) when access control is determined based on access information.

일 실시 예에서, 애플리케이션 서버(1002)는 접근 통제 요청을 RFC로 수신함에 따라 접근 통제 처리를 수행하여 사용자(1000)의 컴퓨팅 시스템에 대한 접근을 차단할 수 있다. 일 실시 예에서, 모니터링 모듈(20005)은 사용자 계정별 접근 통제 정보가 변경될 때마다 RFC 방식으로 동기화를 요청하고 애플리케이션 서버(1002)는 사용자 계정별 접근 통제 정보를 수신함에 따라 접근 통제 정보 동기화 처리를 수행하고 이 정보와 사용자(1000)가 컴퓨팅 시스템에 접근 시 IP/MAC을 수집하고 사용자 계정별 접근 통제 정보와 비교하여 접근을 차단할 수 있다.In one embodiment, the application server (1002) may perform access control processing upon receiving an access control request as an RFC to block access of a user (1000) to the computing system. In one embodiment, the monitoring module (20005) requests synchronization in an RFC manner whenever access control information for each user account is changed, and the application server (1002) performs access control information synchronization processing upon receiving access control information for each user account, collects IP/MAC when the user (1000) accesses the computing system, and compares this information with the access control information for each user account to block access.

도 61은 실시 예에 따른 데이터 관리 플랫폼이 컴퓨팅 시스템에 대한 접근 통제를 수행하는 다른 예를 개시하는 도면FIG. 61 is a diagram disclosing another example of a data management platform according to an embodiment performing access control to a computing system.

도시한 예에서 데이터 관리 플랫폼(10000)은 수집 모듈(20001), 분석 모듈(20002) 및 모니터링 모듈(20005)을 포함할 수 있다. 애플리케이션 서버(1002)는 로그인 정보 수집부(30017) 및 접근 통제 처리부(30018)을 포함할 수 있다.In the illustrated example, the data management platform (10000) may include a collection module (20001), an analysis module (20002), and a monitoring module (20005). The application server (1002) may include a login information collection unit (30017) and an access control processing unit (30018).

데이터 관리 플랫폼(10000)의 수집 모듈(20001)은 패킷 수집부(30001) 및 접근 정보 수집부(30002)를 포함할 수 있다.The collection module (20001) of the data management platform (10000) may include a packet collection unit (30001) and an access information collection unit (30002).

패킷 수집부(30001)는 수집 모듈(20001)은 사용자(1000)와 애플리케이션 서버(1002) 사이에서 전송되는 패킷을 Network Switch 또는 Network TAP(Test Access Point, 이하, TAP)을 통해 수집할 수 있다. 접근 정보 수집부(30002)는 컴퓨팅 시스템과 사용자(1000) 간의 패킷을 분석하여 컴퓨팅 시스템 GUI가 실행된 사용자(1000)의 IP 및 MAC 주소 정보를 수집할 수 있다. 또한 로그인 정보 수집부(30017)는 컴퓨팅 시스템에 설치되어 사용자(1000)가 로그인 시 OS 명령어를 활용하여 사용자(1000)의 IP 및 MAC 주소를 수집할 수 있다.The packet collection unit (30001) can collect packets transmitted between a user (1000) and an application server (1002) through a network switch or a network TAP (Test Access Point, hereinafter, TAP). The access information collection unit (30002) can analyze packets between the computing system and the user (1000) to collect IP and MAC address information of the user (1000) who is executing the computing system GUI. In addition, the login information collection unit (30017) can be installed in the computing system and collect the IP and MAC addresses of the user (1000) by utilizing OS commands when the user (1000) logs in.

분석 모듈(20002)은 이벤트 분석부(30003), 사용자 행위 분석부(30004) 및 접근 정보 분석부(30005)를 포함할 수 있다. 일 실시 예에서, 이벤트 분석부(30003)는 사용자 계정 별 접근 통제 정보를 생성하고, 접근 통제 정보를 기준으로 감사 로그(Audit Log)에 이벤트를 추가할 수 있는 이벤트 처리 규칙을 생성할 수 있다. 예를 들어, 접근 통제 정보는 IP 및 IP 범위, 시스템, 요일, 시간 범위, 규칙 유효 기간 및 차단 메시지 중 적어도 하나를 포함할 수 있다. 이에 대한 자세한 내용은 아래에서 설명된다.The analysis module (20002) may include an event analysis unit (30003), a user behavior analysis unit (30004), and an access information analysis unit (30005). In one embodiment, the event analysis unit (30003) may generate access control information for each user account and generate event processing rules that can add events to an audit log based on the access control information. For example, the access control information may include at least one of an IP address, an IP range, a system, a day of the week, a time range, a rule validity period, and a blocking message. This is described in detail below.

이벤트 분석부(30003)는 패킷을 분석하여 감사 로그를 생성하고 이벤트 규칙과 비교하여 허용되지 않는 접근인 경우 접근 통제 이벤트를 감사 로그에 추가하여 저장할 수 있다. 예를 들어, 감사 로그는 SAPGUI 정보를 포함할 수 있다. 일 실시 예에서, 이벤트 분석부(30003)는 감사 로그를 조회하여 접근 통제 이벤트를 추가할 수 있다.The event analysis unit (30003) analyzes packets to generate audit logs, compares them against event rules, and, if access is deemed unauthorized, adds and stores an access control event to the audit log. For example, the audit log may include SAPGUI information. In one embodiment, the event analysis unit (30003) may query the audit log to add an access control event.

사용자 행위 분석부(30004)는 이벤트를 축적하여 사용자에 대한 사용자의 이상 행위를 분석할 수 있는 상관 분석 규칙을 생성할 수 있다. 예를 들어, 상관 분석 규칙은 개인 정보 조회 후 다운로드 등 다양한 컴퓨팅 시스템에 대한 다양한 사용자 행동을 포함할 수 있다.The user behavior analysis unit (30004) can accumulate events and generate correlation analysis rules to analyze unusual user behavior. For example, correlation analysis rules can include various user behaviors on various computing systems, such as downloading after viewing personal information.

사용자 행위 분석부(30004)는 감사 로그를 조회하여 상관 분석 규칙에 따라 사용자의 이상 행위를 탐지하여 인시던트를 생성할 수 있다.The user behavior analysis unit (30004) can create incidents by searching audit logs and detecting abnormal user behavior according to correlation analysis rules.

접근 정보 분석부(30005)는 사용자(1000)의 컴퓨팅 시스템에 대한 접근 정보에 기반하여 사용자의 접근 통제를 수행할 수 있다. 일 실시 예에서, 접근 정보 분석부(30005)는 IP 주소, MAC 주소 및 유효기간 정보를 관리하고, 데이터베이스에 저장하고 이 정보를 접근 통제 요청부(30006)가 컴퓨팅 시스템(1002)과 동기화하고 접근 통제 처리부(30018)를 통해 사용자 별 접근 통제를 수행할 수 있다.The access information analysis unit (30005) can perform user access control based on the user's (1000) access information to the computing system. In one embodiment, the access information analysis unit (30005) manages IP address, MAC address, and expiration date information, stores them in a database, and synchronizes this information with the computing system (1002) through the access control request unit (30006), and performs user-specific access control through the access control processing unit (30018).

접근 통제 처리부(30018)는 데이터 관리 플랫폼(10000)에서 접근 통제 요청부(30006)를 통해 동기화된 사용자 계정별 접근 통제 정보와 로그인 정보 수집부(30017)를 통해 수집된 IP 주소 및 MAC 주소 중 적어도 하나를 포함하는 접근 정보를 비교하여 사용자의 접근을 통제한다.The access control processing unit (30018) controls user access by comparing the access control information for each user account synchronized through the access control request unit (30006) in the data management platform (10000) with the access information including at least one of the IP address and MAC address collected through the login information collection unit (30017).

모니터링 모듈(20005)은 접근 통제 요청부(30006)를 포함할 수 있다. 일 실시 예에서, 접근 통제 요청부(30006)는 접근 통제 이벤트가 있는지 확인하고 있는 경우 SAP RFC를 이용해 감사 로그 내에 있는 컨텍스트(Context) ID 정보 및 차단 메시지를 애플리케이션 서버(1002)에게 전달할 수 있다.The monitoring module (20005) may include an access control request unit (30006). In one embodiment, the access control request unit (30006) may, if it determines that there is an access control event, transmit context ID information and a blocking message contained in the audit log to the application server (1002) using SAP RFC.

일 실시 예에서, 접근 통제 요청부(30006)는 생성된 인시던트에 대해 알림 전송(Simple Mail Transfer Protocol, SMTP) 및 RFC를 이용해 애플리케이션 서버(1002)에게 접근 통제 요청을 송신할 수 있다. 일 실시 예에서, 애플리케이션 서버(1002)는 SAP RFC 함수가 호출되면 수신된 컨텍스트 ID 정보를 이용하여 사용자 세션을 검색하여 차단하고 사용자(1000)에게 차단 메시지를 팝업시킬 수 있다.In one embodiment, the access control request unit (30006) may transmit an access control request to the application server (1002) using Simple Mail Transfer Protocol (SMTP) and RFC for a generated incident. In one embodiment, when the SAP RFC function is called, the application server (1002) may search for and block a user session using the received context ID information and pop up a blocking message to the user (1000).

일 실시 예에서, 접근 통제 요청부(30006)는 저장된 사용자 별 접근 통제 정보를 RFC 프로토콜을 이용하여 애플리케이션 서버(1002)에게 전달할 수 있다. 일 실시 예에서, 애플리케이션 서버(1002)는 SAP RFC로 수신된 접근 통제 정보를 컴퓨팅 시스템 내 테이블에 저장할 수 있다. 일 실시 예에서, 로그인 정보 수집부(30017)는 컴퓨팅 시스템에 사용자 로그인 시 컴퓨팅 시스템 GUI가 실행된 사용자(1000)의 IP 주소 및 MAC 주소 정보 중 적어도 하나를 포함하는 접근 정보를 수집할 수 있다. 또한, 접근 통제 처리부(30018)는 접근 통제 정보와 수집된 접근 정보를 비교하여 허용되지 않는 접근인 경우 접속을 종료시킬 수 있다. 일 실시 예에서, 이벤트 기반 접근 통제의 경우 여러 네트워크 스위치를 거치게 되면 MAC 주소가 이용될 수 없기 때문에 MAC 기반 접근 통제 기능이 사용될 수 있다.In one embodiment, the access control request unit (30006) may transmit the stored user-specific access control information to the application server (1002) using the RFC protocol. In one embodiment, the application server (1002) may store the access control information received as SAP RFC in a table within the computing system. In one embodiment, the login information collection unit (30017) may collect access information including at least one of the IP address and MAC address information of the user (1000) whose computing system GUI is executed when the user logs in to the computing system. In addition, the access control processing unit (30018) may compare the access control information with the collected access information and terminate the connection if the access is not permitted. In one embodiment, in the case of event-based access control, a MAC-based access control function may be used because the MAC address cannot be used when passing through multiple network switches.

도 62는 실시 예에 따른 접근 통제 정보 입력 화면의 예를 개시하는 도면Figure 62 is a drawing disclosing an example of an access control information input screen according to an embodiment.

도시한 예에서, 본 발명의 접근 통제 정보 입력 화면(30007)을 제공할 수 있다. 여기에서, 접근 통제 정보 입력 화면(30007)은 사용자의 컴퓨팅 시스템에 대한 로그인을 통제하는 접근 통제 정보를 입력하는 화면을 나타낼 수 있다.In the illustrated example, an access control information input screen (30007) of the present invention may be provided. Here, the access control information input screen (30007) may represent a screen for inputting access control information that controls the user's login to the computing system.

일 실시 예에서, 접근 통제 정보 입력 화면(30007)은 사용자 계정, 조직, 차단 메시지, IP 주소, MAC 주소, 시스템 정보, 요일, 시간 범위 및 유효 기간 중 적어도 하나를 포함하는 접근 통제 정보가 입력될 수 있다.In one embodiment, the access control information input screen (30007) may input access control information including at least one of a user account, an organization, a blocking message, an IP address, a MAC address, system information, a day of the week, a time range, and a validity period.

차단 메시지는 접근이 차단되는 경우 사용자에게 전달되는 차단 메시지를 포함할 수 있다. 예를 들어, 차단 메시지는 '등록되지 않은 IP이므로 차단됩니다'와 같은 메시지를 포함할 수 있다. IP 주소는 접근을 통제할 IP 주소를 설정할 수 있으며, 특정 IP 주소와 IP 범위가 설정될 수 있다. MAC 주소는 접근을 통제할 MAC 주소를 설정할 수 있으며, 모든 MAC과 특정 MAC 주소가 설정될 수 있다. 시스템은 접근을 통제할 시스템을 설정할 수 있으며, 모든 시스템과 특정 시스템이 설정될 수 있다. 요일은 접근을 차단할 요일을 설정할 수 있으며, 월/화/수/목/금/토/일 중 적어도 하나의 요일이 설정될 수 있다. 시간 범위는 접근을 통제할 시간을 설정할 수 있으며, 모든 시간과 특정 시간이 설정될 수 있다. 유효 기간은 접근이 통제되는 기간을 설정할 수 있으며, 모든 기간과 특정 기간이 설정될 수 있다.The blocking message can include a blocking message delivered to the user when access is blocked. For example, the blocking message can include a message such as "This IP address is unregistered and is blocked." The IP address can be set to restrict access to IP addresses, and can be a specific IP address or an IP range. The MAC address can be set to restrict access to MAC addresses, and can be set to all MAC addresses or a specific MAC address. The system can be set to restrict access to systems, and can be set to all systems or a specific system. The day of the week can be set to restrict access to days of the week, and can be set to at least one of the following: Monday, Tuesday, Wednesday, Thursday, Friday, Saturday, or Sunday. The time range can be set to restrict access to all times or a specific time. The validity period can be set to restrict access to all times or a specific time.

도 63은 실시 예에 따른 이벤트 처리 규칙 화면의 예를 개시하는 도면Figure 63 is a drawing disclosing an example of an event processing rule screen according to an embodiment.

도시한 예에서, 본 발명의 이벤트 처리 규칙 화면(30008)을 제공할 수 있다. 여기에서, 이벤트 처리 규칙 화면(30008)은 접근 통제 정보에 기반한 이벤트 처리 규칙을 입력하는 화면을 나타낼 수 있다.In the illustrated example, an event processing rule screen (30008) of the present invention may be provided. Here, the event processing rule screen (30008) may represent a screen for entering an event processing rule based on access control information.

일 실시 예에서, 이벤트 처리 규칙 화면(30008)은 규칙 ID, 규칙 종류, 규칙 설정 및 규칙 코드를 포함할 수 있다. 여기서, 규칙 코드는 접근 통제 정보에 기반하여 접근이 통제되는 조건 및 규칙을 정의하는 코드를 포함할 수 있다.In one embodiment, the event processing rule screen (30008) may include a rule ID, a rule type, rule settings, and a rule code. Here, the rule code may include code defining conditions and rules for controlling access based on access control information.

예를 들어, new_login이 true이거나 transaction_code가 WEBGUI_SERVICES인 경우, 사용자 계정(sap_userid)이 PT_INSPIEN이고, IP 주소가 10.1.1.100를 포함하지 않고, 요일이 월, 화, 수, 목, 금이 아니며, 시간 범위가 09000 내지 18000 사이에 해당하지 않는 경우 “등록되지 않은 IP이므로 차단됩니다”라는 CON-MGT 이벤트를 출력하도록 규칙 코드가 작성될 수 있다. 즉, 이러한 접근 통제 정보에 기반한 이벤트 처리 규칙에 따라 사용자 로그인이 통제될 수 있다.For example, if new_login is true or transaction_code is WEBGUI_SERVICES, the user account (sap_userid) is PT_INSPIEN, the IP address does not include 10.1.1.100, the day of the week is not Mon, Tue, Wed, Thu, Fri, and the time range is not between 09000 and 18000, the rule code can be written to output a CON-MGT event saying “Unregistered IP is blocked.” In other words, user login can be controlled according to the event processing rule based on this access control information.

도 64는 실시 예에 따른 상관 규칙 입력 화면의 예를 개시하는 도면Figure 64 is a drawing disclosing an example of a correlation rule input screen according to an embodiment.

도시한 예에서, 본 발명의 상관 규칙 입력 화면(30009)을 제공할 수 있다. 여기에서, 상관 규칙 입력 화면(30009)은 사용자 행위 분석 기반 접근 통제를 수행하는 상관 규칙을 입력하는 화면을 나타낼 수 있다.In the illustrated example, a correlation rule input screen (30009) of the present invention may be provided. Here, the correlation rule input screen (30009) may represent a screen for inputting a correlation rule for performing access control based on user behavior analysis.

일 실시 예에서, 상관 규칙 입력 화면(30009)은 상관 규칙 ID, 규칙 유형, 규칙명, 적용 이벤트, 동일 대상 항목 및 축적 항목 중 적어도 하나를 포함하는 상관 규칙이 입력될 수 있다. 여기서, 규칙 유형은 특정 이벤트가 일정 횟수 누적되면 접근을 통제하는 누적 초과 유형을 포함할 수 있다. 예를 들어, 개인정보 누적 100건 초과 시 접근이 통제될 수 있다.In one embodiment, the correlation rule input screen (30009) can input a correlation rule that includes at least one of a correlation rule ID, a rule type, a rule name, an applicable event, a common target item, and an accumulated item. Here, the rule type may include an accumulated excess type that restricts access when a specific event accumulates a certain number of times. For example, access may be restricted when the accumulated personal information exceeds 100.

적용 이벤트는 접근 통제가 적용될 적어도 하나의 이벤트를 선택할 수 있다. 예를 들어, 주민번호조회, 외국인등록번호조회, 운전면허번호조회 및 여권번호조회 이벤트가 상관 규칙에 입력되는 이벤트로 선택되어 접근 통제가 적용될 수 있다.The applicable event can select at least one event to which access control will be applied. For example, the resident registration number search, alien registration number search, driver's license number search, and passport number search events can be selected as events entered into the correlation rule and subject to access control.

동일 대상 항목은 접근 통제가 적용될 기준 항목을 선택할 수 있다. 예를 들어, 사번, 시스템, 프로토콜명 및 조직 기준으로 이벤트를 축적할 수 있다. 이는 접근 통제를 적용할 기준 항목으로 선택될 수 있다.For the same target item, you can select criteria for which access control will be applied. For example, events can be accumulated based on employee number, system, protocol name, and organization. These can be selected as criteria for applying access control.

축적 항목은 축적 시간, 누적 항목 및 초과 건수를 포함할 수 있다. 예를 들어, 축적 시간은 이벤트가 축적되는 시간을 나타낼 수 있다. 누적 항목은 이벤트 필드 중 수치 값을 나타내는 필드가 포함될 수 있다. 예를 들어, 이벤트 건수(n건 이상)를 포함할 수 있다. 또한, 초과 건수는 누적항목이 축적시간 동안 합산되는 값의 임계치(예: 100)를 설정할 수 있다.An accumulation item can include an accumulation time, an accumulation item, and an exceedance count. For example, an accumulation time can indicate the time at which events accumulate. An accumulation item can include a field representing a numeric value among the event fields. For example, it can include the number of events (n or more). Additionally, the exceedance count can set a threshold (e.g., 100) for the values that accumulate over the accumulation time.

도 65는 실시 예에 따른 상관 규칙 활성화 화면의 예를 개시하는 도면Figure 65 is a drawing disclosing an example of a correlation rule activation screen according to an embodiment.

도시한 예에서, 본 발명의 상관 규칙 활성화 화면(30010)을 제공할 수 있다. 여기에서, 상관 규칙 활성화 화면(30010)은 수신된 이벤트 간의 상관 관계를 실시간으로 분석하여 사건(인시던트)을 발생시키는 규칙을 입력하는 화면을 나타낼 수 있다.In the illustrated example, a correlation rule activation screen (30010) of the present invention may be provided. Here, the correlation rule activation screen (30010) may represent a screen for entering a rule that generates an incident by analyzing the correlation between received events in real time.

일 실시 예에서, 상관 규칙 활성화 화면(30010)은 상관규칙 적용 영역에 상관규칙 입력 화면(30009)로 입력된 상관규칙의 활성화 여부를 입력할 수 있다. 일 실시 예에서 개인정보 누적 100건 초과 규칙의 활성화 여부 및 상관분석 규칙 정보를 입력할 수 있다. 여기서, 상관 규칙 적용 영역은 규칙 ID, 규칙 명, 분석 간격 및 활성화 여부를 포함할 수 있다.In one embodiment, the correlation rule activation screen (30010) can input whether to activate a correlation rule entered in the correlation rule input screen (30009) in the correlation rule application area. In one embodiment, whether to activate a rule exceeding 100 accumulated personal information cases and information on correlation analysis rules can be input. Here, the correlation rule application area can include a rule ID, rule name, analysis interval, and activation status.

또한, 상관규칙 적용 영역은 여러 규칙을 포함할 수 있고, 각 상관 규칙은 보안 점수, 보안점수 산정 규칙, 위험도, 알림적용 규칙 및 인시던트 저장을 포함할 수 있다. 예를 들어, 개인정보 누적이 100건 초과되는 경우 보안 점수 30점, 사용자 접근 차단 알림적용 규칙이 설정되며, 인시던트 저장이 활성화(ON)될 수 있다.Additionally, a correlation rule application area can include multiple rules, each of which can include a security score, a security score calculation rule, a risk assessment, a notification rule, and incident storage. For example, if the accumulated personal information exceeds 100, a security score of 30 points, a user access blocking notification rule, and incident storage can be enabled (ON).

도 66은 실시 예에 따른 통제 데이터 설정 화면의 예를 개시하는 도면Figure 66 is a drawing disclosing an example of a control data setting screen according to an embodiment.

도시한 예에서, 본 발명의 통제 데이터 설정 화면(30011)을 제공할 수 있다. 여기에서, 통제 데이터 설정 화면(30011)은 MAC 기반 접근 통제를 수행하는 통제 데이터 설정을 입력하는 화면을 나타낼 수 있다.In the illustrated example, a control data setting screen (30011) of the present invention may be provided. Here, the control data setting screen (30011) may represent a screen for entering control data settings for performing MAC-based access control.

일 실시 예에서, 통제 데이터 설정 화면(30011)은 컴퓨팅 시스템(예: SAP)에 대한 계정 그룹, 계정 그룹 관리자 및 계정 그룹별 통제 중 적어도 하나가 설정될 수 있다.In one embodiment, the control data setup screen (30011) may set at least one of an account group, an account group manager, and an account group-specific control for a computing system (e.g., SAP).

일 실시 예에서, 계정 그룹은 계정그룹 코드와 이에 대응하는 계정그룹 명을 포함할 수 있다. 예를 들어, 계정그룹 코드 FI의 계정그룹 명은 재무회계로 설정될 수 있다. 또한, 계정 그룹 관리자는 시스템 사용자 및 계정 그룹을 포함할 수 있다. 예를 들어, 시스템 관리자는 administrator로 설정되고, 계정 그룹은 ALL로 설정될 수 있다. 또한, 계정 그룹별 통제는 컴퓨팅 시스템에 대한 계정, 계정 그룹 및 계정 명을 포함할 수 있다.In one embodiment, an account group may include an account group code and a corresponding account group name. For example, the account group name for the account group code FI may be set to "Financial Accounting." Furthermore, an account group administrator may include system users and account groups. For example, a system administrator may be set to "administrator," and an account group may be set to "ALL." Furthermore, account group-specific controls may include accounts, account groups, and account names for computing systems.

도 67은 실시 예에 따른 데이터 관리 방법이 데이터 포맷을 변환하는 예를 개시하는 흐름도Figure 67 is a flowchart showing an example of a data management method according to an embodiment of the present invention converting a data format.

데이터 관리 소프트웨어 패키지 내에 포함된 적어도 하나의 엔진 또는 모듈을 제어하는 데이터 관리 플랫폼에서, 사용자의 컴퓨팅 시스템에 대한 접근을 검출한다(S30101). 일 실시 예에서, 사용자의 컴퓨팅 시스템에 대한 접근은 네트워크 스위치 및 TAP와 MAC 주소 중 적어도 하나에 기반하여 검출될 수 있다. 이에 대하여는 도 60 및 61에서 상술한 내용을 참고하도록 한다.In a data management platform controlling at least one engine or module included in a data management software package, access to a user's computing system is detected (S30101). In one embodiment, access to the user's computing system may be detected based on at least one of a network switch, a TAP, and a MAC address. For further details, please refer to the details described above in FIGS. 60 and 61.

미리 저장된 접근 통제 정보에 기반하여, 상기 사용자의 접근을 차단하는 접근 통제 요청을 상기 컴퓨팅 시스템에 대한 애플리케이션 서버에게 송신한다(S30103). 일 실시 예에서, 사용자의 사용자 계정에 대한 이벤트 처리 규칙을 생성하고, 이벤트 처리 규칙에 기반하여 상기 검출된 접근에 대응하는 사용자 로그인을 차단하는 접근 통제 요청을 애플리케이션 서버에게 송신할 수 있다.Based on pre-stored access control information, an access control request blocking the user's access is transmitted to the application server for the computing system (S30103). In one embodiment, an event processing rule for the user's user account can be created, and an access control request blocking the user login corresponding to the detected access can be transmitted to the application server based on the event processing rule.

일 실시 예에서, 사용자에 대한 적어도 하나의 이벤트에 기반하여 사용자 행위에 대한 상관 분석 규칙을 생성하고, 상관 분석 규칙에 기반하여 상기 검출된 접근을 차단하는 접근 통제 요청을 상기 애플리케이션 서버에게 송신할 수 있다.In one embodiment, a correlation analysis rule for user behavior can be generated based on at least one event for the user, and an access control request for blocking the detected access can be sent to the application server based on the correlation analysis rule.

일 실시 예에서, 사용자에 대한 접근 주소 정보를 획득하고, 접근 주소 정보에 기반하여 사용자의 상기 검출된 접근을 차단하는 접근 통제 요청을 애플리케이션 서버에게 송신할 수 있다. 이에 대하여는 도 62 내지 66에서 상술한 내용을 참고하도록 한다.In one embodiment, access address information for a user may be obtained, and an access control request blocking the user's detected access based on the access address information may be transmitted to the application server. For details, refer to the details described above in FIGS. 62 to 66.

도 68은 실시 예에 따른 데이터 관리 플랫폼이 로그 데이터 관련 인덱싱 및 로그 데이터 압축을 수행하는 예를 개시하는 도면FIG. 68 is a diagram disclosing an example of a data management platform according to an embodiment performing log data-related indexing and log data compression.

도시한 예에서 데이터 관리 플랫폼(10000)은 로그 데이터 압축 및 로그 데이터 관련 인덱싱을 수행할 수 있다. 일 실시 예에서, 데이터 관리 플랫폼(10000)은 수집 모듈(20001), 분석 모듈(20002) 및 모니터링 모듈(20005)을 통하여 원본 로그 데이터를 버킷 단위로 압축하고, 로그 데이터 관련 풀텍스트 인덱싱(full-text indexing) 및 비트맵 인덱싱(bitmap indexing)을 수행할 수 있다.In the illustrated example, the data management platform (10000) can perform log data compression and log data-related indexing. In one embodiment, the data management platform (10000) can compress original log data by bucket unit through the collection module (20001), analysis module (20002), and monitoring module (20005), and perform full-text indexing and bitmap indexing related to the log data.

일 실시 예에서, 수집 모듈(20001)은 로그 데이터를 수신할 수 있다. 여기서, 로그 데이터는 이벤트 데이터를 포함할 수 있다. 일 실시 예에서, 수집 모듈(20001)은 브로커(broker) 로부터 정규화된 이벤트 데이터를 수신할 수 있다. 일 실시 예에서, 브로커는 분산처리를 위해 각 프로세스 중간에 로그 데이터를 적재하기 위한 프로세스를 포함할 수 있다.In one embodiment, the collection module (20001) may receive log data. Here, the log data may include event data. In one embodiment, the collection module (20001) may receive normalized event data from a broker. In one embodiment, the broker may include a process for loading log data between each process for distributed processing.

분석 모듈(20002)은 로그 데이터를 버킷 단위로 압축할 수 있다. 일 실시 예에서, 분석 모듈(20002)은 로그 데이터에 대한 세그먼트의 샤드(shard)를 결정할 수 있다. 일 실시 예에서, 분석 모듈(20002)은 로그 데이터를 청크(chunk) 단위로 버킷(bucket)에 저장할 수 있다. 이 경우, 청크는 로그 데이터를 저장하는 단위를 나타낼 수 있다. 또한, 원본 로그 데이터는 용량 기준의 버킷 파일로 나눠 저장될 수 있다. 이 경우, 본 발명에 따르면 버킷을 통해 압축률을 높일 수 있다. 일 실시 예에서, 청크에 기반하여 압축된 로그 데이터는 세그먼트 단위의 해제 없이 청크 단위로 압축 해제하여 조회될 수 있다.The analysis module (20002) can compress log data in bucket units. In one embodiment, the analysis module (20002) can determine the shards of the segments for the log data. In one embodiment, the analysis module (20002) can store the log data in buckets in chunk units. In this case, a chunk can represent a unit for storing log data. Additionally, the original log data can be divided and stored into bucket files based on capacity. In this case, according to the present invention, the compression ratio can be increased through buckets. In one embodiment, log data compressed based on chunks can be decompressed and searched in chunk units without decompressing the segment units.

일 실시 예에서, 분석 모듈(20002)은 저장되는 로그 데이터에 새로운 이벤트 필드 유형 정보가 포함되어 있는 경우, 이벤트 필드 유형 정보에 필드 정보를 저장하고, 인덱스 대상인 해당 로그 데이터의 로그 ID(Log ID)를 큐(queue)(20009)에 저장할 수 있다. 여기서, 이벤트 필드 유형 정보는 해당 로그 데이터가 포함하고 있는 이벤트 필드에 대한 이벤트 필드 ID와 이벤트 필드 ID를 식별할 수 있는 숫자 정보를 나타낼 수 있다.In one embodiment, if the stored log data includes new event field type information, the analysis module (20002) may store field information in the event field type information and store the log ID of the corresponding log data, which is an index target, in a queue (20009). Here, the event field type information may indicate an event field ID for an event field included in the corresponding log data and numeric information that can identify the event field ID.

일 실시 예에서, 분석 모듈(20002)은 로그 ID에 대응하는 로그 데이터의 원본에 대한 풀텍스트 인덱스(full-text index) 및 상기 로그 데이터의 이벤트 필드에 대한 비트맵 인덱스(bitmap index) 중 적어도 하나를 포함하는 로그 데이터 관련 인덱스를 생성할 수 있다.In one embodiment, the analysis module (20002) may generate a log data related index including at least one of a full-text index for the original log data corresponding to the log ID and a bitmap index for an event field of the log data.

일 실시 예에서, 분석 모듈(20002)은 버킷에 청크 단위로 저장된 로그 데이터에 대한 압축을 수행할 수 있다. 이에 대한 자세한 내용은 아래에서 설명된다.In one embodiment, the analysis module (20002) may perform compression on log data stored in chunks in a bucket. This is described in detail below.

모니터링 모듈(20005)은 사용자/클라이언트(1000)로부터 다수의 로그 데이터 중 적어도 하나의 로그 데이터에 대한 로그 조회 요청을 수신할 수 있다. 일 실시 예에서, 모니터링 모듈(20005)은 로그 조회 요청에 따라, 로그 데이터 관련 인덱스에 기반하여 검색된 로그 ID를 기준으로 대상 버킷을 구분하고, 압축된 버킷의 오프셋(offset) 정보를 이용하여 버킷에 로그 데이터가 포함된 청크를 찾아 압축 해제할 수 있다. 일 실시 예에서, 모니터링 모듈(20005)은 다수의 로그 데이터 중 로그 조회 요청에 따라 압축 해제된 로그 데이터를 사용자/클라이언트(1000)에게 송신할 수 있다.The monitoring module (20005) may receive a log query request for at least one log data among a plurality of log data from a user/client (1000). In one embodiment, the monitoring module (20005) may, in response to the log query request, identify a target bucket based on a log ID searched based on a log data-related index, and may use offset information of a compressed bucket to find and decompress a chunk containing log data in the bucket. In one embodiment, the monitoring module (20005) may transmit decompressed log data among the plurality of log data to the user/client (1000) in response to the log query request.

도 69는 실시 예에 따른 데이터 관리 플랫폼이 로그 데이터 관련 인덱싱 및 로그 데이터 압축을 수행하는 다른 예를 개시하는 도면FIG. 69 is a diagram disclosing another example of a data management platform according to an embodiment performing log data-related indexing and log data compression.

도시한 예에서 데이터 관리 플랫폼(10000)은 수집 모듈(20001), 분석 모듈(20002) 및 모니터링 모듈(20005)을 포함할 수 있다.In the illustrated example, the data management platform (10000) may include a collection module (20001), an analysis module (20002), and a monitoring module (20005).

수집 모듈(20001)은 데이터 수집부(30012)를 포함할 수 있다. 일 실시 예에서, 데이터 수집부(30012)는 브로커를 통해 정규화된 로그 데이터와 해당 로그 데이터가 저장될 저장소 정보 및 샤드 정보 중 적어도 하나를 수신할 수 있다.The collection module (20001) may include a data collection unit (30012). In one embodiment, the data collection unit (30012) may receive normalized log data and at least one of storage information and shard information where the log data is to be stored through a broker.

분석 모듈(20002)은 압축 처리부(30013) 및 인덱스 생성부(30014)를 포함할 수 있다. 일 실시 예에서, 압축 처리부(30013)는 해당 로그 데이터에 대한 이벤트의 시간 정보를 이용하여 대상 세그먼트(Segment)를 추출할 수 있다. 또한, 압축 처리부(30013)는 해당 세그먼트의 샤드를 조회하고, 샤드가 없는 경우 샤드를 생성할 수 있다.The analysis module (20002) may include a compression processing unit (30013) and an index generation unit (30014). In one embodiment, the compression processing unit (30013) may extract a target segment using event time information for the corresponding log data. In addition, the compression processing unit (30013) may search for shards of the corresponding segment and, if no shards exist, create shards.

일 실시 예에서, 압축 처리부(30013)는 로그 데이터를 저장하기 위한 바이트(Byte) 형태로 인코딩하여 청크 단위로 버킷에 저장할 수 있다. 일 실시 예에서, 다수의 로그 데이터에 대하여 적어도 하나의 로그 데이터가 청크 단위로 버킷에 저장될 수 있다. 일 실시 예에서, 해당 버킷에 저장된 이벤트 위치를 확인할 수 있는 로그 ID가 생성될 수 있다.In one embodiment, the compression processing unit (30013) may encode log data into byte format for storage and store them in buckets in chunk units. In one embodiment, at least one log data may be stored in a bucket in chunk units for a plurality of log data. In one embodiment, a log ID may be generated to identify the location of an event stored in the bucket.

일 실시 예에서, 압축 처리부(30013)는 저장소 내 세그먼트의 샤드에 버킷 정보에 기반하여, 로그 데이터가 저장된 다수의 버킷 중 마지막 버킷을 제외한 나머지 버킷에 대해 압축을 수행할 수 있다. 로그 수집 시 당일 로그 데이터의 마지막 버킷은 로그 데이터가 추가되고 있기 때문에 마지막 버킷에 대한 수집 마감기간이 끝난 이후에 마지막 버킷에 대한 압축을 수행한다.In one embodiment, the compression processing unit (30013) may perform compression on buckets other than the last bucket among multiple buckets storing log data based on bucket information in shards of segments within the storage. During log collection, compression is performed on the last bucket of the day's log data after the collection deadline for the last bucket ends, as log data is being added to the last bucket.

일 실시 예에서, 압축 처리부(30013)는 각 버킷에 대응하는 데이터(data) 정보, 오프셋(offset) 정보 및 사이즈 정보 중 적어도 하나를 생성할 수 있다. 여기서, 데이터 정보는 버킷에 Chunk 단위로 압축된 로그 데이터를 포함할 수 있다. 오프셋 정보는 해당 버킷 내 청크 별 위치 정보를 포함할 수 있다. 사이즈 정보는 버킷의 사이즈 및 청크 사이즈를 포함하는 압축 관련 정보를 포함할 수 있다.In one embodiment, the compression processing unit (30013) may generate at least one of data information, offset information, and size information corresponding to each bucket. Here, the data information may include log data compressed in chunk units within the bucket. The offset information may include location information for each chunk within the bucket. The size information may include compression-related information including the bucket size and chunk size.

압축 처리부(30013)는 저장되는 로그 데이터에 대한 이벤트에 신규 이벤트 필드 유형 정보가 포함된 경우, 이벤트 필드 ID와 이벤트 필드 ID를 식별할 수 있는 숫자 정보를 포함하는 이벤트 필드 유형 정보에 추가하고, 이벤트 필드 유형 정보를 이용하여 로그 데이터를 바이트로 인코딩할 수 있다. 압축 처리부(30013)는 버킷에 저장 후 로그 데이터 대응하는 로그 ID를 생성할 수 있으며 인덱스 처리를 위해 로그 ID를 큐에 저장할 수 있다.When the event for the stored log data includes new event field type information, the compression processing unit (30013) can add event field type information including an event field ID and numeric information that can identify the event field ID, and can encode the log data into bytes using the event field type information. The compression processing unit (30013) can generate a log ID corresponding to the log data after storing it in a bucket, and can store the log ID in a queue for index processing.

인덱스 생성부(30014)는 큐를 통해 전달된 로그 ID를 이용하여 버킷에서 원본 로그 데이터를 참조하고, 해당 이벤트의 필드에 대응하는 비트맵 인덱스 및 이벤트의 로그 데이터, 즉, 원본 데이터(Raw Data)에 대응하는 풀텍스트 인덱스를 생성하여 저장할 수 있다.The index creation unit (30014) can reference the original log data in the bucket using the log ID transmitted through the queue, and create and store a bitmap index corresponding to the field of the event and a full-text index corresponding to the log data of the event, i.e., the original data (Raw Data).

일 실시 예에서, 인덱스 생성부(30014)는 미리 저장된 이벤트 변환 규칙 언어에 기반하여 풀텍스트 인덱싱의 대상인 원본 로그 데이터의 텍스트 데이터로부터 비트맵 인덱싱의 대상인 이벤트 필드 ID와 필드 값을 추출할 수 있다.In one embodiment, the index generation unit (30014) can extract event field IDs and field values, which are targets of bitmap indexing, from text data of original log data, which are targets of full-text indexing, based on a pre-stored event transformation rule language.

일 실시 예에서, 이벤트 변환 규칙 언어는 이벤트 변환(Event Transform) 및 표현식(Expression)을 포함할 수 있다. 여기서, 표현식은 구문(Statement)에서 쓰이는 식을 포함할 수 있다. 예를 들어, 표현식은 변수, 문자열, NULL, 숫자, 문자열합침식, 타입, 변환식 및 함수호출식 중 적어도 하나를 포함할 수 있다.In one embodiment, the event transformation rule language may include an event transformation and an expression. Here, the expression may include an expression used in a statement. For example, the expression may include at least one of a variable, a string, NULL, a number, a string concatenation, a type, a transformation expression, and a function call expression.

또한, 이벤트 변환은 변수 할당문(Assignment), 로그 저장문(Log Statement) 및 매치 구문(Match Statement) 중 적어도 하나를 포함하는 구문(Statement)에 기반하여 결정될 수 있다. 예를 들어, 변수 할당문은 변수 이름을 할당할 수 있다. 또한, 로그 저장문은 이벤트의 필드명 및 저장소 중 적어도 하나에 기반하여 필드 값을 할당할 수 있다.Additionally, event transformations can be determined based on a statement that includes at least one of a variable assignment statement, a log statement, and a match statement. For example, a variable assignment statement can assign a variable name. Additionally, a log statement can assign a field value based on at least one of the event's field name and storage.

또한, 매치 구문은 데이터(문자열)이 특정 패턴과 일치하는지에 따라 원하는 구문을 수행할 수 있다. 일 실시 예에서, 매치 구문은 변수명, 패턴 타입, 패턴 문자열 및 옵션 정보 중 적어도 하나에 기반하여 구문을 수행할 수 있다. 이 경우, 변수명은 지정되지 않을 수 있으며, 지정되지 않은 경우 원본 로그 데이터를 의미할 수 있다.Additionally, the match statement can perform a desired action based on whether data (a string) matches a specific pattern. In one embodiment, the match statement can perform a action based on at least one of a variable name, a pattern type, a pattern string, and optional information. In this case, the variable name may not be specified, and if not specified, it may refer to the original log data.

또한, 패턴 타입은 정규표현식(regx), 콤마 구분(comma separated value , csv), 탭 구분(tab separated value, tsv), 라벨링된 탭 구분(labeled tab separated value, ltsv) 및 고정 길이(fixed length)를 포함할 수 있다. 예를 들어, 콤마 구분은 패턴식에 콤마로 구분된 필드명을 입력하여 이벤트를 변환할 수 있다. 탭 구분은 패턴식에 탭으로 구분된 필드명을 입력하여 이벤트를 변환할 수 있다. 라벨링된 탭 구분은 이벤트에서 라벨링된 탭 구분 값을 파싱하여 패턴식이 사용되지 않을 수 있다. 고정 길이는 고정된 길이의 이벤트인 경우 각 필드의 길이만큼 필드값으로 식별하여 이벤트를 변환할 수 있다.Additionally, pattern types can include regular expressions (regx), comma-separated values (csv), tab-separated values (tsv), labeled tab-separated values (ltsv), and fixed length. For example, comma-separated can convert events by entering comma-separated field names in the pattern expression. Tab-separated can convert events by entering tab-separated field names in the pattern expression. Labeled tab-separated parses labeled tab-separated values in the event, so the pattern expression may not be used. Fixed length can convert events by identifying field values as long as each field is a fixed-length event.

또한, 패턴 문자열은 패턴 타입에 따라 매치 구문에 포함되지 않을 수 있다. 또한, 옵션 정보는 인코딩(encoding) 옵션을 포함할 수 있으며, 인코딩 옵션에 기반하여 로그 데이터의 인코딩이 변환될 수 있다.Additionally, the pattern string may not be included in the match syntax depending on the pattern type. Additionally, the option information may include encoding options, and the encoding of the log data may be converted based on the encoding options.

일 실시 예에서, 이벤트 변환 규칙 언어는 Lua 스크립트에 기반하여 결정될 수 있다. 이 때, Lua 스크립트 변환 처리 시 호출 되는 transform이라는 함수가 정의될 수 있다. 예를 들어, 함수의 첫번째 파라미터는 입력 이벤트, 두번째 파라미터는 로그 저장을 위한 객체를 나타낼 수 있다. 이 경우, Lua 스크립트의 경우 보다 유연한 범용 언어를 사용하기 때문에 복잡한 요구사항에 대한 처리가 가능하다. Lua 스크립트 내 입력된 이벤트에서 이벤트 필드 추출을 위해 match 기능을 정의할 수 있다. 일 실시 예에서, match 기능은 정규표현식(regx), 콤마 구분(comma separated value , csv), 탭 구분(tab separated value, tsv), 라벨링된 탭 구분(labeled tab separated value, ltsv) 및 고정 길이(fixed length), 구분자 구분(delimiter separated value, dsv)과 같은 변환 함수를 포함할 수 있다. 예를 들어, 구분자 구분은 텍스트 내에서 필드를 구분하고 있는 문자 정보와 필드 정보를 입력하여 이벤트를 변환할 수 있다.In one embodiment, the event transformation rule language can be determined based on a Lua script. At this time, a function called transform, which is called when processing the Lua script transformation, can be defined. For example, the first parameter of the function can represent an input event, and the second parameter can represent an object for log storage. In this case, since Lua script uses a more flexible general-purpose language, it is possible to process complex requirements. A match function can be defined to extract event fields from an event input in the Lua script. In one embodiment, the match function can include transformation functions such as regular expressions (regx), comma separated value (csv), tab separated value (tsv), labeled tab separated value (ltsv), fixed length, and delimiter separated value (dsv). For example, delimiter separated can transform an event by inputting character information and field information that separate fields in text.

또한, 일 실시 예에서, 입력 이벤트에 대한 개인정보 추출을 위해 프라이버시(privacy) 기능을 정의할 수 있다. 예를 들어, 프라이버시 기능에 개인정보 유형 정보와 개인정보를 추출할 수 있는 정규식, 개인정보 유효성 검사식을 정의하면 프라이버시 기능에 포함된 검색(finder) 함수를 이용하여 입력 이벤트에서 개인정보를 추출할 수 있다.Additionally, in one embodiment, a privacy function can be defined to extract personal information from input events. For example, by defining personal information type information, a regular expression for extracting personal information, and a personal information validation formula in the privacy function, personal information can be extracted from input events using the finder function included in the privacy function.

모니터링 모듈(20005)은 압축 해제부(30015) 및 로그 조회부(30016)를 포함할 수 있다. 일 실시 예에서, 압축 해제부(30015)는 버킷 단위로 압축된 로그 데이터 중 풀텍스트 인덱스 및 비트맵 인덱스 중 적어도 하나로 검색된 이벤트의 로그 ID에 대응하는 로그 데이터만을 압축 해제할 수 있다.The monitoring module (20005) may include a decompression unit (30015) and a log search unit (30016). In one embodiment, the decompression unit (30015) may decompress only log data corresponding to the log ID of an event searched for by at least one of a full-text index and a bitmap index among log data compressed by bucket unit.

일 실시 예에서, 로그 조회부(30016)는 비트맵 인덱스를 이용하여 로그 데이터의 특정 칼럼(예: 필드 유형)에 대한 조건 검색을 수행할 수 있다. 즉, 로그 조회부(30016)는 비트맵 인덱스에 대응하는 이벤트 필드에 대응하는 검색 값으로 해당하는 로그 ID를 조회할 수 있다.In one embodiment, the log search unit (30016) can perform a conditional search for a specific column (e.g., field type) of log data using a bitmap index. That is, the log search unit (30016) can search for a log ID corresponding to a search value corresponding to an event field corresponding to the bitmap index.

일 실시 예에서, 로그 조회부(30016)는 풀텍스트 인덱스를 이용하여 원본 로그 데이터에 특정한 텍스트 내용이 포함된 로그 데이터의 키워드 기반 검색을 수행할 수 있다.In one embodiment, the log query unit (30016) can perform keyword-based searches of log data containing specific text content in the original log data using a full-text index.

일 실시 예에서, 압축 해제부(30015)는 버킷 단위로 압축된 로그 데이터 중 사용자/클라이언트(1000)로부터 요청으로 로그 조회부(30016)에 의해 검색된 로그 ID를 기반으로 적어도 하나의 로그 데이터가 포함된 청크만을 압축 해제할 수 있다. 즉, 본 발명에 따르면, 버킷 단위의 압축을 통해 저장소에 더 많은 로그 데이터가 저장될 수 있다. 또한, 본 발명에 따르면, 압축된 로그 데이터를 청크 단위로 압축 해제하여 세그먼트 전체의 압축 해제 없이 실시간 조회가 수행될 수 있다.In one embodiment, the decompression unit (30015) may decompress only chunks containing at least one log data based on a log ID retrieved by the log query unit (30016) upon a request from a user/client (1000) among the log data compressed in bucket units. That is, according to the present invention, more log data can be stored in the storage through compression in bucket units. In addition, according to the present invention, real-time query can be performed without decompressing the entire segment by decompressing the compressed log data in chunk units.

일 실시 예에서, 로그 조회부(30016)는 사용자/클라이언트(1000)로부터 로그 데이터에 대한 로그 조회 요청을 수신할 수 있으며, 다수의 로그 데이터 중 로그 조회 요청에 따라 압축 해제된 로그 데이터를 사용자/클라이언트(1000)에게 송신할 수 있다.In one embodiment, the log query unit (30016) may receive a log query request for log data from a user/client (1000), and may transmit decompressed log data from among a plurality of log data according to the log query request to the user/client (1000).

도 70은 실시 예에 따른 데이터 관리 방법이 로그 데이터 관련 인덱싱 및 로그 데이터 압축을 수행하는 예를 개시하는 흐름도Figure 70 is a flowchart showing an example of a data management method according to an embodiment performing log data-related indexing and log data compression.

데이터 관리 소프트웨어 패키지 내에 포함된 적어도 하나의 엔진 또는 모듈을 제어하는 데이터 관리 플랫폼에서, 로그 데이터를 수신한다(S30201). 일 실시 예에서, 브로커로부터 정규화된 로그 데이터를 수신할 수 있다. 이에 대하여는 도 68 및 도 69에서 상술한 내용을 참고하도록 한다.In a data management platform controlling at least one engine or module included in a data management software package, log data is received (S30201). In one embodiment, normalized log data may be received from a broker. For details, refer to the details described above in FIGS. 68 and 69.

로그 데이터를 버킷 단위로 압축한다(S30203). 일 실시 예에서, 로그 데이터를 인코딩하여 청크(chunk) 단위로 버킷에 저장하고, 버킷에 청크 단위로 저장된 로그 데이터에 대한 압축을 수행하고, 버킷에 대한 데이터 정보, 오프셋 정보 및 압축 정보 중 적어도 하나를 생성할 수 있다. 이에 대하여는 도 68 및 도 69에서 상술한 내용을 참고하도록 한다.Compress log data by bucket (S30203). In one embodiment, log data is encoded and stored in buckets by chunk, compression is performed on the log data stored in the bucket by chunk, and at least one of data information, offset information, and compression information for the bucket is generated. For details, refer to the contents described above in FIGS. 68 and 69.

로그 데이터 관련 인덱스에 기반하여 상기 버킷에 포함된 로그 데이터를 압축 해제한다(S30303). 일 실시 예에서, S30303 단계 이전에, 로그 데이터에 대한 풀텍스트 인덱스(full-text index) 및 로그 데이터의 이벤트 필드에 대한 비트맵 인덱스(bitmap index) 중 적어도 하나를 포함하는 로그 데이터 관련 인덱스를 생성할 수 있다. 일 실시 예에서, S30303 단계 이전에, 로그 데이터에 대한 로그 ID 정보를 큐(queue)에 저장하고, 로그 ID 정보에 기반하여 버킷에 저장된 로그 데이터를 식별할 수 있다. 이에 대하여는 도 68 및 도 69에서 상술한 내용을 참고하도록 한다.Log data contained in the bucket is decompressed based on a log data-related index (S30303). In one embodiment, prior to step S30303, a log data-related index including at least one of a full-text index for the log data and a bitmap index for an event field of the log data may be generated. In one embodiment, prior to step S30303, log ID information for the log data may be stored in a queue, and the log data stored in the bucket may be identified based on the log ID information. For this, refer to the contents described above with reference to FIGS. 68 and 69.

도 71은 실시 예에 따른 데이터 관리 플랫폼이 통계 쿼리 생성 및 통계 데이터 시각화를 수행하는 예를 개시하는 도면FIG. 71 is a diagram disclosing an example of a data management platform according to an embodiment performing statistical query generation and statistical data visualization.

도시한 예에서 데이터 관리 플랫폼(10000)은 사용자의 입력에 따라 통계 쿼리 생성 및 검색 결과를 이용하여 차트나 테이블 형태로 시각화하여 제공할 수 있다. 일 실시 예에서, 데이터 관리 플랫폼(10000)은 모니터링 모듈(20005)을 통하여 통계 검색 언어를 자동으로 생성하고, 통계적 검색을 통해 추출된 통계 데이터를 시각화할 수 있다.In the illustrated example, the data management platform (10000) can generate statistical queries based on user input and provide visualizations of search results in the form of charts or tables. In one embodiment, the data management platform (10000) can automatically generate a statistical search language through the monitoring module (20005) and visualize statistical data extracted through statistical searches.

일 실시 예에서, 모니터링 모듈(20005)은 사용자 입력 획득부(30019), 쿼리 생성부(30020), 데이터 검색부(30021) 및 데이터 시각화부(30022)를 포함할 수 있다.In one embodiment, the monitoring module (20005) may include a user input acquisition unit (30019), a query generation unit (30020), a data retrieval unit (30021), and a data visualization unit (30022).

사용자 입력 획득부(30019)는 쿼리 생성 사용자 인터페이스(user interface, UI)에 대한 쿼리 유형을 지정하는 사용자 입력을 획득할 수 있다. 일 실시 예에서, 쿼리 생성 사용자 인터페이스는 팝업(pop-up) 형태로 사용자/클라이언트에 제공되어 디스플레이될 수 있다. 일 실시 예에서, 사용자 입력 획득부(30019)는 시각화 차트를 생성하는 사용자 입력을 획득할 수 있다.The user input acquisition unit (30019) can acquire user input specifying a query type for a query generation user interface (UI). In one embodiment, the query generation UI can be provided and displayed to a user/client in the form of a pop-up. In one embodiment, the user input acquisition unit (30019) can acquire user input for generating a visualization chart.

쿼리 생성부(30020)는 지정된 쿼리 유형에 기반하여 통계 데이터를 검색하는 통계 쿼리를 생성할 수 있다. 일 실시 예에서, 쿼리 유형은, 상위 특정 개수의 데이터를 추출하는 제 1 쿼리, 시계열에 따라 측정된 데이터를 추출하는 제 2 쿼리 및 필드 유형에 따른 데이터를 추출하는 제 3 쿼리 중 적어도 하나를 포함할 수 있다. 여기서, 제 1 쿼리는 TopN 쿼리 또는 이와 동등한 기술적 의미를 갖는 용어로 지칭될 수 있다. 또한, 제 2 쿼리는 Time Series 쿼리 또는 이와 동등한 기술적 의미를 갖는 용어로 지칭될 수 있다. 또한, 제 3 쿼리는 Group By 쿼리 또는 이와 동등한 기술적 의미를 갖는 용어로 지칭될 수 있다. 이에 대한 자세한 내용은 아래에서 설명된다.The query generation unit (30020) may generate a statistical query that searches for statistical data based on a specified query type. In one embodiment, the query type may include at least one of a first query that extracts a certain number of data, a second query that extracts data measured according to a time series, and a third query that extracts data according to a field type. Here, the first query may be referred to as a TopN query or a term having an equivalent technical meaning thereto. In addition, the second query may be referred to as a Time Series query or a term having an equivalent technical meaning thereto. In addition, the third query may be referred to as a Group By query or a term having an equivalent technical meaning thereto. This is described in detail below.

본 발명에 따르면, 통계적 메트릭(Metric) 정보 저장소에 보다 단순하고 강력한 쿼리 언어를 제공할 수 있다. 또한, 본 발명에 따르면, 단순한 통계 쿼리를 사용자와 상호작용하며 간단하게 생성할 수 있는 UX를 지원할 수 있다. 또한, 본 발명에 따르면, 통계 쿼리에 대한 문법을 자세히 몰라도 단순한 통계 쿼리를 UI 조작만으로 간단히 생성할 수 있다.The present invention provides a simpler and more powerful query language for statistical metric information repositories. Furthermore, the present invention supports a user interface (UX) that allows simple statistical queries to be created through user interaction. Furthermore, the present invention allows simple statistical queries to be created simply through UI manipulation, without requiring detailed knowledge of statistical query syntax.

데이터 검색부(30021)는 통계 쿼리에 기반하여 통계 데이터를 검색할 수 있다. 일 실시 예에서, 데이터 검색부(30021)는 통계 쿼리에 의해 설정된 데이터 소스, 검색 기간 및 쿼리 유형 중 적어도 하나에 기반하여 통계 데이터를 검색할 수 있다.The data retrieval unit (30021) can retrieve statistical data based on a statistical query. In one embodiment, the data retrieval unit (30021) can retrieve statistical data based on at least one of a data source, a search period, and a query type set by the statistical query.

데이터 시각화부(30022)는 통계 쿼리에 기반하여 검색된 통계 데이터를 시각화할 수 있다. 일 실시 예에서, 데이터 시각화부(30022)는 통계 데이터 검색 후 시각화 차트를 선택하여 통계 데이터를 시각화할 수 있다. 본 발명에 따르면, 통계적 검색 결과로 생성된 결과 집합 데이터의 각 필드를 자유롭게 상호 작용하여 분석 차트를 생성할 수 있다. 또한, 본 발명에 따르면, 다양한 차트의 종류를 지원할 수 있다. 또한, 본 발명에 따르면, 같은 통계적 데이터를 통해서 여러 유형의 시각화를 손쉽게 생성하여 사용 가능하도록 할 수 있다.The data visualization unit (30022) can visualize statistical data retrieved based on a statistical query. In one embodiment, the data visualization unit (30022) can select a visualization chart after retrieving statistical data to visualize the statistical data. According to the present invention, analysis charts can be generated by freely interacting with each field of the result set data generated as a statistical search result. Furthermore, according to the present invention, various chart types can be supported. Furthermore, according to the present invention, various types of visualizations can be easily generated and used using the same statistical data.

도 72는 실시 예에 따른 쿼리 생성 화면의 예를 개시하는 도면Figure 72 is a drawing disclosing an example of a query creation screen according to an embodiment.

도시한 예에서, 본 발명의 쿼리 생성 화면(30023)을 제공할 수 있다. 여기에서, 쿼리 생성 화면(30023)은 통계 쿼리에 대한 기본 정보를 설정하는 화면을 나타낼 수 있다.In the illustrated example, a query creation screen (30023) of the present invention may be provided. Here, the query creation screen (30023) may represent a screen for setting basic information for a statistical query.

일 실시 예에서, 쿼리 생성 화면(30023)은 데이터 소스, 검색 기간 및 쿼리 유형을 설정할 수 있다. 여기서, 데이터 소스는 통계 검색을 수행할 통계 데이터를 포함한다. 예를 들어, 데이터 소스는 ResourceCheck로 설정될 수 있다.In one embodiment, the query creation screen (30023) can set a data source, search period, and query type. Here, the data source includes statistical data for performing a statistical search. For example, the data source can be set to ResourceCheck.

또한, 검색 기간은 통계 검색을 수행할 기간 설정 정보를 포함할 수 있다. 예를 들어, 검색 기간은 2024-01-09 00:00부터 2024-01-09 23:59로 설정될 수 있다.Additionally, the search period may include information setting the period for which the statistical search will be performed. For example, the search period may be set from 2024-01-09 00:00 to 2024-01-09 23:59.

또한, 쿼리 유형은 통계 검색을 수행할 통계 쿼리의 유형을 설정할 수 있다. 예를 들어, 쿼리 유형은 TopN 쿼리, Time Series 쿼리 및 Group By 쿼리 중 하나가 설정될 수 있다.Additionally, the query type can set the type of statistical query to perform the statistical search. For example, the query type can be set to one of the following: TopN query, Time Series query, or Group By query.

도 73은 실시 예에 따른 TopN 쿼리 설정 화면의 예를 개시하는 도면Figure 73 is a drawing disclosing an example of a TopN query setting screen according to an embodiment.

도시한 예에서, 본 발명의 TopN 쿼리 설정 화면(30024)을 제공할 수 있다. 여기에서, TopN 쿼리 설정 화면(30024)은 상위 특정 개수의 데이터를 추출하는 TopN 쿼리의 세부 정보를 설정하는 화면을 나타낼 수 있다.In the illustrated example, a TopN query setting screen (30024) of the present invention may be provided. Here, the TopN query setting screen (30024) may represent a screen for setting details of a TopN query that extracts a specific number of data.

일 실시 예에서, TopN 쿼리 설정 화면(30024)은 임계 건수, 시간 단위, 디멘션(Dimension) 및 메트릭(Metric)을 포함할 수 있다. 여기서, 임계 건수는 측정된 전체 데이터 중 검색 결과에 표시될 특정 상위 건수를 포함할 수 있다. 예를 들어, 임계 건수는 10으로 설정될 수 있으며, 전체 데이터 중 상위 10개의 통계 데이터가 검색될 수 있다.In one embodiment, the TopN query setting screen (30024) may include a threshold number of cases, a time unit, a dimension, and a metric. Here, the threshold number may include a specific top number of cases to be displayed in search results among all measured data. For example, the threshold number may be set to 10, and the top 10 statistical data among all data may be retrieved.

또한, 시간 단위는 데이터가 검색되는 시간 단위를 포함할 수 있다. 디멘션은 검색되는 데이터의 속성을 나타내는 검색 기준을 포함할 수 있다. 예를 들어, 디멘션은 디바이스 ID인 device_id로 설정될 수 있다. 또한, 메트릭은 정량적으로 검색되는 검색 항목과 연산방식을 포함할 수 있다. 일 실시 예에서, 연산방식은 검색항목 값의 합계를 측정하는 제 1 연산방식, 검색항목 값 중 최대 값을 측정하는 제 2 연산방식, 검색항목 값 중 최소값을 측정하는 제 3 연산방식 중 적어도 하나를 포함할 수 있다. 예를 들어, device_id를 기준으로 Usage의 합계를 측정하여 10건이 표시되도록 설정될 수 있다.Additionally, the time unit may include the time unit in which data is searched. The dimension may include search criteria indicating the properties of the data being searched. For example, the dimension may be set to device_id, which is a device ID. Additionally, the metric may include search items and an operation method to be searched quantitatively. In one embodiment, the operation method may include at least one of a first operation method that measures the sum of search item values, a second operation method that measures the maximum value among the search item values, and a third operation method that measures the minimum value among the search item values. For example, the sum of Usage may be measured based on device_id, and 10 cases may be set to be displayed.

도 74는 실시 예에 따른 Time Series 쿼리 설정 화면의 예를 개시하는 도면Figure 74 is a drawing disclosing an example of a Time Series query setting screen according to an embodiment.

도시한 예에서, 본 발명의 Time Series 쿼리 설정 화면(30025)을 제공할 수 있다. 여기에서, Time Series 쿼리 설정 화면(30025)은 시계열에 따라 측정된 데이터를 추출하는 Time Series 쿼리의 세부 정보를 설정하는 화면을 나타낼 수 있다.In the illustrated example, a Time Series query setting screen (30025) of the present invention may be provided. Here, the Time Series query setting screen (30025) may represent a screen for setting details of a Time Series query that extracts data measured according to a time series.

일 실시 예에서, Time Series 쿼리 설정 화면(30025)은 시간 단위 및 메트릭을 포함할 수 있다. 여기서, 시간 단위는 데이터가 검색되는 시간 단위를 포함할 수 있다. 또한, 예를 들어, 시간 단위는 오름차순, 내림차순과 같은 시계열 순서가 설정될 수 있다.In one embodiment, the Time Series query settings screen (30025) may include a time unit and a metric. Here, the time unit may include the time unit in which data is retrieved. Additionally, for example, the time unit may be configured to specify a time series order, such as ascending or descending.

또한, 메트릭은 정량적으로 검색되는 검색 항목을 포함할 수 있다. 예를 들어, 시간 단위 기준으로 Usage의 합계를 측정하도록 설정될 수 있다.Additionally, metrics can include search items that are searched quantitatively. For example, they can be set to measure the sum of Usage over time.

도 75는 실시 예에 따른 Group By 쿼리 설정 화면의 예를 개시하는 도면Figure 75 is a drawing disclosing an example of a Group By query setting screen according to an embodiment.

도시한 예에서, 본 발명의 Group By 쿼리 설정 화면(30026)을 제공할 수 있다. 여기에서, Group By 쿼리 설정 화면(30026)은 필드 유형에 따른 데이터를 추출하는 Group By 쿼리의 세부 정보를 설정하는 화면을 나타낼 수 있다.In the illustrated example, a Group By query setting screen (30026) of the present invention may be provided. Here, the Group By query setting screen (30026) may represent a screen for setting details of a Group By query that extracts data according to field type.

일 실시 예에서, Group By 쿼리 설정 화면(30026)은 시간 단위, 디멘션(Dimension) 및 메트릭(Metric)을 포함할 수 있다. 여기서, 시간 단위는 데이터가 검색되는 시간 단위를 포함할 수 있다. 디멘션은 검색되는 데이터의 속성을 나타내는 검색 기준을 포함할 수 있다. 예를 들어, 디멘션은 디바이스 ID인 device_id로 설정될 수 있다. 또한, 메트릭은 정량적으로 검색되는 검색 항목을 포함할 수 있다. 예를 들어, 시간 단위 별 device_id를 기준으로 Usage의 합계를 측정하도록 설정될 수 있다.In one embodiment, the Group By query settings screen (30026) may include a time unit, a dimension, and a metric. Here, the time unit may include the time unit in which data is searched. The dimension may include search criteria indicating attributes of the data being searched. For example, the dimension may be set to device_id, which is a device ID. Additionally, the metric may include search items that are searched quantitatively. For example, the metric may be set to measure the sum of usage based on device_id for each time unit.

도 76은 실시 예에 따른 통계 검색 화면의 예를 개시하는 도면Figure 76 is a drawing disclosing an example of a statistical search screen according to an embodiment.

도시한 예에서, 본 발명의 통계 검색 화면(30027)을 제공할 수 있다. 여기에서, 통계 검색 화면(30027)은 통계 쿼리에 기반하여 통계 데이터를 검색하는 화면을 나타낼 수 있다.In the illustrated example, a statistical search screen (30027) of the present invention may be provided. Here, the statistical search screen (30027) may represent a screen for searching statistical data based on a statistical query.

일 실시 예에서, 통계 검색 화면(30027)은 쿼리 영역 및 검색 결과 영역을 포함하는 통계 검색을 수행할 수 있다. 여기서, 쿼리 영역은 설정된 통계 쿼리가 입력되는 영역으로 쿼리 유형에 따라 코드 형식으로 표시될 수 있다.In one embodiment, the statistical search screen (30027) can perform a statistical search including a query area and a search result area. Here, the query area is an area where a set statistical query is entered and can be displayed in code format according to the query type.

검색 결과 영역은 통계 검색에 따른 통계 데이터를 나타낼 수 있다. 일 실시 예에서, 검색 결과 영역은 디멘션 및 메트릭 값을 포함할 수 있다. 예를 들어, device_id를 디멘션으로 하고 메트릭에 해당하는 카운트(Count) 값을 LONGMAX로 연산하여 검색하는 경우 device_id 별 최대 카운트 값이 산출될 수 있다. 예를 들어, device_id가 b25e1d74-57dd-44dd-a8c0-cbb1143bv4c는 이에 대응하는 174의 최대 카운트 값이 산출될 수 있다.The search results area may display statistical data based on a statistical search. In one embodiment, the search results area may include dimensions and metric values. For example, when searching using device_id as a dimension and calculating the count value corresponding to the metric using LONGMAX, the maximum count value for each device_id may be calculated. For example, a device_id of b25e1d74-57dd-44dd-a8c0-cbb1143bv4c may produce a corresponding maximum count value of 174.

일 실시 예에서, 검색 결과 영역은 통계 검색 후 통계 데이터를 시각화 하는 시각화 전환 탭을 포함할 수 있다. 사용자에 의해 시각화 전환 탭에 대한 입력이 검출되는 경우, 해당 통계 데이터에 기반한 시각화 차트를 생성하는 화면이 팝업 형태로 제공될 수 있다. 이에 대한 자세한 내용은 아래에서 상세히 설명된다.In one embodiment, the search results area may include a visualization transition tab that visualizes statistical data after a statistical search. If a user input is detected in the visualization transition tab, a pop-up screen may be provided that generates a visualization chart based on the statistical data. This is described in detail below.

도 77은 실시 예에 따른 시각화 차트 선택 화면의 예를 개시하는 도면Figure 77 is a drawing disclosing an example of a visualization chart selection screen according to an embodiment.

도시한 예에서, 본 발명의 시각화 차트 선택 화면(30028)을 제공할 수 있다. 여기에서, 시각화 차트 선택 화면(30028)은 시각화 차트를 선택하는 화면을 나타낼 수 있다.In the illustrated example, a visualization chart selection screen (30028) of the present invention may be provided. Here, the visualization chart selection screen (30028) may represent a screen for selecting a visualization chart.

일 실시 예에서, 시각화 차트 선택 화면(30028)은 다양한 종류의 차트를 포함할 수 있다. 예를 들어, 시각화 차트 선택 화면(30028)은 선 차트(Line Chart), 커브 차트(Curve Chart), 열 차트(Column Chart) 및 바 차트(Bar Chart)를 포함할 수 있으나, 이에 제한되지 않고 다양한 차트가 포함될 수 있다.In one embodiment, the visualization chart selection screen (30028) may include various types of charts. For example, the visualization chart selection screen (30028) may include, but is not limited to, a line chart, a curve chart, a column chart, and a bar chart.

일 실시 예에서, 시각화 차트 선택 화면(30028)은 기본 차트, 특수 차트 및 조합 차트 중 적어도 하나를 포함할 수 있다. 예를 들어, 특수 차트는 한 축에 둘 이상의 변수를 표시하는 특수 유형의 누적 차트를 포함할 수 있으나, 이에 제한되지 않고 다양한 특수 유형의 차트가 포함될 수 있다.In one embodiment, the visualization chart selection screen (30028) may include at least one of a basic chart, a special chart, and a combination chart. For example, the special chart may include, but is not limited to, a special type of stacked chart that displays two or more variables on one axis, and may include various special types of charts.

또한, 조합 차트는 두 개의 차트 유형을 함께 표시하는 차트 및 두 개의 연속 요약 값을 비교하는 차트를 포함할 수 있으나, 이에 제한되지 않고 다양한 조합 유형의 차트가 포함될 수 있다.Additionally, combination charts may include, but are not limited to, charts that display two chart types together and charts that compare two continuous summary values, and may include various combination types of charts.

도 78은 실시 예에 따른 필드 매핑 화면의 예를 개시하는 도면Figure 78 is a drawing disclosing an example of a field mapping screen according to an embodiment.

도시한 예에서, 본 발명의 필드 매핑 화면(30029)을 제공할 수 있다. 여기에서, 필드 매핑 화면(30009)은 선택된 차트 스펙에 맞는 세부 정보를 설정하는 화면을 나타낼 수 있다.In the illustrated example, a field mapping screen (30029) of the present invention may be provided. Here, the field mapping screen (30009) may represent a screen for setting detailed information suitable for a selected chart specification.

일 실시 예에서, 필드 매핑 화면(30029)은 선택된 차트 종류에 따른 데이터 시리즈(series) 및 축(axis) 정보를 포함할 수 있다. 여기서, 데이터 시리즈는 차트에 대한 데이터 유형, 레이블, 범례 등의 속성을 포함할 수 있다.In one embodiment, the field mapping screen (30029) may include data series and axis information according to the selected chart type. Here, the data series may include properties such as data type, labels, and legends for the chart.

일 실시 예에서, 필드 매핑 화면(30029)은 지정된 차트 스펙에 맞게 필드를 드래그 엔 드롭(drag & drop)을 통해 X 축, Y 축 및 데이터 시리즈에 매핑할 수 있다.In one embodiment, the field mapping screen (30029) can map fields to the X-axis, Y-axis, and data series through drag & drop to fit the specified chart specifications.

또한, 필드 매핑 화면(30029)은 제목 사용여부, 제목명, 제목 위치, 제목 글자 크기 및 드릴 다운 주소를 포함하는 시각화 세부 정보를 포함할 수 있다.Additionally, the field mapping screen (30029) may include visualization details including whether to use a title, title name, title location, title font size, and drill-down address.

일 실시 예에서, 사용자에 의한 저장 버튼 클릭 입력을 통해 활성화된 팝업에서 시각화명 입력 후 확인 버튼으로 시각화 설정 정보가 저장될 수 있다.In one embodiment, visualization setting information can be saved by entering a visualization name in a pop-up activated by a user clicking a save button and then clicking a confirm button.

도 79는 실시 예에 따른 데이터 관리 방법이 통계 쿼리 생성 및 통계 데이터 시각화를 수행하는 예를 개시하는 흐름도Figure 79 is a flowchart disclosing an example of a data management method according to an embodiment of the present invention for generating statistical queries and visualizing statistical data.

데이터 관리 소프트웨어 패키지 내에 포함된 적어도 하나의 엔진 또는 모듈을 제어하는 데이터 관리 플랫폼에서, 쿼리 생성 사용자 인터페이스(user interface, UI)에 대한 쿼리 유형을 지정하는 사용자 입력을 획득한다(S30301). 일 실시 예에서, 쿼리 생성 사용자 인터페이스에 대한 데이터 소스, 검색 기간 설정 및 쿼리 유형 중 적어도 하나를 지정하는 사용자 입력을 획득할 수 있다. 이에 대하여는 도 71 내지 도 75에서 상술한 내용을 참고하도록 한다.In a data management platform that controls at least one engine or module included in a data management software package, user input specifying a query type for a query generation user interface (UI) is obtained (S30301). In one embodiment, user input specifying at least one of a data source, a search period setting, and a query type for the query generation user interface may be obtained. For this purpose, please refer to the contents described above in FIGS. 71 to 75.

지정된 쿼리 유형에 기반하여 통계 데이터를 검색하는 통계 쿼리를 생성한다(S30303). 일 실시 예에서, 쿼리 유형은, 상위 특정 개수의 데이터를 추출하는 제 1 쿼리, 시계열에 따라 측정된 데이터를 추출하는 제 2 쿼리 및 필드 유형에 따른 데이터를 추출하는 제 3 쿼리 중 적어도 하나를 포함할 수 있다. 이에 대하여는 도 71 내지 도 75에서 상술한 내용을 참고하도록 한다.A statistical query is generated to retrieve statistical data based on a specified query type (S30303). In one embodiment, the query type may include at least one of a first query that extracts a specific number of data, a second query that extracts data measured according to a time series, and a third query that extracts data according to a field type. For details, refer to the contents described above in FIGS. 71 to 75.

통계 쿼리에 기반하여 검색된 통계 데이터를 시각화한다(S30305). 일 실시 예에서, 검색된 통계 데이터에 대한 시각화 차트의 종류를 결정하고, 통계 데이터의 필드 유형을 시각화 차트에 대한 데이터 시리즈(series) 및 축(axis) 정보에 매핑할 수 있다. 이에 대하여는 도 76 내지 도 78에서 상술한 내용을 참고하도록 한다.Visualize statistical data retrieved based on a statistical query (S30305). In one embodiment, the type of visualization chart for the retrieved statistical data may be determined, and the field types of the statistical data may be mapped to data series and axis information for the visualization chart. For details, refer to the details described above in FIGS. 76 to 78.

도 80은 실시 예에 따른 데이터 관리 플랫폼이 타임 윈도우 기반 상관 분석을 수행하는 예를 개시하는 도면FIG. 80 is a diagram disclosing an example of a data management platform according to an embodiment performing time window-based correlation analysis.

도시한 예에서 데이터 관리 플랫폼(10000)은 상관 분석 규칙에 따라 타임 윈도우 기반 상관 분석을 수행할 수 있다. 일 실시 예에서, 데이터 관리 플랫폼(10000)은 모니터링 모듈(20005)을 통하여 타임 윈도우 방식으로 실시간 이벤트 상관 분석을 수행할 수 있다.In the illustrated example, the data management platform (10000) can perform time-window-based correlation analysis according to correlation analysis rules. In one embodiment, the data management platform (10000) can perform real-time event correlation analysis in a time-window manner through the monitoring module (20005).

일 실시 예에서, 모니터링 모듈(20005)은 사용자 입력에 기반하여 이벤트들 및 인시던트들 간 상관 관계에 대한 상관 분석 규칙을 생성할 수 있다.In one embodiment, the monitoring module (20005) can generate correlation analysis rules for correlations between events and incidents based on user input.

일 실시 예에서, 모니터링 모듈(20005)은 상관 분석 규칙을 분산 스케줄러에 등록하여 상관 분석을 주기적으로 수행할 수 있다.In one embodiment, the monitoring module (20005) can periodically perform correlation analysis by registering correlation analysis rules in a distributed scheduler.

일 실시 예에서, 모니터링 모듈(20005)은 이벤트 저장소로부터 검색 조건에 해당하는 이벤트를 조회할 수 있다. 이후, 모니터링 모듈(20005)은 상관 분석 규칙에 대한 상관 분석기가 없는 경우 해당 상관 분석기를 생성할 수 있다. 해당 상관 분석기가 존재하는 경우, 모니터링 모듈(20005)은 조회된 이벤트 및 이전에 생성된 인시던트 중 적어도 하나를 해당 상관 분석기에 분배할 수 있다.In one embodiment, the monitoring module (20005) can retrieve events matching search criteria from the event repository. Then, if a correlation analyzer for the correlation analysis rule does not exist, the monitoring module (20005) can create a correlation analyzer. If the correlation analyzer exists, the monitoring module (20005) can assign at least one of the retrieved event and a previously generated incident to the correlation analyzer.

일 실시 예에서, 모니터링 모듈(20005)은 상관 분석기를 통해 전달된 이벤트 및 인시던트가 분석 대상임을 확인하고, 필터링할 수 있다.In one embodiment, the monitoring module (20005) can verify that events and incidents transmitted through the correlation analyzer are subject to analysis and filter them.

일 실시 예에서, 모니터링 모듈(20005)은 상관 분석 규칙 별 기준 필드를 생성하고, 이벤트를 타임 윈도우 및 기준 필드 별로 저장할 수 있다. 일 실시 예에서, 해당 타임 윈도우 동안 기준 필드의 값이 같은 적어도 하나의 이벤트는 하나의 타임 윈도우 컨텍스트(Time Window Context)에 포함될 수 있다.In one embodiment, the monitoring module (20005) can generate a reference field for each correlation analysis rule and store events by time window and reference field. In one embodiment, at least one event with the same value of the reference field during the time window can be included in one time window context.

일 실시 예에서, 모니터링 모듈(20005)은 상관 분석기 별로 분석 기간이 만료된 타임 윈도우 컨텍스트(Time Window Context)를 삭제할 수 있다. 이 때 타임 윈도우 컨텍스트에 포함되어 있는 이벤트 및 인시던트도 같이 삭제된다.In one embodiment, the monitoring module (20005) can delete a time window context whose analysis period has expired for each correlation analyzer. At this time, events and incidents included in the time window context are also deleted.

일 실시 예에서, 모니터링 모듈(20005)은 상관 분석 규칙에 기반하여 타임 윈도우 컨텍스트에 포함된 적어도 하나의 이벤트에 대한 인시던트를 생성할 수 있다. 일 실시 예에서, 생성된 인시던트는 인시던트 저장소에 저장될 수 있다. 또한, 생성된 인시던트는 각 상관 분석기에 재귀적으로 입력되어 상관 분석 규칙 간 조합이 생성될 수 있다. 이를 통해, 본 발명에 따르면, 상관 분석 규칙 간 조합에 기반하여 멀티-레벨(Multi-Level) 상관 분석이 수행될 수 있다.In one embodiment, the monitoring module (20005) can generate an incident for at least one event included in a time window context based on correlation analysis rules. In one embodiment, the generated incident can be stored in an incident repository. Furthermore, the generated incident can be recursively input to each correlation analyzer to generate a combination of correlation analysis rules. Thus, according to the present invention, multi-level correlation analysis can be performed based on the combination of correlation analysis rules.

일 실시 예에서, 모니터링 모듈(20005)은 타임 윈도우 컨텍스트에 포함된 이벤트를 메모리 및 MapDB를 이용한 파일 기반으로 저장할 수 있다.In one embodiment, the monitoring module (20005) can store events included in the time window context in memory and on a file basis using MapDB.

도 81은 실시 예에 따른 데이터 관리 플랫폼이 타임 윈도우 기반 상관 분석을 수행하는 다른 예를 개시하는 도면FIG. 81 is a diagram disclosing another example of a data management platform according to an embodiment performing time window-based correlation analysis.

도시한 예에서 데이터 관리 플랫폼(10000)은 모니터링 모듈(20005)을 포함할 수 있다. 일 실시 예에서, 모니터링 모듈(20005)은 상관 분석 규칙 생성부(30030), 이벤트 수집부(30031), 인시던트 생성부(30032) 및 서비스 제공부(30033)를 포함할 수 있다.In the illustrated example, the data management platform (10000) may include a monitoring module (20005). In one embodiment, the monitoring module (20005) may include a correlation analysis rule generation unit (30030), an event collection unit (30031), an incident generation unit (30032), and a service provision unit (30033).

상관 분석 규칙 생성부(30030)는 사용자 입력에 기반하여 상관 분석 규칙을 생성할 수 있다. 일 실시 예에서, 상관 분석 규칙은, 타임 윈도우 동안의 이벤트 및 인시던트 중 적어도 하나를 입력으로 발생 횟수, 누적 횟수, 조건 유지, 발생 억제 및 미발생, 발생 순서 및 상태 변경 중 적어도 하나에 기반하여 새로운 인시던트를 생성하기 위해 사용될 수 있다. 각 상관 분석 규칙의 유형에 대한 내용은 아래에서 설명된다.The correlation analysis rule generation unit (30030) can generate correlation analysis rules based on user input. In one embodiment, the correlation analysis rule can be used to generate a new incident based on at least one of the number of occurrences, cumulative number, condition maintenance, occurrence suppression and non-occurrence, occurrence order, and status change, taking at least one of events and incidents during a time window as input. Details regarding each type of correlation analysis rule are described below.

일 실시 예에서, 상관 분석 규칙 생성부(30030)는 사용자 인터페이스(user interface, UI) 타입과 코드 타입 중 하나에 따라 상관 분석 규칙을 생성할 수 있다. 여기서, UI 타입은 사용자 인터페이스를 통해 사용자의 입력에 따라 항목을 입력하여 상관 분석 규칙이 입력되는 것을 의미할 수 있다. 또한, 코드 타입은 복잡한 상관 분석 규칙이 작성되는 경우 사용자에 의해 스크립트 형태의 코드를 입력하여 상관 분석 규칙이 입력되는 것을 의미할 수 있다. 이 경우, UI 타입에서 항목 선택 후 코드 타입으로 변경 시 선택된 항목이 코드 타입으로 변환될 수 있다.In one embodiment, the correlation analysis rule generation unit (30030) may generate a correlation analysis rule based on either a user interface (UI) type or a code type. Here, the UI type may mean that a correlation analysis rule is entered by inputting an item according to user input via a user interface. In addition, the code type may mean that a correlation analysis rule is entered by inputting a code in the form of a script by a user when a complex correlation analysis rule is created. In this case, when an item is selected in the UI type and then changed to the code type, the selected item may be converted to the code type.

이벤트 수집부(30031)는 이벤트 저장소로부터 검색 조건에 따라 이벤트를 검색할 수 있다. 인시던트 생성부(30032)는 상관 분석 규칙에 기반하여 이벤트에 대한 인시던트를 생성할 수 있다. 일 실시 예에서, 상관 분석이 수행될 때 연산자 및 조건에 따라 이벤트가 필터링될 수 있다. 예를 들어, 연산자는 AND, OR, NOT, IN, ==, !=, <, <=, >, >=, IN 등 다양한 연산자를 포함할 수 있다.The event collection unit (30031) can retrieve events from the event repository based on search conditions. The incident generation unit (30032) can generate incidents for events based on correlation analysis rules. In one embodiment, when correlation analysis is performed, events can be filtered based on operators and conditions. For example, the operators can include various operators such as AND, OR, NOT, IN, ==, !=, <, <=, >, >=, IN, etc.

일 실시 예에서, 인시던트 생성부(30032)는 인시던트를 생성하여 이벤트로 저장할지 여부를 결정하고, 인시던트의 필드 값을 지정할 수 있다.In one embodiment, the incident generation unit (30032) may determine whether to generate an incident and save it as an event, and may specify field values of the incident.

서비스 제공부(30033)는 인시던트에 기반하여 인시던트 대응 서비스를 제공할 수 있다. 일 실시 예에서, 인시던트 대응 서비스는 사이버 위협, 보안 침해 및 사이버 공격을 탐지하고 이에 대응하는 서비스를 포함할 수 있다. 일 실시 예에서, 인시던트 대응 서비스는 예기치 않은 이벤트 또는 인가되지 않은 접근에 대응하여 서비스 접근을 차단하는 서비스를 포함할 수 있다.The service provider (30033) may provide incident response services based on incidents. In one embodiment, the incident response services may include services for detecting and responding to cyber threats, security breaches, and cyber attacks. In one embodiment, the incident response services may include services for blocking service access in response to unexpected events or unauthorized access.

도 82는 실시 예에 따른 상관 분석 규칙 등록 화면의 예를 개시하는 도면Figure 82 is a drawing disclosing an example of a correlation analysis rule registration screen according to an embodiment.

도시한 예에서, 본 발명의 상과 분석 규칙 등록 화면(30034)을 제공할 수 있다. 여기에서, 상관 분석 규칙 등록 화면(30034)은 상관 분석 규칙을 등록하는 화면을 나타낼 수 있다.In the illustrated example, the correlation analysis rule registration screen (30034) of the present invention may be provided. Here, the correlation analysis rule registration screen (30034) may represent a screen for registering correlation analysis rules.

일 실시 예에서, 상관 분석 규칙 등록 화면(30034)은 규칙명, 인시던트 저장소, 검색 필터, 작업 유형, 작업 주기, 실행 주기, 초기 기간 및 지연 처리 기간을 포함할 수 있다. 일 실시 예에서, 상관 분석 규칙 등록 화면(30034)은 상관 분석의 실시간 작업 여부 및 로그 저장 여부를 활성화(ON/OFF)할 수 있다.In one embodiment, the correlation analysis rule registration screen (30034) may include a rule name, an incident repository, a search filter, a task type, a task cycle, an execution cycle, an initial period, and a delay processing period. In one embodiment, the correlation analysis rule registration screen (30034) may enable (turn ON/OFF) whether correlation analysis is performed in real time and whether logs are saved.

여기서, 규칙명은 상관 분석 규칙 명칭을 나타낼 수 있다. 예를 들어, 규칙명은 방화벽 세션 상관 분석으로 설정될 수 있다. 인시던트 저장소는 상관 분석 규칙의 결과로 발생한 인시던트가 저장되는 저장소를 나타낼 수 있다. 예를 들어, 인시던트 저장소는 incident로 설정될 수 있다. 검색 필터는 상관 분석에 사용할 검색 쿼리를 나타낼 수 있다. 예를 들어, 검색 필터는 eventtype:('syslog1' 'syslog2')로 설정될 수 있다. 작업 유형은 크론(cron) 및 초를 나타낼 수 있다. 예를 들어, 작업 유형은 초로 설정될 수 있다.Here, the rule name can indicate the name of the correlation analysis rule. For example, the rule name can be set to "Firewall Session Correlation Analysis." The incident repository can indicate the repository where incidents resulting from the correlation analysis rule are stored. For example, the incident repository can be set to "incident." The search filter can indicate the search query to be used for correlation analysis. For example, the search filter can be set to "eventtype:('syslog1' 'syslog2'). The task type can indicate cron and seconds. For example, the task type can be set to seconds.

작업 주기는 작업 유형에 따라 상관 분석을 수행할 실행 주기 간격의 작업 있는지를 확인하는 시간을 나타낼 수 있다. 예를 들어, 작업 주기는 60으로 설정되면 60초 마다 마지막으로 상관 분석이 실행된 시간과 지연 처리 기간 내에 실행하지 않은 실행 주기 간격 작업이 있는지 확인하여 실행할 수 있다.The job cycle can indicate the time interval at which correlation analysis is performed, depending on the job type. For example, if the job cycle is set to 60, the job cycle will check every 60 seconds for any tasks that have not yet been executed within the last correlation analysis run and the delay processing period.

실행 주기는 상관 분석을 수행할 시간 단위를 나타낼 수 있다. 예를 들어, 실행 주기는 10초로 설정될 수 있다. 초기 기간은 상관 분석이 실행된 이력이 없는 경우 상관분석을 초기에 실행할 기간을 나타낼 수 있다. 예를 들어, 초기 기간은 5분으로 설정될 수 있다.The execution period can indicate the time unit for performing correlation analysis. For example, the execution period can be set to 10 seconds. The initial period can indicate the initial period for performing correlation analysis if no correlation analysis has been performed before. For example, the initial period can be set to 5 minutes.

지연 처리 기간은 상관 분석으로 입력되는 이벤트의 현재 시간과의 간격을 나타낼 수 있다. 예를 들어 지연 처리 기간은 10초로 설정되면 현재시간으로부터 10초 전 이벤트만 상관분석으로 입력될 수 있다.The delay processing period can indicate the interval between the current time and the event being input for correlation analysis. For example, if the delay processing period is set to 10 seconds, only events 10 seconds prior to the current time can be input for correlation analysis.

도 83은 실시 예에 따른 대상 필드 미사용에 대한 횟수 초과의 예를 개시하는 도면Figure 83 is a drawing disclosing an example of an excess of the number of times a target field is not used according to an embodiment.

도시한 예에서, 대상 필드 미사용에 따른 횟수 초과(count over)에 기반한 상관 분석이 수행될 수 있다. 여기서, 횟수 초과는 기준 필드가 동일한 값을 가지는 이벤트가 발생한 횟수가 임계값보다 큰 경우 인시던트가 발생하는 것을 의미할 수 있다.In the illustrated example, correlation analysis can be performed based on count overs due to non-use of a target field. Here, count overs may mean that an incident occurs when the number of events with the same value for a reference field exceeds a threshold.

일 실시 예에서, 대상 필드 미사용에 따른 횟수 초과는 축적 시간 및 횟수에 기반하여 설정될 수 있다. 여기서, 축적 시간은 상관 분석을 수행 시에 분석 대상 이벤트를 저장하는 시간 단위를 의미할 수 있다. 또한, 횟수는 인시던트가 발생하는 기준 횟수를 의미할 수 있다.In one embodiment, the number of times a target field is unused may be set based on the accumulation time and the number of times. Here, the accumulation time may refer to the time unit for storing the target event when performing correlation analysis. Additionally, the number of times may refer to the reference number of times an incident occurs.

예를 들어, 상관 분석 규칙은 상관분석 유형이 횟수 초과, 축적 시간이 30초, 횟수가 100, 기준 필드는 srcIp로 설정될 수 있다. 이 때, 행위(action) 필드가 deny이고 목적지 포트(dstPort)가 7080인 방화벽 이벤트가 동일한 사용자 IP(srcIp)에서 100회 이상 발생하는 경우, 7080 port로 ddos 공격이 의심되는 외부 공격자 IP를 식별하여 접근 제어 시스템으로 차단 요청을 할 수 있다.For example, a correlation analysis rule can be set to have a correlation analysis type of exceeded, an accumulation time of 30 seconds, a count of 100, and a reference field of srcIp. In this case, if a firewall event with an action field of deny and a destination port (dstPort) of 7080 occurs more than 100 times from the same user IP (srcIp), the external attacker IP suspected of a DDoS attack on port 7080 can be identified and a blocking request can be sent to the access control system.

도 84는 실시 예에 따른 대상 필드 사용에 대한 횟수 초과의 예를 개시하는 도면FIG. 84 is a drawing disclosing an example of exceeding the number of times for using a target field according to an embodiment.

도시한 예에서, 대상 필드 사용에 따른 횟수 초과(count over)에 기반한 상관 분석이 수행될 수 있다. 여기서, 횟수 초과는 기준 필드가 동일한 값을 가지는 이벤트가 발생한 횟수가 임계값보다 큰 경우 인시던트가 발생하는 것을 의미할 수 있다.In the illustrated example, correlation analysis can be performed based on count overs for the target field. Here, count overs can mean that an incident occurs when the number of events with the same value for the reference field exceeds a threshold.

일 실시 예에서, 대상 필드 사용에 따른 횟수 초과는 축적 시간, 횟수 및 대상 필드에 기반하여 설정될 수 있다. 여기서, 축적 시간은 상관 분석을 수행 시에 분석 대상 이벤트를 저장하는 시간 단위를 의미할 수 있다. 또한, 횟수는 인시던트가 발생하는 기준 횟수를 의미할 수 있다. 또한, 대상 필드는 상관 분석을 수행할 때 횟수가 카운팅될 필드를 의미할 수 있다. 이 때 대상 필드의 값이 동일한 경우는 카운팅되지 않는다.In one embodiment, the number of times a target field is used may be set based on the accumulation time, the number of times, and the target field. Here, the accumulation time may refer to the time unit for storing the target event when performing correlation analysis. Furthermore, the number of times may refer to the reference number of times an incident occurs. Furthermore, the target field may refer to the field for which the number of times is counted when performing correlation analysis. In this case, if the target field values are identical, they are not counted.

예를 들어, 상관 분석 규칙은 상관분석 유형이 횟수 초과, 축적 시간이 1분, 횟수가 100, 기준 필드는 srcIp 및 dstIp, 대상 필드는 dstPort로 설정될 수 있다. 이 때, 행위(action) 필드가 deny인 방화벽 이벤트가 동일한 사용자 IP(srcIp)에서 서로 다른 목적지 포트(dstPort)로 100회 이상 발생하는 경우, 포트 스캐닝 피해 IP 및 공격자 IP를 식별하여 접근 제어 시스템으로 차단 요청을 할 수 있다. 이 때, 대상 필드를 dstPort로 설정하여 포트 스캐닝의 경우 1개의 포트에 여러 번 접근한 이벤트를 축적에서 제외할 수 있다.For example, a correlation analysis rule can have a correlation analysis type of exceeded count, an accumulation time of 1 minute, a count of 100, reference fields of srcIp and dstIp, and a target field of dstPort. In this case, if a firewall event with an action field of deny occurs more than 100 times from the same user IP (srcIp) to different destination ports (dstPort), the port scanning victim IP and the attacker IP can be identified and a blocking request can be sent to the access control system. In this case, by setting the target field to dstPort, events that access a single port multiple times can be excluded from accumulation in the case of port scanning.

도 85는 실시 예에 따른 억제 조건의 예를 개시하는 도면Figure 85 is a drawing disclosing an example of a suppression condition according to an embodiment.

도시한 예에서, 억제 조건(suppress)에 기반한 상관 분석이 수행될 수 있다. 여기서, 억제 조건은 기준 필드가 동일한 값을 가지는 이벤트에 의해 인시던트가 발생하는 경우 일정 시간 동안 새로운 인시던트가 발생하는 것을 억제하는 것을 의미할 수 있다.In the illustrated example, correlation analysis can be performed based on a suppression condition. Here, the suppression condition may mean suppressing the occurrence of new incidents for a certain period of time if the incident is caused by an event in which the reference field has the same value.

일 실시 예에서, 억제 조건은 해당 인시던트를 억제하는 억제 시간에 기반하여 설정될 수 있다. 여기서 억제 시간은 인시던트가 생성된 후 새로운 인시던트를 억제하는 시간을 의미할 수 있다.In one embodiment, the suppression condition may be set based on a suppression time for suppressing the corresponding incident. Here, the suppression time may refer to the time for suppressing a new incident after the incident is generated.

예를 들어, 입력 데이터가 메모리 과다 사용 인시던트인 경우, 상관 분석 규칙은 상관분석 유형이 억제 조건, 억제 시간이 30분, 기준 필드가 host로 설정될 수 있다. 이 경우, 메모리 과다 사용 인시던트가 발생되는 경우, 메모리 이상 인시던트를 발생시키고, 30분간 메모리 이상 인시던트가 발생되지 않도록 할 수 있다.For example, if the input data is a memory overuse incident, a correlation analysis rule could have a correlation analysis type of "Suppress Condition," a suppression time of 30 minutes, and a reference field of "host." In this case, if a memory overuse incident occurs, a memory abnormality incident is triggered and no further memory abnormality incidents are triggered for 30 minutes.

도 86은 실시 예에 따른 미발생 감지의 예를 개시하는 도면Figure 86 is a drawing disclosing an example of non-occurrence detection according to an embodiment.

도시한 예에서, 미발생 감지(not occur)에 기반한 상관 분석이 수행될 수 있다. 여기서, 미발생 감지는 기준 필드가 동일한 값을 가지는 이벤트가 일정 시간 동안 발생하지 않을 경우 인시던트가 발생하는 것을 의미할 수 있다.In the illustrated example, correlation analysis can be performed based on non-occurrence detection. Here, non-occurrence detection can mean that an incident occurs when an event with the same value of a reference field does not occur for a certain period of time.

일 실시 예에서, 미발생 감지는 미발생 시간에 기반하여 설정될 수 있다. 여기서, 미발생 시간은 해당 이벤트가 발생되지 않는 일정 시간을 의미할 수 있다.In one embodiment, non-occurrence detection may be set based on a non-occurrence time, where the non-occurrence time may refer to a period of time during which the corresponding event does not occur.

예를 들어, 입력 데이터가 에이전트 PING 이벤트이고, 에이전트에서 설정 시 10초에 하나의 이벤트가 발생되는 경우, 상관 분석 규칙은 상관분석 유형이 미발생 감지, 미발생 시간은 30초, 기준 필드는 host로 설정될 수 있다. 이 경우, 10초마다 발생해야 하는 PING 이벤트가 30초 이상 발생되지 않는 경우 인시던트를 생성하여 이상이 있다고 판단할 수 있다.For example, if the input data is an agent PING event, and the agent is configured to generate one event every 10 seconds, a correlation analysis rule could be set to "Non-occurrence Detection" for the correlation analysis type, "Non-occurrence Time" for 30 seconds, and "Host" for the reference field. In this case, if a PING event, which should occur every 10 seconds, does not occur for more than 30 seconds, an incident can be generated and an anomaly determined.

도 87은 실시 예에 따른 연속 조건의 예를 개시하는 도면Figure 87 is a drawing disclosing an example of a continuous condition according to an embodiment.

도시한 예에서, 연속 조건(successive)에 기반한 상관 분석이 수행될 수 있다. 여기서, 연속 조건은 축적 시간 내에 지정된 이벤트가 순서대로 발생하는 경우 인시던트가 발생하는 것을 의미할 수 있다.In the illustrated example, correlation analysis can be performed based on a successive condition. Here, a successive condition can mean that an incident occurs when specified events occur in sequence within an accumulated time.

일 실시 예에서, 연속 조건은 축적 시간 및 이벤트 순서에 기반하여 설정될 수 있다. 여기서, 축적 시간은 수행 시에 분석 대상 이벤트를 저장하는 시간 단위를 의미할 수 있다. 또한, 이벤트 순서는 이벤트가 발생되는 순서를 의미할 수 있다.In one embodiment, the continuity condition may be set based on the accumulation time and event order. Here, the accumulation time may refer to the time unit in which the target events are stored during the execution. Additionally, the event order may refer to the order in which the events occur.

예를 들어, 입력 데이터가 개인정보 과다 다운로드 인시던트 발생과 usb 저장장치 사용 이벤트인 경우, 상관 분석 규칙은 상관분석 유형이 연속 조건, 축적 시간은 30초, 이벤트 순서는 개인정보 과다 다운로드 후 usb 저장장치 사용, 기준 필드는 사번 및 사용자 IP로 설정될 수 있다. 이 경우, 개인정보 과다 다운로드 후 30초 내에 usb 저장장치를 사용하는 내부 개인정보 유출의심자 및 사용 PC를 인식할 수 있다.For example, if the input data is an incident involving excessive downloading of personal information and a USB storage device usage event, a correlation analysis rule could be set to a continuous condition with an accumulation time of 30 seconds, an event sequence of "excessive downloading of personal information followed by USB storage device usage," and the reference fields set to employee ID and user IP. In this case, it would be possible to identify suspected internal personal information leakers and their PCs using USB storage devices within 30 seconds of the excessive download.

또한, 본 발명에 따르면, 다양한 상관분석 유형에 따라 상관 분석이 수행될 수 있다. 일 실시 예에서, 누적 초과(sum over)에 기반한 상관 분석이 수행될 수 있다. 여기서, 누적 초과는 기준 필드가 동일한 값을 가지는 이벤트의 대상 필드의 누적 값이 임계값을 초과하는 경우 인시던트를 발생시키는 것을 의미할 수 있다.Furthermore, according to the present invention, correlation analysis can be performed based on various correlation analysis types. In one embodiment, correlation analysis based on cumulative overshoot can be performed. Here, cumulative overshoot can mean generating an incident when the cumulative value of the target field of an event with the same value as the reference field exceeds a threshold.

일 실시 예에서, 누적 초과는 축적 시간, 누적 초과 기준 및 대상 필드에 기반하여 설정될 수 있다. 여기서, 축적 시간은 상관 분석 수행 시에 분석 대상 이벤트를 저장하는 시간 단위를 의미할 수 있다. 또한, 누적 초과 기준은 인시던트를 발생시키는 기준 누적 값을 의미할 수 있다. 또한, 대상 필드는 상관 분석을 수행하는 값을 누적할 필드를 의미할 수 있다.In one embodiment, the cumulative exceedance can be set based on an accumulation time, an accumulation exceedance criterion, and a target field. Here, the accumulation time may refer to the time unit for storing the target event for analysis when performing correlation analysis. Furthermore, the accumulation exceedance criterion may refer to the reference accumulated value that triggers an incident. Furthermore, the target field may refer to the field in which the value for performing correlation analysis is accumulated.

예를 들어, 입력 데이터가 DRM 이벤트인 경우, 상관 분석 규칙은 상관분석 유형이 누적 초과, 지속 시간은 24시간, 누적초과 기준은 100,000, 대상 필드는 DRM 해제 건수 및 기준 필드는 사번으로 설정될 수 있다. 이 경우, 내부 개인정보를 유출하는 APT 공격 의심자로서 사번 별로 24시간 내 해제건수가 100,000건 이상인 사원에 대한 알림을 받을 수 있다.For example, if the input data is a DRM event, a correlation analysis rule could be set to "Cumulative Excess," with a "DRM Duration" of 24 hours, a "Cumulative Excess Criteria" of 100,000, a "DRM Release Count" target field, and a "DRM Release Criteria" field. In this case, employees suspected of being involved in an APT attack leaking internal personal information could receive alerts for employees with a DRM release count of 100,000 or more within 24 hours, based on their employee ID.

또한, 본 발명에 따르면, 조건 유지(sustain)에 기반한 상관 분석이 수행될 수 있다. 여기서, 조건 유지는 기준 필드가 동일한 값을 가지는 이벤트가 일정 시간 동안 특정 조건을 충족하는 경우 인시던트가 발생하는 것을 의미할 수 있다.Additionally, according to the present invention, correlation analysis based on conditional persistence can be performed. Here, conditional persistence may mean that an incident occurs when events with identical values in a reference field meet a specific condition for a certain period of time.

일 실시 예에서, 조건 유지는 유지 시간 및 특정 조건에 기반하여 설정될 수 있다. 여기서, 유지 시간은 인시던트가 발생하는 해당 특정 조건의 유지 시간을 의미할 수 있다. 또한, 특정 조건은 해당 필드에 대한 다양한 조건을 의미할 수 있다.In one embodiment, condition maintenance can be set based on a maintenance time and a specific condition. Here, the maintenance time may refer to the duration of the specific condition under which an incident occurs. Furthermore, the specific condition may refer to various conditions for the field.

예를 들어, 입력 데이터가 에이전트 메모리 측정량 이벤트인 경우, 상관 분석 규칙은 상관분석 유형이 조건 유지, 유지 시간은 10분, 조건은 usage가 0.9보다 크거나 같은 경우, 기준 필드가 host로 설정될 수 있다. 이 경우, 10분간 메모리 사용률이 90프로 이상인 상태를 유지하는 장비를 확인할 수 있다.For example, if the input data is an agent memory measurement event, a correlation rule might have a correlation type of "Conditional Maintenance," a retention time of 10 minutes, a condition of "Usage >= 0.9," and a reference field of "host." This would identify devices that maintain a memory usage rate of 90% or higher for 10 minutes.

또한, 본 발명에 따르면, 상태 변경 감지에 기반한 상관 분석이 수행될 수 있다. 일 실시 예에서, 상태 변경 감지는 축적 시간, 변경 감지 조건, 대상 필드 및 기준 필드에 기반하여 설정될 수 있다.Additionally, according to the present invention, correlation analysis based on state change detection can be performed. In one embodiment, state change detection can be set based on accumulation time, change detection conditions, target fields, and reference fields.

예를 들어, 입력 데이터가 출입게이트 사용 이벤트인 경우, 상관 분석 규칙은 상관분석 유형이 상태 변경 감지, 축적 시간은 1분, 변경 감지 조건은 IN to IN, OUT to OUT, 대상 필드는 출입 방향 및 기준 필드는 사번으로 설정될 수 있다. 이 경우, 같은 출입 카드로 출입 게이트에서 같은 방향으로 두 번 태깅하는 경우, 부정 사용으로 인지할 수 있다.For example, if the input data is an access gate usage event, the correlation analysis rule can be set to a correlation analysis type of state change detection, an accumulation time of 1 minute, a change detection condition of IN to IN and OUT to OUT, an entry/exit direction target field, and a reference field of employee number. In this case, if the same access card is tagged twice in the same direction at the access gate, it can be recognized as fraudulent use.

도 88은 실시 예에 따른 데이터 관리 방법이 타임 윈도우 기반 상관 분석을 수행하는 예를 개시하는 흐름도Figure 88 is a flowchart disclosing an example of a data management method according to an embodiment performing time window-based correlation analysis.

데이터 관리 소프트웨어 패키지 내에 포함된 적어도 하나의 엔진 또는 모듈을 제어하는 데이터 관리 플랫폼에서, 이벤트를 검색한다(S30401). 일 실시 예에서, 사용자에 의해 설정된 검색 조건에 해당하는 이벤트를 조회할 수 있다. 이에 대하여는 도 80 및 도 81에서 상술한 내용을 참고하도록 한다.In a data management platform that controls at least one engine or module included in a data management software package, an event is searched (S30401). In one embodiment, events corresponding to search conditions set by the user can be searched. For details, refer to the details described above in FIGS. 80 and 81.

미리 정의된 적어도 하나의 상관 분석 규칙에 기반하여 타임 윈도우(time window) 동안의 상기 이벤트에 대한 인시던트를 생성한다(S30403). 일 실시 예에서, 미리 정의된 적어도 하나의 상관 분석 규칙은, 타임 윈도우 동안의 이벤트 및 인시던트 중 적어도 하나의 발생 횟수, 누적 횟수, 조건 유지, 발생 억제 및 미발생, 발생 순서 및 상태 변경 중 적어도 하나에 기반하여 결정될 수 있다.An incident is generated for the event during a time window based on at least one predefined correlation analysis rule (S30403). In one embodiment, the at least one predefined correlation analysis rule may be determined based on at least one of the number of occurrences, cumulative number, condition maintenance, occurrence suppression and non-occurrence, occurrence order, and status change of at least one of the events and incidents during the time window.

일 실시 예에서, S30403 단계 이전에, 타임 윈도우 이외의 시간 구간에 대한 이벤트를 필터링할 수 있다. 일 실시 예에서, 이벤트에 대한 적어도 하나의 상관 분석 규칙 별 기준 필드를 생성하고, 타임 윈도우 동안 기준 필드의 값이 동일한 이벤트를 하나의 타임 윈도우 컨텍스트(Time Window Context)로 생성하며, 미리 정의된 적어도 하나의 상관 분석 규칙에 기반한 타임 윈도우 컨텍스트에 대한 상관 분석을 수행하여 인시던트를 생성할 수 있다. 이에 대하여는 도 82 내지 도 87에서 상술한 내용을 참고하도록 한다.In one embodiment, prior to step S30403, events for time intervals other than the time window can be filtered. In one embodiment, at least one correlation analysis rule-specific reference field for an event is created, events with the same value of the reference field during the time window are created as a single time window context, and correlation analysis is performed on the time window context based on at least one predefined correlation analysis rule to create an incident. For this, please refer to the contents described above in FIGS. 82 to 87.

생성된 인시던트에 기반하여 인시던트 대응 서비스를 제공한다(S30405). 일 실시 예에서, 인시던트 대응 서비스는 사이버 보안 및 운영 상태 복원에 대응하는 서비스를 포함할 수 있다. 이에 대하여는 도 80 및 도 81에서 상술한 내용을 참고하도록 한다.Incident response services are provided based on generated incidents (S30405). In one embodiment, the incident response services may include services for cybersecurity and operational status restoration. For more information, please refer to the details described in FIGS. 80 and 81.

도 89는 실시 예에 따른 데이터 관리 플랫폼이 사용자 유형에 따른 로그 데이터 검색을 수행하는 예를 개시하는 도면FIG. 89 is a diagram disclosing an example of a data management platform according to an embodiment performing a log data search according to a user type.

도시한 예에서 데이터 관리 플랫폼(10000)은 사용자 유형에 따른 로그 데이터 검색을 수행할 수 있다. 일 실시 예에서, 데이터 관리 플랫폼(10000)은 모니터링 모듈(20005)을 통하여 사용자 유형에 따른 검색 UI(user interface)를 제공하고, 쿼리 검색과 일반 검색을 사용하여 로그 데이터 검색을 수행할 수 있다.In the illustrated example, the data management platform (10000) can perform log data searches based on user type. In one embodiment, the data management platform (10000) provides a search UI (user interface) based on user type through the monitoring module (20005), and can perform log data searches using query searches and general searches.

일 실시 예에서, 모니터링 모듈(20005)은 검색 조건 획득부(30035), 로그 검색부(30036) 및 검색 결과 제공부(30037)를 포함할 수 있다.In one embodiment, the monitoring module (20005) may include a search condition acquisition unit (30035), a log search unit (30036), and a search result provision unit (30037).

검색 조건 획득부(30035)는 사용자 입력에 의한 검색 조건을 획득할 수 있다. 일 실시 예에서, 이벤트 로그 검색 화면을 통하여 검색 조건 설정 화면이 팝업 형식으로 활성화될 수 있다. 이 경우, 해당 검색 조건 설정 화면에 대한 사용자 입력에 의해 검색 조건이 설정될 수 있다.The search condition acquisition unit (30035) can acquire search conditions based on user input. In one embodiment, a search condition setting screen can be activated in a pop-up format through the event log search screen. In this case, search conditions can be set based on user input on the search condition setting screen.

일 실시 예에서, 검색 조건이 입력된 후 검색 조건 설정 화면에서 사용자의 적용 버튼 클릭 시 이벤트 로그 검색 화면에 검색 조건이 적용될 수 있다. 이후 사용자의 검색 요청에 대한 입력(예: 재생 아이콘 클릭)에 따라 로그 데이터의 검색이 시작될 수 있다. 일 실시 예에서, 검색 조건 설정 화면에서 사용자의 검색 시작 버튼 클릭 시 검색 조건이 적용된 후 검색 요청이 바로 수행될 수 있다. 이와 같이, 본 발명에 따르면, 일반 사용자는 검색 쿼리를 알지 못하여도 UI를 통해 단순한 검색을 손쉽게 수행할 수 있다.In one embodiment, after entering search conditions, when the user clicks the Apply button on the search condition setting screen, the search conditions may be applied to the event log search screen. Thereafter, a search of log data may begin based on the user's input of a search request (e.g., clicking the play icon). In one embodiment, when the user clicks the Start Search button on the search condition setting screen, the search conditions may be applied and the search request may be immediately executed. In this way, according to the present invention, general users can easily perform simple searches through the UI even without knowing the search query.

일 실시 예에서, 검색 조건 설정 화면에서 사용자의 쿼리 생성 버튼 클릭 시 검색 조건에 따른 검색 쿼리가 생성될 수 있다. 이와 같이, 본 발명에 따르면, 고급 사용자는 검색 쿼리를 이용하여 복잡한 조건의 검색을 수행할 수 있다.In one embodiment, a search query based on search conditions can be generated when a user clicks the Create Query button on the search condition setting screen. Thus, according to the present invention, advanced users can perform searches with complex conditions using search queries.

로그 검색부(30036)는 검색 조건에 기반하여 로그 데이터를 검색할 수 있다. 일 실시 예에서, 로그 검색부(30036)는 검색 쿼리에 기반하여 로그 데이터를 검색할 수 있다. 이에 대한 상세한 내용은 아래에서 설명된다.The log search unit (30036) can search log data based on search conditions. In one embodiment, the log search unit (30036) can search log data based on a search query. This is described in detail below.

검색 결과 제공부(30037)는 사용자 유형에 따라 검색 결과를 제공할 수 있다. 일 실시 예에서, 검색 결과 제공부(30037)는 검색 요청에 대한 검색 결과 수신 시 일반 사용자 뷰에서 검색 결과를 스크롤링 형태로 표시할 수 있다.The search result provider (30037) may provide search results according to the user type. In one embodiment, the search result provider (30037) may display the search results in a scrolling format in the general user view when receiving search results for a search request.

일 실시 예에서, 검색 결과 제공부(30037)는 일반 사용자 뷰에서 고급 사용자 뷰로 전환할 수 있다. 이 경우, 고급 사용자 뷰는 페이징 방식으로 표시될 수 있다.In one embodiment, the search result provider (30037) can switch from a general user view to an advanced user view. In this case, the advanced user view can be displayed in a paging manner.

일 실시 예에서, 검색 결과 제공부(30037)는 고급 사용자 뷰에서 관심 필드와 선택 필드를 통해 사용자에 의해 선택된 필드들만 선택적으로 표시할 수 있다. 또한, 검색 결과 제공부(30037)는 관심 필드 및 선택 필드의 필드값(예: 숫자)에 대한 사용자 입력을 통해 필드별 검색 데이터 통계를 표시할 수 있다. 이에 대한 상세한 내용은 아래에서 설명된다.In one embodiment, the search result provider (30037) can selectively display only fields selected by the user through the fields of interest and selection fields in the advanced user view. Furthermore, the search result provider (30037) can display field-specific search data statistics based on user input for field values (e.g., numbers) in the fields of interest and selection fields. This is described in detail below.

본 발명에 따르면, 한 개의 화면에서 쿼리 기반 및 UI 기반 검색을 동시 제공하고 UI 기반 검색 키워드를 검색 쿼리와 연계할 수 있다.According to the present invention, query-based and UI-based searches can be provided simultaneously on a single screen, and UI-based search keywords can be linked to search queries.

또한, 본 발명에 따르면, 사용자의 시스템 사용 패턴에 따라 UI를 커스터마이징할 수 있다.Additionally, according to the present invention, the UI can be customized according to the user's system usage pattern.

도 90은 실시 예에 따른 검색 조건 설정 화면의 예를 개시하는 도면FIG. 90 is a drawing disclosing an example of a search condition setting screen according to an embodiment.

도시한 예에서, 본 발명의 검색 조건 설정 화면(30039)을 제공할 수 있다. 여기에서, 검색 조건 설정 화면(30039)은 로그 데이터를 검색하는 검색 조건을 설정하는 화면을 나타낼 수 있다. 일 실시 예에서, 검색 조건 설정 화면(30039)은 이벤트 로그 검색 화면(30038)을 통해 팝업 형태로 활성화될 수 있다.In the illustrated example, a search condition setting screen (30039) of the present invention may be provided. Here, the search condition setting screen (30039) may represent a screen for setting search conditions for searching log data. In one embodiment, the search condition setting screen (30039) may be activated in a pop-up form through the event log search screen (30038).

일 실시 예에서, 검색 조건 설정 화면(30039)은 검색 기간, 이벤트 유형 및 검색 조건을 포함할 수 있다. 여기서, 검색 조건은 사용자에 의해 선택된 이벤트 유형이 포함하고 있는 이벤트 필드 정보가 표시된다. 예를 들어, 이벤트 유형, 호스트(host), 원본 길이 및 장비 ID를 포함할 수 있다.In one embodiment, the search condition setting screen (30039) may include a search period, an event type, and a search condition. Here, the search condition displays event field information included in the event type selected by the user. For example, the search condition may include the event type, host, original length, and device ID.

일 실시 예에서, 사용자 별로 전체 이벤트 목록에서 자주 사용되는 이벤트를 선택하여 자주 사용 이벤트 유형으로 관리될 수 있다. 예를 들어, AGENT, DB, SYSLOG 등의 이벤트 유형은 자주 사용되는 이벤트 유형으로 표시될 수 있다. 일 실시 예에서, 사용자는 이벤트 유형 리스트에 포함된 각 이벤트 유형에 대한 클릭 입력을 통해 해당 이벤트를 이벤트 유형에 추가시킬 수 있다.In one embodiment, frequently used events can be selected from the entire event list for each user and managed as frequently used event types. For example, event types such as AGENT, DB, and SYSLOG can be displayed as frequently used event types. In one embodiment, a user can add each event type included in the event type list to an event type by clicking on it.

일 실시 예에서, 검색 조건 설정 후, 검색 시작에 대한 사용자 입력이 획득되는 경우, 검색 조건을 적용하고 검색 요청을 통해 로그 데이터의 검색이 진행될 수 있다.In one embodiment, after setting search conditions, if a user input for starting a search is obtained, the search conditions can be applied and a search of log data can be performed through a search request.

일 실시 예에서, 검색 조건 설정 후, 적용에 대한 사용자 입력이 획득되는 경우, 사용자에 의해 입력된 검색 조건이 설정되어 이벤트 로그 검색 화면(30038)에 표시될 수 있다. 이에 대한 자세한 내용은 아래에서 설명된다.In one embodiment, after setting search conditions, if user input for application is obtained, the search conditions entered by the user may be set and displayed on the event log search screen (30038). This is described in detail below.

일 실시 예에서, 검색 조건 설정 후, 쿼리 생성에 대한 사용자 입력이 획득되는 경우, 해당 검색 조건이 검색 쿼리의 형태로 생성되어 이벤트 로그 검색 화면(30038)에 표시될 수 있다. 이에 대한 자세한 내용은 아래에서 설명된다.In one embodiment, after setting search conditions, if user input for query generation is obtained, the search conditions may be generated in the form of a search query and displayed on the event log search screen (30038). This is described in detail below.

도 91은 실시 예에 따른 검색 조건 UI의 예를 개시하는 도면Figure 91 is a drawing disclosing an example of a search condition UI according to an embodiment.

도시한 예에서, 본 발명의 검색 조건 UI(30040)는 이벤트 로그 검색 화면을 통해 제공될 수 있다. 일 실시 예에서, 검색 조건 설정 후, 적용에 대한 사용자 입력이 획득되는 경우, 사용자에 의해 입력된 검색 조건이 설정되어 이벤트 로그 검색 화면(30038)에 표시될 수 있다. 즉, 해당 검색 조건이 검색 조건 UI(30040) 형태로 표시되어 사용자가 해당 검색 조건을 용이하게 확인할 수 있다.In the illustrated example, the search condition UI (30040) of the present invention may be provided via an event log search screen. In one embodiment, after setting a search condition, if a user input for application is obtained, the search condition entered by the user may be set and displayed on the event log search screen (30038). That is, the search condition is displayed in the form of a search condition UI (30040), allowing the user to easily check the search condition.

일 실시 예에서, 적어도 하나의 검색 조건 UI(30040)에 대한 사용자의 삭제 입력에 따라 해당 검색 조건만이 선택적으로 삭제될 수 있다.In one embodiment, only the search conditions may be selectively deleted based on a user's deletion input for at least one search condition UI (30040).

일 실시 예에서, 재생 아이콘에 대한 사용자 입력을 통해 검색 조건 UI(30040)에 대응하는 검색 요청이 수행될 수 있다. 검색 요청에 따라 로그 데이터의 검색이 수행될 수 있다. 다만, 검색을 요청하는 아이콘의 형태 및 배치는 다양하게 구현될 수 있으며 이에 제한되지 않는다.In one embodiment, a search request corresponding to a search condition UI (30040) may be performed via user input for a play icon. Log data may be searched based on the search request. However, the shape and arrangement of the icon requesting the search may be implemented in various ways and are not limited thereto.

도 92는 실시 예에 따른 검색 쿼리의 예를 개시하는 도면FIG. 92 is a drawing disclosing an example of a search query according to an embodiment.

도시한 예에서, 본 발명의 검색 쿼리(30041)는 이벤트 로그 검색 화면을 통해 제공될 수 있다. 일 실시 예에서, 검색 조건 설정 후, 쿼리 생성에 대한 사용자 입력이 획득되는 경우, 해당 검색 조건이 검색 쿼리(30041)의 형태로 생성되어 이벤트 로그 검색 화면(30038)에 표시될 수 있다.In the illustrated example, the search query (30041) of the present invention may be provided through an event log search screen. In one embodiment, after setting search conditions, if user input for query generation is obtained, the search conditions may be generated in the form of a search query (30041) and displayed on the event log search screen (30038).

일 실시 예에서, 검색 쿼리(30041)는 검색 조건에 대응하는 코드 형태로 생성될 수 있다. 예를 들어, 검색 기간은 time:['2024-01-10T00:00:00'~'2024-01-11T00:00:00'}, 이벤트 유형은 eventtype:'SYSLOG', 호스트는 host:'10.0.2.7'로 표현되며, 각 검색 조건항목은 AND 연산자로 결합될 수 있다.In one embodiment, a search query (30041) may be generated in the form of a code corresponding to a search condition. For example, the search period may be expressed as time:['2024-01-10T00:00:00'~'2024-01-11T00:00:00'}, the event type as eventtype:'SYSLOG', and the host as host:'10.0.2.7', and each search condition item may be combined using an AND operator.

일 실시 예에서, 검색 쿼리(30041)의 코드는 사용자 입력에 따라 수정될 수 있으며, 이를 통해 고급 사용자는 복잡한 검색 조건을 설정할 수 있다.In one embodiment, the code of the search query (30041) can be modified based on user input, allowing advanced users to set complex search conditions.

일 실시 예에서, 재생 아이콘에 대한 사용자 입력을 통해 검색 쿼리(30041)에 대응하는 검색 요청이 수행될 수 있다. 검색 요청에 따라 로그 데이터의 검색이 수행될 수 있다. 다만, 검색을 요청하는 아이콘의 형태 및 배치는 다양하게 구현될 수 있으며 이에 제한되지 않는다.In one embodiment, a search request corresponding to a search query (30041) may be performed via user input to a play icon. Log data may be searched based on the search request. However, the shape and arrangement of the icon requesting the search may be implemented in various ways and are not limited thereto.

도 93은 실시 예에 따른 일반 검색 결과 표시 영역의 예를 개시하는 도면FIG. 93 is a drawing disclosing an example of a general search results display area according to an embodiment.

도시한 예에서, 본 발명의 일반 검색 결과 표시 영역(30042)은 이벤트 로그 검색 화면을 통해 제공될 수 있다.In the illustrated example, the general search result display area (30042) of the present invention may be provided through an event log search screen.

일 실시 예에서, 일반 검색 결과 표시 영역(30042)은 일반 사용자 뷰에서의 검색 결과 조회를 포함할 수 있다. 이 경우, 검색 결과에 포함된 로그 데이터가 적어도 하나의 필드 별로 구분되어 표시될 수 있다. 예를 들어, 검색 조건이 이벤트 타입이 SYSLOG이고, 호스트가 10.0.2.7인 경우, 이벤트 타입이 SYSLOG이고, 호스트가 10.0.2.7인 로그 데이터가 검색되며, 해당 로그 데이터는 발생 시간, 이벤트 유형, 호스트, 원본 길이, 장비 ID 및 원본 로그와 같은 필드로 구분되어 표시될 수 있다.In one embodiment, the general search results display area (30042) may include search results in a general user view. In this case, log data included in the search results may be displayed by being separated by at least one field. For example, if the search condition is an event type of SYSLOG and the host is 10.0.2.7, log data with an event type of SYSLOG and a host of 10.0.2.7 may be searched, and the log data may be displayed by being separated by fields such as occurrence time, event type, host, original length, equipment ID, and original log.

이와 같이, 본 발명에 따르면, 일반 사용자 뷰를 통해 일반 사용자는 로그 데이터를 칼럼별로 구분하여 용이하게 식별할 수 있다.In this way, according to the present invention, a general user can easily identify log data by dividing it into columns through a general user view.

도 94는 실시 예에 따른 검색 옵션 설정 화면의 예를 개시하는 도면FIG. 94 is a drawing disclosing an example of a search option setting screen according to an embodiment.

도시한 예에서, 본 발명의 검색 옵션 설정 화면(30043)을 제공할 수 있다. 일 실시 예에서, 검색 옵션 설정 화면(30043)은 이벤트 로그 검색 화면을 통해 팝업 형태로 활성화될 수 있다.In the illustrated example, a search option setting screen (30043) of the present invention may be provided. In one embodiment, the search option setting screen (30043) may be activated in a pop-up form through the event log search screen.

일 실시 예에서, 검색 옵션 설정 화면(30043)은 이벤트 포함 활성화 설정, 고속 검색 설정, 고급 사용자 뷰 전환 설정, 구버전 검색 활성화 설정 및 검색결과 수 제한 설정을 포함할 수 있다. 이 경우, 고급 사용자 뷰 전환 설정은 일반 사용자 뷰에서 고급 사용자 뷰로 전환하는 검색 옵션을 의미할 수 있다.In one embodiment, the search option settings screen (30043) may include settings for enabling event inclusion, setting for fast search, setting for switching to an advanced user view, setting for enabling old version search, and setting for limiting the number of search results. In this case, the setting for switching to an advanced user view may refer to a search option for switching from a general user view to an advanced user view.

사용자 입력에 의해 고급 사용자 뷰가 활성화(ON)되는 경우, 이벤트 로그 검색 화면의 일반 검색 결과 표시 영역이 고급 검색 결과 표시 영역으로 전환될 수 있다. 이에 대한 자세한 내용은 아래에서 설명된다.When the Advanced User View is enabled (ON) by user input, the general search results display area of the Event Log Search screen can be switched to the Advanced Search results display area. This is explained in more detail below.

도 95는 실시 예에 따른 고급 검색 결과 표시 영역의 예를 개시하는 도면FIG. 95 is a drawing disclosing an example of an advanced search results display area according to an embodiment.

도시한 예에서, 본 발명의 고급 검색 결과 표시 영역은 이벤트 로그 검색 화면을 통해 제공될 수 있다. 여기서, 고급 검색 결과 표시 영역은 필드 설정 영역(30044) 및 검색 결과 표시 영역(30045)을 포함할 수 있다.In the illustrated example, the advanced search results display area of the present invention may be provided through an event log search screen. Here, the advanced search results display area may include a field setting area (30044) and a search results display area (30045).

필드 설정 영역(30044)은 필드 검색, 선택 필드 및 관심 필드를 포함할 수 있다. 여기서, 필드 검색은 사용자가 직접 필드명을 입력하여 검색하는 영역을 의미할 수 있다. 관심 필드는 로그 데이터에 포함된 다수의 필드를 포함할 수 있다. 이 경우, 필드 검색을 통해 검색된 필드가 관심 필드로 추가될 수 있다.The field setting area (30044) may include field search, selection fields, and fields of interest. Here, field search may refer to an area where a user can directly enter a field name to search. Fields of interest may include multiple fields contained in log data. In this case, fields found through field search may be added as fields of interest.

선택 필드는 관심 필드에 포함된 다수의 필드 중 사용자에 의해 선택된 적어도 하나의 필드를 포함할 수 있다. 이 경우, 선택 필드에 포함된 적어도 하나의 필드가 검색 결과 표시 영역(30045)에 추가될 수 있다.The selection field may include at least one field selected by the user from among multiple fields included in the interest field. In this case, at least one field included in the selection field may be added to the search results display area (30045).

검색 결과 표시 영역(30045)은 선택 필드에 대응하는 로그 데이터의 검색 결과를 표시할 수 있다. 예를 들어, 선택 필드에 device_ip와 이벤트 유형이 포함된 경우, 검색 결과 표시 영역(30045)은 device_ip와 이벤트 유형 필드에 대한 로그 데이터를 표시할 수 있다.The search results display area (30045) can display search results of log data corresponding to the selected field. For example, if the selected field includes device_ip and event type, the search results display area (30045) can display log data for the device_ip and event type fields.

예를 들어, 검색 조건이 이벤트 타입이 SYSLOG이고, 호스트가 10.0.2.7이고, 선택 필드에 device_ip와 이벤트 유형이 포함된 경우, 이벤트 타입이 SYSLOG이고, 호스트가 10.0.2.7인 로그 데이터가 검색되며, 해당 로그 데이터는 발생 시간, device_ip, 이벤트 유형 및 원본 로그와 같은 사용자가 선택한 필드로 구분되어 표시될 수 있다.For example, if the search condition is that the event type is SYSLOG, the host is 10.0.2.7, and the selection field includes device_ip and event type, log data where the event type is SYSLOG and the host is 10.0.2.7 will be searched, and the log data can be displayed separated by user-selected fields such as occurrence time, device_ip, event type, and original log.

일 실시 예에서, 필드 설정 영역(30044)의 각 필드에 대한 통계 데이터 표시 화면(30046)이 팝업 형태로 활성화될 수 있다. 일 실시 예에서, 통계 데이터 표시 화면(30046)은 선택된 필드값의 유형에 대한 개수 기준 상위 순위에 대한 통계 수치를 나타낼 수 있다.In one embodiment, a statistical data display screen (30046) for each field in the field setting area (30044) may be activated in a pop-up form. In one embodiment, the statistical data display screen (30046) may display statistical figures for the top rankings based on the number of types of selected field values.

도 96은 실시 예에 따른 데이터 관리 방법이 사용자 유형에 따른 로그 데이터 검색을 수행하는 예를 개시하는 흐름도Figure 96 is a flowchart showing an example of a data management method according to an embodiment performing a log data search according to a user type.

데이터 관리 소프트웨어 패키지 내에 포함된 적어도 하나의 엔진 또는 모듈을 제어하는 데이터 관리 플랫폼에서, 사용자 입력에 의한 검색 조건을 획득한다(S30501). 일 실시 예에서, 검색 조건은 검색 기간, 이벤트 유형, 호스트, 원본 길이 및 장비 ID 중 적어도 하나를 포함할 수 있다. 이에 대하여는 도 89 및 도 90에서 상술한 내용을 참고하도록 한다.In a data management platform that controls at least one engine or module included in a data management software package, search conditions are acquired based on user input (S30501). In one embodiment, the search conditions may include at least one of a search period, an event type, a host, an original length, and a device ID. For further details, please refer to the details described above in FIGS. 89 and 90.

검색 조건에 기반하여 로그 데이터를 검색한다(S30503). 일 실시 예에서, 검색 조건에 대응하는 검색 쿼리를 생성하고, 검색 쿼리에 기반하는 로그 데이터를 검색할 수 있다. 이에 대하여는 도 91 및 도 92에서 상술한 내용을 참고하도록 한다.Log data is searched based on search conditions (S30503). In one embodiment, a search query corresponding to the search conditions can be generated, and log data based on the search query can be searched. For details, refer to the contents described above in FIGS. 91 and 92.

사용자 유형에 따라 검색 결과를 제공한다(S30505). 일 실시 예에서, 사용자 유형이 일반 사용자 유형인 경우, 적어도 하나의 필드 별로 구분된 로그 데이터를 포함하는 검색 결과를 제공할 수 있다. 일 실시 예에서, 사용자 유형이 고급 사용자 유형인 경우, 로그 데이터의 다수의 관심 필드 중 사용자 입력에 의해 선택된 선택 필드에 대응하는 검색 결과 및 선택 필드 및 관심 필드에 대한 통계 데이터 중 적어도 하나를 제공할 수 있다. 이에 대하여는 도 93 내지 도 95에서 상술한 내용을 참고하도록 한다.Search results are provided based on the user type (S30505). In one embodiment, if the user type is a general user type, search results including log data categorized by at least one field may be provided. In one embodiment, if the user type is an advanced user type, search results corresponding to a selection field selected by user input among multiple fields of interest in the log data and at least one of statistical data for the selection field and the field of interest may be provided. For this purpose, please refer to the contents described above in FIGS. 93 to 95.

도 97은 본 발명의 데이터 관리 플랫폼에서 에이전트를 관리하는 실시 예를 설명하는 도면이다.Figure 97 is a drawing illustrating an embodiment of managing an agent in the data management platform of the present invention.

로그를 수집하기 위해 에이전트를 사용할 때 다수의 에이전트에 대한 설정 변경, 버전 관리, 상태 모니터링과 같은 작업들은 많은 수작업을 요구하고 복잡성을 증가시킨다. 이로 인해 업무 효율성이 저하될 수 있다. 본 발명은 데이터 수집을 위한 에이전트에 대한 중앙 관리를 통해 이러한 문제를 해결하고자 한다. 이를 통해 에이전트에 대한 패치 적용, 동작 관리, 설정 관리 등의 작업을 자동화하고 중앙화하여 업무의 효율성을 증가시킬 수 있다.When using agents to collect logs, tasks such as changing settings, managing versions, and monitoring the status of multiple agents require significant manual effort and increase complexity. This can reduce work efficiency. The present invention aims to address these issues through centralized management of agents for data collection. This automates and centralizes tasks such as patch application, operation management, and configuration management, thereby increasing work efficiency.

본 발명의 데이터 관리 플랫폼(10000)의 수집 모듈(20001)은 에이전트 관리부(2032)를 통하여 적어도 하나의 에이전트(20008a, 20008b, …)를 관리할 수 있다. 여기에서, 에이전트(Agent)는 컴퓨터 시스템 또는 네트워크 상에서 특정한 작업을 수행하기 위해 설치되고 실행되는 소프트웨어 모듈 또는 프로그램에 대응할 수 있다. 본 발명의 에이전트는 호스트 서버에 설치되어 설정된 기능(예를 들어, 데이터 수집 기능)을 수행할 수 있다.The collection module (20001) of the data management platform (10000) of the present invention can manage at least one agent (20008a, 20008b, …) through the agent management unit (2032). Here, the agent may correspond to a software module or program installed and executed to perform a specific task on a computer system or network. The agent of the present invention can be installed on a host server and perform a set function (e.g., a data collection function).

이를 위하여, 데이터 관리 플랫폼(10000)의 수집 모듈(20001)은 에이전트 관리부(2032)를 포함할 수 있다.For this purpose, the collection module (20001) of the data management platform (10000) may include an agent management unit (2032).

에이전트 관리부(2032)는 에이전트에 대한 버전(version)을 관리할 수 있다. 에이전트 관리부(2032)는 수집 대상에 설치된 에이전트에 대한 버전 정보를 모니터링하며, 신규 버전의 버전으로 업데이트하는 명령을 수행할 수 있다. 여기에서, 수집 대상은 본 발명의 데이터 관리 플랫폼(10000)에서 로그를 수집하고자 하는 호스트 서버에 해당한다. 또한, 에이전트 관리부(2032)는 적어도 하나의 에이전트(20008a, 20008b, …의 수집 상태를 모니터링할 수 있다.The agent management unit (2032) can manage the version of the agent. The agent management unit (2032) monitors the version information of the agent installed on the collection target and can execute a command to update to a new version. Here, the collection target corresponds to a host server from which logs are to be collected in the data management platform (10000) of the present invention. In addition, the agent management unit (2032) can monitor the collection status of at least one agent (20008a, 20008b, ...).

일 실시 예에서, 에이전트 관리부(2032)는 적어도 하나의 에이전트(20008a, 20008b, …)별로 수행에 필요한 명령어를 생성하고, 해당 에이전트에 배포할 수 있다. 예를 들어, 에이전트 관리부(2032)는 제 1 에이전트(20008a)을 위한 로그 데이터 수집 명령어를 생성하고, 제 1 에이전트(20008a)에게 로그 데이터 수집 명령어를 배포할 수 있다.In one embodiment, the agent management unit (2032) may generate commands required for execution by at least one agent (20008a, 20008b, …) and distribute them to the corresponding agents. For example, the agent management unit (2032) may generate a log data collection command for the first agent (20008a) and distribute the log data collection command to the first agent (20008a).

일 실시 예에서, 에이전트 관리부(2032)는 적어도 하나의 에이전트(20008a, 20008b, …)가 설치된 서버에 대한 목록을 관리할 수 있다.In one embodiment, the agent management unit (2032) can manage a list of servers on which at least one agent (20008a, 20008b, …) is installed.

이때, 에이전트는 데이터 수집 기능을 수행하기 위한 적어도 하나의 모듈을 포함할 수 있다. 일 실시 예에서, 제 1 에이전트(20008a)는 와치독 모듈(Watchdog module, 2033) 및 워커 모듈(Worker module, 2034)을 포함할 수 있다. 여기에서, 워커 모듈은 실질적으로 데이터를 수집하는 기능을 수행하며, 에이전트 모듈이라는 이름으로 지칭될 수 있다.At this time, the agent may include at least one module for performing a data collection function. In one embodiment, the first agent (20008a) may include a watchdog module (2033) and a worker module (2034). Here, the worker module actually performs the function of collecting data and may be referred to as an agent module.

여기에서, 와치독 모듈(2033)은 제 1 에이전트(20008a)에서 사용되는 모듈 중 하나로, 에이전트의 예기치 않은 동작 또는 비정상적인 상태를 감지하는 기능을 수행할 수 있다. 일 실시 예에서, 와치독 모듈(2033)은 데이터 관리 플랫폼(10000)과의 통신을 담당하고, 워커 모듈(2034)에 대한 제어를 담당할 수 있다.Here, the watchdog module (2033) is one of the modules used in the first agent (20008a) and can perform a function of detecting unexpected behavior or abnormal states of the agent. In one embodiment, the watchdog module (2033) is responsible for communication with the data management platform (10000) and can be responsible for controlling the worker module (2034).

워커 모듈(2034)은 제 1 에이전트(20008a)의 핵심적인 기능을 수행하는 모듈로 주로 백그라운드에서 동작하며 특정 작업을 처리하고 결과를 반환하는 역할을 담당할 수 있다.The worker module (2034) is a module that performs the core functions of the first agent (20008a) and mainly operates in the background, and can be responsible for processing specific tasks and returning results.

특히, 본 발명에서는, 워커 모듈(2034)은 와치독 모듈(2033)로부터 전달받은 명령어를 수행하고 그 수행 결과를 데이터 관리 플랫폼(10000)으로 반환할 수 있다.In particular, in the present invention, the worker module (2034) can execute a command received from the watchdog module (2033) and return the execution result to the data management platform (10000).

이에 따라, 데이터 관리 플랫폼(10000)은 적어도 하나의 에이전트(20008a, 20008b)로부터 수집된 데이터를 데이터베이스(20007)에 저장할 수 있다.Accordingly, the data management platform (10000) can store data collected from at least one agent (20008a, 20008b) in a database (20007).

특히, 본 도면에서는 2개의 에이전트만을 예로 들어 설명하였으나, 에이전트가 수십 내지 수백개가 되는 경우에도 동일하게 적용할 수 있다. 즉, 수십 내지 수백개의 에이전트를 동시에 업데이트 하기 위하여 정책을 배포할 수 있어 업무의 효율성이 극대화될 수 있다.In particular, while this diagram illustrates only two agents as an example, the same approach can be applied to dozens or even hundreds of agents. In other words, policies can be deployed to update dozens or even hundreds of agents simultaneously, maximizing work efficiency.

도 98는 본 발명의 데이터 관리 플랫폼에서 에이전트를 설치하는 실시 예를 설명하는 도면이다.Figure 98 is a drawing illustrating an embodiment of installing an agent in the data management platform of the present invention.

본 도면은 에이전트를 설치하는 실시 예를 설명하는 도면이다.This drawing is a drawing illustrating an example of installing an agent.

본 발명의 데이터 관리 플랫폼을 이용하는 사용자는 에이전트를 실행하기 위한 JRE(운영 체제 버전에 맞는 자바 런타임 환경)를 선택할 수 있다. 이때, JRE는 에이전트가 실행될 때 필요한 JRE 설치 파일을 의미한다. 이러한 선택을 통해 사용자는 에이전트 소프트웨어를 다운로드하는 과정을 관리할 수 있다.Users using the data management platform of the present invention can select a JRE (Java Runtime Environment appropriate for the operating system version) to run the agent. The JRE refers to the JRE installation file required for agent execution. This selection allows users to manage the agent software download process.

설치된 에이전트는 로그 데이터를 수집하여 특정 컬렉터(collector)로 수집된 데이터를 전송할 수 있다. 여기서 컬렉터는 설치된 에이전트로부터 로그를 수집하는 대상 기기나 서버를 나타낼 수 있다.An installed agent can collect log data and send it to a specific collector. The collector can represent a target device or server that collects logs from the installed agent.

특히, 본 발명은 OS별로 다운로드 기능을 제공한다는 점에서 종래와 차이가 있다. 이때, 다운로드 유형은 설치 파일의 압축 형식을 포함하며, 사용자는 설정을 포함하여 에이전트 서비스를 다운로드할 수 있다. 예를 들어, 설정에는 JRE, 전송 대상 컬렉터, 다운로드 유형이 포함될 수 있다.In particular, the present invention differs from conventional approaches in that it provides a download function for each operating system. The download type includes a compressed installation file format, and users can download the agent service, including its settings. For example, the settings may include a JRE, a transmission target collector, and a download type.

보다 상세하게는, 에이전트를 설치할 때 대상 서버(예를 들어, 호스트 서버)에 에이전트 설치 파일을 압축 해제하고, Windows의 경우 'setup_win' 폴더 내의 'setup.bat' 파일을, Linux나 Unix의 경우 'setup' 폴더 내의 'setup.sh' 파일을 실행하여 설치할 수 있다. 이때, 설치 과정 중 에이전트가 OS별로 올바르게 등록되도록 권한을 확인하는 단계가 포함될 수 있다. 이는 사용자가 해당 파일에 실행 권한을 확인하고 부여하는 과정을 의미한다.More specifically, when installing the agent, unzip the agent installation file on the target server (e.g., the host server) and run the 'setup.bat' file in the 'setup_win' folder for Windows, or the 'setup.sh' file in the 'setup' folder for Linux or Unix. The installation process may include a step to check permissions to ensure the agent is properly registered for each OS. This means the user checking and granting execution permissions to the file.

도 99는 본 발명의 데이터 관리 플랫폼에서 에이전트를 승인하는 실시 예를 설명하는 도면이다.Figure 99 is a drawing illustrating an embodiment of approving an agent in the data management platform of the present invention.

본 도면은 에이전트의 승인 처리 및 설치 프로세스를 설명하는 도면이다.This drawing illustrates the agent's approval processing and installation process.

기존 프로세스에서는 서버에 에이전트를 설치한 후, 생성된 에이전트 아이디를 사용하여 데이터 관리 플랫폼에서 수동으로 에이전트의 설치를 진행해야 했다. 이는 관리자가 각 에이전트의 설치를 개별적으로 추적하고 등록해야 하는 번거로운 과정을 필요로 한다.In the existing process, after installing an agent on a server, the agent ID generated had to be used to manually install the agent on the data management platform. This required administrators to track and register each agent installation individually, a cumbersome process.

본 발명에 따르면, 에이전트의 와치독 모듈(2033)이 서버에 설치된 이후 실행되면 데이터 관리 플랫폼에 접속하고, 관리자는 승인 시 적용 버전을 선택하여 에이전트의 설정을 완료할 수 있다. 이 과정에서 에이전트 아이디는 자동으로 생성되며, 관리자의 승인이 이루어지면 워커 모듈(2034)이 서버로 다운로드 후 실행되고 데이터 관리 플랫폼에서 활성화된다.According to the present invention, after the agent's watchdog module (2033) is installed on the server and executed, it connects to the data management platform, and the administrator can complete the agent's configuration by selecting the applicable version upon approval. During this process, an agent ID is automatically generated, and upon administrator approval, the worker module (2034) is downloaded to the server, executed, and activated on the data management platform.

일 실시 예에서, 에이전트 승인 팝업은 에이전트 서비스 승인 시에 다운로드하여 설치할 에이전트의 버전 정보, 에이전트 서비스 승인 대기 목록 및 에이전트 서비스 승인 처리 목록을 포함할 수 있다.In one embodiment, the agent approval pop-up may include version information of the agent to be downloaded and installed upon agent service approval, an agent service approval waiting list, and an agent service approval processing list.

이에 따라, 데이터 관리 플랫폼에 접속한 관리자는 서버에 설치하고자 하는 에이전트의 버전을 선택하여, 에이전트 서비스의 승인 대기 목록 중 적어도 하나의 에이전트를 선택한 후 승인 처리할 수 있다. 이후, 승인된 에이전트는 서버에 승인 시 선택된 버전의 워커 모듈이 설치되게 된다.Accordingly, an administrator accessing the data management platform can select the version of the agent they wish to install on the server, select at least one agent from the agent service's approval list, and then approve it. Approved agents will then have the selected version of the worker module installed on the server.

일 실시 예에서, 에이전트는 와치독 모듈과 워커 모듈을 포함하고 있으며, 에이전트를 설치할 때는 와치독 모듈만 설치 및 실행되며, 이후 승인 과정을 통해 버전이 선택되면 해당 버전의 워커 모듈이 서버에 다운로드 및 설치될 수 있다. 이후, 설치된 워커 모듈이 실행되어 데이터 관리 플랫폼에서 활성화된다.In one embodiment, the agent includes a watchdog module and a worker module. When installing the agent, only the watchdog module is installed and executed. Afterwards, when a version is selected through an approval process, the worker module of that version can be downloaded and installed on the server. The installed worker module is then executed and activated on the data management platform.

또한, 일 실시 예에서, 데이터 관리 플랫폼은 에이전트 일괄제어 버튼(xx)을 선택하는 신호를 수신하는 경우, 제 1 에이전트 버전을 적용 대상으로 설정된 에이전트에 일괄적으로 설치할 수 있다.Additionally, in one embodiment, when the data management platform receives a signal for selecting the agent batch control button (xx), the first agent version can be installed in batches on the agents set as the target.

도 100는 본 발명의 데이터 관리 플랫폼에서 에이전트를 관리하는 실시 예를 설명하는 도면이다.Figure 100 is a drawing illustrating an embodiment of managing an agent in a data management platform of the present invention.

본 도면은 에이전트의 채널 유형을 설정할 수 있는 화면을 나타낸다. 이때, 설정은 에이전트에서 설정하는 것이 아닌 데이터 관리 플랫폼의 서버에서 설정하는 것으로, 에이전트는 서버에서 설정된 설정 값을 가져가 적용할 수 있다.This diagram shows a screen where you can configure an agent's channel type. Note that the settings are set on the data management platform's server, not on the agent itself. The agent can then retrieve and apply the settings set on the server.

일 실시 예에서, 에이전트 설정 팝업을 통해 채널 유형을 설정할 수 있다.In one embodiment, the channel type can be set via the agent settings pop-up.

보다 상세하게는, 에이전트 설정은 채널이 구분되어 설정되며 각 설정마다 어댑터(adaptor) 유형이 존재하고, 어댑터의 유형에 기초하여 수집하는 방식이 구분될 수 있다. 이때, 각 채널의 아이디는 중복될 수 없으며 채널 별로 수집 설정이 다르게 동작될 수 있다.More specifically, agent settings are configured by channel, each with its own adapter type. Collection methods can be differentiated based on the adapter type. Each channel ID cannot be duplicated, and collection settings can operate differently for each channel.

또한, 각 채널은 어댑터(PING, SYSMON, WINLOG, FILE)를 지정해야 하고, 각 어댑터 별로 공통 설정 항목 및 별도의 설정 항목이 존재할 수 있다. 이하에서 자세히 설명하도록 한다.Additionally, each channel must specify an adapter (PING, SYSMON, WINLOG, FILE), and each adapter may have common and separate settings. This is explained in detail below.

예를 들어, 채널 설정 공통 항목은 어댑터의 종류를 지정해야 하고, 수집 주기(task.interval)를 설정해야 하고, 수집 활성화 여부(task.active)를 설정해야 하고, 수집 실행 로깅 여부(task.logging)를 설정해야 한다. 예를 들어, 수집 주기는 밀리세컨드(1000분의 1초) 단위로 설정할 수 있다. 또한, 수집 활성화 여부의 디폴트 값은 true이며 true 또는 false 값으로 설정할 수 있다. 또한, 수집 실행 로깅 여부의 디폴트 값은 false이며 true 또는 false 값으로 설정할 수 있다.For example, common channel configuration items must specify the adapter type, set the collection interval (task.interval), set whether collection is active (task.active), and set whether collection execution is logged (task.logging). For example, the collection interval can be set in milliseconds (1/1000th of a second). The default value for whether collection is active is true and can be set to either true or false. The default value for whether collection execution is logged is false and can be set to either true or false.

일 실시 예에서, 어댑터가 SYSMON, WINLOG, FILE인 경우, 메시지 암호화 옵션(net.secure), 메시지 압축 옵션(net.compress) 및 메시지 무결성 옵션(net.mac)을 설정할 수 있다. 예를 들어, 메시지 암호화 옵션, 메시지 압축 옵션 및 메시지 무결성 옵션의 디폴트 값은 false이며, true 또는 false 값으로 설정할 수 있다.In one embodiment, if the adapter is SYSMON, WINLOG, or FILE, the message encryption option (net.secure), the message compression option (net.compress), and the message integrity option (net.mac) can be set. For example, the default values for the message encryption option, the message compression option, and the message integrity option are false, and can be set to true or false.

또한, PING 어댑터는 에이전트와 서버간 연결이 잘 되는지 테스트하거나 heartbeat 용도로 사용할 수 있다. 예를 들어, PING 어댑터를 heartbeat 용도로 사용할 때에는 상관 분석 및 알람 기능을 사용하여 에이전트 설치 서버가 다운된 경우 알림을 받도록 설정할 수 있다.Additionally, the PING adapter can be used to test the connection between the agent and the server or for heartbeat purposes. For example, when using the PING adapter for heartbeat purposes, you can configure the correlation analysis and alarm features to notify you if the agent installation server is down.

또한, SYSMON 어댑터는 CPU, MEM 또는 DISK의 사용량을 수집하여 전송하는 용도로 사용할 수 있다. 이때, 다른 어댑터가 수집된 이벤트를 TCP로 전송하는 것과 달리 SYSMON 어댑터는 UDP로 전송할 수 있다. 일 실시 예에서, metric 설정에 수집할 CPU, MEM 또는 DISK 중 하나를 입력하여 입력된 CPU, MEM 또는 DISK의 사용량을 수집할 수 있다. 이때, 수집되는 사용량이 DISK의 사용량인 경우, fs(파일 시스템 경로, DISK 파일의 시스템 경로를 의미한다.) 옵션으로 수집할 파일 시스템 이름을 명시해야 하며, SYSMON 어댑터는 해당 파일 시스템 별로 사용량을 수집할 수 있다. 또한, SYSMON 어댑터는 각 metric 별 로그(_raw) 포맷을 다르게 가질 수 있다.In addition, the SYSMON adapter can be used to collect and transmit CPU, MEM, or DISK usage. At this time, unlike other adapters that transmit collected events via TCP, the SYSMON adapter can transmit them via UDP. In one embodiment, the input CPU, MEM, or DISK usage can be collected by entering one of the CPU, MEM, or DISK to be collected in the metric setting. At this time, if the usage to be collected is the usage of DISK, the name of the file system to be collected must be specified with the fs (file system path, meaning the system path of the DISK file) option, and the SYSMON adapter can collect usage for each file system. In addition, the SYSMON adapter can have a different log (_raw) format for each metric.

또한, WINLOG 어댑터는 윈도우 이벤트 로그를 수집할 수 있다. 이때, WINLOG 어댑터는 logname 설정 항목을 통해 수집하기 위한 로그 카테고리(예를 들어, System, Application, Security 등)를 지정할 수 있다. 이때, 수집하기 위한 로그 카테고리가 여러 개인 경우 콤마(,)로 구분하되 공백을 포함할 수 없다. 또한, data.format 항목을 통해 로그 포맷을 별도로 지정할 수 있다. 또한, data.long를 통해 메시지가 너무 길 경우 메시지를 전부 보낼 지 여부를 설정할 수 있다. 디폴트 값은 false로 기본적으로 메시지가 기 설정된 길이를 초과하는 경우, 제 1 단위로 잘라서 전송할 수 있다.In addition, the WINLOG adapter can collect Windows event logs. At this time, the WINLOG adapter can specify the log category to collect (e.g., System, Application, Security, etc.) through the logname setting item. At this time, if there are multiple log categories to collect, separate them with commas (,) and cannot include spaces. In addition, the log format can be separately specified through the data.format item. In addition, data.long can be used to set whether to send the entire message if it is too long. The default value is false, so if the message exceeds the preset length, it can be cut into the first unit and transmitted.

또한, FILE 어댑터는 파일의 내용을 읽어 수집하는 어댑터로 실시간 모드 2 종류, 배치 모드 2 종류로 4개의 처리 방식을 포함할 수 있다.Additionally, the FILE adapter can include four processing modes: two real-time modes and two batch modes, as it is an adapter that reads and collects the contents of a file.

이때, FILE 어댑터의 공통 설정 항목은, 파일 처리 모드(mode), 수집할 파일의 디렉토리 경로(file.path), 수집할 파일의 이름 패턴(file.pattern, 처리 모드 별로 가능한 패턴이 상이하다.), 파일 인코딩(file.encoding, 디폴트는 UTF-8에 대응한다.), 파일의 내용의 레코드 단위로 끊어내기 위한 파서(parser, 예를 들어, NEWLINE, MULTILINE, BLOCK, UTMP 등), 파서의 한 라인의 최대 길이(parser.line_max_size, 디폴트는 10*1024에 대응한다.), 파서의 한 라인이 최대 길이를 초과하였을 때의 동작 설정(parser.truncate_overflow)을 포함할 수 있다. 이때, parser.truncate_overflow 값이 true인 경우, 초과되는 내용은 버릴 수 있다. 디폴트 값은 false이다.At this time, the common settings of the FILE adapter can include the file processing mode (mode), the directory path of the file to be collected (file.path), the name pattern of the file to be collected (file.pattern, the possible patterns are different depending on the processing mode), the file encoding (file.encoding, the default corresponds to UTF-8), a parser for breaking the contents of the file into record units (parser, for example, NEWLINE, MULTILINE, BLOCK, UTMP, etc.), the maximum length of one line of the parser (parser.line_max_size, the default corresponds to 10*1024), and the action setting when one line of the parser exceeds the maximum length (parser.truncate_overflow). At this time, if the value of parser.truncate_overflow is true, the exceeding contents can be discarded. The default value is false.

여기에서, 파일 처리 모드는 BATCH 모드, PERIODIC 모드, ROTATE 모드, TAIL 모드 및 TIMEPATTERN_TAIL 모드를 포함할 수 있다.Here, the file processing mode can include BATCH mode, PERIODIC mode, ROTATE mode, TAIL mode, and TIMEPATTERN_TAIL mode.

예를 들어, BATCH 모드는 파일의 전체를 읽어 한번 처리하고, 동일 파일은 더 이상 처리하지 않는 모드를 나타낸다. 파일명 패턴은 GLOB 패턴을 따를 수 있다.For example, BATCH mode indicates that the entire file is read and processed once, and the same file is not processed again. The file name pattern can follow the GLOB pattern.

또한, PERIODIC 모드는 특정 시간이나 날짜 간격으로 한 번에 처리해야 하는 경우 사용되는 모드이다. 처리 간격, 지연 간격, 초기 간격을 설정해야 하며, P1D(하루), PT1H(1시간) 등의 표기법을 따를 수 있다. period.interval을 통해 어떤 간격으로 처리해야 하는지 여부를 지정할 수 있고, period.delay를 통해 얼마나 지연하여 처리할 것인지 여부를 지정할 수 있고, period.initial을 통해 초기 간격을 얼마나 설정할 것인지 지정할 수 있다. 파일명 패턴은 날짜 및 시간으로 지정할 수 있다.Additionally, PERIODIC mode is used when processing is required at a specific time or date interval. Processing intervals, delay intervals, and initial intervals must be set, and can follow notations such as P1D (one day) and PT1H (one hour). You can specify the interval at which processing should occur via period.interval , the delay time for processing via period.delay , and the initial interval via period.initial . The filename pattern can be specified using dates and times.

또한, ROTATE 모드는 파일의 레코드가 실시간으로 추가(append)되면서, 추가된 레코드가 특정 사이즈 이상인 경우, 기존 파일들의 이름을 재설정하고(rename) 신규 레코드를 추가하는 경우에 사용되는 모드이다. tail.read_from_head를 통해 데이터를 처음부터 읽을 것인지 현 시점부터 읽은 것인지를 여부를 설정할 수 있다.Additionally, ROTATE mode is used when appending records to a file in real time, renaming existing files and adding new records if the added records exceed a certain size. You can set whether to read data from the beginning or from the current point using tail.read_from_head.

또한, 파일의 동일성 여부를 파일명이 아닌 파일의 일부분의 해시코드를 수집하여 확인할 수 있다. file.hash_offset를 통해 해시코드를 수집할 파일의 길이를 지정할 수 있고 기본값은 64 바이트이다. 일 실시 예에서 file.hash_offset이 64인 경우, 파일이 64바이트 이하의 파일은 읽어 들이지 않고, 앞부분의 64바이트가 동일한 파일이 존재하면 안 된다.Additionally, file identity can be verified by collecting hash codes of portions of a file, rather than the file name. The length of the file for which the hash code is to be collected can be specified via file.hash_offset, and the default value is 64 bytes. In one embodiment, if file.hash_offset is 64, files shorter than 64 bytes will not be read, and files with the same first 64 bytes must not exist.

또한, TAIL 모드는 파일에 레코드가 실시간으로 추가되면서 파일 이름이 변경되지 않는 경우에 사용되는 모드이다. tail.read_from_head를 통해 데이터를 처음부터 읽을 것인지 현 시점부터 읽을 것인지 여부를 설정할 수 있다. 또한, ROTATE 모드와 달리 파일 명으로 파일의 동일성 여부를 판단할 수 있다.Additionally, TAIL mode is used when the file name remains unchanged as records are added to the file in real time. You can use tail.read_from_head to configure whether data is read from the beginning or from the current point. Furthermore, unlike ROTATE mode, file identity can be determined based on the file name.

또한, TIMEPATTERN_TAIL 모드는 TAIL 모드에서 path와 file.pattern에 날짜 패턴식을 적용하는 모드이다. file.tree 값이 true일 경우 file.path 경로의 하위 디렉토리에 있는 파일도 수집하는 것을 특징으로 한다.Additionally, the TIMEPATTERN_TAIL mode applies a date pattern expression to path and file.pattern in TAIL mode. If the file.tree value is true, it is characterized by also collecting files in subdirectories of the file.path path.

일 실시 예에서, FTP 어댑터의 공통 설정 항목은, 파일 처리 모드, 파일 전송 프로토콜 선택(protocol), 수집할 파일의 경로(file.path), 수집할 파일의 패턴(file.pattern), 파일 전송 속도 제한(file.max_bytes_per_sec), 파일전송자 포스트처리를 위한 파일 타입(file.type), 전송할 파일이 저장될 FTP 서버 경로(file.remote_path), 파일 수집 모니터링시 사용할 채널 명(log.channel), 날짜 패턴 사용 시 몇일 전의 날짜를 사용할 것인지 여부(beforeDays)를 포함할 수 있다.In one embodiment, common settings of the FTP adapter may include file processing mode, file transfer protocol selection (protocol), path of files to be collected (file.path), pattern of files to be collected (file.pattern), file transfer speed limit (file.max_bytes_per_sec), file type for file sender post-processing (file.type), FTP server path where files to be transferred will be stored (file.remote_path), channel name to use when monitoring file collection (log.channel), and whether to use a date from several days ago when using a date pattern (beforeDays).

여기에서, 파일 처리 모드는 FILE 모드, TIMESTAMP_DIRECTORY 모드 및 TIMEPATTERN_DIRECTORY 모드를 포함할 수 있다.Here, the file processing mode can include FILE mode, TIMESTAMP_DIRECTORY mode, and TIMEPATTERN_DIRECTORY mode.

예를 들어, FILE 모드는 특정 경로 내 파일명이 패턴 맞는 경우 해당 파일을 전송하는 모드이다.For example, FILE mode is a mode that transfers a file if the file name in a specific path matches a pattern.

또한, TIMESTAMP_DIRECTORY 모드는 해당 경로(file.path)의 디렉토리 하위에 디렉토리 명이 타임스탬프(timestamp) 형태로 구성되어 있는 경우 디렉토리 명을 시간 정보로 변환하여 1년 이내 생성된 디렉토리 안에 있는 파일들 중 파일 패턴(file.pattern)에 맞는 파일을 전송하는 모드이다.Additionally, the TIMESTAMP_DIRECTORY mode is a mode that converts the directory name into time information if the directory name under the directory of the corresponding path (file.path) is configured in the form of a timestamp, and transfers files that match the file pattern (file.pattern) among the files in the directory created within one year.

또한, TIMEPATTERN_DIRECTORY 모드는 파일의 경로(file.path), 원격 경로(file.remote_path)에도 날짜 패턴을 적용하여 $yyyy$, $MM$, $dd$, $mm$, $ss$, $date$ 등의 패턴식이 사용 가능한 모드이다.Additionally, the TIMEPATTERN_DIRECTORY mode applies date patterns to the file path (file.path) and remote path (file.remote_path), so pattern expressions such as $yyyy$, $MM$, $dd$, $mm$, $ss$, and $date$ can be used.

다른 일 실시 예에서, 에이전트 설정 암호화 통신 사용 여부를 입력할 수 있다. 이때, 데이터 관리 플랫폼은 에이전트가 접속할 때 생성한 키와 데이터 관리 플랫폼에 에이전트 별로 등록된 인증서를 이용하여 암호화할 수 있다. 시 사용되는 인증서 정보를 입력할 수 있다.In another embodiment, the agent may enter whether to use encrypted communication settings. In this case, the data management platform may encrypt the data using a key generated when the agent connects and a certificate registered for each agent on the data management platform. The certificate information used may be entered.

도 101는 본 발명의 데이터 관리 방법에서 에이전트를 관리하는 방법을 설명하는 도면이다.Figure 101 is a drawing explaining a method for managing an agent in the data management method of the present invention.

이하의 단계는 호스트 서버에 설치된 에이전트를 관리하기 위하여 본 발명의 데이터 관리 방법이 수행하는 실시 예를 설명한다. 특히, 이하의 방법은 동시에 수행되지 않을 수 있으며, 개별적으로 수행될 수 있음은 물론이다.The following steps describe an embodiment of the data management method of the present invention for managing agents installed on a host server. In particular, the following methods may not be performed simultaneously and may be performed individually.

단계(S12010)에서, 본 발명의 데이터 관리 방법은 에이전트에 대한 버전을 관리할 수 있다. 일 실시 예에서, 데이터 관리 방법은 관리자의 요청에 따라 에이전트에 대한 버전을 관리할 수 있다. 예를 들어, 관리자는 에이전트 관리부를 통하여 복수 개의 에이전트에 대한 버전 업데이트 등을 요청할 수 있다. 이를 통해, 에이전트의 업데이트, 패치 및 롤백을 용이하게 수행할 수 있다.In step S12010, the data management method of the present invention can manage versions for agents. In one embodiment, the data management method can manage versions for agents at the request of an administrator. For example, the administrator can request version updates for multiple agents through the agent management unit. This facilitates agent updates, patches, and rollbacks.

단계(S12020)에서, 본 발명의 데이터 관리 방법은 에이전트 별로 수행에 필요한 명령어를 생성하고 해당 에이전트에 배포할 수 있다. 일 실시 예에서, 데이터 관리 방법은 관리자의 버전 업데이트 요청에 기초하여 에이전트 별로 업데이트 수행에 필요한 명령어를 생성하고 해당 에이전트에 배포할 수 있다. 예를 들어, 관리자가 에이전트 관리부 상에 패치 파일을 업로드하고, 에이전트의 업데이트를 요청하는 경우, 에이전트 관리부는 에이전트 별로 업데이트 수행에 필요한 명령어를 생성할 수 있다. 이후, 에이전트의 와치독 모듈은 명령어를 수신하여 워커 모듈을 재시작할 수 있다. 에이전트의 동작은 이하에서 자세히 설명하도록 한다.In step (S12020), the data management method of the present invention can generate commands required for execution for each agent and distribute them to the corresponding agents. In one embodiment, the data management method can generate commands required for update execution for each agent and distribute them to the corresponding agents based on a version update request from an administrator. For example, if an administrator uploads a patch file to the agent management unit and requests an update for an agent, the agent management unit can generate commands required for update execution for each agent. Thereafter, the watchdog module of the agent can receive the commands and restart the worker module. The operation of the agent will be described in detail below.

단계(S12030)에서, 본 발명의 데이터 관리 방법은 에이전트의 상태를 모니터링할 수 있다. 보다 상세하게는, 데이터 관리 방법은 에이전트의 동작 상태, 연결 상태 등을 실시간으로 모니터링하고 문제 발생 시 대응할 수 있다.In step (S12030), the data management method of the present invention can monitor the status of the agent. More specifically, the data management method can monitor the agent's operating status, connection status, etc. in real time and respond when a problem occurs.

도 102는 본 발명의 데이터 관리 방법에서 에이전트의 버전을 관리하는 방법을 설명하는 도면이다.Figure 102 is a drawing explaining a method for managing the version of an agent in the data management method of the present invention.

본 발명에서는 에이전트 코드의 버전 관리를 위해 분산 버전 관리 시스템(예를 들어, git)을 사용할 수 있다. 분산 버전 관리 시스템을 사용함으로써 코드의 변경 이력을 효과적으로 추적하고, 다양한 버전의 코드를 안전하게 관리할 수 있다. 이는 개발자들이 코드 변경 사항을 쉽게 협업하고 통합할 수 있게 해줌으로써, 개발 과정의 효율성을 크게 높인다.The present invention utilizes a distributed version control system (e.g., git) to manage the version of the agent code. Using a distributed version control system effectively tracks code change history and securely manages multiple versions of the code. This allows developers to easily collaborate and integrate code changes, significantly improving the efficiency of the development process.

또한, 본 발명에서는 에이전트 실행 파일 내에 'Manifest' 정보를 포함시킬 수 있다. 이 Manifest 파일에는 'Revision' 정보가 포함되어 있으며, 이를 통해 에이전트 소프트웨어의 특정 버전을 명확하게 식별할 수 있다. 데이터 관리 플랫폼의 사용자에 의해 에이전트 소프트웨어가 업로드되거나 업데이트될 때, 이 Revision 정보를 참조하여 각 버전을 정확하게 식별하고 관리할 수 있다.Additionally, the present invention can include "Manifest" information within the agent executable file. This manifest file contains "Revision" information, which allows for the clear identification of a specific version of the agent software. When the agent software is uploaded or updated by a user of the data management platform, this Revision information can be referenced to accurately identify and manage each version.

즉, 에이전트 소프트웨어의 각 버전 정보는 설치나 패치 과정에서 중요한 역할을 한다. 데이터 관리 플랫폼의 사용자는 이 버전 정보를 참조하여 특정 에이전트의 현재 상태를 파악할 수 있고, 필요한 경우 적절한 버전으로 업데이트하거나 롤백할 수 있다.In other words, each version of the agent software plays a crucial role in the installation and patching process. Users of the data management platform can reference this version information to determine the current status of a specific agent and, if necessary, update or roll back to the appropriate version.

이를 통하여 효율적인 코드 관리와 정확한 버전 추적을 가능하게 한다.This enables efficient code management and accurate version tracking.

도 103는 본 발명의 데이터 관리 방법이 에이전트의 상태를 모니터링하는 실시 예를 설명하는 도면이다.Figure 103 is a drawing illustrating an embodiment of a data management method of the present invention for monitoring the status of an agent.

본 발명에서, 에이전트는 수집 서버와 TCP(Transmission Control Protocol)를 사용하여 연결될 수 있다. 이때, 이 연결을 통해 에이전트는 주기적으로 서버와 설정을 동기화하며, 이는 수집 대상 및 방법, 데이터 처리 규칙 등 다양한 설정을 포함할 수 있다. 또한, TCP 연결이 종료될 경우, 에이전트는 자신의 상태를 변경하여 연결 끊김을 반영할 수 있다.In the present invention, the agent can connect to the collection server using TCP (Transmission Control Protocol). Through this connection, the agent periodically synchronizes settings with the server, which may include various settings such as collection targets and methods, data processing rules, etc. Furthermore, when the TCP connection is terminated, the agent can change its status to reflect the disconnection.

일 실시 예에서, 각 에이전트 채널의 동작 상태는 별도로 관리되며, 동기화를 수행할 때 해당 상태 정보가 갱신될 수 있다. 이는 에이전트가 수집 서버와의 연결 상태, 데이터 처리 상태 등을 정확하게 추적하고 관리할 수 있도록 한다.In one embodiment, the operational status of each agent channel is managed separately, and the corresponding status information can be updated when synchronization is performed. This allows the agent to accurately track and manage its connection status with the collection server, data processing status, and other information.

또한, 로그 수집 대상이 파일인 경우, 해당 에이전트 채널은 로깅 처리된 위치 정보를 'offset' 파일에 기록할 수 있다. 여기에서, offset 파일은 로그 파일 내에서 수집된 마지막 위치를 추적하여, 이후 수집 작업이 중복되지 않고 효율적으로 진행될 수 있도록 한다. 만약 수집된 데이터가 기준치 이상인 경우, 에이전트는 초기화 요청을 통해 offset 정보를 재설정할 수 있다.Additionally, if the log collection target is a file, the agent channel can record the logged location information in an "offset" file. This offset file tracks the last location collected within the log file, ensuring that subsequent collection operations proceed efficiently and without duplication. If the collected data exceeds the threshold, the agent can reset the offset information through a reset request.

도 104는 본 발명의 데이터 관리 방법에서 에이전트와의 통신 방법을 설명하는 도면이다.Figure 104 is a drawing explaining a communication method with an agent in the data management method of the present invention.

단계(S12040)에서, 본 발명의 데이터 관리 방법은 에이전트의 와치독 모듈에게 명령어를 전달할 수 있다. 보다 상세하게는, 데이터 관리 방법은 에이전트 별로 동작에 필요한 명령어를 생성하고 에이전트에게 전달할 수 있다. 이에 따라, 에이전트의 와치독 모듈은 외부로부터 명령어를 수신할 수 있다.In step (S12040), the data management method of the present invention can transmit commands to the watchdog module of the agent. More specifically, the data management method can generate commands required for operation for each agent and transmit them to the agent. Accordingly, the watchdog module of the agent can receive commands from an external source.

단계(S12050)에서, 본 발명의 데이터 관리 방법은 에이전트의 워커 모듈로부터 수집된 데이터를 저장할 수 있다. 에이전트의 워커 모듈은 에이전트가 설치된 서버에서 데이터를 수집할 수 있다. 이때, 워커 모듈을 통하여 수집된 데이터들은 데이터 관리 플랫폼의 수집 모듈로 직접 전송될 수 있다. 일 실시 예에서, 워커 모듈을 통하여 수집된 데이터들은 에이전트 별로 암호화되어 데이터 관리 플랫폼의 수집 모듈로 전송될 수 있다. 이때, 에이전트 별로 다른 암호화 방법에 기초하여 수집된 데이터가 전달될 수 있다. 예를 들어, 에이전트에서 데이터 관리 플랫폼으로 전송되는 데이터들은 SSL/TLS (Secure Sockets Layer/Transport Layer Security) 암호화 프로토콜, VPN 암호화 프로토콜, 대칭키 또는 비대칭키 암호화 알고리즘을 통해 암호화되어 전달될 수 있다. 이를 통하여, 개별적인 에이전트에서 수집된 데이터들은 안전하게 암호화되어 데이터 관리 플랫폼으로 전달될 수 있다.In step (S12050), the data management method of the present invention can store data collected from the worker module of the agent. The worker module of the agent can collect data from the server on which the agent is installed. At this time, the data collected through the worker module can be directly transmitted to the collection module of the data management platform. In one embodiment, the data collected through the worker module can be encrypted for each agent and transmitted to the collection module of the data management platform. At this time, the collected data can be transmitted based on different encryption methods for each agent. For example, the data transmitted from the agent to the data management platform can be encrypted and transmitted using the SSL/TLS (Secure Sockets Layer/Transport Layer Security) encryption protocol, the VPN encryption protocol, or a symmetric or asymmetric key encryption algorithm. Through this, the data collected from each agent can be safely encrypted and transmitted to the data management platform.

이후, 데이터 관리 플랫폼의 수집 모듈은 에이전트에서 수집된 데이터를 저장할 수 있다. 일 실시 예에서, 데이터 관리 플랫폼의 수집 모듈은 에이전트에서 수집된 데이터를 데이터베이스에 저장할 수 있다.Thereafter, the collection module of the data management platform may store the data collected from the agent. In one embodiment, the collection module of the data management platform may store the data collected from the agent in a database.

이를 통해, 본 발명의 데이터 관리 방법은 에이전트와의 효율적인 통신, 데이터 수집 및 저장을 수행할 수 있다.Through this, the data management method of the present invention can perform efficient communication with the agent, data collection, and storage.

도 105는 본 발명의 데이터 관리 방법에서 에이전트의 동작 방법을 설명하는 도면이다.Figure 105 is a drawing explaining the operation method of an agent in the data management method of the present invention.

단계(S12060)에서, 에이전트는 데이터 관리 플랫폼의 수집 모듈의 에이전트 관리부와 통신할 수 있다. 에이전트는 에이전트 관리부로부터 수행에 필요한 명령어를 수신할 수 있다.In step (S12060), the agent can communicate with the agent management unit of the collection module of the data management platform. The agent can receive commands required for execution from the agent management unit.

단계(S12070)에서, 에이전트는 에이전트 관리부로부터 수신한 명령어를 워커 모듈에게 전달할 수 있다.In step (S12070), the agent can transmit a command received from the agent management unit to the worker module.

단계(S12080)에서, 에이전트의 워커 모듈에서 명령어를 수행할 수 있다. 워커 모듈은 에이전트의 핵심적인 기능을 수행하는 모듈로 명령어를 수행하고, 결과를 반환할 수 있다.At step (S12080), a command can be executed in the agent's worker module. The worker module is a module that performs the agent's core functions, executes commands, and returns results.

단계(S12090)에서, 에이전트는 워커 모듈을 통하여 수행 결과에 대응하는 데이터를 수집할 수 있다. 일 실시 예에서, 에이전트가 설치된 호스트 서버에서 데이터를 수집하는 기능을 수행할 것을 명령받은 경우, 워커 모듈은 호스트 서버에서 데이터를 수집할 수 있다. 이후, 에이전트는 수집된 데이터를 암호화하여 데이터 관리 플랫폼에게 전송할 수 있다.In step (S12090), the agent can collect data corresponding to the performance results through the worker module. In one embodiment, if the agent is instructed to perform a function of collecting data from the host server on which it is installed, the worker module can collect the data from the host server. Thereafter, the agent can encrypt the collected data and transmit it to the data management platform.

도 106는 본 발명의 데이터 관리 방법에서 에이전트의 동작 방법을 설명하는 도면이다.Figure 106 is a drawing explaining the operation method of an agent in the data management method of the present invention.

단계(S13010)에서, 본 발명의 데이터 관리 방법은 데이터를 수집하는 적어도 하나의 에이전트의 버전 정보를 모니터링할 수 있다. 본 발명의 데이터 관리 방법은 상술한 실시 예를 이용하여 데이터를 수집하거나 도 97 내지 도 105의 실시 예를 이용하여 데이터를 수집할 수 있다. 단계(S13010)은 도 97내지 도 105의 데이터 수집 방법을 이용하기 위하여 호스트 서버에 에이전트를 설치할 수 있다. 이때, 네트워크 상에서 송수신되는 패킷으로부터 데이터를 수집하는 실시 예와 달리 호스트 서버에 에이전트를 설치하기 때문에, 호스트 서버에서 발생하는 모든 로그 데이터를 수집할 수 있다.In step (S13010), the data management method of the present invention can monitor the version information of at least one agent that collects data. The data management method of the present invention can collect data using the above-described embodiment or the embodiment of FIGS. 97 to 105. In step (S13010), an agent can be installed on a host server to use the data collection method of FIGS. 97 to 105. At this time, unlike the embodiment that collects data from packets transmitted and received over a network, since the agent is installed on the host server, all log data generated on the host server can be collected.

일 실시 예에서, 에이전트가 호스트 서버에 설치되는 방법은 에이전트의 와치독 모듈이 호스트 서버에 설치되는 단계, 와치독 모듈이 실행되는 단계 및 버전 정보에 기초한 워커 모듈이 호스트 서버에 다운로드되는 단계를 포함할 수 있다. 이에 대하여는 도 99의 실시 예를 참고하도록 한다.In one embodiment, a method for installing an agent on a host server may include installing a watchdog module of the agent on the host server, executing the watchdog module, and downloading a worker module based on version information to the host server. For details, refer to the embodiment of FIG. 99.

단계(S13020)에서, 본 발명의 데이터 관리 방법은 버전 정보의 업데이트 요청에 기초하여 업데이트에 필요한 명령어를 생성할 수 있다. 이에 대하여는 도 97 내지 도 103의 실시 예를 참고하도록 한다.In step (S13020), the data management method of the present invention can generate commands necessary for updating based on a request for updating version information. For details, refer to the embodiments of FIGS. 97 to 103.

단계(S13030)에서, 본 발명의 데이터 관리 방법은 생성된 명령어를 적어도 하나의 에이전트에게 배포할 수 있다. 이에 대하여는 도 97 내지 도 103의 실시 예를 참고하도록 한다.In step (S13030), the data management method of the present invention can distribute the generated command to at least one agent. For details, refer to the embodiments of FIGS. 97 to 103.

본 발명은 암호화되지 않은 원문 데이터를 재처리하여 암호화하는 기술에 관한 것으로, 특히 암호화가 필요하지만 다양한 이유로 인해 암호화되지 않은 데이터를 대상으로 한다.The present invention relates to a technology for reprocessing and encrypting unencrypted raw data, and particularly to data that requires encryption but is not encrypted for various reasons.

본 발명의 시스템에서 암호화 엔진은 별도의 프로그램으로 분리되어 있으며, 이는 암호화 대상 시스템과의 통신을 필요로 한다. 이러한 구조적 특성으로 인해 통신 오류나 기타 이유로 인해 암호화가 즉시 이루어지지 않을 수 있다. 이 경우, 시스템의 연속성을 유지하기 위해 원문 데이터가 먼저 암호화 대상 시스템의 데이터베이스에 비암호화 상태로 전달될 수 있다.In the system of the present invention, the encryption engine is a separate program that requires communication with the target system. Due to this structural characteristic, encryption may not be completed immediately due to communication errors or other reasons. In this case, to maintain system continuity, the original data may first be transmitted unencrypted to the target system's database.

이에 따라, 본 발명은 시스템 내에 저장된 전체 데이터를 검색하고, 이중 암호화되지 않은 부분을 찾아내는 과정을 포함할 수 있다. 이하를 통해 상세히 설명하도록 한다.Accordingly, the present invention may include a process for searching all data stored within the system and identifying portions that are not double-encrypted. This will be described in detail below.

도 107는 실시 예에 따른 데이터 관리 장치를 설명하는 도면이다.Figure 107 is a drawing illustrating a data management device according to an embodiment.

상술한 실시 예에 따라, 데이터 관리 플랫폼은 키 관리 모듈(미도시)과 개인정보 관리 모듈(20004)를 사용하여 데이터를 암호화할 수 있다.According to the above-described embodiment, the data management platform can encrypt data using a key management module (not shown) and a personal information management module (20004).

이때, 본 발명의 실시 예는 개인정보 관리 모듈(20004)에서 수행되는 것으로 기재하였으나 필요에 의해 키 관리 모듈에 포함된 구성을 이용할 수 있음은 물론이다. 각 모듈에 대한 설명한 상술한 실시 예를 참고하도록 한다.Here, the embodiment of the present invention is described as being performed in the personal information management module (20004), but it is of course possible to utilize the configuration included in the key management module as needed. Refer to the above-described embodiment for each module.

일 실시 예에서, 개인정보 관리 모듈(20004)는 암호화 대상 필드 생성부(2068), 미 암호화 필드 검출부(2069), 검출 내용 저장부(2070), 미 암호 대상 일괄 처리부(2071), 자동 실행 관리부(2072), 로그 관리부(2073)를 포함할 수 있다.In one embodiment, the personal information management module (20004) may include an encrypted target field generation unit (2068), an unencrypted field detection unit (2069), a detection content storage unit (2070), an unencrypted target batch processing unit (2071), an automatic execution management unit (2072), and a log management unit (2073).

암호화 대상 필드 생성부(2068)는 개인정보 암호화 대상을 관리할 수 있다. 예를 들어, 암호화 대상은 주민등록번호, 카드번호, 계좌번호, 여권번호, 운전면허번호, 외국인등록번호, 생체인식정보 등을 포함할 수 있다.The encryption target field generation unit (2068) can manage personal information encryption targets. For example, encryption targets may include resident registration numbers, card numbers, account numbers, passport numbers, driver's license numbers, alien registration numbers, biometric information, etc.

이를 위하여, 암호화 대상 필드 생성부(2068)는 해당 시스템의 모든 테이블에 대하여 개인정보가 필요한 필드를 생성할 수 있고, 해당 시스템의 모든 테이블과 필드에 대한 정보를 개인정보 관리 테이블에 저장하고 관리할 수 있다. 이때, 테이블은 데이터베이스 내의 테이블을 대상으로 하며, 이 테이블들은 법적으로 암호화가 필요한 필드 정보를 포함할 수 있다.To this end, the encryption target field generation unit (2068) can create fields requiring personal information for all tables in the system, and store and manage information about all tables and fields in a personal information management table. At this time, the tables target tables within a database, and these tables may include field information that legally requires encryption.

또한, 암호화 대상이 되는 테이블과 필드는 대상 시스템의 보안 관리자와의 협의를 통해 결정될 수 있다.Additionally, the tables and fields to be encrypted can be determined through consultation with the security administrator of the target system.

미 암호화 필드 검출부(2069)는 데이터가 시스템에 입력될 때 그 필도 값이 암호화되었는지 여부를 검사할 수 있다. 일 실시 예에서, 암호화 대상 테이블의 필드 값은 시스템에 들어오는 즉시 암호화되며, 조회 시 사용자의 권한에 따라 복호화 처리될 수 있다. 이 과정에서 프로그램의 누락, 암호화 엔진과의 통신 오류 등으로 인해 발생할 수 있는 암호화 누락 사항을 암호화 대상 필드 생성부에서 생성된 개인정보 관리 테이블을 이용하여 검사할 수 있다.The unencrypted field detection unit (2069) can check whether the field value is encrypted when data is entered into the system. In one embodiment, the field value of the encryption target table is encrypted immediately upon entering the system, and can be decrypted upon query according to the user's authority. During this process, encryption omissions that may occur due to program omissions, communication errors with the encryption engine, etc. can be checked using the personal information management table generated by the encryption target field generation unit.

일 실시 예에서, 검사의 효율성을 높이기 위해, 미 암호화 필드 검출부(2069)는 마스터 데이터에 대해서는 전수 검사를 실시하고, 트랜잭션 데이터의 경우에는 특정 기간에 대해 검사를 수행할 수 있다. 검출 방식은 필드의 유형에 따라 다르게 적용될 수 있다. 여기에서, 마스터 데이터는 인사 테이블, 고객 정보 테이블 등 초기 등록 시에 한번 생성되어 관리되는 데이터를 의미하며, 트랜잭션 데이터는 업무 수행 내용, 전표 생성, 구매 이력, 휴가 신청 등 업무를 진행하면서 생기는 데이터로 중복 생성이 가능한 데이터를 의미한다.In one embodiment, to increase the efficiency of the inspection, the non-encrypted field detection unit (2069) may perform a full inspection on master data and a inspection for a specific period on transaction data. The detection method may be applied differently depending on the type of field. Here, master data refers to data that is created and managed once during initial registration, such as a personnel table or customer information table, and transaction data refers to data that may be duplicated as work progresses, such as work performance details, voucher creation, purchase history, and vacation requests.

예를 들어, 숫자로 이루어진 필드(예: 주민등록번호, 카드번호 등)의 경우, 암호화되면 일부분이 문자로 변경되는 것을 통해 암호화 여부를 판단할 수 있다.For example, in the case of fields consisting of numbers (e.g., resident registration number, card number, etc.), you can determine whether or not they are encrypted by looking at whether some of them are changed to characters when encrypted.

보다 상세하게는, 데이터 필드가 숫자로만 구성된 경우, 해당 필드가 문자를 포함하고 있다면 암호화된 것으로 간주할 수 있 예를 들어, 주민등록번호나 카드번호와 같은 필드가 원래 숫자로만 구성되어 있는데, 조회 시 문자가 포함되어 있으면, 이는 해당 데이터가 암호화되었음을 나타낸다.More specifically, if a data field consists only of numbers, it can be considered encrypted if it contains characters. For example, if a field such as a resident registration number or card number originally consists only of numbers, but contains characters when queried, this indicates that the data is encrypted.

반면, 숫자와 문자가 혼용된 필드(예: 계좌번호, 운전면허번호 등)의 경우, 원문의 해시 값이 암호화 테이블에서 조회되는지, 또는 특수 문자의 포함 여부 등을 통해 암호화 여부를 판단할 수 있다.On the other hand, for fields that contain a mix of numbers and letters (e.g. account number, driver's license number, etc.), encryption can be determined by looking up the hash value of the original text in the encryption table or by checking whether special characters are included.

보다 상세하게는, 주소와 같이 한글, 숫자, 영문이 모두 혼합된 필드의 경우 암호화 여부를 파악하기가 더 복잡하다. 이를 해결하기 위해, 본 발명에서는 '해시(HASH)' 기술을 사용한다. 해시 함수는 글자나 문서의 내용을 입력 받아 정해진 길이의 특정 값으로 반환하는 표준 함수이며, 이를 통해 원본 데이터가 올바르게 암호화되었는지 확인할 수 있다. 이때, 원문이 다를 경우 해시 값이 달라지므로, 이를 통해 암호화 여부를 간접적으로 판별할 수 있다.More specifically, for fields containing a mix of Korean, numbers, and English characters, such as addresses, determining whether they are encrypted is more complex. To address this, the present invention utilizes "HASH" technology. A hash function is a standard function that takes text or document content and returns a specific value of a fixed length. This allows for verification of whether the original data has been properly encrypted. Since the hash value changes if the original text differs, this allows for indirectly determining whether the data is encrypted.

또한, 특수 문자를 삽입하여 암호화된 데이터를 구분할 수 있다.Additionally, you can insert special characters to distinguish encrypted data.

검출 내용 저장부(2070)는 미 암호화 필드 검출부(2069)를 통하여 미 암호화 값으로 분류된 정보에 대한 상세 정보를 저장할 수 있다. 여기에서, 상세 정보는 미 암호화 필드의 위치 및 내용을 확인하기 위한 테이블의 이름, 도메인의 이름, 테이블 키 값, 미 암호화 값, 검출 시간 및 생성자를 포함할 수 있다.The detection content storage unit (2070) can store detailed information about information classified as an unencrypted value through the unencrypted field detection unit (2069). Here, the detailed information can include the name of the table for confirming the location and content of the unencrypted field, the name of the domain, the table key value, the unencrypted value, the detection time, and the generator.

미 암호 대상 일괄 처리부(2071)는 암호화되지 않은 데이터 필드가 검출된 후 이에 대한 정보를 조회할 수 있다. 이후, 미 암호 대상 일괄 처리부(2071)는 보안 관리자에 의해 검출된 정보에 기초하여 시스템 내 미 암호화된 데이터에 대해 일괄적으로 암호화를 처리할 수 있다.The unencrypted target batch processing unit (2071) can retrieve information about unencrypted data fields after they are detected. Thereafter, the unencrypted target batch processing unit (2071) can batch-encrypt unencrypted data within the system based on the information detected by the security manager.

일 실시 예에서, 미 암호 대상 일괄 처리부(2071)는 암호화 누락의 원인을 검출된 미암호화 데이터 필드에 기초하여 분석할 수 있다. 이를 통해, 보안 관리자는 암호화 누락의 근본적인 원인을 파악하고 이에 대한 적절한 조치를 취할 수 있다.In one embodiment, the unencrypted target batch processing unit (2071) can analyze the cause of an encryption omission based on the detected unencrypted data fields. This allows the security manager to identify the root cause of the encryption omission and take appropriate action.

자동 실행 관리부(2072)는 시스템 내에서 미 암호화된 데이터의 존재를 실시간으로 모니터링하고, 이러한 데이터가 발견될 경우 자동으로 암호화 처리를 실행할 수 있다.The automatic execution management unit (2072) can monitor the presence of unencrypted data within the system in real time and automatically execute encryption processing when such data is discovered.

로그 관리부(2073)는 미 암호 대상 일괄 처리부(2071)를 통하여 일괄 암호화 적용한 로그를 저장하고 관리할 수 있다.The log management unit (2073) can store and manage logs to which batch encryption has been applied through the non-encrypted batch processing unit (2071).

특히, 미 암호화된 데이터는 자동 실행 관리부에 의해 자동으로, 또는 미 암호 대상 일괄 처리부에 의해 수동으로 암호화 처리되며 이 과정은 로그 관리부(2073)를 통해 기록되어 추적이 가능하다.In particular, unencrypted data is encrypted automatically by the automatic execution management unit or manually by the unencrypted target batch processing unit, and this process is recorded and traceable through the log management unit (2073).

이를 통해, 데이터에 대한 신뢰성을 높이며, 시스템 오류나 통신 문제로 인한 암호화 지연에도 불구하고 데이터의 보안을 유지할 수 있도록 한다.This increases the reliability of data and ensures that data remains secure despite encryption delays due to system errors or communication issues.

도 108은 실시 예에 따른 미 암호화 필드 검출 방법을 설명하는 도면이다.Figure 108 is a drawing explaining a method for detecting an unencrypted field according to an embodiment.

일 실시 예에서, 데이터 관리 장치는 시스템에 저장되어 있는 다양한 테이블(2074)을 대상으로 암호화되지 않은 개인정보에 대해 암호화 작업을 수행할 수 있다.In one embodiment, the data management device can perform an encryption operation on unencrypted personal information targeting various tables (2074) stored in the system.

보다 상세하게는, 데이터 관리 장치는 시스템에 저장되어 있는 테이블(2074)에 대하여, 조회 일자를 조건으로 암호화되지 않은 개인정보에 대해 암호화 작업을 수행할 수 있다.More specifically, the data management device can perform an encryption operation on unencrypted personal information based on the query date for a table (2074) stored in the system.

예를 들어, 사용자는 조회 일자에 대한 조건(2075)을 ALL, ERDAT, SQL 중 적어도 하나로 설정할 수 있다. 여기에서, ALL 조건은 모니터링 대상 테이블에 날짜(ERDAT) 필드가 없는 경우 모든 사용하는 조건으로, 모든 일자에 대해 조회하는 방법이다. ERDAT 조건은 모니터링 대상 테이블에 날짜 필드가 있는 경우에 사용하는 방법이다. SQL 조건은 ABAP SQL 문으로 데이터를 추출하는 경우 사용하는 방법이다.For example, the user can set the condition (2075) for the query date to at least one of ALL, ERDAT, and SQL. Here, the ALL condition is a condition used when the monitored table does not have a date (ERDAT) field, and is a method for searching for all dates. The ERDAT condition is a method used when the monitored table has a date field. The SQL condition is a method used when extracting data using ABAP SQL statements.

도 109은 실시 예에 따른 미 암호화 필드 검출 후 재암호화 수행 방법을 설명하는 도면이다.Figure 109 is a drawing explaining a method for performing re-encryption after detecting an unencrypted field according to an embodiment.

본 도면은 일 실시 예에 따라 미 암호화 필드 검출 후 재암호화 처리를 위한 옵션을 선택받는 사용자 인터페이스를 나타낸다.This drawing illustrates a user interface for selecting options for re-encryption processing after detection of an unencrypted field according to one embodiment.

보다 상세하게는, 미 암호화 필드 저장 옵션(2076), 미 암호화 필드 검출 후 재암호화 수행 옵션(2077), 미 암호화 데이터 출력 옵션(2078) 및 재암호화 수행 내역 출력 옵션(2079)을 포함할 수 있다.More specifically, it may include an option to store unencrypted fields (2076), an option to perform re-encryption after detecting unencrypted fields (2077), an option to output unencrypted data (2078), and an option to output re-encryption performance history (2079).

미 암호화 필드 저장 옵션(2076)은 암호화되지 않은 데이터를 추출한 후 저장할지 여부를 결정하는 옵션이다.The unencrypted field storage option (2076) is an option that determines whether to store unencrypted data after extraction.

미 암호화 필드 검출 후 재암호화 수행 옵션(2077)은 상술한 실시 예에 따라 조회 조건(where field)에 해당하는 대상 테이블의 암호화되지 않은 개인정보 데이터를 추출하여 재암호화를 수행하는지 여부를 결정하는 옵션이다.The option to perform re-encryption after detecting an unencrypted field (2077) is an option that determines whether to perform re-encryption by extracting unencrypted personal information data of the target table corresponding to the query condition (where field) according to the embodiment described above.

미 암호화 데이터 출력 옵션(2078)은 추출되어 저장된 미 암호화 데이터를 출력할지 여부를 결정하는 옵션이다.The unencrypted data output option (2078) is an option that determines whether to output extracted and stored unencrypted data.

재암호화 수행 내역 출력 옵션(2079)은 미 암호화 데이터를 재암호화 수행한 후 수행 내역을 출력할지 여부를 결정하는 옵션이다.The re-encryption performance history output option (2079) is an option that determines whether to output the performance history after performing re-encryption on unencrypted data.

데이터 관리 장치는 각 옵션의 선택 여부에 따라 미 암호화 필드 검출 후 재암호화 수행 방법에 대한 프로세스를 진행할 수 있다.The data management device can proceed with the process of performing re-encryption after detecting unencrypted fields depending on whether each option is selected.

도 110은 실시 예에 따른 미 암호화 필드 검출 결과를 나타내는 도면이다.Figure 110 is a diagram showing the results of detecting an unencrypted field according to an embodiment.

본 도면의 (a)는 미 암호화 데이터가 검출되지 않는 상황으로, 필드의 type이 S로 나타나고, message text로 “target extraction encrypt-no encryption target”이 나타나는 경우, 모든 데이터가 암호화되었다고 판단할 수 있다.(a) of this drawing is a situation where unencrypted data is not detected. If the field type is S and the message text “target extraction encrypt-no encryption target” appears, it can be determined that all data is encrypted.

본 도면의 (b)는 미 암호화 데이터가 검출된 상황으로, 필드의 type이 E로 나타나고, message text가 숫자로 나타나는 경우, 해당 필드의 데이터가 암호화되지 않았다고 판단할 수 있다.(b) of this drawing is a situation where unencrypted data is detected. If the field type is displayed as E and the message text is displayed as a number, it can be determined that the data in the field is not encrypted.

이에 대하여, 본 발명의 일 실시 예에 따라, 선택한 레코드(record)만 개별적으로 암호화를 수행하거나, 선택한 레코드에 해당하는 테이블의 필드 값 전체를 재암호화할 수 있다.In this regard, according to one embodiment of the present invention, encryption may be performed individually on only selected records, or the entire field values of the table corresponding to the selected records may be re-encrypted.

이때, 재암호화 처리에 대한 상세 로그는 다음과 같다.At this time, the detailed log for re-encryption processing is as follows.

필드field설명explanationConversionConversionConversion DateConversion DateTable NameTable Name테이블 이름.Table name.NoNoRecord NoRecord NoField NameField Name필드 이름.Field name.Where FieldWhere Field조회 조건Search conditionsWhere SQLWhere SQL상세 조회 조건Detailed inquiry conditionsStatusStatusGreen - 정상 / Yellow - warning / Red - ErrorGreen - Normal / Yellow - warning / Red - ErrorKey ValueKey Value테이블 키 값Table key valueTypeTypeS : Success / E : Error / W : Warning / I : InfoS: Success / E: Error / W: Warning / I: InfoMessage TextMessage Text암호화가 필요한 원문 값Original text value that requires encryptionRunRun작업에 영향을 받는 데이터의 수The number of data affected by the taskPINPIN생성된 PIN의 수Number of PINs generatedCreated onCreated on재 암호화 목록 생성일자Re-encryption list creation dateCreated timeCreated time재 암호화 목록 생성시간Re-encryption list generation timeCreated ByCreated By재 암호화 목록 생성유저Generate re-encryption list user

도 111은 실시 예에 따른 재암호화된 데이터에 대한 변경 이력 로그를 나타내는 도면이다.FIG. 111 is a diagram showing a change history log for re-encrypted data according to an embodiment.

일 실시 예에서, 변경 이력 로그 검색을 위하여 테이블 이름, 생성 날짜, 테이블의 필드 이름, 각 테이블의 필드 정보의 상태 등을 조건으로 검색할 수 있다.In one embodiment, the change history log search can be performed by searching for conditions such as table name, creation date, field name of the table, and status of field information of each table.

본 도면은 조회 대상 테이블(2080)에 대응하는 변경 이력 로그(2081)를 나타낸다.This drawing shows a change history log (2081) corresponding to a query target table (2080).

조회 대상 테이블(2080)은 변경 이력 조회 대상 테이블 이름, 변경 이력 조회 대상 테이블에 대한 설명, 변경 이력 수를 포함할 수 있다.The query target table (2080) may include the name of the change history query target table, a description of the change history query target table, and the number of change histories.

변경 이력 로그(2081)는 설정 정보를 변경한 일자, 설정 정보를 변경한 시간, 설정 정보 변경자, 설정 변경된 테이블, 설정 변경된 테이블의 필드, 변경된 대상 테이블의 키 값, 테이블의 필드 정보의 상태, 변경 시퀀스, 변경 실행한 프로그램의 트랜잭션 코드, 변경된 데이터, 변경 전 데이터, 변경 실행한 날짜, 변경 실행한 시간 등을 포함할 수 있다.The change history log (2081) may include the date the setting information was changed, the time the setting information was changed, the person who changed the setting information, the table in which the setting was changed, the field of the table in which the setting was changed, the key value of the changed target table, the status of the field information of the table, the change sequence, the transaction code of the program that executed the change, the changed data, the data before the change, the date the change was executed, the time the change was executed, etc.

도 112은 실시 예에 따른 데이터 관리 방법을 설명하는 순서도이다.Figure 112 is a flowchart illustrating a data management method according to an embodiment.

일 실시 예에서, 데이터 관리 방법은 수집된 데이터로부터 개인정보를 추출할 수 있다(S40010). 이에 대하여는 도 12 내지 도 18에서 상술한 내용을 참고하도록 한다.In one embodiment, the data management method can extract personal information from collected data (S40010). For more information, please refer to the details described above in FIGS. 12 to 18.

일 실시 예에서, 데이터 관리 방법은 개인정보를 암호화하여 제 1 데이터로 저장할 수 있다(S40020). 이에 대하여는 도 22 내지 도 26에서 상술한 내용을 참고하도록 한다.In one embodiment, the data management method may encrypt personal information and store it as first data (S40020). For details, refer to the details described above in FIGS. 22 through 26.

일 실시 예에서, 데이터 관리 방법은 저장된 제 1 데이터에 포함된 필드의 암호화 여부를 검사할 수 있다(S40030). 이때, 제 1 데이터가 마스터 데이터인 경우 실시간으로 암호화 여부를 검사하고, 상기 제 1 데이터가 트랜잭션 데이터인 경우, 기 설정된 주기로 암호화 여부를 검사할 수 있다. 이를 위하여, 데이터 관리 방법은 데이터에 포함된 테이블에 대하여 개인정보가 필요한 필드를 생성하고, 암호화 대상이 되는 필드를 관리하여, 미 암호화된 필드를 검출할 수 있다. 이에 대하여는, 도 107 내지 도 109에서 상술한 내용을 참고하도록 한다.In one embodiment, the data management method can check whether fields included in stored first data are encrypted (S40030). If the first data is master data, the encryption can be checked in real time. If the first data is transaction data, the encryption can be checked at preset intervals. To this end, the data management method can create fields requiring personal information for tables included in the data, manage fields subject to encryption, and detect unencrypted fields. For this purpose, please refer to the details described above in FIGS. 107 to 109.

일 실시 예에서, 데이터 관리 방법은 제 1 데이터에 포함된 필드 중 제 1 미 암호화 필드가 검출되는 경우, 제 1 미 암호화 필드에 대한 상세 정보를 저장할 수 있다(S40040). 이에 대하여는, 도 107 및 도 110에서 상술한 내용을 참고하도록 한다.In one embodiment, the data management method may store detailed information about the first unencrypted field when a first unencrypted field is detected among the fields included in the first data (S40040). For this, please refer to the contents described above in FIGS. 107 and 110.

일 실시 예에서, 데이터 관리 방법은 제 1 데이터를 재암호화할 수 있다(S40050). 이때, 데이터 관리 방법은 사용자 설정에 따라 자동 또는 수동으로 제 1 데이터를 재암호화할 수 있다. 또한, 데이터 관리 방법은 제 1 데이터를 재암호화하는 과정에 대응하는 로그를 기록하여 저장할 수 있다. 이에 대하여는, 도 107 및 도 111에서 상술한 내용을 참고하도록 한다.In one embodiment, the data management method can re-encrypt the first data (S40050). At this time, the data management method can automatically or manually re-encrypt the first data according to user settings. Furthermore, the data management method can record and store a log corresponding to the process of re-encrypting the first data. For details, refer to the details described above in FIGS. 107 and 111.

10000: 데이터 관리 플랫폼10000: Data Management Platform

Claims

Translated fromKorean

수집된 데이터로부터 개인정보를 추출하는 단계;
상기 개인정보를 암호화하여 제 1 데이터로 저장하는 단계;
상기 저장된 제 1 데이터에 포함된 필드의 암호화 여부를 검사하는 단계;
상기 제 1 데이터에 포함된 필드 중 제 1 미 암호화 필드가 검출되는 경우, 상기 제 1 미 암호화 필드에 대한 상세 정보를 저장하는 단계; 및
상기 제 1 데이터를 재암호화하는 단계를 포함하는, 데이터 관리 방법.
A step of extracting personal information from the collected data;
A step of encrypting the above personal information and storing it as first data;
A step of checking whether a field included in the above stored first data is encrypted;
A step of storing detailed information about the first unencrypted field when a first unencrypted field is detected among the fields included in the first data; and
A data management method comprising a step of re-encrypting the first data.

제 1 항에 있어서,
상기 제 1 데이터가 마스터 데이터인 경우 실시간으로 암호화 여부를 검사하고, 상기 제 1 데이터가 트랜잭션 데이터인 경우, 기 설정된 주기로 암호화 여부를 검사하는 단계를 더 포함하는, 데이터 관리 방법.
In the first paragraph,
A data management method further comprising a step of checking whether the first data is encrypted in real time if the first data is master data, and checking whether the first data is encrypted at a preset cycle if the first data is transaction data.

제 1 항에 있어서,
상기 제 1 데이터를 재암호화하는 단계는,
사용자 설정에 따라 자동 또는 수동으로 암호화되는 단계를 포함하고,
상기 데이터 관리 방법은
상기 제 1 데이터를 재암호화하는 과정에 대응하는 로그를 기록하여 저장하는 단계를 포함하는, 데이터 관리 방법.
In the first paragraph,
The step of re-encrypting the above first data is:
Includes steps to encrypt automatically or manually, depending on user settings;
The above data management method
A data management method comprising a step of recording and storing a log corresponding to the process of re-encrypting the first data.

데이터를 저장하는 데이터베이스; 및
상기 데이터를 처리하는 프로세서를 포함하고,
상기 프로세서는,
수집된 데이터로부터 개인정보를 추출하고,
상기 개인정보를 암호화하여 제 1 데이터로 저장하고,
상기 저장된 제 1 데이터에 포함된 필드의 암호화 여부를 검사하고,
상기 제 1 데이터에 포함된 필드 중 제 1 미 암호화 필드가 검출되는 경우, 상기 제 1 미 암호화 필드에 대한 상세 정보를 저장하고,
상기 제 1 데이터를 재암호화하는, 데이터 관리 장치.
a database that stores data; and
Including a processor for processing the above data,
The above processor,
Extract personal information from the collected data,
Encrypt the above personal information and store it as first data,
Check whether the fields included in the first data stored above are encrypted,
If a first unencrypted field is detected among the fields included in the first data, detailed information about the first unencrypted field is stored,
A data management device that re-encrypts the above first data.

제 4 항에 있어서,
상기 프로세서는,
상기 제 1 데이터가 마스터 데이터인 경우 실시간으로 암호화 여부를 검사하고, 상기 제 1 데이터가 트랜잭션 데이터인 경우, 기 설정된 주기로 암호화 여부를 검사하는 단계를 더 포함하는, 데이터 관리 장치.
In paragraph 4,
The above processor,
A data management device further comprising a step of checking whether the first data is encrypted in real time if the first data is master data, and checking whether the first data is encrypted at a preset cycle if the first data is transaction data.

제 4 항에 있어서,
상기 프로세서는,
사용자 설정에 따라 자동 또는 수동으로 상기 제 1 데이터를 재암호화하고,
상기 제 1 데이터를 재암호화하는 과정에 대응하는 로그를 기록하여 저장하는, 데이터 관리 장치.
In paragraph 4,
The above processor,
Re-encrypt the first data automatically or manually according to user settings,
A data management device that records and stores a log corresponding to the process of re-encrypting the above first data.

수집된 데이터로부터 개인정보를 추출하고,
상기 개인정보를 암호화하여 제 1 데이터로 저장하고,
상기 저장된 제 1 데이터에 포함된 필드의 암호화 여부를 검사하고,
상기 제 1 데이터에 포함된 필드 중 제 1 미 암호화 필드가 검출되는 경우, 상기 제 1 미 암호화 필드에 대한 상세 정보를 저장하고,
상기 제 1 데이터를 재암호화하는, 데이터 관리 방법을 컴퓨터에 실행시키기 위한 컴퓨터 프로그램이 저장된 컴퓨터가 판독 가능한 기록 매체.
Extract personal information from the collected data,
Encrypt the above personal information and store it as first data,
Check whether the fields included in the first data stored above are encrypted,
If a first unencrypted field is detected among the fields included in the first data, detailed information about the first unencrypted field is stored,
A computer-readable recording medium storing a computer program for causing a computer to execute a data management method for re-encrypting the first data.