Movatterモバイル変換


[0]ホーム

URL:


CN109104487A - Data transmission method based on logstack + kafka - Google Patents

Data transmission method based on logstack + kafka
Download PDF

Info

Publication number
CN109104487A
CN109104487ACN201810947702.9ACN201810947702ACN109104487ACN 109104487 ACN109104487 ACN 109104487ACN 201810947702 ACN201810947702 ACN 201810947702ACN 109104487 ACN109104487 ACN 109104487A
Authority
CN
China
Prior art keywords
data
logstash
business datum
kafka
middleware
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810947702.9A
Other languages
Chinese (zh)
Inventor
颜朋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Software Co Ltd
Original Assignee
Inspur Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Software Co LtdfiledCriticalInspur Software Co Ltd
Priority to CN201810947702.9ApriorityCriticalpatent/CN109104487A/en
Publication of CN109104487ApublicationCriticalpatent/CN109104487A/en
Pendinglegal-statusCriticalCurrent

Links

Classifications

Landscapes

Abstract

Translated fromChinese

本发明公开了一种基于logstash+kafka数据传输方法,包括业务数据生产端操作和业务数据消费端操作;1)通过业务数据生产端logstash,自动采集业务系统数据库中业务数据,并传输至中间件kafka;2)所述的中间件kafka收集各所述的业务数据生产端传输的数据,统一发送至业务数据消费端logstash;3)所述的业务数据消费端logstash接收所述的中间件kafka的消息,转为数据文件,并存储至目标数据库。本发明和现有技术相比,配置操作简便,处理速度快,能有效实现较为分散的业务数据的采集、处理、存储等操作,可以较为迅速方便的搭建业务数据生产端与消费端传输通道,且扩展性良好,降低开发及运维难度。

The invention discloses a data transmission method based on logstash+kafka, including the operation of the business data production end and the business data consumption end; 1) through the business data production end logstash, the business data in the business system database is automatically collected and transmitted to the middleware kafka; 2) the middleware kafka collects the data transmitted by each of the business data producers and sends them to the business data consumer logstash; 3) the business data consumer logstash receives the middleware kafka Messages are converted into data files and stored in the target database. Compared with the prior art, the present invention has simple configuration and operation, fast processing speed, can effectively realize operations such as collection, processing, and storage of relatively scattered business data, and can quickly and conveniently build a transmission channel between the production end and the consumer end of business data, And the scalability is good, reducing the difficulty of development and operation and maintenance.

Description

One kind being based on logstash+kafka data transmission method
Technical field
The present invention relates to high-speed data acquisition application fields, specifically a kind of to be passed based on logstash+kafka dataTransmission method.
Background technique
In enterprise's application process, there is the information system of oneself in each information enterprise, but the data letter between each enterpriseBreath can not achieve it is shared, also lack a unified data platform come these business datums are integrated, are handled, are excavated with divideAnalysis, it is serious to hinder the whole progress of IT application in enterprise.To solve this problem, begin one's study various data of people passDefeated mode, attempt by not homologous ray data efficient, safety collect together, the basis of data platform in Unified SetOn, further mining analysis, mining data rule, aid decision are carried out to data.Available data transmission mode is broadly divided intoIt is several below:
1) CDN technology
By adding one layer of new network architecture in existing network, the content of website is published to the network closest to user" edge " allows users to the content needed for obtaining nearby, improves the response speed that user accesses website.But disadvantage is also very brightIt is aobvious: it is non real-time, indirect to update to specified object, and there is manual intervention in centre, it need be compared tight, thoughtfulIt arranges;
2) based on the transmission technology of File Transfer Protocol
The effect of FTP remote file transferring agreement is that file is moved on to from a computer there are one computer.Most frequently makeIt is the transmitted in both directions using FTP, i.e., data are transmitted between remote system and local.User can will be on remote computerFile download to user where host on, then copy to again in the terminating machine of user, or be directly downloaded to the end of userIn terminal, additionally it is possible to which the file in the file of host where user or user terminal is transferred on remote computer;
FTPserver need be established using FTP transmission file.Using the FTP of registration user, user and password need be also managed.General host all provides the client of FTP, it is possible to use dedicated FTPclient uses integrated ftp software.According to the peopleBanking software constraint is forbidden to use anonymous ftp transmitting data;
Had using the major defect that FTP mode carries out file transmission: the integrality for transmitting data is unable to get guarantee;Scalability compared withDifference;
3) it is based on mail transfer
File is transmitted using e-mail system.E-mail system is with transmission speed is fast, file type is diversified, transmitting-receiving sideJust, the features such as communicatee is extensive, safe.But the integrality for transmitting data is unable to get guarantee, and efficiency of transmission is lower, andAnd file is transmitted based on lettergram mode, efficiency is lower;
4) it is transmitted based on middleware
Data are transmitted using middlewares such as MQ, MT, has the function of data compression, the big file of transmission, breakpoint transmission etc., may be implementedFile security, reliable transmission.But that there are efficiency is lower for major part middleware transmission mode at present, needs corresponding interface exploitationWorkload, the inflexible disadvantage of data pick-up.
Summary of the invention
Technical assignment of the invention is to provide a kind of based on logstash+kafka data transmission method.
Technical assignment of the invention is realized in the following manner:
One kind being based on logstash+kafka data transmission method, including the operation of the business datum manufacturing side and business datum consumption terminalOperation;
Operating procedure is as follows:
Step 1) passes through business datum manufacturing side logstash, business datum in automatic collection operation system database, and transmitsTo middleware kafka;
Middleware kafka described in step 2 collects the data of each business datum manufacturing side transmission, is uniformly sent to industryBe engaged in data consumption end logstash;
Business datum consumption terminal logstash described in step 3) receives the message of the middleware kafka, switchs to data textPart, and store to target database.
Business datum in step 1) the automatic collection operation system database, comprising:
The business datum of the acquisition is not needed again by specifying in logstash configuration file through development interface program.
The business datum manufacturing side can dispose more logstash tools in the step 1).
The data of the extraction different business of logstash tool described in every.
The more logstash tools concurrently execute data pick-up work.
The middleware kafka clustered deploy(ment) is in multiple servers, equally, business datum consumption terminal logstash according toThe difference for extracting data service range, opens multiple logstash tools, receives different numbers from middleware kafka cluster respectivelyAccording to concurrent processing is data file.
One kind being based on logstash+kafka data transmission device, including the business datum manufacturing side and business datum consumptionEnd;
The business datum manufacturing side is used for the business datum that capturing service system generates;
The business datum consumption terminal is used to handle the data of acquisition, data is focused in unified data platform.
The business datum manufacturing side includes: input module and output module;
The input module is data acquisition, business datum needed for acquiring in service database automatically;
The output module is data transmission, and acquisition data are converted to message and are sent in data manufacturing side kafka queue.
The business datum consumption terminal includes consumption terminal kafka data middleware and consumption terminal logstash;
The consumption terminal kafka data middleware is used for transmission and receives the business datum message that each operation system is sent;
The consumption terminal logstash is converted into data file and deposits in unification for receiving middleware kafka message, processingData platform in.
Compared to the prior art one kind of the invention is based on logstash+kafka data transmission method, data content is reallyIt is fixed to be realized not by development interface, it can be made simply by the content in the logstash configuration file of the data manufacturing sideIt is fixed;When needing operation expanding, data area increases, it is only necessary to modify configuration file;And with the promotion of data volume, can pass throughThe guarantee of logstash and kafka cluster extended to realize performance;Final data processing format can be needed to configure according to enterpriseDifferent logstash configuration files realizations, can be according to profile name rule, to be placed into server designated position.
Detailed description of the invention
Attached drawing 1 is a kind of flow diagram based on logstash+kafka data transmission method.
Specific embodiment
Embodiment 1:
Configuration device:
One kind being based on logstash+kafka data transmission device, including the business datum manufacturing side and business datum consumption terminal;
The business datum manufacturing side is used for the business datum that capturing service system generates;The business datum manufacturing side packetIt includes: input module and output module;The input module is data acquisition, business needed for acquiring in service database automaticallyData;The output module is data transmission, and acquisition data are converted to message and are sent to data manufacturing side kafka queueIn.
The business datum consumption terminal is used to handle the data of acquisition, data is focused in unified data platform.The business datum consumption terminal includes consumption terminal kafka data middleware and consumption terminal logstash;The consumption terminalKafka data middleware is used for transmission and receives the business datum message that each operation system is sent;The consumption terminalLogstash is converted into data file and deposits in unified data platform for receiving middleware kafka message, processing.
Operating method:
Step 1) passes through business datum manufacturing side logstash, and business datum in automatic collection operation system database is describedThe business datum of acquisition is not needed again by development interface program, and be transmitted to centre by specifying in logstash configuration filePart kafka;
The business datum manufacturing side can dispose more logstash tools;Logstash tool described in every extracts differentThe data of business;The more logstash tools concurrently execute data pick-up work.
Middleware kafka described in step 2 collects the data of each business datum manufacturing side transmission, unified to sendTo business datum consumption terminal logstash;
Business datum consumption terminal logstash described in step 3) receives the message of the middleware kafka, switchs to data textPart, and store to target database.
The middleware kafka clustered deploy(ment) is in multiple servers, equally, business datum consumption terminal logstash according toThe difference for extracting data service range, opens multiple logstash tools, receives different numbers from middleware kafka cluster respectivelyAccording to concurrent processing is data file.
Business datum manufacturing side configuration file sample:
input {
jdbc {
jdbc_connection_string => "jdbc:oracle:thin:@xxx.xxx.xxx.xxx:1521:db1"
jdbc_user => "user1"
jdbc_password => "upwd1"
jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
jdbc_driver_library => "/opt/langchao/logstash-5.6.6/jdbcdrivers/ojdbc6.jar"
jdbc_paging_enabled => "false"
jdbc_page_size => "50000"
statement => "SELECT 'table1' as tablename, id, name, to_char(createdate, 'yyyy-MM-dd HH24:mi:ss') as createdate FROM produce where to_char(createdate, 'yyyy-MM-dd HH24:mi:ss') > to_char(:sql_last_value) orderby createdate"
schedule => "0 0 16 * * *"
type => "test"
record_last_run => true
use_column_value => true
parameters => {"createdate" => "2018-02-28 23:59:59"}
tracking_column => "createdate"
tracking_column_type => "numeric"
last_run_metadata_path => "/opt/langchao/logstash-5.6.6/runcache/tables/table1"
clean_run => false
}
}
output {
if [type]== "test" {
kafka {
codec => json_lines {
charset => "UTF-8"
}
topic_id => "test"
bootstrap_servers => "xxx.xxx.xxx.xxx:9092"
}
}
}
Business datum consumption terminal configuration file sample:
input {
kafka {
codec => json {
charset => "UTF-8"
}
auto_offset_reset => "earliest"
topics => ["test"]
bootstrap_servers => "10.1.80.238:9092"
max_poll_records => "10"
request_timeout_ms => "300000"
session_timeout_ms => "180000"
}
}
filter {
date {
match => ["message","UNIX_MS"]
target => "@timestamp"
}
ruby {
code => "event.set('timestamp', event.get('@timestamp').time.localtime + 8*60*60)"
}
ruby {
code => "event.set('@timestamp',event.get('timestamp'))"
}
mutate {
remove_field => ["timestamp"]
}
json {
source => "message"
}
}
output {
if [tabname] == "plm_item" {
csv {
codec => plain {
charset => "UTF-8"
}
path => "/opt/ldatas/data/%{category}/%{+YYYYMMdd}/%{tabname}_%{comid}_%{+YYYYMMdd-H}.csv"
dir_mode => 0777
file_mode => 0777
filename_failure => "/opt/ldatas/failures/filefailures.txt"
fields => ["item_id", "item_name"]
csv_options => {"col_sep" => ","}
}
}
}
The technical personnel in the technical field can readily realize the present invention with the above specific embodiments,.But it should manageSolution, the present invention is not limited to above-mentioned several specific embodiments.On the basis of the disclosed embodiments, the technical fieldTechnical staff can arbitrarily combine different technical features, to realize different technical solutions.

Claims (9)

CN201810947702.9A2018-08-202018-08-20Data transmission method based on logstack + kafkaPendingCN109104487A (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
CN201810947702.9ACN109104487A (en)2018-08-202018-08-20Data transmission method based on logstack + kafka

Applications Claiming Priority (1)

Application NumberPriority DateFiling DateTitle
CN201810947702.9ACN109104487A (en)2018-08-202018-08-20Data transmission method based on logstack + kafka

Publications (1)

Publication NumberPublication Date
CN109104487Atrue CN109104487A (en)2018-12-28

Family

ID=64850446

Family Applications (1)

Application NumberTitlePriority DateFiling Date
CN201810947702.9APendingCN109104487A (en)2018-08-202018-08-20Data transmission method based on logstack + kafka

Country Status (1)

CountryLink
CN (1)CN109104487A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110401724A (en)*2019-08-222019-11-01北京旷视科技有限公司File management method, ftp server and storage medium
CN110442436A (en)*2019-07-122019-11-12平安普惠企业管理有限公司Process management method and relevant apparatus based on container
CN111753007A (en)*2020-06-162020-10-09国家电网有限公司客户服务中心 A data aggregation system and aggregation method based on pluggable components under multi-system
CN112965702A (en)*2021-02-192021-06-15中国工商银行股份有限公司Data acquisition method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN106294672A (en)*2016-08-082017-01-04杭州玳数科技有限公司The method and system that a kind of daily record represents in real time and inquires about
CN106330963A (en)*2016-10-112017-01-11江苏电力信息技术有限公司 A method for cross-network multi-node log collection
CN106844171A (en)*2016-12-272017-06-13浪潮软件集团有限公司 A method for realizing mass operation and maintenance
US20170272516A1 (en)*2016-03-172017-09-21International Business Machines CorporationProviding queueing in a log streaming messaging system
CN108365985A (en)*2018-02-072018-08-03深圳壹账通智能科技有限公司A kind of cluster management method, device, terminal device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20170272516A1 (en)*2016-03-172017-09-21International Business Machines CorporationProviding queueing in a log streaming messaging system
CN106294672A (en)*2016-08-082017-01-04杭州玳数科技有限公司The method and system that a kind of daily record represents in real time and inquires about
CN106330963A (en)*2016-10-112017-01-11江苏电力信息技术有限公司 A method for cross-network multi-node log collection
CN106844171A (en)*2016-12-272017-06-13浪潮软件集团有限公司 A method for realizing mass operation and maintenance
CN108365985A (en)*2018-02-072018-08-03深圳壹账通智能科技有限公司A kind of cluster management method, device, terminal device and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘锴: "海量数据日志系统架构分析与应用", 《长春工业大学学报》*
王力群: "基于日志分析平台的监控系统的设计与实现", 《计算机应用与软件》*

Cited By (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CN110442436A (en)*2019-07-122019-11-12平安普惠企业管理有限公司Process management method and relevant apparatus based on container
CN110401724A (en)*2019-08-222019-11-01北京旷视科技有限公司File management method, ftp server and storage medium
CN110401724B (en)*2019-08-222022-04-12北京旷视科技有限公司File management method, file transfer protocol server and storage medium
CN111753007A (en)*2020-06-162020-10-09国家电网有限公司客户服务中心 A data aggregation system and aggregation method based on pluggable components under multi-system
CN112965702A (en)*2021-02-192021-06-15中国工商银行股份有限公司Data acquisition method and device

Similar Documents

PublicationPublication DateTitle
US11922232B2 (en)Responding to incidents identified by an information technology and security operations application using a mobile application
US11496371B2 (en)Executing custom playbook code in a hybrid security operations application environment
US11855850B2 (en)Systems and methods for networked microservice modeling and visualization
US20230237094A1 (en)Processing ingested data to identify anomalies
US11843622B1 (en)Providing machine learning models for classifying domain names for malware detection
US11799798B1 (en)Generating infrastructure templates for facilitating the transmission of user data into a data intake and query system
US10187461B2 (en)Configuring a system to collect and aggregate datasets
US11573971B1 (en)Search and data analysis collaboration system
CN100568193C (en) Systems and methods for performance management in a multi-tier computing environment
US11516069B1 (en)Aggregate notable events in an information technology and security operations application
US9361203B2 (en)Collecting and aggregating log data with fault tolerance
US9425973B2 (en)Resource-based synchronization between endpoints in a web-based real time collaboration
US9817867B2 (en)Dynamically processing an event using an extensible data model
US9082127B2 (en)Collecting and aggregating datasets for analysis
CN109104487A (en)Data transmission method based on logstack + kafka
US11886844B1 (en)Updating reusable custom functions across playbooks
US20230161760A1 (en)Applying data-determinant query terms to data records with different formats
CN103095819A (en)Data information pushing method and data information pushing system
US12045201B1 (en)Automatically configuring connectors of an information technology and security operations application
US20220245093A1 (en)Enhanced search performance using data model summaries stored in a remote data store
CN105138709A (en)Remote evidence taking system based on physical memory analysis
US7779113B1 (en)Audit management system for networks
CN103780700A (en)Application system and method for achieving compatibility and sharing among multi-source heterogeneous systems
CN105099735B (en)A kind of method and system for obtaining magnanimity more detailed logging
US11792157B1 (en)Detection of DNS beaconing through time-to-live and transmission analyses

Legal Events

DateCodeTitleDescription
PB01Publication
PB01Publication
SE01Entry into force of request for substantive examination
SE01Entry into force of request for substantive examination
RJ01Rejection of invention patent application after publication

Application publication date:20181228

RJ01Rejection of invention patent application after publication

[8]ページ先頭

©2009-2025 Movatter.jp