Invention content
In view of this, the application provides a kind of real-time stream processing method and processing device, succinct pass through userClass SQL control commands realize the processing to real-time streaming data.
Specifically, the application is achieved by the following technical solution:
According to the application's in a first aspect, providing a kind of stream data processing method, the method is applied to Spark platformsSpark SQL components, the method includes:
Receive externally input type of structured query language SQL control commands;
The class SQL control commands are parsed;
If the class SQL control commands parsed are stream streaming control commands, controlled according to the class SQLOrder handles subsequently received flow data;
If the class SQL control commands parsed are primary Spark SQL control commands, controlled according to the class SQLSystem order handles the external structurant data specified by the class SQL control commands.
Optionally, it is described the class SQL control commands are parsed after, the method further includes:
Based on the critical field obtained after being parsed to the class SQL control commands, determine that the class SQL parsed is controlledThe type of order.
Optionally, it is described based on the critical field obtained after being parsed to the class SQL control commands, determine the institute parsedThe type of class SQL control commands is stated, including:
If the critical field obtained after being parsed to the class SQL control commands matches the first default specific field, solution is determinedThe class SQL control commands being precipitated are the streaming control commands;
If the critical field obtained after being parsed to the class SQL control commands matches the second default specific field, it is determined thatThe class SQL control commands parsed are the primary Spark SQL control commands.
Optionally, the class SQL control commands carry data processing keyword;
It is described that subsequently received flow data is handled according to the class SQL control commands, including:
According to the data processing keyword that the class SQL control commands carry, at subsequently received flow dataReason;
The processing of the flow data includes:Join processing, Map processing, Reduce processing and User Defined processing.
One or more of optionally, in the following way, receive externally input class SQL control commands, convection currentData are handled:
Application programming interface API modes;
Command Line Interface CLI modes;
Java database connection mode JDBC modes.
According to the second aspect of the application, a kind of flow data processing device is provided, described device is applied to Spark platformsSpark SQL components, described device include:
Resolution unit, for receiving externally input type of structured query language SQL control commands, and to the class SQLControl command is parsed;
Stream SQL units, if the class SQL control commands for parsing are stream streaming control commands,Then subsequently received flow data is handled according to the class SQL control commands;
Spark SQL units, if the class SQL control commands for parsing are primary Spark SQL control commands,Then the external structurant data specified by the class SQL control commands are handled according to the class SQL control commands.
Optionally, the resolution unit is additionally operable to based on the keyword obtained after being parsed to the class SQL control commandsSection, determines the type of the class SQL control commands parsed.
Optionally, the resolution unit, if specifically for the critical field obtained after being parsed to the class SQL control commandsThe first default specific field is matched, it is the streaming control commands to determine the class SQL control commands parsed;IfThe critical field obtained after being parsed to the class SQL control commands matches the second default specific field, it is determined that the institute parsedClass SQL control commands are stated as the primary Spark SQL control commands.
Optionally, the class SQL control commands carry data processing keyword;
The Stream SQL units, specifically for the data processing keyword carried according to the class SQL control commands,Subsequently received flow data is handled;
The data processing keyword includes:Join processing, Map processing, Reduce processing and User Defined processing.
Optionally, the Spark SQL components one or more of in the following way, receive externally input classSQL control commands, stream data are handled:
Application programming interface API modes;
Command Line Interface CLI modes;
Java database connection mode JDBC modes.
The application provides a kind of processing method of real-time streaming data, by the Spark SQL components in Spark platforms intoRow extension, increases Stream SQL units so that Spark SQL components can receive externally input SQL control commands, and rightThe class SQL control commands are parsed.If the class SQL control commands parsed are stream streaming control commands,Spark SQL components can then be handled subsequently received flow data according to the class SQL control commands.If it parsesThe class SQL control commands are primary Spark SQL control commands, then the class SQL are controlled according to the class SQL control commandsThe specified external structurant data of system order are handled.
On the one hand, since user is by using succinct class SQL control commands, it is possible to which Spark platforms are receivedReal time data carries out Stream Processing, reduces user's study using the threshold of Spark platforms, greatly facilitates and simplifies userUse to Spark platforms, so as to be greatly promoted the promotion and application of Spark platforms.
On the other hand, since Spark SQL components can automatically determine the order class of class SQL control commands input by userType, can be to the external knot specified by the class SQL control commands when such SQL control command is Spark SQL native commandsStructure data are handled, when such SQL control command be streaming SQL control commands when, can to real-time streaming data intoRow Stream Processing so that the interactive mode that user provides only with Spark SQL components, it is possible to said external structuringData and real-time streaming data both data are handled.
The third aspect, user can by the Spark SQL components CLI provided, the API that easily calls and JDBC these threeInteractive mode, input class SQL control commands, real-time streaming data is handled, so as to extend user to real-time streaming data intoThe interactive mode of row processing.
Specific embodiment
Here exemplary embodiment will be illustrated in detail, example is illustrated in the accompanying drawings.Following description is related toDuring attached drawing, unless otherwise indicated, the same numbers in different attached drawings represent the same or similar element.Following exemplary embodimentDescribed in embodiment do not represent all embodiments consistent with the application.On the contrary, they be only with it is such as appendedThe example of the consistent device and method of some aspects be described in detail in claims, the application.
It is only merely for the purpose of description specific embodiment in term used in this application, and is not intended to be limiting the application.It is also intended in the application and " one kind " of singulative used in the attached claims, " described " and "the" including majorityForm, unless context clearly shows that other meanings.It is also understood that term "and/or" used herein refers to and wrapsContaining one or more associated list items purposes, any or all may be combined.
It will be appreciated that though various information, but this may be described using term first, second, third, etc. in the applicationA little information should not necessarily be limited by these terms.These terms are only used for same type of information being distinguished from each other out.For example, not departing fromIn the case of the application range, the first information can also be referred to as the second information, and similarly, the second information can also be referred to asOne information.Depending on linguistic context, word as used in this " if " can be construed to " ... when " or " when ...When " or " in response to determining ".
Spark platforms be it is a kind of be used for realize quick and general cluster Computing Platform.Since Spark platforms have operationThe advantages such as speed is fast, ease for use is good, versatile and fault-tolerance is high so that Spark platforms carry out greatly for real time dataThe fields such as data calculating are widely used.
Become more and more important with the business value of real time data, many users are carrying out big data point to real time dataDuring analysis, it will usually using Spark platforms.For example, the streaming computing that Spark platforms carry out some business usually can be used in user,The visit capacity of such as counting user.
When Spark platforms is used to write streaming computing application program, user need call Spark platforms API come intoRow streaming computing application program is write.
On the one hand, although Spark platforms provide abundant API, each API has the rule that uses of oneself, useFamily must be familiar with grasping the use rule of each API, could call API to carry out writing for streaming computing application program.This is rightFor user, study threshold is higher, and user must be to the relevant technologies ratio of Spark platforms and stream calculation, even bottomIn the case of more familiar, efficient streaming computing application program can be just write out.So as to greatly limit the popularization of Spark platformsWith application.
On the other hand, user can only use Spark platforms by API to write streaming applications so that user makesMode with Spark platforms is excessively single.
In addition, user after streaming computing application program has been write, needs streaming computing compiling of application being packaged intoJar forms, then submit application again.Due to from streaming computing application program be programmed into streaming computing application issued need throughThe cumbersome process of above-mentioned complexity is gone through, thus greatly reduces the service efficiency of Spark platforms.
In view of this, the application provides a kind of processing method of real-time streaming data, by the Spark in Spark platformsSQL components are extended, and increase Stream SQL units so that Spark SQL components can receive externally input class formationChange query language SQL control commands, and the class SQL control commands are parsed.If the class SQL control lives parsedIt enables to flow streaming control commands, Spark SQL components then can be according to the class SQL control commands to subsequently receivedFlow data is handled.If the class SQL control commands parsed are primary Spark SQL control commands, according to described inClass SQL control commands handle the external structurant data specified by the class SQL control commands.
On the one hand, since user is by using succinct class SQL control commands, it is possible to which Spark platforms are receivedReal time data carries out Stream Processing, reduces user's study using the threshold of Spark platforms, greatly facilitates and simplifies userUse to Spark platforms, so as to be greatly promoted the promotion and application of Spark platforms.
On the other hand, since Spark SQL components can automatically determine the order class of class SQL control commands input by userType, can be to the external knot specified by the class SQL control commands when such SQL control command is Spark SQL native commandsStructure data are handled, when such SQL control command be streaming SQL control commands when, can to real-time streaming data intoRow Stream Processing so that the interactive mode that user provides only with Spark SQL components, it is possible to said external structuringData and real-time streaming data both data are handled.
The third aspect, user can pass through CLI (Command-Line Interface, the life of the offer of Spark SQL componentsEnable row interface), the API that easily calls and JDBC (connection of Java DataBase Connectivity, java database) this threeKind interactive mode, inputs class SQL control commands, real-time streaming data is handled, so as to extend user to real-time streaming dataThe interactive mode handled.
Referring to Fig. 1, Fig. 1 is a kind of schematic diagram of Spark platforms shown in one exemplary embodiment of the application;Spark is put downPlatform generally includes:Spark SQL components, Spark Streaming components, MLbase/MLlib (Machine Learning, machineDevice learns) component and GraphX (graph) component.
Wherein, above-mentioned Spark SQL components are the frame sets that can be used to operate structural data that Spark platforms providePart.By Spark SQL components, user can using SQL statement come inquire or read external structurant data (such as JASON,Data in Hive, Parquet etc.).In addition, user can also be connected to Spark by third party's intelligence software by JDBCSQL is inquired.
Above-mentioned Spark Streaming components, which are one, carries out real-time streaming data the Stream Processing that height is handled up, height is fault-tolerantSystem can carry out multiple data sources (such as Kafka, Flume, Twitter) the complicated behaviour such as similar Map, Reduce and JoinMake, and result is saved in foreign file system.
Above-mentioned MLbase components are the components being absorbed in Spark platforms for machine learning, include in the component commonMachine learning algorithm and using program, including classifying, returning, clustering, collaborative filtering etc..
Above-mentioned GraphX components can be the API for figure and figure parallel computation in Spark platforms.
The processing method of real-time streaming data provided herein is to carry out Function Extension to above-mentioned Spark SQL components,For example increase the function of the resolution unit in Spark SQL components, increase Stream SQL units in Spark SQL components,So as to increase the function of being handled real-time streaming data so that user can be original to platform in realization by class SQL statementExternal structurant data processing processing while, also the real-time streaming data received is handled.
The Spark SQL components provided below the application are described in detail.
Referring to Fig. 2, Fig. 2 is a kind of schematic diagram of Spark SQL components shown in one exemplary embodiment of the application.
The Spark SQL components provided in the embodiment of the present application may include:Resolution unit, Spark SQL units andStream SQL units.
Wherein, above-mentioned resolution unit, for after class SQL control commands input by user are received, being controlled to the class SQLSystem order is parsed.Such SQL control command type is determined by parsing obtained critical field.If such SQL is controlledIt orders as primary Spark SQL control commands, then such SQL control command is sent to Spark SQL units.If such SQLControl command is streaming control commands, then such SQL control command is sent to Stream SQL units.
Above-mentioned Spark SQL units are used for operating the core cell of structural data in SparkSQL components.User canThe data of external structured data sources (such as JSON, Hive, Parquet) are inquired or read using SQL statement;While it is notIt is only supported inside Spark programs and carries out data query using SQL statement, third-party intelligence software is also supported to connect by JDBCSpark SQL are connected to be inquired.Handle above-mentioned Stream SQL units, for from data sources to real-time streaming dataIt is handled, but is that Spark SQL units are that external structural data is handled with the difference of Spark SQL units, andStream SQL be to real-time reception to flow data handle.Handling real-time streaming data here, it is possible to understand thatFor the business demand according to user, flow-type business calculating etc. is carried out to real-time streaming data.Certainly, here only to real-time streaming dataIt carries out processing to illustrate, it not carried out specifically defined.
More specifically, as shown in figure 3, Fig. 3 is a kind of real-time streaming data shown in one exemplary embodiment of the applicationThe schematic diagram of task processing.
From figure 3, it can be seen that Stream SQL units further comprise receiving module, Stream task processing modules and hairSend module.
Wherein, receiving module is responsible for receiving the data flow in data source, which can be that Kafka is (a kind of high to handle upThe distributed post of amount subscribes to message system), Flume (High Availabitity that Cloudera is provided, it is highly reliable, it is distributedMassive logs acquisition, polymerization and transmission system) etc. multiple data sources.The receiving module can be the Spark platformsReceiving module in Spark Streaming components, can also be it is newly developed go out receiving module, do not have here to itLimit to body.
Stream task processing modules, available for crucial according to the data processing in class SQL control commands input by userWord carries out task processing to the real-time streaming data received.
Sending module, available for receiving Stream task processing modules task treated data write-in external storage toolIn, which can directly be exchanged with external storage tool, then by real-time streaming data task handle as a result, according toMeet the form of interfaced storage tool, format conversion is carried out to handling result, is then store in the storage tool.
The processing method of the real-time streaming data proposed below to the application is described in detail.
Referring to Fig. 4, Fig. 4 is a kind of flow of real-time streaming data processing method shown in one exemplary embodiment of the applicationFigure.This method can be used for the Spark SQL components of Spark platforms.This method may include step as follows.
Step 401:Receive externally input type of structured query language SQL control commands;
Step 402:The class SQL control commands are parsed;
Step 403:If the class SQL control commands parsed are streaming control commands, according to the classSQL control commands handle subsequently received flow data;
Step 404:If the class SQL control commands parsed are primary Spark SQL control commands, according to described inClass SQL control commands handle the external structurant data specified by the class SQL control commands.
Wherein, above-mentioned flow data, it can be understood as from data source real-time reception to flow data, so-called flow data can be withIt is interpreted as the data that data source is continuously generated, such as the journal file that user is generated in real time using application program, net purchase data etc.Deng.The flow data is usually also referred to as real-time streaming data.
Data processing keyword is carried in above-mentioned class SQL control commands, is divided from type, can be included at two class dataKeyword is managed, is closed respectively for the data processing keyword of real-time streaming data and for the data processing of external structurant dataKey word.When above-mentioned class SQL control commands are Streaming control commands, the data processing that such SQL control command carries is closedKey word can be the data processing keyword for real-time streaming data, and the data processing indicated by the data processing keyword can wrapInclude Join processing, Map processing, Reduce processing and some User Defined processing modes.Certainly can also include in SQL statementSome common calculating tasks, for example sum, screening etc..When above-mentioned class SQL control commands are primary Spark SQL control livesWhen enabling, the data processing operation indicated by data processing keyword that is carried in such SQL control command can include originalThe data processing operation that Spark SQL can be performed.
In the embodiment of the present application, the interactive mode identical with Spark SQL components can be used in user, inputs above-mentioned classSQL control commands.
For example, user can be by tri- kinds of modes of API, CLI and JDBC, to Spark SQL components input class SQL control livesIt enables.
It should be noted that for the API, provided with the Spark Streaming components in Spark platformsAPI is entirely different, and the api interface that Spark Streaming are provided needs user to understand the api interface in detail, can just write multipleMiscellaneous program is called into line interface.And in this application, user can input general succinct class SQL control lives by APIIt enables, real-time streaming data is handled.
In the embodiment of the present application, Spark SQL components can receive class SQL control commands input by user.And to thisSQL control commands are parsed.
After being parsed to such SQL control command, some critical fielies are can obtain.
In an optional implementation manner, usual Spark SQL components are preset with related to Streaming control commandsSome specific fields, as the first specific field, for example, some fields such as STREAM.Spark SQL components are also preset withWith some relevant specific fields of primary Spark SQL control commands, as the second specific field, such as some words such as tableSection.
It should be noted that above-mentioned first specific field can be understood as and preset and Streaming control command phasesThe set for some specific fields closed.Above-mentioned second specific field, which can be understood as preset controlled with primary Spark SQL, ordersEnable the set of some relevant specific fields.
Spark SQL components can based on obtained critical field after being parsed to SQL control commands, determine to parse suchThe type of SQL control commands.If parsing obtained above-mentioned critical field matches above-mentioned first specific field, Spark SQL componentsCan determine the class SQL control commands parsed is Streaming control commands;If parse obtained above-mentioned critical field matchingAbove-mentioned second specific field, Spark SQL components can determine that the class SQL control commands parsed are controlled for primary Spark SQLOrder.
For example, it is assumed that class SQL control commands input by user are:
“CREAT TABLE tableName(NAME STRING,AGE INT);
Show tables;”
Wherein, CREAT TABLE is create table order, and tableName is table name, and (NAME STRING, AGE INT) isThe data type of definition, Show tables are display table order.
Spark SQL components can obtain some critical fielies after being parsed to such control command, such as CREAT TABLE,Show tables etc..By parsing obtained critical field, Spark SQL components find that these critical fielies are specified with secondField (fields such as such as table, show tables) matches, and Spark SQL components can determine the class SQL control commands parsedFor primary Spark SQL control commands.
For another example, it is assumed that class SQL control commands input by user are:
Wherein, CREATE STREAM are the order for creating Streaming streams;StreamTable is what is createdThe title of Streaming streams;(NAME STRING, AGE INT) is the data type of definition;TBLPROPERTIES is represented with keyValue is to form storage configuration information;SOURCES is data source to specify socket;HOSTNAME is for the socket IP receivedLocation;SINKS is hive to specify data storage format;Insert is to start the order of Streaming streams created, SELECTFor querying command.
Spark SQL components can obtain some critical fielies, such as CREATE after being parsed to such control commandSTREAM, StreamTable etc..By parsing obtained critical field, Spark SQL components find these critical fielies and theOne specific field (some fields of such as STREAM) matches, and Spark SQL components can determine such SQL control command parsedFor streaming control commands.
It is, of course, also possible to the command type of above-mentioned class SQL control commands is determined using other methods, here only to trueThe method for determining the command type of class SQL control commands is illustratively illustrated, it is not carried out specifically defined.
In the embodiment of the present application, usual class SQL control commands carry data processing keyword, are determining user's inputClass SQL control commands after, Spark SQL components after being parsed to such SQL control command, can obtain such SQL controlThe data processing keyword that system order carries.Spark SQL components can be according to these data processing keywords, to real-time streaming dataIt is handled.
Wherein, above-mentioned data processing keyword can include CREAT (establishment), and insert (startup), join is (at JoinReason), group by (Reduce processing) etc..
For example, it is assumed that class SQL control commands input by user are:
“CREAT TABLE tableName(NAME STRING,AGE INT);
Show tables;”
Wherein, CREAT TABLE are the order for creating table;TableName is table name;(NAME STRING,AGE INT)Data type for definition.
Spark SQL components can parse the control command after such SQL control command is received, after parsing,Spark SQL components can obtain such SQL control command carrying data processing keyword, such as CREAT, Show tables etc.,And to the processing operation corresponding to such SQL control command it is establishment table and shows the operation of table being created that.
For another example assume that class SQL control commands input by user are:
Wherein, CREATE STREAM are the order for creating Streaming streams;W is the title of Streaming streams created;(NAME STRING, AGE INT) is the data type of definition;TBLPROPERTIES represents that storage configuration is believed in the form of key-value pairBreath;SOURCES is data source to specify socket;HOSTNAME is the IP address that socket is received;SINKS is specifies numberIt is hive according to storage form;Insert is to start the order of Streaming streams created;Group represents real-time to what is receivedFlow data is grouped processing.
Spark SQL components can parse the control command after such SQL control command is received.Pass through solutionAnalysis, Spark SQL components can obtain the data processing keyword carried in such SQL control command, as CREAT, insert,Group by etc., and can processing keyword be got based on this, determine such corresponding processing operation of SQL control commands to createStreaming flows, and starts streaming streams and complete to handle the classified statistic of real-time streaming data received.
In the embodiment of the present application, when above-mentioned class SQL control commands is flow streaming control commands, Spark SQL groupsPart can be handled the real-time streaming data received according to the data processing keyword got.
Still by it is above-mentioned the example of processing is grouped to real-time stream for, to such SQL control command parse after,Spark SQL components can start above-mentioned processing procedure.In this example, Spark SQL components can be according to above-mentioned data processing keyWord carries out above-mentioned classified statistic processing operation to the real-time stream received, then will be in handling result and processing procedureThe intermediate data (batch generated in such as real-time streaming data processing procedure) of generation is stored in such SQL control command and specifiesStorage tool in, in Hive, for user inquire.
In the embodiment of the present application, when the class SQL control commands are primary Spark SQL control commands, then based on obtainingThe data processing keyword got handles external structurant data specified by the class SQL control commands, and stores instituteState processing as a result, so that user inquires.
In addition, in the embodiment of the present application, the processing of above-mentioned flow data may include Join processing, Map processing, at ReduceReason and User Defined processing.
For Join tasks, following example can be lifted.
Class SQL control commands input by user are:
Spark SQL components can then obtain data processing keyword in parsing, such as CREAT STREAM w1, CREATESTREAM w2, insert, join etc..Spark SQL components can determine to be performed from the data processing keyword gotProcessing be to create two Streaming stream, title is respectively w1 and w2.After by w1 and w2 connect entirely, from result setMiddle selection age identical record.
For Map tasks, following example can be lifted.
Class SQL control commands input by user are:
Spark SQL components can obtain the number of such SQL control command carrying after being parsed to such SQL control commandAccording to processing keyword, such as CREATE STREAM w, insert, age+100.Spark SQL components can be somebody's turn to do from what is gotData processing keyword determines that the processing to be performed is to create a Streaming streams, entitled w.The corresponding real-time streams to wData add 100.
For Reduce tasks, following example can be lifted.
Class SQL control commands input by user are:
Spark SQL components can then obtain the data processing keyword of such SQL control command carrying in parsing, such asCREATE STREAM w, insert, group by etc..Spark SQL components can be true from the data processing keyword gotThe fixed processing to be performed includes creating a Streaming streams, and entitled w, to w, corresponding real-time streaming data is according to age'sValue is grouped statistical operation.
In addition, the Spark platforms that the application provides are also to provide custom function operation.
For example, user wants whether realize through Age estimation is young man's demand.
User can write custom function in advance, such as:
Then, user can input class SQL as follows when the custom function is used to carry out real-time streaming data processingControl command is.
add jar youngPeople.jar;
create temporary function isYoung as'com.spark.YoungPeople';
insert into stream console select name,isYoung(age)from w;
Assuming that the real-time streaming data received is the name and age data of user, Spark platforms can obtain after parsingData processing keyword, such as create temporary function, insert, select name, isYoung (age) etc..Spark SQL components can determine that the processing to be performed is included according to judgement from the data processing keyword gotYoungPeople functions handle the corresponding real-time streams of the w of establishment, i.e., are according to the Age estimation user of userNo is young man.
The application provides a kind of processing method of real-time streaming data, by the Spark SQL components in Spark platforms intoRow extension, increases Stream SQL units so that Spark SQL components can receive externally input SQL control commands, and rightThe class SQL control commands are parsed.If the class SQL control commands parsed are streaming control commands,Spark SQL components can then be handled subsequently received flow data according to the class SQL control commands.If it parsesThe class SQL control commands are primary Spark SQL control commands, then the class SQL are controlled according to the class SQL control commandsThe specified external structurant data of system order are handled.
On the one hand, since user is by using succinct class SQL control commands, it is possible to which Spark platforms are receivedReal time data carries out Stream Processing, reduces user's study using the threshold of Spark platforms, greatly facilitates and simplifies userUse to Spark platforms, so as to be greatly promoted the promotion and application of Spark platforms.
On the other hand, since Spark SQL components can automatically determine the order class of class SQL control commands input by userType, can be to the external knot specified by the class SQL control commands when such SQL control command is Spark SQL native commandsStructure data are handled, when such SQL control command be streaming SQL control commands when, can to real-time streaming data intoRow Stream Processing so that the interactive mode that user provides only with Spark SQL components, it is possible to said external structuringData and real-time streaming data both data are handled.
The third aspect, user can by the Spark SQL components CLI provided, the API that easily calls and JDBC these threeInteractive mode, input class SQL control commands, real-time streaming data is handled, so as to extend user to real-time streaming data intoThe interactive mode of row processing.
Referring to Fig. 5, Fig. 5 is a kind of real-time streaming data processing unit block diagram shown in one exemplary embodiment of the application.InstituteThe Spark SQL components that device is applied to Spark platforms are stated, described device includes:
Resolution unit 501, for receiving externally input type of structured query language SQL control commands, and to the classSQL control commands are parsed;
Stream SQL units 502, if the class SQL control commands for parsing are stream streaming control livesIt enables, then subsequently received flow data is handled according to the class SQL control commands.
Spark SQL units 503, if the class SQL control commands for parsing are primary Spark SQL control livesIt enables, then the external structurant data specified by the class SQL control commands is handled according to the class SQL control commands.
Optionally, the resolution unit 501 is additionally operable to based on the key obtained after being parsed to the class SQL control commandsField determines the type of the class SQL control commands parsed.
Optionally, the resolution unit 501, if matching the first default specific field specifically for the critical field, reallySurely the class SQL control commands parsed are the streaming control commands;If the critical field matching second is defaultSpecific field, it is determined that the class SQL control commands parsed are the primary Spark SQL control commands.
Optionally, the class SQL control commands carry data processing keyword;
The Stream SQL units 502, specifically for the data processing key carried according to the class SQL control commandsWord handles subsequently received flow data;
The data processing keyword includes:Join processing, Map processing, Reduce processing and User Defined processing.
Optionally, in the following way one or more of, the Spark SQL components in the following way inOne or more, receive externally input class SQL control commands, and stream data is handled:
Application programming interface API modes;
Command Line Interface CLI modes;
Java database connection mode JDBC modes.
Accordingly, present invention also provides the hardware configurations of Fig. 5 shown devices.Referring to Fig. 6, Fig. 6 is what the application providedThe hardware architecture diagram of Fig. 5 shown devices, the system include:Processor 601, memory 602 and bus 603;Wherein, it handlesDevice 601 and memory 602 complete mutual communication by bus 603.
Wherein, processor 601 can be a CPU, and memory 602 can be nonvolatile memory (non-Volatile memory), and metadata management logic instruction is stored in memory 602, processor 601 can be performed and be depositedThe metadata management logic instruction stored in reservoir 602, it is shown in Figure 3 to realize the management function of above-mentioned real-time streaming dataFlow.
As shown in fig. 6, the hardware configuration can also include the power management that a power supply module is configured as executive device604 and input and output (I/O) interface 605.
For device embodiment, since it corresponds essentially to embodiment of the method, so related part is referring to method realityApply the part explanation of example.The apparatus embodiments described above are merely exemplary, wherein described be used as separating componentThe unit of explanation may or may not be physically separate, and the component shown as unit can be or can alsoIt is not physical unit, you can be located at a place or can also be distributed in multiple network element.It can be according to realityIt needs that some or all of module therein is selected to realize the purpose of application scheme.Those of ordinary skill in the art are not payingIn the case of going out creative work, you can to understand and implement.
The foregoing is merely the preferred embodiment of the application, not limiting the application, all essences in the applicationGod and any modification, equivalent substitution, improvement and etc. within principle, done, should be included within the scope of the application protection.