Summary of the invention
Therefore, for above-mentioned problem, the present invention proposes a kind of GPS mass data processing method, and it adopts novel data processing mechanism, can improve the processing speed of gps data, shorten the time of data storage and query, and then solve the problem of prior art.
For solving this technical matters, present invention employs following technical scheme:
The invention provides a kind of GPS mass data processing method, for the treatment of the GPS mass data that server receives, this server comprises some gps data storehouse servers and some GPS application servers, GPS application server for receiving GPS mass data, distribute the GPS mass data that receives to different gps data storehouse servers GPS mass data processed, gps data storehouse server distributes to its GPS mass data for storing GPS application server; The method comprises the following steps:
Step 1: some gps data storehouse servers are set, composition distributed database server cluster; Some GPS application servers are set, composition distributed application server cluster; Wherein, the database on the server of gps data storehouse adopts oracle database;
Step 2: carry out Region dividing according to the gps data that GPS navigation equipment reports by the position of locating terminal, is dispersed to different GPS application servers by the gps data after dividing;
Step 3: after different GPS application servers receives gps data, carries out first time classification to this gps data, sorted gps data is sent to different gps data storehouse servers and stores; Above-mentioned gps data comprises the information data such as locating terminal mark, locating terminal position (longitude, latitude), locating terminal speed, positioning time (start time and end time); This locating terminal can be arranged on vehicle, also can be arranged on other mobile terminals;
Step 4: when needs are inquired about gps data, GPS application server receives the inquiry request of user, this inquiry request at least comprises locating terminal mark and positioning time, first GPS application server finds according to locating terminal mark the gps data storehouse server storing this gps data, then the sorted table in conjunction with gps data finds this record be queried, and finally Query Result is sent to user.
Further, in above-mentioned steps 2, the gps data after dividing is dispersed to different GPS application servers, is disperseed by direct routing (LVS-DR) pattern and the minimum link of weighting (WLC) scheduling method.Concrete, to be the request data package that sends of client CIP be sent to scheduler VIP through route layer by layer to LVS-DR pattern, request bag is distributed to application server cluster node R S by forward by scheduler again, after application server cluster node R S receives request bag, the another name network interface card being set to scheduler VIP by address encapsulated response message and directly send to client CIP, no longer forward through scheduler, thus accelerate response speed.WLC scheduling method refers to that the performance difference of each GPS application server in application server cluster is larger, scheduler adopts " the minimum link of weighting " dispatching algorithm to optimize load-balancing performance, has and will bear the flexible connection load of larger proportion compared with the server of high weight.Scheduler can inquire the loading condition of true application server automatically, and dynamically adjusts its weights.Its data scatter all GPS navigation equipment uploaded up by LVS cluster and parallel processing technique, on different GPS application servers, alleviates the pressure on single GPS application server.
In step 3, multiple stage gps data storehouse server composition database server cluster, different gps data storehouse servers receives the different gps datas that GPS application server distributes, and realizes the parallel processing between the server of gps data storehouse.
In addition, in step 3, classify to gps data, comprise double classification, its step is as follows:
Step 31:GPS application server performs first time classification, this classification is divided into three grades: the gps data first according to locating terminal position, the gps data of locating terminal being divided into different blocks, then according to again dividing positioning time, finally divide according to locating terminal mark again; The gps data of different demarcation grade is stored to different gps data storehouse servers;
Step 32: perform second time classification on the server of every platform gps data storehouse, this classification is by the range partition of Oracle, list partition, hash subregion, combination range-hash subregion, combination range-list partition, and combine actual business demand, gps data is stored in the different partition table in the oracle database of this gps data storehouse server, is convenient to the process of database data.
The present invention is by adopting said method, and it adopts novel data processing mechanism, by the combination of distributed database server cluster and distributed application server cluster, improves the processing speed of gps data, shortens the time of data storage and query; Additionally by the secondary classification to the data stored, make the storage of gps data more regular, effectively can improve the accuracy of data processing, improve the precision of data processing, promote the speed obtaining valuable information.
Embodiment
Now the present invention is further described with embodiment by reference to the accompanying drawings.
Data base management system (DBMS) (DBMS) is that all control to data of core component of mass data storage searching system all will be realized by DBMS.Oracle database management system application is very universal, is the relational database management system that current high-performance storage and retrieval system is mainly selected, and therefore the storage of mass data realizes based on oracle database management system herein.
Realize the database policies that high performance mass data storage can take to have:
1. partitioning technique: in order to more subtly to database object as table index and index editing table manage and access, further can divide these database objects, Here it is so-called partitioning technique.
The table of Oracle partition is by using " partition key " subregion, and partition key determines one group of row of certain row place subregion.Oracle provide three kinds of master data distribution method scopes (range), list (list), with hash (hash).Use above-mentioned data distributing method, table can be divided into single partition table or assemblage province table.The partitioning technique that then Oracle provides mainly is divided into following several: range partition, list partition, hash subregion, combination range-hash subregion, combination range-list partition.In addition Oracle also provides the subregion index of three types, comprises local index, overall subregion index and overall case of non-partitioned index.Corresponding index partition strategy can be selected according to business demand, thus realize most suitable subregion, to support the application program of any type.Oracle provides a set of strong technology for showing, the subregion of index and index editing table.The database purchase of mass data can select one or more in above partitioning technique, carrys out management zone table, thus reach the object of high-performance retrieval by one group of complete order.Can effect be reached by partitioning technique:
1) availability is strengthened: if fault has appearred in certain subregion of database table, can guarantee that the data of database table in other subregion still can be used.
2) easy to maintenance: if fault has appearred in certain subregion of database table, then only to have needed the data of repairing this fault subregion, and do not need to safeguard whole database table.
3) balanced I/O: can by partition map different for database table to disk in order to balance I/O, the overall performance of system can be made to improve.
4) improve query performance: when user inquires about zone object, only need the subregion that search subscriber is concerned about, thus can inquiry velocity be improved, improve query performance.
2. parallel processing technique: in order to improve system performance, can allow multiple processor collaborative work to perform single SQL statement, Here it is so-called parallel processing technique.
Parallel processing technique is a core technology of database, refers to utilize multiple, CPU and I/O resource performs the operation of individual data storehouse, thus makes database can manage and access the data of TB level efficiently.Although the data base management system (DBMS) of main flow all represents and can provide parallel processing capability at present, parallel processing structure all also exists crucial difference.So-called parallel processing structure refers to: individual task is decomposed into multiple less unit.Not that all working is completed by a process, but by tasks in parallel, thus multiple process is run simultaneously on less unit, do like this and greatly can improve system performance and system resource can be utilized best.
Oracle uses dynamic parallel process framework, and data manipulation can according to work at present feature, the importance of inquiry and load, uses 1-N Real application cluster nodal parallel to run.
The characteristic of parallel processing technique: oracle database concurrent technique can improve database performance, and maximum operational speed and the ultimate load that can improve database.Because each node of parallel system is separate, make a node can not cause this database corruption if there is fault, remaining node can recover malfunctioning node while providing service for user, and therefore concurrent technique is higher than the reliability of single node.Oracle database concurrent technique can also distribute at any time as required and discharge database instance, and the maneuverability of database is high.Be exactly a bit that concurrent technique can overcome internal memory restriction, for more user provides data, services in addition.
3.LVS load-balancing technique: LVS cluster adopts IP load-balancing technique and content-based Requests routing technology.Scheduler has good throughput, request is balancedly transferred on different servers and performed, and scheduler automatic shield falls the fault of server, thus one group of server is formed the virtual server of high performance a, High Availabitity.The structure of whole server cluster is transparent to client, and without the need to revising the program of client and server end.
Based on above theory, the invention provides a kind of GPS mass data processing method, for the treatment of the GPS mass data that server receives, this server comprises some gps data storehouse servers and some GPS application servers, GPS application server for receiving GPS mass data, distribute the GPS mass data that receives to different gps data storehouse servers GPS mass data processed, gps data storehouse server distributes to its GPS mass data for storing GPS application server; The method comprises the following steps:
Step 1: some gps data storehouse servers are set, composition distributed database server cluster; Some GPS application servers are set, composition distributed application server cluster; Wherein, the database on the server of gps data storehouse adopts oracle database;
Step 2: carry out Region dividing according to the gps data that GPS navigation equipment reports by the position of locating terminal, is dispersed to different GPS application servers by the gps data after dividing;
Step 3: after different GPS application servers receives gps data, carries out first time classification to this gps data, sorted gps data is sent to different gps data storehouse servers and stores; Above-mentioned gps data comprises the information data such as locating terminal mark, locating terminal position (longitude, latitude), locating terminal speed, positioning time (start time and end time); This locating terminal can be arranged on vehicle, also can be arranged on other mobile terminals;
Step 4: when needs are inquired about gps data, GPS application server receives the inquiry request of user, this inquiry request at least comprises locating terminal mark and positioning time, first GPS application server finds according to locating terminal mark the gps data storehouse server storing this gps data, then the sorted table in conjunction with gps data finds this record be queried, and finally Query Result is sent to user.
In above-mentioned steps 2, the gps data after dividing is dispersed to different GPS application servers, is disperseed by direct routing (LVS-DR) pattern and the minimum link of weighting (WLC) scheduling method.Concrete, to be the request data package that sends of client CIP be sent to scheduler VIP through route layer by layer to LVS-DR pattern, request bag is distributed to application server cluster node R S by forward by scheduler again, after application server cluster node R S receives request bag, the another name network interface card being set to scheduler VIP by address encapsulated response message and directly send to client CIP, no longer forward through scheduler, thus accelerate response speed.WLC scheduling method refers to that the performance difference of each GPS application server in application server cluster is larger, scheduler adopts " the minimum link of weighting " dispatching algorithm to optimize load-balancing performance, has and will bear the flexible connection load of larger proportion compared with the server of high weight.Scheduler can inquire the loading condition of true application server automatically, and dynamically adjusts its weights.Its data scatter all GPS navigation equipment uploaded up by LVS cluster and parallel processing technique, on different GPS application servers, alleviates the pressure on single GPS application server.
In step 3, multiple stage gps data storehouse server composition database server cluster, different gps data storehouse servers receives the different gps datas that GPS application server distributes, and realizes the parallel processing between the server of gps data storehouse.
In addition, in step 3, classify to gps data, comprise double classification, its step is as follows:
Step 31:GPS application server performs first time classification, this classification is divided into three grades: the gps data first according to locating terminal position, the gps data of locating terminal being divided into different blocks, then according to again dividing positioning time, finally divide according to locating terminal mark again; The gps data of different demarcation grade is stored to different gps data storehouse servers;
Step 32: perform second time classification on the server of every platform gps data storehouse, this classification is by the range partition of Oracle, list partition, hash subregion, combination range-hash subregion, combination range-list partition, and combine actual business demand, gps data is stored in the different partition table in the oracle database of this gps data storehouse server, is convenient to the process of database data.
With reference to figure 1, treatment scheme of the present invention is as follows: GPS mass data is sent to distributed application server cluster by several GPS navigation equipment, and when GPS navigation equipment substantial amounts, the gps data that p.s. is transmitted will be huge.Adopt LVS cluster and parallel processing technique to be dispersed to by gps data on each GPS application server, and then classify, and be sent to gps data storehouse server and store.
Although specifically show in conjunction with preferred embodiment and describe the present invention; but those skilled in the art should be understood that; not departing from the spirit and scope of the present invention that appended claims limits; can make a variety of changes the present invention in the form and details, be protection scope of the present invention.