The terms "first," "second," "third," "fourth," and the like in the description and in the claims, as well as in the drawings, if any, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are, for example, capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements explicitly listed, but may include other steps or elements not explicitly listed or inherent to such process, method, article, or apparatus.
The technical solution of the present invention will be described in detail with specific examples. Several of the following embodiments may be combined with each other and some details of the same or similar concepts or processes may not be repeated in some embodiments.
The embodiment of the application provides a data processing method, which allocates description tags for description information of an association relationship between two associated objects, and counts data information of the bodies by matching the description tags with the corresponding bodies, so as to calculate a development index of the two associated objects to measure the development condition of the associated objects. The scheme can objectively and comprehensively measure the development condition of the object.
The data processing procedure is described in detail below with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a schematic diagram of a data processing flow in the embodiment of the present application. The method comprises the following specific steps:
step 101, acquiring an association relationship between the description information of the first object and the description information of the second object, and assigning a description label to the association relationship.
In the embodiment of the present application, acquiring an association relationship between description information of a first object and description information of a second object includes:
acquiring all description information of a first object as a first description information set;
acquiring all description information of a second object as a second description information set;
respectively determining whether an association relationship exists between each piece of description information in the first description information set and each piece of description information in the second description information set;
and acquiring the description information in the first description information set and the description information in the second description information set which have the association relationship.
Assigning a description label to the association, comprising:
and determining the technical field according to the description information in the first description information set and the description information in the second description information set corresponding to the association relationship, acquiring related keywords in the corresponding technical field, and taking the keywords as description labels of the association relationship.
The object in the embodiment of the application can be an industry, an industry and the like;
if the first object is a financial industry and the second object is an emerging technology industry (technical field), the two objects can be associated with each other, for example, the emerging technology industry can be applied to the financial industry, and the two associated objects can be crossed to obtain a third object, for example, the financial technology industry.
Each object has its own description information, which can be classified, primary, secondary, tertiary, etc., and the classification and description information is determined according to the specific division condition.
The financial industry and emerging technology industries are still exemplified below:
the description information of the financial industry comprises the description information corresponding to the business chain and the industry chain, wherein the description information corresponding to the financial industry chain is shown in table 1, and table 1 is the content corresponding to the description information of the financial industry chain.
TABLE 1
The description of the industry chain of the financial industry in table 1 is divided into two levels, the first level includes 6 types of description information, and the description information corresponding to each first level includes a plurality of second levels of description information.
The description information corresponding to the business chain of the financial industry is shown in table 2, and table 2 is the content corresponding to the description information of the business chain of the financial industry.
TABLE 2
The description of the business chain of the financial industry in table 2 is divided into two levels, the first level includes 6 types of description information, and the description information corresponding to each first level includes a plurality of second levels of description information.
The description information corresponding to the scientific and technological industry chain is shown in table 3, and table 3 shows the content corresponding to the description information of the scientific and technological industry.
TABLE 3
The descriptions in table 3 for the science and technology industry chain are described in three levels.
As for the description information of the financial industry and the scientific and technological industry, the description information of the two industries is crossed to obtain a complete picture of the financial and technological industry, and if the complete picture can be obtained by using a coordinate system, each description information (primary and secondary description information) of the financial industry is taken as a horizontal axis or a vertical axis, each description information (primary, secondary and tertiary description information) of the emerging scientific and technological industry is taken as a vertical axis or a horizontal axis, a coordinate system is established, and the description information on the horizontal and vertical axes is checked in pairs to determine whether the description information is in a relation.
For example, artificial intelligence in the scientific and technological industry chain corresponding to the first level in table 3 is associated with bank customers in the financial industry chain corresponding to the third level in table 2, and labels can be assigned: artificial intelligence, bank customers and intelligent customer service. The artificial intelligence can be expanded into vocabularies such as robots, biological recognition, natural language processing, knowledge maps and the like; the bank customer service can be expanded into vocabularies such as bank customer service, bank machine customer service, bank electronic customer service and the like; the intelligent customer service can be expanded to a bank intelligent customer service robot, an AI customer service and the like.
For another example, the big data industry in the science and technology industry chain corresponding to the first level in table 3 and the commercial banking service in the financial industry chain corresponding to the second level in table 1 may be assigned with labels as big data and commercial banks. Big data can be expanded into big data analysis, big data marketing, big data wind control, big data credit investigation, big data platform, etc., and commercial banks can be expanded into services provided for commercial banks or businesses developed, such as deposit, loan, financing, etc. Big data service related to commercial banks can be expanded into bank big data marketing, bank big data wind control, bank big data middleboxes and the like.
As long as the enterprises that can satisfy the financial chain word expansion + science and technology word expansion or satisfy the financial science and technology word expansion simultaneously can all incorporate it into the financial science and technology industry chain ecology, this is not restricted in the embodiment of the present application.
And 102, acquiring a main body of which the description information is matched with the description label and data corresponding to the attribute information of the main body.
In the embodiment of the application, the main body can be an enterprise, a company, an e-commerce and the like.
There is a database, stored in the device or other servers, which can be obtained when the information is needed to be obtained from the database.
The database stores the subject identification and data corresponding to the attribute information of the subject.
If the main body is an enterprise, the attribute information of the main body may include industrial and commercial data, financial and newspaper data, recruitment data, patent data, news information of the enterprise, and policy and consultation information of the location of the enterprise.
As to how the database obtains the above information, the embodiment of the present application is not limited.
Each main body also corresponds to some description information, all description tags instep 101 are used for matching in the description information, and as long as the main body corresponding to the description information of the description tag exists, the main body is the main body which needs to be subjected to data statistics in the embodiment of the present application.
Step 103, counting the data to obtain a plurality of index values, and performing indexing on the index values.
Taking the financial technology industry as an example, the indexes to be statistically obtained include 17 values of three-level indexes (number of enterprises, number of enterprises on market, number of industrial parks, number of workers, revenue of enterprises on market, profits of enterprises on market, financing amount, number of financing strokes, financing maturity, number of financial institutions, number of policies, search index, news popularity, number of patents, patent quality average, technician demand, number of enterprises with software right).
Because the unit and the numerical value of each index are different in size, the index value is also indexed so as to index all the index values to values between 0 and 1.
The specific indexing in the embodiments of the present application is implemented as follows:
yitjand the normalized index number of the ith area/domain, the tth period and the jth three-level index is represented. Wherein x isitjThe concrete numerical value, min (x), of the ith area/field, the tth period and the jth tertiary indexnj) Minimum value, max (x) of all regions/domains in jth three-level index with n years as basenj) The maximum value of all regions/fields in the jth tertiary index with n years as the base number is shown, wherein the n years as the base number can be 2009, 2019 and the like, and the data of the corresponding year can be determined according to actual needs.
Step 104, carrying out weighted summation on the indexed index values to obtain a development index value of a third object; and the third correspondence is an object after the first object and the second object are associated.
In the embodiment of the application, after 17 values of the three-level index are obtained, weighting is performed according to a preset weight value to calculate 4 second-level index values, and the index values of the financial science and technology industry are calculated by weighting the 4 second-level index values.
The specific calculation formula is given below:
after the exponential value of the third-level index is obtained, calculating by a weighted average method to obtain a second-level index, wherein the calculation formula is as follows:
wherein, aiA weight representing an ith tertiary index; m is the number of the third-level indexes corresponding to the calculated second-level indexes;
after the second-level index is obtained, calculating to obtain a final financial technology industry development index FII through weighted average, wherein the calculation formula is as follows:
wherein A isIA weight representing the I second level index; k is the number of the secondary indexes.
A specific numerical example is given below, see Table 4, where Table 4 shows the composition and weight of specific indices of financial science and technology development index.
TABLE 4
In a specific implementation, the index value of the financial technology industry can be directly weighted and calculated by directly using 17 index values. This is not limited in the embodiments of the present application.
Step 105, determining the development status of the third subject according to the development index value.
The larger the development index value in the examples of the present application, the better the development status of the subject.
In the embodiment of the application, the development index value of the determined object can be used for comparing the development conditions of the object in different time periods, the development conditions of the same object in different regions and the like.
The first application example compares the development conditions of the same subject at different time periods:
acquiring an association relation between description information of a first object and description information of a second object, and distributing a description label for the association relation;
acquiring a main body of which the description information is matched with the description label and data corresponding to the attribute information of the main body;
counting the data within a first preset time to obtain a plurality of index values, and performing indexing on the index values; weighting and summing the indexed index values to obtain a development index value corresponding to the third object in first preset time;
counting the data within a second preset time to obtain a plurality of index values, and performing indexing on the index values; weighting and summing the indexed index values to obtain a development index value corresponding to the third object in second preset time;
comparing the development index value of the third object at a second preset time with the development index value at a first preset time;
when the development index value of the third subject corresponding to the second preset time is determined to be larger than the development index value corresponding to the first preset time, determining that the development condition of the third subject in the second preset time is better than the development condition of the third subject in the first preset time;
when the development index value of the third subject corresponding to the second preset time is determined to be smaller than the development index value corresponding to the first preset time, determining that the development condition of the third subject in the first preset time is better than the development condition of the third subject in the second preset time;
when the development index value of the third subject at the second preset time is determined to be equal to the development index value at the first preset time, determining that the development condition of the third subject at the first preset time is the same as the development condition of the third subject at the second preset time.
The second application example, compares the development status of the same object in different domains:
acquiring an association relation between description information of a first object and description information of a second object, and distributing a description label for the association relation;
acquiring a main body of which the description information is matched with the description label and data corresponding to the attribute information of the main body;
counting the data in a first preset region to obtain a plurality of index values, and performing indexing on the index values; weighting and summing the indexed index values to obtain a development index value corresponding to the third object in a first preset region;
counting the data in a second preset region to obtain a plurality of index values, and performing indexing on the index values; weighting and summing the indexed index values to obtain a development index value corresponding to the third object in a second preset region;
comparing the development index value of the third object corresponding to the second preset region with the development index value corresponding to the first preset region;
when the development index value of the third object in the second preset region is determined to be larger than the development index value in the first preset region, determining that the development condition of the third object in the second preset region is better than that in the first preset region;
when the development index value of the third object corresponding to the second preset region is determined to be smaller than the development index value corresponding to the first preset region, determining that the development condition of the third object in the first preset region is better than that in the second preset region;
when it is determined that the development index value of the third object corresponding to the second preset region is equal to the development index value corresponding to the first preset region, it is determined that the development condition of the third object in the first preset region is the same as the development condition in the second preset region.
To sum up, this application finds the data of the whole volume enterprise through the carding of the key link of the whole industrial chain of finance and technology. The platform positions the industrial label and the business label of the enterprise according to the description text of the enterprise by using the method of artificial intelligence and big data. The method comprises the steps of comprehensively adding enterprises of a traditional financial industry chain and enterprises of a high and new technology industry chain to obtain the financial technology industry chain, finding mass enterprises in the industry chain according to the financial technology industry chain, and obtaining and measuring development conditions of two fused industry objects through statistics of data of related enterprises.
Based on the same inventive concept, the embodiment of the application also provides a data processing device. Fig. 2 is a schematic structural diagram of an apparatus applied to the above-described technology in the embodiment of the present application. The device includes: afirst acquisition unit 201, adistribution unit 202, asecond acquisition unit 203, astatistic unit 204, acalculation unit 205, and adetermination unit 206;
afirst acquisition unit 201 configured to acquire an association relationship between description information of a first object and description information of a second object;
an allocatingunit 202, configured to allocate a description tag to the association relationship acquired by the first acquiringunit 201;
a second obtainingunit 203, configured to obtain a main body whose description information matches the description tag assigned by the assigningunit 202, and data corresponding to attribute information of the main body;
acounting unit 204, configured to count the data acquired by the second acquiringunit 203 to acquire a plurality of index values, and perform indexing on the index values;
a calculatingunit 205, configured to perform weighted summation on the index values indexed by thestatistical unit 204 to obtain a development index value of a third object; the third correspondence is an object after the first object and the second object are associated;
a determiningunit 206, configured to determine a development status of the third subject according to the development index value obtained by the calculatingunit 205.
Preferably, the first and second electrodes are formed of a metal,
the first obtainingunit 201, specifically configured to obtain an association relationship between description information of a first object and description information of a second object, includes: acquiring all description information of a first object as a first description information set; acquiring all description information of a second object as a second description information set; respectively determining whether an association relationship exists between each piece of description information in the first description information set and each piece of description information in the second description information set; and acquiring the description information in the first description information set and the description information in the second description information set which have the association relationship.
Preferably, the first and second electrodes are formed of a metal,
the allocatingunit 202 is specifically configured to, when allocating the description tag to the association relationship, include: and determining the technical field according to the description information in the first description information set and the description information in the second description information set corresponding to the association relationship, acquiring related keywords in the corresponding technical field, and taking the keywords as description labels of the association relationship.
The units of the above embodiments may be integrated into one body, or may be separately deployed; may be combined into one unit or further divided into a plurality of sub-units.
In another embodiment, an electronic device is also provided, comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the steps of the data processing method when executing the program.
In another embodiment, a computer readable storage medium is also provided, having stored thereon computer instructions, which when executed by a processor, may implement the steps in the data processing method.
Fig. 3 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention. As shown in fig. 3, the electronic device may include: a processor (processor)310, a communication Interface (communication Interface)320, a memory (memory)330 and acommunication bus 340, wherein theprocessor 310, thecommunication Interface 320 and thememory 330 communicate with each other via thecommunication bus 340. Theprocessor 310 may call logic instructions in thememory 330 to perform the following method:
acquiring an association relation between description information of a first object and description information of a second object, and distributing a description label for the association relation;
acquiring a main body of which the description information is matched with the description label and data corresponding to the attribute information of the main body;
counting the data to obtain a plurality of index values, and indexing the index values;
weighted summation is carried out on the indexed index values to obtain a development index value of a third object; the third correspondence is an object after the first object and the second object are associated;
determining a development status of the third subject according to the development index value.
In addition, the logic instructions in thememory 330 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.