Disclosure of Invention
In order to solve the technical problem, the invention provides a data classification method and system based on keywords, wherein the data classification method comprises the following steps:
establishing a keyword pool: classifying the given keywords and establishing a plurality of keyword pools;
building sub-nodes: expanding child nodes and constructing a configuration table on the basis of the keyword pool;
a storage step: establishing a data table, and storing data to be classified into the data table;
a storage step: storing the keywords according to the priority in the configuration table by adopting a bidirectional linked list;
a classification step: and classifying the data to be classified in the data table according to the doubly linked list.
The data classification method based on the keywords, wherein the step of building the child nodes comprises:
and expanding the child nodes according to the keyword pool, summarizing the keywords of the child nodes, and setting the priority of the child nodes.
The data classification method based on the keywords, wherein the information stored by each node of the doubly linked list includes: node name, node keyword, parent node and child node.
In the data classification method based on the keywords, each keyword pool is correspondingly matched with one doubly linked list.
The data classification method based on the keywords is characterized in that the classification step comprises the following steps:
classifying a keyword pool: classifying the data to be classified into a keyword pool according to the priority of the keyword pool in the configuration table;
a node classification step: classifying the data to be classified into one node according to the node priority of the keyword pool in the configuration table based on the keyword pool.
The invention also provides a data classification system based on the keywords, which comprises the following steps:
the system comprises a keyword pool establishing module, a keyword pool establishing module and a keyword pool establishing module, wherein the keyword pool establishing module classifies given keywords and establishes a plurality of keyword pools;
the building sub-node module is used for building sub-nodes and building a configuration table on the basis of the keyword pool;
the storage module establishes a data table and stores the data to be classified into the data table;
the storage module stores the keywords according to the priority in the configuration table by adopting a bidirectional linked list;
and the classification module classifies the data to be classified in the data table according to the bidirectional linked list.
In the data classification system based on the keywords, the develop sub-node module develops the sub-nodes according to the keyword pool, summarizes the keywords of the sub-nodes, and sets the priority for the sub-nodes.
The data classification system based on the keywords, wherein the information stored by each node of the doubly linked list includes: node name, node keyword, parent node and child node.
In the data classification system based on the keywords, each keyword pool is correspondingly matched with one of the doubly linked lists.
The data classification system based on the keywords, wherein the classification module comprises:
the keyword pool classifying unit classifies the data to be classified into a keyword pool according to the priority of the keyword pool in the configuration table;
and the node classifying unit classifies the data to be classified into one node according to the node priority of the keyword pool in the configuration table based on the keyword pool.
The invention has the beneficial effects that:
1. when the data is more, the program operation efficiency is improved.
2. The method is flexible and changeable, and the data classification accuracy is improved.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Reference to "connected," "coupled," and the like in this application is not intended to be limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The term "plurality" as referred to herein means two or more. "and/or" describes an association relationship of associated objects, meaning that three relationships may exist, for example, "A and/or B" may mean: a exists alone, A and B exist simultaneously, and B exists alone. The character "/" generally indicates that the former and latter associated objects are in an "or" relationship. Reference herein to the terms "first," "second," "third," and the like, are merely to distinguish similar objects and do not denote a particular ordering for the objects.
The present invention is described in detail with reference to the embodiments shown in the drawings, but it should be understood that these embodiments are not intended to limit the present invention, and those skilled in the art should understand that functional, methodological, or structural equivalents or substitutions made by these embodiments are within the scope of the present invention.
Before describing in detail the various embodiments of the present invention, the core inventive concepts of the present invention are summarized and described in detail by the following several embodiments.
Referring to fig. 1, fig. 1 is a flowchart of a data classification method based on keywords. As shown in fig. 1, the data classification method based on keywords of the present invention includes:
keyword pool establishment step S1: classifying the given keywords and establishing a plurality of keyword pools;
building child nodes step S2: expanding child nodes and constructing a configuration table on the basis of the keyword pool;
step S3 is stored: establishing a data table, and storing data to be classified into the data table;
storage step S4: storing the keywords according to the priority in the configuration table by adopting a bidirectional linked list;
classification step S5: and classifying the data to be classified in the data table according to the doubly linked list.
Wherein, the step of building the child nodes comprises: and expanding the child nodes according to the keyword pool, summarizing the keywords of the child nodes, and setting the priority of the child nodes.
Wherein, the information stored by each node of the doubly linked list includes: node name, node keyword, parent node and child node.
And each keyword pool is correspondingly matched with one doubly linked list.
Referring to fig. 2, fig. 2 is a flowchart illustrating a sub-step of step S5 in fig. 1. As shown in fig. 2, the step S5 of obtaining the ideal average flow rate includes:
keyword pool classifying step S51: classifying the data to be classified into a keyword pool according to the priority of the keyword pool in the configuration table;
node classification step S52: classifying the data to be classified into one node according to the node priority of the keyword pool in the configuration table based on the keyword pool.
Hereinafter, the keyword-based data classification method of the present invention will be described in detail with reference to examples.
The first embodiment is as follows:
this example discloses a specific implementation of a statistical-based keyword-based data classification method (hereinafter "method").
1. According to business needs, given keywords are classified firstly, a plurality of keyword pools are established, child nodes are expanded from the pools, keywords of the child nodes are collected, and then priority is set for the child nodes. Each child node may again be expanded to a plurality of child nodes. And determining to expand a plurality of layers of sub-nodes according to the service requirement. Then, a configuration table is established in the database, as shown in fig. 3, and the classified keywords are put into a warehouse.
2. And establishing a data table, and storing the data to be classified into a database.
3. A bidirectional linked list is adopted in a program for storing keyword configuration, and information stored by each node of the linked list comprises: node name, node keyword, father node, child node. Then the nodes are stored in the linked list according to the priority in the configuration table. Each pool corresponds to a linked list.
4. And traversing the data, judging whether the current data meets the condition of a certain pool, and if so, filtering the data by using the pool.
And gradually judging whether the current data meets the requirement of a certain node or not according to the sequence of the linked list, and if so, stopping judging so as to obtain which classification the data belongs to according to the information of the current node.
As shown in fig. 6:
and sequentially matching the data to be classified according to the priority of the keywords from small to large, entering a milk powder pool if the keywords are matched with the milk powder pool, continuously judging thehit level 1 child node, repeating the steps, entering a second level child node, repeating the steps, entering a third level child node, and judging the layer number of the child node according to the service requirement.
Example two:
referring to fig. 4, fig. 4 is a schematic structural diagram of the data classification system based on keywords according to the present invention. Fig. 4 shows a keyword-based data classification system according to the present invention, which includes:
the system comprises a keyword pool establishing module, a keyword pool establishing module and a keyword pool establishing module, wherein the keyword pool establishing module classifies given keywords and establishes a plurality of keyword pools;
the building sub-node module is used for building sub-nodes and building a configuration table on the basis of the keyword pool;
the storage module establishes a data table and stores the data to be classified into the data table;
the storage module stores the keywords according to the priority in the configuration table by adopting a bidirectional linked list;
and the classification module classifies the data to be classified in the data table according to the bidirectional linked list.
The building sub-node module develops the sub-nodes according to the keyword pool, summarizes the keywords of the sub-nodes, and sets priority for the sub-nodes.
Wherein, the information stored by each node of the doubly linked list includes: node name, node keyword, parent node and child node.
And each keyword pool is correspondingly matched with one doubly linked list.
Wherein the classification module comprises:
the keyword pool classifying unit classifies the data to be classified into a keyword pool according to the priority of the keyword pool in the configuration table;
and the node classifying unit classifies the data to be classified into one node according to the node priority of the keyword pool in the configuration table based on the keyword pool.
Example three:
referring to FIG. 5, the embodiment discloses an embodiment of a computer device. The computer device may comprise aprocessor 81 and amemory 82 in which computer program instructions are stored.
Specifically, theprocessor 81 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.
Memory 82 may include, among other things, mass storage for data or instructions. By way of example, and not limitation,memory 82 may include a Hard Disk Drive (Hard Disk Drive, abbreviated to HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these.Memory 82 may include removable or non-removable (or fixed) media, where appropriate. Thememory 82 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, thememory 82 is a Non-Volatile (Non-Volatile) memory. In particular embodiments,Memory 82 includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended data output Dynamic Random-Access Memory (EDODRAM), a Synchronous Dynamic Random-Access Memory (SDRAM), and the like.
Thememory 82 may be used to store or cache various data files for processing and/or communication use, as well as possible computer program instructions executed by theprocessor 81.
Theprocessor 81 reads and executes computer program instructions stored in thememory 82 to implement any of the above-described embodiments of the statistical-based keyword-based data classification method.
In some of these embodiments, the computer device may also include acommunication interface 83 and abus 80. As shown in fig. 5, theprocessor 81, thememory 82, and thecommunication interface 83 are connected via thebus 80 to complete communication therebetween.
Thecommunication interface 83 is used for implementing communication between modules, devices, units and/or equipment in the embodiment of the present application. Thecommunication port 83 may also be implemented with other components such as: the data communication is carried out among external equipment, image/data acquisition equipment, a database, external storage, an image/data processing workstation and the like.
Bus 80 includes hardware, software, or both to couple the components of the computer device to each other.Bus 80 includes, but is not limited to, at least one of the following: data Bus (Data Bus), Address Bus (Address Bus), Control Bus (Control Bus), Expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example, and not limitation,Bus 80 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a Hyper Transport (HT) Interconnect, an ISA (ISA) Bus, an InfiniBand (InfiniBand) Interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a microchannel Architecture (MCA) Bus, a PCI (Peripheral Component Interconnect) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a Video Electronics Bus (audio Electronics Association), abbreviated VLB) bus or other suitable bus or a combination of two or more of these.Bus 80 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
The computer device may perform the detection of network abnormal traffic based on a statistical keyword-based data classification method, thereby implementing the methods described in conjunction with fig. 1-2.
In addition, in combination with the statistical-based keyword-based data classification method in the foregoing embodiments, embodiments of the present application may provide a computer-readable storage medium to implement the method. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the above embodiments of a statistical-based keyword-based data categorization method.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
In summary, the beneficial effects of the invention are that the invention provides a data classification method based on the keywords based on statistics, and when the data is more, the program operation efficiency is improved; the method is flexible and changeable, and improves the accuracy of data classification.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.