Url filtering system and the method for filtering URLTechnical field
The communications field of the present invention, the method that relates in particular to a kind of URL (Uniform/Universal Resource Locator, URL) filtration system and filter URL.
Background technology
URL also is called as web page address, is the resource addresses that Internet goes up standard, is used for intactly describing a kind of identification method that Internet goes up webpage and other resource addresses.Each last webpage of Internet all has a unique URL address name identification, is referred to as the URL address usually, and this address can be a local disk, also can be a certain computer on the local area network (LAN), more is the website on the Internet.Briefly, URL is exactly the Web address, is commonly called as " network address ".
Along with popularizing of network; Information on the Internet provides increasing facility for people's life and work; The teen-age quantity that touches network is also more and more, but online information is very different, particularly also has the website of bad things such as advocation pornographic quite a lot, violence, supertition; In order to present a healthy and upgoing website to the teenager; Need the URL of its visit be filtered, thereby mask some unsound and illegal websites, thereby guarantee teen-age growing up healthy and sound.
At present existing url filtering method mainly contains three kinds:
The first, use the hash table to deposit URL information; This method is applicable to that the different URL of domain name searches, and when domain name is identical, searches consuming time longer;
The second, use string matching algorithm; This method is applicable to keyword search, but seek rate is slow;
The 3rd uses the canonical matching algorithm; This method is applicable to that uncertain URL searches, and its seek rate is also slow.
Existing method seek rate can increase and significantly descends along with the record of the URL in the URL list, can not satisfy the URL management in the network of handling up of present height.
Summary of the invention
The method that the object of the present invention is to provide a kind of url filtering system and filter URL is improved prior art with solution and is searched the slow-footed problem of URL.
The invention provides the method for a kind of URL of filtration, may further comprise the steps:
According to user-defined URL list, generate the discernible URL rule file of url filtering system, and above-mentioned URL rule file is loaded in the internal memory;
When said system is received message, scan and judge whether above-mentioned message is HTTP (Hyper Text Transfer Protocol, HTTP) message, if, then
Wherein URL information of scanning is mated with the URL information in the URL rule file in the internal memory;
Perhaps filter the above-mentioned HTTP message according to the matching result clearance.
Further, the above-mentioned URL rule file is loaded in the internal memory also comprises after the step:
Judge whether above-mentioned user-defined URL list changes, if, then according to the user-defined URL list after changing, the discernible URL rule file of the system that regenerates, and newly-generated URL rule file is loaded in the internal memory;
After loading completion, new URL rule file is carried out the URL information matches in the said system use internal memory, deletes URL rule file old in the internal memory simultaneously.
Further, when said system is judged the message of receiving and is not the HTTP message, the above-mentioned message of directly letting pass.
Further, above-mentioned user-defined URL list is blacklist or white list.
Further, above-mentioned let pass or filter above-mentioned HTTP message step according to matching result specifically comprise:
When above-mentioned user-defined URL list is blacklist, if the above-mentioned HTTP message is then filtered in the URL information of the HTTP message of receiving and the success of the URL information matches in the URL rule file in the internal memory; If the URL information of the HTTP message of receiving and the failure of the URL information matches in the URL rule file in the internal memory, the above-mentioned HTTP of then letting pass message;
When above-mentioned user-defined URL list is white list, if the URL information of the HTTP message of receiving and the success of the URL information matches in the URL rule file in the internal memory, the above-mentioned HTTP of then letting pass message; If the above-mentioned HTTP message is then filtered in the URL information of the HTTP message of receiving and the failure of the URL information matches in the URL rule file in the internal memory.
The present invention also provides a kind of url filtering system, comprises recognition unit and internal storage location, also comprises regular unit, scanning element and matching unit,
Said recognition unit is used to discern whether the message of receiving is the HTTP message, and recognition result is sent to above-mentioned scanning element;
Above-mentioned regular unit is used for according to user-defined URL list, the discernible URL rule file of generation system, and above-mentioned URL rule file is loaded into above-mentioned internal storage location;
Above-mentioned scanning element is used to scan the message of receiving, and sends to above-mentioned message recognition unit, perhaps scans the URL information in the HTTP message, and above-mentioned URL information is sent to above-mentioned matching unit; And the matching result that returns of the recognition result that returns according to above-mentioned recognition unit and above-mentioned matching unit, the message that clearance/filtration is received;
Above-mentioned matching unit, the URL information in the URL information that is used for receiving and the URL rule file of above-mentioned internal storage location is mated, and matching result is sent to above-mentioned scanning element.
Further; Whether above-mentioned regular unit also is used for the self-defining URL list of judges and changes, and when above-mentioned user-defined URL changes; According to the user-defined URL list after changing; The discernible URL rule file of the system that regenerates is loaded into newly-generated URL rule file in the above-mentioned internal storage location, and after loading successfully, notifies above-mentioned matching unit to use new URL rule file to carry out the URL information matches.
Further, above-mentioned matching unit also is used for after receiving the notice of above-mentioned regular unit, uses new URL rule file to carry out the URL information matches, and deletes URL rule file old in the above-mentioned internal storage location.
The present invention further provides a kind of gateway, and above-mentioned gateway comprises above-mentioned url filtering system.
The present invention converts user-defined URL list to URL system hardware discernible URL rule file and is loaded in the internal memory; When receiving message, system can mate the URL rule file in HTTP message and the internal memory rapidly, and provides matching result; The scanning matching speed can reach 2Gbps at least; And need not distinguish the type of URL, save in the existing method complicated and loaded down with trivial details URL classification and search, accelerate the URL processing speed; The present invention supports the url filtering of big data quantity, is applicable in the network equipments such as ISG (Integrated Service Gateway, integrated service gateway), WAP gateway, WEB gateway.
Description of drawings
Accompanying drawing described herein is used to provide further understanding of the present invention, constitutes a part of the present invention, and illustrative examples of the present invention and explanation thereof are used to explain the present invention, does not constitute improper qualification of the present invention.In the accompanying drawings:
Fig. 1 is the method flow diagram that the present invention filters URL;
Fig. 2 is the theory diagram of url filtering of the present invention system;
Fig. 3 is the theory diagram of gateway of the present invention.
Embodiment
In order to make technical problem to be solved by this invention, technical scheme and beneficial effect clearer, clear,, the present invention is further elaborated below in conjunction with accompanying drawing and embodiment.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
As shown in Figure 1, be the method flow diagram that the present invention filters URL, present embodiment supposes that user-defined URL list is a blacklist, specifically may further comprise the steps:
Step S001:, generate the discernible URL rule file of url filtering system according to user-defined blacklist;
Step S002: above-mentioned URL rule file is loaded in the internal memory;
Step S003: message is received by system;
Step S004: scan above-mentioned message;
Step S005: judge whether above-mentioned message is the HTTP message, if, execution in step S006 then, otherwise, execution in step S010;
Step S006: scanning URL information wherein;
Step S007: mate with the URL information in the URL rule file in the internal memory;
Step S008: judge whether to mate successfully, if, execution in step S009 then; Otherwise, execution in step S010;
Step S009: filter above-mentioned message;
Step S010: the above-mentioned message of letting pass.
The message of this step comprises HTTP message and non-HTTP message.
In other embodiments, when user-defined URL list is white list, if the URL information of the HTTP message of receiving and the success of the URL information matches in the URL rule file in the internal memory, the above-mentioned HTTP of then letting pass message; If the above-mentioned HTTP message is then filtered in the URL information of the HTTP message of receiving and the failure of the URL information matches in the URL rule file in the internal memory.
Among the present invention, in the time of the system handles message, judge also whether above-mentioned user-defined URL list changes; If, then according to the user-defined URL list after changing, the discernible URL rule file of the system that regenerates; And newly-generated URL rule file is loaded in the internal memory; After loading completion, use new URL rule file to carry out the URL information matches, delete old URL rule file simultaneously; This makes the present invention under the professional situation of the coupling of interrupt scanning not, realizes the real-time update of URL rule file.In concrete embodiment, can reserve two internal memory A and B, if old URL rule file leaves among the internal memory A; After the so user-defined URL name altered, newly-generated URL rule file just is loaded among the internal memory B, after loading is accomplished; System uses the URL rule file among the internal memory B to carry out the URL information matches; Meanwhile, the URL rule file among the deletion internal memory A is after user-defined URL list changes once more; Newly-generated URL rule file then is loaded among the internal memory A, and the like.That is to say that system carries out two tasks simultaneously, one is to handle the message of receiving, whether one be to detect user-defined URL list to change.
The present invention is based on the filter method of hardware, compare, improved the speed of handling the HTTP message with existing method based on software.
As shown in Figure 2, be the theory diagram of url filtering of the present invention system, comprisescanning element 01,recognition unit 02,regular unit 03, matchingunit 04,internal storage location 05;
Scanningelement 01 is used to scan the message of receiving, and sends tomessage recognition unit 02, perhaps scans the URL information in the HTTP message, and above-mentioned URL information is sent to matchingunit 04; And the matching result that returns of the recognition result that returns according torecognition unit 02 and matchingunit 04, the message that clearance/filtration is received;
Recognition unit 02 is used to discern whether the message of receiving is the HTTP message, and recognition result is sent to scanningelement 01;
Rule unit 03 is used for according to user-defined URL list, the discernible URL rule file of generation system, and above-mentioned URL rule file is loaded intointernal storage location 05; And be used for the self-defining URL list of judges and whether change; And when above-mentioned user-defined URL changes; According to the user-defined URL list after changing, the discernible URL rule file of the system that regenerates is loaded into newly-generated URL rule file in theinternal storage location 05; And after loading completion, the notice matching unit uses new URL rule file to carry out the URL information matches;
Matchingunit 04; URL information in the URL information that is used for receiving and the URL rule file ofinternal storage location 05 is mated; And matching result sent to scanningelement 01; Perhaps when receiving the notice ofregular unit 03, use the URL rule file of new loading in theinternal storage location 05 to carry out the URL information matches, and old URL rule file in the deletioninternal storage location 05.
As shown in Figure 3; Be gateway theory diagram of the present invention, comprise url filtering system shown in Figure 2, the url filtering system comprisesscanning element 01,recognition unit 02,regular unit 03, matchingunit 04,internal storage location 05; Each Elementary Function is no longer repeated referring to above-mentioned description to Fig. 2 here.
Above-mentioned explanation illustrates and has described the preferred embodiments of the present invention; But as previously mentioned; Be to be understood that the present invention is not limited to the form that this paper discloses, should do not regard eliminating as, and can be used for various other combinations, modification and environment other embodiment; And can in invention contemplated scope described herein, change through the technology or the knowledge of above-mentioned instruction or association area.And change that those skilled in the art carried out and variation do not break away from the spirit and scope of the present invention, then all should be in the protection range of accompanying claims of the present invention.