本發明是一種網路資料辨識與管理系統及其方法,尤其是指一種透過分析應用程式之行為特徵值來判斷出程式名稱,並藉以作為後續網路管理決策之網路資料辨識與管理系統及其方法。The invention relates to a network data identification and management system and a method thereof, in particular to a network data identification and management system for determining a program name by analyzing an application characteristic value of a program, and as a subsequent network management decision and Its method.
自電腦普及化以來,資訊的主要傳播途徑已從書頁轉變成網頁,此外,人們的娛樂以及交流也產生了顛覆性改變,時至於此,幾乎任何訊息都可以透過網路傳遞。Since the popularization of computers, the main channels of information transmission have changed from book pages to web pages. In addition, people's entertainment and communication have also undergone subversive changes. At this point, almost any message can be transmitted through the Internet.
上述的傳播便利也帶來了相應的代價,例如區域網路的管理即是一項艱困的工作,其因在於管理者難以直接得知被管理者是否正在不當地使用網路,進而影響學習或工作效率。The above-mentioned communication convenience also brings corresponding costs. For example, the management of the regional network is a difficult task because the manager has difficulty in directly knowing whether the manager is improperly using the network, thereby affecting the learning. Or work efficiency.
習用的管理做法眾多,諸如防堵特定IP、特定PORT、或利用防毒軟體來禁止電腦的某些功能被存取等。然而,這些管理方式非常容易受到破解,例如透過一些程式即可利用網路跳板避開IP和PORT的規則限制,且以PORT做為防堵方式,經常發生原PORT更改而使封堵失效、或是錯誤防堵其他連線等管理糾紛。而使用防毒軟體統一進行管理者,也相當容易發生破解、相容性衝突以及軟體誤判問題,徒增管理者與被管理者的困擾。There are numerous administrative practices, such as blocking specific IPs, specific PORTs, or using anti-virus software to prevent certain features of the computer from being accessed. However, these management methods are very easy to crack, for example, through some programs, you can use the network springboard to avoid the IP and PORT rules.With PORT as the anti-blocking method, management disputes such as the original PORT change and the blocking failure, or the wrong blocking of other connections are often occurred. The use of anti-virus software for unified management is also quite prone to cracking, compatibility conflicts and software misjudgment problems, increasing the confusion of managers and managers.
較為準確的一種管理方式為深度封包檢測(Deep Packet Inspection,DPI),此方式係檢測被管理者透過網路對外往來的封包內是否含有特定字串,藉此來判斷被管理者是否不當使用網路。由於此管理方式需要找到關鍵訊息才會判讀確定,因此可以減少誤判或是封堵設定的麻煩,但其卻仍然存在致命缺點,例如:A more accurate management method is Deep Packet Inspection (DPI), which is to detect whether a packet is externally transmitted by the administrator through the network, thereby determining whether the administrator improperly uses the network. road. Since this management method needs to find the key information to be judged, it can reduce the trouble of misjudging or blocking the setting, but it still has fatal shortcomings, such as:
第一,封包的字串內容經常會因為應用程式更新版本而有所變動,隨著版本不斷更新,特定字串的判讀失真率會逐漸提高。且實務上,深度封包檢測所使用的條件資料庫必須人力更新,現今的應用程式多不計數,以人力去追逐版本更新,明顯不切實際。First, the contents of the string of the packet are often changed due to the updated version of the application. As the version is updated, the interpretation distortion rate of the specific string will gradually increase. In practice, the conditional database used for deep packet inspection must be manually updated. Today's applications are not counted, and it is obviously impractical to chase the version update by manpower.
第二,為了因應個人資料保護以及有價資訊的盜用問題,現下的資料加密技術已大行其道,上述的檢測方式受到技術層面以及適法性問題的挑戰。據此,這種檢測方式需要事先解讀封包內部資訊,才能判讀內部是否含有特定字串,若封包自始即已被加密,縱使封包內含有符合條件的字串流,此種習用檢測方式也只能被迫忽略上述被加密的封包,造成檢測精準度大幅下滑,甚至完全失效。Second, in order to respond to the protection of personal data and the theft of valuable information, the current data encryption technology has become popular, and the above detection methods are challenged by technical and legal issues. According to this, this detection method needs to interpret the internal information of the packet beforehand to determine whether the internal string contains a specific string. If the packet has been encrypted from the beginning, even if the packet contains a qualified stream, the conventional detection method only Can be forced to ignore the above encrypted packets, resulting in a sharp decline in detection accuracy, or even completely invalid.
再者,特定字串之設定通常為應用程式的名稱或是縮寫,透過竄改封包內的特定資訊,亦能夠做到使特定字串消失而成功避開大部份的檢測,因此即使能夠排除上述的人力耗費因素,這樣的漏洞仍使深度封包檢測存在難以克服的困難。Furthermore, the setting of a specific string is usually the name or abbreviation of the application. By tampering with the specific information in the packet, it is also possible to make the specific string disappear and successfully avoid most of the detection, so even if the above can be excluded The human cost factor, such a loophole still makes the deep packet inspection difficult to overcome.
鑒於以上缺點,一種能夠廣泛適合各種管理需求、且既不受前述誤判、軟體相容性問題或加密限制,也不需持續以人力管理與維護的資料辨識與管理方案,顯然是目前的網路管理實務中所迫切需要的。In view of the above shortcomings, a data identification and management solution that can be widely adapted to various management requirements without being subject to the aforementioned misjudgment, software compatibility problems or encryption restrictions, and without continuous manpower management and maintenance, is obviously the current network. Urgently needed in management practice.
依據本發明之一實施方式,提供一種網路資料辨識與管理方法,其包含以下步驟:依據一封包來回時間定義複數網路環境條件。操作一本地終端,並於各個網路環境條件下分別執行至少一應用程式,藉以分別產生相對各個網路環境條件及應用程式之至少一行為特徵值。歸納前述應用程式之一程式名稱、網路環境條件以及網路環境條件所對應之行為特徵值而成為一特徵集合。以及,儲存前述的特徵集合於一資料庫。According to an embodiment of the present invention, a network data identification and management method is provided, which includes the following steps: defining a plurality of network environment conditions according to a packet round-trip time. Operating a local terminal and executing at least one application in each network environment condition to generate at least one behavior characteristic value relative to each network environment condition and the application program. A feature set is obtained by summarizing the behavior feature values corresponding to one of the program names, the network environment conditions, and the network environment conditions. And storing the aforementioned features in a database.
本實施方式應用在資訊領域中,前述的封包來回時間(Round-Trip Time,RTT)所指的是:1.封包由發出端傳送至目標端;2.發出端又於之後收到目標端回傳之回應封包的整個過程所花費的時間總和。本發明於其後各實施方式所述的封包來回時間,其意義皆與上述定義相同。此外,調變網路頻寬以使前述之封包來回時間被延長或縮短,亦為領域中習知的通常技術可以達成,因此不多贅述其細節。The present embodiment is applied in the information field, and the foregoing Round-Trip Time (RTT) refers to: 1. The packet is transmitted from the originating end to the target end; 2. The transmitting end receives the target end back later. The sum of the time spent in response to the entire process of the packet. The packet round-trip time described in the following embodiments of the present invention has the meanings as defined above.with. In addition, modulating the network bandwidth to extend or shorten the aforementioned packet back and forth time is also possible in the conventional techniques known in the art, and thus the details thereof will not be described.
由上述實施方式可知,本發明依據本地終端所處的不同的封包來回時間來定義網路環境條件,此一步驟的意義在於:由於特徵集合內部儲存有行為特徵值,而本發明並不採用DPI的字串掃描方法,因此當同一個應用程式在不同條件之封包來回時間下執行時,同一段時間內與外界往來的封包數量、封包容量或是封包傳輸時間都會受到巨幅影響。It can be seen from the above embodiments that the present invention defines the network environment condition according to the different packet round-trip time of the local terminal. The significance of this step is that the present invention does not adopt the DPI because the feature set internally stores the behavior feature value. The string scanning method, so when the same application is executed under different conditions of the packet back and forth time, the number of packets, packet capacity or packet transmission time with the outside world in the same period of time will be greatly affected.
因此,本發明依據封包來回時間來加以分類特徵集合,將有利於使後續的封包檢測得以建立在同一標準上,不至於因套用不對等的網路環境條件,而使檢測結果產生偏差。Therefore, the present invention classifies the feature set according to the packet round-trip time, which will facilitate the subsequent packet detection to be established on the same standard, without causing deviations in the detection result due to the application of unequal network environment conditions.
前述實施方式的行為特徵值可包含對應於應用程式之一封包數量、一封包容量或是一封包傳輸時間。此處的封包傳輸時間與前述的封包來回時間有所差異,封包傳輸時間係指由一終端機發送封包,至封包抵達另一終端機的延遲時間(Latency)。The behavioral feature value of the foregoing embodiment may include a packet size corresponding to the application, a packet capacity, or a packet transmission time. The packet transmission time here is different from the foregoing packet round-trip time. The packet transmission time refers to the delay time (Latency) when the packet is sent by one terminal to the time when the packet arrives at another terminal.
前述實施方式之網路資料辨識與管理方法可另包含:定義前述應用程式與一目標終端之間的一傳輸協定,並令此傳輸協定至少包含前述應用程式之一來源埠口值。另外,前述實施方式之網路資料辨識與管理方法可再包含:利用前述來源埠口值查詢前述應用程式之程式名稱,並儲存此程式名稱於對應的特徵集合中。The network data identification and management method of the foregoing embodiment may further include: defining a transmission agreement between the application and a target terminal, and causing the transmission protocol to include at least one source port value of the foregoing application. In addition, the network data identification and management method of the foregoing embodiment can be furtherThe method includes: querying a program name of the foregoing application by using the source port value, and storing the program name in the corresponding feature set.
前述之網路資料辨識與管理方法可另包含:定義前述傳輸協定包含前述本地終端之一來源位址值、前述目標終端之一目標位址值以及一目標埠口值。The foregoing network data identification and management method may further include: defining the foregoing transmission protocol to include one of the foregoing local terminal source address values, one of the target terminal target address values, and a target port value.
前述之網路資料辨識與管理方法可另包含:以前述本地終端執行複數之應用程式,並依據那些應用程式與前述目標終端之間的各個傳輸協定,分類那些行為特徵值,使具有相同之來源埠口值之行為特徵值對應同一個應用程式。The foregoing method for identifying and managing a network data may further include: executing a plurality of applications by using the foregoing local terminal, and classifying those behavior feature values according to respective transmission agreements between the application and the target terminal, so that the same source is obtained. The behavioral feature values of the mouth value correspond to the same application.
藉由上述步驟,本實施方式之網路資料辨識與管理方法可於各個網路環境條件下執行多個應用程式,這些應用程式彼此可以相同也可以相異,又即使是在同一個網路環境條件下,也可以執行相同之複數個應用程式。Through the above steps, the network data identification and management method of the present embodiment can execute multiple applications under various network environment conditions, and the applications can be the same or different from each other, even in the same network environment. Under the same conditions, the same number of applications can be executed.
此外,由於每一個應用程式在本地終端上都有對應配給的來源埠口值,因此可以透過來源埠口值查詢所有正在被執行的應用程式之程式名稱,以便於後續將前述特徵集合儲存於資料庫中。In addition, since each application has a corresponding source port value on the local terminal, the name of all the applications being executed can be queried through the source port value, so that the foregoing feature set can be stored in the data later. In the library.
依據本發明之另一實施方式,提供一種網路資料辨識與管理方法,其用於監控至少一用戶終端,並包含以下步驟:依據一封包來回時間定義複數網路環境條件。操作一本地終端,並於各個網路環境條件下分別執行至少一應用程式,藉以分別產生相對各個網路環境條件及應用程式之至少一行為特徵值。歸納前述應用程式之一程式名稱、網路環境條件以及網路環境條件所對應之行為特徵值而成為一特徵集合。儲存前述的特徵集合於一資料庫。監控前述之用戶終端,並計算此用戶終端於一待測網路條件下執行一待測程式所產生之至少一待測特徵值。傳送前述待測網路條件以及前述待測特徵值至前述之資料庫。以及,自前述資料庫中選擇其中一個特徵集合,並比對前述網路環境條件以及選擇出之行為特徵值是否分別符合前述待測網路條件以及待測特徵值,若是,則判斷前述之待測程式為前述應用程式。According to another embodiment of the present invention, a network data identification and management method is provided for monitoring at least one user terminal, and includes the following steps: defining a plurality of network environment conditions according to a packet round-trip time. Operating a local terminal and executing at least one application in each network environment condition to generate at least one behavior characteristic value relative to each network environment condition and the application program. Induct the name of one of the aforementioned applicationsThe characteristic value of the behavior corresponding to the network environment condition and the network environment condition becomes a feature set. The aforementioned features are stored in a database. Monitoring the foregoing user terminal, and calculating at least one to-be-tested feature value generated by the user terminal to execute a program to be tested under a condition of the network to be tested. Transmitting the foregoing network conditions to be tested and the foregoing feature values to be tested to the foregoing database. And selecting one of the feature sets from the foregoing database, and comparing whether the foregoing network environment condition and the selected behavior feature value respectively meet the foregoing network condition to be tested and the feature value to be tested, and if yes, determine the foregoing The test program is the aforementioned application.
當特徵集合於資料庫內儲存完畢後,即可透過後續步驟進行檢測比對,於資料庫中的特徵集合之選擇,主要係以待測網路條件做為選擇標準,這是由於符合待測網路條件的網路環境條件,其對應的行為特徵值較能準確地反映出同等條件下可能的待測特徵值。然而,此處的選擇邏輯僅作為說明用途,實際上採用其他選擇方式而可令行為特徵值與待測特徵值正確對照者,皆為本實施方式所容許。After the feature set is stored in the database, the comparison can be performed through subsequent steps. The selection of the feature set in the database is mainly based on the network condition to be tested as the selection criterion. The network environment condition of the network condition, the corresponding behavior characteristic value can accurately reflect the possible eigenvalues to be tested under the same conditions. However, the selection logic herein is only for illustrative purposes. In fact, other selection methods may be used to make the behavior characteristic value and the characteristic value to be tested correctly compared, which are allowed in the embodiment.
前述另一實施方式之網路資料辨識與管理方法可另包含:定義前述應用程式與一目標終端之間的一傳輸協定,並令此傳輸協定至少包含前述應用程式之一來源埠口值。另外,可再包含:利用前述來源埠口值查詢前述應用程式之程式名稱,並儲存此程式名稱於對應的特徵集合中。The network data identification and management method of the foregoing another embodiment may further include: defining a transmission agreement between the application and a target terminal, and causing the transmission protocol to include at least one source port value of the foregoing application. In addition, the method further includes: querying the program name of the application by using the source port value, and storing the program name in the corresponding feature set.
前述另一實施方式之網路資料辨識與管理方法可另包含:定義前述傳輸協定包含前述本地終端之一來源位址值、前述目標終端之一目標位址值以及一目標埠口值。The network data identification and management method of the foregoing another embodiment may further include: defining the foregoing transmission protocol to include one of the foregoing local terminal source address values, one of the target terminal target address values, and a target port value.
前述另一實施方式之網路資料辨識與管理方法可另包含:以前述本地終端執行複數之應用程式,並依據那些應用程式與前述目標終端之間的各個傳輸協定,分類那些行為特徵值,使具有相同之來源埠口值之行為特徵值對應同一個應用程式。The network data identification and management method according to another embodiment of the present invention may further include: performing, by the foregoing local terminal, a plurality of applications, and classifying those behavior feature values according to respective transmission agreements between the application and the target terminal, so that The behavioral feature values with the same source mouth value correspond to the same application.
另外,本實施方式之網路資料辨識與管理方法可另包含:判斷前述待測程式為前述應用程式時,中止此一待測程式於用戶終端之對外傳輸。再者,上述判斷結果成立時,亦可傳輸此應用程式之程式名稱至一管理終端。In addition, the network data identification and management method of the present embodiment may further include: when determining that the program to be tested is the application, suspending the external transmission of the program to be tested at the user terminal. Furthermore, when the above judgment result is established, the program name of the application can also be transmitted to a management terminal.
藉由上述揭露的實施方式,本發明可先行歸納出多個特徵集合,由於特徵集合內的行為特徵值、網路環境條件以及應用程式之程式名稱皆在本地終端上產生,因此上述資料的正確性可獲得保證,不需考慮封包或是程式名稱受到竄改的問題。利用這樣的原理,本發明進一步可將上述的判斷流程結合後端的管理作業,例如減少或封鎖特定程式對外的連線頻寬,或者將判斷結果傳輸至其他管理單位。With the above disclosed embodiments, the present invention can first summarize a plurality of feature sets. Since the behavior feature values, network environment conditions, and application program names in the feature set are generated on the local terminal, the above information is correct. Sexuality is guaranteed, no need to consider the packet or the name of the program has been tampered with. By using such a principle, the present invention can further combine the above-mentioned judging process with the management operation of the back end, for example, reducing or blocking the external connection bandwidth of a specific program, or transmitting the judgment result to other management units.
依據本發明之又一實施方式,提供一種網路資料辨識與管理系統,其包含一訓練模組以及一資料庫。According to still another embodiment of the present invention, a network data identification and management system is provided, which includes a training module and a database.
本實施方式的訓練模組設置複數網路環境條件,並包含一本地終端,其於各個網路環境條件下分別執行至少一應用程式,並且對外傳輸執行前述應用程式所產生之至少一行為特徵值。資料庫訊號連接訓練模組,且自訓練模組接收並儲存至少一特徵集合,此處之特徵集合包含前述網路環境條件及其所對應之應用程式之一程式名稱,以及前述之行為特徵值。The training module of the present embodiment sets a plurality of network environment conditions, and includes a local terminal, which executes at least one application program under each network environment condition, and performs at least one behavior characteristic value generated by executing the foregoing application program. . The database signal is connected to the training module, and the self-training module receives and stores at least one feature set, where the feature set includes the network environment condition and a program name of the corresponding application, and the foregoing behavior feature value. .
在網路資料辨識與管理系統處於初始狀態時,資料庫內部尚未包含任何特徵集合,因此,訓練模組先行在資料庫外部執行應用程式,並建立後續資料比對所需的行為特徵值,同時記錄應用程式的程式名稱,其後連同前述之網路環境條件,將上述資料歸納為特徵集合,最後再將特徵集合回傳給資料庫以供其應用。When the network data identification and management system is in the initial state, the database does not contain any feature set internally. Therefore, the training module executes the application outside the database first, and establishes the behavior characteristic values required for the subsequent data comparison. The program name of the application is recorded, and then the above data is summarized into a feature set along with the aforementioned network environment conditions, and finally the feature set is passed back to the database for its application.
在本實施方式的前述網路資料辨識與管理系統中,這些網路環境條件可以由一封包來回時間定義。前述的資料庫可以為一雲端伺服器,亦即訓練模組與資料庫不必以有線方式訊號連接,兩者之間可以藉由網路進行資料交流。In the aforementioned network data identification and management system of the present embodiment, these network environment conditions can be defined by a packet back and forth time. The foregoing database can be a cloud server, that is, the training module and the database do not need to be connected by wired signals, and the data can be exchanged between the two through the network.
前述的行為特徵值可包含對應於應用程式之一封包數量、一封包容量或是一封包傳輸時間。前述應用程式與一目標終端之間具有一傳輸協定,其至少包含前述應用程式之一來源埠口值。此外,傳輸協定可另包含本地終端之一來源位址值、前述目標終端之一目標位址值以及一目標埠口值。此處所載之封包傳輸時間以及傳輸協定如同第一實施方式所述,故不另作說明。The foregoing behavioral feature value may include a number of packets corresponding to the application, a packet capacity, or a packet transmission time. The foregoing application has a transmission agreement with a target terminal, and at least one source port value of the foregoing application is included. In addition, the transmission protocol may further include a source address value of one of the local terminals, a target address value of one of the foregoing target terminals, and aTarget port value. The packet transmission time and transmission protocol contained herein are as described in the first embodiment and will not be further described.
前述網路資料辨識與管理系統中,訓練模組可另包含一流量處理器,流量處理器訊號連接本地終端,當本地終端執行之應用程式為複數時,流量處理器依據各個應用程式與前述目標終端之間的那些傳輸協定,分類那些行為特徵值,使具有相同的來源埠口值之至少一行為特徵值對應同一個應用程式。In the foregoing network data identification and management system, the training module may further include a flow processor, and the flow processor signal is connected to the local terminal. When the application executed by the local terminal is plural, the flow processor is based on each application and the foregoing target. Those transport protocols between terminals classify those behavioral feature values such that at least one behavioral feature value having the same source port value corresponds to the same application.
前述網路資料辨識與管理系統中,訓練模組可另包含一辨識模組,其透過來源埠口值查詢前述應用程式之程式名稱,並且將此程式名稱儲存於此應用程式對應的特徵集合中。In the network data identification and management system, the training module may further include an identification module for querying the program name of the application by using the source port value, and storing the program name in the feature set corresponding to the application. .
前述網路資料辨識與管理系統中,訓練模組可再包含一資料匯整模組,其將前述之網路環境條件儲存為封包來回時間,並提供資料庫接收前述特徵集合。In the foregoing network data identification and management system, the training module may further comprise a data collection module, which stores the foregoing network environment condition as a packet round-trip time, and provides a database to receive the foregoing feature set.
在上述網路資料辨識與管理系統的實施方式中,流量處理器先依據不同的傳輸協定所包含的來源埠口值來加以分類各種行為特徵值,確保每一個特徵集合內的行為特徵值被正確對應於正確的應用程式。同時,由於資料流量來自本地終端,使辨識模組得以利用前述的來源埠口值查詢程式名稱,最後再將上述的結果一併交由資料匯整模組,利用其轉化前述資料而成為資料庫所需的資訊。In the implementation manner of the network data identification and management system described above, the traffic processor first classifies various behavior feature values according to source port values included in different transmission protocols, ensuring that the behavior feature values in each feature set are correctly Corresponds to the correct application. At the same time, since the data flow comes from the local terminal, the identification module can use the source port value to query the program name, and finally the above result is transferred to the data collection module, and the data is converted into the database by using the data. The information you need.
依據本發明之再一實施方式,提供一種網路資料辨識與管理系統,其用於監控至少一用戶終端,並包含一訓練模組、一資料庫、一流量分析模組以及一解析模組。According to still another embodiment of the present invention, a network data identification and management system is provided for monitoring at least one user terminal, and includesA training module, a database, a flow analysis module, and an analysis module.
訓練模組設置複數網路環境條件,並包含一本地終端,其於各個網路環境條件下分別執行至少一應用程式,並且對外傳輸執行前述應用程式所產生之至少一行為特徵值。資料庫訊號連接訓練模組,且自訓練模組接收並儲存至少一特徵集合,此處之特徵集合包含前述網路環境條件及其所對應之應用程式之一程式名稱,以及前述之行為特徵值。流量分析模組與資料庫網路連接並監控前述用戶終端,同時計算用戶終端於一待測網路條件下執行一待測程式所產生之至少一待測特徵值,且傳送此待測網路條件以及待測特徵值至前述之資料庫。解析模組訊號連接資料庫,並自資料庫內選擇其中一個特徵集合,且比對前述網路環境條件、行為特徵值、待測網路條件以及待測特徵值,若前述的網路環境條件以及行為特徵值分別符合前述之待測網路條件以及待測特徵值,則解析模組判斷待測程式即為前述之應用程式。The training module sets a plurality of network environment conditions, and includes a local terminal that executes at least one application program under each network environment condition and transmits at least one behavior characteristic value generated by the foregoing application program. The database signal is connected to the training module, and the self-training module receives and stores at least one feature set, where the feature set includes the network environment condition and a program name of the corresponding application, and the foregoing behavior feature value. . The traffic analysis module is connected to the database network and monitors the user terminal, and calculates at least one characteristic value to be tested generated by the user terminal to execute a program to be tested under a condition of the network to be tested, and transmits the network to be tested. Conditions and characteristics to be tested to the aforementioned database. Parsing the module signal connection database, and selecting one of the feature sets from the database, and comparing the network environment condition, the behavior characteristic value, the network condition to be tested, and the characteristic value to be tested, if the foregoing network environment condition And the behavior characteristic value respectively meets the foregoing network condition to be tested and the characteristic value to be tested, and the parsing module determines that the program to be tested is the foregoing application program.
關於本實施方式之網路資料辨識與管理系統,訓練模組以及資料庫之架構如同前述又一實施方式,因此不再重複說明。The architecture of the network data identification and management system, the training module, and the database of the present embodiment is the same as the foregoing embodiment, and thus the description thereof will not be repeated.
流量分析模組監控用戶終端之對外連線,其原理類似於訓練模組透過本地終端來產生特徵集合,並將計算結果傳輸至資料庫,接著再由解析模組從資料庫中選出特徵集合加以比對,並藉由比對結果決定後續管理作業。The traffic analysis module monitors the external connection of the user terminal, and the principle is similar to the training module generating the feature set through the local terminal, and transmitting the calculation result to the database, and then the analysis module selects the feature set from the database. Compare and determine the follow-up management operations by comparing the results.
在本實施方式之網路資料辨識與管理系統中,這些網路環境條件可以由一封包來回時間定義。前述的資料庫可以為一雲端伺服器。In the network data identification and management system of the present embodiment, these network environment conditions can be defined by a packet back and forth time. The aforementioned database can be a cloud server.
前述的行為特徵值可包含對應於應用程式之一封包數量、一封包容量或是一封包傳輸時間。前述應用程式與一目標終端之間具有一傳輸協定,其至少包含前述應用程式之一來源埠口值。前述傳輸協定可另包含本地終端之一來源位址值、前述目標終端之一目標位址值以及一目標埠口值。The foregoing behavioral feature value may include a number of packets corresponding to the application, a packet capacity, or a packet transmission time. The foregoing application has a transmission agreement with a target terminal, and at least one source port value of the foregoing application is included. The foregoing transmission protocol may further include a source address value of one of the local terminals, a target address value of one of the foregoing target terminals, and a target port value.
本實施方式之網路資料辨識與管理系統中,訓練模組可另包含一流量處理器,流量處理器訊號連接本地終端,當本地終端執行之應用程式為複數時,流量處理器依據各個應用程式與前述目標終端之間的那些傳輸協定,分類那些行為特徵值,使具有相同的來源埠口值之至少一行為特徵值對應同一個應用程式。In the network data identification and management system of the present embodiment, the training module may further include a flow processor, and the flow processor signal is connected to the local terminal. When the application executed by the local terminal is plural, the flow processor is configured according to each application. Those transmission agreements with the aforementioned target terminals classify those behavior feature values such that at least one behavior feature value having the same source port value corresponds to the same application.
本實施方式之網路資料辨識與管理系統中,訓練模組可另包含一辨識模組,其透過來源埠口值查詢前述應用程式之程式名稱,並且將此程式名稱儲存於此應用程式對應的特徵集合中。In the network data identification and management system of the present embodiment, the training module may further include an identification module that queries the program name of the application by using the source port value, and stores the program name in the application corresponding to the application. Feature collection.
前述網路資料辨識與管理系統中,訓練模組可再包含一資料匯整模組,其將前述之網路環境條件儲存為封包來回時間,並提供資料庫接收前述特徵集合。In the foregoing network data identification and management system, the training module may further comprise a data collection module, which stores the foregoing network environment condition as a packet round-trip time, and provides a database to receive the foregoing feature set.
本實施方式之網路資料辨識與管理系統可另包含一執行模組,其監控前述之用戶終端,當解析模組判斷前述待測程式為前述應用程式時,流量分析模組命令執行模組中止此一待測程式於用戶終端之對外傳輸,或傳輸此應用程式之程式名稱至一管理終端。The network data identification and management system of the present embodiment may further include an execution module that monitors the foregoing user terminal and determines when the analysis moduleWhen the program to be tested is the application, the traffic analysis module commands the execution module to terminate the external transmission of the program to be tested on the user terminal, or transfer the program name of the application to a management terminal.
另外,前述之流量分析模組可以另包含一側錄端口,並利用側錄端口來訊號連接用戶終端,藉此接收前述待測網路條件以及待測特徵值。In addition, the foregoing traffic analysis module may further include a side recording port, and use the side recording port to connect the user terminal to receive the foregoing network condition to be tested and the feature value to be tested.
透過上述記載網路資料辨識與管理系統的實施方式,本發明之網路資料辨識與管理系統可以自動地透過訓練、偵測、比對、決策來完成區域網路的管理。Through the above described implementation of the network data identification and management system, the network data identification and management system of the present invention can automatically complete the management of the regional network through training, detection, comparison, and decision making.
100‧‧‧網路資料辨識與管理方法100‧‧‧Network data identification and management method
200‧‧‧網路資料辨識與管理系統200‧‧‧Network Data Identification and Management System
300‧‧‧訓練模組300‧‧‧ training module
310‧‧‧本地終端310‧‧‧Local Terminal
320‧‧‧封包解碼器320‧‧‧Packet Decoder
330‧‧‧流量處理器330‧‧‧Flow Processor
340‧‧‧辨識模組340‧‧‧ Identification Module
350‧‧‧資料匯整模組350‧‧‧ Data Collection Module
400‧‧‧資料庫400‧‧‧Database
410‧‧‧解析模組410‧‧‧Analytical Module
500‧‧‧流量偵測模組500‧‧‧Flow Detection Module
510‧‧‧側錄端子510‧‧‧ side recording terminal
600‧‧‧執行模組600‧‧‧Execution module
A‧‧‧應用程式A‧‧‧App
C‧‧‧網路環境條件C‧‧‧Network environmental conditions
D‧‧‧封包D‧‧‧Package
E‧‧‧特徵集合E‧‧‧ feature set
F‧‧‧決策因子F‧‧‧ decision factor
G‧‧‧管理終端G‧‧‧Management terminal
M‧‧‧訓練模型M‧‧‧ training model
N‧‧‧程式名稱N‧‧‧program name
O‧‧‧目標終端O‧‧‧ Target terminal
P‧‧‧傳輸協定P‧‧‧Transport Agreement
T‧‧‧數據流量T‧‧‧ data flow
U‧‧‧用戶終端U‧‧‧user terminal
V‧‧‧行為特徵值V‧‧‧ behavioral eigenvalue
DP‧‧‧目標埠口值DP‧‧‧ target import value
SP‧‧‧來源埠口值SP‧‧‧ source import value
DIP‧‧‧目標位址值DIP‧‧‧target address value
RTT‧‧‧封包來回時間RTT‧‧‧Package round-trip time
SIP‧‧‧來源位址值SIP‧‧‧ source address value
At‧‧‧待測程式At ‧‧‧ test program
Ct‧‧‧待測網路條件Ct ‧‧‧ measured network conditions
Vt‧‧‧待測特徵值Vt ‧‧‧ eigenvalues to be tested
S01~S11‧‧‧步驟S01~S11‧‧‧Steps
第1A圖係繪示本發明之網路資料辨識與管理方法的步驟流程圖;第1B圖係繪示本發明之網路資料辨識與管理方法的步驟流程圖;第2圖係繪示本發明之網路資料辨識與管理系統的結構方塊圖;第3圖係繪示第2圖之網路資料辨識與管理系統的資料訓練示意圖;第4A圖係繪示第2圖之網路資料辨識與管理系統的決策因子示意圖;第4B圖係繪示第2圖之網路資料辨識與管理系統的訓練模型示意圖;以及第5圖係繪示第2圖之網路資料辨識與管理系統的管理架構示意圖。1A is a flow chart showing the steps of the network data identification and management method of the present invention; FIG. 1B is a flow chart showing the steps of the network data identification and management method of the present invention; and FIG. 2 is a diagram showing the present invention; The block diagram of the network data identification and management system; the third figure shows the data training diagram of the network data identification and management system of Figure 2; the 4A figure shows the network data identification of Figure 2 Schematic diagram of the decision factor of the management system; FIG. 4B is a schematic diagram showing the training model of the network data identification and management system of FIG. 2;Figure 5 is a schematic diagram showing the management architecture of the network data identification and management system of Figure 2.
第1A圖係繪示本發明之網路資料辨識與管理方法100的步驟流程圖。第1B圖係繪示本發明之網路資料辨識與管理方法的步驟流程圖。請參照第1A圖以及第1B圖(為求清楚說明,請一併配合參照第3圖以及第5圖中之元件符號),網路資料辨識與管理方法100包含以下步驟:步驟S01為依據一封包來回時間RTT定義複數網路環境條件C。FIG. 1A is a flow chart showing the steps of the network data identification and management method 100 of the present invention. FIG. 1B is a flow chart showing the steps of the network data identification and management method of the present invention. Please refer to FIG. 1A and FIG. 1B (for the sake of clarity, please refer to the component symbols in FIG. 3 and FIG. 5 together), and the network data identification and management method 100 includes the following steps: Step S01 is based on The packet round trip time RTT defines the complex network environment condition C.
如同前述說明,封包來回時間RTT意指封包D由發出端傳送至目標端,並再次由目標端回傳封包D至發出端的整個過程所花費的時間總和,就直觀理解而言,封包來回時間RTT可以視為是網路傳輸速度的指標之一,封包來回時間RTT越短者,通常代表著網路傳輸速度越快。As described above, the packet round-trip time RTT means the sum of the time taken by the packet D to be transmitted from the originating end to the target end, and the entire process of returning the packet D to the originating end by the target end again. In terms of intuitive understanding, the packet round-trip time RTT Can be regarded as one of the indicators of network transmission speed, the shorter the RTT of the packet back and forth time, usually represents the faster the network transmission speed.
在本方法中,封包來回時間RTT可以透過一些技巧來自由調整,舉例來說,本方法係利用虛擬機器(Virtual Machine)軟體在一部電腦上創建兩個虛擬網路介面卡,再將電腦的實體網路介面卡與其中一個虛擬網路介面卡橋接,其後,指定另外一個虛擬網路介面卡做為此電腦的對外連線之媒介,如此一來則可以透過調變虛擬機器的網路流量來改變電腦的對外網路流量,達到調整封包來回時間RTT的目的。此為相關實驗所常用的一般技巧,且也並非僅有上述的方式可以達成,因此不多贅述。In this method, the RTT of the packet can be adjusted by some techniques. For example, the method uses the Virtual Machine software to create two virtual network interface cards on a computer, and then the computer The physical network interface card is bridged with one of the virtual network interface cards, and then another virtual network interface card is designated as the medium for the external connection of the computer, so that the virtual machine network can be modulated. Traffic to change the computer's external network traffic to achieve adjustment packetsThe purpose of the RTT back and forth time. This is a general technique commonly used in related experiments, and it is not only the above-mentioned methods can be achieved, so it is not repeated.
步驟S02為操作一本地終端310,並於各個網路環境條件C執行複數應用程式A,藉以產生相對前述各個網路環境條件C及各個應用程式A之複數行為特徵值V。Step S02 is to operate a local terminal 310, and execute a plurality of application programs A in each network environment condition C, thereby generating a complex behavior characteristic value V relative to each of the network environment conditions C and each application A.
本地終端310所執行的這些應用程式A,其可以是不同程式,也可以是多個相同的程式。The application programs A executed by the local terminal 310 may be different programs or multiple identical programs.
步驟S03為定義各個行為特徵值V包含一封包數量、一封包容量或一封包傳輸時間。Step S03 is to define each behavior characteristic value V to include a packet number, a packet capacity, or a packet transmission time.
行為特徵值V指的是應用程式A在執行或是對外網路傳輸時所自然產生的一些資訊,例如此處舉例的封包數量、封包容量或是封包傳輸時間等。The behavior characteristic value V refers to some information naturally generated by the application A when it is executed or transmitted to the external network, such as the number of packets, the packet capacity, or the packet transmission time.
步驟S04係定義各個應用程式A與至少一目標終端O之間的一傳輸協定P,並使各個傳輸協定P包含前述本地終端310之一來源位址值SIP、前述應用程式A之一來源埠口值SP、前述目標終端O之一目標位址值DIP以及一目標埠口值DP。Step S04 is to define a transmission agreement P between each application A and at least one target terminal O, and each transmission protocol P includes one source address value SIP of the local terminal 310, and one of the foregoing application programs A. The value SP, a target address value DIP of the aforementioned target terminal O, and a target port value DP.
本方法所使用的傳輸協定P即為5-tuple架構(Source IP、Destination IP、Source Port、Destination Port、Layer4 Protocol),這些資訊包含封包D的發出端(Source)與目標端(Destination)的網路位址(IP)以及此網路位址用來收發封包D的埠口(Port),由於此為TCP/IP網路架構的一般常識,故僅做以上簡單介紹。The transport protocol P used in this method is a 5-tuple architecture (Source IP, Destination IP, Source Port, Destination Port, Layer 4 Protocol), and the information includes the source of the packet D and the network of the destination (Destination). The road address (IP) and the port address of this network address are used to send and receive the packet D. Since this is the general knowledge of the TCP/IP network architecture, only the above is briefly introduced.
步驟S05為依據各個應用程式A與前述目標終端O之間的各個傳輸協定P,分類那些行為特徵值V,使具有相同之來源埠口值SP之至少一行為特徵值V對應同一個應用程式A。Step S05 is to classify the behavioral feature values V according to the respective transmission agreements P between the application A and the target terminal O, so that at least one behavioral feature value V having the same source port value SP corresponds to the same application A. .
假設將本地終端310做為行為特徵值V的產生來源,則每一個來源埠口值SP皆代表本地終端310所執行某一應用程式A時,其用於與外界傳輸資料的專用窗口。因此,若有十萬個封包D,且十萬個封包D的來源埠口值SP總計有十個值,則表示十萬個封包D來自十個應用程式A,步驟S05即是將來源埠口值SP相同的封包D歸為同一類,表示其中的某一部分封包D是為同一個應用程式A所用,且這些封包D構成的行為特徵值V也對應同一個應用程式A。Assuming that the local terminal 310 is used as the source of the behavioral feature value V, each source port value SP represents a dedicated window for transmitting data to the outside world when the local terminal 310 executes an application A. Therefore, if there are 100,000 packets D, and the source port value SP of 100,000 packets D has a total of ten values, it means that 100,000 packets D are from ten applications A, and step S05 is the source port. Packets D with the same value of SP are classified into the same class, indicating that a certain part of the packet D is used by the same application A, and the behavior characteristic value V formed by these packets D also corresponds to the same application A.
步驟S06為利用前述之來源埠口值SP查詢前述應用程式A之一程式名稱N。由於在本地終端310上所執行的應用程式A為正確且沒有被竄改的疑慮,因此,透過來源埠口值SP可以查詢本地終端310正在執行之應用程式A,以及其正確的程式名稱N。Step S06 is to query the program name N of the application A by using the source port value SP. Since the application A executed on the local terminal 310 is correct and has not been tampered with, the source port SP can be used to query the application A being executed by the local terminal 310 and its correct program name N.
步驟S07為歸納並儲存前述應用程式A之程式名稱N、網路環境條件C以及其對應之行為特徵值V為一特徵集合E,並儲存此特徵集合E於一資料庫400。Step S07 is to summarize and store the program name N of the application A, the network environment condition C, and the corresponding behavior feature value V as a feature set E, and store the feature set E in a database 400.
至此,在本地終端310上所執行的各個應用程式A,其相關資訊被整理為各個特徵集合E,並且儲存於資料庫400內。So far, the respective applications A executed on the local terminal 310 are organized into respective feature sets E and stored in the database 400.
步驟S08為監控一用戶終端U,並計算用戶終端U於一待測網路條件Ct下執行一待測程式At所產生之至少一待測特徵值Vt。Step S08 is to monitor a user terminal U, and calculate that the user terminal U executes at least one to-be-tested feature value Vt generated by a program to be tested At under a network condition Ct to be tested.
步驟S08所指的監控可透過鏡像連接埠(Mirror Port)來完成,鏡像連接埠為一般學校宿舍或辦公室的網路管理人員常用的管理工具,最主要用於將目前所處的區域網路內所有的網路流量複製一份資料,方便網路管理人員進行管理作業。透過監控用戶終端U所取得的這些資料,可以得知用戶終端U以何網路條件、當時執行的程式為哪些,以及各個程式分別對外發送與接收的資料等有關資訊。The monitoring referred to in step S08 can be completed through a mirror port (Mirror Port), which is a common management tool used by network administrators in a general school dormitory or office, and is mainly used for the current local area network. All network traffic is copied to facilitate network management personnel to manage operations. By monitoring the data obtained by the user terminal U, it can be known what network conditions of the user terminal U, which programs are executed at that time, and information about the externally transmitted and received data of each program.
步驟S09為傳送前述待測網路條件Ct以及待測特徵值Vt至資料庫400。需特別說明的是,步驟S09之所以未如步驟S06利用用戶終端U的來源埠口值SP直接查詢程式名稱N,係因為網路管理者雖然可藉此查知程式名稱N,但此方式僅能得到用戶終端U對應用程式A的自訂命名,無法得知用戶終端U的使用者是否已對程式名稱N進行修改,故其可靠性並不如行為特徵值V的判定。In step S09, the network condition Ct to be tested and the feature value Vt to be tested are transmitted to the database 400. It should be noted that, in step S09, the program name N is directly queried by using the source port value SP of the user terminal U in step S06, because the network administrator can use this to find the program name N, but this mode only The user terminal U can obtain the custom naming of the application program A, and it is impossible to know whether the user of the user terminal U has modified the program name N, so the reliability is not determined as the behavior characteristic value V.
步驟S10為自資料庫400選擇其中一特徵集合E,並比對前述網路環境條件C以及行為特徵值V是否分別符合前述待測網路條件Ct以及待測特徵值Vt,若是,則判斷前述待測程式At為前述之應用程式A。Step S10 is to select one of the feature sets E from the database 400, and compare whether the network environment condition C and the behavior feature value V respectively meet the foregoing network condition Ct to be tested and the feature value Vt to be tested, and if so, It is judged that the aforementioned program to be tested At is the aforementioned application A.
在本實施方式中,特徵集合E的選出以封包來回時間RTT作為主要準則。如上述所言,雖然同一應用程式A與外界往來的必要資訊大多相同,但在封包來回時間RTT有所差異下,多部電腦在單位時間內所發送與接收的封包D之數量(或其他特徵)卻可能存在極大差異,當部份行為特徵值V會受到網路傳輸速度影響時,同一應用程式A在不同網路傳輸速度下的特徵集合E也就可能出現偏差。故此,步驟S10係以封包來回時間RTT來選出網路環境條件C與待測網路條件Ct相符的特徵集合E,據此進一步比對行為特徵值V與待測特徵值Vt是否也相符。若前述的比對結果都高度相符,則判斷待測程式At實際上就是步驟S07中所儲存的應用程式A。In the present embodiment, the selection of the feature set E is based on the packet round trip time RTT as the main criterion. As mentioned above, although the necessary information for the same application A to the outside world is mostly the same, the number of packets D sent and received by multiple computers in a unit time (or other characteristics) is different under the RTT of the packet back and forth time. However, there may be great differences. When some behavioral feature values V are affected by the network transmission speed, the feature set E of the same application A at different network transmission speeds may be deviated. Therefore, step S10 selects the feature set E of the network environment condition C and the network condition Ct to be tested according to the packet round-trip time RTT, and further compares whether the behavior characteristic value V and the feature value Vt to be tested are also consistent. . If the comparison result is highly consistent, it is determined that the test program At app actually stored in step S07 A.
步驟S11為中止前述待測程式At於前述用戶終端U之對外傳輸,或傳輸前述之程式名稱N至一管理終端G。一般的影音網站或是遊戲軟體、網頁,在辦公場所或教室等場合大多不被允許,步驟S11所述的中止,可以藉由設置訪問規則加以屏蔽這些對外連線來達成,或另外通知管理終端G等其他決策。The step S11 to suspend the external test programs A Nt transmit to the user terminal U, the program name of the transmission, or to a management terminal G. A general audio-visual website or a game software or a web page is mostly not allowed in an office or a classroom. The suspension described in step S11 can be achieved by setting an access rule to block these external connections, or separately notify the management terminal. G and other decisions.
透過本發明之網路資料辨識與管理方法,網路管理者可以透過設定各種網路環境條件來模擬不同情況之被管理者的應用程式執行狀態,藉以使判斷的可靠性大幅提升。此外,本發明所使用的方法不需要持續以人力去追蹤各種軟體的版本更新,其因為本方法並不是憑藉封包內部的字串來進行資料辨識,因此不需要經過繁瑣的關鍵字更新。更甚者,使用行為特徵值作為判斷的基礎,使本方法不受資料加密的阻礙,使辨識效率以及有效性皆能顯著地提升。Through the network data identification and management method of the present invention, the network administrator can simulate the execution state of the manager's application in different situations by setting various network environment conditions, so that the reliability of the judgment is greatly improved. In addition, the method used in the present invention does not need to continuously track the version updates of various softwares by manpower. Since the method does not rely on the strings inside the packet for data identification, it does not need to undergo complicated keyword updating. What's more, using the behavioral feature value as the basis for judgment makes the partyThe method is not hindered by data encryption, so that the efficiency and effectiveness of identification can be significantly improved.
第2圖係繪示本發明之網路資料辨識與管理系統200的結構方塊圖。請參照第2圖,網路資料辨識與管理系統200包含一訓練模組300、一資料庫400、一流量分析模組500以及一執行模組600。FIG. 2 is a block diagram showing the structure of the network data identification and management system 200 of the present invention. Referring to FIG. 2 , the network data identification and management system 200 includes a training module 300 , a database 400 , a traffic analysis module 500 , and an execution module 600 .
訓練模組300設置複數個網路環境條件C,並且包含一本地終端310、一封包解碼器320、一流量處理器330、一辨識模組340以及一資料匯整模組350。前述的網路環境條件C由一封包來回時間RTT定義,此處不再重複敘明其意義。The training module 300 is configured with a plurality of network environment conditions C, and includes a local terminal 310, a packet decoder 320, a traffic processor 330, an identification module 340, and a data collection module 350. The aforementioned network environment condition C is defined by a packet round-trip time RTT, and its meaning is not repeated here.
第3圖係繪示第2圖之網路資料辨識與管理系統200的資料訓練示意圖。請配合參照第3圖,本地終端310於訓練模組300設置的這些網路環境條件C之下,分別執行至少一應用程式A,並同時與一目標終端O互相傳輸複數封包D。此處的應用程式A也可以為複數個,且在此情況下,應用程式A可以相同或相異,為了完整說明本發明之網路資料辨識與管理系統200的架構,以下將以複數之應用程式A作為實施例。FIG. 3 is a schematic diagram showing the data training of the network data identification and management system 200 of FIG. Referring to FIG. 3, the local terminal 310 executes at least one application A under the network environment condition C set by the training module 300, and simultaneously transmits a plurality of packets D to a target terminal O. The application A here may also be plural, and in this case, the application A may be the same or different. In order to fully describe the architecture of the network data identification and management system 200 of the present invention, the following application will be plural. Program A is taken as an example.
封包解碼器320收集並解碼由本地終端310產生的那些封包D。封包D經過解析後,封包D內部所儲存之一傳輸協定P即可供讀取,傳輸協定P包含本地終端310之一來源位址值SIP、封包D所屬之應用程式A之一來源埠口值SP、前述目標終端O之一目標位址值DIP以及一目標埠口值DP。此處的目標埠口值DP和來源埠口值SP的原理相同,意指目標終端O用於與此應用程式A往來傳輸資訊的窗口。Packet decoder 320 collects and decodes those packets D generated by local terminal 310. After the packet D is parsed, one of the transport protocols P stored in the packet D is available for reading. The transport protocol P includes one source address value SIP of the local terminal 310 and one source address of the application A to which the packet D belongs. SP, a target address value DIP of the target terminal O, and a target 埠The mouth value is DP. Here, the target port value DP and the source port value SP have the same principle, which means that the target terminal O is used for the window for transmitting information to and from the application A.
因本地終端310同時執行複數應用程式A,流量處理器330係利用前述的傳輸協定P,依據來源埠口值SP來分類封包D,使來源埠口值SP相同之封包D被分類歸屬於同一個應用程式A,同時計算出這些封包D之複數行為特徵值V(例如封包D數150個、大小為2048bytes...等),此處的行為特徵值V可包含一封包數量、一封包容量或一封包傳輸時間,但並不以此為限。Since the local terminal 310 simultaneously executes the plurality of application programs A, the traffic processor 330 classifies the packet D according to the source port value SP by using the foregoing transmission protocol P, so that the packets D having the same source port value SP are classified into the same one. Application A, and calculate the complex behavior characteristic value V of these packets D (for example, 150 packets D, size 2048bytes, etc.), where the behavior characteristic value V may include a packet number, a packet capacity, or A packet transmission time, but not limited to this.
封包D經由前述處理過後,成為依據不同應用程式A分類的複數個數據流量T,辨識模組340依據數據流量T內各個封包D之傳輸協定P的來源埠口值SP,查詢應用程式A之一程式名稱N。至此,各個應用程式A的程式名稱N及其對應的行為特徵值V皆已被求出。After the packet D is processed as described above, it becomes a plurality of data flows T classified according to different application programs A, and the identification module 340 queries one of the application programs A according to the source port value SP of the transmission protocol P of each packet D in the data traffic T. Program name N. So far, the program name N of each application A and its corresponding behavior characteristic value V have been found.
資料匯整模組350接收經由流量處理器330以及辨識模組340處理過後的資料,並將前述之網路環境條件C(即封包來回時間RTT)與程式名稱N以及行為特徵值V加以整合,使上述的這些資料成為完整的特徵集合E。The data collection module 350 receives the data processed by the flow processor 330 and the identification module 340, and integrates the aforementioned network environment condition C (ie, packet round-trip time RTT) with the program name N and the behavior characteristic value V. Make the above information into a complete feature set E.
資料庫400與訓練模組300訊號連接,並接收及儲存來自資料匯整模組350的那些特徵集合E。另外,資料庫400可以是一雲端伺服器,透過網路參與上述的資料交換。The database 400 is coupled to the training module 300 and receives and stores those feature sets E from the data collection module 350. In addition, the database 400 can be a cloud server that participates in the above data exchange through the network.
第4A圖係繪示第2圖之網路資料辨識與管理系統200的決策因子F示意圖。搭配第2圖參照第4A圖,資料庫400包含一解析模組410,由於上述的各個特徵集合E所包含的網路環境條件C以及行為特徵值V皆可能差異甚大,因此本實施方式採用多層決策樹的模式將其先行分類,並以此規則儲存在資料庫400內。以第4A圖為例,由於本實施方式中的各個特徵集合E都包含了封包來回時間RTT的資訊,當解析模組410進行分類時,係先經過一決策因子F來決定分類依據,例如將封包來回時間RTT分為25毫秒以內、26至50毫秒、51至75毫秒等多個區間。FIG. 4A is a schematic diagram showing the decision factor F of the network data identification and management system 200 of FIG. Referring to FIG. 4A, the database 400 includes an analysis module 410. Since the network environment condition C and the behavior characteristic value V included in each feature set E may be greatly different, the present embodiment adopts multiple layers. The pattern of the decision tree is pre-classified and stored in the database 400 using this rule. Taking FIG. 4A as an example, since each feature set E in the present embodiment includes information of the packet round-trip time RTT, when the parsing module 410 performs classification, the decision factor F is first determined by a decision factor F, for example, The packet round-trip time RTT is divided into multiple intervals of 25 milliseconds, 26 to 50 milliseconds, 51 to 75 milliseconds, and the like.
第4B圖係繪示第2圖之網路資料辨識與管理系統200的訓練模型M示意圖。請搭配參照第4B圖之進一步說明,在第4B圖中,所有的特徵集合E都會通過層層之決策因子F而不斷被分類,以至於最終形成決策樹之下的複數訓練模型M。FIG. 4B is a schematic diagram showing the training model M of the network data identification and management system 200 of FIG. Please refer to the further description of FIG. 4B. In FIG. 4B, all feature sets E are continuously classified by the decision factor F of the layer, so that the complex training model M under the decision tree is finally formed.
值得一提的是,為了讓所有的特徵集合E皆可以順利地被分類,決策樹的判定式可以重複遞迴,舉例來說,當有極多數的特徵集合E都落入其中一個決策因子F的判定結果時,可以再進入下一決策因子F之前,先行在此決策因子F中再次細分,例如在封包來回時間RTT為26至50毫秒以及封包數量為100至200個之下,再次細分為26至38毫秒、39至50毫秒等兩個區段。It is worth mentioning that in order for all feature sets E to be classified smoothly, the decision tree's decision formula can be repeated and returned. For example, when a very large number of feature sets E fall into one of the decision factors F The result of the determination may be subdivided again in the decision factor F before entering the next decision factor F, for example, after the packet RTM is 26 to 50 milliseconds and the number of packets is 100 to 200, subdivided into Two segments of 26 to 38 milliseconds, 39 to 50 milliseconds, and the like.
此外,本實施方式之解析模組410可以是複數個(未於圖中繪出),當資料匯整模組350登入資料庫400並發送傳輸請求時,資料庫400依據各個解析模組410的負載程度,指定其中一個解析模組410接收特徵集合E,並進行上述的分類步驟,於分類完畢時,將分類的結果套用更新至所有解析模組410。In addition, the analysis module 410 of the embodiment may be plural (not shown in the figure), and the data collection module 350 logs into the database 400 andWhen the transmission request is sent, the database 400 specifies that one of the analysis modules 410 receives the feature set E according to the load level of each analysis module 410, and performs the above-mentioned classification step. When the classification is completed, the classification result is updated to all. The module 410 is parsed.
如此一來,本實施方式即可藉上述資料庫400以及訓練模組300的配置,在大型網路架構中實現異地資料訓練以及後續管理。In this way, in the embodiment, the configuration of the database 400 and the training module 300 can be used to implement remote data training and subsequent management in a large network architecture.
前述所指的封包數量、封包容量以及封包傳輸時間僅係為了說明需要而舉例,實際上本發明所使用的行為特徵值V包含了更多種資訊。The number of packets, the packet capacity, and the packet transmission time referred to above are merely exemplified for the purpose of illustration. In fact, the behavior characteristic value V used in the present invention contains more kinds of information.
例如:本地終端310與目標終端O的一次請求與回應稱為一個回合(round),兩者在不同回合所傳輸的封包D之數量、容量或花費時間都會有所不同,因此不同回合所產生的這些行為特徵值V之規律,其亦具有參考價值;又如,目標終端O可能被設定在收到特定資訊後延遲一段固定時間,並於其後對特定資訊作出回應。故本實施方式所指的行為特徵值V不應被限制為僅有封包數量、封包容量以及封包傳輸時間。For example, a request and response of the local terminal 310 and the target terminal O is called a round, and the number, capacity or time spent by the two packets transmitted in different rounds will be different, so different rounds are generated. The behavior of these behavioral feature values V also has a reference value; for example, the target terminal O may be set to delay a fixed period of time after receiving the specific information, and then respond to the specific information. Therefore, the behavior characteristic value V referred to in this embodiment manner should not be limited to only the packet number, the packet capacity, and the packet transmission time.
第5圖係繪示第2圖之網路資料辨識與管理系統200的管理架構示意圖。繼續參照第5圖,流量分析模組500監控用戶終端U,並同時訊號連接資料庫400以及執行模組600。FIG. 5 is a schematic diagram showing the management architecture of the network data identification and management system 200 of FIG. Continuing to refer to FIG. 5, the traffic analysis module 500 monitors the user terminal U and simultaneously connects the database 400 and the execution module 600.
流量分析模組500包含一側錄端子510,用戶終端U對外之網路傳輸皆會被鏡像轉錄至側錄端子510內,此為一般網路管理實務中常見之管理架構,因此不詳述其實施細節。The traffic analysis module 500 includes a side recording terminal 510, and the external network transmission of the user terminal U is mirrored and transcribed into the side recording terminal 510.This is a common management architecture in general network management practices, so details of its implementation are not detailed.
對於網路管理者而言,由用戶終端U轉錄而來的資料仍為未經整理的零散封包D,因此在這些封包D被傳送至資料庫400之前,需經由流量分析模組500進行整理,使其轉化成為一待測網路條件Ct下之一待測程式At的至少一待測特徵值Vt。有關於流量分析模組500的具體轉化過程,請參考前述封包解碼器320、流量處理器330以及資料匯整模組350之說明。For the network administrator, the data transcribed by the user terminal U is still the unorganized scattered packet D, so before the packets D are transmitted to the database 400, they need to be sorted by the traffic analysis module 500. It is converted into at least one to-be-tested feature value Vt of one of the test programs At under a network condition Ct to be tested. For a specific conversion process of the traffic analysis module 500, please refer to the descriptions of the packet decoder 320, the flow processor 330, and the data collection module 350.
待上述整理完成之資料傳輸至資料庫400後,解析模組410則套用前述決策樹及其決策因子F,分別對前述待測網路條件Ct以及待測特徵值Vt進行比對。After the data to be processed is transmitted to the database 400, the parsing module 410 applies the foregoing decision tree and its decision factor F to compare the network condition Ct to be tested and the feature value Vt to be tested.
以第4A圖為例,解析模組410係事先藉由封包來回時間RTT進行分類,隨後以第4B圖的模式持續比對,直到找出與待測網路條件Ct以及待測特徵值Vt相符的訓練模型M。當然,以封包來回時間RTT做為第一層決策因子F,主要原因係如前述所言:封包D之檢測需建立在同一標準方有可靠性,但本實施方式並不限制決策因子F的排列順序。Taking FIG. 4A as an example, the parsing module 410 is classified by the packet round-trip time RTT in advance, and then continuously compared in the mode of FIG. 4B until the network condition Ct to be tested and the feature value to be tested are found.t matching training model M. Of course, the packet RTM is used as the first layer decision factor F. The main reason is as described above: the detection of the packet D needs to be established on the same standard side, but the implementation does not limit the arrangement of the decision factor F. order.
解析模組410完成比對後,將可能的應用程式A之程式名稱N回傳給流量分析模組500,流量分析模組500控制執行模組600以監控用戶終端U。實際上,流量分析模組500以及執行模組600的關係可以為軟體定義網路架構(Software-defined networking,SDN),其將資料分析比對以及後續的決策交由位於控制層的流量分析模組500執行,執行模組600則僅依據前者的指令對末端應用端實施資料管控。After the comparison module 410 completes the comparison, the program name N of the possible application A is transmitted back to the traffic analysis module 500, and the traffic analysis module 500 controls the execution module 600 to monitor the user terminal U. In fact, the relationship between the traffic analysis module 500 and the execution module 600 may be a software-defined networking (SDN), which divides the data.The analysis and subsequent decision-making are performed by the traffic analysis module 500 located at the control layer, and the execution module 600 performs data management and control on the terminal application only according to the instructions of the former.
本實施方式使用SDN架構,除了架構本身具有方便規劃資料傳輸途徑的優點以外,另外的好處在於此架構可以不斷平行擴張,並依據流量分析模組500內部的設定達到同時控制多個執行模組600的機能。例如第5圖中的示例,流量分析模組500可以透過執行模組600執行各種動作,包含中止待測程式At的對外連線或是用戶終端U的所有對外連線;也可以命令執行模組600將前述待測程式At的判斷結果傳送至一管理終端G。In this embodiment, the SDN architecture is used. In addition to the advantages of the architecture itself for facilitating the planning of data transmission paths, another advantage is that the architecture can be continuously expanded in parallel, and multiple execution modules 600 can be simultaneously controlled according to the settings in the traffic analysis module 500. Function. For example in FIG. 5 in the example, flow analysis module 500 may be executed through the module 600 to perform various actions, all external connections suspension comprising program At test or the external connection terminal U of the user; mode command may also the determination result of the group 600 at test program is transmitted to a management terminal G.
雖然本發明已以實施方式揭露如上,然其並非用以限定本發明,任何熟習此技藝者,在不脫離本發明之精神和範圍內,當可作各種之更動與潤飾,因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。Although the present invention has been disclosed in the above embodiments, it is not intended to limit the present invention, and the present invention can be modified and modified without departing from the spirit and scope of the present invention. The scope is subject to the definition of the scope of the patent application attached.
400‧‧‧資料庫400‧‧‧Database
410‧‧‧解析模組410‧‧‧Analytical Module
500‧‧‧流量偵測模組500‧‧‧Flow Detection Module
510‧‧‧側錄端子510‧‧‧ side recording terminal
600‧‧‧執行模組600‧‧‧Execution module
D‧‧‧封包D‧‧‧Package
G‧‧‧管理終端G‧‧‧Management terminal
N‧‧‧程式名稱N‧‧‧program name
U‧‧‧用戶終端U‧‧‧user terminal
At‧‧‧待測程式At‧‧‧Testing program
Ct‧‧‧待測網路條件Ct‧‧‧Network conditions to be tested
Vt‧‧‧待測特徵值Vt‧‧‧ eigenvalues to be tested
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW104123570ATWI569606B (en) | 2015-07-21 | 2015-07-21 | Data recognition system for internet and method thereof |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| TW104123570ATWI569606B (en) | 2015-07-21 | 2015-07-21 | Data recognition system for internet and method thereof |
| Publication Number | Publication Date |
|---|---|
| TW201705721Atrue TW201705721A (en) | 2017-02-01 |
| TWI569606B TWI569606B (en) | 2017-02-01 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| TW104123570ATWI569606B (en) | 2015-07-21 | 2015-07-21 | Data recognition system for internet and method thereof |
| Country | Link |
|---|---|
| TW (1) | TWI569606B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| TW527804B (en)* | 2001-05-25 | 2003-04-11 | Accton Technology Corp | Method and apparatus for bandwidth management of TCP traffic employing post-acknowledgement control |
| US7433304B1 (en)* | 2002-09-06 | 2008-10-07 | Packeteer, Inc. | Classification data structure enabling multi-dimensional network traffic classification and control schemes |
| US7664048B1 (en)* | 2003-11-24 | 2010-02-16 | Packeteer, Inc. | Heuristic behavior pattern matching of data flows in enhanced network traffic classification |
| KR101141645B1 (en)* | 2005-03-29 | 2012-05-17 | 엘지전자 주식회사 | Method for Controlling Transmission of Data Block |
| CN101202652B (en)* | 2006-12-15 | 2011-05-04 | 北京大学 | Device for classifying and recognizing network application flow quantity and method thereof |
| CN102056187A (en)* | 2009-10-29 | 2011-05-11 | 上海倍亚得信息技术有限公司 | System and method for testing round-trip time (RTT) of wireless network data traffic |
| Publication number | Publication date |
|---|---|
| TWI569606B (en) | 2017-02-01 |
| Publication | Publication Date | Title |
|---|---|---|
| US12335275B2 (en) | System for monitoring and managing datacenters | |
| JP6535809B2 (en) | Anomaly detection device, an anomaly detection system, and an anomaly detection method | |
| CN101582813B (en) | Distributed migration network learning-based intrusion detection system and method thereof | |
| CN102685016B (en) | Internet flow distinguishing method | |
| CN103795723B (en) | Distributed type internet-of-things safety situation awareness method | |
| CN107360145B (en) | Multi-node honeypot system and data analysis method thereof | |
| Duan et al. | ByteIoT: A practical IoT device identification system based on packet length distribution | |
| CN108334758B (en) | Method, device and equipment for detecting user unauthorized behavior | |
| CN113114618B (en) | Internet of things equipment intrusion detection method based on traffic classification recognition | |
| CN112839014B (en) | Establish methods, systems, equipment and media for identifying abnormal visitor models | |
| Portela et al. | Evaluation of the performance of supervised and unsupervised Machine learning techniques for intrusion detection | |
| US11770380B1 (en) | Systems and methods for enhanced network detection | |
| WO2022151815A1 (en) | Method and apparatus for determining security state of terminal device | |
| Guo et al. | FullSight: A feasible intelligent and collaborative framework for service function chains failure detection | |
| CN118694617A (en) | Network data transmission monitoring system and method based on big data analysis | |
| CN117376004A (en) | Deep learning-based API logic vulnerability monitoring method and system | |
| CN109922083B (en) | Network protocol flow control system | |
| CN108805211A (en) | IN service type cognitive method based on machine learning | |
| CN108650274B (en) | A kind of network intrusion detection method and system | |
| TWI591982B (en) | Network flow recognization method and recognization system | |
| CN118761068A (en) | Vulnerability management method and system based on adaptive architecture of host security platform | |
| TWI569606B (en) | Data recognition system for internet and method thereof | |
| CN117478380A (en) | Network alarm assessment methods and devices, electronic equipment and storage media | |
| CN117411708A (en) | Attack detection method, device, equipment and medium based on flow characteristic matching | |
| CN103530297A (en) | Method and device capable of automatically carrying out website analysis |