Uh oh!
There was an error while loading.Please reload this page.
- Notifications
You must be signed in to change notification settings - Fork144
NFStream: a Flexible Network Data Analysis Framework.
License
nfstream/nfstream
Folders and files
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Repository files navigation
NFStream is a multiplatform Python framework providing fast, flexible, and expressive data structures designed to makeworking withonline oroffline network data easy and intuitive. It aims to be Python's fundamental high-levelbuilding block for doing practical,real-world network flow data analysis. Additionally, it has the broadergoal of becominga unifying network data analytics framework for researchers providing data reproducibilityacross experiments.
| Live Notebook | |
| Project Website | |
| Discussion Channel | |
| Latest Release | |
| Supported Versions | |
| Project License | |
| Continuous Integration | |
| Code Quality |
- Table of Contents
- Main Features
- How to get it?
- How to use it?
- Building from sources
- Contributing
- Ethics
- Credits
- Publications that use NFStream
- License
- Performance: NFStream is designed to be fast:AF_PACKET_V3/FANOUT on Linux, multiprocessing, nativeCFFI based computation engine, andPyPy full support.
- Encrypted layer-7 visibility: NFStream deep packet inspection is based onnDPI.It allows NFStream to performreliable encrypted applications identification and metadatafingerprinting (e.g. TLS, SSH, DHCP, HTTP).
- System visibility: NFStream probes the monitored system's kernel to obtain information on open Internet socketsand collects guaranteed ground-truth (process name, PID, etc.) at the application level.
- Statistical features extraction: NFStream provides state of the art of flow-based statistical feature extraction.It includes post-mortem statistical features (e.g., minimum, mean, standard deviation, and maximum of packet size andinter-arrival time) and early flow features (e.g. sequence of first n packets sizes, inter-arrival times, and directions).
- Flexibility: NFStream is easily extensible usingNFPlugins. It allows the creation of a new flowfeature within a few lines of Python.
- Machine Learning oriented: NFStream aims to make Machine Learning Approaches for network traffic managementreproducible and deployable. By using NFStream as a common framework, researchers ensure that models are trained usingthe same feature computation logic, and thus, a fair comparison is possible. Moreover, trained models can be deployedand evaluated on live networks usingNFPlugins.
Binary installers for the latest released version are available on Pypi.
pip install nfstream
Windows Notes: NFStream does not include capture drivers on Windows (license restrictions). It is required toinstallNpcap drivers before installing NFStream.If Wireshark is already installed on Windows, then Npcap drivers are already installed, and you do not need to performany additional action.
Dealing with a big pcap file and want to aggregate into labeled network flows?NFStream make this path easier ina few lines:
fromnfstreamimportNFStreamer# We display all streamer parameters with their default values.# See documentation for detailed information about each parameter.# https://www.nfstream.org/docs/api#nfstreamermy_streamer=NFStreamer(source="facebook.pcap",# or live network interfacedecode_tunnels=True,bpf_filter=None,promiscuous_mode=True,snapshot_length=1536,idle_timeout=120,active_timeout=1800,accounting_mode=0,udps=None,n_dissections=20,statistical_analysis=False,splt_analysis=0,n_meters=0,max_nflows=0,performance_report=0,system_visibility_mode=0,system_visibility_poll_ms=100)forflowinmy_streamer:print(flow)# print it.
# See documentation for each feature detailed description.# https://www.nfstream.org/docs/api#nflowNFlow(id=0,expiration_id=0,src_ip='192.168.43.18',src_mac='30:52:cb:6c:9c:1b',src_oui='30:52:cb',src_port=52066,dst_ip='66.220.156.68',dst_mac='98:0c:82:d3:3c:7c',dst_oui='98:0c:82',dst_port=443,protocol=6,ip_version=4,vlan_id=0,tunnel_id=0,bidirectional_first_seen_ms=1472393122365,bidirectional_last_seen_ms=1472393123665,bidirectional_duration_ms=1300,bidirectional_packets=19,bidirectional_bytes=5745,src2dst_first_seen_ms=1472393122365,src2dst_last_seen_ms=1472393123408,src2dst_duration_ms=1043,src2dst_packets=9,src2dst_bytes=1345,dst2src_first_seen_ms=1472393122668,dst2src_last_seen_ms=1472393123665,dst2src_duration_ms=997,dst2src_packets=10,dst2src_bytes=4400,application_name='TLS.Facebook',application_category_name='SocialNetwork',application_is_guessed=0,application_confidence=4,requested_server_name='facebook.com',client_fingerprint='bfcc1a3891601edb4f137ab7ab25b840',server_fingerprint='2d1eb5817ece335c24904f516ad5da12',user_agent='',content_type='')
NFStream probes the monitored system's kernel to obtain information on open Internet sockets and collects guaranteedground-truth (process name, PID, etc.) at the application level.
fromnfstreamimportNFStreamermy_streamer=NFStreamer(source="Intel(R) Wi-Fi 6 AX200 160MHz",# Live capture mode.# Disable L7 dissection for readability purpose only.n_dissections=0,system_visibility_poll_ms=100,system_visibility_mode=1)forflowinmy_streamer:print(flow)# print it.
# See documentation for each feature detailed description.# https://www.nfstream.org/docs/api#nflowNFlow(id=0,expiration_id=0,src_ip='192.168.43.18',src_mac='30:52:cb:6c:9c:1b',src_oui='30:52:cb',src_port=59339,dst_ip='184.73.244.37',dst_mac='98:0c:82:d3:3c:7c',dst_oui='98:0c:82',dst_port=443,protocol=6,ip_version=4,vlan_id=0,tunnel_id=0,bidirectional_first_seen_ms=1638966705265,bidirectional_last_seen_ms=1638966706999,bidirectional_duration_ms=1734,bidirectional_packets=98,bidirectional_bytes=424464,src2dst_first_seen_ms=1638966705265,src2dst_last_seen_ms=1638966706999,src2dst_duration_ms=1734,src2dst_packets=22,src2dst_bytes=2478,dst2src_first_seen_ms=1638966705345,dst2src_last_seen_ms=1638966706999,dst2src_duration_ms=1654,dst2src_packets=76,dst2src_bytes=421986,# The process that generated this reported flow.system_process_pid=14596,system_process_name='FortniteClient-Win64-Shipping.exe')
NFStream performs 48 post-mortem flow statistical features extraction, which includes detailed TCP flags analysis,minimum, mean, maximum, and standard deviation of both packet size and inter-arrival time in each direction.
fromnfstreamimportNFStreamermy_streamer=NFStreamer(source="facebook.pcap",# Disable L7 dissection for readability purpose.n_dissections=0,statistical_analysis=True)forflowinmy_streamer:print(flow)
# See documentation for each feature detailed description.# https://www.nfstream.org/docs/api#nflowNFlow(id=0,expiration_id=0,src_ip='192.168.43.18',src_mac='30:52:cb:6c:9c:1b',src_oui='30:52:cb',src_port=52066,dst_ip='66.220.156.68',dst_mac='98:0c:82:d3:3c:7c',dst_oui='98:0c:82',dst_port=443,protocol=6,ip_version=4,vlan_id=0,tunnel_id=0,bidirectional_first_seen_ms=1472393122365,bidirectional_last_seen_ms=1472393123665,bidirectional_duration_ms=1300,bidirectional_packets=19,bidirectional_bytes=5745,src2dst_first_seen_ms=1472393122365,src2dst_last_seen_ms=1472393123408,src2dst_duration_ms=1043,src2dst_packets=9,src2dst_bytes=1345,dst2src_first_seen_ms=1472393122668,dst2src_last_seen_ms=1472393123665,dst2src_duration_ms=997,dst2src_packets=10,dst2src_bytes=4400,bidirectional_min_ps=66,bidirectional_mean_ps=302.36842105263156,bidirectional_stddev_ps=425.53315715259754,bidirectional_max_ps=1454,src2dst_min_ps=66,src2dst_mean_ps=149.44444444444446,src2dst_stddev_ps=132.20354676701294,src2dst_max_ps=449,dst2src_min_ps=66,dst2src_mean_ps=440.0,dst2src_stddev_ps=549.7164925870628,dst2src_max_ps=1454,bidirectional_min_piat_ms=0,bidirectional_mean_piat_ms=72.22222222222223,bidirectional_stddev_piat_ms=137.34994188549086,bidirectional_max_piat_ms=398,src2dst_min_piat_ms=0,src2dst_mean_piat_ms=130.375,src2dst_stddev_piat_ms=179.72036811192467,src2dst_max_piat_ms=415,dst2src_min_piat_ms=0,dst2src_mean_piat_ms=110.77777777777777,dst2src_stddev_piat_ms=169.51458475436397,dst2src_max_piat_ms=409,bidirectional_syn_packets=2,bidirectional_cwr_packets=0,bidirectional_ece_packets=0,bidirectional_urg_packets=0,bidirectional_ack_packets=18,bidirectional_psh_packets=9,bidirectional_rst_packets=0,bidirectional_fin_packets=0,src2dst_syn_packets=1,src2dst_cwr_packets=0,src2dst_ece_packets=0,src2dst_urg_packets=0,src2dst_ack_packets=8,src2dst_psh_packets=4,src2dst_rst_packets=0,src2dst_fin_packets=0,dst2src_syn_packets=1,dst2src_cwr_packets=0,dst2src_ece_packets=0,dst2src_urg_packets=0,dst2src_ack_packets=10,dst2src_psh_packets=5,dst2src_rst_packets=0,dst2src_fin_packets=0)
NFStream performs early (up to 255 packets) flow statistical features extraction (referred to as SPLT analysis in theliterature). It is summarized as a sequence of these packets' directions, sizes, and inter-arrival times.
fromnfstreamimportNFStreamermy_streamer=NFStreamer(source="facebook.pcap",# We disable l7 dissection for readability purpose.n_dissections=0,splt_analysis=10)forflowinmy_streamer:print(flow)
# See documentation for each feature detailed description.# https://www.nfstream.org/docs/api#nflowNFlow(id=0,expiration_id=0,src_ip='192.168.43.18',src_mac='30:52:cb:6c:9c:1b',src_oui='30:52:cb',src_port=52066,dst_ip='66.220.156.68',dst_mac='98:0c:82:d3:3c:7c',dst_oui='98:0c:82',dst_port=443,protocol=6,ip_version=4,vlan_id=0,tunnel_id=0,bidirectional_first_seen_ms=1472393122365,bidirectional_last_seen_ms=1472393123665,bidirectional_duration_ms=1300,bidirectional_packets=19,bidirectional_bytes=5745,src2dst_first_seen_ms=1472393122365,src2dst_last_seen_ms=1472393123408,src2dst_duration_ms=1043,src2dst_packets=9,src2dst_bytes=1345,dst2src_first_seen_ms=1472393122668,dst2src_last_seen_ms=1472393123665,dst2src_duration_ms=997,dst2src_packets=10,dst2src_bytes=4400,# The sequence of 10 first packet direction, size and inter arrival time.splt_direction=[0,1,0,0,1,1,0,1,0,1],splt_ps=[74,74,66,262,66,1454,66,1454,66,463],splt_piat_ms=[0,303,0,0,313,0,0,0,0,1])
NFStream natively supports Pandas as an export interface.
# See documentation for more details.# https://www.nfstream.org/docs/api#pandas-dataframe-conversionfromnfstreamimportNFStreamermy_dataframe=NFStreamer(source='teams.pcap').to_pandas()[["src_ip","src_port","dst_ip","dst_port","protocol","bidirectional_packets","bidirectional_bytes","application_name"]]my_dataframe.head(5)
NFStream natively supports CSV file format as an export interface.
# See documentation for more details.# https://www.nfstream.org/docs/api#csv-file-conversionflows_count=NFStreamer(source='facebook.pcap').to_csv(path=None,columns_to_anonymize=(),flows_per_file=0,rotate_files=0)
Didn't find a specific flow feature? add a plugin toNFStream in a few lines:
fromnfstreamimportNFPluginclassMyCustomPktSizeFeature(NFPlugin):defon_init(self,packet,flow):# flow creation with the first packetifpacket.raw_size==self.custom_size:flow.udps.packet_with_custom_size=1else:flow.udps.packet_with_custom_size=0defon_update(self,packet,flow):# flow update with each packet belonging to the flowifpacket.raw_size==self.custom_size:flow.udps.packet_with_custom_size+=1extended_streamer=NFStreamer(source='facebook.pcap',udps=MyCustomPktSizeFeature(custom_size=555))forflowinextended_streamer:# see your dynamically created metric in generated flowsprint(flow.udps.packet_with_custom_size)
The following simplistic example demonstrates how to train and deploy a machine-learning approach for trafficflow categorization.We want to run a classification of Social Network category flows based on bidirectional_packets and bidirectional_bytesas input features. For the sake of brevity, we decide to predict only at the flow expiration stage.
fromnfstreamimportNFPlugin,NFStreamerimportnumpyfromsklearn.ensembleimportRandomForestClassifierdf=NFStreamer(source="training_traffic.pcap").to_pandas()X=df[["bidirectional_packets","bidirectional_bytes"]]y=df["application_category_name"].apply(lambdax:1if'SocialNetwork'inxelse0)model=RandomForestClassifier()model.fit(X,y)
classModelPrediction(NFPlugin):defon_init(self,packet,flow):flow.udps.model_prediction=0defon_expire(self,flow):# You can do the same in on_update entrypoint and force expiration with custom id.to_predict=numpy.array([flow.bidirectional_packets,flow.bidirectional_bytes]).reshape((1,-1))flow.udps.model_prediction=self.my_model.predict(to_predict)ml_streamer=NFStreamer(source="eth0",udps=ModelPrediction(my_model=model))forflowinml_streamer:print(flow.udps.model_prediction)
More NFPlugin examples and details are provided in the officialdocumentation. You can also testNFStream without installation using ourlive demo notebook.
To buildNFStream from sources, please read theinstallation guide provided in the officialdocumentation.
Please readContributing for details on our code of conduct and the process for submitting pullrequests to us.
NFStream is intended for network data research and forensics. Researchers and network data scientists can use thisframework to build reliable datasets and train and evaluate network-applied machine learning models.As with any packet monitoring tool,NFStream could be misused.Do not run it on any network that you do not own oradministrate.
NFStream paper is published inComputer Networks (COMNET). If you use NFStream in a scientificpublication, we would appreciate citations to the following article:
@article{AOUINI2022108719, title = {NFStream: A flexible network data analysis framework}, author = {Aouini, Zied and Pekar, Adrian}, doi = {10.1016/j.comnet.2021.108719}, issn = {1389-1286}, journal = {Computer Networks}, pages = {108719}, year = {2022}, publisher = {Elsevier}, volume = {204}, url = {https://www.sciencedirect.com/science/article/pii/S1389128621005739}}The following people contributed to NFStream:
- Zied Aouini: Creator and core developer.
- Adrian Pekar: Datasets generation and storage.
- Romain Picard: MDNS and DHCP plugins implementation.
- Radion Bikmukhamedov: Initial work on SPLT analysis NFPlugin.
- Jorge Casajús-Setién: JA4 signature plugin implementation.
The following organizations supported NFStream:
- SoftAtHome: Supporter of NFStream development.
- Technical University of Košice: Hardware and infrastructure for datasets generation andstorage.
- ntop: Technical support ofnDPI integration.
- The Nmap Project: Technical support ofNpcap integration(NPCAP OEM installer on Windows CI).
- Google OSS Fuzz: Continious fuzzingtesting support of NFStream project.
More than100 research papers have already used NFStream as part of their processing pipelines.
This project is licensed under the LGPLv3 License - see theLicense file for details
About
NFStream: a Flexible Network Data Analysis Framework.
Topics
Resources
License
Code of conduct
Contributing
Security policy
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Sponsor this project
Uh oh!
There was an error while loading.Please reload this page.
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors13
Uh oh!
There was an error while loading.Please reload this page.









