(Note: For further information on other OARC data, please consult theOARC Data Catalog. Information about contributing to DITL can be found below.)
An Overview
DNS-OARC collects DNS packet captures from busy and interesting DNS nameservers through various means, including the annual Day In The Life of the Internet (DITL) collection effort. It also includes other data collections in other formats such as BIND query logs.
The Day in the Life collection is an annual event where contributors record traffic from DNS servers they manage for the same 50-hour period. This provides an annual two-day slice of DNS traffic across a broad swath of the Internet which is useful for research into things like "typical" behavior, trends over time, and other subjects.
OARC offers access to these data to researchers and OARC members through the use of a number of analysis servers. Approved users/individuals are given logins on these servers to conduct whatever studies are necessary for their work. Data contributed to DNS-OARC under our Data Sharing Agreement requires that the original data be kept on servers under OARC's control, and restricts data extracted from these servers to highly aggregated and anonymized data, synthesized from the original, and suitable for publication.
Organizations wishing to become an OARC Member, in order to access these data, should consult theJoining and Participating in DNS-OARC page.
A Day in the Life of the Internet is a large-scale data collection project, initially undertaken byCAIDA, and managed by OARC every year since 2006. If you would like to participate by collecting and contributing DNS packet captures, pleaseemail us.
Participation Requirements
There are no strict participation requirements. OARC is happy to accept data from members and non-members alike. We particularly encourage participation from:
- root server operators
- TLD operators
- AS112 operators and other large reverse zone operators (e.g. Regional Internet Registries, ISPs)
- medium sized recursive operators (e.g. large universities or enterprises, regional ISPs)
For recursive server contributions, we expect the data collection to be done on the network interface "above" the recursive server, capturing traffic to authoritative servers instead of traffic directly from individual clients. This avoids privacy issues with personally identifiable information.
Any organization that wishes to contribute data to DITL shouldcontact staff to coordinate setup.
Types of DNS Data
DITL contributions are typically PCAP files (from tools likednscap ortcpdump). OARC has an established system to receive a stream of compressed PCAP files from contributors, live during the collection. Contributing organizations will need a login for DNS-OARC's data collection system to upload data. ContactOARC staff if you wish to participate and do not already have a login for your organization.
In some cases in the past we have accepted data in other formats, such as query logs. If you wish to contribute data in a format other than PCAP, pleasecontact us to make other arrangements.
Technical Information for Contributors
Pre-collection Checklist
- Please make sure that your collection hosts are time-synchronized with NTP. Do not simply use
dateto check a clock as timezone offsets can go unnoticed. Usentpd(or similar long running daemons) to keep your clocks in sync, or usentpdatelike this:$ ntpdate -q time.google.comserver 204.152.184.72, stratum 1, offset 0.002891, delay 0.02713
Pick a time server that makes sense for your network.
The reported offset should normally be very small (less than one second). If not, your clock is probably not synchronized with NTP. - Be sure to do some "dry runs" before the actual collection time. This will test your procedures and give you a sense of how much data you'll be collecting. OARC runs an official test window two weeks before the main collection, but other testing is welcome anytime up to a few days before.
- Carefully consider your local storage options. Do you have enough local space to store all the DITL data? Or will you need to upload it as it is being collected? If you have enough space, perhaps you'll find it easier to collect first and upload after, rather than trying to manage both at the same time.
Collecting Data with dnscap
If you don't already have your own collection system for DNS traffic, we recommend usingdnscap, with some shell scripts that we provide specifically for DITL:
- Install the most recent version ofdnscap, available from theOARC package repository.
- Next, download theditl-tools package. This provides scripts for automatic capture and upload using eitherdnscap, ortcpdump withtcpdump-split.
In most casesdnscap should be the easiest option. Thetcpdump method is included for sites that would prefer it or cannot usednscap for some reason. Note that thesettings.shconfiguration file described below includes variables for bothdnscap andtcpdump. Some variables are common to both, while some are unique to each. By default these will store pcap files in the current directory. You may want to copy these scripts to a different directory where you have plenty of free disk space. - Copy
settings.sh.defaulttosettings.sh. - Open
settings.shin a text editor. - Set the
IFACESvariable to the names of your network interfaces carrying DNS data.
IMPORTANT: For recursive servers this should be the interface where outgoing queries toward authoritative servers or upstream forwarders exit the system, and not the interface where incoming client queries are received. If these are the same interface, then additional filters will be required to ensure only queries sourced from the local server are captured. - Set the
NODENAMEvariable (or leave it commented to use the output ofhostnameas theNODENAME). Please make sure that each instance ofdnscapthat you run has a uniqueNODENAME! - Set the
OARC_MEMBERvariable to your OARC-assigned contributor login. The provided scripts automatically prependoarc-to the login name before connecting, so just give the short version here. The scripts assume your OARC ssh upload key is at/root/.ssh/oarc_id_ed25519unless the settings are changed. - Look over the remaining variables in
settings.sh. Read the comments incapture-dnscap.shto understand what all the variables mean.
Here is an example of a customizedsettings.sh file:
# Settings that you should customize#IFACES="fxp0"NODENAME="lgh"OARC_MEMBER="test"# START_T='2011-04-12 11:00:00'# STOP_T='2011-04-14 13:00:00'
When you're done customizing the settings, runcapture-dnscap.sh as root:
$ sudo sh capture-dnscap.sh
When its time to do the actual DITL data collection, uncomment theSTART_T andSTOP_T variables insettings.sh. The date settings for each year's DITL collection are communicated to the contributors in early February in order to give plenty of notice for testing and setup.
With the date values set, the script will automatically start and stop capturing data at the correct times.
You can run the scripts from within a terminal session manager likescreen ortmux to avoid terminal disconnections from prematurely ending your collection.
Collecting Data with tcpdump and tcpdump-split
Another collection option is to usetcpdump and ourtcpdump-split program. The instructions are similar to the above:
- Download and install theditl-tools package.
- Follow the instructions in theditl-tools README.md file for compiling and installingtcpdump-split.
- Copy
settings.sh.defaulttosettings.shand bring it up in a text editor. - Set the
IFACESvariable to thesingle network interface to collect DNS data from. - Set
NODENAME, andOARC_MEMBERas above. - Set
DESTINATIONSif desired. - Start the capture with:
$ sudo sh capture-tcpdump.sh
- Set and uncomment the
START_TandSTOP_Tvalues, and usescreen ortmux for the main collection event, also as above.
Uploading Data Manually
If for some reason there is an interruption and the scripts need to be restarted, or something didn't get uploaded and files collected on your local server due to a bad SSH key, you can still upload that data manually. This can be done really easily by invoking the pcap-submit-to-oarc.sh script within a shell command, like so:
for F in *.pcap.gz; do /your/path/to/pcap-submit-to-oarc.sh $F; done
If you happen to have files that have not been compressed, compress those manually and then invoke the script as above.
Anonymization
Data providers may wish to anonymize their PCAP data prior to upload due to privacy concerns, corporate policy or local legal requirements. There are a number of tools available for anonymizing PCAP data in ways that are still scientifically useful. OARC staff can put new contributors in contact with existing contributors who are already anonymizing their contributions for pointers and assistance.
Contact
ContactDNS-OARC staff with any questions about DITL.

