- Notifications
You must be signed in to change notification settings - Fork7
This project consists of implementations of several kNN algorithms for road networks (aka finding nearest points of interest) and the experimental framework to compare them from a research paper published in PVLDB 2016. You can use it to add new methods and/or queries or reproduce our experimental results.
License
tenindra/RN-kNN-Exp
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This project consists of implementations of several kNN algorithms for road networks and the experimental framework to compare them. This has primarily been released to allow readers to reproduce results from a paper to appear at VLDB 2016 (details to be updated) and to use in future studies. If you use the code, please cite ourpaper. Please follow the Requirements below carefully. If you have any issues please check theFAQ andcontact us.
Before the executable can be compiled the following packages/libraries must be installed. Note that we were able to install METIS from the Ubuntu repositories if you are using that operating system.
- g++ version 4.9 or higher
- Boost version 1.57 or higher for serialization
- [METIS] (http://glaros.dtc.umn.edu/gkhome/metis/metis/download) version 5.1 or higher
- CMake version 2.8 or higher
- gnuplot for figure generation
- Optional: [Google Sparsehash] (http://code.google.com/p/sparsehash/) for comparison of G-tree hashtables implementations
Note: We found Boost versions lower than 1.57 did not correctly support serialization of some STL data structures (such asstd::unordered_map
) thus they cannot be used here. This means you need to remove any existing Boost version, e.g. by uninstallinglibboost-all-dev
through your package manager.
We recommend a 64-bit OS due to the size of indexes (and we have not tested on 32-bit). Furthermore 32GB RAM is required to reproduce all experiments as they are in the paper. Results can be reproduced using a smaller amount of RAM by omitting some of the larger datasets (see below). Also ensure that there is 200GB of disk space freely available on the hard disk storing indexes (for both travel distance and travel time experiments).
The C++ source code can be found in the top-levelRN-kNN-Exp
directory, and in the subdirectoriescommand
,external
,processing
,queue
,tuple
andutility
. All bash scripts for various tasks such as setting up the directories, running experiments, and generating figures are found in thescripts
subdirectory.
Change
CMAKE_C_COMPILER
andCMAKE_CXX_COMPILER
in CMakeLists.txt to point to the correctg++
version (i.e. 4.9 or higher).CMakeLists.txt
can be found in theRN-kNN-Exp
directory.Create a directory called
build
in the top-level (e.g. RN-kNN-Exp/) directorymkdir build
Change into this directory and generate makefiles using
CMake
cd buildcmake -G "Unix Makefiles" ../CMakeLists.txt ..
Compile all executables using make
make
Open the
globalVariables
bash script inscripts
directory in your favourite editorChange the
output_path
variable to the full-path of anexisting (i.e. create it first) location where you wish to store all data (e.g. indexes, objects sets, figures, etc...)Note: This can be different to the path containing the code (recommended)
Change the
exe_path
variable to the full path of the build directory created above (e.g./home/user/Downloads/rn_knn/build
)Run
resetExperimentalSetup
in thescripts
directory to create all necessary sub-directoriescd scriptsbash resetExperimentalSetup
Download the DIMACS distance edge-weight graph files (with extension .gr.gz and prefixed by USA-road-d) and coordinate files (with extension .co.gz) for DE, VT, ME and NW fromhttps://github.com/tenindra/RN-kNN-Exp-Data to the
output_path/data/dimacs
directoryDownload the DIMACS distance edge-weight graph files and coordinate files for COL, CAL, E, W, CTR, USA fromhttp://www.dis.uniroma1.it/challenge9/ to the
output_path/data/dimacs
directoryDownload the node (with extension .cnode) and edge (with extension .cedge) files for North America (NA) fromhttp://www.cs.utah.edu/~lifeifei/SpatialDataset.htm to the
output_path/data/tpq
directoryNote: Youmust gzip these files so that
output_path/data/tpq
directory contains NA.cnode.gz and NA.cedge.gzDownload the real_world_pois.tar.gz fromhttps://github.com/tenindra/RN-kNN-Exp-Data to the
output_path
directoryUnzip the real_world_pois.tar.gz archive (this should create a
output_path/real_world_pois
directory populated with subdirectories with POI sets for several road networks)
Experiments can still be run with less than 32GB RAM, but for fewer datasets. To do this go to theglobalVariables
file and modify theroad_networks
array. The road networks are in size order, so simply remove the road networks from the end. E.g. to run experiments up to the Eastern US dataset change it toroad_networks=("DE" "VT" "ME" "COL" "FLA" "CAL" "E")
.
Note 1: All in-depth experiments on the US dataset will not be possible, of course. In this case it is best to comment out those experiments in therunPaperExperiments
script.
Note 2: SILC (index used by Distance Browsing) can only be built upto the default NW dataset. SILC on the NW requires at least 20GB of memory. Less than this will cause all default experiments to be missing Distance Browsing comparisons. In this case we suggest changing the default road network to COL (for which SILC only requires 8GB), by changingdefault_network
anddefault_parameters
to COL.
The follow instructions can be followed to re-create all figures from the paper. Assuming your are already in thescripts
directory (otherwise cd into scripts):
Clean the DIMACS and TPQ datasets for errors and redundancy
bash transformInputData
Build the binary files for the basic graph representations
bash buildBinaryGraphs
Build all road network indexes (this may take a while... ~15 hours on our machine)
bash buildRoadNetworkIndexes
Generate query sets
bash generateQuerySets
Generate random object sets and build corresponding object indexes
bash buildObjectIndexes
Note: You can run
resetExperimentalSetup
again to remove all object indexes and query setsRun all paper experiments and produce figures
bash runPaperExperiments
Note: The above command also creates figures, but if you wish to recreate figures without running experiments again (which can take sometime), you can use:
bash createPaperFigures
Note: All above commands may be batched in the shell terminal, just enter each separated by semi-colon ";"
Travel times experiments must be reproduced indepedently, as they require different indexes. These can be re-produced using the following procedure:
Create a new directory to store data for travel time experiments (e.g. figures etc...) somewhere
Change the
output_path
variable in globalVariables the full-path of this new locationChange the
edge_type
variable toedge_type=t
Note: The
edge_type
variable must be changed back to be "d" to run travel distance experiments againRun the
resetExperimentalSetup
to create all necessary sub-directoriesbash resetExperimentalSetup
Download the DIMACS travel time edge-weight graph files (with extension .gr.gz and prefixed by USA-road-t) and coordinate files (with extension .co.gz) for DE, VT, ME and NW fromhttps://github.com/tenindra/RN-kNN-Exp-Data to the
output_path/data/dimacs
directoryDownload the DIMACS travel time edge-weight graph files and coordinate files for COL, CAL, E, W, CTR, USA fromhttp://www.dis.uniroma1.it/challenge9/ to the
output_path/data/dimacs
directoryDownload the real_world_pois.tar.gz fromhttps://github.com/tenindra/RN-kNN-Exp-Data to the
output_path/ directory
Unzip the real_world_pois.tar.gz archive (this should create a
output_path/real_world_pois
directory populated with subdirectories with POI sets for several road networks)Clean the DIMACS datasets for errors and redundancy for travel time edge weights
bash transformInputData
Build the binary files for the basic graph representations for travel time edge weights
bash buildBinaryGraphs
As in steps 3-5 in "Running Travel Distance Experiments", execute:
bash buildRoadNetworkIndexesbash generateQuerySetsbash buildObjectIndexes
Run all travel time experiments and produce figures
bash runTravelTimeExperiments
Note: The above command also creates figures, but if you wish to recreate figures without running experiments again (which can take sometime), you can use:
bash createTravelTimeFigures
Parts of open source projects associated with the following publications were used in our project:
- Takuya Akiba, Yoichi Iwata, Ken-ichi Kawarabayashi, and Yuki Kawata, Fast Shortest-path Distance Queries on Road Networks by Pruned Highway Labeling. In ALENEX 2014. (Code)
- Lingkun Wu, Xiaokui Xiao, Dingxiong Deng, Gao Cong, Andy Diwen Zhu, Shuigeng Zhou: Shortest Path and Distance Queries on Road Networks: An Experimental Evaluation. PVLDB 5(5): 406-417 2012. (Code)
Road Network kNN Experimental Evaluation is free software; you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version.
Road Network kNN Experimental Evaluation is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with Road Network kNN Experimental Evaluation; see LICENSE.txt; if not, seehttp://www.gnu.org/licenses/.
About
This project consists of implementations of several kNN algorithms for road networks (aka finding nearest points of interest) and the experimental framework to compare them from a research paper published in PVLDB 2016. You can use it to add new methods and/or queries or reproduce our experimental results.