- Notifications
You must be signed in to change notification settings - Fork103
Fast CSV parser and writer for Modern C++
License
MIT, MIT licenses found
Licenses found
p-ranav/csv2
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
#include<csv2/reader.hpp>intmain() { csv2::Reader<csv2::delimiter<','>, csv2::quote_character<'"'>, csv2::first_row_is_header<true>, csv2::trim_policy::trim_whitespace> csv;if (csv.mmap("foo.csv")) {constauto header = csv.header();for (constauto row: csv) {for (constauto cell: row) {// Do something with cell value// std::string value;// cell.read_value(value); } } }}
This benchmark measures the average execution time (of 5 runs after 3 warmup runs) forcsv2
to memory-map the input CSV file and iterate over every cell in the CSV. Seebenchmark/main.cpp
for more details.
cd benchmarkg++ -I../include -O3 -std=c++11 -o main main.cpp./main<csv_file>
Type | Value |
---|---|
Processor | 11th Gen Intel(R) Core(TM) i9-11900KF @ 3.50GHz 3.50 GHz |
Installed RAM | 32.0 GB (31.9 GB usable) |
SSD | ADATA SX8200PNP |
OS | Ubuntu 20.04 LTS running on WSL in Windows 11 |
C++ Compiler | g++ (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 |
Dataset | File Size | Rows | Cols | Time |
---|---|---|---|---|
Denver Crime Data | 111 MB | 479,100 | 19 | 0.102s |
AirBnb Paris Listings | 196 MB | 141,730 | 96 | 0.170s |
2015 Flight Delays and Cancellations | 574 MB | 5,819,079 | 31 | 0.603s |
StackLite: Stack Overflow questions | 870 MB | 17,203,824 | 7 | 0.911s |
Used Cars Dataset | 1.4 GB | 539,768 | 25 | 0.947s |
Title-Based Semantic Subject Indexing | 3.7 GB | 12,834,026 | 4 | 2.867s |
Bitcoin tweets - 16M tweets | 4 GB | 47,478,748 | 9 | 3.290s |
DDoS Balanced Dataset | 6.3 GB | 12,794,627 | 85 | 6.963s |
Seattle Checkouts by Title | 7.1 GB | 34,892,623 | 11 | 7.698s |
SHA-1 password hash dump | 11 GB | 2,62,974,241 | 2 | 10.775s |
DOHUI NOH scaled_data | 16 GB | 496,782 | 3213 | 16.553s |
Here is the public API available to you:
template<classdelimiter = delimiter<','>,classquote_character = quote_character<'"'>,classfirst_row_is_header = first_row_is_header<true>,classtrim_policy = trim_policy::trim_whitespace>classReader {public:// Use this if you'd like to mmap and read from fileboolmmap(string_type filename);// Use this if you have the CSV contents in std::string alreadyboolparse(string_type contents);// Shapesize_trows()const;size_tcols()const;// Row iterator// If first_row_is_header, row iteration will start// from the second row RowIteratorbegin()const; RowIteratorend()const;// Access the first row of the CSV Rowheader()const;};
Here's theRow
class:
// Row classclassRow {public:// Get raw contents of the rowvoidread_raw_value(Container& value)const;// Cell iterator CellIteratorbegin()const; CellIteratorend()const;};
and here's theCell
class:
// Cell classclassCell {public:// Get raw contents of the cellvoidread_raw_value(Container& value)const;// Get converted contents of the cell// Handles escaped content, e.g.,// """foo""" => ""foo""voidread_value(Container& value)const;};
This library also provides a basiccsv2::Writer
class - one that can be used to write CSV rows to file. Here's a basic usage:
#include<csv2/writer.hpp>#include<vector>#include<string>usingnamespacecsv2;intmain() { std::ofstreamstream("foo.csv"); Writer<delimiter<','>>writer(stream); std::vector<std::vector<std::string>> rows = { {"a","b","c"}, {"1","2","3"}, {"4","5","6"} }; writer.write_rows(rows); stream.close();}
Here is the public API available to you:
template<classdelimiter = delimiter<','>>classWriter {public:// Construct using an std::ofstreamWriter(output_file_stream stream);// Use this to write a single row to filevoidwrite_row(container_of_strings row);// Use this to write a list of rows to filevoidwrite_rows(container_of_rows rows);
mkdir build&&cd buildcmake -DCSV2_BUILD_TESTS=ON ..makecdtest./csv2_test
python3 utils/amalgamate/amalgamate.py -c single_include.json -s.
Contributions are welcome, have a look at theCONTRIBUTING.md document for more information.
The project is available under theMIT license.
About
Fast CSV parser and writer for Modern C++
Topics
Resources
License
MIT, MIT licenses found
Licenses found
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.