- Notifications
You must be signed in to change notification settings - Fork101
Fast CSV parser and writer for Modern C++
License
MIT, MIT licenses found
Licenses found
MIT
LICENSEMIT
LICENSE.mioNotificationsYou must be signed in to change notification settings
p-ranav/csv2
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
#include<csv2/reader.hpp>intmain() { csv2::Reader<csv2::delimiter<','>, csv2::quote_character<'"'>, csv2::first_row_is_header<true>, csv2::trim_policy::trim_whitespace> csv;if (csv.mmap("foo.csv")) {constauto header = csv.header();for (constauto row: csv) {for (constauto cell: row) {// Do something with cell value// std::string value;// cell.read_value(value); } } }}
This benchmark measures the average execution time (of 5 runs after 3 warmup runs) forcsv2
to memory-map the input CSV file and iterate over every cell in the CSV. Seebenchmark/main.cpp
for more details.
cd benchmarkg++ -I../include -O3 -std=c++11 -o main main.cpp./main<csv_file>
Type | Value |
---|---|
Processor | 11th Gen Intel(R) Core(TM) i9-11900KF @ 3.50GHz 3.50 GHz |
Installed RAM | 32.0 GB (31.9 GB usable) |
SSD | ADATA SX8200PNP |
OS | Ubuntu 20.04 LTS running on WSL in Windows 11 |
C++ Compiler | g++ (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0 |
Dataset | File Size | Rows | Cols | Time |
---|---|---|---|---|
Denver Crime Data | 111 MB | 479,100 | 19 | 0.102s |
AirBnb Paris Listings | 196 MB | 141,730 | 96 | 0.170s |
2015 Flight Delays and Cancellations | 574 MB | 5,819,079 | 31 | 0.603s |
StackLite: Stack Overflow questions | 870 MB | 17,203,824 | 7 | 0.911s |
Used Cars Dataset | 1.4 GB | 539,768 | 25 | 0.947s |
Title-Based Semantic Subject Indexing | 3.7 GB | 12,834,026 | 4 | 2.867s |
Bitcoin tweets - 16M tweets | 4 GB | 47,478,748 | 9 | 3.290s |
DDoS Balanced Dataset | 6.3 GB | 12,794,627 | 85 | 6.963s |
Seattle Checkouts by Title | 7.1 GB | 34,892,623 | 11 | 7.698s |
SHA-1 password hash dump | 11 GB | 2,62,974,241 | 2 | 10.775s |
DOHUI NOH scaled_data | 16 GB | 496,782 | 3213 | 16.553s |
Here is the public API available to you:
template<classdelimiter = delimiter<','>,classquote_character = quote_character<'"'>,classfirst_row_is_header = first_row_is_header<true>,classtrim_policy = trim_policy::trim_whitespace>classReader {public:// Use this if you'd like to mmap and read from fileboolmmap(string_type filename);// Use this if you have the CSV contents in std::string alreadyboolparse(string_type contents);// Shapesize_trows()const;size_tcols()const;// Row iterator// If first_row_is_header, row iteration will start// from the second row RowIteratorbegin()const; RowIteratorend()const;// Access the first row of the CSV Rowheader()const;};
Here's theRow
class:
// Row classclassRow {public:// Get raw contents of the rowvoidread_raw_value(Container& value)const;// Cell iterator CellIteratorbegin()const; CellIteratorend()const;};
and here's theCell
class:
// Cell classclassCell {public:// Get raw contents of the cellvoidread_raw_value(Container& value)const;// Get converted contents of the cell// Handles escaped content, e.g.,// """foo""" => ""foo""voidread_value(Container& value)const;};
This library also provides a basiccsv2::Writer
class - one that can be used to write CSV rows to file. Here's a basic usage:
#include<csv2/writer.hpp>#include<vector>#include<string>usingnamespacecsv2;intmain() { std::ofstreamstream("foo.csv"); Writer<delimiter<','>>writer(stream); std::vector<std::vector<std::string>> rows = { {"a","b","c"}, {"1","2","3"}, {"4","5","6"} }; writer.write_rows(rows); stream.close();}
Here is the public API available to you:
template<classdelimiter = delimiter<','>>classWriter {public:// Construct using an std::ofstreamWriter(output_file_stream stream);// Use this to write a single row to filevoidwrite_row(container_of_strings row);// Use this to write a list of rows to filevoidwrite_rows(container_of_rows rows);
mkdir build&&cd buildcmake -DCSV2_BUILD_TESTS=ON ..makecdtest./csv2_test
python3 utils/amalgamate/amalgamate.py -c single_include.json -s.
Contributions are welcome, have a look at theCONTRIBUTING.md document for more information.
The project is available under theMIT license.
About
Fast CSV parser and writer for Modern C++
Topics
Resources
License
MIT, MIT licenses found
Licenses found
MIT
LICENSEMIT
LICENSE.mioStars
Watchers
Forks
Packages0
No packages published