Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

Fast CSV parser and writer for Modern C++

License

MIT, MIT licenses found

Licenses found

MIT
LICENSE
MIT
LICENSE.mio
NotificationsYou must be signed in to change notification settings

p-ranav/csv2

Repository files navigation

csv2

Table of Contents

CSV Reader

#include<csv2/reader.hpp>intmain() {  csv2::Reader<csv2::delimiter<','>,                csv2::quote_character<'"'>,                csv2::first_row_is_header<true>,               csv2::trim_policy::trim_whitespace> csv;if (csv.mmap("foo.csv")) {constauto header = csv.header();for (constauto row: csv) {for (constauto cell: row) {// Do something with cell value// std::string value;// cell.read_value(value);      }    }  }}

Performance Benchmark

This benchmark measures the average execution time (of 5 runs after 3 warmup runs) forcsv2 to memory-map the input CSV file and iterate over every cell in the CSV. Seebenchmark/main.cpp for more details.

cd benchmarkg++ -I../include -O3 -std=c++11 -o main main.cpp./main<csv_file>

System Details

TypeValue
Processor11th Gen Intel(R) Core(TM) i9-11900KF @ 3.50GHz 3.50 GHz
Installed RAM32.0 GB (31.9 GB usable)
SSDADATA SX8200PNP
OSUbuntu 20.04 LTS running on WSL in Windows 11
C++ Compilerg++ (Ubuntu 10.3.0-1ubuntu1~20.04) 10.3.0

Results (as of 23 SEP 2022)

DatasetFile SizeRowsColsTime
Denver Crime Data111 MB479,100190.102s
AirBnb Paris Listings196 MB141,730960.170s
2015 Flight Delays and Cancellations574 MB5,819,079310.603s
StackLite: Stack Overflow questions870 MB17,203,82470.911s
Used Cars Dataset1.4 GB539,768250.947s
Title-Based Semantic Subject Indexing3.7 GB12,834,02642.867s
Bitcoin tweets - 16M tweets4 GB47,478,74893.290s
DDoS Balanced Dataset6.3 GB12,794,627856.963s
Seattle Checkouts by Title7.1 GB34,892,623117.698s
SHA-1 password hash dump11 GB2,62,974,241210.775s
DOHUI NOH scaled_data16 GB496,782321316.553s

Reader API

Here is the public API available to you:

template<classdelimiter = delimiter<','>,classquote_character = quote_character<'"'>,classfirst_row_is_header = first_row_is_header<true>,classtrim_policy = trim_policy::trim_whitespace>classReader {public:// Use this if you'd like to mmap and read from fileboolmmap(string_type filename);// Use this if you have the CSV contents in std::string alreadyboolparse(string_type contents);// Shapesize_trows()const;size_tcols()const;// Row iterator// If first_row_is_header, row iteration will start// from the second row  RowIteratorbegin()const;  RowIteratorend()const;// Access the first row of the CSV  Rowheader()const;};

Here's theRow class:

// Row classclassRow {public:// Get raw contents of the rowvoidread_raw_value(Container& value)const;// Cell iterator  CellIteratorbegin()const;  CellIteratorend()const;};

and here's theCell class:

// Cell classclassCell {public:// Get raw contents of the cellvoidread_raw_value(Container& value)const;// Get converted contents of the cell// Handles escaped content, e.g.,// """foo""" => ""foo""voidread_value(Container& value)const;};

CSV Writer

This library also provides a basiccsv2::Writer class - one that can be used to write CSV rows to file. Here's a basic usage:

#include<csv2/writer.hpp>#include<vector>#include<string>usingnamespacecsv2;intmain() {    std::ofstreamstream("foo.csv");    Writer<delimiter<','>>writer(stream);    std::vector<std::vector<std::string>> rows =         {            {"a","b","c"},            {"1","2","3"},            {"4","5","6"}        };    writer.write_rows(rows);    stream.close();}

Writer API

Here is the public API available to you:

template<classdelimiter = delimiter<','>>classWriter {public:// Construct using an std::ofstreamWriter(output_file_stream stream);// Use this to write a single row to filevoidwrite_row(container_of_strings row);// Use this to write a list of rows to filevoidwrite_rows(container_of_rows rows);

Compiling Tests

mkdir build&&cd buildcmake -DCSV2_BUILD_TESTS=ON ..makecdtest./csv2_test

Generating Single Header

python3 utils/amalgamate/amalgamate.py -c single_include.json -s.

Contributing

Contributions are welcome, have a look at theCONTRIBUTING.md document for more information.

License

The project is available under theMIT license.


[8]ページ先頭

©2009-2025 Movatter.jp