- Notifications
You must be signed in to change notification settings - Fork0
A lightweight .NET CSV Parser (RFC 4180-like)
License
nd1012/CSV-Parser
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
A lightweight .NET Standard 2.0 CSV table (RFC 4180-like) parser.
CSV-Parser is available as(NuGet package "CSV-Parser")[https://www.nuget.org/packages/CSV-Parser/].
Parse a file example:
CsvTabletable=wan24.Data.CsvParser.ParseFile(@"/path/to/file.csv");foreach(string[]rowintable){Console.WriteLine("Row in CSV table:");for(inti=0;i<table.CountColumns;i++){Console.WriteLine($"\t{table.Header[i]}:{row[i]}");}}
Parse a file asynchronous example:
CsvTabletable=awaitwan24.Data.CsvParser.ParseFileAsync(@"/path/to/file.csv");
These static methods are available:
ParseFile
andParseFileAsync
forparsing a CSV fileParseStream
andParseStreamAsync
forparsing a CSV streamParseString
forparsing a CSV stringCountRowsFromFile
andCountRowsFromFileAsync
forcounting rows of a CSV fileCountRowsFromStream
andCountRowsFromStreamAsync
forcounting rows of a CSV streamCountRowsFromString
forcounting rows of a CSV stringParseHeaderFromFile
andParseHeaderFromFileAsync
forparsing column headers from a CSV fileParseHeaderFromStream
andParseHeaderFromStreamAsync
forparsing column headers from a CSV streamParseHeaderFromString
forparsing column headers from a CSV stringEnumerateFile
forenumerating each row from a CSV fileEnumerateStream
forenumerating each row from a CSV streamEnumerateString
forenumerating each row from a CSV stringCreateMap
forcreating mapping informationsMap
formapping a row to an objectUnmap
formapping an object to a row
You may adjust these details using additional parameters:
- If the first line contains the column headers (default is
true
) - The field delimiter character (default is comma (
,
)) - The string value delimiter character (default is double quotes (
"
)) - String encoding to use (default is the .NET encoding)
- If the stream should be left open (default is
false
) - Buffer size in bytes (number of bytes that need to include all header columns, default is 80K)
- Chunk size in bytes (for filling the buffer, default is 4K)
- Desired row offset (zero based index of the first row to include in the result)
- Maximum number of rows to include in the result (beginning from the row offset)
The resulting CSV table object holds the parsed table data:
CountColumns
: column countCountRows
: row countHeader
: column headersRows
: row dataObjects
: objects from rows having their type name in the first fieldAsDictionaries
: rows as dictionaries (having the headers as key)Mapping
: row <-> object mapping
The overloadedToString
method would create CSV table data from a CSV table. Other methods are:
CreateHeaders
: create automatic headers (0..n)AddColumn
: add/insert a column (optional using a field value factory)RemoveColumn
: remove a columnMoveColumn
: move a column to another positionSwapColumn
: swap two columnsReorderColumns
: apply a new column orderAddRow
: add a validated rowValidate
: validate the CSV tableClear
: clear row (and header) dataAsDictionary
: get a row as dictionaryClone
: create a copy of the CSV table objectAsObject
: get a row mapped as/to an objectAsObjects
: enumerate all rows as objectsAddObjects
: map objects to a new rowCreateMapping
: create a mapping from the column headers
For memory saving stream operations, you might want to use theCsvStream
:
// Readingusing(CsvStreamcsv=newCsvStream(File.OpenRead(@"path\to\data.csv"))){csv.SkipHeader();// Or ReadHeader (optional, if any)foreach(string[]rowincsv.Rows)// Or use ReadObjects{// Enumerate rows or use ReadRow or ReadObject instead...}}// Writingusing(CsvStreamcsv=newCsvStream(File.OpenWrite(@"path\to\data.csv"))){csv.WriteHeader(newstring[]{...});// Optionalcsv.WriteRow(...);// Or use WriteObject(s)}
Find all methods as asynchronous versions, having theAsync
postfix.
For working with dictionaries, you can use the propertyAsDictionaries
or the methodsReadDictionary
andReadDictionaryAsync
.
Per default when reading a header/row, the size is limited to 80KB. To adjust this value, you can modify these values at construction time:
bufferSize
: Read buffer size in bytes (= maximum header/row size (default: 80KB))chunkSize
: Chunk size in bytes (how many bytes to read before trying to match a header/row from the buffer (default: 4KB))
In order to be able to read/write objects, you need to define a mapping. This mapping is responsible for telling the CSV-Parser from which property to get a row field value, and to which property to write a field value from a row. The mapping also supports value factories which can convert a value, and value validation.
Dictionary<int,CsvMapping>mapping=CsvParser.CreateMapping(newCsvMapping(){Field=0,PropertyName="AnyProperty",ObjectValueFactory= ...,// Convert from string to property value (optional)RowValueFactory= ...,// Convert from property value to string (optional)PreValidation= ...,// Validate a string value from the CSV dataPostValidation= ...// Validate a converted value before setting it as object property value},...);
Set this mapping to theMapping
property of aCsvTable
, give it to theCsvStream
constructor, or as parameter to one of the object mapping methods, if available.
For value conversion,CsvParser.ObjectValueFactories
andCsvParser.RowValueFactories
offer default converter functions for these types:
bool
int
float
char
byte[]
You can extend them with any type.
If you want to use the same mapping for the same type everytime when no other mapping was given, you can add a prepared mapping toCsvParser.TypeMappings
.
In an object you may use theCsvMappingAttribute
attribute for properties that should be mapped:
[CsvMapping(0)]publicstringPropertyName{get;}
The attribute parameter is the index of the related CSV column. Then, for creating a mapping for your object, useCsvParser.CreateMapping
without parameters. The returned mapping will be stored in theCsvParser.TypeMappings
.
Actually CSV is used to store a table. Each row has a fixed number of fields, maybe a header row is present. But you can use CSV also for storing mixed data - for example different objects:
// Assumed all used types are working with CsvMappingAttribute,// or mapping are prepared in CsvParser.TypeMappings already// Writing objectsusing(CsvStreamcsv=newCsvStream(FileStream.OpenWrite("objects.csv"))){csv.WriteObjectRows(anyObjectInstance,anotherTypeInstance);}// Reading objectsusing(CsvStreamcsv=newCsvStream(FileStream.OpenRead("objects.csv"))){anyObjectInstance=csv.ReadObjectRow()asAnyType;anotherTypeInstance=csv.ReadObjectRow()asAnotherType;}
NOTE: The field mapping needs to count from field index zero, because the mapper will get the row without the first field that contains the type name! This ensures that you can re-use the mapping everywhere.
Using the streamsObjectRows
property you can also enumerate trough objects from a CSV file.
CsvTable
implementsAsObject
,AddObjects
andObjects
for this purpose.
Usually each row should have the number of fields that equals the number of columns. To ignore, when a row has a different field count:
CsvParser.IgnoreErrors=true;
This setting will also ignorenull
headers/values, and if usingToString
when a string delimiter is required to produce valid CSV data.
WARNING: Ignoring errors may cause unknown failures and produce invalid CSV data!
Even more lightweight versions of this library are available on request. These can come optional without
- dictionary methods
- object mapping
- stream support (and CSV writing support)
That would reduce the functionality to this minimum, which may be enough for supporting a nice CSV import interface only:
- CSV parsing
- CSV header parsing
- CSV file row counting
The resulting DLL file would be smaller than 30KB, if all extended features are excluded.
About
A lightweight .NET CSV Parser (RFC 4180-like)