- Notifications
You must be signed in to change notification settings - Fork41
World's Fastest .NET CSV Parser. Modern, minimal, fast, zero allocation, reading and writing of separated values (`csv`, `tsv` etc.). Cross-platform, trimmable and AOT/NativeAOT compatible.
License
nietras/Sep
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Modern, minimal, fast, zero allocation, reading and writing of separated values(csv
,tsv
etc.). Cross-platform, trimmable and AOT/NativeAOT compatible.Featuring an opinionated API design and pragmatic implementation targetted atmachine learning use cases.
⭐ Please star this project if you like it. ⭐
🌃 Modern - utilizes features such asSpan<T>
,Generic Math(ISpanParsable<T>
/ISpanFormattable
),ref struct
,ArrayPool<T>
and similar from.NET 7+ and C#11+ for a modernand highly efficient implementation.
🔎 Minimal - a succinct yet expressive API with few options and no hiddenchanges to input or output. What you read/write is what you get. E.g. by defaultthere is no "automatic" escaping/unescaping of quotes or trimming of spaces. Toenable this seeSepReaderOptions andUnescaping andTrimming. SeeSepWriterOptions forEscaping.
🚀 Fast - blazing fast with both architecture specific and cross-platformSIMD vectorized parsing incl. 64/128/256/512-bit paths e.g. AVX2, AVX-512 (.NET8.0+), NEON. UsescsFastFloat forfast parsing of floating points. Seedetailedbenchmarks for cross-platform results.
🌪️ Multi-threaded - unparalleled speed with highly efficient parallel CSVparsing that isup to 35x faster thanCsvHelper, seeParallelEnumerate andbenchmarks.
🌀 Async support - efficientValueTask
basedasync/await
support.Requires C# 13.0+ and for .NET 9.0+ includesSepReader
implementingIAsyncEnumerable<>
. SeeAsync Support for details.
🗑️ Zero allocation - intelligent and efficient memory management allowingfor zero allocations after warmup incl. supporting use cases of reading orwriting arrays of values (e.g. features) easily without repeated allocations.
✅ Thorough tests - great code coverage and focus on edge case testing incl.randomizedfuzz testing.
🌐 Cross-platform - works on any platform, any architecture supported byNET. 100% managed and written in beautiful modern C#.
✂️ Trimmable and AOT/NativeAOT compatible - no problematic reflection ordynamic code generation. Hence, fullytrimmableandAhead-of-Timecompatible. With a simple console tester program executable possible in just afew MBs. 💾
🗣️ Opinionated and pragmatic - conforms to the essentials ofRFC-4180, but takes an opinionated andpragmatic approach towards this especially with regards to quoting and lineends. See sectionRFC-4180.
Example |Naming and Terminology |API |Limitations and Constraints |Comparison Benchmarks |Example Catalogue |RFC-4180 |FAQ |Public API Reference
vartext=""" A;B;C;D;E;F Sep;🚀;1;1.2;0.1;0.5 CSV;✅;2;2.2;0.2;1.5 """;usingvarreader=Sep.Reader().FromText(text);// Infers separator 'Sep' from headerusingvarwriter=reader.Spec.Writer().ToText();// Writer defined from reader 'Spec'// Use .FromFile(...)/ToFile(...) for filesvaridx=reader.Header.IndexOf("B");varnms=new[]{"E","F"};foreach(varreadRowinreader)// Read one row at a time{vara=readRow["A"].Span;// Column as ReadOnlySpan<char>varb=readRow[idx].ToString();// Column to string (might be pooled)varc=readRow["C"].Parse<int>();// Parse any T : ISpanParsable<T>vard=readRow["D"].Parse<float>();// Parse float/double fast via csFastFloatvars=readRow[nms].Parse<double>();// Parse multiple columns as Span<T>// - Sep handles array allocation and reuseforeach(refvarvins){v*=10;}usingvarwriteRow=writer.NewRow();// Start new row. Row written on Dispose.writeRow["A"].Set(a);// Set by ReadOnlySpan<char>writeRow["B"].Set(b);// Set by stringwriteRow["C"].Set($"{c*2}");// Set via InterpolatedStringHandler, no allocswriteRow["D"].Format(d/2);// Format any T : ISpanFormattablewriteRow[nms].Format(s);// Format multiple columns directly// Columns are added on first access as ordered, header written when first row written}varexpected=""" A;B;C;D;E;F Sep;🚀;2;0.6;1;5 CSV;✅;4;1.1;2;15 """;// Empty line at end is for line ending,// which is always written.Assert.AreEqual(expected,writer.ToString());// Above example code is for demonstration purposes only.// Short names and repeated constants are only for demonstration.
Sep uses naming and terminology that is not based onRFC-4180, butis more tailored to usage in machine learning or similar. Additionally, Septakes a pragmatic approach towards names by using short names and abbreviationswhere it makes sense and there should be no ambiguity given the context. Thatis, usingSep
forSeparator
andCol
forColumn
to keep code succinct.
Term | Description |
---|---|
Sep | Short for separator, also calleddelimiter. E.g. comma (, ) is the separator for the separated values in acsv -file. |
Header | Optional first row defining names of columns. |
Row | A row is a collection of col(umn)s, which may span multiple lines. Also calledrecord. |
Col | Short for column, also calledfield. |
Line | Horizontal set of characters until a line ending;\r\n ,\r ,\n . |
Index | 0-based that isRowIndex will be 0 for first row (or the header if present). |
Number | 1-based that isLineNumber will be 1 for the first line (as innotepad ). Given a row may span multiple lines a row can have aFrom line number and aToExcl line number matching the C# range indexing syntax[LineNumberFrom..LineNumberToExcl] . |
Besides being the succinct name of the library,Sep
is both the main entrypoint to using the library and the container for a validated separator. That is,Sep
is basically defined as:
publicreadonlyrecordstructSep(charSeparator);
The separatorchar
is validated upon construction and is guaranteed to bewithin a limited range and not being achar
like"
(quote) or similar. Thiscan be seen insrc/Sep/Sep.cs. The separator is constrainedalso for internal optimizations, so you cannot use anychar
as a separator.
⚠ Note that all types are within the namespacenietras.SeparatedValues
and notSep
since it is problematic to have a type and a namespace with the same name.
To get started you can useSep
as the static entry point to building either areader or writer. That is, forSepReader
:
usingvarreader=Sep.Reader().FromFile("titanic.csv");
where.Reader()
is a convenience method corresponding to:
usingvarreader=Sep.Auto.Reader().FromFile("titanic.csv");
whereSep? Auto => null;
is a static property that returnsnull
for anullableSep
to signify that the separator should be inferred from the firstrow, which might be a header. If the first row does not contain any of the bydefault supported separators or there are no rows, the default separator will beused.
⚠ Note Sep uses;
as the default separator, since this is what was used in aninternal proprietary library which Sep was built to replace. This is also toavoid issues with comma,
being used as a decimal separator in some locales.Without having to resort to quoting.
If you want to specify the separator you can write:
usingvarreader=Sep.New(',').Reader().FromFile("titanic.csv");
or
varsep=newSep(',');usingvarreader=sep.Reader().FromFile("titanic.csv");
Similarly, forSepWriter
:
usingvarwriter=Sep.Writer().ToFile("titanic.csv");
or
usingvarwriter=Sep.New(',').Writer().ToFile("titanic.csv");
where you have to specify a valid separator, since it cannot be inferred. Tofascillitate easy flow of the separator andCultureInfo
bothSepReader
andSepWriter
expose aSpec
property of typeSepSpec
that simply defines thosetwo. This means you can write:
usingvarreader=Sep.Reader().FromFile("titanic.csv");usingvarwriter=reader.Spec.Writer().ToFile("titanic-survivors.csv");
where thewriter
then will use the separator inferred by the reader, forexample.
In general, both reading and writing follow a similar pattern:
Sep/Spec => SepReaderOptions => SepReader => Row => Col(s) => Span/ToString/ParseSep/Spec => SepWriterOptions => SepWriter => Row => Col(s) => Set/Format
where each continuation flows fluently from the preceding type. For example,Reader()
is an extension method toSep
orSepSpec
that returns aSepReaderOptions
. Similarly,Writer()
is an extension method toSep
orSepSpec
that returns aSepWriterOptions
.
SepReaderOptions
andSepWriterOptions
are optionally configurable.That and the APIs for reader and writer is covered in the following sections.
For a complete example, see theexample above or theReadMeTest.cs.
⚠ Note that it is important to understand that SepRow
/Col
/Cols
areref struct
s(please follow theref struct
link and understand how this limits the usage ofthose). This is due to these types being simplefacades or indirections to theunderlying reader or writer. That means you cannot use LINQ or create an arrayof all rows likereader.ToArray()
. While for .NET9+ the reader is nowIEnumerable<>
sinceref struct
s can now be used in interfaces that havewhere T: allows ref struct
this still does not mean it is LINQ compatible. Hence, if you need store per rowstate or similar you need to parse or copy to different types instead. The sameapplies toCol
/Cols
which point to internal state that is also reused. Thisis to avoid repeated allocations for each row and get the best possibleperformance, while still defining a well structured and straightforward API thatguides users to relevant functionality. SeeWhy SepReader Was Not IEnumerableUntil .NET 9 and Is Not LINQCompatiblefor more.
⚠ For a full overview of public types and methods seePublic APIReference.
SepReader
API has the following structure (in pseudo-C# code):
usingvarreader=Sep.Reader(o=>o).FromFile/FromText/From...;varheader=reader.Header;var_=header.IndexOf/IndicesOf/NamesStartingWith...;foreach(varrowinreader){var_=row[colName/colNames].Span/ToString/Parse<T>...;var_=row[colIndex/colIndices].Span/ToString/Parse<T>...;}
That is, to useSepReader
follow the points below:
- Optionally define
Sep
or use default automatically inferred separator. - Specify reader with optional configuration of
SepReaderOptions
. Forexample, if a csv-file does not have a header this can be configured via:For all options seeSepReaderOptions.Sep.Reader(o=>owith{HasHeader=false})
- Specify source e.g. file, text (
string
),TextWriter
, etc. viaFrom
extension methods. - Optionally access the header. For example, to get all columns starting with
GT_
use:varcolNames=header.NamesStarting("GT_");varcolIndices=header.IndicesOf(colNames);
- Enumerate rows. One row at a time.
- Access a column by name or index. Or access multiple columns with names andindices.
Sep
internally handles pooled allocation and reuse of arrays formultiple columns. - Use
Span
to access the column directly as aReadOnlySpan<char>
. Or useToString
to convert to astring
. Or useParse<T>
whereT : ISpanParsable<T>
to parse the columnchar
s to a specific type.
The following options are available:
/// <summary>/// Specifies the separator used, if `null` then automatic detection/// is used based on first row in source./// </summary>publicSep?Sep{get;init;}=null;/// <summary>/// Specifies initial internal `char` buffer length./// </summary>/// <remarks>/// The length will likely be rounded up to the nearest power of 2. A/// smaller buffer may end up being used if the underlying source for <see/// cref="System.IO.TextReader"/> is known to be smaller. Prefer to keep the/// default length as that has been tuned for performance and cache sizes./// Avoid making this unnecessarily large as that will likely not improve/// performance and may waste memory./// </remarks>publicintInitialBufferLength{get;init;}=SepDefaults.InitialBufferLength;/// <summary>/// Specifies the culture used for parsing./// May be `null` for default culture./// </summary>publicCultureInfo?CultureInfo{get;init;}=SepDefaults.CultureInfo;/// <summary>/// Indicates whether the first row is a header row./// </summary>publicboolHasHeader{get;init;}=true;/// <summary>/// Specifies <see cref="IEqualityComparer{T}" /> to use/// for comparing header column names and looking up index./// </summary>publicIEqualityComparer<string>ColNameComparer{get;init;}=SepDefaults.ColNameComparer;/// <summary>/// Specifies the method factory used to convert a column span/// of `char`s to a `string`./// </summary>publicSepCreateToStringCreateToString{get;init;}=SepToString.Direct;/// <summary>/// Disables using [csFastFloat](https://github.com/CarlVerret/csFastFloat)/// for parsing `float` and `double`./// </summary>publicboolDisableFastFloat{get;init;}=false;/// <summary>/// Disables checking if column count is the same for all rows./// </summary>publicboolDisableColCountCheck{get;init;}=false;/// <summary>/// Disables detecting and parsing quotes./// </summary>publicboolDisableQuotesParsing{get;init;}=false;/// <summary>/// Unescape quotes on column access./// </summary>/// <remarks>/// When true, if a column starts with a quote then the two outermost quotes/// are removed and every second inner quote is removed. Note that/// unquote/unescape happens in-place, which means the <see/// cref="SepReader.Row.Span" /> will be modified and contain "garbage"/// state after unescaped cols before next col. This is for efficiency to/// avoid allocating secondary memory for unescaped columns. Header/// columns/names will also be unescaped./// Requires <see cref="DisableQuotesParsing"/> to be false./// </remarks>publicboolUnescape{get;init;}=false;/// <summary>/// Option for trimming spaces (` ` - ASCII 32) on column access./// </summary>/// <remarks>/// By default no trimming is done. See <see cref="SepTrim"/> for options./// Note that trimming may happen in-place e.g. if also unescaping, which/// means the <see cref="SepReader.Row.Span" /> will be modified and contain/// "garbage" state for trimmed/unescaped cols. This is for efficiency to/// avoid allocating secondary memory for trimmed/unescaped columns. Header/// columns/names will also be trimmed. Note that only the space ` ` (ASCII/// 32) character is trimmed, not any whitespace character./// </remarks>publicSepTrimTrim{get;init;}=SepTrim.None;/// <summary>/// Forwarded to <see/// cref="System.Threading.Tasks.ValueTask.ConfigureAwait(bool)"/> or/// similar when async methods are called./// </summary>publicboolAsyncContinueOnCapturedContext{get;init;}=false;
While great care has been taken to ensure Sep unescaping of quotes is bothcorrect and fast, there is always the question of how does one respond toinvalid input.
The below table tries to summarize the behavior of Sep vs CsvHelper and Sylvan.Note that all do the same for valid input. There are differences for how invalidinput is handled. For Sep the design choice has been based on not wanting tothrow exceptions and to use a principle that is both reasonably fast and simple.
Input | Valid | CsvHelper | CsvHelper¹ | Sylvan | Sep² |
---|---|---|---|---|---|
a | True | a | a | a | a |
"" | True | ||||
"""" | True | " | " | " | " |
"""""" | True | "" | "" | "" | "" |
"a" | True | a | a | a | a |
"a""a" | True | a"a | a"a | a"a | a"a |
"a""a""a" | True | a"a"a | a"a"a | a"a"a | a"a"a |
a""a | False | EXCEPTION | a""a | a""a | a""a |
a"a"a | False | EXCEPTION | a"a"a | a"a"a | a"a"a |
·""· | False | EXCEPTION | ·""· | ·""· | ·""· |
·"a"· | False | EXCEPTION | ·"a"· | ·"a"· | ·"a"· |
·"" | False | EXCEPTION | ·"" | ·"" | ·"" |
·"a" | False | EXCEPTION | ·"a" | ·"a" | ·"a" |
a"""a | False | EXCEPTION | a"""a | a"""a | a"""a |
"a"a"a" | False | EXCEPTION | aa"a" | a"a"a | aa"a |
""· | False | EXCEPTION | · | " | · |
"a"· | False | EXCEPTION | a· | a" | a· |
"a"""a | False | EXCEPTION | aa | EXCEPTION | a"a |
"a"""a" | False | EXCEPTION | aa" | a"a<NULL> | a"a" |
""a" | False | EXCEPTION | a" | "a | a" |
"a"a" | False | EXCEPTION | aa" | a"a | aa" |
""a"a"" | False | EXCEPTION | a"a"" | "a"a" | a"a" |
""" | False | EXCEPTION | " | ||
""""" | False | " | " | EXCEPTION | "" |
·
(middle dot) is whitespace to make this visible
¹ CsvHelper withBadDataFound = null
² Sep withUnescape = true
inSepReaderOptions
Sep supports trimming by theSepTrim
flags enum, whichhas two options as documented there. Below the result of both trimming andunescaping is shown in comparison to CsvHelper. Note unescaping is enabled forall results shown. It is possible to trim without unescaping too, of course.
As can be seen Sep supports a simple principle of trimmingbefore andafterunescaping with trimming before unescaping being important for unescaping ifthere is a starting quote after spaces.
Input | CsvHelper Trim | CsvHelper InsideQuotes | CsvHelper All¹ | Sep Outer | Sep AfterUnescape | Sep All² |
---|---|---|---|---|---|---|
a | a | a | a | a | a | a |
·a | a | ·a | a | a | a | a |
a· | a | a· | a | a | a | a |
·a· | a | ·a· | a | a | a | a |
·a·a· | a·a | ·a·a· | a·a | a·a | a·a | a·a |
"a" | a | a | a | a | a | a |
"·a" | ·a | a | a | ·a | a | a |
"a·" | a· | a | a | a· | a | a |
"·a·" | ·a· | a | a | ·a· | a | a |
"·a·a·" | ·a·a· | a·a | a·a | ·a·a· | a·a | a·a |
·"a"· | a | ·"a"· | a | a | "a" | a |
·"·a"· | ·a | ·"·a"· | a | ·a | "·a" | a |
·"a·"· | a· | ·"a·"· | a | a· | "a·" | a |
·"·a·"· | ·a· | ·"·a·"· | a | ·a· | "·a·" | a |
·"·a·a·"· | ·a·a· | ·"·a·a·"· | a·a | ·a·a· | "·a·a·" | a·a |
·
(middle dot) is whitespace to make this visible
¹ CsvHelper withTrimOptions.Trim | TrimOptions.InsideQuotes
² Sep withSepTrim.All = SepTrim.Outer | SepTrim.AfterUnescape
inSepReaderOptions
Debuggability is an important part of any library and while this is still a workin progress for Sep,SepReader
does have a unique feature when looking at itand it's row or cols in a debug context. Given the below example code:
vartext=""" Key;Value A;"1 2 3" B;"Apple Banana Orange Pear" """;usingvarreader=Sep.Reader().FromText(text);foreach(varrowinreader){// Hover over reader, row or col when breaking herevarcol=row[1];if(Debugger.IsAttached&&row.RowIndex==2){Debugger.Break();}Debug.WriteLine(col.ToString());}
and you are hovering overreader
when the break is triggered then this willshow something like:
String Length=55
That is, it will show information of the source for the reader, in this case astring of length 55.
If you are hovering overrow
then this will show something like:
2:[5..9] = "B;\"Apple\r\nBanana\r\nOrange\r\nPear\""
This has the format shown below.
<ROWINDEX>:[<LINENUMBERRANGE>] = "<ROW>"
Note how this shows line number range[FromIncl..ToExcl]
, as in C#rangeexpression,so that one can easily find the row in question innotepad
or similar. Thismeans Sep has to track line endings inside quotes and is an example of a featurethat makes Sep a bit slower but which is a price considered worth paying.
GitHub doesn't show line numbers in code blocks so consider copying theexample text to notepad or similar to see the effect.
Additionally, if you expand therow
in the debugger (e.g. via the smalltriangle) you will see each column of the row similar to below.
00:'Key' = "B"01:'Value' = "\"Apple\r\nBanana\r\nOrange\r\nPear\""
If you hover overcol
you should see:
"\"Apple\r\nBanana\r\nOrange\r\nPear\""
As mentioned earlier Sep only allows enumeration and access to one row at a timeandSepReader.Row
is just a simplefacade or indirection to the underlyingreader. This is why it is defined as aref struct
. In fact, the followingcode:
usingvarreader=Sep.Reader().FromText(text);foreach(varrowinreader){}
can also be rewritten as:
usingvarreader=Sep.Reader().FromText(text);while(reader.MoveNext()){varrow=reader.Current;}
whererow
is just afacade for exposing row specific functionality. That is,row
is still basically thereader
underneath. Hence, let's look at usingLINQ withSepReader
implementingIEnumerable<SepReader.Row>
and theRow
not being aref struct
. Then, you would be able to write something like below:
usingvarreader=Sep.Reader().FromText(text);SepReader.Row[]rows=reader.ToArray();
GivenRow
is just a facade for the reader, this would be equivalent towriting:
usingvarreader=Sep.Reader().FromText(text);SepReader[]rows=reader.ToArray();
which hopefully makes it clear why this is not a good thing. The array wouldeffectively be the reader repeated several times. If this would have to besupported one would have to allocate memory for each row always, which wouldbasically be no different than aReadLine
approach as benchmarked inComparison Benchmarks.
This is perhaps also the reason why no other efficient .NET CSV parser (known toauthor) implements an API pattern like Sep, but instead let the reader defineall functionality directly and hence only let's you access the current row andcols on that. This API, however, is in this authors opinion not ideal and can bea bit confusing, which is why Sep is designed like it is. The downside is theabove caveat.
The main culprit above is that for exampleToArray()
would store aref struct
in a heap allocated array, the actual enumeration is not a problem andhence implementingIEnumerable<SepReader.Row>
is not the problem as such. Theproblem was that prior to .NET 9 it was not possible to implement this interfacewithT
being aref struct
, but with C# 13allows ref struct
and .NET 9having annotated such interfaces it is now possible and you can assignSepReader
toIEnumerable
, but most if not all of LINQ will still not work asshown below.
vartext=""" Key;Value A;1.1 B;2.2 """;usingvarreader=Sep.Reader().FromText(text);IEnumerable<SepReader.Row>enumerable=reader;// Currently, most LINQ methods do not work for ref types. See below.//// The type 'SepReader.Row' may not be a ref struct or a type parameter// allowing ref structs in order to use it as parameter 'TSource' in the// generic type or method 'Enumerable.Select<TSource,// TResult>(IEnumerable<TSource>, Func<TSource, TResult>)'//// enumerable.Select(row => row["Key"].ToString()).ToArray();
CallingSelect
should in principle be possible if this was annotated withallows ref struct
, but it isn't currently.
If you want to use LINQ or similar you have to first parse or transform the rowsinto some other type and enumerate it. This is easy to do and instead ofcounting lines you should focus on how such enumeration can be easily expressedusing C# iterators (akayield return
). With local functions this can be doneinside a method like:
vartext=""" Key;Value A;1.1 B;2.2 """;varexpected=new(stringKey,doubleValue)[]{("A",1.1),("B",2.2),};usingvarreader=Sep.Reader().FromText(text);varactual=Enumerate(reader).ToArray();CollectionAssert.AreEqual(expected,actual);staticIEnumerable<(stringKey,doubleValue)>Enumerate(SepReaderreader){foreach(varrowinreader){yieldreturn(row["Key"].ToString(),row["Value"].Parse<double>());}}
Now if instead refactoring this to something LINQ-compatible by defining acommonEnumerate
or similar method it could be:
vartext=""" Key;Value A;1.1 B;2.2 """;varexpected=new(stringKey,doubleValue)[]{("A",1.1),("B",2.2),};usingvarreader=Sep.Reader().FromText(text);varactual=Enumerate(reader, row=>(row["Key"].ToString(),row["Value"].Parse<double>())).ToArray();CollectionAssert.AreEqual(expected,actual);staticIEnumerable<T>Enumerate<T>(SepReaderreader,SepReader.RowFunc<T>select){foreach(varrowinreader){yieldreturnselect(row);}}
In fact, Sep provides such a convenience extension method. And, discounting theEnumerate
method, this does have less boilerplate, but not really moreeffective lines of code. The issue here is that this tends to favor factoringcode in a way that can become very inefficient quickly. Consider if one wantedto only enumerate rows matching a predicate onKey
which meant only 1% of rowswere to be enumerated e.g.:
vartext=""" Key;Value A;1.1 B;2.2 """;varexpected=new(stringKey,doubleValue)[]{("B",2.2),};usingvarreader=Sep.Reader().FromText(text);varactual=reader.Enumerate( row=>(row["Key"].ToString(),row["Value"].Parse<double>())).Where(kv=>kv.Item1.StartsWith('B')).ToArray();CollectionAssert.AreEqual(expected,actual);
This means you are still parsing the double (which is magnitudes slower thangetting just the key) for all rows. Imagine if this was an array of floatingpoints or similar. Not only would you then be parsing a lot of values you wouldalso be allocated 99x arrays that aren't used after filtering withWhere
.
Instead, you should focus on how to express the enumeration in a way that isboth efficient and easy to read. For example, the above could be rewritten as:
vartext=""" Key;Value A;1.1 B;2.2 """;varexpected=new(stringKey,doubleValue)[]{("B",2.2),};usingvarreader=Sep.Reader().FromText(text);varactual=Enumerate(reader).ToArray();CollectionAssert.AreEqual(expected,actual);staticIEnumerable<(stringKey,doubleValue)>Enumerate(SepReaderreader){foreach(varrowinreader){varkeyCol=row["Key"];if(keyCol.Span.StartsWith("B")){yieldreturn(keyCol.ToString(),row["Value"].Parse<double>());}}}
To accomodate this Sep provides an overload forEnumerate
that is similar to:
staticIEnumerable<T>Enumerate<T>(thisSepReaderreader,SepReader.RowTryFunc<T>trySelect){foreach(varrowinreader){if(trySelect(row,outvarvalue)){yieldreturnvalue;}}}
With this the above customEnumerate
can be replaced with:
vartext=""" Key;Value A;1.1 B;2.2 """;varexpected=new(stringKey,doubleValue)[]{("B",2.2),};usingvarreader=Sep.Reader().FromText(text);varactual=reader.Enumerate((SepReader.Rowrow,out(stringKey,doubleValue)kv)=>{varkeyCol=row["Key"];if(keyCol.Span.StartsWith("B")){kv=(keyCol.ToString(),row["Value"].Parse<double>());returntrue;}kv=default;returnfalse;}).ToArray();CollectionAssert.AreEqual(expected,actual);
Note how this is pretty much the same length as the previous customEnumerate
.Also worse due to how C# requires specifying types forout
parameters whichthen requires all parameter types for the lambda to be specified. Hence, in thiscase the customEnumerate
does not take significantly longer to write and is alot more efficient than using LINQ.Where
(also avoids allocating a string forkey for each row) and is easier to debug and perhaps even read. All examplesabove can be seen inReadMeTest.cs.
There is a strong case for having an enumerate API though and that is forparallelized enumeration, which will be discussed next.
As discussed in the previous section Sep providesEnumerate
convenienceextension methods, that should be used carefully. Alongside these there areParallelEnumerate
extension methods that provide very efficient multi-threadedenumeration. Seebenchmarks for numbers andPublicAPI Reference.
ParallelEnumerate
is build on top of LINQAsParallel().AsOrdered()
and willreturn exactly the same asEnumerate
but with enumeration parallelized. Thiswill use more memory during execution and as many threads as possible via the.NET thread pool. When usingParallelEnumerate
one should, therefore (asalways), be certain the provided delegate does not refer to or change anymutable state.
ParallelEnumerate
comes with a lot of overhead compared to single-threadedforeach
orEnumerate
and should be used carefully based on measuring anypotential benefit. Sep goes a long way to make this very efficient by usingpooled arrays and parsing multiple rows in batches, but if the source only has afew rows then any benefit is unlikely.
Due toParallelEnumerate
being based on batches of rows it is also importantnot to "abuse" it in-place of LINQAsParallel
. The idea is to use it forparsing rows, not for doing expensive per row operations like loading an imageor similar. In that case, you are better off usingAsParallel()
afterParallelEnumerate
orEnumerate
similarly to:
usingvarreader=Sep.Reader().FromFile("very-long.csv");varresults=reader.ParallelEnumerate(ParseRow).AsParallel().AsOrdered().Select(LoadData)// Expensive load.ToList();
As a rule of thumb if the time per row exceeds 1 millisecond consider moving theexpensive work to afterParallelEnumerate
/Enumerate
,
SepWriter
API has the following structure (in pseudo-C# code):
usingvarwriter=Sep.Writer(o=>o).ToFile/ToText/To...;foreach(vardatainEnumerateData()){usingvarrow=writer.NewRow();var_=row[colName/colNames].Set/Format<T>...;var_=row[colIndex/colIndices].Set/Format<T>...;}
That is, to useSepWriter
follow the points below:
- Optionally define
Sep
or use default automatically inferred separator. - Specify writer with optional configuration of
SepWriterOptions
.For all options seeSepWriterOptions. - Specify destination e.g. file, text (
string
viaStringWriter
),TextWriter
, etc. viaTo
extension methods. - MISSING:
SepWriter
currently does not allow you to define the header upfront. Instead, header is defined by the order in which column names areaccessed/created when defining the row. - Define new rows with
NewRow
. ⚠ Be sure to dispose any new rows beforestarting the next! For convenience Sep provides an overload forNewRow
thattakes aSepReader.Row
and copies the columns from that row to the new row:usingvarreader=Sep.Reader().FromText(text);usingvarwriter=reader.Spec.Writer().ToText();foreach(varreadRowinreader){usingvarwriteRow=writer.NewRow(readRow);}
- Create a column by selecting by name or index. Or multiple columns viaindices and names.
Sep
internally handles pooled allocation and reuse ofarrays for multiple columns. - Use
Set
to set the column value either as aReadOnlySpan<char>
,string
or via an interpolated string. Or useFormat<T>
whereT : IFormattable
to formatT
to the column value. - Row is written when
Dispose
is called on the row.Note this is to allow a row to be defined flexibly with both columnremoval, moves and renames in the future. This is not yet supported.
The following options are available:
/// <summary>/// Specifies the separator used./// </summary>publicSepSep{get;init;}/// <summary>/// Specifies the culture used for parsing./// May be `null` for default culture./// </summary>publicCultureInfo?CultureInfo{get;init;}/// <summary>/// Specifies whether to write a header row/// before data rows. Requires all columns/// to have a name. Otherwise, columns can be/// added by indexing alone./// </summary>publicboolWriteHeader{get;init;}=true;/// <summary>/// Disables checking if column count is the/// same for all rows./// </summary>/// <remarks>/// When true, the <see cref="ColNotSetOption"/>/// will define how columns that are not set/// are handled. For example, whether to skip/// or write an empty column if a column has/// not been set for a given row./// <para>/// If any columns are skipped, then columns of/// a row may, therefore, be out of sync with/// column names if <see cref="WriteHeader"/>/// is true./// </para>/// As such, any number of columns can be/// written as long as done sequentially./// </remarks>publicboolDisableColCountCheck{get;init;}=false;/// <summary>/// Specifies how to handle columns that are/// not set./// </summary>publicSepColNotSetOptionColNotSetOption{get;init;}=SepColNotSetOption.Throw;/// <summary>/// Specifies whether to escape column names/// and values when writing./// </summary>/// <remarks>/// When true, if a column contains a separator/// (e.g. `;`), carriage return (`\r`), line/// feed (`\n` or quote (`"`) then the column/// is prefixed and suffixed with quotes `"`/// and any quote in the column is escaped by/// adding an extra quote so it becomes `""`./// Note that escape applies to column names/// too, but only the written name./// </remarks>publicboolEscape{get;init;}=false;/// <summary>/// Forwarded to <see/// cref="System.Threading.Tasks.ValueTask.ConfigureAwait(bool)"/> or/// similar when async methods are called./// </summary>publicboolAsyncContinueOnCapturedContext{get;init;}=false;
Escaping is not enabled by default in Sep, but when it is it gives the sameresults as other popular CSV librares as shown below. Although, CsvHelperappears to be escaping spaces as well, which is not necessary.
Input | CsvHelper | Sylvan | Sep¹ |
---|---|---|---|
`` | |||
· | "·" | · | · |
a | a | a | a |
; | ";" | ";" | ";" |
, | , | , | , |
" | """" | """" | """" |
\r | "\r" | "\r" | "\r" |
\n | "\n" | "\n" | "\n" |
a"aa"aaa | "a""aa""aaa" | "a""aa""aaa" | "a""aa""aaa" |
a;aa;aaa | "a;aa;aaa" | "a;aa;aaa" | "a;aa;aaa" |
Separator/delimiter is set to semi-colon;
(default for Sep)
·
(middle dot) is whitespace to make this visible
\r
,\n
are carriage return and line feed special characters to make these visible
¹ Sep withEscape = true
inSepWriterOptions
Sep supports efficientValueTask
based asynchronous reading and writing.
However, given bothSepReader.Row
andSepWriter.Row
areref struct
s, asthey point to internal state and should only be used one at a time,async/await
usage is only supported on C# 13.0+ as this has support for"refand unsafe in iterators and async methods" as covered inWhat's new in C#13. Pleaseconsult details in that for limitations and constraints due to this.
Similarly,SepReader
only implementsIAsyncEnumerable<SepReader.Row>
(andIEnumerable<SepReader.Row>
) for .NET 9.0+/C# 13.0+ since then the interfaceshave been annotated withallows ref struct
forT
.
Async support is provided on the existingSepReader
andSepWriter
typessimilar to howTextReader
andTextWriter
support both sync and async usage.This means you as a developer are responsible for calling async methods andusingawait
when necessary. See below for a simple example and consult testson GitHub for more examples.
vartext=""" A;B;C;D;E;F Sep;🚀;1;1.2;0.1;0.5 CSV;✅;2;2.2;0.2;1.5 """;// Empty line at end is for line endingusingvarreader=awaitSep.Reader().FromTextAsync(text);awaitusingvarwriter=reader.Spec.Writer().ToText();awaitforeach(varreadRowinreader){awaitusingvarwriteRow=writer.NewRow(readRow);}Assert.AreEqual(text,writer.ToString());
Note how forSepReader
theFromTextAsync
is suffixed withAsync
toindicate async creation, this is due to the reader having to read the first rowof the source at creation to determine both separator and, if file has a header,column names of the header. TheFrom*Async
call then has to beawait
ed.After that rows can be enumerated asynchronously simply by puttingawait
beforeforeach
. If one forgets to do that the rows will be enumeratedsynchronously.
ForSepWriter
the usage is kind of reversed.To*
methods have noAsync
variants, since creation is synchronous. That is,StreamWriter
is created by asimple constructor call. Nothing is written until a header or row is defined andDispose
/DisposeAsync
is called on the row.
For reader nothing needs to be asynchronously disposed, sousing
does notrequireawait
. However, forSepWriter
dispose may have to write/flush datato underlyingTextWriter
and hence it should be usingDisposeAsync
, so youmust useawait using
.
To support cancellation many methods have overloads that accept aCancellationToken
like theFrom*Async
methods for creating aSepReader
orfor exampleNewRow
forSepWriter
. ConsultPublic APIReference for full set of available methods.
Additionally, bothSepReaderOptions andSepWriterOptions feature thebool AsyncContinueOnCapturedContext
option that is forwarded to internalConfigureAwait
calls, see theConfigureAwaitFAQ for details onthat.
Sep is designed to be minimal and fast. As such, it has some limitations andconstraints:
- Comments
#
are not directly supported. You can skip a row by:This does not allow skipping lines before a header row starting withforeach(varrowinreader){// Skip row if starts with #if(!row.Span.StartsWith("#")){// ...}}
#
though. InExample Catalogue a full example is givendetailing how to skip lines before header.
To investigate the performance of Sep it is compared to:
- CsvHelper -the most commonlyused CSV library with a staggering
downloads on NuGet. Fullyfeatured and battle tested.
- Sylvan - is well-known and haspreviously been shown to bethe fastest CSV libraries forparsing(Sep changes that 😉).
ReadLine
/WriteLine
- basic naive implementations that read line by lineand split on separator. While writing columns, separators and line endingsdirectly. Does not handle quotes or similar correctly.
All benchmarks are run from/to memory either with:
StringReader
orStreamReader + MemoryStream
StringWriter
orStreamWriter + MemoryStream
This to avoid confounding factors from reading from or writing to disk.
When usingStringReader
/StringWriter
eachchar
counts as 2 bytes, whenmeasuring throughput e.g.MB/s
. When usingStreamReader
/StreamWriter
content is UTF-8 encoded and eachchar
typically counts as 1 byte, as contentusually limited to 1 byte per char in UTF-8. Note that in .NET forTextReader
andTextWriter
data is converted to/fromchar
, but for reading suchconversion can often be just as fast asMemmove
.
By default onlyStringReader
/StringWriter
results are shown, if a result isbased onStreamReader
/StreamWriter
it will be called out. Usually, resultsforStreamReader
/StreamWriter
are in line withStringReader
/StringWriter
but with half the throughput due to 1 byte vs 2 bytes. For brevity they are notshown here.
For all benchmark results, Sep has been defined as theBaseline
inBenchmarkDotNet. This meansRatio
will be 1.00for Sep. For the othersRatio
will then show how manytimes faster Sep isthan that. Or how manytimes more bytes are allocated inAlloc Ratio
.
Disclaimer: Any comparison made is based on a number of preconditions andassumptions. Sep is a new library written from the ground up to use the latestand greatest features in .NET. CsvHelper has a long history and has to takeinto account backwards compatibility and still supporting older runtimes, somay not be able to easily utilize more recent features. Same goes for Sylvan.Additionally, Sep has a different feature set compared to the two. Performanceis a feature, but not the only feature. Keep that in mind when evaluatingresults.
The following runtime is used for benchmarking:
NET 9.0.X
NOTE:Garbage CollectionDATASmode is disabled since this severely impacts (e.g.1.7xslower) performance for somebenchmarks due to the bursty accumulated allocations. That is,GarbageCollectionAdaptationMode
is set to0
.
The following platforms are used for benchmarking:
AMD EPYC 7763
(Virtual) X64 Platform InformationOS=Ubuntu 22.04.5 LTS (Jammy Jellyfish)AMD EPYC 7763, 1 CPU, 4 logical and 2 physical cores
AMD Ryzen 7 PRO 7840U
(Laptop on battery) X64 Platform InformationOS=Windows 11 (10.0.22631.4460/23H2/2023Update/SunValley3)AMD Ryzen 7 PRO 7840U w/ Radeon 780M Graphics, 1 CPU, 16 logical and 8 physical cores
AMD 5950X
(Desktop) X64 Platform InformationOS=Windows 10 (10.0.19044.2846/21H2/November2021Update)AMD Ryzen 9 5950X, 1 CPU, 32 logical and 16 physical cores
Apple M1
(Virtual) ARM64 Platform InformationOS=macOS Sonoma 14.7.1 (23H222) [Darwin 23.6.0]Apple M1 (Virtual), 1 CPU, 3 logical and 3 physical cores
The following reader scenarios are benchmarked:
- NCsvPerf fromThe fastest CSVparser in.NET
- Floats as for example in machine learning.
Details for each can be found in the following. However, for each of these 3different scopes are benchmarked to better assertain the low-level performanceof each library and approach and what parts of the parsing consume the mosttime:
- Row - for this scope only the row is enumerated. That is, for Sep allthat is done is:this should capture parsing both row and columns but without accessing these.Note that some libraries (like Sylvan) will defer work for columns to whenthese are accessed.
foreach(varrowinreader){}
- Cols - for this scope all rows and all columns are enumerated. Ifpossible columns are accessed as spans, if not as strings, which then mightmean a string has to be allocated. That is, for Sep this is:
foreach(varrowinreader){for(vari=0;i<row.ColCount;i++){varspan=row[i].Span;}}
- XYZ - finally the full scope is performed which is specific to each ofthe scenarios.
Additionally, as Sep supports multi-threaded parsing viaParallelEnumerate
benchmarks results with_MT
in the method name are multi-threaded. These showSep provides unparalleled performance compared to any other CSV parser.
The overhead of Sep async support is also benchmarked and can be seen with_Async
in the method name. Note that this is the absolute best case for asyncgiven there is no real IO involved and hence no actual asynchronous work orcontinuations (thus noTask
allocations) since benchmarks run from memoryonly. This is fine as the main purpose of the benchmark is to gauge the overheadof the async code path.
NCsvPerf fromThe fastest CSVparser in.NET is abenchmark which inJoel Verhagen ownwords was defined with:
My goal was to find the fastest low-level CSV parser. Essentially, all Iwanted was a library that gave me a string[] for each line where each field inthe line was an element in the array.
What is great about this work is it tests a whole of 35 different libraries andapproaches to this. Providing a great overview of those and their performance onthis specific scenario. Given Sylvan is the fastest of those it is used as theone to beat here, while CsvHelper is used to compare to the most commonly usedlibrary.
The source used for this benchmarkPackageAssetsBench.cs is aPackageAssets.csvwith NuGet package information in 25 columns with rows like:
75fcf875-017d-4579-bfd9-791d3e6767f0,2020-11-28T01:50:41.2449947+00:00,Akinzekeel.BlazorGrid,0.9.1-preview,2020-11-27T22:42:54.3100000+00:00,AvailableAssets,RuntimeAssemblies,,,net5.0,,,,,,lib/net5.0/BlazorGrid.dll,BlazorGrid.dll,.dll,lib,net5.0,.NETCoreApp,5.0.0.0,,,0.0.0.075fcf875-017d-4579-bfd9-791d3e6767f0,2020-11-28T01:50:41.2449947+00:00,Akinzekeel.BlazorGrid,0.9.1-preview,2020-11-27T22:42:54.3100000+00:00,AvailableAssets,CompileLibAssemblies,,,net5.0,,,,,,lib/net5.0/BlazorGrid.dll,BlazorGrid.dll,.dll,lib,net5.0,.NETCoreApp,5.0.0.0,,,0.0.0.075fcf875-017d-4579-bfd9-791d3e6767f0,2020-11-28T01:50:41.2449947+00:00,Akinzekeel.BlazorGrid,0.9.1-preview,2020-11-27T22:42:54.3100000+00:00,AvailableAssets,ResourceAssemblies,,,net5.0,,,,,,lib/net5.0/de/BlazorGrid.resources.dll,BlazorGrid.resources.dll,.dll,lib,net5.0,.NETCoreApp,5.0.0.0,,,0.0.0.075fcf875-017d-4579-bfd9-791d3e6767f0,2020-11-28T01:50:41.2449947+00:00,Akinzekeel.BlazorGrid,0.9.1-preview,2020-11-27T22:42:54.3100000+00:00,AvailableAssets,MSBuildFiles,,,any,,,,,,build/Microsoft.AspNetCore.StaticWebAssets.props,Microsoft.AspNetCore.StaticWebAssets.props,.props,build,any,Any,0.0.0.0,,,0.0.0.075fcf875-017d-4579-bfd9-791d3e6767f0,2020-11-28T01:50:41.2449947+00:00,Akinzekeel.BlazorGrid,0.9.1-preview,2020-11-27T22:42:54.3100000+00:00,AvailableAssets,MSBuildFiles,,,any,,,,,,build/Akinzekeel.BlazorGrid.props,Akinzekeel.BlazorGrid.props,.props,build,any,Any,0.0.0.0,,,0.0.0.0
ForScope = Asset
the columns are parsed into aPackageAsset
class, whichconsists of 25 properties of which 22 arestring
s. Each asset is accumulatedinto aList<PackageAsset>
. Each column is accessed as astring
regardless.
This means this benchmark is dominated by turning columns intostring
s for thedecently fast parsers. Hence, the fastest libraries in this test employ stringpooling. That is, basically a custom dictionary fromReadOnlySpan<char>
tostring
, which avoids allocating a newstring
for repeated values. And as canbe seen in the csv-file there are a lot of repeated values. Both Sylvan andCsvHelper do this in the benchmark. So does Sep and as with Sep this is anoptional configuration that has to be explicitly enable. For Sep this means thereader is created with something like:
usingvarreader=Sep.Reader(o=>owith{HasHeader=false,CreateToString=SepToString.PoolPerCol(maximumStringLength:128),}).From(CreateReader());
What is unique for Sep is that it allows defining a pool per column e.g. viaSepToString.PoolPerCol(...)
. This is based on the factthat often each column has its own set of values or strings that may be repeatedwithout any overlap to other columns. This also allows one to define per columnspecific handling ofToString
behavior. Whether to pool or not. Or even to usea statically defined pool.
Sep supports unescaping via an option, seeSepReaderOptionsandUnescaping. Therefore, Sep has two methods being tested. ThedefaultSep
without unescaping andSep_Unescape
where unescaping is enabled.Note that only if there are quotes will there be any unescaping, but to supportunescape one has to track extra state during parsing which means there is aslight cost no matter what, most notably for theCols
scope. Sep is still thefastest of all (by far in many cases).
The results below show Sep isthe fastest .NET CSV Parser (for thisbenchmark on these platforms and machines 😀). While for pure parsing allocatingonly a fraction of the memory due to extensive use of pooling and theArrayPool<T>
.
This is in many aspects due to Sep having extremely optimized string pooling andoptimized hashing ofReadOnlySpan<char>
, and thus not really due the thecsv-parsing itself, since that is not a big part of the time consumed. At leastnot for a decently fast csv-parser.
WithParallelEnumerate
(MT) Sep is>2x faster than Sylvan and up to 9xfaster than CsvHelper.
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Row | 50000 | 3.427 ms | 1.00 | 29 | 8487.1 | 68.5 | 1.02 KB | 1.00 |
Sep_Async | Row | 50000 | 3.656 ms | 1.07 | 29 | 7954.9 | 73.1 | 1.03 KB | 1.00 |
Sep_Unescape | Row | 50000 | 3.473 ms | 1.01 | 29 | 8376.0 | 69.5 | 1.15 KB | 1.12 |
Sylvan___ | Row | 50000 | 4.429 ms | 1.29 | 29 | 6567.1 | 88.6 | 7.67 KB | 7.48 |
ReadLine_ | Row | 50000 | 21.587 ms | 6.30 | 29 | 1347.4 | 431.7 | 88608.3 KB | 86,496.57 |
CsvHelper | Row | 50000 | 63.743 ms | 18.60 | 29 | 456.3 | 1274.9 | 20.12 KB | 19.64 |
Sep______ | Cols | 50000 | 4.758 ms | 1.00 | 29 | 6112.5 | 95.2 | 1.04 KB | 1.00 |
Sep_Unescape | Cols | 50000 | 5.691 ms | 1.20 | 29 | 5110.7 | 113.8 | 1.04 KB | 1.00 |
Sylvan___ | Cols | 50000 | 8.204 ms | 1.72 | 29 | 3545.5 | 164.1 | 7.68 KB | 7.42 |
ReadLine_ | Cols | 50000 | 22.823 ms | 4.80 | 29 | 1274.4 | 456.5 | 88608.31 KB | 85,518.30 |
CsvHelper | Cols | 50000 | 110.312 ms | 23.18 | 29 | 263.7 | 2206.2 | 445.93 KB | 430.38 |
Sep______ | Asset | 50000 | 39.731 ms | 1.00 | 29 | 732.1 | 794.6 | 13803.91 KB | 1.00 |
Sep_MT___ | Asset | 50000 | 28.730 ms | 0.72 | 29 | 1012.4 | 574.6 | 13858.99 KB | 1.00 |
Sylvan___ | Asset | 50000 | 50.605 ms | 1.28 | 29 | 574.8 | 1012.1 | 13963.34 KB | 1.01 |
ReadLine_ | Asset | 50000 | 125.231 ms | 3.16 | 29 | 232.3 | 2504.6 | 102135 KB | 7.40 |
CsvHelper | Asset | 50000 | 127.145 ms | 3.20 | 29 | 228.8 | 2542.9 | 13971.75 KB | 1.01 |
Sep______ | Asset | 1000000 | 850.145 ms | 1.00 | 581 | 684.4 | 850.1 | 266670.16 KB | 1.00 |
Sep_MT___ | Asset | 1000000 | 505.005 ms | 0.59 | 581 | 1152.2 | 505.0 | 276118.02 KB | 1.04 |
Sylvan___ | Asset | 1000000 | 1,035.263 ms | 1.22 | 581 | 562.1 | 1035.3 | 266828.4 KB | 1.00 |
ReadLine_ | Asset | 1000000 | 2,597.222 ms | 3.06 | 581 | 224.0 | 2597.2 | 2038837.9 KB | 7.65 |
CsvHelper | Asset | 1000000 | 2,649.300 ms | 3.12 | 581 | 219.6 | 2649.3 | 266845.35 KB | 1.00 |
AMD.Ryzen.7.PRO.7840U.w.Radeon.780M - PackageAssets Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Row | 50000 | 4.250 ms | 1.00 | 29 | 6865.5 | 85.0 | 1.33 KB | 1.00 |
Sep_Async | Row | 50000 | 4.447 ms | 1.05 | 29 | 6562.5 | 88.9 | 1.32 KB | 0.99 |
Sep_Unescape | Row | 50000 | 4.278 ms | 1.01 | 29 | 6822.0 | 85.6 | 1.18 KB | 0.89 |
Sylvan___ | Row | 50000 | 4.768 ms | 1.07 | 29 | 6119.8 | 95.4 | 7.66 KB | 6.48 |
ReadLine_ | Row | 50000 | 20.959 ms | 4.71 | 29 | 1392.3 | 419.2 | 88608.26 KB | 74,925.56 |
CsvHelper | Row | 50000 | 65.193 ms | 14.64 | 29 | 447.6 | 1303.9 | 20.2 KB | 17.08 |
Sep______ | Cols | 50000 | 6.747 ms | 1.00 | 29 | 4325.2 | 134.9 | 1.19 KB | 1.00 |
Sep_Unescape | Cols | 50000 | 6.739 ms | 1.00 | 29 | 4330.0 | 134.8 | 1.19 KB | 1.00 |
Sylvan___ | Cols | 50000 | 7.509 ms | 1.12 | 29 | 3885.9 | 150.2 | 7.67 KB | 6.46 |
ReadLine_ | Cols | 50000 | 23.536 ms | 3.50 | 29 | 1239.9 | 470.7 | 88608.28 KB | 74,617.50 |
CsvHelper | Cols | 50000 | 107.075 ms | 15.94 | 29 | 272.5 | 2141.5 | 448.88 KB | 378.00 |
Sep______ | Asset | 50000 | 54.052 ms | 1.00 | 29 | 539.9 | 1081.0 | 13803.3 KB | 1.00 |
Sep_MT___ | Asset | 50000 | 35.594 ms | 0.66 | 29 | 819.8 | 711.9 | 13914.86 KB | 1.01 |
Sylvan___ | Asset | 50000 | 62.009 ms | 1.15 | 29 | 470.6 | 1240.2 | 13962.68 KB | 1.01 |
ReadLine_ | Asset | 50000 | 201.825 ms | 3.73 | 29 | 144.6 | 4036.5 | 102134.43 KB | 7.40 |
CsvHelper | Asset | 50000 | 135.566 ms | 2.51 | 29 | 215.3 | 2711.3 | 13972.69 KB | 1.01 |
Sep______ | Asset | 1000000 | 1,047.265 ms | 1.00 | 583 | 557.4 | 1047.3 | 266672.16 KB | 1.00 |
Sep_MT___ | Asset | 1000000 | 492.995 ms | 0.47 | 583 | 1184.2 | 493.0 | 267823.63 KB | 1.00 |
Sylvan___ | Asset | 1000000 | 1,218.367 ms | 1.16 | 583 | 479.2 | 1218.4 | 266825.65 KB | 1.00 |
ReadLine_ | Asset | 1000000 | 3,276.776 ms | 3.13 | 583 | 178.2 | 3276.8 | 2038836.1 KB | 7.65 |
CsvHelper | Asset | 1000000 | 2,683.525 ms | 2.56 | 583 | 217.5 | 2683.5 | 266846.78 KB | 1.00 |
AMD.Ryzen.9.5950X - PackageAssets Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Row | 50000 | 2.230 ms | 1.00 | 29 | 13088.4 | 44.6 | 1.09 KB | 1.00 |
Sep_Async | Row | 50000 | 2.379 ms | 1.07 | 29 | 12264.0 | 47.6 | 1.02 KB | 0.93 |
Sep_Unescape | Row | 50000 | 2.305 ms | 1.03 | 29 | 12657.6 | 46.1 | 1.02 KB | 0.93 |
Sylvan___ | Row | 50000 | 2.993 ms | 1.33 | 29 | 9750.2 | 59.9 | 7.65 KB | 7.52 |
ReadLine_ | Row | 50000 | 12.106 ms | 5.36 | 29 | 2410.5 | 242.1 | 88608.25 KB | 87,077.59 |
CsvHelper | Row | 50000 | 43.313 ms | 19.19 | 29 | 673.7 | 866.3 | 20.04 KB | 19.69 |
Sep______ | Cols | 50000 | 3.211 ms | 1.00 | 29 | 9089.3 | 64.2 | 1.02 KB | 1.00 |
Sep_Unescape | Cols | 50000 | 3.845 ms | 1.20 | 29 | 7589.1 | 76.9 | 1.02 KB | 1.00 |
Sylvan___ | Cols | 50000 | 5.065 ms | 1.58 | 29 | 5760.9 | 101.3 | 7.66 KB | 7.52 |
ReadLine_ | Cols | 50000 | 12.850 ms | 4.00 | 29 | 2270.9 | 257.0 | 88608.25 KB | 86,910.78 |
CsvHelper | Cols | 50000 | 68.999 ms | 21.49 | 29 | 422.9 | 1380.0 | 445.85 KB | 437.31 |
Sep______ | Asset | 50000 | 33.615 ms | 1.00 | 29 | 868.1 | 672.3 | 13802.47 KB | 1.00 |
Sep_MT___ | Asset | 50000 | 20.231 ms | 0.60 | 29 | 1442.4 | 404.6 | 13992.1 KB | 1.01 |
Sylvan___ | Asset | 50000 | 34.762 ms | 1.03 | 29 | 839.5 | 695.2 | 13962.2 KB | 1.01 |
ReadLine_ | Asset | 50000 | 97.204 ms | 2.89 | 29 | 300.2 | 1944.1 | 102133.9 KB | 7.40 |
CsvHelper | Asset | 50000 | 83.550 ms | 2.49 | 29 | 349.3 | 1671.0 | 13970.66 KB | 1.01 |
Sep______ | Asset | 1000000 | 629.552 ms | 1.00 | 583 | 927.3 | 629.6 | 266669.13 KB | 1.00 |
Sep_MT___ | Asset | 1000000 | 261.089 ms | 0.41 | 583 | 2236.0 | 261.1 | 267793.45 KB | 1.00 |
Sylvan___ | Asset | 1000000 | 761.171 ms | 1.21 | 583 | 767.0 | 761.2 | 266825.09 KB | 1.00 |
ReadLine_ | Asset | 1000000 | 1,636.526 ms | 2.60 | 583 | 356.7 | 1636.5 | 2038835.59 KB | 7.65 |
CsvHelper | Asset | 1000000 | 1,754.461 ms | 2.79 | 583 | 332.7 | 1754.5 | 266833.16 KB | 1.00 |
Apple.M1.(Virtual) - PackageAssets Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Row | 50000 | 4.041 ms | 1.00 | 29 | 7196.8 | 80.8 | 1033 B | 1.00 |
Sep_Async | Row | 50000 | 4.385 ms | 1.09 | 29 | 6633.5 | 87.7 | 990 B | 0.96 |
Sep_Unescape | Row | 50000 | 4.449 ms | 1.10 | 29 | 6537.7 | 89.0 | 990 B | 0.96 |
Sylvan___ | Row | 50000 | 21.045 ms | 5.22 | 29 | 1382.1 | 420.9 | 6958 B | 6.74 |
ReadLine_ | Row | 50000 | 21.449 ms | 5.32 | 29 | 1356.1 | 429.0 | 90734895 B | 87,836.30 |
CsvHelper | Row | 50000 | 46.465 ms | 11.52 | 29 | 626.0 | 929.3 | 20692 B | 20.03 |
Sep______ | Cols | 50000 | 5.001 ms | 1.00 | 29 | 5816.4 | 100.0 | 994 B | 1.00 |
Sep_Unescape | Cols | 50000 | 6.269 ms | 1.25 | 29 | 4639.4 | 125.4 | 999 B | 1.01 |
Sylvan___ | Cols | 50000 | 23.746 ms | 4.75 | 29 | 1224.9 | 474.9 | 6958 B | 7.00 |
ReadLine_ | Cols | 50000 | 21.710 ms | 4.34 | 29 | 1339.7 | 434.2 | 90734901 B | 91,282.60 |
CsvHelper | Cols | 50000 | 66.705 ms | 13.34 | 29 | 436.0 | 1334.1 | 457440 B | 460.20 |
Sep______ | Asset | 50000 | 33.390 ms | 1.00 | 29 | 871.1 | 667.8 | 14134046 B | 1.00 |
Sep_MT___ | Asset | 50000 | 22.413 ms | 0.67 | 29 | 1297.7 | 448.3 | 14280628 B | 1.01 |
Sylvan___ | Asset | 50000 | 53.205 ms | 1.60 | 29 | 546.7 | 1064.1 | 14296832 B | 1.01 |
ReadLine_ | Asset | 50000 | 109.717 ms | 3.30 | 29 | 265.1 | 2194.3 | 104585674 B | 7.40 |
CsvHelper | Asset | 50000 | 102.502 ms | 3.08 | 29 | 283.8 | 2050.0 | 14305752 B | 1.01 |
Sep______ | Asset | 1000000 | 657.056 ms | 1.00 | 581 | 885.6 | 657.1 | 273070256 B | 1.00 |
Sep_MT___ | Asset | 1000000 | 572.779 ms | 0.87 | 581 | 1015.9 | 572.8 | 284492848 B | 1.04 |
Sylvan___ | Asset | 1000000 | 1,177.217 ms | 1.80 | 581 | 494.3 | 1177.2 | 273228824 B | 1.00 |
ReadLine_ | Asset | 1000000 | 2,052.148 ms | 3.13 | 581 | 283.5 | 2052.1 | 2087769848 B | 7.65 |
CsvHelper | Asset | 1000000 | 1,733.243 ms | 2.65 | 581 | 335.7 | 1733.2 | 273238320 B | 1.00 |
The package assets benchmark (ScopeAsset
) has a very high base load in theform of the accumulated instances ofPackageAsset
and since Sep is so fast theGC becomes a significant bottleneck for the benchmark, especially formulti-threaded parsing. Switching toSERVERGCcan, therefore, provide significant speedup as can be seen below.
WithParallelEnumerate
and server GC Sep is>4x faster than Sylvan and up to18x faster than CsvHelper. Breaking 4 GB/s parsing speed on package assets on5950X.
AMD.EPYC.7763 - PackageAssets Benchmark Results (SERVER GC) (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Asset | 50000 | 33.15 ms | 1.00 | 29 | 877.3 | 663.0 | 13.48 MB | 1.00 |
Sep_MT___ | Asset | 50000 | 16.87 ms | 0.51 | 29 | 1724.3 | 337.4 | 13.55 MB | 1.01 |
Sylvan___ | Asset | 50000 | 45.40 ms | 1.37 | 29 | 640.6 | 908.1 | 13.63 MB | 1.01 |
ReadLine_ | Asset | 50000 | 62.51 ms | 1.89 | 29 | 465.3 | 1250.2 | 99.74 MB | 7.40 |
CsvHelper | Asset | 50000 | 120.39 ms | 3.63 | 29 | 241.6 | 2407.7 | 13.64 MB | 1.01 |
Sep______ | Asset | 1000000 | 680.86 ms | 1.00 | 581 | 854.6 | 680.9 | 260.41 MB | 1.00 |
Sep_MT___ | Asset | 1000000 | 346.45 ms | 0.51 | 581 | 1679.5 | 346.5 | 268.56 MB | 1.03 |
Sylvan___ | Asset | 1000000 | 888.66 ms | 1.31 | 581 | 654.8 | 888.7 | 260.58 MB | 1.00 |
ReadLine_ | Asset | 1000000 | 1,267.46 ms | 1.86 | 581 | 459.1 | 1267.5 | 1991.05 MB | 7.65 |
CsvHelper | Asset | 1000000 | 2,478.49 ms | 3.64 | 581 | 234.8 | 2478.5 | 260.58 MB | 1.00 |
AMD.Ryzen.7.PRO.7840U.w.Radeon.780M - PackageAssets Benchmark Results (SERVER GC) (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Asset | 50000 | 32.88 ms | 1.00 | 29 | 887.6 | 657.5 | 13.48 MB | 1.00 |
Sep_MT___ | Asset | 50000 | 12.08 ms | 0.37 | 29 | 2414.9 | 241.7 | 13.57 MB | 1.01 |
Sylvan___ | Asset | 50000 | 43.15 ms | 1.31 | 29 | 676.3 | 862.9 | 13.63 MB | 1.01 |
ReadLine_ | Asset | 50000 | 65.30 ms | 1.99 | 29 | 446.9 | 1305.9 | 99.74 MB | 7.40 |
CsvHelper | Asset | 50000 | 117.54 ms | 3.58 | 29 | 248.3 | 2350.9 | 13.64 MB | 1.01 |
Sep______ | Asset | 1000000 | 712.70 ms | 1.00 | 583 | 819.1 | 712.7 | 260.41 MB | 1.00 |
Sep_MT___ | Asset | 1000000 | 279.00 ms | 0.39 | 583 | 2092.4 | 279.0 | 262.79 MB | 1.01 |
Sylvan___ | Asset | 1000000 | 920.38 ms | 1.29 | 583 | 634.3 | 920.4 | 260.57 MB | 1.00 |
ReadLine_ | Asset | 1000000 | 1,078.15 ms | 1.52 | 583 | 541.5 | 1078.2 | 1991.05 MB | 7.65 |
CsvHelper | Asset | 1000000 | 2,417.96 ms | 3.40 | 583 | 241.4 | 2418.0 | 260.58 MB | 1.00 |
AMD.Ryzen.9.5950X - PackageAssets Benchmark Results (SERVER GC) (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Asset | 50000 | 20.951 ms | 1.00 | 29 | 1392.9 | 419.0 | 13.48 MB | 1.00 |
Sep_MT___ | Asset | 50000 | 6.614 ms | 0.32 | 29 | 4411.8 | 132.3 | 13.64 MB | 1.01 |
Sylvan___ | Asset | 50000 | 27.761 ms | 1.33 | 29 | 1051.2 | 555.2 | 13.63 MB | 1.01 |
ReadLine_ | Asset | 50000 | 33.516 ms | 1.60 | 29 | 870.7 | 670.3 | 99.74 MB | 7.40 |
CsvHelper | Asset | 50000 | 77.007 ms | 3.68 | 29 | 378.9 | 1540.1 | 13.64 MB | 1.01 |
Sep______ | Asset | 1000000 | 432.887 ms | 1.00 | 583 | 1348.6 | 432.9 | 260.41 MB | 1.00 |
Sep_MT___ | Asset | 1000000 | 119.430 ms | 0.28 | 583 | 4888.1 | 119.4 | 261.39 MB | 1.00 |
Sylvan___ | Asset | 1000000 | 559.550 ms | 1.29 | 583 | 1043.3 | 559.6 | 260.57 MB | 1.00 |
ReadLine_ | Asset | 1000000 | 573.637 ms | 1.33 | 583 | 1017.7 | 573.6 | 1991.05 MB | 7.65 |
CsvHelper | Asset | 1000000 | 1,537.602 ms | 3.55 | 583 | 379.7 | 1537.6 | 260.58 MB | 1.00 |
Apple.M1.(Virtual) - PackageAssets Benchmark Results (SERVER GC) (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Asset | 50000 | 26.05 ms | 1.00 | 29 | 1116.6 | 521.0 | 13.48 MB | 1.00 |
Sep_MT___ | Asset | 50000 | 11.03 ms | 0.42 | 29 | 2636.5 | 220.6 | 13.59 MB | 1.01 |
Sylvan___ | Asset | 50000 | 48.67 ms | 1.87 | 29 | 597.6 | 973.4 | 13.63 MB | 1.01 |
ReadLine_ | Asset | 50000 | 34.94 ms | 1.34 | 29 | 832.4 | 698.9 | 99.74 MB | 7.40 |
CsvHelper | Asset | 50000 | 73.10 ms | 2.81 | 29 | 397.9 | 1461.9 | 13.64 MB | 1.01 |
Sep______ | Asset | 1000000 | 507.79 ms | 1.00 | 581 | 1145.9 | 507.8 | 260.41 MB | 1.00 |
Sep_MT___ | Asset | 1000000 | 204.22 ms | 0.40 | 581 | 2849.3 | 204.2 | 269.28 MB | 1.03 |
Sylvan___ | Asset | 1000000 | 991.41 ms | 1.95 | 581 | 586.9 | 991.4 | 260.57 MB | 1.00 |
ReadLine_ | Asset | 1000000 | 1,083.07 ms | 2.13 | 581 | 537.2 | 1083.1 | 1991.05 MB | 7.65 |
CsvHelper | Asset | 1000000 | 1,924.79 ms | 3.79 | 581 | 302.3 | 1924.8 | 260.58 MB | 1.00 |
NCsvPerf
does not examine performance in the face of quotes in the csv. Thisis relevant since some libraries like Sylvan will revert to a slower (not SIMDvectorized) parsing code path if it encounters quotes. Sep was designed toalways use SIMD vectorization no matter what.
Since there are two extrachar
s to handle per column, it does have asignificant impact on performance, no matter what though. This is expected whenlooking at the numbers. For each row of 25 columns, there are 24 separators(here,
) and one set of line endings (here\r\n
). That's 26 characters.Adding quotes around each of the 25 columns will add 50 characters or almosttriple the total to 76.
AMD.EPYC.7763 - PackageAssets with Quotes Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Row | 50000 | 10.55 ms | 1.00 | 33 | 3154.2 | 211.0 | 1.06 KB | 1.00 |
Sep_Async | Row | 50000 | 12.02 ms | 1.14 | 33 | 2768.6 | 240.4 | 1.06 KB | 1.00 |
Sep_Unescape | Row | 50000 | 10.44 ms | 0.99 | 33 | 3189.2 | 208.7 | 1.06 KB | 1.00 |
Sylvan___ | Row | 50000 | 25.83 ms | 2.45 | 33 | 1288.3 | 516.7 | 7.74 KB | 7.30 |
ReadLine_ | Row | 50000 | 25.94 ms | 2.46 | 33 | 1282.8 | 518.9 | 108778.82 KB | 102,568.61 |
CsvHelper | Row | 50000 | 77.25 ms | 7.32 | 33 | 430.8 | 1545.0 | 20.29 KB | 19.13 |
Sep______ | Cols | 50000 | 12.01 ms | 1.00 | 33 | 2770.2 | 240.3 | 1.07 KB | 1.00 |
Sep_Unescape | Cols | 50000 | 13.29 ms | 1.11 | 33 | 2505.2 | 265.7 | 1.08 KB | 1.00 |
Sylvan___ | Cols | 50000 | 29.56 ms | 2.46 | 33 | 1126.0 | 591.2 | 7.76 KB | 7.24 |
ReadLine_ | Cols | 50000 | 27.61 ms | 2.30 | 33 | 1205.6 | 552.1 | 108778.82 KB | 101,447.64 |
CsvHelper | Cols | 50000 | 107.02 ms | 8.91 | 33 | 311.0 | 2140.3 | 445.93 KB | 415.88 |
Sep______ | Asset | 50000 | 47.85 ms | 1.00 | 33 | 695.6 | 956.9 | 13802.84 KB | 1.00 |
Sep_MT___ | Asset | 50000 | 30.16 ms | 0.63 | 33 | 1103.4 | 603.2 | 13862.06 KB | 1.00 |
Sylvan___ | Asset | 50000 | 72.55 ms | 1.52 | 33 | 458.8 | 1450.9 | 13963.14 KB | 1.01 |
ReadLine_ | Asset | 50000 | 155.62 ms | 3.25 | 33 | 213.9 | 3112.3 | 122305.53 KB | 8.86 |
CsvHelper | Asset | 50000 | 126.34 ms | 2.64 | 33 | 263.4 | 2526.8 | 13973.89 KB | 1.01 |
Sep______ | Asset | 1000000 | 986.97 ms | 1.00 | 665 | 674.6 | 987.0 | 266670.24 KB | 1.00 |
Sep_MT___ | Asset | 1000000 | 587.82 ms | 0.60 | 665 | 1132.7 | 587.8 | 272038.75 KB | 1.02 |
Sylvan___ | Asset | 1000000 | 1,464.19 ms | 1.48 | 665 | 454.7 | 1464.2 | 266840.84 KB | 1.00 |
ReadLine_ | Asset | 1000000 | 3,142.01 ms | 3.18 | 665 | 211.9 | 3142.0 | 2442321.3 KB | 9.16 |
CsvHelper | Asset | 1000000 | 2,609.30 ms | 2.64 | 665 | 255.2 | 2609.3 | 266834.41 KB | 1.00 |
AMD.Ryzen.7.PRO.7840U.w.Radeon.780M - PackageAssets with Quotes Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Row | 50000 | 10.41 ms | 1.00 | 33 | 3206.3 | 208.2 | 1.21 KB | 1.00 |
Sep_Async | Row | 50000 | 10.75 ms | 1.03 | 33 | 3105.5 | 215.0 | 1.17 KB | 0.97 |
Sep_Unescape | Row | 50000 | 10.32 ms | 0.99 | 33 | 3233.3 | 206.5 | 1.21 KB | 1.00 |
Sylvan___ | Row | 50000 | 26.60 ms | 2.51 | 33 | 1254.7 | 532.0 | 7.72 KB | 6.39 |
ReadLine_ | Row | 50000 | 24.43 ms | 2.30 | 33 | 1366.2 | 488.6 | 108778.79 KB | 90,048.08 |
CsvHelper | Row | 50000 | 71.27 ms | 6.72 | 33 | 468.3 | 1425.5 | 23.22 KB | 19.22 |
Sep______ | Cols | 50000 | 12.22 ms | 1.00 | 33 | 2730.9 | 244.4 | 1.22 KB | 1.00 |
Sep_Unescape | Cols | 50000 | 13.10 ms | 1.07 | 33 | 2547.4 | 262.1 | 1.22 KB | 1.00 |
Sylvan___ | Cols | 50000 | 30.00 ms | 2.46 | 33 | 1112.4 | 600.1 | 7.73 KB | 6.35 |
ReadLine_ | Cols | 50000 | 25.36 ms | 2.08 | 33 | 1316.0 | 507.3 | 108778.78 KB | 89,397.65 |
CsvHelper | Cols | 50000 | 101.58 ms | 8.31 | 33 | 328.6 | 2031.6 | 445.86 KB | 366.42 |
Sep______ | Asset | 50000 | 59.31 ms | 1.00 | 33 | 562.8 | 1186.2 | 13803.31 KB | 1.00 |
Sep_MT___ | Asset | 50000 | 41.53 ms | 0.70 | 33 | 803.6 | 830.7 | 13939.88 KB | 1.01 |
Sylvan___ | Asset | 50000 | 81.00 ms | 1.37 | 33 | 412.1 | 1619.9 | 13962.41 KB | 1.01 |
ReadLine_ | Asset | 50000 | 244.05 ms | 4.12 | 33 | 136.8 | 4881.0 | 122304.87 KB | 8.86 |
CsvHelper | Asset | 50000 | 134.78 ms | 2.27 | 33 | 247.7 | 2695.5 | 13973.52 KB | 1.01 |
Sep______ | Asset | 1000000 | 1,168.49 ms | 1.00 | 667 | 571.4 | 1168.5 | 266670.8 KB | 1.00 |
Sep_MT___ | Asset | 1000000 | 670.66 ms | 0.57 | 667 | 995.6 | 670.7 | 268687.83 KB | 1.01 |
Sylvan___ | Asset | 1000000 | 1,636.51 ms | 1.40 | 667 | 408.0 | 1636.5 | 266825.86 KB | 1.00 |
ReadLine_ | Asset | 1000000 | 4,203.80 ms | 3.60 | 667 | 158.8 | 4203.8 | 2442318.99 KB | 9.16 |
CsvHelper | Asset | 1000000 | 2,561.09 ms | 2.19 | 667 | 260.7 | 2561.1 | 266837.54 KB | 1.00 |
AMD.Ryzen.9.5950X - PackageAssets with Quotes Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Row | 50000 | 7.046 ms | 1.00 | 33 | 4737.2 | 140.9 | 1.04 KB | 1.00 |
Sep_Async | Row | 50000 | 8.137 ms | 1.15 | 33 | 4101.8 | 162.7 | 1.04 KB | 1.00 |
Sep_Unescape | Row | 50000 | 7.473 ms | 1.06 | 33 | 4466.7 | 149.5 | 1.04 KB | 1.00 |
Sylvan___ | Row | 50000 | 17.571 ms | 2.38 | 33 | 1899.5 | 351.4 | 7.69 KB | 7.41 |
ReadLine_ | Row | 50000 | 14.336 ms | 1.94 | 33 | 2328.2 | 286.7 | 108778.75 KB | 104,689.33 |
CsvHelper | Row | 50000 | 52.672 ms | 7.12 | 33 | 633.7 | 1053.4 | 20.05 KB | 19.29 |
Sep______ | Cols | 50000 | 8.126 ms | 1.00 | 33 | 4107.5 | 162.5 | 1.04 KB | 1.00 |
Sep_Unescape | Cols | 50000 | 9.748 ms | 1.20 | 33 | 3424.0 | 195.0 | 1.05 KB | 1.01 |
Sylvan___ | Cols | 50000 | 20.503 ms | 2.52 | 33 | 1628.0 | 410.1 | 7.7 KB | 7.39 |
ReadLine_ | Cols | 50000 | 16.513 ms | 2.03 | 33 | 2021.3 | 330.3 | 108778.76 KB | 104,394.99 |
CsvHelper | Cols | 50000 | 74.224 ms | 9.13 | 33 | 449.7 | 1484.5 | 445.85 KB | 427.88 |
Sep______ | Asset | 50000 | 39.523 ms | 1.00 | 33 | 844.5 | 790.5 | 13802.63 KB | 1.00 |
Sep_MT___ | Asset | 50000 | 23.386 ms | 0.59 | 33 | 1427.2 | 467.7 | 13981.76 KB | 1.01 |
Sylvan___ | Asset | 50000 | 50.803 ms | 1.29 | 33 | 657.0 | 1016.1 | 13962.08 KB | 1.01 |
ReadLine_ | Asset | 50000 | 114.306 ms | 2.89 | 33 | 292.0 | 2286.1 | 122304.45 KB | 8.86 |
CsvHelper | Asset | 50000 | 88.786 ms | 2.25 | 33 | 375.9 | 1775.7 | 13970.43 KB | 1.01 |
Sep______ | Asset | 1000000 | 752.681 ms | 1.00 | 667 | 887.1 | 752.7 | 266669 KB | 1.00 |
Sep_MT___ | Asset | 1000000 | 377.733 ms | 0.50 | 667 | 1767.7 | 377.7 | 267992.5 KB | 1.00 |
Sylvan___ | Asset | 1000000 | 1,091.345 ms | 1.45 | 667 | 611.8 | 1091.3 | 266825.09 KB | 1.00 |
ReadLine_ | Asset | 1000000 | 2,615.390 ms | 3.47 | 667 | 255.3 | 2615.4 | 2442319.06 KB | 9.16 |
CsvHelper | Asset | 1000000 | 1,756.409 ms | 2.33 | 667 | 380.2 | 1756.4 | 266839.53 KB | 1.00 |
Apple.M1.(Virtual) - PackageAssets with Quotes Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Row | 50000 | 12.18 ms | 1.01 | 33 | 2732.2 | 243.6 | 1.09 KB | 1.00 |
Sep_Async | Row | 50000 | 11.74 ms | 0.97 | 33 | 2835.4 | 234.8 | 1 KB | 0.92 |
Sep_Unescape | Row | 50000 | 11.10 ms | 0.92 | 33 | 2997.8 | 222.0 | 1 KB | 0.92 |
Sylvan___ | Row | 50000 | 24.97 ms | 2.06 | 33 | 1332.7 | 499.5 | 6.79 KB | 6.25 |
ReadLine_ | Row | 50000 | 26.23 ms | 2.17 | 33 | 1269.0 | 524.5 | 108778.81 KB | 100,080.42 |
CsvHelper | Row | 50000 | 49.35 ms | 4.08 | 33 | 674.4 | 986.9 | 20.09 KB | 18.49 |
Sep______ | Cols | 50000 | 12.41 ms | 1.00 | 33 | 2681.1 | 248.3 | 1.01 KB | 1.00 |
Sep_Unescape | Cols | 50000 | 14.75 ms | 1.19 | 33 | 2256.8 | 295.0 | 1.01 KB | 1.00 |
Sylvan___ | Cols | 50000 | 26.17 ms | 2.11 | 33 | 1271.6 | 523.5 | 6.79 KB | 6.72 |
ReadLine_ | Cols | 50000 | 25.07 ms | 2.02 | 33 | 1327.5 | 501.4 | 108778.8 KB | 107,622.70 |
CsvHelper | Cols | 50000 | 78.74 ms | 6.35 | 33 | 422.7 | 1574.8 | 446.72 KB | 441.97 |
Sep______ | Asset | 50000 | 39.11 ms | 1.00 | 33 | 851.0 | 782.1 | 13802.77 KB | 1.00 |
Sep_MT___ | Asset | 50000 | 30.33 ms | 0.78 | 33 | 1097.4 | 606.5 | 13876.85 KB | 1.01 |
Sylvan___ | Asset | 50000 | 56.15 ms | 1.44 | 33 | 592.7 | 1123.1 | 13961.25 KB | 1.01 |
ReadLine_ | Asset | 50000 | 127.77 ms | 3.28 | 33 | 260.5 | 2555.5 | 122305.8 KB | 8.86 |
CsvHelper | Asset | 50000 | 80.19 ms | 2.06 | 33 | 415.1 | 1603.7 | 13971.07 KB | 1.01 |
Sep______ | Asset | 1000000 | 794.37 ms | 1.00 | 665 | 838.1 | 794.4 | 266670.09 KB | 1.00 |
Sep_MT___ | Asset | 1000000 | 623.62 ms | 0.79 | 665 | 1067.6 | 623.6 | 275579.94 KB | 1.03 |
Sylvan___ | Asset | 1000000 | 1,218.66 ms | 1.54 | 665 | 546.3 | 1218.7 | 266825.24 KB | 1.00 |
ReadLine_ | Asset | 1000000 | 2,426.00 ms | 3.06 | 665 | 274.4 | 2426.0 | 2442322 KB | 9.16 |
CsvHelper | Asset | 1000000 | 2,183.43 ms | 2.75 | 665 | 304.9 | 2183.4 | 266837.34 KB | 1.00 |
Here again are benchmark results with server garbage collection, which providessignificant speedup over workstation garbage collection.
AMD.EPYC.7763 - PackageAssets with Quotes Benchmark Results (SERVER GC) (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Asset | 50000 | 39.11 ms | 1.00 | 33 | 851.0 | 782.2 | 13.48 MB | 1.00 |
Sep_MT___ | Asset | 50000 | 23.02 ms | 0.59 | 33 | 1445.6 | 460.5 | 13.53 MB | 1.00 |
Sylvan___ | Asset | 50000 | 64.15 ms | 1.64 | 33 | 518.8 | 1283.0 | 13.63 MB | 1.01 |
ReadLine_ | Asset | 50000 | 69.84 ms | 1.79 | 33 | 476.5 | 1396.8 | 119.44 MB | 8.86 |
CsvHelper | Asset | 50000 | 119.33 ms | 3.05 | 33 | 278.9 | 2386.7 | 13.64 MB | 1.01 |
Sep______ | Asset | 1000000 | 851.15 ms | 1.00 | 665 | 782.2 | 851.1 | 260.41 MB | 1.00 |
Sep_MT___ | Asset | 1000000 | 433.22 ms | 0.51 | 665 | 1536.9 | 433.2 | 262.82 MB | 1.01 |
Sylvan___ | Asset | 1000000 | 1,328.75 ms | 1.56 | 665 | 501.1 | 1328.7 | 260.57 MB | 1.00 |
ReadLine_ | Asset | 1000000 | 1,477.57 ms | 1.74 | 665 | 450.6 | 1477.6 | 2385.07 MB | 9.16 |
CsvHelper | Asset | 1000000 | 2,519.43 ms | 2.96 | 665 | 264.3 | 2519.4 | 260.59 MB | 1.00 |
AMD.Ryzen.7.PRO.7840U.w.Radeon.780M - PackageAssets with Quotes Benchmark Results (SERVER GC) (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Asset | 50000 | 40.64 ms | 1.00 | 33 | 821.3 | 812.8 | 13.48 MB | 1.00 |
Sep_MT___ | Asset | 50000 | 22.02 ms | 0.54 | 33 | 1515.5 | 440.5 | 13.59 MB | 1.01 |
Sylvan___ | Asset | 50000 | 65.40 ms | 1.61 | 33 | 510.3 | 1308.1 | 13.63 MB | 1.01 |
ReadLine_ | Asset | 50000 | 86.96 ms | 2.14 | 33 | 383.8 | 1739.1 | 119.44 MB | 8.86 |
CsvHelper | Asset | 50000 | 113.09 ms | 2.78 | 33 | 295.1 | 2261.8 | 13.64 MB | 1.01 |
Sep______ | Asset | 1000000 | 837.08 ms | 1.00 | 667 | 797.7 | 837.1 | 260.41 MB | 1.00 |
Sep_MT___ | Asset | 1000000 | 424.07 ms | 0.51 | 667 | 1574.5 | 424.1 | 262.92 MB | 1.01 |
Sylvan___ | Asset | 1000000 | 1,355.96 ms | 1.62 | 667 | 492.4 | 1356.0 | 260.57 MB | 1.00 |
ReadLine_ | Asset | 1000000 | 1,145.13 ms | 1.37 | 667 | 583.1 | 1145.1 | 2385.07 MB | 9.16 |
CsvHelper | Asset | 1000000 | 2,308.37 ms | 2.76 | 667 | 289.3 | 2308.4 | 260.58 MB | 1.00 |
AMD.Ryzen.9.5950X - PackageAssets with Quotes Benchmark Results (SERVER GC) (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Asset | 50000 | 26.42 ms | 1.00 | 33 | 1263.1 | 528.5 | 13.48 MB | 1.00 |
Sep_MT___ | Asset | 50000 | 11.53 ms | 0.44 | 33 | 2894.1 | 230.7 | 13.64 MB | 1.01 |
Sylvan___ | Asset | 50000 | 43.05 ms | 1.63 | 33 | 775.3 | 861.1 | 13.63 MB | 1.01 |
ReadLine_ | Asset | 50000 | 37.30 ms | 1.41 | 33 | 894.8 | 746.0 | 119.44 MB | 8.86 |
CsvHelper | Asset | 50000 | 78.91 ms | 2.99 | 33 | 423.0 | 1578.1 | 13.64 MB | 1.01 |
Sep______ | Asset | 1000000 | 538.48 ms | 1.00 | 667 | 1240.0 | 538.5 | 260.43 MB | 1.00 |
Sep_MT___ | Asset | 1000000 | 213.29 ms | 0.40 | 667 | 3130.5 | 213.3 | 261.37 MB | 1.00 |
Sylvan___ | Asset | 1000000 | 879.04 ms | 1.63 | 667 | 759.6 | 879.0 | 260.57 MB | 1.00 |
ReadLine_ | Asset | 1000000 | 642.57 ms | 1.19 | 667 | 1039.1 | 642.6 | 2385.07 MB | 9.16 |
CsvHelper | Asset | 1000000 | 1,598.79 ms | 2.97 | 667 | 417.6 | 1598.8 | 260.58 MB | 1.00 |
Apple.M1.(Virtual) - PackageAssets with Quotes Benchmark Results (SERVER GC) (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Asset | 50000 | 41.29 ms | 1.03 | 33 | 806.0 | 825.9 | 13.48 MB | 1.00 |
Sep_MT___ | Asset | 50000 | 31.46 ms | 0.78 | 33 | 1058.0 | 629.1 | 13.53 MB | 1.00 |
Sylvan___ | Asset | 50000 | 58.99 ms | 1.47 | 33 | 564.2 | 1179.9 | 13.63 MB | 1.01 |
ReadLine_ | Asset | 50000 | 57.46 ms | 1.43 | 33 | 579.2 | 1149.3 | 119.44 MB | 8.86 |
CsvHelper | Asset | 50000 | 85.16 ms | 2.12 | 33 | 390.8 | 1703.2 | 13.64 MB | 1.01 |
Sep______ | Asset | 1000000 | 696.42 ms | 1.00 | 665 | 956.0 | 696.4 | 260.41 MB | 1.00 |
Sep_MT___ | Asset | 1000000 | 529.92 ms | 0.76 | 665 | 1256.4 | 529.9 | 266.15 MB | 1.02 |
Sylvan___ | Asset | 1000000 | 1,168.96 ms | 1.68 | 665 | 569.6 | 1169.0 | 260.57 MB | 1.00 |
ReadLine_ | Asset | 1000000 | 1,593.69 ms | 2.29 | 665 | 417.8 | 1593.7 | 2385.08 MB | 9.16 |
CsvHelper | Asset | 1000000 | 1,663.19 ms | 2.39 | 665 | 400.3 | 1663.2 | 260.58 MB | 1.00 |
Similar to the benchmark related to quotes here spaces and quotes
"
areadded to relevant columns to benchmark impact of trimming and unescape on lowlevel column access. That is, basically"
is prepended and appended to eachcolumn. This will test the assumed most common case and fast path part oftrimming and unescaping in Sep. Sep is about 10x faster than CsvHelper for this.Sylvan does not appear to have support automatic trimming and is, therefore, notincluded.
AMD.EPYC.7763 - PackageAssets with Spaces and Quotes Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep_ | Cols | 50000 | 13.32 ms | 1.00 | 41 | 3128.8 | 266.4 | 1.08 KB | 1.00 |
Sep_Trim | Cols | 50000 | 26.13 ms | 1.96 | 41 | 1595.1 | 522.5 | 1.11 KB | 1.02 |
Sep_TrimUnescape | Cols | 50000 | 19.37 ms | 1.45 | 41 | 2151.4 | 387.4 | 1.11 KB | 1.03 |
Sep_TrimUnescapeTrim | Cols | 50000 | 21.30 ms | 1.60 | 41 | 1956.4 | 426.0 | 1.11 KB | 1.03 |
CsvHelper_TrimUnescape | Cols | 50000 | 145.66 ms | 10.94 | 41 | 286.1 | 2913.2 | 451.86 KB | 418.74 |
CsvHelper_TrimUnescapeTrim | Cols | 50000 | 143.79 ms | 10.80 | 41 | 289.8 | 2875.8 | 446.2 KB | 413.49 |
AMD.Ryzen.7.PRO.7840U.w.Radeon.780M - PackageAssets with Spaces and Quotes Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep_ | Cols | 50000 | 13.52 ms | 1.00 | 41 | 3089.2 | 270.4 | 1.22 KB | 1.00 |
Sep_Trim | Cols | 50000 | 18.53 ms | 1.37 | 41 | 2253.9 | 370.6 | 1.91 KB | 1.57 |
Sep_TrimUnescape | Cols | 50000 | 19.87 ms | 1.47 | 41 | 2102.1 | 397.4 | 1.25 KB | 1.02 |
Sep_TrimUnescapeTrim | Cols | 50000 | 22.41 ms | 1.66 | 41 | 1863.6 | 448.3 | 1.26 KB | 1.03 |
CsvHelper_TrimUnescape | Cols | 50000 | 129.55 ms | 9.58 | 41 | 322.4 | 2591.0 | 451.52 KB | 369.89 |
CsvHelper_TrimUnescapeTrim | Cols | 50000 | 127.82 ms | 9.45 | 41 | 326.8 | 2556.4 | 445.86 KB | 365.25 |
AMD.Ryzen.9.5950X - PackageAssets with Spaces and Quotes Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep_ | Cols | 50000 | 9.467 ms | 1.00 | 41 | 4412.2 | 189.3 | 1.05 KB | 1.00 |
Sep_Trim | Cols | 50000 | 12.972 ms | 1.37 | 41 | 3219.9 | 259.4 | 1.06 KB | 1.01 |
Sep_TrimUnescape | Cols | 50000 | 13.630 ms | 1.44 | 41 | 3064.5 | 272.6 | 1.06 KB | 1.02 |
Sep_TrimUnescapeTrim | Cols | 50000 | 15.502 ms | 1.64 | 41 | 2694.4 | 310.0 | 1.07 KB | 1.03 |
CsvHelper_TrimUnescape | Cols | 50000 | 98.444 ms | 10.40 | 41 | 424.3 | 1968.9 | 451.52 KB | 431.70 |
CsvHelper_TrimUnescapeTrim | Cols | 50000 | 97.110 ms | 10.26 | 41 | 430.1 | 1942.2 | 445.86 KB | 426.29 |
Apple.M1.(Virtual) - PackageAssets with Spaces and Quotes Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep_ | Cols | 50000 | 13.51 ms | 1.00 | 41 | 3085.8 | 270.1 | 1.1 KB | 1.00 |
Sep_Trim | Cols | 50000 | 16.78 ms | 1.24 | 41 | 2484.2 | 335.5 | 1.03 KB | 0.94 |
Sep_TrimUnescape | Cols | 50000 | 17.90 ms | 1.33 | 41 | 2327.7 | 358.1 | 1.37 KB | 1.25 |
Sep_TrimUnescapeTrim | Cols | 50000 | 21.65 ms | 1.60 | 41 | 1924.5 | 433.1 | 1.37 KB | 1.25 |
CsvHelper_TrimUnescape | Cols | 50000 | 95.06 ms | 7.04 | 41 | 438.4 | 1901.1 | 451.6 KB | 410.32 |
CsvHelper_TrimUnescapeTrim | Cols | 50000 | 93.41 ms | 6.92 | 41 | 446.1 | 1868.2 | 445.93 KB | 405.18 |
TheFloatsReaderBench.csbenchmark demonstrates what Sep is built for. Namely parsing 32-bit floatingpoints or features as in machine learning. Here a simple CSV-file is randomlygenerated withN
ground truth values,N
predicted result values and nothingelse (note this was changed from version 0.3.0, prior to that there were someextra leading columns).N = 20
here. For example:
GT_Feature0;GT_Feature1;GT_Feature2;GT_Feature3;GT_Feature4;GT_Feature5;GT_Feature6;GT_Feature7;GT_Feature8;GT_Feature9;GT_Feature10;GT_Feature11;GT_Feature12;GT_Feature13;GT_Feature14;GT_Feature15;GT_Feature16;GT_Feature17;GT_Feature18;GT_Feature19;RE_Feature0;RE_Feature1;RE_Feature2;RE_Feature3;RE_Feature4;RE_Feature5;RE_Feature6;RE_Feature7;RE_Feature8;RE_Feature9;RE_Feature10;RE_Feature11;RE_Feature12;RE_Feature13;RE_Feature14;RE_Feature15;RE_Feature16;RE_Feature17;RE_Feature18;RE_Feature190.52276427;0.16843422;0.26259267;0.7244084;0.51292276;0.17365117;0.76125056;0.23458846;0.2573214;0.50560355;0.3202332;0.3809696;0.26024464;0.5174511;0.035318818;0.8141374;0.57719684;0.3974705;0.15219308;0.09011261;0.70515215;0.81618196;0.5399706;0.044147138;0.7111546;0.14776127;0.90621275;0.6925897;0.5164137;0.18637845;0.041509967;0.30819967;0.5831603;0.8210651;0.003954861;0.535722;0.8051845;0.7483589;0.3845737;0.149119080.6264564;0.11517637;0.24996082;0.77242833;0.2896067;0.6481459;0.14364648;0.044498358;0.6045593;0.51591337;0.050794687;0.42036617;0.7065823;0.6284636;0.21844554;0.013253775;0.36516154;0.2674384;0.06866083;0.71817476;0.07094294;0.46409357;0.012033525;0.7978093;0.43917948;0.5134962;0.4995968;0.008952909;0.82883793;0.012896823;0.0030740085;0.063773096;0.6541431;0.034539033;0.9135142;0.92897075;0.46119377;0.37533295;0.61660606;0.0444438160.7922863;0.5323656;0.400699;0.29737252;0.9072584;0.58673894;0.73510516;0.019412167;0.88168067;0.9576787;0.33283427;0.7107;0.1623628;0.10314285;0.4521515;0.33324885;0.7761104;0.14854911;0.13469358;0.21566042;0.59166247;0.5128394;0.98702157;0.766223;0.67204326;0.7149494;0.2894748;0.55206;0.9898286;0.65083236;0.02421702;0.34540752;0.92906284;0.027142895;0.21974725;0.26544374;0.03848049;0.2161237;0.59233844;0.422213970.10609442;0.32130885;0.32383907;0.7511514;0.8258279;0.00904226;0.0420841;0.84049565;0.8958947;0.23807365;0.92621964;0.8452882;0.2794469;0.545344;0.63447595;0.62532926;0.19230893;0.29726416;0.18304513;0.029583583;0.23084833;0.93346167;0.98742676;0.78163713;0.13521992;0.8833956;0.18670778;0.29476836;0.5599867;0.5562107;0.7124796;0.121927656;0.5981778;0.39144602;0.88092715;0.4449142;0.34820423;0.96379805;0.46364686;0.54301775
ForScope=Floats
the benchmark will parse the features as two spans offloat
s; one for ground truth values and one for predicted result values. Thencalculates the mean squared error (MSE) of those as an example. For Sep thiscode is succinct and still incredibly efficient:
usingvarreader=Sep.Reader().From(Reader.CreateReader());vargroundTruthColNames=reader.Header.NamesStartingWith("GT_");varresultColNames=groundTruthColNames.Select(n=>n.Replace("GT_","RE_",StringComparison.Ordinal)).ToArray();varsum=0.0;varcount=0;foreach(varrowinreader){vargts=row[groundTruthColNames].Parse<float>();varres=row[resultColNames].Parse<float>();sum+=MeanSquaredError(gts,res);++count;}returnsum/count;
Note how one can access and parse multiple columns easily while there are norepeated allocations for the parsed floating points. Sep internally handles apool of arrays for handling multiple columns and returns spans for them.
The benchmark is based on an assumption of accessing columns by name perrow. Ideally, one would look up the indices of the columns by name beforeenumerating rows, but this is a repeated nuisance to have to handle and Sep wasbuilt to avoid this. Hence, the comparison is based on looking up by name foreach, even if this ends up adding a bit more code in the benchmark for otherapproaches.
As can be seen below, the actual low level parsing of the separated values is atiny part of the total runtime for Sep for which the runtime is dominated byparsing the floating points. Since Sep usescsFastFloat for an integrated fastfloating point parser, it is>2x faster than Sylvan for example. If usingSylvan one may consider using csFastFloat if that is an option. With themulti-threaded (MT)ParallelEnumerate
implementation Sep isup to 23x fasterthan Sylvan.
CsvHelper suffers from the fact that one can only access the column as a stringso this has to be allocated for each column (ReadLine by definition alwaysallocates a string per column). Still CsvHelper is significantly slower than thenaiveReadLine
approach. With Sep being>4x faster than CsvHelper andup to35x times faster when usingParallelEnumerate
.
Note thatParallelEnumerate
provides significant speedup over single-threadedparsing even though the source is only about 20 MB. This underlines howefficientParallelEnumerate
is, but bear in mind that this is for the case ofrepeated micro-benchmark runs.
It is a testament to how good the .NET and the .NET GC is that the ReadLine ispretty good compared to CsvHelper regardless of allocating a lot of strings.
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Row | 25000 | 2.913 ms | 1.00 | 20 | 6958.5 | 116.5 | 1.26 KB | 1.00 |
Sylvan___ | Row | 25000 | 3.566 ms | 1.22 | 20 | 5685.3 | 142.6 | 10.71 KB | 8.51 |
ReadLine_ | Row | 25000 | 18.192 ms | 6.24 | 20 | 1114.3 | 727.7 | 73489.7 KB | 58,426.60 |
CsvHelper | Row | 25000 | 38.233 ms | 13.12 | 20 | 530.2 | 1529.3 | 20.06 KB | 15.95 |
Sep______ | Cols | 25000 | 3.950 ms | 1.00 | 20 | 5131.9 | 158.0 | 1.26 KB | 1.00 |
Sylvan___ | Cols | 25000 | 5.911 ms | 1.50 | 20 | 3429.6 | 236.4 | 10.72 KB | 8.48 |
ReadLine_ | Cols | 25000 | 19.574 ms | 4.96 | 20 | 1035.6 | 783.0 | 73489.68 KB | 58,155.66 |
CsvHelper | Cols | 25000 | 41.031 ms | 10.39 | 20 | 494.1 | 1641.3 | 21340.29 KB | 16,887.53 |
Sep______ | Floats | 25000 | 31.469 ms | 1.00 | 20 | 644.2 | 1258.7 | 8.08 KB | 1.00 |
Sep_MT___ | Floats | 25000 | 12.639 ms | 0.40 | 20 | 1604.0 | 505.5 | 67.81 KB | 8.40 |
Sylvan___ | Floats | 25000 | 84.199 ms | 2.68 | 20 | 240.8 | 3368.0 | 19.89 KB | 2.46 |
ReadLine_ | Floats | 25000 | 112.934 ms | 3.59 | 20 | 179.5 | 4517.4 | 73493.2 KB | 9,101.10 |
CsvHelper | Floats | 25000 | 161.035 ms | 5.12 | 20 | 125.9 | 6441.4 | 22062.53 KB | 2,732.14 |
AMD.Ryzen.7.PRO.7840U.w.Radeon.780M - FloatsReader Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Row | 25000 | 3.415 ms | 1.00 | 20 | 5949.8 | 136.6 | 1.41 KB | 1.00 |
Sylvan___ | Row | 25000 | 3.803 ms | 1.11 | 20 | 5343.6 | 152.1 | 10.71 KB | 7.59 |
ReadLine_ | Row | 25000 | 15.853 ms | 4.64 | 20 | 1281.8 | 634.1 | 73489.64 KB | 52,078.47 |
CsvHelper | Row | 25000 | 39.778 ms | 11.65 | 20 | 510.8 | 1591.1 | 20.03 KB | 14.19 |
Sep______ | Cols | 25000 | 4.470 ms | 1.00 | 20 | 4546.3 | 178.8 | 1.42 KB | 1.00 |
Sylvan___ | Cols | 25000 | 5.999 ms | 1.34 | 20 | 3387.4 | 239.9 | 10.71 KB | 7.54 |
ReadLine_ | Cols | 25000 | 17.779 ms | 3.98 | 20 | 1142.9 | 711.2 | 73489.66 KB | 51,756.13 |
CsvHelper | Cols | 25000 | 43.374 ms | 9.70 | 20 | 468.5 | 1735.0 | 21340.41 KB | 15,029.29 |
Sep______ | Floats | 25000 | 32.146 ms | 1.00 | 20 | 632.1 | 1285.8 | 8.2 KB | 1.00 |
Sep_MT___ | Floats | 25000 | 6.082 ms | 0.19 | 20 | 3340.7 | 243.3 | 115.72 KB | 14.11 |
Sylvan___ | Floats | 25000 | 81.398 ms | 2.53 | 20 | 249.6 | 3255.9 | 18.88 KB | 2.30 |
ReadLine_ | Floats | 25000 | 107.332 ms | 3.34 | 20 | 189.3 | 4293.3 | 73493.12 KB | 8,960.23 |
CsvHelper | Floats | 25000 | 157.689 ms | 4.91 | 20 | 128.9 | 6307.6 | 22062.72 KB | 2,689.87 |
AMD.Ryzen.9.5950X - FloatsReader Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Row | 25000 | 2.013 ms | 1.00 | 20 | 10093.4 | 80.5 | 1.25 KB | 1.00 |
Sylvan___ | Row | 25000 | 2.355 ms | 1.17 | 20 | 8627.4 | 94.2 | 10.7 KB | 8.56 |
ReadLine_ | Row | 25000 | 9.787 ms | 4.86 | 20 | 2076.1 | 391.5 | 73489.63 KB | 58,791.71 |
CsvHelper | Row | 25000 | 25.143 ms | 12.49 | 20 | 808.2 | 1005.7 | 20 KB | 16.00 |
Sep______ | Cols | 25000 | 2.666 ms | 1.00 | 20 | 7622.2 | 106.6 | 1.25 KB | 1.00 |
Sylvan___ | Cols | 25000 | 3.702 ms | 1.39 | 20 | 5488.4 | 148.1 | 10.71 KB | 8.54 |
ReadLine_ | Cols | 25000 | 10.544 ms | 3.96 | 20 | 1927.1 | 421.8 | 73489.63 KB | 58,654.23 |
CsvHelper | Cols | 25000 | 27.442 ms | 10.29 | 20 | 740.5 | 1097.7 | 21340.34 KB | 17,032.36 |
Sep______ | Floats | 25000 | 20.297 ms | 1.00 | 20 | 1001.1 | 811.9 | 7.97 KB | 1.00 |
Sep_MT___ | Floats | 25000 | 3.780 ms | 0.19 | 20 | 5375.6 | 151.2 | 179.49 KB | 22.51 |
Sylvan___ | Floats | 25000 | 52.343 ms | 2.58 | 20 | 388.2 | 2093.7 | 18.88 KB | 2.37 |
ReadLine_ | Floats | 25000 | 68.698 ms | 3.38 | 20 | 295.8 | 2747.9 | 73493.12 KB | 9,215.89 |
CsvHelper | Floats | 25000 | 100.913 ms | 4.97 | 20 | 201.4 | 4036.5 | 22061.69 KB | 2,766.49 |
Apple.M1.(Virtual) - FloatsReader Benchmark Results (Sep 0.9.0.0, Sylvan 1.3.9.0, CsvHelper 33.0.1.24)
Method | Scope | Rows | Mean | Ratio | MB | MB/s | ns/row | Allocated | Alloc Ratio |
---|---|---|---|---|---|---|---|---|---|
Sep______ | Row | 25000 | 4.209 ms | 1.00 | 20 | 4815.8 | 168.4 | 1.2 KB | 1.00 |
Sylvan___ | Row | 25000 | 19.401 ms | 4.61 | 20 | 1044.9 | 776.0 | 10.62 KB | 8.87 |
ReadLine_ | Row | 25000 | 15.132 ms | 3.60 | 20 | 1339.7 | 605.3 | 73489.65 KB | 61,381.24 |
CsvHelper | Row | 25000 | 30.200 ms | 7.18 | 20 | 671.3 | 1208.0 | 20.21 KB | 16.88 |
Sep______ | Cols | 25000 | 5.070 ms | 1.00 | 20 | 3998.5 | 202.8 | 1.21 KB | 1.00 |
Sylvan___ | Cols | 25000 | 23.742 ms | 4.68 | 20 | 853.9 | 949.7 | 10.62 KB | 8.74 |
ReadLine_ | Cols | 25000 | 17.569 ms | 3.47 | 20 | 1153.9 | 702.7 | 73489.65 KB | 60,493.09 |
CsvHelper | Cols | 25000 | 34.182 ms | 6.74 | 20 | 593.1 | 1367.3 | 21340.43 KB | 17,566.40 |
Sep______ | Floats | 25000 | 27.363 ms | 1.00 | 20 | 740.8 | 1094.5 | 8.08 KB | 1.00 |
Sep_MT___ | Floats | 25000 | 12.814 ms | 0.47 | 20 | 1582.0 | 512.6 | 67.85 KB | 8.40 |
Sylvan___ | Floats | 25000 | 78.840 ms | 2.88 | 20 | 257.1 | 3153.6 | 18.57 KB | 2.30 |
ReadLine_ | Floats | 25000 | 89.458 ms | 3.27 | 20 | 226.6 | 3578.3 | 73493.2 KB | 9,093.41 |
CsvHelper | Floats | 25000 | 130.793 ms | 4.78 | 20 | 155.0 | 5231.7 | 22061.99 KB | 2,729.76 |
Writer benchmarks are still pending, but Sep is unlikely to be the fastest heresince it is explicitly designed to make writing more convenient and flexible.Still efficient, but not necessarily fastest. That is, Sep does not requirewriting header up front and hence having to keep header column order and rowvalues column order the same. This means Sep does not write columnsdirectlyupon definition but defers this until a new row has been fully defined and thenis ended.
The following examples are available inReadMeTest.cs.
vartext=""" A;B;C;D;E;F Sep;🚀;1;1.2;0.1;0.5 CSV;✅;2;2.2;0.2;1.5 """;// Empty line at end is for line endingusingvarreader=Sep.Reader().FromText(text);usingvarwriter=reader.Spec.Writer().ToText();foreach(varreadRowinreader){usingvarwriteRow=writer.NewRow(readRow);}Assert.AreEqual(text,writer.ToString());
vartext=""" A;B;C;D;E;F Sep;🚀;1;1.2;0.1;0.5 CSV;✅;2;2.2;0.2;1.5 """;// Empty line at end is for line endingusingvarreader=awaitSep.Reader().FromTextAsync(text);awaitusingvarwriter=reader.Spec.Writer().ToText();awaitforeach(varreadRowinreader){awaitusingvarwriteRow=writer.NewRow(readRow);}Assert.AreEqual(text,writer.ToString());
vartext=""" A 1 2 3 4 """;// Empty line at end is for line endingvarexpected=new[]{1,2,3,4};// Disable col count check to allow empty rowsusingvarreader=Sep.Reader(o=>owith{DisableColCountCheck=true}).FromText(text);varactual=newList<int>();foreach(varrowinreader){// Skip empty rowif(row.Span.Length==0){continue;}actual.Add(row["A"].Parse<int>());}CollectionAssert.AreEqual(expected,actual);
SinceSepReader.Row
is aref struct
as covered above, one has to avoidreferencing it directly in async context for C# prior to 13.0. This can be donein a number of ways, but one way is to useEnumerate
extension method toparse/extract data from row like shown below.
vartext=""" C 1 2 """;usingvarreader=Sep.Reader().FromText(text);varsquaredSum=0;// Use Enumerate to avoid referencing SepReader.Row in async contextforeach(varvalueinreader.Enumerate(row=>row["C"].Parse<int>())){squaredSum+=awaitTask.Run(()=>value*value);}Assert.AreEqual(5,squaredSum);
Another way to avoid referencingSepReader.Row
directly in async context is touse custom iterator viayield return
to parse/extract data from row like shownbelow.
vartext=""" C 1 2 """;usingvarreader=Sep.Reader().FromText(text);varsquaredSum=0;// Use custom local function Enumerate to avoid referencing// SepReader.Row in async contextforeach(varvalueinEnumerate(reader)){squaredSum+=awaitTask.Run(()=>value*value);}Assert.AreEqual(5,squaredSum);staticIEnumerable<int>Enumerate(SepReaderreader){foreach(varrinreader){yieldreturnr["C"].Parse<int>();}}
Below shows how one can skip lines starting with comment#
since Sep does nothave built-in support for this. Note that this presumes lines to be skippedbefore header do not contain quotes or rather line endings within quotes as thatis not handled by thePeek()
skipping. The rows starting with comment#
after header are skipped if handling quoting is enabled in Sep options.
vartext=""" # Comment 1 # Comment 2 A # Comment 3 1 2 # Comment 4 """;constcharComment='#';usingvartextReader=newStringReader(text);// Skip initial lines (not rows) before headerwhile(textReader.Peek()==Comment&&textReader.ReadLine()isstringline){}usingvarreader=Sep.Reader().From(textReader);varvalues=newList<int>();foreach(varrowinreader){// Skip rows starting with commentif(row.Span.StartsWith([Comment])){continue;}varvalue=row["A"].Parse<int>();values.Add(value);}CollectionAssert.AreEqual(newint[]{1,2},values);
While theRFC-4180 requires\r\n
(CR,LF) as line ending, the well-known line endings (\r\n
,\n
and\r
) aresupported similar to .NET.Environment.NewLine
is used when writing. Quotingis supported by simply matching pairs of quotes, no matter what.
Note that some libraries will claim conformance but the RFC is, perhapsnaturally, quite strict e.g. only comma is supported as separator/delimiter. Sepdefaults to using;
as separator if writing, while auto-detecting supportedseparators when reading. This is decidedly non-conforming.
The RFC defines the following condensedABNFgrammar:
file = [header CRLF] record *(CRLF record) [CRLF]header = name *(COMMA name)record = field *(COMMA field)name = fieldfield = (escaped / non-escaped)escaped = DQUOTE *(TEXTDATA / COMMA / CR / LF / 2DQUOTE) DQUOTEnon-escaped = *TEXTDATACOMMA = %x2CCR = %x0D ;as per section 6.1 of RFC 2234 [2]DQUOTE = %x22 ;as per section 6.1 of RFC 2234 [2]LF = %x0A ;as per section 6.1 of RFC 2234 [2]CRLF = CR LF ;as per section 6.1 of RFC 2234 [2]TEXTDATA = %x20-21 / %x23-2B / %x2D-7E
Note howTEXTDATA
is restricted too, yet many will allow any character incl.emojis or similar (which Sep supports), but is not in conformance with the RFC.
Quotes inside an escaped field e.g."fie""ld"
are only allowed to be doublequotes. Sep currently allows any pairs of quotes and quoting doesn't need to beat start of or end of field (col or column in Sep terminology).
All in all Sep takes a pretty pragmatic approach here as the primary use case isnot exchanging data on the internet, but for use in machine learningpipelines or similar.
Ask questions on GitHub and this section will be expanded. :)
- Does Sep supportobject mapping likeCsvHelper?No, Sep is a minimal library and does not support object mapping. First, thisis usually supported via reflection, which Sep avoids. Second, object mappingoften only works well in a few cases without actually writing custom mappingfor each property, which then basically amounts to writing the parsing codeyourself. If object mapping is a must have, consider writing your ownsourcegeneratorfor it if you want to use Sep. Maybe some day Sep will have a built-in sourcegenerator, but not in the foreseeable future.
[assembly:System.CLSCompliant(false)][assembly:System.Reflection.AssemblyMetadata("IsTrimmable","True")][assembly:System.Reflection.AssemblyMetadata("RepositoryUrl","https://github.com/nietras/Sep/")][assembly:System.Resources.NeutralResourcesLanguage("en")][assembly:System.Runtime.CompilerServices.InternalsVisibleTo("Sep.Benchmarks")][assembly:System.Runtime.CompilerServices.InternalsVisibleTo("Sep.ComparisonBenchmarks")][assembly:System.Runtime.CompilerServices.InternalsVisibleTo("Sep.Test")][assembly:System.Runtime.CompilerServices.InternalsVisibleTo("Sep.XyzTest")][assembly:System.Runtime.Versioning.TargetFramework(".NETCoreApp,Version=v9.0",FrameworkDisplayName=".NET 9.0")]namespacenietras.SeparatedValues{publicreadonlystructSep:System.IEquatable<nietras.SeparatedValues.Sep>{publicSep(){}publicSep(charseparator){}publiccharSeparator{get;init;}publicstaticnietras.SeparatedValues.Sep?Auto{get;}publicstaticnietras.SeparatedValues.SepDefault{get;}publicstaticnietras.SeparatedValues.SepNew(charseparator){}publicstaticnietras.SeparatedValues.SepReaderOptionsReader(){}publicstaticnietras.SeparatedValues.SepReaderOptionsReader(System.Func<nietras.SeparatedValues.SepReaderOptions,nietras.SeparatedValues.SepReaderOptions>configure){}publicstaticnietras.SeparatedValues.SepWriterOptionsWriter(){}publicstaticnietras.SeparatedValues.SepWriterOptionsWriter(System.Func<nietras.SeparatedValues.SepWriterOptions,nietras.SeparatedValues.SepWriterOptions>configure){}}publicenumSepColNotSetOption:byte{Throw=0,Empty=1,Skip=2,}publicdelegatenietras.SeparatedValues.SepToStringSepCreateToString(nietras.SeparatedValues.SepReaderHeader?maybeHeader,intcolCount);publicstaticclassSepDefaults{publicstaticSystem.StringComparerColNameComparer{get;}publicstaticSystem.Globalization.CultureInfoCultureInfo{get;}publicstaticcharSeparator{get;}}[System.Diagnostics.DebuggerDisplay("{DebuggerDisplay,nq}")]publicsealedclassSepReader:nietras.SeparatedValues.SepReaderState,System.Collections.Generic.IAsyncEnumerable<nietras.SeparatedValues.SepReader.Row>,System.Collections.Generic.IEnumerable<nietras.SeparatedValues.SepReader.Row>,System.Collections.Generic.IEnumerator<nietras.SeparatedValues.SepReader.Row>,System.Collections.IEnumerable,System.Collections.IEnumerator,System.IDisposable{publicnietras.SeparatedValues.SepReader.RowCurrent{get;}publicboolHasHeader{get;}publicboolHasRows{get;}publicnietras.SeparatedValues.SepReaderHeaderHeader{get;}publicboolIsEmpty{get;}publicnietras.SeparatedValues.SepSpecSpec{get;}publicnietras.SeparatedValues.SepReader.AsyncEnumeratorGetAsyncEnumerator(System.Threading.CancellationTokencancellationToken=default){}publicnietras.SeparatedValues.SepReaderGetEnumerator(){}publicboolMoveNext(){}publicSystem.Threading.Tasks.ValueTask<bool>MoveNextAsync(System.Threading.CancellationTokencancellationToken=default){}publicstringToString(intindex){}publicreadonlystructAsyncEnumerator:System.Collections.Generic.IAsyncEnumerator<nietras.SeparatedValues.SepReader.Row>,System.IAsyncDisposable{publicnietras.SeparatedValues.SepReader.RowCurrent{get;}publicSystem.Threading.Tasks.ValueTaskDisposeAsync(){}publicSystem.Threading.Tasks.ValueTask<bool>MoveNextAsync(){}}[System.Diagnostics.DebuggerDisplay("{DebuggerDisplay}")]publicreadonlyrefstructCol{publicSystem.ReadOnlySpan<char>Span{get;}publicTParse<T>()whereT:System.ISpanParsable<T>{}publicoverridestringToString(){}publicT?TryParse<T>()whereT:struct,System.ISpanParsable<T>{}publicboolTryParse<T>(outTvalue)whereT:System.ISpanParsable<T>{}}publicreadonlyrefstructCols{publicintCount{get;}publicnietras.SeparatedValues.SepReader.Colthis[intindex]{get;}publicstringCombinePathsToString(){}publicSystem.ReadOnlySpan<char>Join(System.ReadOnlySpan<char>separator){}publicstringJoinPathsToString(){}publicstringJoinToString(System.ReadOnlySpan<char>separator){}publicSystem.Span<T>Parse<T>()whereT:System.ISpanParsable<T>{}publicvoidParse<T>(System.Span<T>span)whereT:System.ISpanParsable<T>{}publicT[]ParseToArray<T>()whereT:System.ISpanParsable<T>{}publicSystem.Span<T>Select<T>(methodselector){}publicSystem.Span<T>Select<T>(nietras.SeparatedValues.SepReader.ColFunc<T>selector){}publicSystem.Span<string>ToStrings(){}publicstring[]ToStringsArray(){}publicSystem.Span<T?>TryParse<T>()whereT:struct,System.ISpanParsable<T>{}publicvoidTryParse<T>(System.Span<T?>span)whereT:struct,System.ISpanParsable<T>{}}[System.Diagnostics.DebuggerDisplay("{DebuggerDisplayPrefix,nq}{Span}")][System.Diagnostics.DebuggerTypeProxy(typeof(nietras.SeparatedValues.SepReader.Row.DebugView))]publicreadonlyrefstructRow{publicintColCount{get;}publicnietras.SeparatedValues.SepReader.Colthis[intindex]{get;}publicnietras.SeparatedValues.SepReader.Colthis[System.Indexindex]{get;}publicnietras.SeparatedValues.SepReader.Colthis[stringcolName]{get;}publicnietras.SeparatedValues.SepReader.Colsthis[System.Rangerange]{get;}publicnietras.SeparatedValues.SepReader.Colsthis[System.ReadOnlySpan<int>indices]{get;}publicnietras.SeparatedValues.SepReader.Colsthis[System.Collections.Generic.IReadOnlyList<int>indices]{get;}publicnietras.SeparatedValues.SepReader.Colsthis[int[]indices]{get;}publicnietras.SeparatedValues.SepReader.Colsthis[System.ReadOnlySpan<string>colNames]{get;}publicnietras.SeparatedValues.SepReader.Colsthis[System.Collections.Generic.IReadOnlyList<string>colNames]{get;}publicnietras.SeparatedValues.SepReader.Colsthis[string[]colNames]{get;}publicintLineNumberFrom{get;}publicintLineNumberToExcl{get;}publicintRowIndex{get;}publicSystem.ReadOnlySpan<char>Span{get;}publicSystem.Func<int,string>UnsafeToStringDelegate{get;}publicoverridestringToString(){}}publicdelegatevoidColAction(nietras.SeparatedValues.SepReader.Colcol);publicdelegateTColFunc<T>(nietras.SeparatedValues.SepReader.Colcol);publicdelegatevoidColsAction(nietras.SeparatedValues.SepReader.Colscol);publicdelegatevoidRowAction(nietras.SeparatedValues.SepReader.Rowrow);publicdelegateTRowFunc<T>(nietras.SeparatedValues.SepReader.Rowrow);publicdelegateboolRowTryFunc<T>(nietras.SeparatedValues.SepReader.Rowrow,outTvalue);}publicstaticclassSepReaderExtensions{publicstaticSystem.Collections.Generic.IEnumerable<T>Enumerate<T>(thisnietras.SeparatedValues.SepReaderreader,nietras.SeparatedValues.SepReader.RowFunc<T>select){}publicstaticSystem.Collections.Generic.IEnumerable<T>Enumerate<T>(thisnietras.SeparatedValues.SepReaderreader,nietras.SeparatedValues.SepReader.RowTryFunc<T>trySelect){}publicstaticSystem.Collections.Generic.IAsyncEnumerable<T>EnumerateAsync<T>(thisnietras.SeparatedValues.SepReaderreader,nietras.SeparatedValues.SepReader.RowFunc<T>select){}publicstaticSystem.Collections.Generic.IAsyncEnumerable<T>EnumerateAsync<T>(thisnietras.SeparatedValues.SepReaderreader,nietras.SeparatedValues.SepReader.RowTryFunc<T>trySelect){}publicstaticnietras.SeparatedValues.SepReaderFrom(inthisnietras.SeparatedValues.SepReaderOptionsoptions,byte[]buffer){}publicstaticnietras.SeparatedValues.SepReaderFrom(inthisnietras.SeparatedValues.SepReaderOptionsoptions,System.IO.Streamstream){}publicstaticnietras.SeparatedValues.SepReaderFrom(inthisnietras.SeparatedValues.SepReaderOptionsoptions,System.IO.TextReaderreader){}publicstaticnietras.SeparatedValues.SepReaderFrom(inthisnietras.SeparatedValues.SepReaderOptionsoptions,stringname,System.Func<string,System.IO.Stream>nameToStream){}publicstaticnietras.SeparatedValues.SepReaderFrom(inthisnietras.SeparatedValues.SepReaderOptionsoptions,stringname,System.Func<string,System.IO.TextReader>nameToReader){}publicstaticSystem.Threading.Tasks.ValueTask<nietras.SeparatedValues.SepReader>FromAsync(thisnietras.SeparatedValues.SepReaderOptionsoptions,byte[]buffer,System.Threading.CancellationTokencancellationToken=default){}publicstaticSystem.Threading.Tasks.ValueTask<nietras.SeparatedValues.SepReader>FromAsync(thisnietras.SeparatedValues.SepReaderOptionsoptions,System.IO.Streamstream,System.Threading.CancellationTokencancellationToken=default){}publicstaticSystem.Threading.Tasks.ValueTask<nietras.SeparatedValues.SepReader>FromAsync(thisnietras.SeparatedValues.SepReaderOptionsoptions,System.IO.TextReaderreader,System.Threading.CancellationTokencancellationToken=default){}publicstaticSystem.Threading.Tasks.ValueTask<nietras.SeparatedValues.SepReader>FromAsync(thisnietras.SeparatedValues.SepReaderOptionsoptions,stringname,System.Func<string,System.IO.Stream>nameToStream,System.Threading.CancellationTokencancellationToken=default){}publicstaticSystem.Threading.Tasks.ValueTask<nietras.SeparatedValues.SepReader>FromAsync(thisnietras.SeparatedValues.SepReaderOptionsoptions,stringname,System.Func<string,System.IO.TextReader>nameToReader,System.Threading.CancellationTokencancellationToken=default){}publicstaticnietras.SeparatedValues.SepReaderFromFile(inthisnietras.SeparatedValues.SepReaderOptionsoptions,stringfilePath){}publicstaticSystem.Threading.Tasks.ValueTask<nietras.SeparatedValues.SepReader>FromFileAsync(thisnietras.SeparatedValues.SepReaderOptionsoptions,stringfilePath,System.Threading.CancellationTokencancellationToken=default){}publicstaticnietras.SeparatedValues.SepReaderFromText(inthisnietras.SeparatedValues.SepReaderOptionsoptions,stringtext){}publicstaticSystem.Threading.Tasks.ValueTask<nietras.SeparatedValues.SepReader>FromTextAsync(thisnietras.SeparatedValues.SepReaderOptionsoptions,stringtext,System.Threading.CancellationTokencancellationToken=default){}publicstaticSystem.Collections.Generic.IEnumerable<T>ParallelEnumerate<T>(thisnietras.SeparatedValues.SepReaderreader,nietras.SeparatedValues.SepReader.RowFunc<T>select){}publicstaticSystem.Collections.Generic.IEnumerable<T>ParallelEnumerate<T>(thisnietras.SeparatedValues.SepReaderreader,nietras.SeparatedValues.SepReader.RowTryFunc<T>trySelect){}publicstaticSystem.Collections.Generic.IEnumerable<T>ParallelEnumerate<T>(thisnietras.SeparatedValues.SepReaderreader,nietras.SeparatedValues.SepReader.RowFunc<T>select,intdegreeOfParallism){}publicstaticSystem.Collections.Generic.IEnumerable<T>ParallelEnumerate<T>(thisnietras.SeparatedValues.SepReaderreader,nietras.SeparatedValues.SepReader.RowTryFunc<T>trySelect,intdegreeOfParallism){}publicstaticnietras.SeparatedValues.SepReaderOptionsReader(thisnietras.SeparatedValues.Sepsep){}publicstaticnietras.SeparatedValues.SepReaderOptionsReader(thisnietras.SeparatedValues.Sep?sep){}publicstaticnietras.SeparatedValues.SepReaderOptionsReader(thisnietras.SeparatedValues.SepSpecspec){}publicstaticnietras.SeparatedValues.SepReaderOptionsReader(thisnietras.SeparatedValues.Sepsep,System.Func<nietras.SeparatedValues.SepReaderOptions,nietras.SeparatedValues.SepReaderOptions>configure){}publicstaticnietras.SeparatedValues.SepReaderOptionsReader(thisnietras.SeparatedValues.Sep?sep,System.Func<nietras.SeparatedValues.SepReaderOptions,nietras.SeparatedValues.SepReaderOptions>configure){}publicstaticnietras.SeparatedValues.SepReaderOptionsReader(thisnietras.SeparatedValues.SepSpecspec,System.Func<nietras.SeparatedValues.SepReaderOptions,nietras.SeparatedValues.SepReaderOptions>configure){}}publicsealedclassSepReaderHeader{publicSystem.Collections.Generic.IReadOnlyList<string>ColNames{get;}publicboolIsEmpty{get;}publicstaticnietras.SeparatedValues.SepReaderHeaderEmpty{get;}publicintIndexOf(System.ReadOnlySpan<char>colName){}publicintIndexOf(stringcolName){}publicint[]IndicesOf(System.Collections.Generic.IReadOnlyList<string>colNames){}publicint[]IndicesOf([System.Runtime.CompilerServices.ParamCollection][System.Runtime.CompilerServices.ScopedRef]System.ReadOnlySpan<string>colNames){}publicint[]IndicesOf(paramsstring[]colNames){}publicvoidIndicesOf(System.ReadOnlySpan<string>colNames,System.Span<int>colIndices){}publicSystem.Collections.Generic.IReadOnlyList<string>NamesStartingWith(stringprefix,System.StringComparisoncomparison=4){}publicoverridestringToString(){}publicboolTryIndexOf(System.ReadOnlySpan<char>colName,outintcolIndex){}publicboolTryIndexOf(stringcolName,outintcolIndex){}}publicreadonlystructSepReaderOptions:System.IEquatable<nietras.SeparatedValues.SepReaderOptions>{publicSepReaderOptions(){}publicSepReaderOptions(nietras.SeparatedValues.Sep?sep){}publicboolAsyncContinueOnCapturedContext{get;init;}publicSystem.Collections.Generic.IEqualityComparer<string>ColNameComparer{get;init;}publicnietras.SeparatedValues.SepCreateToStringCreateToString{get;init;}publicSystem.Globalization.CultureInfo?CultureInfo{get;init;}publicboolDisableColCountCheck{get;init;}publicboolDisableFastFloat{get;init;}publicboolDisableQuotesParsing{get;init;}publicboolHasHeader{get;init;}publicintInitialBufferLength{get;init;}publicnietras.SeparatedValues.Sep?Sep{get;init;}publicnietras.SeparatedValues.SepTrimTrim{get;init;}publicboolUnescape{get;init;}}publicclassSepReaderState:System.IDisposable{publicvoidDispose(){}}publicstaticclassSepReaderWriterExtensions{publicstaticvoidCopyTo(thisnietras.SeparatedValues.SepReader.RowreaderRow,nietras.SeparatedValues.SepWriter.RowwriterRow){}publicstaticnietras.SeparatedValues.SepWriter.RowNewRow(thisnietras.SeparatedValues.SepWriterwriter,nietras.SeparatedValues.SepReader.RowrowToCopy){}publicstaticnietras.SeparatedValues.SepWriter.RowNewRow(thisnietras.SeparatedValues.SepWriterwriter,nietras.SeparatedValues.SepReader.RowrowToCopy,System.Threading.CancellationTokencancellationToken){}}publicreadonlystructSepSpec:System.IEquatable<nietras.SeparatedValues.SepSpec>{publicSepSpec(){}publicSepSpec(nietras.SeparatedValues.Sepsep,System.Globalization.CultureInfo?cultureInfo){}publicSepSpec(nietras.SeparatedValues.Sepsep,System.Globalization.CultureInfo?cultureInfo,boolasyncContinueOnCapturedContext){}publicboolAsyncContinueOnCapturedContext{get;init;}publicSystem.Globalization.CultureInfo?CultureInfo{get;init;}publicnietras.SeparatedValues.SepSep{get;init;}}publicabstractclassSepToString:System.IDisposable{protectedSepToString(){}publicvirtualboolIsThreadSafe{get;}publicstaticnietras.SeparatedValues.SepCreateToStringDirect{get;}publicvoidDispose(){}protectedvirtualvoidDispose(booldisposing){}publicabstractstringToString(System.ReadOnlySpan<char>colSpan,intcolIndex);publicstaticnietras.SeparatedValues.SepCreateToStringOnePool(intmaximumStringLength=32,intinitialCapacity=64,intmaximumCapacity=4096){}publicstaticnietras.SeparatedValues.SepCreateToStringPoolPerCol(intmaximumStringLength=32,intinitialCapacity=64,intmaximumCapacity=4096){}publicstaticnietras.SeparatedValues.SepCreateToStringPoolPerColThreadSafe(intmaximumStringLength=32,intinitialCapacity=64,intmaximumCapacity=4096){}publicstaticnietras.SeparatedValues.SepCreateToStringPoolPerColThreadSafeFixedCapacity(intmaximumStringLength=32,intcapacity=2048){}}[System.Flags]publicenumSepTrim:byte{None=0,Outer=1,AfterUnescape=2,All=3,}[System.Diagnostics.DebuggerDisplay("{DebuggerDisplay,nq}")]publicsealedclassSepWriter:System.IAsyncDisposable,System.IDisposable{publicnietras.SeparatedValues.SepWriterHeaderHeader{get;}publicnietras.SeparatedValues.SepSpecSpec{get;}publicvoidDispose(){}publicSystem.Threading.Tasks.ValueTaskDisposeAsync(){}publicvoidFlush(){}publicSystem.Threading.Tasks.TaskFlushAsync(System.Threading.CancellationTokencancellationToken=default){}publicnietras.SeparatedValues.SepWriter.RowNewRow(){}publicnietras.SeparatedValues.SepWriter.RowNewRow(System.Threading.CancellationTokencancellationToken){}publicoverridestringToString(){}publicreadonlyrefstructCol{publicvoidFormat<T>(Tvalue)whereT:System.ISpanFormattable{}publicvoidSet(System.ReadOnlySpan<char>span){}publicvoidSet([System.Runtime.CompilerServices.InterpolatedStringHandlerArgument("")]refnietras.SeparatedValues.SepWriter.Col.FormatInterpolatedStringHandlerhandler){}publicvoidSet(System.IFormatProvider?provider,[System.Runtime.CompilerServices.InterpolatedStringHandlerArgument(newstring?[]?[]{"","provider"})]refnietras.SeparatedValues.SepWriter.Col.FormatInterpolatedStringHandlerhandler){}[System.Runtime.CompilerServices.InterpolatedStringHandler]publicrefstructFormatInterpolatedStringHandler{publicFormatInterpolatedStringHandler(intliteralLength,intformattedCount,nietras.SeparatedValues.SepWriter.Colcol){}publicFormatInterpolatedStringHandler(intliteralLength,intformattedCount,nietras.SeparatedValues.SepWriter.Colcol,System.IFormatProvider?provider){}publicvoidAppendFormatted(System.ReadOnlySpan<char>value){}publicvoidAppendFormatted(string?value){}publicvoidAppendFormatted(System.ReadOnlySpan<char>value,intalignment=0,string?format=null){}publicvoidAppendFormatted(object?value,intalignment=0,string?format=null){}publicvoidAppendFormatted(string?value,intalignment=0,string?format=null){}publicvoidAppendFormatted<T>(Tvalue){}publicvoidAppendFormatted<T>(Tvalue,intalignment){}publicvoidAppendFormatted<T>(Tvalue,string?format){}publicvoidAppendFormatted<T>(Tvalue,intalignment,string?format){}publicvoidAppendLiteral(stringvalue){}}}publicreadonlyrefstructCols{publicintCount{get;}publicnietras.SeparatedValues.SepWriter.Colthis[intcolIndex]{get;}publicvoidFormat<T>(System.Collections.Generic.IReadOnlyList<T>values)whereT:System.ISpanFormattable{}publicvoidFormat<T>([System.Runtime.CompilerServices.ParamCollection][System.Runtime.CompilerServices.ScopedRef]System.ReadOnlySpan<T>values)whereT:System.ISpanFormattable{}publicvoidFormat<T>(System.Span<T>values)whereT:System.ISpanFormattable{}publicvoidFormat<T>(T[]values)whereT:System.ISpanFormattable{}publicvoidFormat<T>(System.ReadOnlySpan<T>values,nietras.SeparatedValues.SepWriter.ColAction<T>format){}publicvoidSet(System.Collections.Generic.IReadOnlyList<string>values){}publicvoidSet([System.Runtime.CompilerServices.ParamCollection][System.Runtime.CompilerServices.ScopedRef]System.ReadOnlySpan<string>values){}publicvoidSet(string[]values){}publicvoidSet(nietras.SeparatedValues.SepReader.Colscols){}}publicrefstructRow:System.IAsyncDisposable,System.IDisposable{publicnietras.SeparatedValues.SepWriter.Colthis[intcolIndex]{get;}publicnietras.SeparatedValues.SepWriter.Colthis[stringcolName]{get;}publicnietras.SeparatedValues.SepWriter.Colsthis[System.ReadOnlySpan<int>indices]{get;}publicnietras.SeparatedValues.SepWriter.Colsthis[System.ReadOnlySpan<string>colNames]{get;}publicnietras.SeparatedValues.SepWriter.Colsthis[System.Collections.Generic.IReadOnlyList<string>colNames]{get;}publicnietras.SeparatedValues.SepWriter.Colsthis[string[]colNames]{get;}publicvoidDispose(){}publicSystem.Threading.Tasks.ValueTaskDisposeAsync(){}}publicdelegatevoidColAction(nietras.SeparatedValues.SepWriter.Colcol);publicdelegatevoidColAction<T>(nietras.SeparatedValues.SepWriter.Colcol,Tvalue);publicdelegatevoidRowAction(nietras.SeparatedValues.SepWriter.Rowrow);}publicstaticclassSepWriterExtensions{publicstaticnietras.SeparatedValues.SepWriterTo(inthisnietras.SeparatedValues.SepWriterOptionsoptions,System.IO.Streamstream){}publicstaticnietras.SeparatedValues.SepWriterTo(inthisnietras.SeparatedValues.SepWriterOptionsoptions,System.IO.TextWriterwriter){}publicstaticnietras.SeparatedValues.SepWriterTo(inthisnietras.SeparatedValues.SepWriterOptionsoptions,System.Text.StringBuilderstringBuilder){}publicstaticnietras.SeparatedValues.SepWriterTo(inthisnietras.SeparatedValues.SepWriterOptionsoptions,System.IO.Streamstream,boolleaveOpen){}publicstaticnietras.SeparatedValues.SepWriterTo(inthisnietras.SeparatedValues.SepWriterOptionsoptions,System.IO.TextWriterwriter,boolleaveOpen){}publicstaticnietras.SeparatedValues.SepWriterTo(inthisnietras.SeparatedValues.SepWriterOptionsoptions,stringname,System.Func<string,System.IO.Stream>nameToStream,boolleaveOpen=false){}publicstaticnietras.SeparatedValues.SepWriterTo(inthisnietras.SeparatedValues.SepWriterOptionsoptions,stringname,System.Func<string,System.IO.TextWriter>nameToWriter,boolleaveOpen=false){}publicstaticnietras.SeparatedValues.SepWriterToFile(inthisnietras.SeparatedValues.SepWriterOptionsoptions,stringfilePath){}publicstaticnietras.SeparatedValues.SepWriterToText(inthisnietras.SeparatedValues.SepWriterOptionsoptions){}publicstaticnietras.SeparatedValues.SepWriterToText(inthisnietras.SeparatedValues.SepWriterOptionsoptions,intcapacity){}publicstaticnietras.SeparatedValues.SepWriterOptionsWriter(thisnietras.SeparatedValues.Sepsep){}publicstaticnietras.SeparatedValues.SepWriterOptionsWriter(thisnietras.SeparatedValues.SepSpecspec){}publicstaticnietras.SeparatedValues.SepWriterOptionsWriter(thisnietras.SeparatedValues.Sepsep,System.Func<nietras.SeparatedValues.SepWriterOptions,nietras.SeparatedValues.SepWriterOptions>configure){}publicstaticnietras.SeparatedValues.SepWriterOptionsWriter(thisnietras.SeparatedValues.SepSpecspec,System.Func<nietras.SeparatedValues.SepWriterOptions,nietras.SeparatedValues.SepWriterOptions>configure){}}[System.Diagnostics.DebuggerDisplay("{DebuggerDisplay,nq}")][System.Diagnostics.DebuggerTypeProxy(typeof(nietras.SeparatedValues.SepWriterHeader.DebugView))]publicsealedclassSepWriterHeader{publicvoidAdd(System.Collections.Generic.IReadOnlyList<string>colNames){}publicvoidAdd([System.Runtime.CompilerServices.ParamCollection][System.Runtime.CompilerServices.ScopedRef]System.ReadOnlySpan<string>colNames){}publicvoidAdd(stringcolName){}publicvoidAdd(string[]colNames){}publicvoidWrite(){}publicSystem.Threading.Tasks.ValueTaskWriteAsync(System.Threading.CancellationTokencancellationToken=default){}}publicreadonlystructSepWriterOptions:System.IEquatable<nietras.SeparatedValues.SepWriterOptions>{publicSepWriterOptions(){}publicSepWriterOptions(nietras.SeparatedValues.Sepsep){}publicboolAsyncContinueOnCapturedContext{get;init;}publicnietras.SeparatedValues.SepColNotSetOptionColNotSetOption{get;init;}publicSystem.Globalization.CultureInfo?CultureInfo{get;init;}publicboolDisableColCountCheck{get;init;}publicboolEscape{get;init;}publicnietras.SeparatedValues.SepSep{get;init;}publicboolWriteHeader{get;init;}}}
About
World's Fastest .NET CSV Parser. Modern, minimal, fast, zero allocation, reading and writing of separated values (`csv`, `tsv` etc.). Cross-platform, trimmable and AOT/NativeAOT compatible.