- Notifications
You must be signed in to change notification settings - Fork2
karrick/gobls
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Gobls is a buffered line scanner for Go.
Similar tobufio.Scanner
, but wrapsbufio.Reader.ReadLine
so linesof arbitrary length can be scanned. It uses a hybrid approach so thatin most cases, when lines are not unusually long, the fast code pathis taken. When lines are unusually long, it uses the per-scannerpre-allocated byte slice to reassemble the fragments into a singleslice of bytes.
When you have an io.Reader that you want to enumerate, normally youwrap it inbufio.Scanner
. This library is a drop in replacement forthis particular circumstance, and you can change frombufio.NewScanner(r)
togobls.NewScanner(r)
, and no longer have toworry about token too long errors.
varlines,charactersintls:=gobls.NewScanner(os.Stdin)forls.Scan() {lines++characters+=len(ls.Bytes()) }iferr:=ls.Err();err!=nil {fmt.Fprintln(os.Stderr,"cannot scan:",err) }fmt.Println("Counted",lines,"lines and",characters,"characters.")
If you already have a slice of bytes that you want to enumerate linesfor, it is much more performant to wrap that byte slice withgobls.NewBufferScanner(buf)
than to wrap the slice in a io.Readerand call either the above orbufio.NewScanner
.
varlines,charactersintls:=gobls.NewBufferScanner(buf)forls.Scan() {lines++characters+=len(ls.Bytes()) }iferr:=ls.Err();err!=nil {fmt.Fprintln(os.Stderr,"cannot scan:",err) }fmt.Println("Counted",lines,"lines and",characters,"characters.")
TheBufferScanner
is faster thanbufio.Scanner
for allbenchmarks. However, on my test system, the regularScanner
takesfrom 2% to nearly 40% longer than bufio scanner, depending on thelength of the lines to be scanned. The 40% longer times were onlyobserved when line lengths werebufio.MaxScanTokenSize
bytes long.Usually the performance penalty is 2% to 15% of bufio measurements.
Rungo test -bench=. -benchmem
on your system for comparison. I'msure the testing method could be improved. Suggestions are welcomed.
For circumstances where there is no concern about enumerating lineswhose lengths are longer than the max token length frombufio
, thenI recommend using the standard library.
On the other hand, if you already have a slice of bytes, library ismuch more performant than the equivalentbufio.NewScanner(bytes.NewReader(buf))
.
$ go test -bench=. -benchmemgoos: linuxgoarch: amd64pkg: github.com/karrick/goblsBenchmarkScanner/0064/bufio-8 30000000 43.7 ns/op 0 B/op 0 allocs/opBenchmarkScanner/0064/reader-8 20000000 59.2 ns/op 0 B/op 0 allocs/opBenchmarkScanner/0064/buffer-8 50000000 33.7 ns/op 0 B/op 0 allocs/opBenchmarkScanner/0128/bufio-8 30000000 54.5 ns/op 0 B/op 0 allocs/opBenchmarkScanner/0128/reader-8 20000000 70.5 ns/op 0 B/op 0 allocs/opBenchmarkScanner/0128/buffer-8 30000000 38.9 ns/op 0 B/op 0 allocs/opBenchmarkScanner/0256/bufio-8 20000000 79.8 ns/op 0 B/op 0 allocs/opBenchmarkScanner/0256/reader-8 20000000 94.9 ns/op 0 B/op 0 allocs/opBenchmarkScanner/0256/buffer-8 30000000 50.2 ns/op 0 B/op 0 allocs/opBenchmarkScanner/0512/bufio-8 10000000 123 ns/op 0 B/op 0 allocs/opBenchmarkScanner/0512/reader-8 10000000 144 ns/op 0 B/op 0 allocs/opBenchmarkScanner/0512/buffer-8 20000000 79.0 ns/op 0 B/op 0 allocs/opBenchmarkScanner/1024/bufio-8 10000000 210 ns/op 0 B/op 0 allocs/opBenchmarkScanner/1024/reader-8 10000000 227 ns/op 0 B/op 0 allocs/opBenchmarkScanner/1024/buffer-8 10000000 119 ns/op 0 B/op 0 allocs/opBenchmarkScanner/2048/bufio-8 5000000 382 ns/op 0 B/op 0 allocs/opBenchmarkScanner/2048/reader-8 3000000 413 ns/op 0 B/op 0 allocs/opBenchmarkScanner/2048/buffer-8 5000000 272 ns/op 0 B/op 0 allocs/opBenchmarkScanner/4096/bufio-8 2000000 701 ns/op 0 B/op 0 allocs/opBenchmarkScanner/4096/reader-8 2000000 733 ns/op 0 B/op 0 allocs/opBenchmarkScanner/4096/buffer-8 3000000 517 ns/op 0 B/op 0 allocs/opBenchmarkScanner/excessively_long/bufio-8 200000 11681 ns/op 0 B/op 0 allocs/opBenchmarkScanner/excessively_long/reader-8 100000 14464 ns/op 2 B/op 0 allocs/opBenchmarkScanner/excessively_long/buffer-8 200000 8688 ns/op 0 B/op 0 allocs/opPASSok github.com/karrick/gobls256.191s
About
go buffered line scanner