Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up

go buffered line scanner

License

NotificationsYou must be signed in to change notification settings

karrick/gobls

Repository files navigation

Gobls is a buffered line scanner for Go.

GoDoc

Description

Similar tobufio.Scanner, but wrapsbufio.Reader.ReadLine so linesof arbitrary length can be scanned. It uses a hybrid approach so thatin most cases, when lines are not unusually long, the fast code pathis taken. When lines are unusually long, it uses the per-scannerpre-allocated byte slice to reassemble the fragments into a singleslice of bytes.

Example

Enumerating lines from an io.Reader (drop in replacement for bufio.Scanner)

When you have an io.Reader that you want to enumerate, normally youwrap it inbufio.Scanner. This library is a drop in replacement forthis particular circumstance, and you can change frombufio.NewScanner(r) togobls.NewScanner(r), and no longer have toworry about token too long errors.

varlines,charactersintls:=gobls.NewScanner(os.Stdin)forls.Scan() {lines++characters+=len(ls.Bytes())    }iferr:=ls.Err();err!=nil {fmt.Fprintln(os.Stderr,"cannot scan:",err)    }fmt.Println("Counted",lines,"lines and",characters,"characters.")

Enumerating lines from []byte

If you already have a slice of bytes that you want to enumerate linesfor, it is much more performant to wrap that byte slice withgobls.NewBufferScanner(buf) than to wrap the slice in a io.Readerand call either the above orbufio.NewScanner.

varlines,charactersintls:=gobls.NewBufferScanner(buf)forls.Scan() {lines++characters+=len(ls.Bytes())    }iferr:=ls.Err();err!=nil {fmt.Fprintln(os.Stderr,"cannot scan:",err)    }fmt.Println("Counted",lines,"lines and",characters,"characters.")

Performance

TheBufferScanner is faster thanbufio.Scanner for allbenchmarks. However, on my test system, the regularScanner takesfrom 2% to nearly 40% longer than bufio scanner, depending on thelength of the lines to be scanned. The 40% longer times were onlyobserved when line lengths werebufio.MaxScanTokenSize bytes long.Usually the performance penalty is 2% to 15% of bufio measurements.

Rungo test -bench=. -benchmem on your system for comparison. I'msure the testing method could be improved. Suggestions are welcomed.

For circumstances where there is no concern about enumerating lineswhose lengths are longer than the max token length frombufio, thenI recommend using the standard library.

On the other hand, if you already have a slice of bytes, library ismuch more performant than the equivalentbufio.NewScanner(bytes.NewReader(buf)).

$ go test -bench=. -benchmemgoos: linuxgoarch: amd64pkg: github.com/karrick/goblsBenchmarkScanner/0064/bufio-8               30000000   43.7  ns/op  0  B/op  0  allocs/opBenchmarkScanner/0064/reader-8              20000000   59.2  ns/op  0  B/op  0  allocs/opBenchmarkScanner/0064/buffer-8              50000000   33.7  ns/op  0  B/op  0  allocs/opBenchmarkScanner/0128/bufio-8               30000000   54.5  ns/op  0  B/op  0  allocs/opBenchmarkScanner/0128/reader-8              20000000   70.5  ns/op  0  B/op  0  allocs/opBenchmarkScanner/0128/buffer-8              30000000   38.9  ns/op  0  B/op  0  allocs/opBenchmarkScanner/0256/bufio-8               20000000   79.8  ns/op  0  B/op  0  allocs/opBenchmarkScanner/0256/reader-8              20000000   94.9  ns/op  0  B/op  0  allocs/opBenchmarkScanner/0256/buffer-8              30000000   50.2  ns/op  0  B/op  0  allocs/opBenchmarkScanner/0512/bufio-8               10000000    123  ns/op  0  B/op  0  allocs/opBenchmarkScanner/0512/reader-8              10000000    144  ns/op  0  B/op  0  allocs/opBenchmarkScanner/0512/buffer-8              20000000   79.0  ns/op  0  B/op  0  allocs/opBenchmarkScanner/1024/bufio-8               10000000    210  ns/op  0  B/op  0  allocs/opBenchmarkScanner/1024/reader-8              10000000    227  ns/op  0  B/op  0  allocs/opBenchmarkScanner/1024/buffer-8              10000000    119  ns/op  0  B/op  0  allocs/opBenchmarkScanner/2048/bufio-8                5000000    382  ns/op  0  B/op  0  allocs/opBenchmarkScanner/2048/reader-8               3000000    413  ns/op  0  B/op  0  allocs/opBenchmarkScanner/2048/buffer-8               5000000    272  ns/op  0  B/op  0  allocs/opBenchmarkScanner/4096/bufio-8                2000000    701  ns/op  0  B/op  0  allocs/opBenchmarkScanner/4096/reader-8               2000000    733  ns/op  0  B/op  0  allocs/opBenchmarkScanner/4096/buffer-8               3000000    517  ns/op  0  B/op  0  allocs/opBenchmarkScanner/excessively_long/bufio-8     200000  11681  ns/op  0  B/op  0  allocs/opBenchmarkScanner/excessively_long/reader-8    100000  14464  ns/op  2  B/op  0  allocs/opBenchmarkScanner/excessively_long/buffer-8    200000   8688  ns/op  0  B/op  0  allocs/opPASSok  github.com/karrick/gobls256.191s

About

go buffered line scanner

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages


[8]ページ先頭

©2009-2025 Movatter.jp