Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

use simd masking for amd64&arm64#326

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
nhooyr merged 26 commits intocoder:devfromwdvxdr1123:patch-simd-mask
Feb 22, 2024
Merged
Show file tree
Hide file tree
Changes from1 commit
Commits
Show all changes
26 commits
Select commitHold shift + click to select a range
5df0303
mask.go: Use SIMD masking for amd64 and arm64
wdvxdr1123Jan 24, 2022
cda2170
Refactor and compile masking code again
nhooyrOct 19, 2023
f5397ae
mask_asm.go: Disable AVX2
nhooyrOct 19, 2023
14172e5
Benchmark pure go masking algorithm separately from assembly
nhooyrOct 19, 2023
685a56e
Update README.md to indicate assembly websocket masking
nhooyrOct 19, 2023
cb7509a
mask_amd64.s: Remove AVX2 fully
nhooyrOct 19, 2023
3f8c9e0
mask_amd64.s: Minor improvements
nhooyrOct 19, 2023
367743d
mask_amd64.sh: Cleanup
nhooyrOct 19, 2023
27f80cb
mask.go: Cleanup assembly and add nbio benchmark
nhooyrOct 19, 2023
369d641
mask_arm64.s: Cleanup
nhooyrOct 20, 2023
fb13df2
ci/bench.sh: Benchmark masking on arm64 with QEMU
nhooyrOct 20, 2023
ecf7dec
ci/bench.sh: Install QEMU on CI
nhooyrOct 20, 2023
d34e5d4
wsjson: Add json.Encoder vs json.Marshal benchmark
nhooyrOct 20, 2023
e25d968
ci/bench.sh: Don't profile by default
nhooyrOct 20, 2023
640e3c2
ci/bench.sh: Try function instead of alias
nhooyrOct 20, 2023
0596e7a
wsjson: Extend benchmark with multiple sizes
nhooyrOct 20, 2023
30447a3
ci/bench.sh: Just symlink the expected qemu-aarch64 binary name
nhooyrOct 20, 2023
f4e61e5
ci/fmt.sh: Error if changes on CI
nhooyrOct 21, 2023
f533f43
mask.go: Reorganize
nhooyrOct 21, 2023
a1bb441
ci: Fix dev coverage output
nhooyrFeb 7, 2024
fee3739
mask_asm: Note implementation may not be perfect
nhooyrFeb 7, 2024
68fc887
mask.go: Revert my changes
nhooyrFeb 22, 2024
f62cef3
test.sh: Test assembly masking on arm64
nhooyrFeb 22, 2024
92acb74
internal/xcpu: Vendor golang.org/x/sys/cpu
nhooyrFeb 22, 2024
17e1b86
mask_asm: Disable AVX2
nhooyrFeb 22, 2024
2cd18b3
README.md: Link to assembly benchmark results
nhooyrFeb 22, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
PrevPrevious commit
NextNext commit
mask_amd64.s: Minor improvements
  • Loading branch information
@nhooyr
nhooyr committedOct 26, 2023
commit3f8c9e07bcaa0a223d092b618c34ca7dba3521db
2 changes: 2 additions & 0 deletionsframe.go
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -184,6 +184,8 @@ func writeFrameHeader(h header, w *bufio.Writer, buf []byte) (err error) {
// to be in little endian.
//
// See https://github.com/golang/go/issues/31586
//
//lint:ignore U1000 mask.go
func maskGo(key uint32, b []byte) uint32 {
if len(b) >= 8 {
key64 := uint64(key)<<32 | uint64(key)
Expand Down
12 changes: 6 additions & 6 deletionsmask_amd64.s
View file
Open in desktop
Original file line numberDiff line numberDiff line change
Expand Up@@ -17,8 +17,8 @@ TEXT ·maskAsm(SB), NOSPLIT, $0-28
SHLQ $32, DI
ORQ DX, DI

CMPQ CX, $15
JLEless_than_16
CMPQ CX, $7
JLEless_than_8
CMPQ CX, $63
JLE less_than_64
CMPQ CX, $128
Expand DownExpand Up@@ -58,7 +58,7 @@ unaligned_loop:
JMP sse

sse:
CMPQ CX, $0x40
CMPQ CX, $64
JL less_than_64
MOVQ DI, X0
PUNPCKLQDQ X0, X0
Expand All@@ -76,9 +76,9 @@ sse_loop:
MOVOU X2, 1*16(AX)
MOVOU X3, 2*16(AX)
MOVOU X4, 3*16(AX)
ADDQ $0x40, AX
SUBQ $0x40, CX
CMPQ CX, $0x40
ADDQ $64, AX
SUBQ $64, CX
CMPQ CX, $64
JAE sse_loop

less_than_64:
Expand Down

[8]ページ先頭

©2009-2025 Movatter.jp