- Notifications
You must be signed in to change notification settings - Fork1k
Commit03ed951
authored
fix: Disable compression for websocket dRPC transport (#145)
There is a race condition in the interop between the websocket and `dRPC`:https://github.com/coder/coder/runs/5038545709?check_suite_focus=true#step:7:117 - it seems both the websocket and dRPC feel like they own the `byte[]` being sent between them. This can lead to data races, in which both `dRPC` and the websocket are writing.This is just tracking some experimentation to fix that race condition## Run results: ##- Run 1: peer test failure- Run 2: peer test failure- Run 3: `TestWorkspaceHistory/CreateHistory` -https://github.com/coder/coder/runs/5040858460?check_suite_focus=true#step:8:45```status code 412: The provided project history is running. Wait for it to complete importing!````- Run 4: `TestWorkspaceHistory/CreateHistory` -https://github.com/coder/coder/runs/5040957999?check_suite_focus=true#step:7:176``` workspacehistory_test.go:122: Error Trace:workspacehistory_test.go:122 Error: Condition never satisfied Test: TestWorkspaceHistory/CreateHistory```- Run 5: peer failure- Run 6: Pass ✅ - Run 7: Peer failure## Open Questions: ##### Is `dRPC` or `websocket` at fault for the data race?It looks like this condition is specifically happening when `dRPC` decides to [`SendError`]). This constructs a new byte payload from [`MarshalError`](https://github.com/storj/drpc/blob/f6e369438f636b47ee788095d3fc13062ffbd019/drpcwire/error.go#L15) - so `dRPC` has created this buffer and owns it.From `dRPC`'s perspective, the callstack looks like this:- [`sendPacket`](https://github.com/storj/drpc/blob/f6e369438f636b47ee788095d3fc13062ffbd019/drpcstream/stream.go#L253) - [`writeFrame`](https://github.com/storj/drpc/blob/f6e369438f636b47ee788095d3fc13062ffbd019/drpcwire/writer.go#L65) - [`AppendFrame`](https://github.com/storj/drpc/blob/f6e369438f636b47ee788095d3fc13062ffbd019/drpcwire/packet.go#L128) - with finally the data race happening here:```go// AppendFrame appends a marshaled form of the frame to the provided buffer.func AppendFrame(buf []byte, fr Frame) []byte {...out := bufout = append(out, control). // <---------```This should be fine, since `dPRC` create this buffer, and is taking the byte buffer constructed from `MarshalError` and tacking a bunch of headers on it to create a proper frame.Once `dRPC` is done writing, it _hangs onto the buffer and resets it here__:https://github.com/storj/drpc/blob/f6e369438f636b47ee788095d3fc13062ffbd019/drpcwire/writer.go#L73However... the websocket implementation, once it gets the buffer, it runs a `statelessDeflate` [here](https://github.com/nhooyr/websocket/blob/8dee580a7f74cf1713400307b4eee514b927870f/write.go#L180), which compresses the buffer on the fly. This functionality actually [mutates the buffer in place](https://github.com/klauspost/compress/blob/a1a9cfc821f00faf2f5231beaa96244344d50391/flate/stateless.go#L94), which is where get our race.In the case where the `byte[]` aren't being manipulated anywhere else, this compress-in-place operation would be safe, and that's probably the case for most over-the-wire usages. In this case, though, where we're plumbing `dRPC` -> websocket, they both are manipulating it (`dRPC` is reusing the buffer for the next `write`, and `websocket` is compressing on the fly).### Why does cloning on `Read` fail?Get a bunch of errors like:```2022/02/02 19:26:10 [WARN] yamux: frame for missing stream: Vsn:0 Type:0 Flags:0 StreamID:0 Length:02022/02/02 19:26:25 [ERR] yamux: Failed to read header: unexpected EOF2022/02/02 19:26:25 [ERR] yamux: Failed to read header: unexpected EOF2022/02/02 19:26:25 [WARN] yamux: frame for missing stream: Vsn:0 Type:0 Flags:0 StreamID:0 Length:0```# UPDATE:We decided we could disable websocket compression, which would avoid the race because the in-place `deflate` operaton would no longer be run. Trying that out now:- Run 1: ✅ - Run 2:https://github.com/coder/coder/runs/5042645522?check_suite_focus=true#step:8:338- Run 3: ✅ - Run 4:https://github.com/coder/coder/runs/5042988758?check_suite_focus=true#step:7:168- Run 5: ✅1 parent2a76b60 commit03ed951
1 file changed
+4
-1
lines changedOriginal file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
48 | 48 |
| |
49 | 49 |
| |
50 | 50 |
| |
51 |
| - | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
52 | 55 |
| |
53 | 56 |
| |
54 | 57 |
| |
|
0 commit comments
Comments
(0)