- Notifications
You must be signed in to change notification settings - Fork24.7k
Commite2d141d
set thread_work_size to 4 for unrolled kernel (#154541)
set thread_work_size to 4 for unrolled kernel (#152396)Previous PRs enabling 8-vectorization inadvertently regressed unrolled kernel perf.Pull Requestresolved:#152396Approved by:https://github.com/BoyuanFeng,https://github.com/msaroufim,https://github.com/malfet,https://github.com/Aidyn-A,https://github.com/atalman(cherry picked from commitadebb8b)Co-authored-by: Natalia Gimelshein <ngimel@meta.com>1 parent1214198 commite2d141d
1 file changed
+11
-2
lines changedLines changed: 11 additions & 2 deletions
Original file line number | Diff line number | Diff line change | |
---|---|---|---|
| |||
83 | 83 |
| |
84 | 84 |
| |
85 | 85 |
| |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
86 | 94 |
| |
87 | 95 |
| |
88 | 96 |
| |
| |||
336 | 344 |
| |
337 | 345 |
| |
338 | 346 |
| |
339 |
| - | |
| 347 | + | |
| 348 | + | |
340 | 349 |
| |
341 |
| - | |
| 350 | + | |
342 | 351 |
| |
343 | 352 |
| |
344 | 353 |
| |
|
0 commit comments
Comments
(0)