- Notifications
You must be signed in to change notification settings - Fork3.8k
Tracking SSE2 Optimisations#3370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.
Already on GitHub?Sign in to your account
Uh oh!
There was an error while loading.Please reload this page.
Conversation
I just tested how to use the SSE blitters on my recent PR. If you say |
PurityLake commentedOct 9, 2022 • edited
Loading Uh oh!
There was an error while loading.Please reload this page.
edited
Uh oh!
There was an error while loading.Please reload this page.
Did some testing with blitting a surface 10000 times using various Here are the results SSE2 without optimisations: SSE2 with optimisations Code used in testing: importpygamefrompygame.localsimport*fromtimeitimportTimerimportrandomdefdo_the_blits(item,positions):forposinpositions:screen.blit(surface, (50,50),special_flags=item)pygame.init()width=800height=600screen=pygame.display.set_mode((width,height))surface=pygame.image.load("Pygame.png").convert_alpha()blend_types= {"BLEND_RGBA_ADD":BLEND_RGBA_ADD,"BLEND_RGB_ADD":BLEND_RGB_ADD,"BLEND_RGBA_MULT":BLEND_RGBA_MULT,"BLEND_RGB_MULT":BLEND_RGB_MULT,"BLEND_RGBA_SUB":BLEND_RGBA_SUB,"BLEND_RGB_SUB":BLEND_RGB_SUB,"BLEND_RGBA_MAX":BLEND_RGBA_MAX,"BLEND_RGB_MAX":BLEND_RGB_MAX,"BLEND_RGBA_MIN":BLEND_RGBA_MIN,"BLEND_RGB_MIN":BLEND_RGB_MIN,}BLITS_TO_DO=100000positions= [(random.randint(0,width-50),random.randint(0,height-50))for_inrange(BLITS_TO_DO)]defdo_the_blits(item):forposinpositions:screen.blit(surface, (50,50),special_flags=item)forkey,valueinblend_types.items():print(f"Testing{key}:")print(Timer(lambda:do_the_blits(value)).timeit(number=1)) |
PurityLake's test numbers as percentage improvements: This is better than I was expecting. I'm especially suspicious of the MULT ones though, they seem too good to be true. I'll also see if I can replicate these testing numbers locally. |
My Test Data I used a 460 x 261 image as my surface basis. Main (AVX)Main (SSE)This PR (SSE)Percent improvement in SSE blit mode performance in this PR over mainIn my testing, I see that this PR achieves a 15-20% performance improvement with these blit modes over main (when just using SSE), as expected. I believe@PurityLake forgot to turn AVX2 off for one of their tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
This looks good to me, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others.Learn more.
LGTM 👍
Uh oh!
There was an error while loading.Please reload this page.
Refers to#3358
This PR will be to track progress in optimising SSE2 Blitters based on Starbuck's AVX optimisations.
Currently have implemented the optimisations for SSE2.