Movatterモバイル変換
[0]ホーム
This is the mail archive of thelibc-alpha@sourceware.orgmailing list for theglibc project.
Re: [PATCH 0/2] Multiarch hooks for memcpy variants
Siddhesh Poyarekar wrote:> The first part is not true for falkor since its implementation is a good> 10-15% faster on the falkor chip due to its design differences. glibc> makes pretty extensive use of memcpy throughout, but I don't have data> on how much difference a core-specific memcpy will make there, so I> don't have enough grounds for a generic change.66% of memcpy calls are <=16 bytes. Assuming you can even get a 15% gainfor these small sizes (there is very little you can do different), that's at most 1cycle faster, so the PLT indirection is going to be more expensive.> Your last point about hurting everything else is very valid though; it's> very likely that adding an extra indirection in cases where> __memcpy_generic is going to be called anyway is going to be expensive> given that a bulk of the memcpy calls will be for small sizes of less> than 1k.Note that the falkor version does quite well in memcpy-random across severalmicro architectures so I think parts of it could be moved into the generic code.> Allowing a PLT only for __memcpy_chk and mempcpy would need a test case> waiver in check_localplt and that would become a blanket OK for PLT> usage for memcpy, which we don't want. Hence my patch is probably the> best compromise, especially since there is precedent for the approach in> x86.I still can't see any reason to even support these entry points in GLIBC, letalone optimize them using ifuncs. The _chk functions should obviously beinlined to avoid all the target specific complexity for no benefit. I think thiscould trivially be done via the GLIBC headers already. (That's assuming theyare in any way performance critical.)Wilco
[8]ページ先頭