Movatterモバイル変換

This is the mail archive of thelibc-alpha@sourceware.orgmailing list for theglibc project.

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]
Other format:	[Raw text]

Re: Re: [PATCH] Add math benchmark latency test

From: Arjan van de Ven <arjan at linux dot intel dot com>
To: Siddhesh Poyarekar <siddhesh at gotplt dot org>, Wilco Dijkstra <Wilco dot Dijkstra at arm dot com>, "libc-alpha at sourceware dot org" <libc-alpha at sourceware dot org>
Cc: nd <nd at arm dot com>
Date: Wed, 16 Aug 2017 07:23:08 -0700
Subject: Re: Re: [PATCH] Add math benchmark latency test
Authentication-results: sourceware.org; auth=none
References: <0e008f2e-f41a-1bb8-803c-2f798e2c3541@gotplt.org>

On 8/16/2017 6:07 AM, Siddhesh Poyarekar wrote:

I didn't notice this earlier, but shouldn't throughput beiterations/cycle and not the other way around?  That is, throughputshould be the inverse of latency.

well not really...I've been working on making expf() faster for x86 (see HJ's email earlier), andwith a massive out of order/pipelined cpu, latency and throughput are very distinct things.expf() can run at a throughput of somewhere in the 10 to 11 cycles range, while the latencycan be in the 45 to 55 cycles range.(not trying to do benchmarking here, just wanting to show an order of magnitude)the latency is then the number of cycles it takes to get a result (on an empty cpu)through from end to end, e.g.printf("%e", expf(fl))while throughput is the cost if  you put multiple consecutive through the cpu,likeprintf("%e", expf(f1) + expf(f2) + expf(f3) + expf(4))(using "printf" as a proxy for 'make externally visible' sync point; of course in reality it could be many other things)the out of order cpu will start execution of the second third and fourth expf() in parallel to the first, which willhide the latency (so the result time is not 4x45 + time of 4 adds, but much less, closer to 45 + 3x11 + time of 4 adds)I picked 4 expf()s here but theoretically throughput would be measured with the asymptote of 4...

Follow-Ups:
- Re: [PATCH] Add math benchmark latency test
  - From: Siddhesh Poyarekar

References:
- Re: [PATCH] Add math benchmark latency test
  - From: Siddhesh Poyarekar

Index Nav:	[Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav:	[Date Prev] [Date Next]	[Thread Prev] [Thread Next]

[8]ページ先頭