Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings
This repository was archived by the owner on Jan 23, 2023. It is now read-only.
/coreclrPublic archive

Override Stream.CopyTo on FileStream#17978

Conversation

@tdinucci
Copy link
Member

@tdinuccitdinucci commentedMay 13, 2018
edited
Loading

Addresseshttps://github.com/dotnet/corefx/issues/29479

This is relevant to Windows only, just as CopyToAsync is.

A PR for unit tests is about to be submitted to corefx.

@jkotas
Copy link
Member

I assume that we are doing this because of the handcrafted implementation is faster. Could you please collect some numbers to prove it?

@jkotasjkotas requested a review fromstephentoubMay 13, 2018 04:06
@jkotas
Copy link
Member

For reference -dotnet/corefx#11569 was PR with the handcrafted async implementation.

@stephentoub
Copy link
Member

I should have included more details on my thoughts in the issue I opened.

My specific thought was that at a minimum FileStream.CopyTo should delegate to CopyToAsync in the case where it's in async mode. FileStream is a strange beast in that it's configured into one of two modes at construction time, either sync mode or async mode, the difference being whether it's created to use overlapped I/O on Windows, which needs to be specified at the time the handle is created. Then all subsequent operations on the stream need to match, so if an async operation is performed on a sync stream, the implementation just queues a work item to the thread pool that calls the synchronous method, and conversely if a sync method is called on an async stream, it delegates to the XxAsync method and then blocks waiting for it to complete. There is overhead associated with setting up each async operation, but there's even more overhead when a sync method is called on an async stream, as you not only get that per-operation async overhead, you also get the overhead of the blocking waiting for the result. FileStream's CopyToAsync override is there to minimize the async overhead for each operation when in async mode. With CopyTo not overridden, it just uses the base Stream's CopyTo, which will end up repeatedly calling Read and Write, each of which on an async FileStream will end up incurring the maximum overhead. So at a minimum I was thinking we should do (pseudo-code):

publicoverridevoidCopyTo(Streamdestination,intbufferSize){if(_asyncMode&&GetType()==typeof(FileStream)){CopyToAsync(destination,bufferSize).GetAwaiter().GetResult();}else{base.CopyTo(destination,bufferSize);}}

so that in the case where CopyTo is used on an async FileStream, we only incur the blocking and async overheads once rather than per read and write operation.

It's possible all of the additional duplication this PR provides also has benefits, but that's much less likely to be a given, and as Jan suggests would really need to be measured (as would my suggested change, but I'm more inclined to believe it'll have measurable impact).

@tdinucci
Copy link
MemberAuthor

Thanks for the feedback.

When I had my nose in this I was actually wondering where the perceived benefit was. Other than saving on a few jumps and bypassing some fairly inexpensive logic there's not much difference. Now knowing your thoughts, things make more sense but it does raise a question for me - and I now realise I should have asked more questions earlier ;)

I wonder if this change is effectively promoting sync over async and would contribute further to FileStream being a "strange beast". I wonder if rather than silently helping people do the wrong thing it would be better to work on a way towards helping them not get into this position in the first place and at the same time work towards having a less complicated internal implementation.

I'm just spitballing here but the type of thing I'm wondering is if having ISyncStream and IAsyncStream interfaces would help. The current Stream type would have to implement both interfaces implicitly however additional implementations (e.g. SyncFileStream, AsyncMemoryStream, etc) would implement one of the interfaces implicitly and the other explicitly. They'd then bring in "base" functionality through composition rather than inheritance. The explicit implementations could probably be kept trivial - I'm not sure how often anyone would want to cast between I*Stream types?

Basically, I wonder if with this type of thing you'd end up with a new set of stream types that are more difficult to use incorrectly and at the same time have simpler internals. I would fully accept that if I cast an AsyncFileStream to ISyncStream that these synchronous operations are sub-optimal but at least I'd have made the conscious decision to do this. Put another way, the API is making it easier for me to do the right thing than it is to potentially make a mistake.

Having said all this, I'm perfectly happy to continue with what's been asked for.

@stephentoub
Copy link
Member

The other sync and async methods on FileStream already do this, so this isn't introducing a new concept, rather it's just doing the consistent thing in CopyTo that also happens to be significantly more efficient when used in this situation. I don't think introducing new interfaces would provide a benefit in this case.

@tdinucci
Copy link
MemberAuthor

Cool, understood. I'll update PR and post figures in the next day or so.

@tdinucci
Copy link
MemberAuthor

@jkotas and@stephentoub, I've done some further testing and submitted a new PR based on the feedback above.

Unfortunately, My machine isn't suitable for this level of performance testing due to the amount of background noise and I'd say someone else would have to run perf tests on this before reaching any level of confidence.

From my perspective there is perhaps up to a 5% improvement with large files (10MB to 64MB) however there seemed to be a more noticeable degradation for smaller files (maybe ~40% slower for 64KB files).

The tests I ran on the original PR did not show a marked difference.

Copy link
Member

@stephentoubstephentoub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM. Thanks.

@stephentoub
Copy link
Member

From my perspective there is perhaps up to a 5% improvement with large files (10MB to 64MB) however there seemed to be a more noticeable degradation for smaller files (maybe ~40% slower for 64KB files).

Interesting. I would have expected a larger impact than that. I'll take a look...

@tdinucci
Copy link
MemberAuthor

👍 I'd be very interested to hear how things look for you

@stephentoub
Copy link
Member

stephentoub commentedMay 17, 2018
edited
Loading

So from what I can see, the problem with smaller files is that this change really doesn't buy much and incurs additional cost in the process. CopyTo{Async} by default uses an ~80K buffer, which means the "old" implementation would end up doing a single read. The benefits of CopyToAsync are that it does the equivalent setup for a read but then reuses that set up across all subsequent operations, so if it would have only done a single read, it's really not providing any benefit. And in the process it's incurring additional overheads, like turningWrites on the destination stream intoWriteAsyncs, which have additional overhead. For larger files, there's definitely a win, primarily in the form of decreased allocation, so the benefits are more visible when part of a larger application. But it's not clear that it's worth it, and if it mattered to an application, that application could of course just call CopyToAsync rather than CopyTo.

@tdinucci
Copy link
MemberAuthor

Thanks for the clarification.

Perhaps I should try adding a test for the streams length and if it's less than some multiple of the buffer size (figured out through testing) call through to the base implementation, otherwise call through to CopyToAsync?

@tdinucci
Copy link
MemberAuthor

But it's not clear that it's worth it, and if it mattered to an application, that application could of course just call CopyToAsync rather than CopyTo.

I agree. I've just done a little more testing on this on files ranging from 10KB to 1GB. today I'm seeing a slight degradation on the smaller files and a slight improvement with bigger files. The degradation today isn't as pronounced as I was seeing the other day, <10% (which proves my machine isn't suitable for this type of testing) and I'm also not sure it's worth adding additional complexity for a potentially tiny gain.

I'll bow out of this one for now.

@stephentoub
Copy link
Member

Thanks for your efforts on this,@tdinucci. Let's go ahead and close this and the issue for now; seems like there's not much of a "there there". Good investigation, though.

@tdinuccitdinucci deleted the FileStream_override_Stream.CopyTo branchMay 20, 2018 11:58
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.

Reviewers

1 more reviewer

@stephentoubstephentoubstephentoub approved these changes

Reviewers whose approvals may not affect merge requirements

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

3 participants

@tdinucci@jkotas@stephentoub

[8]ページ先頭

©2009-2025 Movatter.jp