Movatterモバイル変換

bedevere-bot added the awaiting review label

Copy link

ContributorAuthor

bmerry commentedSep 26, 2019

A few questions about this:

I've implemented it for both StreamReader and Stream, because it seemed like having two identical (apart from_ensure_can_read) implementations is better for maintainability than two implementations that have drifted out of sync. But if one would prefer not to make any changes to StreamReader I can revert that. I suggest doing review discussion on the Stream implementation.
Because the existing functionality is implemented using the general case there might be a very small loss of performance. Are there any benchmarks I should run to test that?
For matches that end in the same position I arbitrarily decided that shortest should win for the purpose of LimitOverrunError, but it was an arbitrary choice. It could just as easily be longest-wins or first-in-list wins.

bmerry commented

Lib/asyncio/streams.py Outdated

		break

		# see upper comment for explanation.
		offset = buflen + 1 -seplen
		offset = buflen + 1 -max_seplen

Copy link

ContributorAuthor

bmerrySep 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Hmm, I see this can make offset negative if max_seplen > min_seplen. I'll add a test to catch it and add a fix.

brandtbucher requested changes

Copy link

Member

brandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Thanks for taking the time to draft this PR@bmerry, and welcome to CPython! 😎

I'm notsuper familiar with this module, but I found a couple of things that could be tidied up. Other than that, the code looks good! This will definitely need updated docs, though.

Lib/asyncio/streams.py OutdatedShow resolvedHide resolved

Lib/asyncio/streams.py Outdated

		# Makes sure shortest matches wins, and supports arbitrary iterables
		separator = sorted(separator, key=lambda sep: len(sep))
		if not separator:
		raise ValueError('Separator list should contain at least one element')

Copy link

Member

brandtbucherSep 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

See above.

Suggested change

	raiseValueError('Separatorlistshould contain at least one element')
	raiseValueError('Separator should contain at least one element')

Lib/asyncio/streams.py Outdated

		@@ -1672,26 +1709,35 @@ def _feed_data(self, data):
		# messages :)

		# `offset` is the number of bytes from the beginning of the buffer
		# where there is no occurrence of`separator`.
		# where there is no occurrence ofanyseparator.

Copy link

Member

brandtbucherSep 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

See above.

Suggested change

	# where there is no occurrence of any separator.
	# where there is no occurrence of any`separator`.

Lib/asyncio/streams.py Outdated

		offset = 0

		# Loop until we find`separator` in the buffer, exceed the buffer size,
		# Loop until we findaseparator in the buffer, exceed the buffer size,

Copy link

Member

brandtbucherSep 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

See above.

Suggested change

	# Loop until we find a separator in the buffer, exceed the buffer size,
	# Loop until we find a`separator` in the buffer, exceed the buffer size,

bedevere-bot added awaiting core review and removed awaiting review labels

brandtbucher added topic-asyncio type-featureA feature request or enhancement labels

brandtbucher reviewed

Lib/asyncio/streams.py Outdated

		separator = [separator]
		else:
		# Makes sure shortest matches wins, and supports arbitrary iterables
		separator = sorted(separator, key=lambda sep: len(sep))

Copy link

Member

brandtbucherSep 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sorry, missed one. Here too!

Suggested change

	separator=sorted(separator,key=lambdasep:len(sep))
	separator=sorted(separator,key=len)

1st1 requested changes

Copy link

Member

1st1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Unfortunately it was decided to revert the current streams implementation from 3.8. Seehttps://bugs.python.org/issue38242 for more details. I'm really sorry, but we'll need to rebase this work on the new API we later add to 3.9 :(

bedevere-bot removed the awaiting core review label

Copy link

bedevere-bot commentedSep 26, 2019

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phraseI have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

And if you don't make the requested changes,you will be poked with soft cushions!

bedevere-bot added the awaiting changes label

Copy link

ContributorAuthor

bmerry commentedSep 26, 2019

@1st1 not a problem - definitely worth getting things right before they go into stdlib. Do you think the revert of the new stream API will conflict much with my changes to the old StreamReader class? If not, perhaps the way forward is for me to remove my changes toStream, address the PR comments onStreamReader and then hopefully it shouldn't need much work to rebase afterStream is removed?

Copy link

Member

1st1 commentedSep 26, 2019

@bmerry I'm not yet sure. I'll be working on a revert later today or tomorrow and will probably know more. I'd keep everything as is as it's not going to be an easy revert anyways.

Copy link

ContributorAuthor

bmerry commentedOct 3, 2019

@1st1 it looks like you've done the Streams reversion now, and I've merged that into my branch. So hopefully this is ready for review now.

@brandtbucher thanks for the suggestions. I've applied all the ones that Github was still showing after the merge from master - let me know if anything got lost along the way.

brandtbucher requested changes

Oct 3, 2019

Copy link

Member

brandtbucher left a comment•
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

One more tiny clarification from me, otherwise this looks good. Still needs updated docs though!

Misc/NEWS.d/next/Library/2019-09-26-17-52-52.bpo-37141.onYY2-.rst OutdatedShow resolvedHide resolved

Copy link

ContributorAuthor

bmerry commentedOct 7, 2019

Still needs updated docs though!

@brandtbucher which docs need updating?

bmerry requested a review from1st1

October 7, 2019 15:51

Copy link

Member

brandtbucher commentedOct 8, 2019

Sorry about the delay@bmerry. The docs I'm referring to are located inDoc/library/asyncio-stream.rst. Specifically, here:

   ..coroutinemethod::readuntil(separator=b'\\n')      Read data from the stream until *separator* is found.      On success, the data and separator will be removed from the      internal buffer (consumed). Returned data will include the      separator at the end.      If the amount of data read exceeds the configured stream limit, a:exc:`LimitOverrunError` exception is raised, and the data      is left in the internal buffer and can be read again.      If EOF is reached before the complete separator is found,      an:exc:`IncompleteReadError` exception is raised, and the internal      buffer is reset.  The:attr:`IncompleteReadError.partial` attribute      may contain a portion of the separator.      ..versionadded::3.5.2

The signature and description should be updated. Also, something like this should be included below the.. versionadded:: 3.5.2 bit:

      ..versionchanged::3.9         The *separator* parameter may now be an:term:`iterable` of separators.

brandtbucher requested changes

Oct 8, 2019

Copy link

Member

brandtbucher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

One tiny change to the NEWS entry due to the API revert:

brandtbucher reviewed

Oct 8, 2019

Misc/NEWS.d/next/Library/2019-09-26-17-52-52.bpo-37141.onYY2-.rst OutdatedShow resolvedHide resolved

Copy link

ContributorAuthor

bmerry commentedOct 9, 2019

The docs I'm referring to are located in Doc/library/asyncio-stream.rst

Thanks. I'd assumed the library docs were generated from the docstring, which is why I missed this. I'll update it.

bmerry requested a review frombrandtbucher

October 9, 2019 18:09

brandtbucher approved these changes

Oct 9, 2019

Allow multiple separators in Stream.readuntil#81322

Copy link

Member

brandtbucher commentedOct 9, 2019

Looks good to me!@1st1?

Copy link

Member

1st1 commentedOct 9, 2019

We will soon start a discussion about the new streaming API for 3.9. I'll update this issue with a link when that happens; please give us some time before making a decision on this one -- Python 3.9 is still relatively far away.

ezio-melotti removed the CLA signed label

Jul 13, 2022

bmerrymannequin mentioned this pull request

Apr 10, 2022

Closed

bmerry added2 commits

April 7, 2024 16:56

Merge remote-tracking branch 'origin/main' into multi-separator

d6e3508

Update version at which readuntil with separate list is supported

99e4fe2

bmerry requested review fromgvanrossum,kumaraditya303 andwillingc ascode owners

April 7, 2024 15:09

gvanrossum removed request for1st1,asvetlov,aeros andkumaraditya303

April 7, 2024 15:42

gvanrossum changed the title~~bpo-37141: support multiple separators in Stream.readuntil~~gh-81322: support multiple separators in Stream.readuntil

Apr 7, 2024

Copy link

Member

gvanrossum commentedApr 7, 2024

Thanks, I will review in the coming week!

bmerry commented

Misc/NEWS.d/next/Library/2019-09-26-17-52-52.bpo-37141.onYY2-.rstShow resolvedHide resolved

gvanrossum approved these changes

Copy link

Member

gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

One comment nit, I'll merge that myself and then land it. Thanks for your contribution, and on behalf of the asyncio maintainers I apologize for the delay!

Lib/asyncio/streams.py OutdatedShow resolvedHide resolved

bedevere-appbot added awaiting merge and removed awaiting changes labels

Fix comment

bee5ebe

len(separator) would be the number of separators, which makes no sense.

gvanrossum merged commit775912a intopython:main

bedevere-appbot removed the awaiting merge label

bmerry added a commit to bmerry/cpython that referenced this pull request

Fix Stream.readuntil with non-bytes buffer objects

c364ee8

PRpython#16429 introduced support for an iterable of separators inStream.readuntil. Since bytes-like types are themselves iterable, thiscan introduce ambiguities in deciding whether the argument is aniterator of separators or a singleton separator. Inpython#16429, only 'bytes'was considered a singleton, but this will break code that passes otherbuffer object types.The Python library docs don't indicate what separator types werepermitted in Python <=3.12, but comments in typeshed indicate that itwould work with types that implement the buffer protocol and provide alen(). To keep those cases working the way they did before, I've changedthe detection logic to consider any instance of collections.abc.Bufferas a singleton separator.There may still be corner cases where this doesn't do what the userwants e.g. a numpy array of byte strings will implement the bufferprotocol and hence be treated as a singleton; but at least those cornercases should behave the same in 3.13 as they did in 3.12.Relates topython#81322.

bmerry mentioned this pull request