Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

Implement Buffer Protocol#2226

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
youknowone merged 11 commits intoRustPython:masterfromqingshi163:buffer_protocol
Oct 10, 2020

Conversation

qingshi163
Copy link
Contributor

I add a slot for implement the buffer protocol, more or less like what CPython doing. I am not sure is the right way for RustPython. Also I am confusing about the slots seem do not inherit from the base? So how we control the behavour of the subclass?

@youknowone
Copy link
Member

Slots is a data field of type (PyClass). For now, we typically iterate mro to find proper field of slots.

use crate::{bytesinner::try_as_bytes, pyobject::IntoPyObject};
use crossbeam_utils::atomic::AtomicCell;

pub trait BufferProtocol: Debug + Sync + Send {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Instead ofSend + Sync here, you probably want to usePyThreadingConstraint frompyobject

@youknowone
Copy link
Member

related to#2125 and#2195, right?

@qingshi163
Copy link
ContributorAuthor

qingshi163 commentedSep 24, 2020
edited
Loading

#2195 I have done it, I will push out when I passed the memoryview unittest.
#2125 I don't think it works with buffer, we have to do something inside objbytes.
@youknowone

@youknowone
Copy link
Member

youknowone commentedSep 25, 2020
edited
Loading

@qingshi163 I attached a commit to change Box to plain fnc0ecc35 . Because this is using read-only slot unlike other ones, It needed different storing method. I couldn't catch it on the previous comments.

@@ -376,3 +381,14 @@ impl PyComparisonOp {
}
}
}

#[pyimpl]
pub trait Bufferable: PyValue {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think the original nameAsBuffer you used was better. "hashable object" is a term in python(or cpython?) but bufferable is not.

Though "comparable object" is also nothing in python.@coolreader18 any idea about naming?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think AsBuffer is good, "buffer" isn't a verb so it doesn't really make sense asVerbable and there's precedence forAsNoun as a trait name in Rust, e.g.AsRawFd,AsRef

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

AsBuffer is good, but the trait function name I changed it to get_buffer as same as CPython, also because it did return a new BufferProtocol. so we should go back to AsBuffer or something like GetBuffer? what we perfer?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I personally think we can keep it asAsBuffer

Copy link
Member

@youknowoneyouknowoneSep 26, 2020
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That sounds like the implementation trait name can beBuffer and the desriptor trait name can beBufferProtocol. Like,

  • AsBuffer(Bufferable) -> BufferProtocol
  • BufferProtocol -> Buffer

Also I personally prefer to reuse CPython terms if there is no special reason not to do. Because we are practically read and port CPython code everyday. So always +1 for following CPython naming.

coolreader18 and qingshi163 reacted with thumbs up emoji
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That makes sense, since a trait is inherently a "protocol", so there's no need to specify that stuff that a buffer can do is a protocol; it's redundant

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

That sounds good!

Comment on lines +31 to +65

fn as_contiguous(&self) -> Option<BorrowedValue<[u8]>> {
let options = self.get_options();
if !options.contiguous {
return None;
}
Some(self.obj_bytes())
}

fn as_contiguous_mut(&self) -> Option<BorrowedValueMut<[u8]>> {
let options = self.get_options();
if !options.contiguous {
return None;
}
Some(self.obj_bytes_mut())
}

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

the consumer should call as_contiguous if they want just a plain buffer. Should we also need to add a function here for consuming a non-contiguous buffer like a sliced memoryview? Like a iter() return the unpacked object?

Comment on lines 24 to 48
pub trait BufferProtocol: Debug + PyThreadingConstraint {
// TODO: return reference to avoid copy
fn get_options(&self) -> BufferOptions;
fn obj_bytes(&self) -> BorrowedValue<[u8]>;
fn obj_bytes_mut(&self) -> BorrowedValueMut<[u8]>;
fn release(&self);
fn is_resizable(&self) -> bool;
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I renamed as_bytes to obj_bytes because it should always return the full memory range for the original object. as_bytes may easier to be miss using.

Comment on lines 57 to 68
impl FormatCode {
fn unit_size(&self) -> usize {
// XXX: size of l L q Q is platform depended?
match self.code {
'x' | 'c' | 'b' | 'B' | '?' | 's' | 'p' => 1,
'h' | 'H' => 2,
'i' | 'l' | 'I' | 'L' | 'f' => 4,
'q' | 'Q' | 'd' => 8,
'i' | 'I' | 'f' => 4,
'l' | 'L' | 'q' | 'Q' | 'd' => 8,
'n' | 'N' | 'P' => std::mem::size_of::<usize>(),
c => {
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I have to change the unit size for 'l' and 'L' to 8, because that is how array.array doing. I don't know is any compatible problem, but in my machine CPython 'l' and 'L' is 8 bytes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

isn't it architecture dependent value?

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

    /* Integers */    case 'h':        intsize = sizeof(short);        is_signed = 1;        break;    case 'H':        intsize = sizeof(short);        is_signed = 0;        break;    case 'i':        intsize = sizeof(int);        is_signed = 1;        break;    case 'I':        intsize = sizeof(int);        is_signed = 0;        break;    case 'l':        intsize = sizeof(long);        is_signed = 1;        break;    case 'L':        intsize = sizeof(long);        is_signed = 0;        break;    case 'q':        intsize = sizeof(long long);        is_signed = 1;        break;    case 'Q':        intsize = sizeof(long long);        is_signed = 0;        break;

This is from CPython, should we put it somewhere together? because we have to make sure array.array and struct have the same size for the type.

Copy link
Member

@youknowoneyouknowoneSep 26, 2020
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

For common platforms, usingstd::mem::size_of::<isize> or cfg target_pointer_width work. It fits for common platforms but not perfect. using libc always will fit as it defined in CPython. Trystd::mem::size_of::<libc::c_int>

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think we should do it in another PR and try unify the reflect from format code to size and the type.

youknowone reacted with thumbs up emoji
Comment on lines 168 to 169
drop(s.1);
read_rwlock(s.0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Suggested change
drop(s.1);
read_rwlock(s.0)
s.1

Self::Ref(r) => BorrowedValue::Ref(f(r)),
Self::MuLock(m) => BorrowedValue::MappedMuLock(PyMutexGuard::map(m, |x| unsafe {
#[allow(mutable_transmutes, clippy::transmute_ptr_to_ptr)]
std::mem::transmute(f(x))
Copy link
Member

@coolreader18coolreader18Sep 26, 2020
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This isn't sound -- somebody could match to get the mutex guard out of the enum, and then treat it as mutable. I ran into the same problem, and I'm not sure exactly what the solution is. Maybe try something in cell? PyImmutableMappedMutexGuard or something?

Copy link
Member

@coolreader18coolreader18Sep 26, 2020
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I could try making that later today if you don't feel confident about it; I think it would be like{ data: *const T, raw: &'a parking_lot::RawMutex, _marker: PhantomData<(&'a T, RawMutex::GuardMarker)> }

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

If you can fix that will be perfect, I have no clue to get it soundless..

coolreader18 reacted with thumbs up emoji
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

@coolreader18 will you merge#2239 before or after the pr?

@qingshi163qingshi163force-pushed thebuffer_protocol branch 2 times, most recently from3403be7 to9e5032fCompareOctober 1, 2020 09:39
@coolreader18
Copy link
Member

Oooh, here's an idea:Buffer could requirePyObjectPayload as a supertrait, and thenget_buffer() could return aPyObjectRc<dyn Buffer>, and then the Buffer and theobj in the memoryview could be the same thing.

@qingshi163
Copy link
ContributorAuthor

But the obj and the buffer in the memoryview could be different, the obj should always point to the original data source and the buffer is the controller shows how can we visit the data. That is how I implement it as the memoryview build from a memoryview the obj is cloned from original object.

@qingshi163qingshi163 marked this pull request as ready for reviewOctober 1, 2020 18:45
@qingshi163qingshi163 changed the title[WIP] Implement Buffer ProtocolImplement Buffer ProtocolOct 1, 2020
@qingshi163
Copy link
ContributorAuthor

I want to know if it is ok to merge, it is not fully completed but works.

@coolreader18
Copy link
Member

Oh, I suppose that's true. Nevermind

@@ -114,11 +112,13 @@ def setitem(key, value):
self.assertRaises(TypeError, setitem, "a", b"a")
# Not implemented: multidimensional slices
slices = (slice(0,1,1), slice(0,1,2))
self.assertRaises(NotImplementedError, setitem, slices, b"a")
# TODO: RUSTPYTHON
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I know this PR enabled a lot of parts from this test, but unless it is fully resolved, I prefer to keep the whole test as expectedFailure before merge. this is only trackable by code, but expectedFailure is trackable by test result. Once the feature is fully done, it will be detected if it is expectedFailiure but not if it is commented out.

@@ -6,7 +6,7 @@
a = memoryview(obj)
assert a.obj == obj

assert a[2:3] == b"c"
#assert a[2:3] == b"c"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

is this intended?

l @ PyList => l.to_byte_inner(vm),
// TODO: PyTyple
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

PyTuple?

@qingshi163qingshi163force-pushed thebuffer_protocol branch 2 times, most recently from6c72e69 to9c3d97bCompareOctober 2, 2020 13:27
Comment on lines 263 to 271
op: PyComparisonOp,
vm: &VirtualMachine,
) -> PyComparisonValue {
// TODO: bytes can compare with any object implemented buffer protocol
// but not memoryview, and not equal if compare with unicode str(PyStr)
PyComparisonValue::from_option(
try_bytes_like(vm, other, |other| {
op.eval_ord(self.elements.as_slice().cmp(other))
Copy link
ContributorAuthor

@qingshi163qingshi163Oct 2, 2020
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Now we have different behavour between (memoryview op bytes) and (bytes op memoryview), Is any solution better than check the type here?
@coolreader18

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Does memory view not implement the buffer protocol? I think that would maybe fix the issue(?)

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

But memoryview is implemented buffer protocol, we will have much more issue if not so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Huh, that's strange, I think that's probably fine for now.

Comment on lines 49 to 54
// TODO: generic way from &[PyObjectRef]
l @ PyList => l.to_byte_inner(vm),
t @ PyTuple => t.to_bytes_inner(vm),
obj => {
let iter = vm.get_method_or_type_error(obj.clone(), "__iter__", || {
format!("a bytes-like object is required, not {}", obj.class())
Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think we could have a trait for the objects that can be borrowed as &[PyObjectRef], so we can have a generic and efficiency way to iter.

coolreader18 reacted with thumbs up emoji
@qingshi163
Copy link
ContributorAuthor

What that random ci fail comes from?

@qingshi163qingshi163force-pushed thebuffer_protocol branch 2 times, most recently fromdcca305 to4e08961CompareOctober 7, 2020 15:36
@coolreader18
Copy link
Member

@qingshi163 I rebased this to master with the new mutex/BorrowedValue::map implementations; it put me as a committer for all the commits but hopefully you can just rebase it yourself and it would get rid of that.

@qingshi163
Copy link
ContributorAuthor

@coolreader18 Thanks, I think I need review for merge now.

coolreader18 reacted with thumbs up emoji

Copy link
Member

@coolreader18coolreader18 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I think this overall looks really good!! I'm fine with merging it and fixing leftover issues later, since this is so big already.@youknowone what do you think?

@coolreader18
Copy link
Member

And don't worry about that CI failure; I can fix that when I merge.

qingshi163 reacted with thumbs up emoji

@youknowone
Copy link
Member

oops, after a few days of resting, I totally forgot to merge this first before other PRs. I agree to merge this and fix other stuffs later. Let me fix the conflicts.

qingshi163 and coolreader18 reacted with thumbs up emoji

@youknowone
Copy link
Member

youknowone commentedOct 10, 2020
edited
Loading

I fixed a few commit messages because fisrt two 'fix test' commits are not fixing test but disabling it.
I also reordered commits a bit to merge "update memoryview eq"s to be single commit and also "Implement buffer protocol" and "Rename to AsBuffer". The contents of them are not changed except for conflict resolution.

@youknowoneyouknowone merged commit54cfdf2 intoRustPython:masterOct 10, 2020
@qingshi163qingshi163 deleted the buffer_protocol branchDecember 7, 2020 06:20
Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@youknowoneyouknowoneyouknowone left review comments

@coolreader18coolreader18coolreader18 approved these changes

Assignees
No one assigned
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

3 participants
@qingshi163@youknowone@coolreader18

[8]ページ先頭

©2009-2025 Movatter.jp