Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

bpo-24905: Support BLOB incremental I/O in sqlite module#271

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Closed
palaviv wants to merge18 commits intopython:mainfrompalaviv:issue-24905-sqlite-blob

Conversation

palaviv
Copy link
Contributor

@palavivpalaviv commentedFeb 24, 2017
edited by bedevere-bot
Loading

This PR adds support in BLOB incremental I/O at the sqlite module. As asked by@serhiy-storchaka and@berkerpeksag I will try to get some more developers to give their input on the wanted API. I am tagging some people that are active in the ghaering/pysqlite and rogerbinns/apsw.
@ghaering,@rianhunter,@rogerbinns,@phdru. Please look at the PR and give your notes.

https://bugs.python.org/issue24905

bozhinov, graingert, erlend-aasland, and chrisprobst reacted with thumbs up emojiSomeoneSerge, chrisprobst, and DavidBuchanan314 reacted with heart emoji
@rogerbinns
Copy link

The APSW doc for reference is athttps://rogerbinns.github.io/apsw/blob.html

Does havinglen make sense? Files don't have that method. It is also confusing - should len return the size from the current seek offset?

The documentation should make clearer that you cannot change the size of a blob, and mention zeroblob as the means to make a blob in a query without having to fill it in.

It may be worth mentioning that another approach is to store large data in a file, and only store the filename in the database. (This comes up on the sqlite-users mailing list quite a lot.)

@phdru
Copy link

I cannot remember I ever was in need to read/write a part of a BLOB; it was always "all or nothing" for me. So I never used BLOB APIs; instead I always SELECT/INSERT/UPDATE BLOB columns; in Postgres they are not even BLOB columns — I always use BYTEA type.

So I'm -0 on exposing BLOB API for SQLite.

@rogerbinns
Copy link

@phdru SQLite is the same with regular queries: you can only read or write blobs in their entirety. That for example means that if you store a 25MB blob then you must read or write 25MB at once.

SQLite has the "incremental blob" API for accessing just portions of blobs. The motivation comes from "Lite" in the name - developers use SQLite because it is lighter weight (amongst other reasons). DBAPI doesn't specify incremental blob I/O so only developers intending to use SQLite directly and not another database would use it. Should they be able to?

@phdru
Copy link

-0 from me means: I don't care and if there will be such an API I'm not gonna use it. That's all.

@palaviv
Copy link
ContributorAuthor

Thanks for the input@rogerbinns.

Does having len make sense? Files don't have that method. It is also confusing - should len return the size from the current seek offset?

What is the difference between implementing__len__ to the methodlength APSW blob has?

@rogerbinns
Copy link

@palaviv there is no difference between thevalue returned bylen and length or similar methods. It is however very uncommon to have alen method on file like objects - I couldn't find an example of any! For example StringIO is closest and has nolen. Hence my recommendation to avoidlen in favour of another method name.

@serhiy-storchaka
Copy link
Member

Themmap.mmap() object is an example of file-like object supportinglen().

@rogerbinns
Copy link

@serhiy-storchaka good example. They don'tdocument it though, and there is a size() method although it is returning something slightly different. There also seems to be a correlation between types that havelen and those that can you can array access.

In any event my recommendation is to avoid breaking new ground with alen method since that seems not to be normal practise for this kind of thing that provides a file like interface.

@palaviv
Copy link
ContributorAuthor

I actually think that we should use__len__ as by thedefinition this is the length of the object. TheBlob object is a representation of the BLOB and that is the BLOB length.

@serhiy-storchaka
Copy link
Member

Pull request conversation is purposed for discussing the code.

It would be better to continue the design discussion on the bug tracker or mailing list.

@palaviv
Copy link
ContributorAuthor

@serhiy-storchaka I have implemented the sequence protocol but I have a few questions:

  1. Do I need bothPySequenceMethods.sq_item,PySequenceMethods.sq_ass_item andPyMappingMethods.mp_subscript,PyMappingMethods.mp_ass_subscript.
  2. I can't make__contains__ work. Could you point me to how fix that?

@palaviv
Copy link
ContributorAuthor

I think that thecontains operation should not be supported for blobs. As blobs can be very large looking for a subset of bytes inside them will be a very inefficient process in memory or in compute.

@palavivpalaviv requested a review froma team as acode ownerApril 18, 2018 13:32
The BLOB size cannot be changed using the :class:`Blob` class. Use
``zeroblob`` to create the blob in the wanted size in advance.

.. versionadded:: 3.7

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

3.8

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

Blob Objects
------------

.. versionadded:: 3.7

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

3.8

Copy link
ContributorAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Done

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

you can retarget this for py 3.9

@matrixise
Copy link
Member

Hi@palaviv

Would you be interested to upgrade your PR to the last master?

Thank you

@palavivpalavivforce-pushed theissue-24905-sqlite-blob branch 2 times, most recently fromefac873 to765545eCompareMay 9, 2019 09:55
Copy link

@auvipyauvipy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Retarget for python 3.9

Blob Objects
------------

.. versionadded:: 3.8

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

3.9

@eamanu
Copy link
Contributor

Hi@palaviv There is some plan with this PR?
Was open 3 years ago. Are you still interest on this patch?
If yes, could you fix the conflict. please?

@palaviv
Copy link
ContributorAuthor

Hi@eamanu,
This patch exist since January 2016 and I kind of given up on it ever going into CPython as there is no core developer that works on sqlite. I would recommend you to use apsw that support this feature. In case any core developer would be in interested in working on this I would gladly fix any needed changes.

@simonw
Copy link
Contributor

I'd love to see this land in Python. I think there's a strong case for it: SQLite lets you store up to 2GB of data in a BLOB, and reading an entire 2GB value into memory at once isn't nearly as pleasant as reading it incrementally, which is what this would let us do.

saulshanabrook, SomeoneSerge, and DavidBuchanan314 reacted with thumbs up emoji

@palaviv
Copy link
ContributorAuthor

Thanks for the review@berkerpeksag. I have made the requested changes; please review again.

bozhinov reacted with thumbs up emoji

@bedevere-bot
Copy link

Thanks for making the requested changes!

@berkerpeksag: please review the changes made to this pull request.

@nightlark
Copy link
Contributor

nightlark commentedAug 18, 2021
edited
Loading

Other than rebasing (due new conflicts arising over time), is there anything that can be done to help move this PR along?

(@palaviv do you want to do the rebase? if you'd like or are too busy I can do the rebase, though I'd need to open a new PR since I don't think I can modify yours)

erlend-aasland reacted with thumbs up emoji

@erlend-aasland
Copy link
Contributor

erlend-aasland commentedAug 18, 2021
edited
Loading

@nightlark,@palaviv: Here' a short list from the top-of-my head of what is needed to rebase this ontomain:

  • the test suite has been normalized; we now use snake casetest_foo_bar method names
  • Argument Clinic
  • use heap types iso. static types
  • exception types are accessed through the (temporary) global state; for Connection objects, it's available throughself->state

If you want to try to land this, Ryan, please give Aviv a week or so to respond before opening a new PR :)

nightlark reacted with thumbs up emoji

@nightlark
Copy link
Contributor

@erlend-aasland Okay — I think I understand how to use argument clinic. Is there a guide to what iso. static types (or heap types)? Is the iso. a prefix for the types or an abbreviation? If it’s an abbreviation maybe that’s the missing a search term I should be using to find relevant resources.

erlend-aasland reacted with thumbs up emoji

@erlend-aasland
Copy link
Contributor

erlend-aasland commentedAug 20, 2021
edited
Loading

@erlend-aasland Okay — I think I understand how to use argument clinic.

Great! AC is nice once you get into it. Feel free to ask if you get stuck :)

Is there a guide to what iso. static types (or heap types)?

There's some info in thedocs, but you can also check the PR's that converted the existing types:

Don't hesitate to ask if you need more pointers.

Is the iso. a prefix for the types or an abbreviation?

Sorry, it's an abbreviation:instead of :) I have the bad habit of using it too much.

@erlend-aasland
Copy link
Contributor

Regarding heap types, take a look at Victor's blog:https://vstinner.github.io/isolate-subinterpreters.html

@erlend-aasland
Copy link
Contributor

@nightlark Are you still working on this PR?

@CoolCat467
Copy link

It's 2022 now woooo

@dholth
Copy link
Contributor

+1

@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actionsgithub-actionsbot added the staleStale PR or inactive for long period of time. labelMar 19, 2022
erlend-aasland pushed a commit to erlend-aasland/cpython that referenced this pull requestApr 14, 2022
erlend-aasland pushed a commit to erlend-aasland/cpython that referenced this pull requestApr 14, 2022
@JelleZijlstra
Copy link
Member

I just merged#30680, a simplified version of this PR. Blobs will be in Python 3.11.

saulshanabrook and DavidBuchanan314 reacted with hooray emoji

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Reviewers

@ZackerySpytzZackerySpytzZackerySpytz left review comments

@auvipyauvipyauvipy requested changes

@berkerpeksagberkerpeksagAwaiting requested review from berkerpeksagberkerpeksag is a code owner

Assignees
No one assigned
Labels
awaiting change reviewstaleStale PR or inactive for long period of time.
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

19 participants
@palaviv@rogerbinns@phdru@serhiy-storchaka@matrixise@eamanu@simonw@berkerpeksag@bedevere-bot@nightlark@erlend-aasland@CoolCat467@dholth@JelleZijlstra@auvipy@ZackerySpytz@brettcannon@Mariatta@the-knights-who-say-ni

[8]ページ先頭

©2009-2025 Movatter.jp