Movatterモバイル変換


[0]ホーム

URL:


Skip to content
DEV Community
Log in Create account

DEV Community

Cover image for Python: open().read()->str , but for big files
TaiKedz
TaiKedz

Posted on • Edited on

     

Python: open().read()->str , but for big files

Image (C) Tai Kedzierski

We just had a use case where we needed to POST a file over to a server. The naive implementation for posting withrequests is to do

withopen("my_file.bin",'rb')asfh:requests.post(url,data={"bytes":fh.read()})
Enter fullscreen modeExit fullscreen mode

Job done! Well. If the file isreeeally big, that.read() operation will attempt to load the entire file into memory,before passing the loaded bytes torequests.post(...)

Clearly, this is going to hurt. A lot.

Usemmap

A quick searchyielded a solution usingmmap to create a "memory mapped" object, which would behave like a string, whilst being backed by a file that only gets read in chunks as needed.

As ever, I like making things re-usable, and easy to slot-in. I adapted the example into a contextual object that can be used in-place of a normal call toopen()

# It's a tiny snippet, but go on.# Delviered to You under MIT Expat License, aka "Do what you want"# I'm not even fussy about attribution.importmmapclassStringyFileReader:def__init__(self,file_name,mode):ifmodenotin("r","rb"):raiseValueError(f"Invalid mode'{mode}'. Only read-modes are supported")self._fh=open(file_name,mode)# A file size of 0 means "whatever the size of the file actually is" on non-Windows# On Windows, you'll need to obtain the actual size, thoughfsize=0self._mmap=mmap.mmap(self._fh.fileno(),fsize,access=mmap.ACCESS_READ)def__enter__(self):returnselfdefread(self):returnself._mmapdef__exit__(self,*args):self._mmap.close()self._fh.close()
Enter fullscreen modeExit fullscreen mode

Which then lets us simply tweak the original naive example to:

withStringyFileReader("my_file.bin",'rb')asfh:requests.post(url,data={"bytes":fh.read()})
Enter fullscreen modeExit fullscreen mode

Job. Done.

EDIT: we've discovered through further use thatrequests is pretty stupid. It sill tries to readthe entire file into memory - possibly by doing a copy of the "string" it receives during one of its internal operations. So this solution seems to only stand in limited cases...

Top comments(4)

Subscribe
pic
Create template

Templates let you quickly answer FAQs or store snippets for re-use.

Dismiss
CollapseExpand
 
xtofl profile image
xtofl
C++, python software engineer with a passion for design that improves teamwork and maintainability.
  • Location
    Belgium
  • Pronouns
    he/him/his
  • Joined

That's so terribly simple!

I was anticipating some multipart chunked transferring, but this makes excellent use of the machinery offered by the operating system.

Ever considered usingcontextlib?

@contextlib.contextmanagerdefreadmm(filename):_fh=open(file_name,mode)fsize=0_mmap=mmap.mmap(self._fh.fileno(),fsize,access=mmap.ACCESS_READ)try:yield_mmapfinally:_fh.close()_mmap.close()withreadmm("my_file.bin","rb")asdata:requests.post(url,data={"bytes":data})
Enter fullscreen modeExit fullscreen mode
CollapseExpand
 
taikedz profile image
TaiKedz
Host of the Edinburgh Language Exchange.Full Snack Developer 🥪, Ramen guzzler 🍜, quiche murderer 🥧. A friendly cat.
  • Location
    Edinburgh, Scotland
  • Education
    BSc Comp Sci, St Andrews, Scotland
  • Pronouns
    he/him
  • Work
    𐌃𐌄ᕓꝊꝊ𐌓𐌔
  • Joined

That does make it even more concise !

That said, I'm not sure how I feel about the enter/exit context being implicit behind this; as in, it reduces the amount of code, but I can feel like reading it back feels unintuitive.

CollapseExpand
 
xtofl profile image
xtofl
C++, python software engineer with a passion for design that improves teamwork and maintainability.
  • Location
    Belgium
  • Pronouns
    he/him/his
  • Joined

Mind, it's not implicit! It'sextracted into the@contextmanager function.

I respect that you phrase it as unintuitive, since intuition is learned. Indeed, to (very) many, the extracted form of the code sandwich is intuitive. You can observe the movement from explicit sandwiches to extracted in many languages (e.g.Scope.Exit,using in C#).

The power it brings is that the developer cannot possibly forget to cleanup, so the reader doesn't have to wonder whether they did. Assuming code is read 10 times more than it is written, inner peace will be your part after growing this intuition.

Thread Thread
 
taikedz profile image
TaiKedz
Host of the Edinburgh Language Exchange.Full Snack Developer 🥪, Ramen guzzler 🍜, quiche murderer 🥧. A friendly cat.
  • Location
    Edinburgh, Scotland
  • Education
    BSc Comp Sci, St Andrews, Scotland
  • Pronouns
    he/him
  • Work
    𐌃𐌄ᕓꝊꝊ𐌓𐌔
  • Joined

Indeed. I guess it's something I just have to get used to - can be regarded as analogous to thewith keyword which, unless you've learned and used it properly, can look oddly incomplete.

Are you sure you want to hide this comment? It will become hidden in your post, but will still be visible via the comment'spermalink.

For further actions, you may consider blocking this person and/orreporting abuse

Host of the Edinburgh Language Exchange.Full Snack Developer 🥪, Ramen guzzler 🍜, quiche murderer 🥧. A friendly cat.
  • Location
    Edinburgh, Scotland
  • Education
    BSc Comp Sci, St Andrews, Scotland
  • Pronouns
    he/him
  • Work
    𐌃𐌄ᕓꝊꝊ𐌓𐌔
  • Joined

More fromTaiKedz

DEV Community

We're a place where coders share, stay up-to-date and grow their careers.

Log in Create account

[8]ページ先頭

©2009-2025 Movatter.jp