io — 處理資料串流的核心工具

原始碼:Lib/io.py


總覽

io 模組替 Python 提供處理各種類型 IO 的主要工具。有三種主要的 IO 類型:文字 I/O (text I/O)二進位 I/O (binary I/O) 以及原始 I/O (raw I/O)。這些均為泛用 (generic) 類型,且每種類型都可以使用各式後端儲存 (backing store)。任一種屬於這些類型的具體物件稱為file object。其它常見的名詞還有資料串流 (stream) 以及類檔案物件 (file-like objects)

無論其類型為何,每個具體的資料串流物件也將具有各種能力:唯讀的、只接受寫入的、或者讀寫兼具的。它還允許任意的隨機存取(向前或向後尋找至任意位置),或者只能依順序存取(例如 socket 或 pipe 的情形下)。

所有的資料串流都會謹慎處理你所提供的資料的型別。舉例來說,提供一個str 物件給二進位資料串流的write() 方法將會引發TypeError。同樣地,若提供一個bytes 物件給文字資料串流的write() 方法,也會引發同樣的錯誤。

在 3.3 版的變更:原本會引發IOError 的操作,現在將改成引發OSError。因為IOError 現在是OSError 的別名。

文字 I/O

文字 I/O 要求和產出str 物件。這意味著每當後端儲存為原生 bytes 時(例如在檔案的情形下),資料的編碼與解碼會以清楚易懂的方式進行,也可選擇同時轉換特定於平台的換行字元。

建立文字資料串流最簡單的方法是使用open(),可選擇性地指定編碼:

f=open("myfile.txt","r",encoding="utf-8")

記憶體內的文字資料串流也可以使用StringIO 物件建立:

f=io.StringIO("some initial text data")

文字資料串流 API 的詳細說明在TextIOBase 文件當中。

二進位 (Binary) I/O

二進位 I/O(也稱為緩衝 I/O (buffered I/O))要求的是類位元組物件 (bytes-like objects) 且產生bytes 物件。不進行編碼、解碼或者換行字元轉換。這種類型的資料串流可用於各種非文字資料,以及需要手動控制對文字資料的處理時。

建立二進位資料串流最簡單的方法是使用open(),並在 mode 字串中加入'b'

f=open("myfile.jpg","rb")

記憶體內的二進位資料串流也可以透過BytesIO 物件來建立:

f=io.BytesIO(b"some initial binary data:\x00\x01")

二進位資料串流 API 的詳細說明在BufferedIOBase 文件當中。

其它函式庫模組可能提供額外的方法來建立文字或二進位資料串流。例如socket.socket.makefile()

原始 (Raw) I/O

原始 I/O(也稱為無緩衝 I/O (unbuffered I/O))通常作為二進位以及文字資料串流的低階 building-block 使用;在使用者程式碼中直接操作原始資料串流很少有用。然而,你可以透過以無緩衝的二進位模式開啟一個檔案來建立一個原始資料串流:

f=open("myfile.jpg","rb",buffering=0)

原始串流 API 在RawIOBase 文件中有詳細描述。

文字編碼

TextIOWrapperopen() 預設編碼是根據區域設定的 (locale-specific) (locale.getencoding())。

然而,許多開發人員在開啟以 UTF-8 編碼的文字檔案(例如:JSON、TOML、Markdown等)時忘記指定編碼,因為多數 Unix 平台預設使用 UTF-8 區域設定。這會導致錯誤,因為對於大多數 Windows 使用者來說,預設地區編碼並非 UTF-8。舉例來說:

# May not work on Windows when non-ASCII characters in the file.withopen("README.md")asf:long_description=f.read()

因此,強烈建議在開啟文字檔案時,明確指定編碼。若你想使用 UTF-8 編碼,請傳入encoding="utf-8"。若想使用目前的地區編碼,Python 3.10 以後的版本支援使用encoding="locale"

也參考

Python UTF-8 模式

在 Python UTF-8 模式下,可以將預設編碼從特定地區編碼改為 UTF-8。

PEP 686

Python 3.15 將預設使用Python UTF-8 模式

選擇性加入的編碼警告

在 3.10 版被加入:更多資訊請見PEP 597

要找出哪些地方使用到預設的地區編碼,你可以啟用-Xwarn_default_encoding 命令列選項,或者設定環境變數PYTHONWARNDEFAULTENCODING。當使用到預設編碼時,會引發EncodingWarning

如果你正在提供一個使用open()TextIOWrapper 且傳遞encoding=None 作為參數的 API,你可以使用text_encoding()。如此一來如果 API 的呼叫方沒有傳遞encoding,呼叫方就會發出一個EncodingWarning。然而,對於新的 API,請考慮預設使用 UTF-8(即encoding="utf-8")。

高階模組介面

io.DEFAULT_BUFFER_SIZE

一個包含模組中緩衝 I/O 類別所使用的預設緩衝區大小的整數。若可能的話,open() 會使用檔案的 blksize (透過os.stat() 取得)。

io.open(file,mode='r',buffering=-1,encoding=None,errors=None,newline=None,closefd=True,opener=None)

這是內建函式open() 的別名。

此函式會引發一個帶有引數pathmode 以及flags稽核事件 (auditing event)openmodeflags 引數可能已經被修改或者從原始呼叫中被推斷出來。

io.open_code(path)

'rb' 模式開啟提供的檔案。此函式應用於意圖將內容視為可執行的程式碼的情況下。

path 應該要屬於str 類別,且是個絕對路徑。

這個函式的行為可能會被之前對PyFile_SetOpenCodeHook() 的呼叫覆寫。然而,假設path 是個str 且為絕對路徑,則open_code(path) 總是與open(path,'rb') 有相同行為。覆寫這個行為是為了對檔案進行額外驗證或預處理。

在 3.8 版被加入.

io.text_encoding(encoding,stacklevel=2,/)

這是個輔助函式,適用於使用open()TextIOWrapper 且具有encoding=None 參數的可呼叫物件。

encoding 不為None,此函式將回傳encoding。否則,將根據UTF-8 Mode 回傳"locale""utf-8"

sys.flags.warn_default_encoding 為真,且encodingNone,此函式會發出一個EncodingWarningstacklevel 指定警告在哪層發出。範例:

defread_text(path,encoding=None):encoding=io.text_encoding(encoding)# stacklevel=2withopen(path,encoding)asf:returnf.read()

在此範例中,對於read_text() 的呼叫方會引發一個EncodingWarning

更多資訊請見文字編碼

在 3.10 版被加入.

在 3.11 版的變更:當 UTF-8 模式啟用且encodingNone 時,text_encoding() 會回傳 "utf-8"。

exceptionio.BlockingIOError

這是內建的BlockingIOError 例外的相容性別名。

exceptionio.UnsupportedOperation

當在資料串流上呼叫不支援的操作時,會引發繼承自OSErrorValueError 的例外。

也參考

sys

包含標準的 IO 資料串流:sys.stdinsys.stdout 以及sys.stderr

類別階層

I/O 串流的實作是由多個類別組合成的階層結構所構成。首先是abstract base classes (抽象基底類別,ABCs),它們被用來規範各種不同類型的串流,接著具體類別會提供標準串流的實作。

備註

為了協助具體串流類別的實作,抽象基底類別提供了某些方法的預設實作。舉例來說,BufferedIOBase 提供未經最佳化的readinto()readline() 實作。

I/O 階層結構的最上層是抽象基底類別IOBase。它定義了串流的基礎的介面。然而,請注意,讀取串流與寫入串流之間並沒有分離;若不支援給定的操作,實作是允許引發UnsupportedOperation 例外的。

抽象基底類別RawIOBase 繼承IOBase。此類別處理對串流的位元組讀寫。FileIO 則繼承RawIOBase 來提供一個介面以存取機器檔案系統內的檔案。

抽象基底類別BufferedIOBase 繼承IOBase。此類別緩衝原始二進位串流 (RawIOBase)。它的子類別BufferedWriterBufferedReaderBufferedRWPair 分別緩衝可寫、可讀、可讀也可寫的的原始二進位串流。類別BufferedRandom 則提供一個對可搜尋串流 (seekable stream) 的緩衝介面。另一個類別BufferedIOBase 的子類別BytesIO,是一個記憶體內位元組串流。

抽象基底類別TextIOBase 繼承IOBase。此類別處理文本位元組串流,並處理字串的編碼和解碼。類別TextIOWrapper 繼承自TextIOBase,這是個對緩衝原始串流 (BufferedIOBase) 的緩衝文本介面。最後,StringIO 是個文字記憶體內串流。

引數名稱不是規範的一部份,只有open() 的引數將作為關鍵字引數。

以下表格總結了io 模組提供的抽象基底類別 (ABC):

抽象基底類別 (ABC)

繼承

Stub 方法

Mixin 方法與屬性

IOBase

filenoseektruncate

closeclosed__enter____exit__flushisatty__iter____next__readablereadlinereadlinesseekabletellwritablewritelines

RawIOBase

IOBase

readintowrite

繼承自IOBase 的方法,readreadall

BufferedIOBase

IOBase

detachreadread1write

繼承自IOBase 的方法,readintoreadinto1

TextIOBase

IOBase

detachreadreadlinewrite

繼承自IOBase 的方法,encodingerrorsnewlines

I/O 基礎類別

classio.IOBase

所有 I/O 類別的抽象基礎類別。

為許多方法提供了空的抽象實作,衍生類別可以選擇性地覆寫這些方法;預設的實作代表一個無法讀取、寫入或搜尋的檔案。

即使IOBase 因為實作的簽名差異巨大而沒有宣告read()write() 方法,實作與用戶端應把這些方法視為介面的一部份。此外,當呼叫不被它們支援的操作時,可能會引發ValueError (或UnsupportedOperation)例外。

The basic type used for binary data read from or written to a file isbytes. Otherbytes-like objects areaccepted as method arguments too. Text I/O classes work withstr data.

請注意,在一個已經關閉的串流上呼叫任何方法(即使只是查詢)都是未定義的。在這種情況下,實作可能會引發ValueError 例外。

IOBase (and its subclasses) supports the iterator protocol, meaningthat anIOBase object can be iterated over yielding the lines in astream. Lines are defined slightly differently depending on whether thestream is a binary stream (yielding bytes), or a text stream (yieldingcharacter strings). Seereadline() below.

IOBase 也是個情境管理器,因此支援with 陳述式。在這個例子中,file 會在with 陳述式執行完畢後關閉——即使發生了異常。

withopen('spam.txt','w')asfile:file.write('Spam and eggs!')

IOBase 提供這些資料屬性與方法:

close()

清除並關閉這個串流。若檔案已經關閉,則此方法沒有作用。一旦檔案被關閉,任何對檔案的操作(例如讀取或寫入)將引發ValueError 異常。

為了方便起見,允許多次呼叫這個方法;然而,只有第一次呼叫會有效果。

closed

如果串流已關閉,則為True

fileno()

如果串流存在,則回傳其底層的檔案描述器(一個整數)。如果 IO 物件不使用檔案描述器,則會引發一個OSError 例外。

flush()

如果適用,清空串流的寫入緩衝區。對於唯讀和非阻塞串流,此操作不會執行任何操作。

isatty()

如果串流是互動式的(即連接到終端機/tty 設備),則回傳True

readable()

如果串流可以被讀取,則回傳True。如果是Falseread() 將會引發OSError 例外。

readline(size=-1,/)

從串流讀取並回傳一行。如果指定了size,則最多讀取size 個位元組。

對於二進位檔案,行結束符總是b'\n';對於文字檔案,可以使用open() 函式的newline 引數來選擇識別的行結束符號。

readlines(hint=-1,/)

從串流讀取並回傳一個含有一或多行的 list。可以指定hint 來控制讀取的行數:如果到目前為止所有行的總大小(以位元組/字元計)超過hint,則不會再讀取更多行。

hint 值為0 或更小,以及None,都被視為沒有提供 hint。

請注意,已經可以使用forlineinfile:... 在檔案物件上進行疊代,而不一定需要呼叫file.readlines()

seek(offset,whence=os.SEEK_SET,/)

將串流位置改變到給定的位元組offset,此位置是相對於由whence 指示的位置解釋的,並回傳新的絕對位置。whence 的值可為:

  • os.SEEK_SET0 -- 串流的起點(預設值);offset 應為零或正數

  • os.SEEK_CUR1 -- 目前串流位置;offset 可以是負數

  • os.SEEK_END2 -- 串流的結尾;offset 通常是負數

在 3.1 版被加入:SEEK_* 常數。

在 3.3 版被加入:某些作業系統可以支援額外的值,例如os.SEEK_HOLEos.SEEK_DATA。檔案的合法值取決於它是以文字模式還是二進位模式開啟。

seekable()

如果串流支援隨機存取,則回傳True。如果是False,則seek()tell()truncate() 會引發OSError

tell()

回傳目前串流的位置。

truncate(size=None,/)

將串流的大小調整為指定的size 位元組(如果沒有指定size,則調整為目前位置)。目前串流位置不會改變。這種調整可以擴展或縮減目前檔案大小。在擴展的情況下,新檔案區域的內容取決於平台(在大多數系統上,額外的位元組會被填充為零)。回傳新的檔案大小。

在 3.5 版的變更:Windows 現在在擴展時會對檔案進行零填充 (zero-fill)。

writable()

如果串流支援寫入,則回傳True。如果是Falsewrite()truncate() 將會引發OSError

writelines(lines,/)

將一個包含每一行的 list 寫入串流。這不會新增行分隔符號,因此通常提供的每一行末尾都有一個行分隔符號。

__del__()

為物件銷毀做準備。IOBase 提供了這個方法的預設實作,該實作會呼叫實例的close() 方法。

classio.RawIOBase

原始二進位串流的基底類別。它繼承自IOBase

原始二進位串流通常提供對底層作業系統設備或 API 的低階存取,並不嘗試將其封裝在高階基元 (primitive) 中(這項功能在緩衝二進位串流和文字串流中的更高階層級完成,後面的頁面會有描述)。

RawIOBase 除了IOBase 的方法外,還提供以下這些方法:

read(size=-1,/)

從物件中讀取最多size 個位元組並回傳。方便起見,如果size 未指定或為 -1,則回傳直到檔案結尾 (EOF) 的所有位元組。否則,只會進行一次系統呼叫。如果作業系統呼叫回傳的位元組少於size,則可能回傳少於size 的位元組。

如果回傳了 0 位元組,且size 不是 0,這表示檔案結尾 (end of file)。如果物件處於非阻塞模式且沒有可用的位元組,則回傳None

預設的實作會遵守readall()readinto() 的實作。

readall()

讀取並回傳串流中直到檔案結尾的所有位元組,必要時使用多次對串流的呼叫。

readinto(b,/)

將位元組讀入一個預先分配的、可寫的bytes-like object (類位元組物件)b 中,並回傳讀取的位元組數量。例如,b 可能是一個bytearray。如果物件處於非阻塞模式且沒有可用的位元組,則回傳None

write(b,/)

將給定的bytes-like object (類位元組物件),b,寫入底層的原始串流,並回傳寫入的位元組大小。根據底層原始串流的具體情況,這可能少於b 的位元組長度,尤其是當它處於非阻塞模式時。如果原始串流設置為非阻塞且無法立即寫入任何單一位元組,則回傳None。呼叫者在此方法回傳後可以釋放或變更b,因此實作應該只在方法呼叫期間存取b

classio.BufferedIOBase

支援某種緩衝的二進位串流的基底類別。它繼承自IOBase

RawIOBase 的主要差異在於,read()readinto()write() 方法將分別嘗試讀取所請求的盡可能多的輸入,或消耗所有給定的輸出,即使可能需要進行多於一次的系統呼叫。

此外,如果底層的原始串流處於非阻塞模式且無法提供或接收足夠的資料,這些方法可能會引發BlockingIOError 例外;與RawIOBase 不同之處在於,它們永遠不會回傳None

此外,read() 方法不存在一個遵從readinto() 的預設實作。

一個典型的BufferedIOBase 實作不應該繼承自一個RawIOBase 的實作,而是應該改用包裝的方式,像BufferedWriterBufferedReader 那樣的作法。

BufferedIOBase 除了提供或覆寫來自IOBase 的資料屬性和方法以外,還包含了這些:

raw

底層的原始串流(一個RawIOBase 實例),BufferedIOBase 處理的對象。這不是BufferedIOBase API 的一部分,且在某些實作可能不存在。

detach()

將底層的原始串流從緩衝區中分離出來,並回傳它。

在原始串流被分離後,緩衝區處於一個不可用的狀態。

某些緩衝區,如BytesIO,沒有單一原始串流的概念可從此方法回傳。它們會引發UnsupportedOperation

在 3.1 版被加入.

read(size=-1,/)

讀取並回傳最多size 個位元組。如果引數被省略、為None 或為負值,將讀取並回傳資料直到達到 EOF 為止。如果串流已經處於 EOF,則回傳一個空的bytes 物件。

如果引數為正數,且底層原始串流不是互動式的,可能會發出多次原始讀取來滿足位元組數量(除非首先達到 EOF)。但對於互動式原始串流,最多只會發出一次原始讀取,且短少的資料不表示 EOF 即將到來。

如果底層原始串流處於非阻塞模式,且目前沒有可用資料,則會引發BlockingIOError

read1(size=-1,/)

讀取並回傳最多size 個位元組,最多呼叫一次底層原始串流的read() (或readinto()) 方法。如果你正在BufferedIOBase 物件之上實作自己的緩衝區,這可能會很有用。

如果size-1 (預設值),則會回傳任意數量的位元組(除非達到 EOF,否則會超過零)。

readinto(b,/)

讀取位元組到一個預先分配的、可寫的bytes-like objectb 當中,並回傳讀取的位元組數量。例如,b 可能是一個bytearray

類似於read(),除非後者是互動式的,否則可能會對底層原始串流發出多次讀取。

如果底層原始串流處於非阻塞模式,且目前沒有可用資料,則會引發BlockingIOError

readinto1(b,/)

讀取位元組到一個預先分配的、可寫的bytes-like objectb 中,最多呼叫一次底層原始串流的read() (或readinto())方法。此方法回傳讀取的位元組數量。

如果底層原始串流處於非阻塞模式,且目前沒有可用資料,則會引發BlockingIOError

在 3.5 版被加入.

write(b,/)

寫入給定的bytes-like objectb,並回傳寫入的位元組數量(總是等於b 的長度,以位元組計,因為如果寫入失敗將會引發OSError)。根據實際的實作,這些位元組可能會立即寫入底層串流,或出於性能和延遲的緣故而被留在緩衝區當中。

當處於非阻塞模式時,如果需要將資料寫入原始串流,但它無法接受所有資料而不阻塞,則會引發BlockingIOError

呼叫者可以在此方法回傳後釋放或變更b,因此實作應該僅在方法呼叫期間存取b

原始檔案 I/O

classio.FileIO(name,mode='r',closefd=True,opener=None)

一個代表包含位元組資料的 OS 層級檔案的原始二進制串流。它繼承自RawIOBase

name 可以是兩種事物之一:

  • 代表將要打開的檔案路徑的一個字元串或bytes 物件。在這種情況下,closefd 必須是True (預設值),否則將引發錯誤。

  • an integer representing the number of an existing OS-level file descriptorto which the resultingFileIO object will give access. When theFileIO object is closed this fd will be closed as well, unlessclosefdis set toFalse.

Themode can be'r','w','x' or'a' for reading(default), writing, exclusive creation or appending. The file will becreated if it doesn't exist when opened for writing or appending; it will betruncated when opened for writing.FileExistsError will be raised ifit already exists when opened for creating. Opening a file for creatingimplies writing, so this mode behaves in a similar way to'w'. Add a'+' to the mode to allow simultaneous reading and writing.

Theread() (when called with a positive argument),readinto() andwrite() methods on thisclass will only make one system call.

A custom opener can be used by passing a callable asopener. The underlyingfile descriptor for the file object is then obtained by callingopener with(name,flags).opener must return an open file descriptor (passingos.open asopener results in functionality similar to passingNone).

The newly created file isnon-inheritable.

See theopen() built-in function for examples on using theopenerparameter.

在 3.3 版的變更:Theopener parameter was added.The'x' mode was added.

在 3.4 版的變更:The file is now non-inheritable.

FileIO provides these data attributes in addition to those fromRawIOBase andIOBase:

mode

The mode as given in the constructor.

name

The file name. This is the file descriptor of the file when no name isgiven in the constructor.

Buffered Streams

Buffered I/O streams provide a higher-level interface to an I/O devicethan raw I/O does.

classio.BytesIO(initial_bytes=b'')

A binary stream using an in-memory bytes buffer. It inherits fromBufferedIOBase. The buffer is discarded when theclose() method is called.

The optional argumentinitial_bytes is abytes-like object thatcontains initial data.

BytesIO provides or overrides these methods in addition to thosefromBufferedIOBase andIOBase:

getbuffer()

Return a readable and writable view over the contents of the bufferwithout copying them. Also, mutating the view will transparentlyupdate the contents of the buffer:

>>>b=io.BytesIO(b"abcdef")>>>view=b.getbuffer()>>>view[2:4]=b"56">>>b.getvalue()b'ab56ef'

備註

As long as the view exists, theBytesIO object cannot beresized or closed.

在 3.2 版被加入.

getvalue()

Returnbytes containing the entire contents of the buffer.

read1(size=-1,/)

InBytesIO, this is the same asread().

在 3.7 版的變更:Thesize argument is now optional.

readinto1(b,/)

InBytesIO, this is the same asreadinto().

在 3.5 版被加入.

classio.BufferedReader(raw,buffer_size=DEFAULT_BUFFER_SIZE)

A buffered binary stream providing higher-level access to a readable, nonseekableRawIOBase raw binary stream. It inherits fromBufferedIOBase.

When reading data from this object, a larger amount of data may berequested from the underlying raw stream, and kept in an internal buffer.The buffered data can then be returned directly on subsequent reads.

The constructor creates aBufferedReader for the given readableraw stream andbuffer_size. Ifbuffer_size is omitted,DEFAULT_BUFFER_SIZE is used.

BufferedReader provides or overrides these methods in addition tothose fromBufferedIOBase andIOBase:

peek(size=0,/)

Return bytes from the stream without advancing the position. At most onesingle read on the raw stream is done to satisfy the call. The number ofbytes returned may be less or more than requested.

read(size=-1,/)

Read and returnsize bytes, or ifsize is not given or negative, untilEOF or if the read call would block in non-blocking mode.

read1(size=-1,/)

Read and return up tosize bytes with only one call on the raw stream.If at least one byte is buffered, only buffered bytes are returned.Otherwise, one raw stream read call is made.

在 3.7 版的變更:Thesize argument is now optional.

classio.BufferedWriter(raw,buffer_size=DEFAULT_BUFFER_SIZE)

A buffered binary stream providing higher-level access to a writeable, nonseekableRawIOBase raw binary stream. It inherits fromBufferedIOBase.

When writing to this object, data is normally placed into an internalbuffer. The buffer will be written out to the underlyingRawIOBaseobject under various conditions, including:

The constructor creates aBufferedWriter for the given writeableraw stream. If thebuffer_size is not given, it defaults toDEFAULT_BUFFER_SIZE.

BufferedWriter provides or overrides these methods in addition tothose fromBufferedIOBase andIOBase:

flush()

Force bytes held in the buffer into the raw stream. ABlockingIOError should be raised if the raw stream blocks.

write(b,/)

Write thebytes-like object,b, and return thenumber of bytes written. When in non-blocking mode, aBlockingIOError is raised if the buffer needs to be written out butthe raw stream blocks.

classio.BufferedRandom(raw,buffer_size=DEFAULT_BUFFER_SIZE)

A buffered binary stream providing higher-level access to a seekableRawIOBase raw binary stream. It inherits fromBufferedReaderandBufferedWriter.

The constructor creates a reader and writer for a seekable raw stream, givenin the first argument. If thebuffer_size is omitted it defaults toDEFAULT_BUFFER_SIZE.

BufferedRandom is capable of anythingBufferedReader orBufferedWriter can do. In addition,seek() andtell() are guaranteed to be implemented.

classio.BufferedRWPair(reader,writer,buffer_size=DEFAULT_BUFFER_SIZE,/)

A buffered binary stream providing higher-level access to two non seekableRawIOBase raw binary streams---one readable, the other writeable.It inherits fromBufferedIOBase.

reader andwriter areRawIOBase objects that are readable andwriteable respectively. If thebuffer_size is omitted it defaults toDEFAULT_BUFFER_SIZE.

BufferedRWPair implements all ofBufferedIOBase's methodsexcept fordetach(), which raisesUnsupportedOperation.

警告

BufferedRWPair does not attempt to synchronize accesses toits underlying raw streams. You should not pass it the same objectas reader and writer; useBufferedRandom instead.

文字 I/O

classio.TextIOBase

Base class for text streams. This class provides a character and line basedinterface to stream I/O. It inherits fromIOBase.

TextIOBase provides or overrides these data attributes andmethods in addition to those fromIOBase:

encoding

The name of the encoding used to decode the stream's bytes intostrings, and to encode strings into bytes.

errors

The error setting of the decoder or encoder.

newlines

A string, a tuple of strings, orNone, indicating the newlinestranslated so far. Depending on the implementation and the initialconstructor flags, this may not be available.

buffer

The underlying binary buffer (aBufferedIOBase instance) thatTextIOBase deals with. This is not part of theTextIOBase API and may not exist in some implementations.

detach()

Separate the underlying binary buffer from theTextIOBase andreturn it.

After the underlying buffer has been detached, theTextIOBase isin an unusable state.

SomeTextIOBase implementations, likeStringIO, may nothave the concept of an underlying buffer and calling this method willraiseUnsupportedOperation.

在 3.1 版被加入.

read(size=-1,/)

Read and return at mostsize characters from the stream as a singlestr. Ifsize is negative orNone, reads until EOF.

readline(size=-1,/)

Read until newline or EOF and return a singlestr. If the stream isalready at EOF, an empty string is returned.

Ifsize is specified, at mostsize characters will be read.

seek(offset,whence=SEEK_SET,/)

Change the stream position to the givenoffset. Behaviour depends onthewhence parameter. The default value forwhence isSEEK_SET.

  • SEEK_SET or0: seek from the start of the stream(the default);offset must either be a number returned byTextIOBase.tell(), or zero. Any otheroffset valueproduces undefined behaviour.

  • SEEK_CUR or1: "seek" to the current position;offset must be zero, which is a no-operation (all other valuesare unsupported).

  • SEEK_END or2: seek to the end of the stream;offset must be zero (all other values are unsupported).

Return the new absolute position as an opaque number.

在 3.1 版被加入:SEEK_* 常數。

tell()

Return the current stream position as an opaque number. The numberdoes not usually represent a number of bytes in the underlyingbinary storage.

write(s,/)

Write the strings to the stream and return the number of characterswritten.

classio.TextIOWrapper(buffer,encoding=None,errors=None,newline=None,line_buffering=False,write_through=False)

A buffered text stream providing higher-level access to aBufferedIOBase buffered binary stream. It inherits fromTextIOBase.

encoding gives the name of the encoding that the stream will be decoded orencoded with. InUTF-8 Mode, this defaults to UTF-8.Otherwise, it defaults tolocale.getencoding().encoding="locale" can be used to specify the current locale's encodingexplicitly. See文字編碼 for more information.

errors is an optional string that specifies how encoding and decodingerrors are to be handled. Pass'strict' to raise aValueErrorexception if there is an encoding error (the default ofNone has the sameeffect), or pass'ignore' to ignore errors. (Note that ignoring encodingerrors can lead to data loss.)'replace' causes a replacement marker(such as'?') to be inserted where there is malformed data.'backslashreplace' causes malformed data to be replaced by abackslashed escape sequence. When writing,'xmlcharrefreplace'(replace with the appropriate XML character reference) or'namereplace'(replace with\N{...} escape sequences) can be used. Any other errorhandling name that has been registered withcodecs.register_error() is also valid.

newline controls how line endings are handled. It can beNone,'','\n','\r', and'\r\n'. It works as follows:

  • When reading input from the stream, ifnewline isNone,universal newlines mode is enabled. Lines in the input can end in'\n','\r', or'\r\n', and these are translated into'\n'before being returned to the caller. Ifnewline is'', universalnewlines mode is enabled, but line endings are returned to the calleruntranslated. Ifnewline has any of the other legal values, input linesare only terminated by the given string, and the line ending is returned tothe caller untranslated.

  • When writing output to the stream, ifnewline isNone, any'\n'characters written are translated to the system default line separator,os.linesep. Ifnewline is'' or'\n', no translationtakes place. Ifnewline is any of the other legal values, any'\n'characters written are translated to the given string.

Ifline_buffering isTrue,flush() is implied when a call towrite contains a newline character or a carriage return.

Ifwrite_through isTrue, calls towrite() are guaranteednot to be buffered: any data written on theTextIOWrapperobject is immediately handled to its underlying binarybuffer.

在 3.3 版的變更:Thewrite_through argument has been added.

在 3.3 版的變更:The defaultencoding is nowlocale.getpreferredencoding(False)instead oflocale.getpreferredencoding(). Don't change temporary thelocale encoding usinglocale.setlocale(), use the current localeencoding instead of the user preferred encoding.

在 3.10 版的變更:Theencoding argument now supports the"locale" dummy encoding name.

TextIOWrapper provides these data attributes and methods inaddition to those fromTextIOBase andIOBase:

line_buffering

Whether line buffering is enabled.

write_through

Whether writes are passed immediately to the underlying binarybuffer.

在 3.7 版被加入.

reconfigure(*,encoding=None,errors=None,newline=None,line_buffering=None,write_through=None)

Reconfigure this text stream using new settings forencoding,errors,newline,line_buffering andwrite_through.

Parameters not specified keep current settings, excepterrors='strict' is used whenencoding is specified buterrors is not specified.

It is not possible to change the encoding or newline if some datahas already been read from the stream. On the other hand, changingencoding after write is possible.

This method does an implicit stream flush before setting thenew parameters.

在 3.7 版被加入.

在 3.11 版的變更:The method supportsencoding="locale" option.

seek(cookie,whence=os.SEEK_SET,/)

Set the stream position.Return the new stream position as anint.

Four operations are supported,given by the following argument combinations:

  • seek(0,SEEK_SET): Rewind to the start of the stream.

  • seek(cookie,SEEK_SET): Restore a previous position;cookiemust be a number returned bytell().

  • seek(0,SEEK_END): Fast-forward to the end of the stream.

  • seek(0,SEEK_CUR): Leave the current stream position unchanged.

Any other argument combinations are invalid,and may raise exceptions.

tell()

Return the stream position as an opaque number.The return value oftell() can be given as input toseek(),to restore a previous stream position.

classio.StringIO(initial_value='',newline='\n')

A text stream using an in-memory text buffer. It inherits fromTextIOBase.

The text buffer is discarded when theclose() method iscalled.

The initial value of the buffer can be set by providinginitial_value.If newline translation is enabled, newlines will be encoded as if bywrite(). The stream is positioned at the start of thebuffer which emulates opening an existing file in aw+ mode, making itready for an immediate write from the beginning or for a write thatwould overwrite the initial value. To emulate opening a file in ana+mode ready for appending, usef.seek(0,io.SEEK_END) to reposition thestream at the end of the buffer.

Thenewline argument works like that ofTextIOWrapper,except that when writing output to the stream, ifnewline isNone,newlines are written as\n on all platforms.

StringIO provides this method in addition to those fromTextIOBase andIOBase:

getvalue()

Return astr containing the entire contents of the buffer.Newlines are decoded as if byread(), althoughthe stream position is not changed.

使用範例:

importiooutput=io.StringIO()output.write('First line.\n')print('Second line.',file=output)# Retrieve file contents -- this will be# 'First line.\nSecond line.\n'contents=output.getvalue()# Close object and discard memory buffer --# .getvalue() will now raise an exception.output.close()
classio.IncrementalNewlineDecoder

A helper codec that decodes newlines foruniversal newlines mode.It inherits fromcodecs.IncrementalDecoder.

Performance

This section discusses the performance of the provided concrete I/Oimplementations.

二進位 (Binary) I/O

By reading and writing only large chunks of data even when the user asks for asingle byte, buffered I/O hides any inefficiency in calling and executing theoperating system's unbuffered I/O routines. The gain depends on the OS and thekind of I/O which is performed. For example, on some modern OSes such as Linux,unbuffered disk I/O can be as fast as buffered I/O. The bottom line, however,is that buffered I/O offers predictable performance regardless of the platformand the backing device. Therefore, it is almost always preferable to usebuffered I/O rather than unbuffered I/O for binary data.

文字 I/O

Text I/O over a binary storage (such as a file) is significantly slower thanbinary I/O over the same storage, because it requires conversions betweenunicode and binary data using a character codec. This can become noticeablehandling huge amounts of text data like large log files. Also,tell() andseek() are both quite slowdue to the reconstruction algorithm used.

StringIO, however, is a native in-memory unicode container and willexhibit similar speed toBytesIO.

Multi-threading

FileIO objects are thread-safe to the extent that the operating systemcalls (such asread(2) under Unix) they wrap are thread-safe too.

Binary buffered objects (instances ofBufferedReader,BufferedWriter,BufferedRandom andBufferedRWPair)protect their internal structures using a lock; it is therefore safe to callthem from multiple threads at once.

TextIOWrapper objects are not thread-safe.

Reentrancy

Binary buffered objects (instances ofBufferedReader,BufferedWriter,BufferedRandom andBufferedRWPair)are not reentrant. While reentrant calls will not happen in normal situations,they can arise from doing I/O in asignal handler. If a thread tries tore-enter a buffered object which it is already accessing, aRuntimeErroris raised. Note this doesn't prohibit a different thread from entering thebuffered object.

The above implicitly extends to text files, since theopen() functionwill wrap a buffered object inside aTextIOWrapper. This includesstandard streams and therefore affects the built-inprint() function aswell.