Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

File Descriptors in IronPython

Pavel Koneski edited this pageJan 19, 2025 ·2 revisions

Windows

The conceptual picture of file descriptors (FDs) usage on Windows, for the most interesting case ofFileStream:

graph LR;FileIO --> StreamBox --> FileStream --> Handle(Handle) --> OSFile[OS File];FD(FD) <--> StreamBox;
Loading

Conceptually, the relationship betweenFD (a number) andStreamBox (a class) is bidirectional becausePythonFileManager (a global singleton) maintains the association between the two so it is cost-free to obtain the one having the other. FD is not the same as the handle, which is created by the OS. FD is an emulated (fake) file descriptor, assigned by thePythonFileManager, for the purpose of supporting the Python API. The descriptors are allocated lazily, i.e. only if the user code makes an API call that accesses it. Once assigned, the descriptor does not change. The FD number is released once the FD is closed (or the associatedFileIO is closed and hadclosefd set to true.)

It is possible to have the structure above withoutFileIO; for instance when an OS file is opened with one of the low-level functions inos, or when an existing FD is duplicated. It is also possible to associate one FD with severalFileIO. In such cases it is the responsibility of the user code to take care that the FD is closed at the right time.

When an FD is duplicated (usingdup ordup2), the associatedStreamBox is duplicated too (there is always a 1-to-1 relationship between FD andStreamBox), but the underlyingFileStream object remains the same, and so is the underlying OS handle. The new FD may be used to create aFileIO (or several, just as for the original FD). All read/seek/write operations on both descriptors go though the sameFileStream object and the same OS handle.

graph LR;FD1(FD1) <--> StreamBox --> FileStream --> Handle(Handle) --> OSFile[OS File];FD2(FD2) <--> StreamBox2[StreamBox] --> FileStream;
Loading

The descriptors can be closed independently, and the underlyingFileStream is closed when the lastStreamBox using it is closed.

POSIX

On Unix-like systems (Linux, maxOS),FileStream uses the actual file descriptor as the handle. In the past. IronPython was ignoring this and still issuing its own fake file descriptors as it is in the case of Windows. Now, however, the genuine FD is extracted from the handle and used as FD at thePythonFileManager level, ensuring that clients of Python API obtain the genuine FD.

graph LR;FileIO --> StreamBox --> FileStream --> FDH(FD) --> OSFile[OS File];FD(FD) <--> StreamBox;
Loading

When a file descriptor FD is duplicated, the actual OS call is made to create the duplicate FD2. In order to use FD2 directly, a newStream object has to be created around it.

Straightforward Mechanism

The straightforward solution is to create anotherFileStream using the constructor that accepts an already opened file descriptor.

graph LR;FD1(FD1) <--> StreamBox --> FileStream --> FDH1(FD1) --> OSFile[OS File];FD2(FD2) <--> StreamBox2[StreamBox] --> FileStream2[FileStream] --> FDH2(FD2) --> OSFile;
Loading

In this way, the file descriptor on thePythonFileManager level is the same as the file descriptor used byFileStream.

Unfortunately, on .NET, somehow, twoFileStream instances using the same file descriptor will have the two independent read/write positions. This is not how duplicated file descriptors should work: both descriptors should point to the same file description structure and share the read/seek/write position. In practice, on .NET, writing through the second file object will overwrite data already written through the first file object. In regular Unix applications (incl. CPython), the subsequent writes append data, regardless which file object is used. The same principle should apply to reads.

Also unfortunately, on Mono, theFileStream constructor accepts only descriptors opened by another call to aFileStream constructor[1]. So descriptors obtained from direct OS calls, likeopen,creat,dup,dup2 are being rejected.

Solution on .NET 8+

On .NET,FileStream that was backing an openFileIO or an open FD from a direct call toos.open has been replaced byPosixFileStream. This class operates directly on the given file descriptor providing unbuffered file access, and replicating CPython's behaviour. So, a duplicated file descriptor looks like in the following diagram:

graph LR;FD1(FD1) <--> StreamBox --> PosixFileStream --> FDH1(FD1) --> OSFile[OS File];FD2(FD2) <--> StreamBox2[StreamBox] --> PosixFileStream2[PosixFileStream] --> FDH2(FD2) --> OSFile;
Loading

Workaround on .NET 6

The solution on .NET 6 is the same as on .NET 8:PosixFileStream is used instead ofFileStream. However, an issue arises when anmmap object is requested for a given FD.mmap implementation is backed byMemoryMappedFile from the .NET library. On .NET 8, aMemoryMappedFile instance can be created from a given FD. .NET 6 lacks this constructor and only acceptsFileStream (for maps that are backed by a regular file). Therefore, for the purpose of supportingMemoryMappedFile, a deficatedFileStream is created around the given FD. This instance ofFileStream is not registered withPythonFileManager but managed directly byMmapDefault, which implementsmmap.

graph LR;FD(FD) <--> StreamBox --> PosixFileStream --> FDH(FD) --> OSFile[OS File];MmapDefault --> FileStream2[FileStream] --> FDH;
Loading

Mono Workaround

To use system-opened file descriptors on Mono,UnixStream could be used instead ofFileStream.

graph LR;FD1(FD1) <--> StreamBox --> FileStream --> FDH1(FD1) --> OSFile[OS File];FD2(FD2) <--> StreamBox2[StreamBox] --> UnixStream --> FDH2(FD2) --> OSFile;
Loading

SinceFileIO works with various types of the underlyingStream, usingUnixStream should be OK.

AlthoughUnixStream is available in .NET through packageMono.Posix, this solution still does not work around desynchronized read/write position, whichFileStream using the original FD1 must somehow maintain independently.

Another problem with usingUnixStream is that this class is unsuitable to createMemoryMappedFile, which on Mono (like on .NET before 8.0) has to be created by being givenFileStream (for file-backed mmaps). Therefore, on Mono,FileStream is being used as the backing forFileIO and a naked FD, just as it is the case on Windows. The difference with Windows is, however, is thatPythonFileManager uses actual FDs when managing files, not emulated ones. When those actual descriptors are being duplicated, the code tries first to useFileStream to access the duplicated descriptor. This leads to a situation described in the "Straightforward Mechanism" section, with all caveats listed there. If usingFileStream fails,UnixStream is employed, as presented in the diagram above.

As mentioned before, usingUnixStream may lead to problems when such FD is used to createmmap, butmmap created on a file opened regularly (not duplicated) will work.

Special Case: Double Stream

In Python, a file can be opened with mode "ab+". The file is opened for appending to the end (created if not exists), and the+ means that it is also opened for updating. i.e. reading and writing. The file pointer is initially set at the end of the file (ready to write to append) but can be moved around to read already existing data. However, each write will append data to the end and reset the read/write pointer at the end again.

This opening mode is not supported byFileStream. On platforms that don't rely onFileStream (.NET 6.0+/POSIX), this is not an issue asPosixFileStream handles it the same way as CPython. On other plaforms (Windows — all frameworks, Mono) mode "ab+" is simulated by using two file streams, one for reading and one for writing. Both are maintained in a singleStreamBox but will have different file handles (Mono: file descriptors).

graph LR;FileIO --> StreamBox --> FileStreamR["FileStream (R)"] --> HandleR("Handle (R)") --> OSFile[OS File];StreamBox --> FileStreamW["FileStream (W)"] --> HandleW("Handle (W)") --> OSFile;FD(FD) <--> StreamBox;
Loading

On Windows, since a file descriptor is emulated, this does not create problems. The question might arise whichFileStream should be used as backing forMemoryMappedFile but it is not relevant since file opened in mode "a" is not suitable to be used formmap anyway.

On Mono, the file desriptor reported by such combo is a genuine descriptor of the write-stream. When the descriptor is duplicated, it is the write-stream's descriptor that gets duplicated, with the exception that if the target FD (usingdup2) is 0 (stdin), the read-stream's descriptor gets duplicated.

🙂 Looks like you've reached the end.

Still looking for more? Browse theDiscussions tab, where you can ask questions to the IronPython community.


🐍IronPython

Clone this wiki locally

[8]ページ先頭

©2009-2025 Movatter.jp