rhashimoto/wa-sqlitePublic

NotificationsYou must be signed in to change notification settings
Fork83
Star1.2k

New demos for OPFS access handle#84

rhashimoto announced inAnnouncements

rhashimoto

Apr 23, 2023

· 2 comments· 7 replies

Return to top

Discussion options

rhashimoto
Apr 23, 2023
Maintainer

There are a couple new demos showing off a synthesis of two ideas that have their own discussion threads:

Implementing a VFS with a pool ofpre-opened OPFS access handles, and
Implementing aservice shared across multiple browser tabs when the service won't work in SharedWorker.

Combining these ideas makes possible an OPFS-backed database service that:

Allows access from multiple contexts (which the OPFS access handle API doesn't support well),
Doesn't have the WASM size or performance penalties of Asyncify,
Doesn't have the COOP/COEP headers or performance penalties of Atomics/SharedArrayBuffer, and
Never invalidates its cache.

The new demos are really just copies of existing demos, except using a shared connection to AccessHandlePoolVFS for database queries:

ahp-demo - This is a copy of themain demo that lets you compose and execute SQL.
ahp-contention - This is a copy of thecontention demo (describedhere) that measures transactions per second across multiple browser tabs.

Both of these demos use the same SharedService name, so you can open multiple browser tabs of both and they all will share a single database connection. That means, for example, that after running a contention test you can compose SQL to dig deeper into the results, say to see if the same tab ever gets consecutive transactions.

Note that these demos are implemented with ES6 module Worker (because I'm too lazy to bundle them), which is enabled by default in Firefox nightly. You can run on Firefox beta and stable by enablingdom.workers.modules.enabled at about:config, but you may need to avoidthis recently fixed bug. The demos work fine on the current versions of Safari and Chrome (and I assume other Chromium-based browsers, e.g. Edge).

Early (and possibly premature) conclusions from these demos:

The two ideas work beautifully together. I'm on record saying that this is the way to use OPFS, and the demos don't contradict that.
Contention from multiple tabs doesn't reduce total transactions per second. In fact, the performance canincrease, possibly because with a single tab the database is idle while communicating between Window and Worker.
Transactions per second performance ismuch lower for Chrome than Safari and Firefox. I had similar results back when writing IndexedDB VFS classes, and in that case Chrome was reportedly beingmore conservative with durability than Firefox, so this may be a result of the same policies. UPDATE 2023-04-25: This appears to be specific to Apple devices. See the next post in this thread.

Shared AHP contention results on my 2014 Mac mini on various browsers for 1, 8, and 32 (!) browser tabs:

browser	total tx/s 1 tab	total tx/s 8 tabs	total tx/s 32 tabs
Chrome 112	24.3	24.5	24.0
Firefox 114.0a1	434.1	446.9	442.1
Safari 16.4.1	818.2	1020.9	1022.0

You must be logged in to vote

Replies: 2 comments 7 replies

Comment options

rhashimoto
Apr 25, 2023
Maintainer Author

Transactions per second performance is much lower for Chrome than Safari and Firefox. I had similar results back when writing IndexedDB VFS classes, and in that case Chrome was reportedly beingmore conservative with durability than Firefox, so this may be a result of the same policies.

I went down a rabbit hole trying to figure this out. First I wrote amicro-benchmark to time a bunch of OPFS writes and flushes to verify that was indeed where the difference was. Then I went on a long internet journey, and found that Chrome has a special flush implementation on Apple devices. Instead of callingfsync(), Chrome isusing an Apple-specific file control flag, F_FULLFSYNC:

if (!HANDLE_EINTR(fcntl(file_.get(),F_FULLFSYNC))) return true;

What does F_FULLFSYNC do? Basically itdoes whatfsync() is supposed to do, but doesn't:

Note that while fsync() will flush all data from the host to the drive (i.e. the "permanent storage device"), the drive itself may not physically write the data to the platters for quite some time and it may be written in an out-of-order sequence.
Specifically, if the drive loses power or the OS crashes, the application may find that only some or none of their data was written. The disk drive may also re-order the data so that later writes may be present, while earlier writes are not.
This is not a theoretical edge case. This scenario is easily reproduced with real world workloads and drive power failures.
For applications that require tighter guarantees about the integrity of their data, Mac OS X provides the F_FULLFSYNC fcntl.

I haven't been able to confirm how Firefox and Safari do syncing, but I strongly suspect that they aren't using F_FULLFSYNC. That would obviously be a serious bug, right? Maybe not.

At one time, the lack of using F_FULLFSYNC on Mac was actually filed as aFirefox bug, but it was eventually marked WONTFIX. One of the commenters on the bug is Dr. Richard Hipp, the author of SQLite:

We added the fullsync option at Apple's request. But it turns out that they don't use it themselves. The fullsync requires the OS to reset the disk controller, so it causes a significant performance impact, not only to the application using SQLite, but all other processes running on the machine that happen to be using the disk drive.
It is true that the fsync() system call does not work as advertised on MacOS (nor Linux) but in practice this does not normally cause too much trouble. The data is stored in the disk drives track buffers and it normally makes it to oxide without too much delay. My understanding is that problems caused by fsync() not working are probably no more common that problems caused by head crashes or other hard disk failures.
I would recommend that you not enable fullsync. Or if you do, at least make it a configuration option so that users can turn it off if they want.

So even Apple doesn't use F_FULLFSYNC and the SQLite folks don't recommend it (at least not back in 2008); it can be enabled in SQLite (on a native Apple build) with apragma, but the default is off. ThisSQLite forum message reports that interrupting SQLite connections not using F_FULLFSYNC could affect durability but did not find any database corruption with the specific devices and software versions tested.

It appears that all the browsers are making an informed decision about the risks and rewards here and have arrived at different conclusions, and that's the reason why Chrome is dramatically slower on OPFS flush (and therefore on OPFS VFS write transactions) on macOS and iOS. Application developers considering an OPFS VFS should be aware that:

Chromium browsers will be slower than other browsers on Apple devices - how much slower will depend on the workload, but more than 10x in the worst case.
On other browsers on Apple devices the expected durability is lower and the chance of database corruption is potentially higher (though very low in absolute terms).

You must be logged in to vote

4 replies

Comment options

tantaman Sep 29, 2023

Is the Chrome team aware of the problem? I wonder how much of an intentional choice it was on their part to useF_FULLFSYNC.

Comment options

rhashimoto Sep 29, 2023
Maintainer Author

@tantaman I filed achromium issue, but there seem to be differing opinions on whether and what to do about it. They asked for input from the SQLite folks but there was no public response (maybe there was a private response). The last comment on the issue references more discussion in another chromium issue which used to be publicly visible but no longer is. IIRC that issue involved occasional corruption in a SQLite DB and they were looking at ways to detect and log that, which might also be useful in running an experiment to see if dropping F_FULLFSYNC made corruption any more likely.

Comment options

chanon Apr 18, 2024

Does this meanall OPFS file writes on chromium browser on Mac OS are affected, not just SQLite related? And they will be up to 10x slower?

If this is not fixed then I wonder if OPFS will actually be usable in any write-heavy real world applications as Chrome users on Mac OS seem to be a pretty important demographic of users.

I looked through the thread, it looks like the chromium devs last recommendation is to raise an issue with the standards body athttps://github.com/WICG/file-system-access/issues?

EDIT: Just tried running the wa-sqlite benchmarks on my Mac in various browsers. Saw that for Test 1 AccessHandlePool is extremely fast on Firefox and Safari but extremely slow on Chrome. Even using locking_mode=EXCLUSIVE barely helps

Another observation is AccessHandlePool in Firefox and Safari on Mac OS alot faster than on Firefox and Chrome on Windows.

Comment options

rhashimoto Apr 18, 2024
Maintainer Author

Does this mean all OPFS file writes on chromium browser on Mac OS are affected, not just SQLite related? And they will be up to 10x slower?

The problem is not exactly with writes but with flushes. It does apply to all pages that use OPFS on macOS, but it especially affects applications that need to write atomically and durably, like databases, because they require more flushes.

If this is not fixed then I wonder if OPFS will actually be usable in any write-heavy real world applications as Chrome users on Mac OS seem to be a pretty important demographic of users.

You can write at high rates. You just can't flush at high rates. I don't think users will notice; it's developers who will run up against it and not write apps that do that. I believe it's unfortunate and unnecessary but I think I've done all I can do.

Comment options

schickling
Jul 8, 2024

First of all: I really appreciate your fantastic work on all of this!

Looks like you've recently shipped v1.0 (congrats!) and as part of the preparations deleted the above-mentioned ahp demos as part of#174. Are there any plans to bring those back?@rhashimoto

You must be logged in to vote

3 replies

Comment options

rhashimoto Jul 8, 2024
Maintainer Author

@schickling I don't have any plans to restore those demos. I was thinking that most applications that would previously have used AccessHandlePoolVFS would now instead use OPFSCoopSyncVFS, which issimilar but supports multiple connections. The SharedService implementations and simple demos showing the basic concept of sharing a Worker (without SQLite) are still around.

Did you have a particular interest in one or both of those demos?

Comment options

schickling Jul 8, 2024

Thanks a lot for that additional context. For now I'll use the deleted demo scripts as a reference to use SQLite from a SharedWorker.

(🤞 that one day OPFS will be supported in a SharedWorker natively)

Comment options

rhashimoto Jul 8, 2024
Maintainer Author

@schickling Just so you're aware, there are some newer VFS options now that can address some of the motivations to share a dedicated Worker, in addition to OPFSCoopSyncVFS.

You can actually use OPFS in a shared worker, just not the synchronous access handles that optimize performance.OPFSAnyContextVFS implements such a VFS without context restrictions. It may be suitable if slow write performance can be tolerated.

IndexedDB is available in a shared worker, so IDBBatchAtomicVFS is always an option. IndexedDB has slower I/O than OPFS, but that is offset by not needing message passing to a Worker. And if performance is critical, the newIDBMirrorVFS is even faster than OPFS (with the limitation that your database fits in available memory).

Movatterモバイル変換

New demos for OPFS access handle#84

Uh oh!

Uh oh!

rhashimotoApr 23, 2023 Maintainer

Replies: 2 comments· 7 replies

Uh oh!

Uh oh!

rhashimotoApr 25, 2023 Maintainer Author

Uh oh!

tantamanSep 29, 2023

Uh oh!

rhashimotoSep 29, 2023 Maintainer Author

Uh oh!

Uh oh!

chanonApr 18, 2024

Uh oh!

rhashimotoApr 18, 2024 Maintainer Author

Uh oh!

schicklingJul 8, 2024

Uh oh!

rhashimotoJul 8, 2024 Maintainer Author

Uh oh!

schicklingJul 8, 2024

Uh oh!

rhashimotoJul 8, 2024 Maintainer Author

Uh oh!

rhashimoto
Apr 23, 2023
Maintainer

Replies: 2 comments 7 replies

rhashimoto
Apr 25, 2023
Maintainer Author

tantaman Sep 29, 2023

rhashimoto Sep 29, 2023
Maintainer Author

chanon Apr 18, 2024

rhashimoto Apr 18, 2024
Maintainer Author

schickling
Jul 8, 2024

rhashimoto Jul 8, 2024
Maintainer Author

schickling Jul 8, 2024

rhashimoto Jul 8, 2024
Maintainer Author