unum-cloud/UCallPublic

NotificationsYou must be signed in to change notification settings
Fork54
Star1.3k

Web Serving and Remote Procedure Calls at 50x lower latency and 70x higher bandwidth than FastAPI, implementing JSON-RPC & REST over io_uring ☎️

License

Apache-2.0 license

1.3k stars 54 forks Branches Tags Activity

Star

Notifications

You must be signed in to change notification settings

Branches Tags

Folders and files

Name		Name	Last commit message	Last commit date
Latest commit History 521 Commits
.devcontainer		.devcontainer
.github		.github
.vscode		.vscode
assets		assets
docs		docs
examples		examples
include		include
src		src
.clang-format		.clang-format
.editorconfig		.editorconfig
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
VERSION		VERSION
pyproject.toml		pyproject.toml
setup.py		setup.py

Repository files navigation

UCall

JSON Remote Procedure Calls Library
Up to 100x Faster than FastAPI

Most modern networking is built either on slow and ambiguous REST APIs or unnecessarily complex gRPC.FastAPI, for example, looks very approachable.We aim to be equally or even simpler to use.

FastAPI	UCall
pip install fastapi uvicorn	pip install ucall
fromfastapiimportFastAPIimportuvicornserver=FastAPI()@server.get('/sum')defsum(a:int,b:int):returna+buvicorn.run(...)	fromucall.posiximportServer# from ucall.uring import Server on 5.19+server=Server()@serverdefsum(a:int,b:int):returna+bserver.run()

FastAPI

UCall

pip install fastapi uvicorn

pip install ucall

fromfastapiimportFastAPIimportuvicornserver=FastAPI()@server.get('/sum')defsum(a:int,b:int):returna+buvicorn.run(...)

fromucall.posiximportServer# from ucall.uring import Server on 5.19+server=Server()@serverdefsum(a:int,b:int):returna+bserver.run()

It takes over a millisecond to handle a trivial FastAPI call on a recent 8-core CPU.In that time, light could have traveled 300 km through optics to the neighboring city or country, in my case.How does UCall compare to FastAPI and gRPC?

Setup	🔁	Server	Latency w 1 client	Throughput w 32 clients
Fast API over REST	❌	🐍	1'203 μs	3'184 rps
Fast API over WebSocket	✅	🐍	86 μs	11'356 rps ¹
gRPC ²	✅	🐍	164 μs	9'849 rps

UCall with POSIX	❌	C	62 μs	79'000 rps
UCall with io_uring	✅	🐍	40 μs	210'000 rps
UCall with io_uring	✅	C	22 μs	231'000 rps

Table legend

All benchmarks were conducted on AWS on general purpose instances withUbuntu 22.10 AMI.It is the first major AMI to come withLinux Kernel 5.19, featuring much widerio_uring support for networking operations.These specific numbers were obtained onc7g.metal beefy instances with Graviton 3 chips.

The 🔁 column marks, if the TCP/IP connection is being reused during subsequent requests.
The "server" column defines the programming language, in which the server was implemented.
The "latency" column report the amount of time between sending a request and receiving a response. μ stands for micro, μs subsequently means microseconds.
The "throughput" column reports the number of Requests Per Second when querying the same server application from multiple client processes running on the same machine.

¹ FastAPI couldn't process concurrent requests with WebSockets.

² We tried generating C++ backends with gRPC, but its numbers, suspiciously, weren't better. There is also an async gRPC option, that wasn't tried.

How is that possible?!

How can a tiny pet-project with just a couple thousand lines of code compete with two of the most established networking libraries?UCall stands on the shoulders of Giants:

io_uring for interrupt-less IO.
- io_uring_prep_read_fixed on 5.1+.
- io_uring_prep_accept_direct on 5.19+.
- io_uring_register_files_sparse on 5.19+.
- IORING_SETUP_COOP_TASKRUN optional on 5.19+.
- IORING_SETUP_SINGLE_ISSUER optional on 6.0+.
SIMD-accelerated parsers with manual memory control.
- simdjson to parse JSON faster than gRPC can unpackProtoBuf.
- Turbo-Base64 to decode binary values from aBase64 form.
- picohttpparser to navigate HTTP headers.

You have already seen the latency of the round trip..., the throughput in requests per second..., want to see the bandwidth?Try yourself!

@serverdefecho(data:bytes):returndata

More Functionality than FastAPI

FastAPI supports native type, while UCall supportsnumpy.ndarray,PIL.Image and other custom types.This comes handy when you build real applications or want to deploy Multi-Modal AI, like we do withUForm.

fromucall.rich_posiximportServerimportuformserver=Server()model=uform.get_model('unum-cloud/uform-vl-multilingual')@serverdefvectorize(description:str,photo:PIL.Image.Image)->numpy.ndarray:image=model.preprocess_image(photo)tokens=model.preprocess_text(description)joint_embedding=model.encode_multimodal(image=image,text=tokens)returnjoint_embedding.cpu().detach().numpy()

We also have our own optionalClient class that helps with those custom types.

fromucall.clientimportClientclient=Client()# Explicit JSON-RPC call:response=client({'method':'vectorize','params': {'description':description,'image':image,    },'jsonrpc':'2.0','id':100,})# Or the same with syntactic sugar:response=client.vectorize(description=description,image=image)

CLI likecURL

Aside from the PythonClient, we provide an easy-to-use Command Line Interface, which comes withpip install ucall.It allow you to call a remote server, upload files, with direct support for images and NumPy arrays.Translating previous example into a Bash script, to call the server on the same machine:

ucall vectorize description='Product description' -i image=./local/path.png

To address a remote server:

ucall vectorize description='Product description' -i image=./local/path.png --uri 0.0.0.0 -p 8545

To print the docs, useucall -h:

usage: ucall [-h] [--uri URI] [--port PORT] [-f [FILE ...]] [-i [IMAGE ...]] [--positional [POSITIONAL ...]] method [kwargs ...]UCall Client CLIpositional arguments:  method                method name  kwargs                method argumentsoptions:  -h, --help            show this help message and exit  --uri URI             server uri  --port PORT           server port  -f [FILE ...], --file [FILE ...]                        method positional arguments  -i [IMAGE ...], --image [IMAGE ...]                        method positional arguments  --positional [POSITIONAL ...]                        method positional arguments

You can also explicitly annotate types, to distinguish integers, floats, and strings, to avoid ambiguity.

ucall auth id=256ucall auth id:int=256ucall auth id:str=256

Free Tier Throughput

We will leave bandwidth measurements to enthusiasts, but will share some more numbers.The general logic is that you can't squeeze high performance from Free-Tier machines.Currently AWS provides following options:t2.micro andt4g.small, on older Intel and newer Graviton 2 chips.This library is so fast, that it doesn't need more than 1 core, so you can run a fast server even on a tiny Free-Tier server!

Setup	🔁	Server	Clients	`t2.micro`	`t4g.small`
Fast API over REST	❌	🐍	1	328 rps	424 rps
Fast API over WebSocket	✅	🐍	1	1'504 rps	3'051 rps
gRPC	✅	🐍	1	1'169 rps	1'974 rps

UCall with POSIX	❌	C	1	1'082 rps	2'438 rps
UCall with io_uring	✅	C	1	-	5'864 rps
UCall with POSIX	❌	C	32	3'399 rps	39'877 rps
UCall with io_uring	✅	C	32	-	88'455 rps

In this case, every server was bombarded by requests from 1 or a fleet of 32 other instances in the same availability zone.If you want to reproduce those benchmarks, check out thesum examples on GitHub.

Quick Start

For Python:

pip install ucall

For CMake projects:

include(FetchContent)FetchContent_Declare(    ucall    GIT_REPOSITORY https://github.com/unum-cloud/ucall    GIT_SHALLOWTRUE)FetchContent_MakeAvailable(ucall)include_directories(${ucall_SOURCE_DIR}/include)

The C usage example is mouthful compared to Python.We wanted to make it as lightweight as possible and to allow optional arguments without dynamic allocations and named lookups.So unlike the Python layer, we expect the user to manually extract the arguments from the call context withucall_param_named_i64(), and its siblings.

#include<cstdio.h>#include<ucall/ucall.h>staticvoidsum(ucall_call_tcall,ucall_callback_tag_t) {int64_ta{},b{};charprinted_sum[256]{};boolgot_a=ucall_param_named_i64(call,"a",0,&a);boolgot_b=ucall_param_named_i64(call,"b",0,&b);if (!got_a|| !got_b)returnucall_call_reply_error_invalid_params(call);intlen=snprintf(printed_sum,256,"%ll",a+b);ucall_call_reply_content(call,printed_sum,len);}intmain(intargc,char**argv) {ucall_server_tserver{};ucall_config_tconfig{};ucall_init(&config,&server);ucall_add_procedure(server,"sum",&sum,NULL);ucall_take_calls(server,0);ucall_free(server);return0;}

Roadmap

Batch Requests
JSON-RPC over raw TCP sockets
JSON-RPC over TCP with HTTP
Concurrent sessions
NumPyarray and Pillow serialization
HTTPS support
Batch-capable endpoints for ML
Zero-ETL relay calls
Integrating withUKV
WebSockets for web interfaces
AF_XDP and UDP-based analogs on Linux

Want to affect the roadmap and request a feature? Join the discussions on Discord.

Why JSON-RPC?

Transport independent: UDP, TCP, bring what you want.
Application layer is optional: use HTTP or not.
Unlike REST APIs, there is just one way to pass arguments.

About

Web Serving and Remote Procedure Calls at 50x lower latency and 70x higher bandwidth than FastAPI, implementing JSON-RPC & REST over io_uring ☎️

unum-cloud.github.io/UCall/

Releases21

Release v0.5.8 Latest

Sep 16, 2025

+ 20 releases

Movatterモバイル変換

Navigation Menu

Search code, repositories, users, issues, pull requests...

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

License

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

UCall

JSON Remote Procedure Calls Library
Up to 100x Faster than FastAPI

How is that possible?!

More Functionality than FastAPI

CLI likecURL

Free Tier Throughput

Quick Start

Roadmap

Why JSON-RPC?

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases21

Uh oh!

Contributors13

Uh oh!

Languages

Movatterモバイル変換

License

unum-cloud/UCall

Folders and files

Latest commit

History

Repository files navigation

UCall

JSON Remote Procedure Calls LibraryUp to 100x Faster than FastAPI

How is that possible?!

More Functionality than FastAPI

CLI likecURL

Free Tier Throughput

Quick Start

Roadmap

Why JSON-RPC?

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases21

Uh oh!

Contributors13

Uh oh!

Languages

JSON Remote Procedure Calls Library
Up to 100x Faster than FastAPI