Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

v4; motivation and initial thoughts#951

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Draft
mgravell wants to merge114 commits intomain
base:main
Choose a base branch
Loading
fromv4
Draft

v4; motivation and initial thoughts#951

mgravell wants to merge114 commits intomainfromv4

Conversation

@mgravell
Copy link
Member

@mgravellmgravell commentedSep 6, 2022
edited
Loading

This PR covers some initial exploration into v4

Key Motivations

  1. improve AOT support
  2. improve performance
  3. support additional memory usage scenarios
  4. smaller outputs

2 and 3 are most likely by way of a new reader/writer API with additional optimizations; 1 is most likely via new build tools which integrate with the outputs from 2 and 3

Improve AOT Support

Currently the core engine is focused on runtime reflection-based IL emit. The library conceptually supports AOT scenarios, including library separation of the core and reflection-based aspects, and attribute based annotation support for manually-written serializers, but none of the tools currently generate code-based serializers. We aim to provide both code-first and contract-first AOT scenarios, typically using Roslyn generators (either based on the discovered code model, or the .proto files parsed - the machinery ahead of these bits already exists).

Additionally:

  1. runtime reflection-based emit is slow at the initial usage, requiring lots of additional system assemblies, lots of type discovery, and consideration of a complicated system, and the actual emit; this impacts cold-start performance, particularly relevant for serverless scenarios where the process is typically short-lived
  2. runtime reflection discovery and emit is not well supported on all platforms, in particular impacting "unity" etc (also: IL2CPP doesn't support all IL scenarios, and isn't perfect in some of the relevant cases)
  3. runtime reflection discovery and emit demands a wide graph of system assemblies; this impacts "pruning", meaning either we need to retain a lot of libraries, or it won't work properly; this impacts "blazor" in particular
  4. runtime reflection discovery and emit is hard to debug, maintain, and extend; if we want to add radical new features (a new core reader/writer API, async, etc) it is prohibitively expensive to implement this in the existing design, and demands very niche skills (reducing the ability of people to contribute)

Improve performance

Profiling has shown that the existing API is sub-optimal; discovery work has been done ahead of this PR to investigate a "from first principles" re-imagining of the core reader/writer API. It is fundamentally not possible to achieve all of the aims here without a new API, although it may be possible to reuse the new API from without the older API as a wrapper layer.

These changes include:

  • reworking the data buffer to reduce all unnecessary optimizations
  • using CPU primitives where profiling shows it to be useful
  • using better generated serializer code to reduce operations
  • exploit framework features list list-span access

Support Additional Memory Usage Scenarios

Some models are inherently "allocatey"; consider, for example, a model with arepeated chunk of multiple sub-items, each of which has abytes payload, resulting in large numbers of smallbyte[] chunks. The idea here is to facilitate more efficient scenarios here; e.g. we could generateReadOnlyMemory<byte> instead ofbyte[], and allow multiple leaf levels to be slices of the same underlying oversized buffer. The existing PR explores this scenario. Note, however, that profiling is mixed on the outcome of this. We want toenable this, but as an option, allowing us to play with multiple options with real data.

Smaller Outputs

Right now the runtime library needs to contain chains for things itmight need - niche random code paths for obscure and esoteric models. Because this discovery is done via reflection, these edge-cases are largely not trimmable (in the AOT sense), because discovering whether they are reached are not is basically impossible. By moving to an AOT path, without all the reflection gunk, it isvery clear at build time what code is reached - thereis no reflection gunk. This means we don't need all the reflection dependencies, and we don't need all the dependencies for all the stuff that isn't used by the model. This saving can be significant.


Likely implementation

We need to consider code-first and contract-first separately here. Let's consider a simple scenario:

syntax="proto3";messageFoo {repeatedstringbar=1;}

Currently, this can be used to generate something akin to the same contract, as seen from a code-first perspective:

[ProtoContract]publicpartialclassFoo{[ProtoMember(1)]publicList<string>Bars{get;}=new();.}

What we want to achieve is that whether starting code-first or contract-first, we generate code that includes the actual serialization code, either at the same time as generating the code (contract-first), or in an additional partial-class (code-first). Typical output code is shown in the exploration work in the PR.

The key point here, though, is that code-first and contract-first start from completely different code models - contract-first (and the existing code-gen) starts from theFileDescriptorSet view, where-as code-first starts from a Roslyn view. The actual code-gen should not have to content with this, and we do not intend duplication, so: the proposal instead is to create a new source-agnostic API that the new code-gen tools should use, and populate the source-agnostic API from the specific scenarios.

For example, we could have:

class CodeGenerationModel  List<CodeGenerationFile> Filesclass CodeGenerationFile  string Name  List<CodeGenerationType> Typesclass CodeGenerationType  string Name, OriginalName // takes Name when null  string Namespace  ReadOnlyMemory<string> ParentTypes  List<CodeGenerationMember> Members  // flags and other helpers; is it an enum? value-type?  // what are we generating for this type? members? serializer?  // note: we expect inbuilt primitives to exist as CodeGenerationType,  // for example, maybe `static CodeGenerationType.String`class CodeGenerationMember  string Name, OriginalName // takes Name when null  string BackingMember  int FieldNumber  CodeGenerationType Type  // data format? wire-type?  // repeated? if so, what kind? other flags?

So here, we would generate the equivalent of

var model = new() {  Files = {    new() {        Name = "my.generated.cs",        Types = {          new() {            Name = "Foo",            Generate = /* serializer+members for contract-first; serializer for code-first */            Members = {              FieldNumber = 1,              Name = "Bars", OriginalName = "bar",              Type = CodeGenerationType.String,              MemberType = Repeated            }          }        }    }  }};

So; the initial work items:

  1. define a rough skeleton model for the above new API
  2. parse the Roslyn code-first model to populate the new model
  3. parse theFileDescriptorSet contract-first model to populate the new model
  4. emit new model+serializer code from the new model, against the new serializer API
  5. implement the new serializer API

It isnot a goal of the current stage to emit code for theold serializer API from the new model; while that might be a nice feature in the future, it is not seen as solving an immediate need, and will only add support costs.


High level tasks

  • setup test skeleton
    • parse .proto toFileDescriptorSet
    • parse C# to Roslyn model
  • setup new working model
  • populate working model fromFileDescriptorSet
  • populate working model from Roslyn model
  • basic DTO output from working model
  • serializer output from working model
  • complete the reader/writer API

Test skeleton; somehow setup multi-input test (folder-based?) that takes a corpus of examples

jasper-d, aeroastro, raulsntos, PanzerFowst, psychonic, and PaulusParssinen reacted with rocket emoji
Signed-off-by: Marc Gravell <marc.gravell@gmail.com>
# Conflicts:#protobuf-net.sln#src/Benchmark/Benchmark.csproj#src/BenchmarkBaseline/BenchmarkBaseline.csproj#src/BuildToolsUnitTests/BuildToolsUnitTests.csproj#src/Directory.Build.props#src/Examples/Examples.csproj#src/LongDataTests/LongDataTests.csproj#src/NativeGoogleTests/NativeGoogleTests.csproj#src/protobuf-net.AspNetCore/protobuf-net.AspNetCore.csproj#src/protobuf-net.BuildTools.Legacy/protobuf-net.BuildTools.Legacy.csproj#src/protobuf-net.BuildTools/protobuf-net.BuildTools.csproj#src/protobuf-net.Core/protobuf-net.Core.csproj#src/protobuf-net.FSharp.Test/protobuf-net.FSharp.Test.fsproj#src/protobuf-net.FSharp/protobuf-net.FSharp.csproj#src/protobuf-net.MSBuild.Test/protobuf-net.MSBuild.Test.csproj#src/protobuf-net.MSBuild/protobuf-net.MSBuild.csproj#src/protobuf-net.MessagePipes/protobuf-net.MessagePipes.csproj#src/protobuf-net.NodaTime/protobuf-net.NodaTime.csproj#src/protobuf-net.Protogen/protobuf-net.Protogen.csproj#src/protobuf-net.Reflection.Test/protobuf-net.Reflection.Test.csproj#src/protobuf-net.ServiceModel/protobuf-net.ServiceModel.csproj#src/protobuf-net.Test/protobuf-net.Test.csproj#src/protobuf-net/protobuf-net.csproj#src/protogen.site/protogen.site.csproj#src/protogen/protogen.csproj
@listepo
Copy link

Hey@mgravell thanks for your work, is there any news about it?

Dona278 reacted with eyes emoji

@Dona278
Copy link

Hi@mgravell , I know that you have a lot of work + family + combat criminals at night but I think this is the best protobuf library for dotnet, and Microsoft since net8 pushes a lot on performance + trimming + AOT + source generator, so I wanna ask:

  • After years, there is any eta for this work?
  • There is any chance to get help from microsoft to support this project as already did with Grpc.AspNetCore?

Anyway thank you for your work!

@mgravell
Copy link
MemberAuthor

Hi; no hard ETA, but definitely still in progress; I'm very aware of the AOT work, and the hope is for the Dapper.AOT learnings to lead into the protobuf-net work; there exists an AOT branch for the analyzer pieces, but I think a lot of it will need some significant rework, but: I'm also a little distracted by Google's recent discussion of "edition 2024", and the "group" changes, which I also want to integrate (parser now works, so... yay!). This is relevant because the "editions" work and the "AOT" work need to interact, so understanding both pieces at the same time is essential.

As for MSFT time: my MSFT time is focused on cache work at the moment, but: let's see how it goes a little later in the year,

@michaldobrodenka
Copy link

About AOT, it seems, that AssemblyBuilder.Save will work in .NET 9. I know generating c# code is better solution, but would this be supported? Generating serializer assemblies for AOT in some "model.csproj" after build step?

@mgravell
Copy link
MemberAuthor

@michaldobrodenka if AssemblyBuilder.Save starts working, I'll happily light up that API, and if that unblocks some scenarios: great! However, that will be unrelated to and tangential to the intended AOT route, which I hope to be codegen based

michaldobrodenka reacted with heart emoji

@tuga001-sme
Copy link

Any news?

@PanzerFowst
Copy link

First off, thank you for your work! It is great!

I know this is not a rushed change (family, day job, etc.), but I was curious what could be done to help this PR along? Are there API improvements of code generators in .NET 9 that can be taken advantage of now?

@mgravell
Copy link
MemberAuthor

The APIs haven't changed hugely (I don't think interceptors give us much); but I do need to revisit this from the ground up, using our learnings here as a foundation - the object model needs a lot of rework based on my learnings from Roslyn incremental generators over the last few years; the approach here is naive. Doable: yes. But it needs dedicated time.

PanzerFowst reacted with rocket emoji

@PanzerFowst
Copy link

Thanks for getting back so quickly, Marc!

Ah, I see. So then would there be an issue / milestone with TODOs etc. to give a roadmap of what needs to be done so that we could help contribute where able?

@michaldobrodenka
Copy link

michaldobrodenka commentedApr 16, 2025
edited
Loading

I started to play with generators and created a demo for protobuf generated serializers/deserializers from protobuf-net attributes.

https://github.com/michaldobrodenka/GProtobuf

It's far from usable, only deserialization is supported with only handful of types. Not tested/used. Just a proof of concept. Maybe will return to it sometimes. But when it's working, deserialization is crazy fast.

PanzerFowst reacted with thumbs up emoji

@PanzerFowst
Copy link

That's neat,@michaldobrodenka!

I am working on converting some code to be NativeAoT compliant and unfortunately haven't found a way to keep the NativeAoT runtime from trimming awayprotobuf-net. The only thing I have found so far is to useGoogle.Protobuf and manually create a.proto file for my DTOs, and it just ends up really messy...

But it did give me the idea (I haven't looked too deeply at this repo to see how feasible it is)--what if the[ProtoContract] and[ProtoMember(n)] attributes could create the.proto files automatically and and add the<Protobuf Include="car.proto" /> to the .csproj to generate theGoogle.Protobuf code that can then be used to automagically accomplish the same behavior in a NativeAoT context?

I am sure there are reasons that this wouldn't work, but with .NET 9 giving full NativeAoT support for iOS, I am seeing a lot of movement towards NativeAoT to get off of MonoAoT.

@mgravell
Copy link
MemberAuthor

Eesh, I should just dust this off and ship something, even if it is incomplete. My plans are wider than my calendar, it seems.

bdovaz, PanzerFowst, and KybernetikGames reacted with thumbs up emojiPanzerFowst and psychonic reacted with rocket emojibdovaz and PanzerFowst reacted with eyes emoji

@KybernetikGames
Copy link

Is there any chance v4 could bring back support forAsReference that was in v2 which allowed a full object graph to be serialized with multiple fields referencing the same object?

I'm trying to find a good serializer for Unity and ProtoBuf v2 is the only one I've found which meets all my needs except that I can't seem to use it in Android builds due to IL2CPP requiring AOT compilation so it would be a huge shame to find a solution to that problem only to lose such a useful feature.

@Dona278
Copy link

@KybernetikGames did you looked atcysharp repos? They develop games with Unity and they are the creators of R3 (observables) and [Message/Memory]Pack (serializers) both developed in the way to be compatible with Unity.

KybernetikGames reacted with thumbs up emoji

@michaldobrodenka
Copy link

michaldobrodenka commentedApr 26, 2025
edited
Loading

Is there any chance v4 could bring back support forAsReference that was in v2 which allowed a full object graph to be serialized with multiple fields referencing the same object?

I'm trying to find a good serializer for Unity and ProtoBuf v2 is the only one I've found which meets all my needs except that I can't seem to use it in Android builds due to IL2CPP requiring AOT compilation so it would be a huge shame to find a solution to that problem only to lose such a useful feature.

If you need solution now, you can check my protobuf-net 2 fork - with precompile you can prepare serializer in post build step as a dll. I'm using it in production. And you don't need old net framework. It works with net6+https://github.com/michaldobrodenka/protobuf-net

KybernetikGames reacted with thumbs up emoji

@KybernetikGames
Copy link

@Dona278 I briefly tried MessagePack and MemoryPack but ran into issues with each of them (here andhere) which would have required me to refactor quite a bit of my code base. ProtoBuf v2 seemed like a silver bullet which handled everything I need to do with it right up until I tried to use it in a runtime build. But if I can't get it going then I'll definitely be revisiting the cysharp systems.

@michaldobrodenka I found your repo earlier today and have been trying to get it to work in Unity with no success so far and there's no Issues page so I wasn't sure how to contact you. Do you have a preferred contact method?

@michaldobrodenka
Copy link

@KybernetikGames have you checkedaot-net6 branch?
I have added issues to this project, but I don't plan to maintain this project much further; I'm just using it until I find a replacement. It works on all my projects and I'm looking for more modern solution - using Span and code generated. Something like myGProtobuf which is only a proof of concept now.

KybernetikGames reacted with thumbs up emoji

@mgravell
Copy link
MemberAuthor

I genuinely do have plans to revisit the AOT work. I just need the world to switch to a 36 hour day so I have enough hours in each...

KybernetikGames, bdovaz, rafayahmed317, PanzerFowst, Dona278, PaulusParssinen, and BrunoJuchli reacted with laugh emoji

@PanzerFowst
Copy link

Well, I just wanted to ask if you maybe had an outline of the work (that you know of so far) that needed to be done so that anyone who has the time and could contribute would (I have been looking intoIIncremementalGenerator and experimenting) be able to help?

I know I am certainly interested in contributing.

@listepo
Copy link

@mgravell I understand you very well, but are there any deadlines?

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment

Reviewers

No reviews

Assignees

No one assigned

Labels

None yet

Projects

None yet

Milestone

No milestone

Development

Successfully merging this pull request may close these issues.

9 participants

@mgravell@listepo@Dona278@michaldobrodenka@tuga001-sme@PanzerFowst@KybernetikGames@DeagleGross

[8]ページ先頭

©2009-2025 Movatter.jp