Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

PoC TLS resume on Linux client#64369

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to ourterms of service andprivacy statement. We’ll occasionally send you account related emails.

Already on GitHub?Sign in to your account

Merged
wfurt merged 17 commits intodotnet:mainfromwfurt:tlsClientResume
Mar 25, 2022
Merged

Conversation

wfurt
Copy link
Member

This complements#57079 and contributes to#22977

Unlike the server part, this is somewhat more complicated. It may still need some work but I would like to get early feedback for the overall strategy.

Lets start with some numbers:

|                           Method | Toolchain | protocol |        Mean |      Error |     StdDev |      Median |         Min |         Max | Ratio | RatioSD |  Gen 0 | Allocated | Alloc Ratio ||--------------------------------- |---------- |--------- |------------:|-----------:|-----------:|------------:|------------:|------------:|------:|--------:|-------:|----------:|------------:|| DefaultHandshakeContextIPv4Async |        pr |        ? |   965.19 us |  15.799 us |  14.779 us |   964.67 us |   934.66 us |   989.94 us |  1.00 |    0.00 |      - |   17897 B |        1.00 || DefaultHandshakeContextIPv4Async |      main |        ? | 5,377.33 us |  40.143 us |  33.521 us | 5,370.95 us | 5,330.34 us | 5,431.63 us |  5.57 |    0.11 |      - |   19848 B |        1.11 || DefaultHandshakeContextIPv6Async |        pr |        ? |   949.92 us |  20.529 us |  23.641 us |   949.12 us |   918.10 us |   995.11 us |  1.00 |    0.00 |      - |   17896 B |        1.00 || DefaultHandshakeContextIPv6Async |      main |        ? | 5,171.75 us | 114.255 us | 126.995 us | 5,114.88 us | 5,051.99 us | 5,456.90 us |  5.44 |    0.16 |      - |   19921 B |        1.11 ||        DefaultHandshakeIPv4Async |        pr |        ? | 5,070.35 us | 171.510 us | 197.512 us | 5,034.52 us | 4,759.74 us | 5,492.81 us |  1.00 |    0.00 |      - |   23679 B |        1.00 ||        DefaultHandshakeIPv4Async |      main |        ? | 5,572.56 us | 101.193 us |  89.705 us | 5,552.17 us | 5,464.39 us | 5,792.33 us |  1.10 |    0.05 |      - |   22730 B |        0.96 ||        DefaultHandshakeIPv6Async |        pr |        ? | 5,562.64 us | 148.510 us | 165.069 us | 5,527.18 us | 5,260.93 us | 5,832.12 us |  1.00 |    0.00 |      - |   23721 B |        1.00 ||        DefaultHandshakeIPv6Async |      main |        ? | 5,550.79 us | 108.666 us |  96.330 us | 5,506.31 us | 5,454.39 us | 5,768.76 us |  1.00 |    0.03 |      - |   23401 B |        0.99 |

WhenSslStreamCertificateContext is used on server side, TLS resume is possible and we get ~ 5.5x boost.
This should get us on par with Windows but I don't have to same machines to verify this.
On server side#57079 linked the cache to theSslStreamCertificateContext. So when server will reuse it it will also provide ability to resume previous session. When the context is Disposed, it will also dispose the SSL_CTX and the cache.

This creates first complication for client. It also needs to use sameSSL_CTX but since there is typically no certificate, there is nothing to hook to.
To solve this I added separate dictionary to hold contexts for client. That allows to reuse previous context and save native allocations. But the native contexts will be preserved during duration of the process. Complete test runs forSslStream creates 13 instance, in ideal case there will be one e.g.SslProtocols.None. e.g. it depends on unique combinations ofSslProtocols used by the client.

To make it more complicated, OpenSSL pushes responsibility for setting the sessions to caller e.g.SslStream.
So we use theTargetHost e.g. SNI host to lookup previous session. Since there may be multiple series running on same name this is not perfect. But since this is just offer to the server, it is up to the server to decide if the session is valid or not. Based on my testing, this is how SCHANNEL operates. Some implementations use IP/Port info for the lookup but SslStream does not have access to it in general case. Even with that, this can be just load balancer and the SSL may be terminated or different servers. Once again, if the session is offered to "wrong" server or if the server was restored or cleaned up the session, it should proceed with full handshake.

The rest is mostly technicality. In order to process the sessions, we need to register callbacks and OpenSSL will inform us when it wants to add or remove session. There is some plumbing to map that to .NET and originalTargetHost.
There can be multiple sessions for given host but the current strategy is to keep the last one.

I added some methods to SafeSslContextHandle to allocate cache and do operations. TheGCHandle can possibly be stored in native SSL_CTX but I felt it would be easier for debug to keep it on managed side. Since the sessions are not directly used anywhere in managed code, I useIntPtr instead of SafeHandle.

stephentoub and CaringDev reacted with heart emoji
@wfurtwfurt added area-System.Net.Security os-linuxLinux OS (any supported distro) labelsJan 27, 2022
@wfurtwfurt added this to the7.0.0 milestoneJan 27, 2022
@wfurtwfurt requested review fromstephentoub,bartonjs anda teamJanuary 27, 2022 04:47
@wfurtwfurt self-assigned thisJan 27, 2022
@ghost
Copy link

Tagging subscribers to this area: @dotnet/ncl,@vcsjones
See info inarea-owners.md if you want to be subscribed.

Issue Details

This complements#57079 and contributes to#22977

Unlike the server part, this is somewhat more complicated. It may still need some work but I would like to get early feedback for the overall strategy.

Lets start with some numbers:

|                           Method | Toolchain | protocol |        Mean |      Error |     StdDev |      Median |         Min |         Max | Ratio | RatioSD |  Gen 0 | Allocated | Alloc Ratio ||--------------------------------- |---------- |--------- |------------:|-----------:|-----------:|------------:|------------:|------------:|------:|--------:|-------:|----------:|------------:|| DefaultHandshakeContextIPv4Async |        pr |        ? |   965.19 us |  15.799 us |  14.779 us |   964.67 us |   934.66 us |   989.94 us |  1.00 |    0.00 |      - |   17897 B |        1.00 || DefaultHandshakeContextIPv4Async |      main |        ? | 5,377.33 us |  40.143 us |  33.521 us | 5,370.95 us | 5,330.34 us | 5,431.63 us |  5.57 |    0.11 |      - |   19848 B |        1.11 || DefaultHandshakeContextIPv6Async |        pr |        ? |   949.92 us |  20.529 us |  23.641 us |   949.12 us |   918.10 us |   995.11 us |  1.00 |    0.00 |      - |   17896 B |        1.00 || DefaultHandshakeContextIPv6Async |      main |        ? | 5,171.75 us | 114.255 us | 126.995 us | 5,114.88 us | 5,051.99 us | 5,456.90 us |  5.44 |    0.16 |      - |   19921 B |        1.11 ||        DefaultHandshakeIPv4Async |        pr |        ? | 5,070.35 us | 171.510 us | 197.512 us | 5,034.52 us | 4,759.74 us | 5,492.81 us |  1.00 |    0.00 |      - |   23679 B |        1.00 ||        DefaultHandshakeIPv4Async |      main |        ? | 5,572.56 us | 101.193 us |  89.705 us | 5,552.17 us | 5,464.39 us | 5,792.33 us |  1.10 |    0.05 |      - |   22730 B |        0.96 ||        DefaultHandshakeIPv6Async |        pr |        ? | 5,562.64 us | 148.510 us | 165.069 us | 5,527.18 us | 5,260.93 us | 5,832.12 us |  1.00 |    0.00 |      - |   23721 B |        1.00 ||        DefaultHandshakeIPv6Async |      main |        ? | 5,550.79 us | 108.666 us |  96.330 us | 5,506.31 us | 5,454.39 us | 5,768.76 us |  1.00 |    0.03 |      - |   23401 B |        0.99 |

WhenSslStreamCertificateContext is used on server side, TLS resume is possible and we get ~ 5.5x boost.
This should get us on par with Windows but I don't have to same machines to verify this.
On server side#57079 linked the cache to theSslStreamCertificateContext. So when server will reuse it it will also provide ability to resume previous session. When the context is Disposed, it will also dispose the SSL_CTX and the cache.

This creates first complication for client. It also needs to use sameSSL_CTX but since there is typically no certificate, there is nothing to hook to.
To solve this I added separate dictionary to hold contexts for client. That allows to reuse previous context and save native allocations. But the native contexts will be preserved during duration of the process. Complete test runs forSslStream creates 13 instance, in ideal case there will be one e.g.SslProtocols.None. e.g. it depends on unique combinations ofSslProtocols used by the client.

To make it more complicated, OpenSSL pushes responsibility for setting the sessions to caller e.g.SslStream.
So we use theTargetHost e.g. SNI host to lookup previous session. Since there may be multiple series running on same name this is not perfect. But since this is just offer to the server, it is up to the server to decide if the session is valid or not. Based on my testing, this is how SCHANNEL operates. Some implementations use IP/Port info for the lookup but SslStream does not have access to it in general case. Even with that, this can be just load balancer and the SSL may be terminated or different servers. Once again, if the session is offered to "wrong" server or if the server was restored or cleaned up the session, it should proceed with full handshake.

The rest is mostly technicality. In order to process the sessions, we need to register callbacks and OpenSSL will inform us when it wants to add or remove session. There is some plumbing to map that to .NET and originalTargetHost.
There can be multiple sessions for given host but the current strategy is to keep the last one.

I added some methods to SafeSslContextHandle to allocate cache and do operations. TheGCHandle can possibly be stored in native SSL_CTX but I felt it would be easier for debug to keep it on managed side. Since the sessions are not directly used anywhere in managed code, I useIntPtr instead of SafeHandle.

Author:wfurt
Assignees:wfurt
Labels:

area-System.Net.Security,os-linux

Milestone:7.0.0

Copy link
Member

@rzikmrzikm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I left some comments, I am not confident in how TLS resumption works so it would be best if somebody else added their review as well

string? name = Marshal.PtrToStringAnsi(serverName);
if (!string.IsNullOrEmpty(name))
{
Interop.Ssl.SessionSetHostname(session, serverName);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

It feels really weird to me that SetHostname would be inside TryAddSession.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I need something to find the session in removal. Associating name with it allows me to get the string and then do lookup. It would be great if we can come up with something that allows to lookup by both IntPtr and Name.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Sure... it just feels like calling SetHostname'd be the responsibility of the caller. Doesn't it have to be done in the case where TryAddSession returns false?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This is ony only for the lookup - there is no functional difference. And I tried to hide all this inside the handle.
If we can lookup/remove the entry just from the IntPtr session, we would not need to do that. But I'm not sure if there is good way.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I put in comment and made changes to make it clear. I also moved complementing removal so both parts are done inside the SafeHandle.

{
Interop.Ssl.SessionSetHostname(session, serverName);

lock (_sslSessions)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

The locking around _sslSessions makes sense, since you're manipulating state depending on how the dictionary performed.

But, since you're already locking it, it feels like you want a non-Concurrent dictionary.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

agreed. I was also thinking about grabbing extra reference on the session. That would allow me to use ConcurrentDictionary without locking as the session would never be released in the middle.
Do you have preference/recommendation@bartonjs ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

grabbing extra reference on the session

Something like

Interop.Ssl.SslSessionUpRef(session);if (!_sslSessions.TryAdd(...)){    // Undo the upref since it's not in the dictionary    Interop.Ssl.SslSessionFree(session);}

? (Upref inside has a race condition with the cleanup in ReleaseHandle)

That would get a little weird since in the cleanup you'd need to call free twice, I think?

The fact that we wrote ConcurrentDictionary suggests that it gives better perf (on average) than manual locking... but if the code to interact with it is doing memory/lifetime management and it becomes unreadable with the gymnastics... then locking is better for maintainability. (If it's clean code and more performant, than by all means use upref+ConcurrentDictionary)

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

yes. and then call free twice. I'm inclined to good with the lock and better maintainability as the perf does not depend on this. This happens one in while - not even for each SSL session. I started with ConcurrentDictionary but you are right - we don't need it at this point.

@wfurt
Copy link
MemberAuthor

Test failures are related. It seems like bug in old OpenSSL. TheServerAsyncAuthenticate_SniSetVersion_Success will first negotiate Tls1.1. And when we offer ticket in next round (the client side did not change e.g. should advertise 1.1 and 1.2) but the negotiation fails because the client is trying to restore only 1.1. But the server is now configured to only use 1.2 and that fails with TLS Alert.

publicasyncTaskServerAsyncAuthenticate_SniSetVersion_Success(SslProtocolsversion){varserverOptions=newSslServerAuthenticationOptions(){ServerCertificate=_serverCertificate,EnabledSslProtocols=version};varclientOptions=newSslClientAuthenticationOptions(){TargetHost=_serverCertificate.GetNameInfo(X509NameType.SimpleName,forIssuer:false),EnabledSslProtocols=SslProtocols.Tls11|SslProtocols.Tls12};

the PAL shim also turned out more complicated for old OpenSSL than I really wanted.
I'm inclined to enable client resume only for 1.1.1. That should be OK for all modern distributions (including current RH7), leaving behind Ubuntu 16 that is out of support anyway according tohttps://wiki.ubuntu.com/Releases

Any thoughts on that@bartonjs? (and support for 1.0.1 in general?)

@bartonjs
Copy link
Member

Any thoughts on that@bartonjs? (and support for 1.0.1 in general?)

The OpenSSL project no longer considers 1.0.x to be in support. Ubuntu 16.04 ESM, and other distros older than 2018 that are in a mode other than full-EoL, presumably is custom-patching their source base to try to maintain parity with 1.1.1's security patches.

Ubuntu 18.04 for x86 and x64 upgraded from 1.1.0 to 1.1.1 in Dec 2018.https://packages.ubuntu.com/bionic/libssl1.1 says that other architectures of Ubuntu 18.04 still use 1.1.0... but I don't know what our support story is for .NET on any of those architectures on 18.04.

That said, I think it's fine to say that a feature only works on an OS with a new-enough version of OpenSSL.

@wfurt
Copy link
MemberAuthor

This should be ready for another review pass with following changes

  • I sprinkled more Asserts and comments through the code
  • Removed ConcurrentDictionary for sessions
  • no resume on old OpenSSL versions (may be revisited in future but the impact seems minimal)

Copy link
Member

@rzikmrzikm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

LGTM, only minor suggestions.

@wfurt
Copy link
MemberAuthor

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@karelz
Copy link
Member

@bartonjs can you please finalize your Code Review? Is there something you consider blocking?

Comment on lines 603 to 604
if (newSessionCb != NULL || removeSessionCb != NULL)
if (newSessionCb != NULL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

This logically nested (but not indented) if is really just the latter condition. One of these two lines should be deleted.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

removed 603. I probably started something I end up not needing.


const char* CryptoNative_SslSessionGetHostname(SSL_SESSION *session)
{
#ifdef SSL_SESSION_get0_hostname
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I don't believe theifdef will evaluate to true on a non-portable build. (Since it's a function identifier then)

This should probably instead be based on the NEED_OPENSSL_x_y values.

Try building with./build-native.sh -portableBuild=false and see if it works as expected.

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I did build on Ubuntu 16.04 with OpenSSL 1.0.1e with portableBuildtrue andfalse. And it did not fail. Same on Ubuntu 20.04 with OpenSSL 1.1.1.
Would NEED_OPENSSL_1_1 cover also 3.0? I can still change that if you prefer it.

Copy link
Member

@bartonjsbartonjsMar 3, 2022
edited
Loading

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I did build on Ubuntu 16.04 with OpenSSL 1.0.1e with portableBuild true and false. And it did not fail.

That's... really surprising. Looking at the headers over time, it looks like it was added in 1.1.0, and has always been a function (vs a macro acting like a function), so what I'd expect is

1.0.1e in non-portable:

The compile doesn't see any definition of SSL_SESSION_get0_hostname at all, turns this into#if false and removes the code.

1.1.1 in non-portable:

SSL_SESSION_get0_hostname is a function, so the preprocessor doesn't see it, so this still turns into#if false and goes away.

In both cases the compile would succeed, but I'd expect 1.1.1-direct to fail to run (remember that./build-native.sh doesn't update the testhost, so you need to manually copy over it or make it a symlink -- or just load up the compile output in a debugger and see if there's a function body or not)

Would NEED_OPENSSL_1_1 cover also 3.0?

No, you'd need#if defined NEED_OPENSSL_1_1 || defined NEED_OPENSSL_3_0

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

you are right. updated. It is interesting that we don't really haveany test coverage for non-portable builds, right? (aside from compilation)

@wfurt
Copy link
MemberAuthor

The build stuff should be resolved@bartonjs. can you please take another look?

Comment on lines +725 to +729
return 1;
}

// OpenSSL will destroy session.
return 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

Are these interesting from a logging perspective (if so, that can easily be in a future change)?

Copy link
MemberAuthor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others.Learn more.

I was thinking about logging but there are conditions that make it normal. If we hit this, there should be no functional change as we simply won't do the caching & resume.

@wfurtwfurt merged commitf8d5706 intodotnet:mainMar 25, 2022
@wfurtwfurt deleted the tlsClientResume branchMarch 25, 2022 22:08
radekdoulik pushed a commit to radekdoulik/runtime that referenced this pull requestMar 30, 2022
* TLS resume on client* shim functions introduced in 1.1.1* add missing struct* disable resume on old OpenSSL* feedback from review* fix source build* update comment* feedback from review* feedback from review* avoild client resume on old OpenSSL
@ghostghost locked asresolvedand limited conversation to collaboratorsApr 25, 2022
Sign up for freeto subscribe to this conversation on GitHub. Already have an account?Sign in.
Reviewers

@bartonjsbartonjsbartonjs approved these changes

@rzikmrzikmrzikm approved these changes

@stephentoubstephentoubAwaiting requested review from stephentoub

Assignees

@wfurtwfurt

Labels
area-System.Net.Securityos-linuxLinux OS (any supported distro)
Projects
None yet
Milestone
7.0.0
Development

Successfully merging this pull request may close these issues.

4 participants
@wfurt@bartonjs@karelz@rzikm

[8]ページ先頭

©2009-2025 Movatter.jp