Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

TaintTracking::FunctionModel taints not shown by hasFlowPath in custom modelling of API#13415

Answeredbyowen-mc
rh-tguittet asked this question inQ&A
Discussion options

Hi,

I am trying to model the Go API for Kubernetes/OpenShift controllers/operators.

UsingTaintTracking::FunctionModel, by copying what was done forencoding/json'sUnmarshal() (here, before theseverely under-documented Models as Data "MaD" were introduced), I managed to have the taint flow I wanted forReader.Get.

The taint tracking logic seem to work fine (on simple examples). However, the flow is not shown byTaintTracking::Configuration'shasFlowPath(), making look like there are gaps in the taint tracking path:

image

A simple example with Json'sUnmarshal() does not have this issue. Looking at the code for it all I can't figure out why since it doesn't look like there is more dataflow/taint tracking logic to it. I also tried to force the taint step but that didn't change anything.

Here is my minimal OpenShift sample operator with codeql db:https://github.com/rh-tguittet/sample-openshift-operator

The minimal Kubernetes QLL:

privateimport gomodule K8sControllerClient{abstractclassRangeextends TaintTracking::FunctionModel,Method{abstractFunctionInputgetAnInput();abstractFunctionOutputgetOutput();overridepredicatehasTaintFlow(FunctionInputinput,FunctionOutputoutput){input=getAnInput()andoutput=getOutput()}}/**     * Signature: Get(ctx context.Context, key ObjectKey, obj Object, opts ...GetOption) error     * https://github.com/kubernetes-sigs/controller-runtime/blob/v0.15.0/pkg/client/interfaces.go#L54     */privateclassKubernetesControllerClientGetFunctionextendsRange{KubernetesControllerClientGetFunction(){this.hasQualifiedName("sigs.k8s.io/controller-runtime/pkg/client.Client","Get")}// Marking this arg as input to avoid having to go deeper (down to the network) in the call stack; because what is behind Get() does not matter in this instanceoverrideFunctionInputgetAnInput(){result.isParameter(1)}overrideFunctionOutputgetOutput(){result.isParameter(2)}}}

The minimal codeql query:

/** * @kind path-problem */import goimport DataFlowimport PathGraphimport semmle.go.security.CommandInjectionimport kubernetes_controller_clientclassMySourceextendsParameter{MySource(){this.getFunction().getName()="Reconcile"andthis.getType().hasQualifiedName("sigs.k8s.io/controller-runtime/pkg/reconcile","Request")}}classMyConfextends TaintTracking::Configuration{MyConf(){this="MyConf"}overridepredicateisSource(DataFlow::Nodenode){node.asParameter()instanceofMySource}overridepredicateisSink(DataFlow::Nodenode){nodeinstanceof CommandInjection::Sink}}from DataFlow::PathNodesource, DataFlow::PathNodesink,MyConfcfgwherecfg.hasFlowPath(source,sink)selectsink.getNode(),source,sink,"Found path"

Any ideas ?

P.S.: I would also like to taint onlysome of theSample's type fields (i.e.Spec and some of the metadata), is that even possible? or should I flip the issue and create Sanitizers/Barriers to exclude the rest instead?

Thanks in advance!

You must be logged in to vote

I discussed your first point with some colleagues. It would be nice, but it would be a lot of work. We have added it to our list of issues we may work on in the future.

For your second point, I have madea draft PR which should fix/improve the situation (when combined with#13461). So far I have only tested it on your example, where it does indeed add a path step where you expected it. (There is the same backwards flow immediately afterwards, but it isn't shown in the path summary because it is skipped over). There is more work to be done to make sure this doesn't cause any regressions and works in all situations, but I am hopeful we will be able to get this in and improve path summaries …

Replies: 5 comments 10 replies

Comment options

sidshank
Jun 9, 2023
Collaborator

Hi@rh-tguittet ! 👋🏼 Thank you for posting the question and the accompanying clear description. The CodeQL Go team has reviewed your question and will be posting a response. Since it is close to the end of the day in European time zones (where the team works), you may only receive a response with suggestions / solutions only early next week. I just wanted to acknowledge that your question is being reviewed.

You must be logged in to vote
1 reply
@rh-tguittet
Comment options

Thank you Sid, I appreciate you notifying me :)

Comment options

I will look into why nothing appears in this case.

I don't think there is any way to force a node to appear in the path explanation, except addingoverride predicate includeHiddenNodes() { any() } to your Config (though that doesn't seem to work in this case).

I believe you are getting a different result when you useJson.Unmarshal because it is modelled with MaD. Could you use MaD to model this method instead?

Using MaD would also solve your second problem. For the output, instead of "Argument[2]" you can write "Argument[2].Field[pkg.classname.fieldname]" to indicate that the taint flows into theField f which satisfiesf.hasQualifiedName(pkg, className, fieldName). (Other options include "ArrayElement", "Element", "MapKey" and "MapValue".) It seems that none of the Go MaD models use this at the moment, but you can see examples in these two tests:one,two.

You must be logged in to vote
3 replies
@rh-tguittet
Comment options

Thanks for your quick answer Owen!

Going through Models as Data did show one additional taint step, number 5 in the screen capture, but it isn't on the method call:

image

Here are the changes I made to leverage Models as Data:

  1. Add the following two lines in myqlpack.yml
dataExtensions:  -ext/*.model.yml
  1. Createext/sigs.k8s.io.controller-runtime.pkg.client.model.yml:
extensions:  -addsTo:pack:tguittet/kubernetesextensible:summaryModeldata:#- ["sigs.k8s.io/controller-runtime/pkg/client", "Client", True, "Get", "", "", "Argument[1]", "Argument[2].Field[github.com/rh-tguittet/sample-openshift-operator/api/v1.Sample.Annotations]", "taint", "manual"] # annotations only, no results returned#- ["sigs.k8s.io/controller-runtime/pkg/client", "Client", True, "Get", "", "", "Argument[1]", "Argument[2].Field[github.com/rh-tguittet/sample-openshift-operator/api/v1.Sample.Spec]", "taint", "manual"]  # spec only, no results returned      -["sigs.k8s.io/controller-runtime/pkg/client", "Client", True, "Get", "", "", "Argument[1]", "Argument[2]", "taint", "manual"]# catchall
  1. Removeextends TaintTracking::FunctionModel and thehasTaintFlow() predicate from myRange class.

  2. Add some more sinks insample_controller.go (codeql-db updated)

I couldn't get theArgument[x].Field[pkg.classname.fieldname] to work, and I would be interested to know how you debug such deep code because all my efforts were moot :) (I tried fiddling with theparseField() predicate over inExternalFlow.qll but my changes never did seem to be taken into account). Also, I fail to see how to generalize access to the fields for a givenSpec field (I suppose something could be done to check struct "inheritance" formetadata fields).

@owen-mc
Comment options

I can explain why the extra step in the path appears to go backwards. It is because the CodeQL library for Go uses "def-use" flow. (all the other languages use "use-use", which wouldn't exhibit this behaviour.) Suppose you have code like this

x := source() // deff(x) // use1sink(x) // use2

The way the go library has set up data flow is that data flows from the a definition (def) to each use of that value, so in this case from def to use1 and from def to use2. So if we mark use1 of x to be tainted, that taint won't flow to use2. We instead have to mark the def corresponding to use1 as tainted, so that other uses of the value will see that it is tainted. We are aware this is counter-intuitive and there has been discussion about switching to "use-use" flow (though that does have some downsides as well).

I will look further into your other questions when I am able to.

@owen-mc
Comment options

By the way, we have justswitched to use-use flow, so you shouldn't see backwards steps in paths any more.

Comment options

I have had an opportunity to look into this more and talk it through with some colleagues. Let me take your questions in turn.

  1. Why does the path explanation jump straight fromreq in the list of parameters to the use ofsample.Spec.Foo in the call toSink?
    We used to have a problem that our path explanations were too long, and contained lots of extraneous detail. The predicatelocalFlowBigStephere is what breaks up the long list of small steps into a smaller list of big steps. One of the things that we make sure to report is when the flows goes into a call or comes out of a call. However, the way that flow is modelled forFunctionModel skips actually going into the call. You are absolutely right that this is confusing, and I will see if I can fix it.

Note that I mentioned overridingincludeHiddenNodes in your config. For Go, the only nodes which this affects relate to MaD, and they don't have locations, so they show up as the empty steps 3 and 4 in your second post. This is mainly intended as a debugging tool. My apologies for mentioning something which was not helpful.

  1. Why doesjson.Unmarshal show the right thing?
    As I said above, it is because it is modelled with MaD, so the flow does actually go into a call and back out again, solocalFlowBigStep does the correct thing.

  2. Why is taint step 5 not in the method call when using MaD?
    Note that taint step 2 is what you would expect, where the flow goes into the method call. This was missing from the original example. As I explained above, rather than flowing tosample as the third argument to the method call the taint actually flows to the definition whichsample is a use of, which is what step 5 shows. (I should have included the terminology, which is "post-update node".) This is due to Go's use of def-use flow, which we hope to one day replace with use-use flow, which would avoid this problem. In the meantime I will think about whether we could insert an additional node so that the path explanation makes more sense.

  3. Is it possible to only taint particular fields of a struct?
    Yes, but only using MaD, notFunctionModel.

  4. Why does your attempt to use MaD to taint a field work?
    I am still investigating this. To show you how to debug such a thing, let me show you how far I have got. I want to use partial flow to see where the taint is flowing to (and with what content, i.e. is it in a field or an array or something like that). Partial flow is explained in more detailhere. Unfortunately we are in the process of changing the dataflow API, so the instructions given in that blog post don't work. First I converted the dataflow configuration to use the new API:

module MyConf implements DataFlow::ConfigSig {  predicate isSource(DataFlow::Node node) { node.asParameter() instanceof MySource }  predicate isSink(DataFlow::Node node) { node instanceof CommandInjection::Sink }}module MyFlow = TaintTracking::Global<MyConf>;import MyFlow::PathGraphfrom MyFlow::PathNode source, MyFlow::PathNode sinkwhere MyFlow::flowPath(source, sink)select sink.getNode(), source, sink, "Found path"

Then I adapted it to use partial flow (I don't think there is any documentation on how to do this yet - I had to use trial-and-error):

module MyConf implements DataFlow::ConfigSig {  predicate isSource(DataFlow::Node node) { node.asParameter() instanceof MySource }  predicate isSink(DataFlow::Node node) { node instanceof CommandInjection::Sink }}module MyFlow = TaintTracking::Global<MyConf>;int myExplorationLimit() { result = 100 }import MyFlow::FlowExploration<myExplorationLimit/0>import PartialPathGraphfrom PartialPathNode source, PartialPathNode sinkwhere partialFlow(source, sink, _)select sink.getNode(), source, sink, "Found path"

The results tell me that the taint is flowing to all the right uses ofsample, with the right content (the fieldSpec), but for some reason it isn't flowing intosample.Spec (with no content).

I will update you when I have discovered the cause of the problem.

Please let me know if there is a question that I have not addressed.

You must be logged in to vote
3 replies
@rh-tguittet
Comment options

Many thanks for your work on this and your in-depth explanations Owen! It indeed covers all but one interrogation remains for:

2. Why does json.Unmarshal show the right thing?As I said above, it is because it is modelled with MaD, so the flow does actually go into a call and back out again, so localFlowBigStep does the correct thing.

So this got me a bit confused. Here's what I am seeing with a minimaljson.Unmarshal example (notice step 3):

image

However, even when I use MaD forr.Get(), I don't see theequivalent of a step 3. Said another way, the path never shows my updatedsample parameter:

image

(Apologies if this is similar to the what I described in the other thread.)

@owen-mc
Comment options

Step 3 in your first image is there because of the&. I modified your sample code to have a& there and got the same thing. Here is a screenshot.
Screenshot 2023-06-15 at 13 53 38
This is because&x is astore step, andstore steps break up the big steps that are shown in the path summary.

@rh-tguittet
Comment options

Got it, thanks. (From a user-perspective it is nonetheless surprising behaviour.)

Comment options

The reason why the flow doesn't work is thatArgument[2] issample, which is defined to be&cachev1.Sample{}, a pointer to a struct, so it doesn't have any fields. (Go allows you to writesample.Spec instead of(*sample).Spec (orsample->Spec in C or C++), and we deal with this by inserting an implicit dereference node.) There is currently no way to specify what you want. I have madea PR to allow you to writeArgument[2].Dereference.Field[...].

I have also madea PR to address your initial point that when a flow path goes through a function modeled withFunctionModel, the steps you would expect to see are missing.

You must be logged in to vote
3 replies
@rh-tguittet
Comment options

Ah, yes, gotcha. Thank your for looking into it and coming up with a solution, much appreciated! -(Would a fuzzy match be considered useful? Having to spell out the full qualified name when only its name is what appears to be really needed isn't super practical. e.g. couldArgument[x].Field[%.%.Spec] be a thing?)

For#13461, I understand the def-use logic that makes it look like the flow is going backwards. Frankly, that's not an issue, it's not less clear (some documentation would lift residual fog). What would make it clearer however is having thesample parameter for ther.Get() call part of the path.

@owen-mc
Comment options

I discussed your first point with some colleagues. It would be nice, but it would be a lot of work. We have added it to our list of issues we may work on in the future.

For your second point, I have madea draft PR which should fix/improve the situation (when combined with#13461). So far I have only tested it on your example, where it does indeed add a path step where you expected it. (There is the same backwards flow immediately afterwards, but it isn't shown in the path summary because it is skipped over). There is more work to be done to make sure this doesn't cause any regressions and works in all situations, but I am hopeful we will be able to get this in and improve path summaries for CodeQL for Go.

Answer selected byrh-tguittet
@rh-tguittet
Comment options

Many thanks Owen for all your work, patience, and thorough replies. I shall keep an eye out for those PRs to land in a release version! From the clone at HEAD + loose patching with the various PR you made, the results I expect are returned and paths are highlighted without obvious gaps.

This comment was marked as off-topic.

Sign up for freeto join this conversation on GitHub. Already have an account?Sign in to comment
Category
Q&A
Labels
None yet
4 participants
@rh-tguittet@sidshank@owen-mc@gittig23441

[8]ページ先頭

©2009-2025 Movatter.jp