- Notifications
You must be signed in to change notification settings - Fork1.9k
[Java] Clarification Needed on DataFlow::FeatureHasSourceCallContext Behavior#21097
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
Hi, To illustrate, I've reproduced the issue in Java: I'm tracking a publicvoidsource(Stringtaint) {returnrandomMethod(taint)}publicObjectrandomMethod(Stringtaint) {returntaint}publicvoidotherMethod(Stringparam) {Objectobj =randomMethod(taint)// continue the tracking / flow} I initially thought that I feel I'm not the only one struggling with this issue:#17241 I'm surprised by this kind of result. I would like a way to ensure that my path goes from source to sink and that it can be called in my program. Thank you |
BetaWas this translation helpful?Give feedback.
All reactions
That looks completely correct to me. If flow reaches a static variable then surely that same static variable can be read from other unrelated calls totoUpperEnglish - storing into and reading from a static variable effectively throws away the call context. So even though this may be considered a false positive, it's working as intended. Data flow cannot distinguish between individual array indices here, so as soon as one entry in the cache is tainted, then they're all considered tainted.
Replies: 2 comments 11 replies
-
Hi 👋 Can you share the query that you used to track flow? I'm surprised that you see flow going from As for flow features, they are used to restrict which kind of flow is allowed at sources/sinks:
Flow features are rarely needed in practice, and if the underlying problem is the erroneous flow mentioned above, then it sounds more like a bug in the data flow library. |
BetaWas this translation helpful?Give feedback.
All reactions
-
Hi, yes so here it is: When is |
BetaWas this translation helpful?Give feedback.
All reactions
-
@hvitved I have extracted the relevant code from my codebase and created a query to show you. So here is my Java code, I have basically copy/paste the weird function which is packagecom.test;importjava.util.Locale;publicclassStringUtils {privatestaticfinalintTO_UPPER_CACHE_LENGTH =2048;privatestaticfinalintTO_UPPER_CACHE_MAX_ENTRY_LENGTH =64;privatestaticfinalString[][]TO_UPPER_CACHE =newString[TO_UPPER_CACHE_LENGTH][];privateStringUtils() { }publicstaticStringtoUpperEnglish(Stringvar0) {if (var0 ==null)returnnull;if (var0.length() >TO_UPPER_CACHE_MAX_ENTRY_LENGTH) {returnvar0.toUpperCase(Locale.ENGLISH); }else {intvar1 =var0.hashCode() & (TO_UPPER_CACHE_LENGTH -1);String[]var2 =TO_UPPER_CACHE[var1];if (var2 !=null &&var2[0].equals(var0)) {returnvar2[1]; }else {Stringvar3 =var0.toUpperCase(Locale.ENGLISH);var2 =newString[]{var0,var3};TO_UPPER_CACHE[var1] =var2;returnvar3; } } }}classSourceclass {publicObjectsourceMethod(Objectsource) {return (String)Sinkclass.sinkMethod(StringUtils.toUpperEnglish((String)source)); }}classOther {privatestaticStringrandomMethod(Objectvar1) {return (String)Sinkclass.sinkMethod(StringUtils.toUpperEnglish((String)var1)); }}classSinkclass {publicstaticObjectsinkMethod(Objectsource) {returnsource; }} Note the Here is my CodeQL query: /** * @id test * @description test * @name test * @kind path-problem * @problem.severity warning * @tags security */import javaimport semmle.code.java.dataflow.FlowSourcesimport semmle.code.java.security.CommandLineQueryclassSourceMethodextendsMethod{SourceMethod(){this.getName()="sourceMethod"}}module MyConfigimplements DataFlow::ConfigSig{predicateisSource(DataFlow::Nodesource){source.asParameter().getCallable()instanceofSourceMethod}predicateisSink(DataFlow::Nodesink){exists(Callc|c.getCallee().hasQualifiedName("com.test","Sinkclass","sinkMethod")andsink.asExpr()=c.getAnArgument())}}module MyFlow= TaintTracking::Global<MyConfig>;import MyFlow::PathGraphfrom MyFlow::PathNodesource, MyFlow::PathNodesinkwhere MyFlow::flowPath(source,sink)selectsink.getNode(),source,sink,"sink: $@",source.getNode(),sink.toStringWithContext() For this I get 2 paths, the one I'm looking for: ![]() but also this one: ![]() |
BetaWas this translation helpful?Give feedback.
All reactions
-
That looks completely correct to me. If flow reaches a static variable then surely that same static variable can be read from other unrelated calls to |
BetaWas this translation helpful?Give feedback.
All reactions
-
Alright so Is there a way to keep the call context from source to sink to prevent those "false positive" ? |
BetaWas this translation helpful?Give feedback.
All reactions
-
If we're analysing the code as-is then I'd argue that we do in fact want this path: The call from |
BetaWas this translation helpful?Give feedback.
All reactions
-
The problem is that I'm running it on multiple codebase and I don't know in advance which function will need such summaryModel |
BetaWas this translation helpful?Give feedback.
All reactions
Uh oh!
There was an error while loading.Please reload this page.
Uh oh!
There was an error while loading.Please reload this page.
-
I guess alternatively you can introduce a barrier for the |
BetaWas this translation helpful?Give feedback.

