Conversation
| } | ||
|
|
||
| private predicate containerSizeAccess(PropertyAccess pa) { | ||
| propertyOverrides(pa.getTarget(), "System.Collections.Generic.ICollection<>", "Count") or |
There was a problem hiding this comment.
This should probably go to a helper file, because we use it in ContainerSizeCmpZero.ql too. But I couldn't find a good location for it. (Also, the check is using a slightly different variant.
| exists(AssignOperation a | a.getLValue() = f.getAnAccess() | result = exprSign(a)) | ||
| or | ||
| f.fromSource() and not exists(f.getInitializer()) and result = TZero() | ||
| } |
There was a problem hiding this comment.
Java also handles access to the field through reflection.
csharp/ql/test/library-tests/dataflow/signanalysis/SignAnalysis.cs
Outdated
Show resolved
Hide resolved
381d906 to
1267db9
Compare
61bf4b5 to
45db4d4
Compare
d3270c8 to
eedaab7
Compare
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/RangeUtils.qll
Outdated
Show resolved
Hide resolved
6183a69 to
d195bb7
Compare
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/RangeUtils.qll
Outdated
Show resolved
Hide resolved
hvitved
left a comment
There was a problem hiding this comment.
Awesome work, @tamasvajk . Great to have this logic shared with Java.
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SsaReadPositionSpecific.qll
Outdated
Show resolved
Hide resolved
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SsaReadPositionSpecific.qll
Outdated
Show resolved
Hide resolved
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SsaReadPositionSpecific.qll
Outdated
Show resolved
Hide resolved
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SsaReadPositionCommon.qll
Outdated
Show resolved
Hide resolved
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SsaReadPositionCommon.qll
Outdated
Show resolved
Hide resolved
| not exists(SsaReadPositionBlock bb | getAnExpression(bb) = e) and | ||
| result = ssaDefSign(v) |
There was a problem hiding this comment.
I'm not quite sure why this disjunct is needed. If it is, could we have a test case?
There was a problem hiding this comment.
@aschackmull can you help me out with this question? This is the disjunct before the refactoring: https://github.com/github/codeql/blob/main/java/ql/src/semmle/code/java/dataflow/SignAnalysis.qll#L482.
There was a problem hiding this comment.
If you look at the disjuncts surrounding this case then you can see some hints. We split into cases based on Expr e and these first couple of cases all cover the VarAccess cases. The most precise case is if this is a well-behaved read of an SSA variable, then there are also cases for non-SSA variable accesses, which are split into fields and non-fields. However, the SSA case would like to assume the existence of a BasicBlock containing the VarAccess in order to look at the SsaReadPosition, and there are a few cases where the Java SSA library can produce an (SSA-variable, SSA-read) pair outside the CFG, for instance in compile-time evaluated expressions in case constants. So in order for all VarAccess expressions to receive a sign we have to include this case.
There was a problem hiding this comment.
OK, I thought it was weird to have both getARead(v) = e and not exists(SsaReadPositionBlock bb | getAnExpression(bb) = e) at the same time, i.e. an SSA read without a basic block. It sounds like a corner case in Java then.
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SignAnalysisCommon.qll
Outdated
Show resolved
Hide resolved
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SignAnalysisSpecific.qll
Show resolved
Hide resolved
4fbca9f to
a021769
Compare
|
@hvitved I pushed some extra commits to address your review. The unsigned integer improvements are not done yet. I think I covered the rest. |
hvitved
left a comment
There was a problem hiding this comment.
Some of the private imports are potentially breaking changes, so should be reverted.
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SignAnalysisSpecific.qll
Show resolved
Hide resolved
|
@hvitved I reverted the last commit, and only made the imports private in the new code, so that I don't introduce breaking changes. |
If you prefer, we could do that in a follow-up PR. |
67b76a2 to
92f9bdc
Compare
|
I don't mind handling the unsigned integers in this PR. I'll work on that tomorrow. Also, I pushed some extra test cases that are not covered. I'll fix them, and try to find more such cases. |
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SignAnalysisSpecific.qll
Outdated
Show resolved
Hide resolved
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SignAnalysisSpecific.qll
Outdated
Show resolved
Hide resolved
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SignAnalysisSpecific.qll
Outdated
Show resolved
Hide resolved
| exists(string charlit | charlit = e.(CharacterLiteral).getValue() | | ||
| if charlit = "\\0" or charlit = "\\u0000" then result = TZero() else result = TPos() | ||
| exists(string charlit | charlit = getCharValue(e) | | ||
| if charlit = "\\0" or charlit = "\\u0000" or charlit.regexpMatch("\\u0000") |
There was a problem hiding this comment.
Why do we need both disjuncts charlit = "\\u0000" or charlit.regexpMatch("\\u0000"?
There was a problem hiding this comment.
This is shared between Java and C#. And I don't know what gets extracted for Java. @aschackmull Can you tell me what gets extracted for (char)0 in Java? See this ticket and the slack discussion for context: https://github.com/github/codeql-coreql-team/issues/539
There was a problem hiding this comment.
(char) 0 is just a cast of a 32 bit literal to an unsigned 16 bit value as far as I know. In Java you can put NUL characters in the source code. You can write (where ^@ represents a real unescaped NUL symbol):
class X {
String normal = "\0";
String unicode = "\u0000";
String nullChar = "^@";
}
I had expected unicode and nullChar to be extracted similarly because as far as I know unicode escape sequences are replaced before lexical analysis. However, in the extractor output the normal and unicode lines are extracted similarly, while the last is different.
The generated TRAP output is
#10000=@"/home/arthur/Devel/code/ql/java/ql/test/query-tests/Nullness/X.java;sourcefile"
files(#10000,"/home/arthur/Devel/code/ql/java/ql/test/query-tests/Nullness/X.java","X","java",0)
#10001=@"/home/arthur/Devel/code/ql/java/ql/test/query-tests/Nullness;folder"
folders(#10001,"/home/arthur/Devel/code/ql/java/ql/test/query-tests/Nullness","Nullness")
#10002=@"/home/arthur/Devel/code/ql/java/ql/test/query-tests;folder"
folders(#10002,"/home/arthur/Devel/code/ql/java/ql/test/query-tests","query-tests")
#10003=@"/home/arthur/Devel/code/ql/java/ql/test;folder"
folders(#10003,"/home/arthur/Devel/code/ql/java/ql/test","test")
#10004=@"/home/arthur/Devel/code/ql/java/ql;folder"
folders(#10004,"/home/arthur/Devel/code/ql/java/ql","ql")
#10005=@"/home/arthur/Devel/code/ql/java;folder"
folders(#10005,"/home/arthur/Devel/code/ql/java","java")
#10006=@"/home/arthur/Devel/code/ql;folder"
folders(#10006,"/home/arthur/Devel/code/ql","ql")
#10007=@"/home/arthur/Devel/code;folder"
folders(#10007,"/home/arthur/Devel/code","code")
#10008=@"/home/arthur/Devel;folder"
folders(#10008,"/home/arthur/Devel","Devel")
#10009=@"/home/arthur;folder"
folders(#10009,"/home/arthur","arthur")
#10010=@"/home;folder"
folders(#10010,"/home","home")
#10011=@"/;folder"
folders(#10011,"/","")
containerparent(#10011,#10010)
containerparent(#10010,#10009)
containerparent(#10009,#10008)
containerparent(#10008,#10007)
containerparent(#10007,#10006)
containerparent(#10006,#10005)
containerparent(#10005,#10004)
containerparent(#10004,#10003)
containerparent(#10003,#10002)
containerparent(#10002,#10001)
containerparent(#10001,#10000)
#10012=@"loc,{#10000},0,0,0,0"
locations_default(#10012,#10000,0,0,0,0)
hasLocation(#10000,#10012)
numlines(#10000,6,5,0)
#10013=@"package;"
packages(#10013,"")
cupackage(#10000,#10013)
#10014=@"class;X"
#10015=@"field;{#10014};nullChar"
#10016=@"class;java.lang.String"
fields(#10015,"nullChar",#10016,#10014,#10015)
#10017=@"/home/arthur/Devel/code/ql/java/ql/test/query-tests/Nullness/X.class;sourcefile"
files(#10017,"/home/arthur/Devel/code/ql/java/ql/test/query-tests/Nullness/X.class","X","class",0)
containerparent(#10001,#10017)
packages(#10013,"")
cupackage(#10017,#10013)
#10018=@"loc,{#10017},0,0,0,0"
locations_default(#10018,#10017,0,0,0,0)
hasLocation(#10017,#10018)
hasLocation(#10015,#10018)
#10019=@"field;{#10014};unicode"
fields(#10019,"unicode",#10016,#10014,#10019)
hasLocation(#10019,#10018)
#10020=@"field;{#10014};normal"
fields(#10020,"normal",#10016,#10014,#10020)
hasLocation(#10020,#10018)
#10021=@"type;void"
primitives(#10021,"void")
#10022=@"unknown;sourcefile"
files(#10022,"","","",0)
#10023=@"loc,{#10022},0,0,0,0"
locations_default(#10023,#10022,0,0,0,0)
hasLocation(#10021,#10023)
#10024=@"callable;{#10014}.<init>(){#10021}"
constrs(#10024,"X","X()",#10021,#10014,#10024)
hasLocation(#10024,#10018)
isDefConstr(#10024)
#10025=@"loc,{#10000},1,7,1,7"
locations_default(#10025,#10000,1,7,1,7)
hasLocation(#10014,#10025)
numlines(#10014,5,4,0)
#10026=*
methods(#10026,"<obinit>","<obinit>()",#10021,#10014,#10026)
#10027=@"modifier;private"
modifiers(#10027,"private")
hasModifier(#10026,#10027)
locations_default(#10025,#10000,1,7,1,7)
hasLocation(#10026,#10025)
#10028=*
stmts(#10028,0,#10026,0,#10026)
locations_default(#10025,#10000,1,7,1,7)
hasLocation(#10028,#10025)
#10029=*
#10030=*
#10031=*
stmts(#10029,14,#10028,0,#10026)
#10032=*
locations_default(#10032,#10000,3,1,3,21)
hasLocation(#10029,#10032)
exprs(#10030,4,#10016,#10029,0)
callableEnclosingExpr(#10030,#10026)
statementEnclosingExpr(#10030,#10029)
#10033=*
locations_default(#10033,#10000,3,1,3,21)
hasLocation(#10030,#10033)
exprs(#10031,60,#10016,#10030,0)
callableEnclosingExpr(#10031,#10026)
statementEnclosingExpr(#10031,#10029)
#10034=*
locations_default(#10034,#10000,3,1,3,21)
hasLocation(#10031,#10034)
variableBinding(#10031,#10020)
#10035=*
exprs(#10035,22,#10016,#10030,1)
callableEnclosingExpr(#10035,#10026)
statementEnclosingExpr(#10035,#10029)
#10036=*
locations_default(#10036,#10000,3,17,3,20)
hasLocation(#10035,#10036)
numlines(#10035,1,1,0)
namestrings("""\0""","^@",#10035)
#10037=*
#10038=*
#10039=*
stmts(#10037,14,#10028,1,#10026)
#10040=*
locations_default(#10040,#10000,4,1,4,26)
hasLocation(#10037,#10040)
exprs(#10038,4,#10016,#10037,0)
callableEnclosingExpr(#10038,#10026)
statementEnclosingExpr(#10038,#10037)
#10041=*
locations_default(#10041,#10000,4,1,4,26)
hasLocation(#10038,#10041)
exprs(#10039,60,#10016,#10038,0)
callableEnclosingExpr(#10039,#10026)
statementEnclosingExpr(#10039,#10037)
#10042=*
locations_default(#10042,#10000,4,1,4,26)
hasLocation(#10039,#10042)
variableBinding(#10039,#10019)
#10043=*
exprs(#10043,22,#10016,#10038,1)
callableEnclosingExpr(#10043,#10026)
statementEnclosingExpr(#10043,#10037)
#10044=*
locations_default(#10044,#10000,4,18,4,25)
hasLocation(#10043,#10044)
numlines(#10043,1,1,0)
namestrings("""\u0000""","^@",#10043)
#10045=*
#10046=*
#10047=*
stmts(#10045,14,#10028,2,#10026)
#10048=*
locations_default(#10048,#10000,5,1,5,22)
hasLocation(#10045,#10048)
exprs(#10046,4,#10016,#10045,0)
callableEnclosingExpr(#10046,#10026)
statementEnclosingExpr(#10046,#10045)
#10049=*
locations_default(#10049,#10000,5,1,5,22)
hasLocation(#10046,#10049)
exprs(#10047,60,#10016,#10046,0)
callableEnclosingExpr(#10047,#10026)
statementEnclosingExpr(#10047,#10045)
#10050=*
locations_default(#10050,#10000,5,1,5,22)
hasLocation(#10047,#10050)
variableBinding(#10047,#10015)
#10051=*
exprs(#10051,22,#10016,#10046,1)
callableEnclosingExpr(#10051,#10026)
statementEnclosingExpr(#10051,#10045)
#10052=*
locations_default(#10052,#10000,5,19,5,21)
hasLocation(#10051,#10052)
numlines(#10051,1,1,0)
namestrings("""^@""","^@",#10051)
locations_default(#10025,#10000,1,7,1,7)
hasLocation(#10024,#10025)
numlines(#10024,1,1,0)
#10053=*
stmts(#10053,0,#10024,0,#10024)
#10054=*
locations_default(#10054,#10000,1,7,1,7)
hasLocation(#10053,#10054)
numlines(#10053,1,1,0)
#10055=*
stmts(#10055,20,#10053,0,#10024)
#10056=*
locations_default(#10056,#10000,1,7,1,7)
hasLocation(#10055,#10056)
numlines(#10055,1,1,0)
#10057=*
stmts(#10057,14,#10053,1,#10024)
#10058=*
locations_default(#10058,#10000,1,7,1,7)
hasLocation(#10057,#10058)
numlines(#10057,1,1,0)
#10059=*
exprs(#10059,61,#10021,#10057,0)
callableEnclosingExpr(#10059,#10024)
statementEnclosingExpr(#10059,#10057)
#10060=*
locations_default(#10060,#10000,1,7,1,7)
hasLocation(#10059,#10060)
numlines(#10059,1,1,0)
callableBinding(#10059,#10026)
#10061=@"class;java.lang.Object"
#10062=@"callable;{#10061}.<init>(){#10021}"
callableBinding(#10055,#10062)
#10063=*
fielddecls(#10063,#10014)
#10064=*
locations_default(#10064,#10000,3,1,3,21)
hasLocation(#10063,#10064)
#10065=*
exprs(#10065,62,#10016,#10063,0)
#10066=*
locations_default(#10066,#10000,3,1,3,6)
hasLocation(#10065,#10066)
numlines(#10065,1,1,0)
#10067=@"loc,{#10000},3,8,3,13"
locations_default(#10067,#10000,3,8,3,13)
hasLocation(#10020,#10067)
fieldDeclaredIn(#10020,#10063,0)
#10068=*
fielddecls(#10068,#10014)
#10069=*
locations_default(#10069,#10000,4,1,4,26)
hasLocation(#10068,#10069)
#10070=*
exprs(#10070,62,#10016,#10068,0)
#10071=*
locations_default(#10071,#10000,4,1,4,6)
hasLocation(#10070,#10071)
numlines(#10070,1,1,0)
#10072=@"loc,{#10000},4,8,4,14"
locations_default(#10072,#10000,4,8,4,14)
hasLocation(#10019,#10072)
fieldDeclaredIn(#10019,#10068,0)
#10073=*
fielddecls(#10073,#10014)
#10074=*
locations_default(#10074,#10000,5,1,5,22)
hasLocation(#10073,#10074)
#10075=*
exprs(#10075,62,#10016,#10073,0)
#10076=*
locations_default(#10076,#10000,5,1,5,6)
hasLocation(#10075,#10076)
numlines(#10075,1,1,0)
#10077=@"loc,{#10000},5,8,5,15"
locations_default(#10077,#10000,5,8,5,15)
hasLocation(#10015,#10077)
fieldDeclaredIn(#10015,#10073,0)
There was a problem hiding this comment.
had expected
unicodeandnullCharto be extracted similarly because as far as I know unicode escape sequences are replaced before lexical analysis. However, in the extractor output thenormalandunicodelines are extracted similarly, while the last is different.
Really? The TRAP you quote shows the three cases with identical values, right?
namestrings("""\0""","^@",#10035)
namestrings("""\u0000""","^@",#10043)
namestrings("""^@""","^@",#10051)
I think perhaps some of the Java QL code here predates an improvement that I made a long time ago to the handling of the value of string literals, which precisely gives this uniform representation.
There was a problem hiding this comment.
I believe the conclusion in slack was that we'd only need the charlit.regexpMatch("\\u0000") case and that the others currently don't work?
There was a problem hiding this comment.
@aschackmull Yes, you're right. The things get extracted the same way in all three cases. I misread things in my terminal before I replaced the NUL character with something more easy to read ;-)
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SignAnalysisSpecific.qll
Outdated
Show resolved
Hide resolved
This confirmed the performance issue caused by the guards logic change, as expected. I have started new jobs based on a rebased branch: |
csharp/ql/src/semmle/code/csharp/dataflow/internal/rangeanalysis/SignAnalysisSpecific.qll
Show resolved
Hide resolved
8a52f61 to
5bc8c46
Compare
688cd92 to
fdc9e01
Compare
Synced between Java and C# through `identical-files.json`.
fdc9e01 to
06dbec7
Compare
|
Differences job after rebase, squash: https://jenkins.internal.semmle.com/job/Changes/job/CSharp-Differences/446/ |
And here is one for Java as well: https://jenkins.internal.semmle.com/job/Changes/job/Java-Differences/929/ |
I restarted the C# one: https://jenkins.internal.semmle.com/job/Changes/job/CSharp-Differences/448/ |
hvitved
left a comment
There was a problem hiding this comment.
Let's get this merged once the CSharp-Differences job completes (without regressions).
Thanks for you patience @tamasvajk ; this is excellent work.
java/ql/src/semmle/code/java/dataflow/internal/rangeanalysis/SignAnalysisCommon.qll
Show resolved
Hide resolved
java/ql/src/semmle/code/java/dataflow/internal/rangeanalysis/SignAnalysisCommon.qll
Show resolved
Hide resolved
java/ql/src/semmle/code/java/dataflow/internal/rangeanalysis/SignAnalysisCommon.qll
Show resolved
Hide resolved
java/ql/src/semmle/code/java/dataflow/internal/rangeanalysis/SignAnalysisCommon.qll
Show resolved
Hide resolved
java/ql/src/semmle/code/java/dataflow/internal/rangeanalysis/SignAnalysisCommon.qll
Show resolved
Hide resolved
| result = exprSign(e.(CastExpr).getExpr()) | ||
| } | ||
|
|
||
| private Sign binaryOpLhsSign(Expr e) { |
There was a problem hiding this comment.
binaryOpLhsSign and binaryOpRhsSign have different implementation for C# and Java.
java/ql/src/semmle/code/java/dataflow/internal/rangeanalysis/SignAnalysisSpecific.qll
Show resolved
Hide resolved
| } | ||
|
|
||
| pragma[noinline] | ||
| private predicate binaryOpSigns(Expr e, Sign lhs, Sign rhs) { |
There was a problem hiding this comment.
This could be moved to the common bit, but the gain would be minimal. I think it's better to keep it together with binaryOpLhsSign and binaryOpRhsSign.
java/ql/src/semmle/code/java/dataflow/internal/rangeanalysis/SignAnalysisCommon.qll
Show resolved
Hide resolved
java/ql/src/semmle/code/java/dataflow/internal/rangeanalysis/SignAnalysisCommon.qll
Show resolved
Hide resolved
|
Thanks @aschackmull for your findings. I'll go through them during the week... |
Sign analysis ported from Java. It is going to be used by the ported range analysis. The public API contains four predicates:
positive(Expr e),negative(Expr e),strictlyPositive(Expr e),strictlyNegative(Expr e). Range analysis doesn't need anisZero(Expr e), so it's not added for the time being.The code is a 1:1 migration from the Java implementation. Java had no direct tests on it, I added a single one for it, because I was anyways debugging through the Java code. I added a lot more test for C#.