CWE-311: Download a binary file over plain-HTTP, and then execute it#364
CWE-311: Download a binary file over plain-HTTP, and then execute it#364gagliardetto wants to merge 8 commits intogithub:mainfrom
Conversation
|
The other languages, codeql supports, only check for insecure downloads of executable files. Checking only for insecure downloads is far easier than also looking for code that executes the downloaded file. |
In particular, this should be not be caught because it's OK (there is no execution): https://github.com/github/codeql-go/pull/364/files#diff-b7f104754b628bc22d73802262efd672R54-R61 This suggests me that there is some huge error(s) in how I wrote the query. |
Yes, it is far easier. That's why it's an interesting challenge to also find the more specific cases where that downloaded file is executed. But yeah, I'm not sure it's worth the effort. The possible more-focused results that it might yield would still be a subset of the insecure-download query. Is the parent set so big to justify a query for just this subset? |
|
Thanks, this sounds like a good query to have! I'll take a look and see if I can comment usefully on the failing test tomorrow. |
| fld.hasQualifiedName("net/http", "Response", "Body") and | ||
| read = fld.getARead() and | ||
| read.readsField(base, fld) and | ||
| base.getAPredecessor*() = httpClientCall.(DataFlow::CallNode).getResult(0).getASuccessor*() |
There was a problem hiding this comment.
getAPredecessor*() and getASuccessor*() are redundant (use localFlow instead)
| } | ||
|
|
||
| /** | ||
| * Holds if the provided `file` is the result of a `fileCreationCall`. |
There was a problem hiding this comment.
Reword-- there's nothing here to assure us that fileCreationCall is creating the file, only that it returns it.
Rename to callReturnsFile or similar (it hasn't a return value, so is not a getter)
| /** | ||
| * Holds if the provided `fileCreationCall` had as source the provided `filename`. | ||
| */ | ||
| predicate getFilename(DataFlow::CallNode fileCreationCall, ValueEntity filename) { |
There was a problem hiding this comment.
Similarly, this is a relation not a getter
|
The reason this doesn't work is because the query asks "is there a path from to an I'll follow-up with some more practical advice about how to do this shortly. |
This way the calling context is maintained throughout.
|
Here's a branch showing a sketch of my suggested approach: https://github.com/smowton/codeql-go/tree/smowton/draft/insecure-binary-download In short, we use additional steps and function models to make url-to-http-response-body and file-write-to-filename into steps, and so trace the whole flow in one configuration -- then use subsidiary configurations to make (not 100% effective) consistency checks, like at least one HTTP client request being fed from the same source. This way the whole end-to-end query maintains the needed callstack context and your good and bad examples are distinguished as we might hope. Feel free to take as much or as little of that draft as you like -- suggested things to check now:
|
|
I've measured the performance impact of increasing the number of PostUpdateNodes as is done in that branch. The cost was negligible. |
|
Thank you so much for your help, @smowton ! |
…-download' into curl-bash-over-http
That will be part of the next project I'm working on that will semi-automate the addition of concrete implementations of the codeql concepts (most likely starting with Do you think this query might be self-sufficient as-is? @smowton |
smowton
left a comment
There was a problem hiding this comment.
OK, sounds good. Have started an all-projects evaluation of this; while we're waiting here are some trivial cleanups, which I appreciate are to a reasonable degree cleaning up my own code :)
ql/src/experimental/CWE-311/InsecureBinaryDownloadExecute.qhelp
Outdated
Show resolved
Hide resolved
ql/src/experimental/CWE-311/InsecureBinaryDownloadExecute.qhelp
Outdated
Show resolved
Hide resolved
ql/src/experimental/CWE-311/InsecureBinaryDownloadExecute.qhelp
Outdated
Show resolved
Hide resolved
ql/src/experimental/CWE-311/InsecureBinaryDownloadExecute.qhelp
Outdated
Show resolved
Hide resolved
| predicate isSink(DataFlow::Node sink, SystemCommandExecution exec) { | ||
| sink = exec.getCommandName() | ||
| } |
There was a problem hiding this comment.
Inline this at its only user
| class FlowHttpUrlToClientRequest extends TaintTracking::Configuration { | ||
| FlowHttpUrlToClientRequest() { this = "FlowHttpUrlToClientRequest" } | ||
|
|
||
| predicate isSource(DataFlow::Node source, StringLit val) { isHttpUrlSource(source, val) } |
There was a problem hiding this comment.
Inline these trivial 2-arg getSource methods everywhere
| override predicate isSink(DataFlow::Node sink) { this.isSink(sink, _) } | ||
| } | ||
|
|
||
| predicate isUrlToRequestResponseStep(DataFlow::Node url, DataFlow::Node response) { |
|
Am just investigating the test situation, which relates to my introducing new PostUpdateNodes. If this is very time-consuming I can back that change out (causing some BAD cases here to not be detected for now) and follow up; I'll let you know which way I want to go by the end of the day. |
|
Result: there were 6 true-positive results, but none of them involved actually saving the response to a file and running it -- rather, they all involved directly executing (part of) the retrieved body. Suggest rephrasing the help and description to make clear the save-to-file step, which is currently universal, is not necessary. |
Co-authored-by: Chris Smowton <smowton@github.com>
Oh, you mean something like evaluating a script directly from the command line without passing through a file medium? I'm not sure I follow here; if results were found, does that mean that the query does not require a save-to-file step? I don't exactly understand the logic of the query anymore 😅 |
|
Save-to-file makes sense -- it's like the difference between vs. simply Both of those are a problem, it just so happens that the existing cases are all like the latter one. As to the logic of the query-- it's just ordinary existing taint-flow modelling, starting from an |
|
Two high-level comments; I haven't looked at the code:
|
|
Re: the last bit, I'd suggest we merge this as-is and then generalise it as a separate PR when moving this out of the |
|
@gagliardetto I checked out the test alterations, and it looks like there's trouble introduced by the expansion of |
CWE-311: Missing Encryption of Sensitive Data
This query looks for scenarios where:
The above Go scenario corresponds to the infamous
curl -s "http://example.com/hello.bin" | bashNOTE:
I'm having difficulties to make the query work as expected. The different flows don't align as I expected.