Skip to content

Create resolve-queries script#3664

Closed
jhutchings1 wants to merge 46 commits into
github:mainfrom
jhutchings1:querylist
Closed

Create resolve-queries script#3664
jhutchings1 wants to merge 46 commits into
github:mainfrom
jhutchings1:querylist

Conversation

@jhutchings1
Copy link
Copy Markdown
Contributor

@jhutchings1 jhutchings1 commented Jun 9, 2020

This PR creates a PowerShell script that can be used to report on the set of queries inside of a particular QL Suite. The output of this script is a CSV that you can use to quickly compare the set of queries from each of the query suites. I've attached an example of this (which is accurate as of last week, but will become stale over time).

query-lists.txt

Creates a PowerShell script that can be used to report on the set of queries inside of a particular QL Suite.
@jhutchings1
Copy link
Copy Markdown
Contributor Author

@sj Do you mind having a look at this one and merging if it's all good from your end?

@sj sj self-requested a review July 9, 2020 08:30
@sj
Copy link
Copy Markdown
Contributor

sj commented Jul 9, 2020

Tried to install PowerShell but it doesn't seem to support the most recent version of Ubuntu. Have a few more options, but might have to pass on the review of this to someone who can actually run the script (and understands PowerShell!)

@sj
Copy link
Copy Markdown
Contributor

sj commented Jul 17, 2020

PowerShell support for Ubuntu 20.04 is under active development: PowerShell/PowerShell#12626

In the meantime, I've been able to install it by fetching libicu60 and libssl1.0.0 from Ubuntu Bionic. So far it seems to work... Giving this a quick review now.

@sj
Copy link
Copy Markdown
Contributor

sj commented Jul 17, 2020

@jhutchings1: After trying to get to run resolve-queries.ps1 on my machine I concluded that as it stands, this script is not very easy to use for most people working on the CodeQL repository. A large majority use Linux (or Mac), on which PowerShell is not easily available.

In addition, the way the script writes CSV can cause formatting problems for fields that contain commas, quotes, and newlines.

I do completely agree that this sort of functionality would be super useful, and I think it'd be accessible to more people if we provide it in the form of a Python. I didn't want to ask you to rewrite the script in Python (not sure whether you're comfortable in that language), so I had a quick stab at it myself. I took the liberty to add a commit to your branch with the new file. I'll push that in a moment and leave a comment to describe I've left the resolve-queries.ps1 script as-is.

Let me know what you think!

@sj
Copy link
Copy Markdown
Contributor

sj commented Jul 17, 2020

I've just pushed my Python script to your branch. Please take a look and tell me what you think, @jhutchings1!

Some notes:

  • The script is a fair bit longer than yours. Python is slightly more wordy than PowerShell, but it also includes some more documentation, I tried to make the Git repo/NWO detection a bit more sturdy, and the CSV writer is definitely more robust.
  • The script should work on Linux, Windows, and Mac as long as Python is available and git and codeql are on the path. Please check whether it works for you!
  • The script will automatically detect that it's being run from within a Git repo, and will add the right things to the search path to make things work.
  • It writes output to stdout, so creating the file would be: `python resolve-code-scanning-query-packs.py > output.csv

@sj
Copy link
Copy Markdown
Contributor

sj commented Jul 17, 2020

Here's an example of the output. Also imported into Google Sheets to test CSV sturdiness.

Script raised two warnings about missing tags metadata: one due to a typo in cs/inefficient-containskey which was fixed in this PR. And one on go/mistyped-exponentiation which I've pinged the team about.

@sj sj added the enhancement New feature or request label Jul 17, 2020
@adityasharad adityasharad changed the base branch from master to main August 14, 2020 18:34
@jhutchings1
Copy link
Copy Markdown
Contributor Author

Good news everyone

@sj I managed to get the Actions integration working properly against the PowerShell version of the script. Happy to swap this out for the Python one, but at the moment, this seems like it would meet our needs. You can see the output at the link below. Let me know how you want to proceed.

https://github.com/jhutchings1/codeql/actions/runs/215593850

@sj
Copy link
Copy Markdown
Contributor

sj commented Sep 1, 2020

Thanks for pushing this forward, @jhutchings1! As I mentioned before, I don't think generating manually generating CSV files in PowerShell is the right way forward here. PowerShell is not widely (if at all) used by the CodeQL team, so it won't be easily maintainable. Unfortunately, the manual approach means that the CSV is malformed: this is a Google Sheet import (note oddity in column F due to anomalies in various rows, like 500).

I've opened a new PR that contains the Python version of your script (see #4177), which now runs successfully on Actions (see https://github.com/github/codeql/runs/1056018299?check_suite_focus=true). By using the Python standard library for writing CSV files, this also generates a well-formed CSV: code-scanning-query-lists.zip (which is importable in Google Sheets)

I've asked the CodeQL team to review #4177, while keeping in mind that's very much an MVP: if it turns out that this sort of functionality if indeed really useful, we might decide to adopt it natively into the CodeQL CLI (which would also make it much faster). Let me know what you think and please do add your review comments to #4177!

@jhutchings1 jhutchings1 closed this Sep 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants