Feature #22353
openIntroduce static analysis to CI pipeline
0%
Description
Bugs like #22351 could be avoided if we had some basic static analysis tools. I've set up PHPStan successfully on projects not too dissimilar to the Commons. It takes a good deal of configuration to get it set up, and getting our custom codebase to pass even the most permissive scan level will probably take some work. At the same time, the benefits are quite real: we avoid the type of simple regression in #22351, and likely also identify other latent bugs.
Ray, Jeremy, have you used PHPStan or another tool for this? I'm happy to have a first crack at it, unless one of you has other thoughts.
Updated by Jeremy Felt 11 months ago
I've finally got some work up in the feature/phpstan-initial branch. I'm going to brain dump here and then give Boone and Ray a chance to get caught up a bit / ask questions. :)
This branch adds a few related composer dependencies and setups up an initial phpstan configuration.
Categories¶
To make things a bit more manageable / contained / faster than molasses to run, I've split code up into various categories.
- mu-plugins: anything in the mu-plugins directory, excluding mercator and cavalcade.
- BP plugins: A managed list of plugins related to BuddyPress and bbPress.
- CAC plugins: A managed list of plugins explicitly prefixed as
cac-. - Plugins One: A list of remaining plugin directories.
- Plugins Two: A second list of remaining plugin directories.
- Themes: The
bp-nelotheme.
Configuration¶
A top-level phpstan configuration exists in developer/phpstan/config/phpstan.neon.dist. This file uses the phpstan default bleedingEdge.neon configuration, scans for errors at level 1, and has a variety of common scan files, ignore files, etc...
Each category has its own phpstan configuration (e.g. phpstan-plugins-cac.neon) that allows us to specify a subset of directories to scan and ignore specific errors if needed.
Baseline¶
Even at level 1, there are thousands of "errors". Rather than ignore these going forward, we can create a baseline so that only new errors are highlighted. These baseline files are large PHP arrays that track found errors and are located in developer/phpstan/baseline/.
Each category has its own baseline file (e.g. baseline-plugins-cac.php).
Stubs¶
PHPStan uses stubs to help describe the expected structure of methods, classes, and other code in cases where you don't want to include the entire codebase. We're able to use the composer packages szepeviktor/phpstan-wordpress and php-stubs/wp-cli-stubs so that we don't need to load the entirety of WordPress or WP-CLI into memory each time.
In addition to those composer packages, we're also able to define our own stubs. PHPStan seems to have trouble analyzing the origin of various constants, so we have a file in developer/phpstan/stubs/phpstan-constants.php that contains a long list which are likely fine to use, but that PHPStan could not verify.
Scripts¶
I've setup an assortment of composer commands that can be used to run scans or baseline creation per category or as a large batch.
composer phpstanruns a PHPStan scan on each category.composer phpstan:plugins:cacruns a PHPStan scan on CAC plugins.composer phpstan:baselinecreates a baseline scan for each category.composer phpstan:plugins:cac:baselinecreates a baseline scan for CAC plugins.
Tasks / Decisions:¶
- What code do we want to scan? * It seems obvious that mu-plugins, bp-nelo, and most cac- prefixed plugins fit here. Most of the bp- prefixed plugins also fit. We may be able to work out a configuration for other plugins and themes that only highlights the most severe errors that should not be shipped to production.
- When do we want it to scan? * This is easy enough to setup as a GitHub action, but scans take quite a bit, so the feedback loop is fairly slow. * I've thought through scanning only changed files a bit, but because a code change in one file can impact any file in the project, we probably need to scan a good chunk each time. There's some sort of balance of risk here—for example, mu-plugins vs the long list of 3rd party plugins.
- Are these categories correct? And how do we maintain them going forward? * I think we can script some directory list commands to highlight when the phpstan configuration is out of sync with the code base.
Updated by Boone Gorges 11 months ago
Awesome!! So much to digest here.
The current breakdown of content seems pretty good. A couple thoughts for (potentially) refining:
- The cac- prefix is a good but not perfect indicator of which plugins are "first-party". We also have a few other tools that are or were developed for sole use on CAC. social-paper comes to mind (though it's deprecated so I don't care about it). Another dumb one, but a good example, is bava-comments #5841. I guess we can discuss the extent to which it's really all that helpful to separate these out - I suppose the idea is that perhaps we would increase the scanner level for in-house-developed tools?
- Similarly, most of what's picked up in mu-plugins is homegrown, but some is not. See eg the cavalcade and mercator libraries.
Do you know how introducing PHPStan into our GitHub Actions workflow is going to affect billing? I assume that it uses machine-minutes, or whatever GH uses to charge us. Obviously this will quickly get out of hand, if we let it. Third-party plugins are generally updated only twice-monthly, which should help to keep these runs down, but it assumes that we are able to target based on files (or categories) that have changed. So, tentatively, I'd say that plugins-one and plugins-two would only need to run twice per month, when plugins are batch-updated. The first-party categories will potentially be updated more frequently, and ideally we could configure an Action to recognize that only certain categories need to be updated on a given change.
Regarding your section on stubs: In the past I've used PHPStan's symbol discovery as an alternative to stubs https://phpstan.org/user-guide/discovering-symbols Maybe this is what you mean about "loading the entirety into memory"?
Updated by Jeremy Felt 10 months ago
I guess we can discuss the extent to which it's really all that helpful to separate these out - I suppose the idea is that perhaps we would increase the scanner level for in-house-developed tools?
I think so, level and frequency. Even the existing categories can take many minutes each in my local environment—I just ran the entire suite in the background in 20 minutes. There's probably room to separate things further so that the scan most frequently run is on some gut-level categorization of actively maintained / interconnected / high impact.
In theory, any change could impact any piece of code. In reality, our code and plugins aren't often calling each others' functions or conflicting in any way that static analysis will identify.
Generating the baseline errors should help here because we can at least say: "let's not introduce any new level 4(?) errors in code that we actively maintain". 3rd party plugins and themes will likely continue to introduce level 1 errors, but hopefully at a rate where we can run the report every so often, generate new baselines as needed, and identify any types of level 1 errors that would cause us to hold an update.
Separately, we can parse the current baseline to identify those that we're okay ignoring in current code, okay ignoring in future code, or should fix in current code and not allow in future code.
Regarding your section on stubs: In the past I've used PHPStan's symbol discovery as an alternative to stubs
I tried a couple different approaches here and I think there's a combination of things being finicky, my learning curve, and trying to speed things up. I know I went down the symbol discovery approach, but I don't remember why it wasn't working for the constants. I'll poke around at that a bit more. :)
Do you know how introducing PHPStan into our GitHub Actions workflow is going to affect billing?
I haven't run an actual GitHub Action test, but I think 20 minutes may be a reasonable enough guess, which is ~$0.16 if over our included plan minutes. It looks like the current CAC usage is ~210 minutes for July so far, so there should be plenty of room in the 2 or 3000 minutes that we have, but don't hold me to that!
In addition to cost, I think that the time just makes using GitHub Actions painful. I know this impacts Boone's workflow more than anyone else, but I could see:
- Use GitHub actions on commits to code we actively manage (target ~2-3 minutes/run)
- And use a script locally when updating plugins and themes twice a month
- And have a scheduled GitHub action run every so often to capture all categories
And then another process to define is: what happens when an error pops up in those 3 cases? Triage/fix/etc...
Updated by Boone Gorges 10 months ago
Jeremy and I discussed this briefly the other day and decided that the system as he's currently built it is ready for a test run. He's going to do some work to integrate selective CI runs into the GitHub Action workflow - "selective" in the sense that a group will only be scanned if the files in that group are touched. Once it's running in GH, we can monitor for a while, for cost, for runtime, for utility, etc. Based on this, we can refine the way that, for example, the most mission-critical plugins on the primary site are tested.
Updated by Jeremy Felt 4 months ago
The perfect scenario: When any file changes, run static analysis on everything.
In theory, it's possible any change could impact any code across the ecosystem. In practice, code generally relies on a small subset of the system and most code in the repository isn't touched for years. It takes a long time to analyze the entire project and the likelihood of it mattering is low.
A good scenario: When a file changes, run static analysis on that file and its likely related files.
For example, we should be able to assume that a change made to one of our MU plugins does not have any impact on nor care about any change in a 3rd party theme that hasn't been updated in years.
But! The idea of "when a file changes" really only works if we're using pull requests to submit changes and waiting for workflows to finish before merging. If we're running this on pushes of commits to a branch, then it's hard for git and the workflow to assume a start commit when generating a diff.
The consolidation scenario: Run static analysis on a specific set of files (mu-plugins, bp-nelo, etc...) every time a change is pushed to any PHP file (or phpstan config).
This doesn't provide 100% coverage, but can catch changes that would likely have a large impact (i.e. network-wide and/or main site code). It also completes in ~8 minutes, which is faster than the current PHPCS and PHP lint workflows.
Here's an approach:
- Run a python script (developer/scripts/analyze-paths.py) to analyze the most frequently changed files in the repository over the last 2 years. This is meant to be run manually to help audit the configuration groups.
- Identify impactful configuration groups to help provide focus to any analysis and cut down on the amount of time it takes to analyze.
- Add a phpstan configuration for each of those groups, located in their respective directories. This allows us to manage different baselines, analysis levels, error ignoring, etc... for each group.
- Add a standard phpstan configuration at the top level of the project to provide all of the paths that should be included/excluded regardless of the run.
- And then, from time to time (???), maybe quarterly, maybe a couple weeks before major releases? We can plan on running a more complete suite for the entire codebase to see if anything has changed against our baseline.
The groups I've defined so far are:
- All mu-plugins (wp-content/mu-plugins)
- BP Nelo (wp-content/themes/bp-nelo)
- CAC CV Editor (wp-content/plugins/cac-cv-editor)
- CAC BP Custom Includes (wp-content/plugins/cac-bp-custom-includes)
I don't think that's a final list, but it covers our most frequently touched code. For example, it probably makes sense to only check something like the CV Editor when working with it directly or as part of a wider run. And there may be a less-touched plugin that is critical enough to include each time.
Each of these groups has its own baseline file in developer/phpstan/baseline. These are errors that we're ignoring for now, but don't want to appear in future reports. If we'd like, we can prioritize the errors in these and actually fix them across the codebase, or leave them for future triage.
The branch for all of this is task/phpstan-static-analysis and you can see the latest action in progress at https://github.com/cuny-academic-commons/cac/actions/runs/21380412502
Updated by Boone Gorges 4 months ago
Thanks, Jeremy. This is great progress.
Can we add wp-content/plugins/bp-custom.php to the list of key groups? This acts like an mu-plugin.
I fixed a couple of the little things in the last branch so that I could see the workflow go green. It worked :-D
The Action is pretty fast - a little less than 4 minutes. How hard would it to be across the entire codebase? I assume it'll take hours (with higher costs?). Is this something we'd do manually? Maybe we have a specific file in the repo that needs to be "touched" every time we want to trigger the run?
Might be worth asking an AI agent to take a look at the exceptions you've listed in your baseline files and decide whether there are ways of categorizing and prioritizing (some combination of ease/safety to fix and potential severity) and perhaps even patching.
Updated by Jeremy Felt 4 months ago
The Action is pretty fast - a little less than 4 minutes.
It's fast per job, but we multiply that by 4 jobs right now. You can see the per run metrics here: https://github.com/cuny-academic-commons/cac/actions/metrics/usage?tab=jobs&filters=workflow_file_name%3Aphpstan.yml
I assume it'll take hours (with higher costs?). Is this something we'd do manually? Maybe we have a specific file in the repo that needs to be "touched" every time we want to trigger the run?
Yes. We can setup another workflow that fires only manually and runs the full suite. We'd initiate it through the actions interface on GitHub. With parallel jobs, it may not take a ton of actual time to run, but the costs will be somewhat higher. I think running this before releases is probably worth it.
Might be worth asking an AI agent
I resisted the urge yesterday, but my imaginary friend Claude is now "systematically fixing these issues". We'll see what that looks like. :)
Updated by Jeremy Felt 3 months ago
This work is now merged in the 2.7.x branch.
Next steps:
- Add a config for wp-content/plugins/bp-custom.php
- Send an AI agent after the baseline errors to see how much cleanup we can confidently do
Updated by Jeremy Felt 3 months ago
Boone / Ray: I've opened an initial pull request with Claude's attempt at addressing initial baseline errors. I asked it to make one commit for each type of error, which it seems to have done fairly well. My guess is that it may be easiest to review that way: one commit at a time. A quick glance shows that there are some that are probably easy/safe and others that might be left untouched. https://github.com/cuny-academic-commons/cac/pull/13
I'll get a config for bp-custom.php going as well. Should we open a new ticket for the baseline issues?
Updated by Boone Gorges 3 months ago
Thanks so much, Jeremy. I've reviewed and merged all the suggested fixes. Most were trivial, but it found some real (if latent) bugs in cac-bp-custom-includes.
How did you prompt Claude to address this? I see there's still a bunch of stuff in baseline-mu-plugins.php. Perhaps this was based on phpstan level? Or did you do some other kind of massaging/"skill"?
I think it's fine to continue here in this ticket with baseline stuff for bp-custom.php.
Updated by Jeremy Felt 3 months ago
Thanks, Boone! It's good to know it was a useful effort of sorts. :)
Here's the prompt and resulting agent "decisions": https://gisthost.github.io/?82303aac8a888005355bab011df206c6/page-001.html#msg-2026-02-24T17-17-16-164Z
The key part is at the end. I haven't followed up with another prompt to address it yet, but I'll see where that takes us.
The remaining baseline entries are primarily third-party property access issues (BuddyPress, bbPress, CACSP_Paper dynamic properties), WordPress stub limitations (remove_filter with 4 params), and complex legacy code patterns that would require larger refactoring.
Updated by Jeremy Felt 24 days ago
I've added a phpstan config for plugins/bp-custom.php in the 2.8.x branch with its baseline errors logged.
I then had an agent walk through the errors it had a high confidence approach for and created this branch: https://github.com/cuny-academic-commons/cac/compare/2.8.x...task/fix-baseline-bp-custom-errors
Boone / Ray - All of that looks good to me, nothing seems dangerous. Your eyes will know it better though. :)
It's also worth taking a look at the baseline errors file, a couple highlights:
- Cac_Featured_Content_Widget::$image_width and Cac_Featured_Content_Widget::getPostContentImage() are called but seem to be invalid.
- There are a couple spots with unreachable code that could be removed (lines 217 and 2092)
- And a handful of logic errors that I'm happy to take a stab at, but you may have better awareness of
Updated by Boone Gorges 24 days ago
Fantastic, thanks! I reviewed the fixes and merged them. Then I went through the next layer of baseline issues and factored them out. There's a few left, but they're really due to quirks in BuddyPress itself, which is not something I want to work around at the time being just to satisfy PHPStan.
Updated by Raymond Hoh 24 days ago
Thanks for your work on this, Jeremy!
I took a look at the composer phpstan:plugins:cac-bp-custom-includes:baseline task and I decided to go through all the baseline errors and pushed up fixes for them in the task/fix-baseline-cac-bp-custom-includes-errors branch.
Some notable changes are:
- Removal of cac_groups_documents_display_content() function: https://github.com/cuny-academic-commons/cac/commit/e8c9ff12c8ba53332a5ea55ce551e09625f2d4eb. This function isn't being used in our current iteration of the Commons.
- Removal of older advanced-post-cache plugin workaround: https://github.com/cuny-academic-commons/cac/commit/ec070f8961472367173079cffc6c3718ee142f49. We no longer use the advanced-post-cache plugin on the Commons.
- empty.variable error in group site clone code: https://github.com/cuny-academic-commons/cac/commit/99ce8d0e925b9e140bf41091a5f48465affde302. This one might require a once-over.
The other fixes should be straight-forward. Boone, can you take a look when you have a chance?
Updated by Boone Gorges 23 days ago
Awesome, thank you so much for doing this, Ray! All looks good - I just had a question and potential improvement that I left as a comment at https://github.com/cuny-academic-commons/cac/commit/e8c9ff12c8ba53332a5ea55ce551e09625f2d4eb
Updated by Raymond Hoh 23 days ago
I just had a question and potential improvement that I left as a comment at https://github.com/cuny-academic-commons/cac/commit/e8c9ff12c8ba53332a5ea55ce551e09625f2d4eb
Good spot on the cac_testers() function. I forgot to remove it. This is done in https://github.com/cuny-academic-commons/cac/commit/7dac61bc85735df7fb5521c565bb70bbb0eeb07e.
Updated by Raymond Hoh 11 days ago
I've added some additional baseline fixes for bp-custom.php in https://github.com/cuny-academic-commons/cac/compare/2.7.x...task/fix-baseline-bp-custom-errors.
Some notable changes:
- Fix invite status during group cloning: https://github.com/cuny-academic-commons/cac/commit/8101e5e6bc85e8b6c6f9825b576ab2810cd4df3f. A good spot by PHPStan. The 'invite_status' property is not fetched when constructing a BP_Groups_Group object.
- Removal of cac_get_avatar_for_featured_blog() function: https://github.com/cuny-academic-commons/cac/commit/1bf6efbd976b07b1b3130005935965837e362925. The cac-featured-content plugin was rewritten by Dominic back in the day and the 'cac_featured_content_blog_avatar' filter no longer exists after the rewrite, so this function is no longer needed.
- Fixing a deadCode.unreachable error: https://github.com/cuny-academic-commons/cac/commit/337fe18b8512a6ec1dc67cb959e13b2939e84de3. I just removed the code underneath it and added a comment when this change was made.
There is only one remaining deadCode.unreachable error: https://github.com/cuny-academic-commons/cac/commit/f0c670b08cd34b9da519652b6368ef2a3310d83f. I've left that one as-is, as I'm not sure if we want to keep the cac_role_field_prompt() function in the codebase for historical reference.
Updated by Boone Gorges 10 days ago
Thanks, Ray! Looks like some of your work got crossed up with some commits I put in place last week - not sure if I hadn't pushed them up or what. In any case, I've manually merged it all together on the 2.7.x branch.
Updated by Raymond Hoh 10 days ago
Shoot, my bad, Boone. I was working off the wrong branch instead of basing my changes off of 2.7.x branch. I've merged a few things that were missed from the task/fix-baseline-bp-custom-errors branch including the removal of the cac_get_avatar_for_featured_blog() function: https://github.com/cuny-academic-commons/cac/commit/1bf6efbd976b07b1b3130005935965837e362925.
It also looks like the PHPStan fixes from the task/fix-baseline-cac-bp-custom-includes-errors branch have not been merged into 2.7.x. Can you take a quick look to see if anything needs to be resolved before merging?
Updated by Raymond Hoh 8 days ago
I've added some baseline fixes for our bp-nelo theme in https://github.com/cuny-academic-commons/cac/compare/2.7.x...task/fix-baseline-bp-nelo-errors.
Notable changes:
- Fix groupblog ID fetching during course deletion for group: https://github.com/cuny-academic-commons/cac/commit/0cacce72400f2e81cb37daaddefaa82bf4a4e822
- Update activation template to use current post form / BuddyPress activation key logic and remove our older workaround: https://github.com/cuny-academic-commons/cac/commit/cf21b676f77916fe2950dcbd248290e806e147ec
- Fix invalid group checks in group cover images and photos functions: https://github.com/cuny-academic-commons/cac/commit/24b36e62beef91c05ec6faf642d695f4711ff992
- Remove cac-advanced-profiles template code as we removed the plugin completely: https://github.com/cuny-academic-commons/cac/commit/e8542b5628131de358a266a8b3707a714bbf19d7
- Remove dirt template as tapor-client has superseded it: https://github.com/cuny-academic-commons/cac/commit/6e0ac6b89c731ccd85db4df8e83d2f66768f3742
- Fix PHPStan stub constants for those referencing file paths: https://github.com/cuny-academic-commons/cac/commit/108e690b5ac271479bbdc08a53fe6cf070ec4ad9
There's only one remaining error and that involves an undefined.variable for $total_count: https://github.com/cuny-academic-commons/cac/blob/ec088c4dc1dc784bbc486fb8265572fcece5d815/wp-content/themes/bp-nelo/functions.php#L754. Boone, can you take a look at the logic in the cac_groups_remove_member() function? The group count reduction appears to be occurring twice.
Updated by Boone Gorges 7 days ago
Thanks, Ray!
I'm not sure what I was thinking when I wrote that code many years ago, but I replaced the incrementor logic with true recounts. Might be redundant with something BP already does, but it shouldn't be harmful. https://github.com/cuny-academic-commons/cac/commit/ea6e4ea17abc7a2430c08da0f7158e02f66b0e59
Updated by Raymond Hoh 4 days ago
Might be redundant with something BP already does, but it shouldn't be harmful.
Thanks for doing that, Boone!
I've added some baseline fixes for the mu-plugins task in https://github.com/cuny-academic-commons/cac/compare/2.7.x...task/fix-baseline-mu-plugins-errors
Notable changes:
- PHPStan came across a property.notFound error in mpo-mods.php. This led to some auditing of the mpo-mods.php code and some fixes to some site privacy enforcement issues with the REST API and RSS feeds: https://github.com/cuny-academic-commons/cac/compare/410b6b7...74b22ad. Sorry for clumping this one in with this branch, but they were kind of tied together! View the commit messages for more details.
- Blank MONTH_IN_SECONDS constant being defined in LayerSlider plugin: https://github.com/cuny-academic-commons/cac/commit/deea438b7f802fac29541335e356ab616905497f. PHPStan got confused with this define. Probably a load order issue with phpstan-wordpress's MONTH_IN_SECONDS define not being loaded in time or in memory.
- Removal of sitewide_feed.php and sitewide-tag-suggestion.php mu-plugins: https://github.com/cuny-academic-commons/cac/compare/765a345...43aa2b3. The functionality of these mu-plugins were not being used, so I removed them.
There are a few, remaining errors, but the ones I wanted to ask about are those related to network-plugin-auditor.php. network-plugin-auditor only works with the ?stats=1 URL parameter. I think we can remove this mu-plugin as well since we added our own, custom wp_network_plugins and wp_network_themes DB tables to query about this information. The other errors are to do with the undefined _b() function and to samesite-cookie-manager. The samesite-cookie-manager comparison one looks like it should only take place for subdirectory installs and not for subdomain installs, but I can inquire about this upstream to the plugin author before making the change.
This should be the last of the PHPStan baseline fixes!
Updated by Boone Gorges 3 days ago
Thank you, Ray! I reviewed your changes and they look good.
I pushed some additional fixes to the same branch, including the removal of references to _b() and the removal of network-plugin-auditor.php. The one remaining item has to do with samesite-cookie-manager.php. I'll let you take care of that.
Feel free to merge into the 2.7.x branch when you're ready.
Updated by Raymond Hoh 2 days ago
Thanks Boone. I've taken care of the last samesite-cookie-manager error in https://github.com/cuny-academic-commons/cac/commit/5edca4dd7b25bfd8d1d83416ddfe3dffb28a225c by removing the COOKIEPATH / SITECOOKIEPATH conditional block. On a subdomain multisite install like ours, that conditional would never take place. I did send a PR to the plugin author to use ! is_subdomain_install(), but I retracted the PR because there might be sites using those constants in non-traditional set ups and I also did not take into account single-site subdirectory installs when I initially made the PR, so better to leave things be for those already using the plugin.
I've merged the task/fix-baseline-mu-plugins-errors branch into 2.7.x branch.