Bug #18235
openLots of failed Cavalcade tasks as of 2023-05-13
0%
Description
Was just checking the failed Cavalcade tasks log and noticed something peculiar. As of 2023-05-13 17:33:05 GMT and on, we've had a consistent stream of failed tasks.
Run the following MySQL query to view all results:
select * from wp_cavalcade_jobs where timestamp > '2023-05-13 17:32:00' and status = 'failed' order by id desc;
As the time of posting, the total number of failed jobs since about three days ago is greater than 15,000. Usually, the number is only a couple failed tasks a day.
I haven't done any further investigation yet, but just wanted to post this issue for now.
Related issues
Updated by Raymond Hoh over 1 year ago
Looks like the problem came from one site: sthomas21003.commons.gc.cuny.edu
.
For some reason, the template
option for the site was set to twentytwelve
, whereas the stylesheet
for the site was set to academica
.
This resulted in the following fatal error when viewing the site:
Fatal error: require(): Failed opening required '/PROD_WWW/html/commons/www/wp-content/themes/twentytwelve/inc/customizer/bootstrap.php' (include_path='.:/usr/share/p ear:/usr/share/php') in /PROD_WWW/html/commons/www/wp-content/themes/academica/functions.php on line 10
I switched the template back to academica
with the following WP-CLI line:
wp option update template academica --url=sthomas21003.commons.gc.cuny.edu --skip-themes=twentytwelve,academica
The --skip-themes
parameter was needed otherwise using the wp option update
command would also cause the fatal error.
The failed Cavalcade tasks are no longer growing.
The next step is figuring out what might have caused the template
DB option to change to twentytwelve
. In our wp-config.php
file, I see that we have set the WP_DEFAULT_THEME
constant to twentytwelve
. In WordPress, this constant is used only in validate_current_theme()
. This function is used in a couple of places, but most notably during the WP Customizer: https://github.com/cuny-academic-commons/cac/blob/491f7adadb505958a6e8c5c38e6a7d5b4ac119b3/wp-includes/class-wp-customize-manager.php#L654-L660. Still not sure exactly what caused the template
DB option to change though. Any ideas, Boone or Jeremy?
Updated by Boone Gorges over 1 year ago
Nice find, Ray! Good to know this is not a broader problem with Cavalcade.
Still not sure exactly what caused the template DB option to change though. Any ideas, Boone or Jeremy?
I don't have any clear thoughts. I wonder if perhaps there was a cache-related glitch, as we've sometimes experienced in the past. Like perhaps the 'template' option wasn't properly loaded from the 'alloptions' cache on some specific pageload, and WP fell back on the default, saving the default theme to the database. It does seem strange, though, because validate_current_theme()
uses switch_theme()
, which should change both the 'template' and 'stylesheet' options.
Given that we haven't heard of this problem before, I'd say that perhaps it's not worth sinking a lot of time into investigating.
Updated by Raymond Hoh over 1 year ago
- Status changed from New to Hold
- Assignee changed from Boone Gorges to Raymond Hoh
- Target version set to Not tracked
I wonder if perhaps there was a cache-related glitch, as we've sometimes experienced in the past. Like perhaps the 'template' option wasn't properly loaded from the 'alloptions' cache on some specific pageload, and WP fell back on the default, saving the default theme to the database.
I'm thinking this might have been the case. Just for my peace of mind, I'm going to log changes to the template
option when it switches to twentytwelve
to see if something else could be causing this.
Updated by Raymond Hoh over 1 year ago
Over the long weekend, we had around 1800 failed Cavalcade tasks from July 2nd until now due to a problem with Jetpack and invalid syntax type. Boone, you can run the following DB query to view the failed tasks:
select * from wp_cavalcade_jobs where nextrun >= '2023-07-02 23:34:22' and status = 'failed' and site = 23793 order by nextrun asc;
The issue is Jetpack was throwing the following fatal error during their scheduled tasks on one specific site -- apiza.commons.gc.cuny.edu
:
Fatal error: Uncaught Error: [] operator not supported for strings in /var/www/html/commons/www/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-urls.php:102 Stack trace: #0 /var/www/html/commons/www/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-urls.php(137): Automattic\Jetpack\Connection\Urls::get_protocol_normalized_url() #1 /var/www/html/commons/www/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-urls.php(146): Automattic\Jetpack\Connection\Urls::get_raw_or_filtered_url() #2 /var/www/html/commons/www/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-identity-crisis/src/class-identity-crisis.php(201): Automattic\Jetpack\Connection\Urls::home_url() #3 /var/www/html/commons/www/wp-includes/class-wp-hook.php(308): Automattic\Jetpack\Identity_Crisis->add_idc_query_args_to_url() #4 /var/www/html/commons/www/wp-includes/plugin.php(205): WP_Hook->apply_filters() #5 /var/www/html/commons/www/wp-content/plugins/jetp in /var/www/html/commons/www/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-urls.php on line 102
The specific Jetpack call was looking for the "jetpack_sync_https_history_home_url"
option and this option was returning the string "100"
instead of an array, which caused the fatal error.
I fixed this by deleting the "jetpack_sync_https_history_home_url"
option for the apiza.commons.gc.cuny.edu
site. I'm guessing this might have occurred for the same reason as my original report, some cache-related glitch.
Updated by Raymond Hoh about 1 year ago
- Related to Support #19083: tarrelleng110.commons.gc.cuny.edu/ added
Updated by Raymond Hoh about 1 year ago
Came across the same Jetpack problem as my previous post. ~140 failed tasks for the preludenyc15.commons.gc.cuny.edu
site with Jetpack activated. Can view the failed Cavalcade tasks with the following MySQL query:
select * from wp_cavalcade_jobs where nextrun >= '2023-11-23 03:14:44' and status = 'failed' and site = 2310 order by nextrun asc;
Same fatal error, invalid option value. Fixed by deleting the jetpack_sync_https_history_home_url
and jetpack_sync_https_history_site_url
options.
Updated by Raymond Hoh 12 months ago
Was checking the Cavalcade failed log and there were roughly 2300 failed tasks between December 15th until now:
mysql> select site,hook,count(*) from wp_cavalcade_jobs where status = 'failed' and nextrun > '2023-12-15 23:13:10' group by site order by count(*) desc; +-------+----------------------------+----------+ | site | hook | count(*) | +-------+----------------------------+----------+ | 31905 | jetpack_clean_nonces | 863 | | 31906 | jetpack_clean_nonces | 856 | | 31907 | jetpack_clean_nonces | 566 | | 886 | akismet_scheduled_delete | 2 | | 18874 | jetpack_sync_full_cron | 1 | | 1534 | action_scheduler_run_queue | 1 | +-------+----------------------------+----------+ 6 rows in set (0.05 sec)
Affected sites were:
mysql> select * from wp_blogs where blog_id in (31905,31906,31907); +---------+---------+------------------------------------+------+---------------------+---------------------+--------+----------+--------+------+---------+---------+ | blog_id | site_id | domain | path | registered | last_updated | public | archived | mature | spam | deleted | lang_id | +---------+---------+------------------------------------+------+---------------------+---------------------+--------+----------+--------+------+---------+---------+ | 31905 | 1 | oliverkowalski.commons.gc.cuny.edu | / | 2023-12-15 23:11:43 | 2023-12-15 23:11:47 | 1 | 0 | 0 | 0 | 0 | 1 | | 31906 | 1 | oakengl11000.commons.gc.cuny.edu | / | 2023-12-15 23:17:04 | 2023-12-15 23:17:04 | 0 | 0 | 0 | 0 | 0 | 1 | | 31907 | 1 | oalengl11000k.commons.gc.cuny.edu | / | 2023-12-15 23:23:17 | 2023-12-15 23:23:23 | 0 | 0 | 0 | 0 | 0 | 1 | +---------+---------+------------------------------------+------+---------------------+---------------------+--------+----------+--------+------+---------+---------+ 3 rows in set (0.00 sec)
In all cases, the cause was the same template / stylesheet mismatch bug described in https://redmine.gc.cuny.edu/issues/18235#note-1 :
$ wp option get template --url=oakengl11000.commons.gc.cuny.edu --skip-themes=twentytwelve,sydney sydney $ wp option get stylesheet --url=oakengl11000.commons.gc.cuny.edu --skip-themes=twentytwelve,sydney twentytwelve
Addressed by switching the stylesheet back to sydney
for the three sites using the following WP-CLI line:
$ wp option update stylesheet sydney --url=oakengl11000.commons.gc.cuny.edu --skip-themes=twentytwelve,sydney Success: Updated 'stylesheet' option.
Luckily, my 'template'
DB option logger finally caught this and it looks like this bug might stem from the site cloning process:
[15-Dec-2023 23:11:45] twentytwelve switched to sydney for site ID 31905: require('wp-blog-header.php'), require_once('wp-includes/template-loader.php'), include('/themes/bp-nelo/buddypress.php'), the_content, apply_filters('the_content'), WP_Hook->apply_filters, bp_replace_the_content, apply_filters('bp_replace_the_content'), WP_Hook->apply_filters, BP_Blogs_Theme_Compat->create_content, bp_buffer_template_part, bp_get_template_part, bp_locate_template, load_template, require('/themes/bp-nelo/blogs/create.php'), cac_show_blog_signup_form, cac_blogs_validate_blog_signup, wpmu_create_blog, wp_insert_site, do_action_deprecated, do_action_ref_array('wpmu_new_blog'), WP_Hook->do_action, WP_Hook->apply_filters, CAC\SiteTemplates\Frontend::process_site_template, CAC\SiteTemplates\Template\Template->clone_to_site, CAC\SiteTemplates\Template\Cloner->go, CAC\SiteTemplates\Template\Cloner->migrate_site_settings, update_option, apply_filters('pre_update_option_template'), WP_Hook->apply_filters, {closure}
I'll try and look into this a bit more tomorrow.