Project

General

Profile

Actions

Bug #18235

open

Lots of failed Cavalcade tasks as of 2023-05-13

Added by Raymond Hoh 10 months ago. Updated 3 months ago.

Status:
Hold
Priority name:
Normal
Assignee:
Category name:
Cavalcade
Target version:
Start date:
2023-05-16
Due date:
% Done:

0%

Estimated time:
Deployment actions:

Description

Was just checking the failed Cavalcade tasks log and noticed something peculiar. As of 2023-05-13 17:33:05 GMT and on, we've had a consistent stream of failed tasks.

Run the following MySQL query to view all results:

select * from wp_cavalcade_jobs where timestamp > '2023-05-13 17:32:00' and status = 'failed' order by id desc;

As the time of posting, the total number of failed jobs since about three days ago is greater than 15,000. Usually, the number is only a couple failed tasks a day.

I haven't done any further investigation yet, but just wanted to post this issue for now.


Related issues

Related to CUNY Academic Commons - Support #19083: tarrelleng110.commons.gc.cuny.edu/DuplicateRaymond Hoh2023-10-23

Actions
Actions #1

Updated by Raymond Hoh 10 months ago

Looks like the problem came from one site: sthomas21003.commons.gc.cuny.edu.

For some reason, the template option for the site was set to twentytwelve, whereas the stylesheet for the site was set to academica.

This resulted in the following fatal error when viewing the site:

Fatal error: require(): Failed opening required '/PROD_WWW/html/commons/www/wp-content/themes/twentytwelve/inc/customizer/bootstrap.php' (include_path='.:/usr/share/p
ear:/usr/share/php') in /PROD_WWW/html/commons/www/wp-content/themes/academica/functions.php on line 10

I switched the template back to academica with the following WP-CLI line:

wp option update template academica --url=sthomas21003.commons.gc.cuny.edu --skip-themes=twentytwelve,academica

The --skip-themes parameter was needed otherwise using the wp option update command would also cause the fatal error.

The failed Cavalcade tasks are no longer growing.


The next step is figuring out what might have caused the template DB option to change to twentytwelve. In our wp-config.php file, I see that we have set the WP_DEFAULT_THEME constant to twentytwelve. In WordPress, this constant is used only in validate_current_theme(). This function is used in a couple of places, but most notably during the WP Customizer: https://github.com/cuny-academic-commons/cac/blob/491f7adadb505958a6e8c5c38e6a7d5b4ac119b3/wp-includes/class-wp-customize-manager.php#L654-L660. Still not sure exactly what caused the template DB option to change though. Any ideas, Boone or Jeremy?

Actions #2

Updated by Boone Gorges 10 months ago

Nice find, Ray! Good to know this is not a broader problem with Cavalcade.

Still not sure exactly what caused the template DB option to change though. Any ideas, Boone or Jeremy?

I don't have any clear thoughts. I wonder if perhaps there was a cache-related glitch, as we've sometimes experienced in the past. Like perhaps the 'template' option wasn't properly loaded from the 'alloptions' cache on some specific pageload, and WP fell back on the default, saving the default theme to the database. It does seem strange, though, because validate_current_theme() uses switch_theme(), which should change both the 'template' and 'stylesheet' options.

Given that we haven't heard of this problem before, I'd say that perhaps it's not worth sinking a lot of time into investigating.

Actions #3

Updated by Matt Gold 10 months ago

Great work, Ray! Thank you!

Actions #4

Updated by Raymond Hoh 10 months ago

  • Status changed from New to Hold
  • Assignee changed from Boone Gorges to Raymond Hoh
  • Target version set to Not tracked

I wonder if perhaps there was a cache-related glitch, as we've sometimes experienced in the past. Like perhaps the 'template' option wasn't properly loaded from the 'alloptions' cache on some specific pageload, and WP fell back on the default, saving the default theme to the database.

I'm thinking this might have been the case. Just for my peace of mind, I'm going to log changes to the template option when it switches to twentytwelve to see if something else could be causing this.

Actions #5

Updated by Raymond Hoh 9 months ago

Over the long weekend, we had around 1800 failed Cavalcade tasks from July 2nd until now due to a problem with Jetpack and invalid syntax type. Boone, you can run the following DB query to view the failed tasks:

select * from wp_cavalcade_jobs where nextrun >= '2023-07-02 23:34:22' and status = 'failed' and site = 23793 order by nextrun asc;

The issue is Jetpack was throwing the following fatal error during their scheduled tasks on one specific site -- apiza.commons.gc.cuny.edu:

Fatal error: Uncaught Error: [] operator not supported for strings in /var/www/html/commons/www/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-urls.php:102

Stack trace:
#0 /var/www/html/commons/www/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-urls.php(137): Automattic\Jetpack\Connection\Urls::get_protocol_normalized_url()
#1 /var/www/html/commons/www/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-urls.php(146): Automattic\Jetpack\Connection\Urls::get_raw_or_filtered_url()
#2 /var/www/html/commons/www/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-identity-crisis/src/class-identity-crisis.php(201): Automattic\Jetpack\Connection\Urls::home_url()
#3 /var/www/html/commons/www/wp-includes/class-wp-hook.php(308): Automattic\Jetpack\Identity_Crisis->add_idc_query_args_to_url()
#4 /var/www/html/commons/www/wp-includes/plugin.php(205): WP_Hook->apply_filters()
#5 /var/www/html/commons/www/wp-content/plugins/jetp in /var/www/html/commons/www/wp-content/plugins/jetpack/jetpack_vendor/automattic/jetpack-connection/src/class-urls.php on line 102

The specific Jetpack call was looking for the "jetpack_sync_https_history_home_url" option and this option was returning the string "100" instead of an array, which caused the fatal error.

I fixed this by deleting the "jetpack_sync_https_history_home_url" option for the apiza.commons.gc.cuny.edu site. I'm guessing this might have occurred for the same reason as my original report, some cache-related glitch.

Actions #6

Updated by Boone Gorges 9 months ago

Thanks for catching this!

Actions #7

Updated by Raymond Hoh 5 months ago

Actions #8

Updated by Raymond Hoh 4 months ago

Came across the same Jetpack problem as my previous post. ~140 failed tasks for the preludenyc15.commons.gc.cuny.edu site with Jetpack activated. Can view the failed Cavalcade tasks with the following MySQL query:

select * from wp_cavalcade_jobs where nextrun >= '2023-11-23 03:14:44' and status = 'failed' and site = 2310 order by nextrun asc;

Same fatal error, invalid option value. Fixed by deleting the jetpack_sync_https_history_home_url and jetpack_sync_https_history_site_url options.

Actions #9

Updated by Raymond Hoh 3 months ago

Was checking the Cavalcade failed log and there were roughly 2300 failed tasks between December 15th until now:

mysql> select site,hook,count(*) from wp_cavalcade_jobs where status = 'failed' and nextrun > '2023-12-15 23:13:10' group by site order by count(*) desc;
+-------+----------------------------+----------+
| site  | hook                       | count(*) |
+-------+----------------------------+----------+
| 31905 | jetpack_clean_nonces       |      863 |
| 31906 | jetpack_clean_nonces       |      856 |
| 31907 | jetpack_clean_nonces       |      566 |
|   886 | akismet_scheduled_delete   |        2 |
| 18874 | jetpack_sync_full_cron     |        1 |
|  1534 | action_scheduler_run_queue |        1 |
+-------+----------------------------+----------+
6 rows in set (0.05 sec)

Affected sites were:

mysql> select * from wp_blogs where blog_id in (31905,31906,31907);
+---------+---------+------------------------------------+------+---------------------+---------------------+--------+----------+--------+------+---------+---------+
| blog_id | site_id | domain                             | path | registered          | last_updated        | public | archived | mature | spam | deleted | lang_id |
+---------+---------+------------------------------------+------+---------------------+---------------------+--------+----------+--------+------+---------+---------+
|   31905 |       1 | oliverkowalski.commons.gc.cuny.edu | /    | 2023-12-15 23:11:43 | 2023-12-15 23:11:47 |      1 |        0 |      0 |    0 |       0 |       1 |
|   31906 |       1 | oakengl11000.commons.gc.cuny.edu   | /    | 2023-12-15 23:17:04 | 2023-12-15 23:17:04 |      0 |        0 |      0 |    0 |       0 |       1 |
|   31907 |       1 | oalengl11000k.commons.gc.cuny.edu  | /    | 2023-12-15 23:23:17 | 2023-12-15 23:23:23 |      0 |        0 |      0 |    0 |       0 |       1 |
+---------+---------+------------------------------------+------+---------------------+---------------------+--------+----------+--------+------+---------+---------+
3 rows in set (0.00 sec)

In all cases, the cause was the same template / stylesheet mismatch bug described in https://redmine.gc.cuny.edu/issues/18235#note-1 :

$ wp option get template --url=oakengl11000.commons.gc.cuny.edu --skip-themes=twentytwelve,sydney
sydney
$ wp option get stylesheet --url=oakengl11000.commons.gc.cuny.edu --skip-themes=twentytwelve,sydney
twentytwelve

Addressed by switching the stylesheet back to sydney for the three sites using the following WP-CLI line:

$ wp option update stylesheet sydney --url=oakengl11000.commons.gc.cuny.edu --skip-themes=twentytwelve,sydney
Success: Updated 'stylesheet' option.

Luckily, my 'template' DB option logger finally caught this and it looks like this bug might stem from the site cloning process:

[15-Dec-2023 23:11:45] twentytwelve switched to sydney for site ID 31905:
require('wp-blog-header.php'),
require_once('wp-includes/template-loader.php'),
include('/themes/bp-nelo/buddypress.php'),
the_content,
apply_filters('the_content'),
WP_Hook->apply_filters,
bp_replace_the_content,
apply_filters('bp_replace_the_content'),
WP_Hook->apply_filters,
BP_Blogs_Theme_Compat->create_content,
bp_buffer_template_part,
bp_get_template_part,
bp_locate_template,
load_template,
require('/themes/bp-nelo/blogs/create.php'),
cac_show_blog_signup_form,
cac_blogs_validate_blog_signup,
wpmu_create_blog,
wp_insert_site,
do_action_deprecated,
do_action_ref_array('wpmu_new_blog'),
WP_Hook->do_action,
WP_Hook->apply_filters,
CAC\SiteTemplates\Frontend::process_site_template,
CAC\SiteTemplates\Template\Template->clone_to_site,
CAC\SiteTemplates\Template\Cloner->go,
CAC\SiteTemplates\Template\Cloner->migrate_site_settings,
update_option,
apply_filters('pre_update_option_template'),
WP_Hook->apply_filters,
{closure}

I'll try and look into this a bit more tomorrow.

Actions

Also available in: Atom PDF