Project

General

Profile

Bug #13457

Forum post not sending notifications

Added by Filipa Calado 11 months ago. Updated 4 days ago.

Status:
New
Priority name:
High
Assignee:
Category name:
Group Forums
Target version:
Start date:
2020-10-12
Due date:
% Done:

0%

Estimated time:

Description

Hello -- I posted in the CUNY DHI forum (http://cuny.is/cunydhi-group) and normally this action would send out emails to all members in the relevant groups tagged in the post. However, this email has not been sent, so the group members are not notified of the post.

Link to the post in question: https://commons.gc.cuny.edu/groups/cunydhi/forum/topic/sign-up-for-6th-annual-cuny-digital-humanities-initiative-lightning-talks-4/#post-99759

The relevant groups include: GC Digital Fellows, GC Digital Initiatives Fellows - Master Group, GC Digital Research Institute - January 2019, GC Digital Research Institute January 2018, GC Digital Research Institute – January 2020, GC Events and Workshops, GIS Working Group, ITP Core 2 Spring 2018, Digital Humanities Initiative.

In the past, when I make a post it automatically emails members of the above groups. Please let me know if there's something I'm doing wrong, or if this is some kind of bug with the commons.

Thank you!

Filipa


Related issues

Related to CUNY Academic Commons - Bug #13430: Delay in RBENew2020-10-06

Related to CUNY Academic Commons - Bug #13890: Not sure email notification went outResolved2021-01-26

History

#1 Updated by Matt Gold 11 months ago

  • Assignee set to Raymond Hoh

#2 Updated by Raymond Hoh 11 months ago

#3 Updated by Raymond Hoh 11 months ago

Hi Filipa,

We've been experiencing some delays with email sending and we have been troubleshooting the matter over the past week.

Your forum topic was posted on October 9th, 1:27pm ET.

The activity ID for the forum topic is 708039. I checked our internal group email log and here are the relevant entries:

[09-Oct-2020 17:27:16] Beginning batch of immediate notifications for 708039.
[09-Oct-2020 17:27:16] Finished sending immediate notifications for 708039. A total of 0 notifications were sent over all batches.

The time in the log is in GMT, so the email batch was sent out at 1:27pm ET.

However, the 0 notifications part stands out. Boone, any idea why the number would be zero here?

For Filipa, an email was sent to her email address at 2:36pm ET:

Oct  9 14:36:36 lw2a postfix/smtp[50405]: E0FA4204CFC9: to=<FILIPAs_EMAIL>, relay=relay.gc.cuny.edu[172.29.28.156]:25, delay=0.25, delays=0.01/0.02/0/0.22, dsn=2.6.0, status=sent (250 2.6.0 <Dh1J8dcbYsR0KjhisoauE1qir1P02NEVMOFknbvSUQ8@commons.gc.cuny.edu> [InternalId=148326695569068, Hostname=WD5A.gc.cuny.edu] Queued mail for delivery)

Not sure if that email was for another group or post. Filipa, can you confirm that you received an email on October 9th at 2:36pm ET and whether that was for your topic or something else?

#4 Updated by Matt Gold 11 months ago

FYI, I'm in this group and don't think I received a notification of Filipa's post

#5 Updated by Filipa Calado 11 months ago

Matt Gold wrote:

FYI, I'm in this group and don't think I received a notification of Filipa's post

Raymond Hoh wrote:

Hi Filipa,

We've been experiencing some delays with email sending and we have been troubleshooting the matter over the past week.

Your forum topic was posted on October 9th, 1:27pm ET.

The activity ID for the forum topic is 708039. I checked our internal group email log and here are the relevant entries:

[...]

The time in the log is in GMT, so the email batch was sent out at 1:27pm ET.

However, the 0 notifications part stands out. Boone, any idea why the number would be zero here?

For Filipa, an email was sent to her email address at 2:36pm ET:

[...]

Not sure if that email was for another group or post. Filipa, can you confirm that you received an email on October 9th at 2:36pm ET and whether that was for your topic or something else?

I did receive an email at 2:36pm ET, but it was from the RUG group. It wasn't the message that I posted to DHI.

I also checked with a 3 other group members and none of them received the email either. This is definitely something specific to the DHI group, because last night I posted to a different group and the email went out just fine.

#6 Updated by Boone Gorges 11 months ago

However, the 0 notifications part stands out. Boone, any idea why the number would be zero here?

Based on what's being reported here, I'd say it's because either

(a) the items were never queued in the database at the time of activity creation. This ought to take place here: https://github.com/boonebgorges/buddypress-group-email-subscription/blob/399879e6fdf407fdda63d6ef099b888ab8ad9471/bp-activity-subscription-functions.php#L98, and there are a few places where failure might have taken place in this function. See 108, 119, 122, 135, and the loop at https://github.com/boonebgorges/buddypress-group-email-subscription/blob/399879e6fdf407fdda63d6ef099b888ab8ad9471/bp-activity-subscription-functions.php#L146

(b) the queued items were determined to be invalid by the send queue. For example, https://github.com/boonebgorges/buddypress-group-email-subscription/blob/399879e6fdf407fdda63d6ef099b888ab8ad9471/classes/class-bpges-async-request-send-queue.php#L101 might have returned a null result (though it's hard to see how, unless the items didn't exist in the table; see (a) above). Yet, if $items was non-empty at this point, then the $total_for_batch variable would have been incremented; see line 121.

Looking at the queued item table, I see that other items have been queued for the DHI group (id 253) since this event. So I don't think it's a problem with the group, at least not in general. Matt, can you confirm whether you've received notifications from this group since Oct 9?

A hunch: The forum post in question had a fair number of links in it. Was it, by chance, held for moderation by Akismet? If there was somehow a delay in posting, perhaps it triggered some problem with BPGES queuing? This shouldn't happen in theory, since BPGES uses the BP activity item, and the BP activity item isn't created until after Akismet moderation is complete.

#7 Updated by Matt Gold 11 months ago

Hi Ray and Boone -- I'm in a meeting with the digital fellows right now. Since Filipa's post was posted to multiple groups, we're wondering whether the fix for https://redmine.gc.cuny.edu/issues/13429 might somehow have caused this problem, as another student mentioned having problems with not getting a notification when he posted to multiple groups, but getting a notification when he tried posting to a single group.

#8 Updated by Raymond Hoh 11 months ago

I tested your theory yesterday, Matt.

I tried posting one forum topic to two groups yesterday (albeit the groups only had just two members), and I was able to receive emails for both members. I'm not sure if the issue is due to how many groups and number of members in the group, but it is worth looking into the multiple forum post code to see if there might be anything that could cause this problem.

The fixes in the other thread are not related.

#9 Updated by Boone Gorges 11 months ago

Looking more closely, it appears likely that bp-multiple-forum-post is part of the problem.

I see that the original post #99759 does have a corresponding activity item 708039. And the original post has a number of duplicates:

mysql> select post_id from wp_1_postmeta where meta_key = '_duplicate_of' and meta_value = '99759';
+---------+
| post_id |
+---------+
|   99761 |
|   99763 |
|   99765 |
|   99767 |
|   99769 |
|   99771 |
|   99773 |
|   99775 |
|   99777 |
+---------+
9 rows in set (0.01 sec)

But none of these duplicate posts has a corresponding activity item:

mysql> select id from wp_bp_activity where secondary_item_id in ( select post_id from wp_1_postmeta where meta_key = '_duplicate_of' and meta_value = '99759' );
Empty set (0.01 sec)

Tracing this through the code, here's what I'm hypothesizing:

1. BPMFP prevents BPGES from sending/queuing immediate notifications for the activity item corresponding to the original forum post. See https://github.com/cuny-academic-commons/cac/blob/99c93476a1336fa6a55f8fcefde34961ffa8b67b/wp-content/plugins/bp-multiple-forum-post/includes/email-notifications.php#L68 (The process is delayed so that it can happen asynchronously, after all of the duplicates have been created)
2. Duplicate topic creation seems to have completed successfully for these items. See https://github.com/cuny-academic-commons/cac/blob/99c93476a1336fa6a55f8fcefde34961ffa8b67b/wp-content/plugins/bp-multiple-forum-post/bp-multiple-forum-post.php#L87
3. At the end of bpmfp_create_duplicate_topics(), after topics have been created, the plugin tries to create all additional activity items. See https://github.com/cuny-academic-commons/cac/blob/99c93476a1336fa6a55f8fcefde34961ffa8b67b/wp-content/plugins/bp-multiple-forum-post/bp-multiple-forum-post.php#L207.
4. So something must be going wrong in this latter function, bpmfp_create_duplicate_activities(). I observe that two things have gone wrong: (a) the new activity items haven't been created https://github.com/cuny-academic-commons/cac/blob/99c93476a1336fa6a55f8fcefde34961ffa8b67b/wp-content/plugins/bp-multiple-forum-post/bp-multiple-forum-post.php#L251 and (b) the GES notification wasn't sent for the original activity item https://github.com/cuny-academic-commons/cac/blob/99c93476a1336fa6a55f8fcefde34961ffa8b67b/wp-content/plugins/bp-multiple-forum-post/bp-multiple-forum-post.php#L247. So I surmise that the function is bailing here https://github.com/cuny-academic-commons/cac/blob/99c93476a1336fa6a55f8fcefde34961ffa8b67b/wp-content/plugins/bp-multiple-forum-post/bp-multiple-forum-post.php#L236

But when I run this code right now, I get results:

  3 $original_topic_id = 99759;
  4 
  5 $original_activity_query_results = BP_Activity_Activity::get( array(
  6         'show_hidden' => true,
  7         'filter_query' => array(
  8                 'relation'  => 'and',
  9                 array(
 10                     'column'    => 'type',
 11                     'value'     => 'bbp_topic_create'
 12                 ),
 13                 array(
 14                     'column'    => 'secondary_item_id',
 15                     'value'     => $original_topic_id
 16                 ),
 17         ),
 18     ) );
 19 
 20 
 21 print_r( $original_activity_query_results );       

Ray, why might this query have failed during the initial asynchronous request? Could there have been a problem with out-of-date caches?

#10 Updated by Matt Gold 11 months ago

As a point of information (I don't know whether it is relevant), I just posted to multiple groups via the web interface and I successfully received an email notification of the post: https://commons.gc.cuny.edu/groups/ma-in-digital-humanities-cuny-graduate-center/forum/topic/research-opportunity-computing-cultural-heritage-in-the-cloud/#post-100054

#11 Updated by Boone Gorges 11 months ago

Thanks, Matt. This suggests that the problem may have been a fluke having to do with a cache race condition or something like that.

#12 Updated by Raymond Hoh 11 months ago

Ray, why might this query have failed during the initial asynchronous request? Could there have been a problem with out-of-date caches?

Your analysis is sound regarding the BP_Activity_Activity::get() call returning an empty result, Boone.

I've been pulling my hair out trying to figure out what might be causing the issue and I believe I've found the potential cause of the problem.

I was looking through the source code for the WP_Async_Task library, which is used to offshore the duplicate forum topic sending process on the 'shutdown' hook, and I found an interesting note about edge cases where the async task might fire quicker than the DB insertion:
https://github.com/techcrunch/wp-async-task/blob/9bdbbf9df4ff5179711bb58b9a2451296f6753dc/wp-async-task.php#L114-L121

I'll quote what they mentioned below:

Launch the request on the WordPress shutdown hook

On VIP we got into data races due to the postback sometimes completing faster than the data could propogate to the database server cluster. This made WordPress get empty data sets from the database without failing.

On their advice, we're moving the actual firing of the async postback to the shutdown hook. Supposedly that will ensure that the data at least has time to get into the object cache.

I think this is exactly what is happening here.

In bbPress, the activity object is created on the 'bbp_new_topic' hook: https://github.com/bbpress/bbPress/blob/2c53828d6d3a48619d5c227240ae51fcea825345/src/includes/topics/functions.php#L361

And shortly afterwards, bbPress calls bbp_redirect().

And due to the redirect, our async task is then fired. Usually the 'shutdown' hook runs at the end of a page load, but the redirect causes the task to fire sooner than expected.

I think the easiest way to solve this is to add a sleep() call of a few seconds (between 2-5 seconds or more?) around here.

Or we could use wp_schedule_single_event() to process the duplicate activities a few minutes into the future.

What do you think, Boone?

#13 Updated by Boone Gorges 11 months ago

  • Target version set to 1.17.6

Wow, thanks for the digging, Ray! This sounds like a promising explanation - why this may have happened in one single case, but isn't easily (or ever?) replicable.

sleep sounds like the easiest solution to implement. Do you want to go ahead and give this a try?

#14 Updated by Raymond Hoh 11 months ago

I've just added a sleep(5) call here on production, but haven't committed the changes.

Last week, I also started logging calls to see when a multiple forum topic is created. Over the past three days, three multiple topics were made and looks like their activity items were created successfully.

I'll keep monitoring the log to see what happens.

#15 Updated by Boone Gorges 11 months ago

  • Target version changed from 1.17.6 to 1.17.7

#16 Updated by Boone Gorges 10 months ago

  • Target version changed from 1.17.7 to 1.17.8

#17 Updated by Boone Gorges 10 months ago

  • Target version changed from 1.17.8 to 1.18.1

#18 Updated by Boone Gorges 9 months ago

  • Target version changed from 1.18.1 to 1.18.2

#19 Updated by Boone Gorges 8 months ago

  • Target version changed from 1.18.2 to 1.18.3

#20 Updated by Boone Gorges 8 months ago

  • Target version changed from 1.18.3 to 1.18.4

#21 Updated by Raymond Hoh 8 months ago

  • Related to Bug #13890: Not sure email notification went out added

#22 Updated by Boone Gorges 7 months ago

  • Target version changed from 1.18.4 to 1.18.5

#23 Updated by Boone Gorges 7 months ago

  • Target version changed from 1.18.5 to 1.18.6

#24 Updated by Boone Gorges 6 months ago

  • Target version changed from 1.18.6 to 1.18.7

#25 Updated by Boone Gorges 6 months ago

  • Target version changed from 1.18.7 to 1.18.8

#26 Updated by Boone Gorges 5 months ago

  • Target version changed from 1.18.8 to 1.18.9

#27 Updated by Boone Gorges 5 months ago

  • Target version changed from 1.18.9 to 1.18.10

#28 Updated by Boone Gorges 4 months ago

  • Target version changed from 1.18.10 to 1.18.11

#29 Updated by Boone Gorges 4 months ago

  • Target version changed from 1.18.11 to 1.18.12

#30 Updated by Boone Gorges 3 months ago

  • Target version changed from 1.18.12 to 1.18.13

#31 Updated by Boone Gorges 3 months ago

  • Target version changed from 1.18.13 to 1.18.14

#32 Updated by Boone Gorges 2 months ago

  • Target version changed from 1.18.14 to 1.18.15

#33 Updated by Boone Gorges about 2 months ago

  • Target version changed from 1.18.15 to 1.18.16

#34 Updated by Boone Gorges about 1 month ago

  • Target version changed from 1.18.16 to 1.18.17

#35 Updated by Boone Gorges 24 days ago

  • Target version changed from 1.18.17 to 1.18.18

#36 Updated by Boone Gorges 4 days ago

  • Target version changed from 1.18.18 to 1.18.19

Also available in: Atom PDF