Project

General

Profile

Bug #13429

Multiple forum notifications

Added by Matt Gold about 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority name:
Normal
Assignee:
Category name:
Email Notifications
Target version:
Start date:
2020-10-05
Due date:
% Done:

0%

Estimated time:

Description

Hi Ray -- I received four separate email notifications for this post - https://commons.gc.cuny.edu/groups/ma-in-digital-humanities-cuny-graduate-center/forum/topic/2020-2021-ma-ms-student-survey/ . The poster, Micki Kaufman, says that she received three. can you please see? If I can provide other info, please let me know. I do know that this was posted to one group and cc'ed to another via the web interface


Related issues

Related to CUNY Academic Commons - Support #13441: ongoing user problems with daily digestsResolved2020-10-07

Related to CUNY Academic Commons - Feature #13466: Automated cleanup for duplicate Cavalcade tasksNew2020-10-13

Related to CUNY Academic Commons - Bug #13500: Received multiple notifications for a single postAssigned2020-10-25

History

#1 Updated by Matt Gold about 1 year ago

I just ran into this issue myself with this post -- https://commons.gc.cuny.edu/groups/ma-in-digital-humanities-cuny-graduate-center/forum/topic/cfp-ach-digital-humanities-mentorship-working-group/

I cross-posted to four groups and received four notifications. So I think that whatever bit of code was preventing people from receiving only a max of one notification for a cross-posted topic is not working.

#2 Updated by Raymond Hoh about 1 year ago

Hi Matt,

For the latest instance, I can see that four emails were sent to your address between 9:58:21am ET and 9:58:24am ET:

Oct  6 09:58:21 lw2a postfix/smtp[72886]: 541AB204CFC6: to=<MATTs_EMAIL_ADDRESS>, relay=relay.gc.cuny.edu[172.29.28.156]:25, delay=0.42, delays=0.01/0/0.01/0.4, dsn=2.6.0, status=sent (250 2.6.0 <bTlIkI4VBGodQ4QD9VYusBcEZXOQkt8RDPnxEO2kle0@commons.gc.cuny.edu> [InternalId=148743307397682, Hostname=wd5b.gc.cuny.edu] Queued mail for delivery)
Oct  6 09:58:21 lw2a postfix/smtp[72891]: 6B77A204CFC9: to=<MATTs_EMAIL_ADDRESS>, relay=relay.gc.cuny.edu[172.29.28.156]:25, delay=0.44, delays=0.01/0.01/0/0.42, dsn=2.6.0, status=sent (250 2.6.0 <OJlxpz8mvtEIDJYu6fF5PPOCmKW56FXg7y324YNF8@commons.gc.cuny.edu> [InternalId=148743307397684, Hostname=wd5b.gc.cuny.edu] Queued mail for delivery)
Oct  6 09:58:24 lw2a postfix/smtp[72891]: A7FF2204CFC9: to=<MATTs_EMAIL_ADDRESS>, relay=relay.gc.cuny.edu[172.29.28.157]:25, delay=0.36, delays=0.01/0/0.01/0.34, dsn=2.6.0, status=sent (250 2.6.0 <Gd1rTbpYBjMfcjYZ1jieXHcTWkIws5PcQLCIIoyc@commons.gc.cuny.edu> [InternalId=148743307397698, Hostname=wd5b.gc.cuny.edu] Queued mail for delivery)
Oct  6 09:58:24 lw2a postfix/smtp[72886]: A715F204CFC6: to=<MATTs_EMAIL_ADDRESS>, relay=relay.gc.cuny.edu[172.29.28.156]:25, delay=0.36, delays=0.01/0/0/0.35, dsn=2.6.0, status=sent (250 2.6.0 <LAKUCGYUDj8TSPC3weydsEuCoNpwx7fs1MqjUbe6Qg@commons.gc.cuny.edu> [InternalId=148635933215235, Hostname=wd5c.gc.cuny.edu] Queued mail for delivery)

Can you confirm that those are the problematic emails? Can you also forward or attach those emails here? I just want to make sure that each email was a duplicate and not unique in some form.


We have a cron job (bpges_health_check) to check if there are pending group emails that need to be send out.

I just checked the Cavalcade log, which handles processing our jobs and it looks like the bpges_health_check job was duplicated many times over:

SELECT * FROM wp_cavalcade_jobs WHERE site = 1 AND hook = "bpges_health_check"'
+--------+------+--------------------+--------+---------------------+---------------------+----------+--------------+---------+
| id     | site | hook               | args   | start               | nextrun             | interval | schedule     | status  |
+--------+------+--------------------+--------+---------------------+---------------------+----------+--------------+---------+
| 566138 |    1 | bpges_health_check | a:0:{} | 2020-07-24 13:58:12 | 2020-07-27 14:23:12 |      300 | five_minutes | failed  |
| 566199 |    1 | bpges_health_check | a:0:{} | 2020-07-24 17:18:12 | 2020-07-27 14:23:12 |      300 | five_minutes | failed  |
| 577409 |    1 | bpges_health_check | a:0:{} | 2020-07-30 23:48:12 | 2020-07-31 03:03:12 |      300 | five_minutes | failed  |
| 579086 |    1 | bpges_health_check | a:0:{} | 2020-07-31 03:43:12 | 2020-07-31 03:43:12 |      300 | five_minutes | failed  |
| 580845 |    1 | bpges_health_check | a:0:{} | 2020-07-31 10:18:12 | 2020-08-01 03:03:12 |      300 | five_minutes | failed  |
| 580946 |    1 | bpges_health_check | a:0:{} | 2020-07-31 16:58:12 | 2020-08-01 03:03:12 |      300 | five_minutes | failed  |
| 580958 |    1 | bpges_health_check | a:0:{} | 2020-07-31 17:43:12 | 2020-08-01 03:03:12 |      300 | five_minutes | failed  |
| 583830 |    1 | bpges_health_check | a:0:{} | 2020-08-01 05:13:12 | 2020-09-02 22:18:12 |      300 | five_minutes | failed  |
| 585088 |    1 | bpges_health_check | a:0:{} | 2020-08-07 17:03:12 | 2020-09-02 22:18:12 |      300 | five_minutes | failed  |
| 585795 |    1 | bpges_health_check | a:0:{} | 2020-08-14 17:13:12 | 2020-09-02 22:18:12 |      300 | five_minutes | failed  |
| 586790 |    1 | bpges_health_check | a:0:{} | 2020-08-21 17:03:12 | 2020-09-02 22:18:12 |      300 | five_minutes | failed  |
| 589144 |    1 | bpges_health_check | a:0:{} | 2020-08-28 16:58:12 | 2020-09-02 22:18:12 |      300 | five_minutes | failed  |
| 557349 |    1 | bpges_health_check | a:0:{} | 2020-05-30 05:28:12 | 2020-06-30 11:53:12 |      300 | five_minutes | running |
| 591645 |    1 | bpges_health_check | a:0:{} | 2020-09-04 13:48:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 591696 |    1 | bpges_health_check | a:0:{} | 2020-09-04 16:48:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 593156 |    1 | bpges_health_check | a:0:{} | 2020-09-11 13:53:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 593220 |    1 | bpges_health_check | a:0:{} | 2020-09-11 16:58:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 593227 |    1 | bpges_health_check | a:0:{} | 2020-09-11 17:43:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 593336 |    1 | bpges_health_check | a:0:{} | 2020-09-12 00:23:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 593405 |    1 | bpges_health_check | a:0:{} | 2020-09-12 05:18:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 593437 |    1 | bpges_health_check | a:0:{} | 2020-09-12 08:28:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 594677 |    1 | bpges_health_check | a:0:{} | 2020-09-18 13:48:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 594717 |    1 | bpges_health_check | a:0:{} | 2020-09-18 17:18:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 594724 |    1 | bpges_health_check | a:0:{} | 2020-09-18 17:38:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 594801 |    1 | bpges_health_check | a:0:{} | 2020-09-19 00:33:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 594830 |    1 | bpges_health_check | a:0:{} | 2020-09-19 04:23:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 594838 |    1 | bpges_health_check | a:0:{} | 2020-09-19 05:53:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 615006 |    1 | bpges_health_check | a:0:{} | 2020-09-25 17:23:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 615483 |    1 | bpges_health_check | a:0:{} | 2020-09-25 17:43:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
| 622078 |    1 | bpges_health_check | a:0:{} | 2020-10-02 16:58:12 | 2020-10-06 17:08:12 |      300 | five_minutes | waiting |
+--------+------+--------------------+--------+---------------------+---------------------+----------+--------------+---------+

I've removed all the duplicate jobs and that should hopefully restore things back to normal.

Matt, can you report back when you receive another group email and whether you received duplicates or not?

#3 Updated by Matt Gold about 1 year ago

Sure. Do you want originals of the emails I received earlier today?

#4 Updated by Raymond Hoh about 1 year ago

Thanks Matt. Originals would be great!

#5 Updated by Boone Gorges about 1 year ago

Ray, I got a report at #13441 that BPGES emails are not coming through correctly. I checked the database and found that there are many BPGES items from the last ~24 hours that should have been sent immediately but have not been. select * from wp_bpges_queued_items where type = 'immediate'; I'm currently trying to force them to go out by triggering the health check manually via wp cron event run but they're not going. I'm investigating right now from the BPGES end, but I wanted to raise the issue here in case you may remember doing something that could cause this kind of problem.

#6 Updated by Boone Gorges about 1 year ago

  • Related to Support #13441: ongoing user problems with daily digests added

#7 Updated by Raymond Hoh about 1 year ago

I'm currently trying to force them to go out by triggering the health check manually via wp cron event run but they're not going.

The health check only sends out emails from the last 30 minutes:
https://github.com/cuny-academic-commons/cac/commit/9ff7d591e97a1698d49d438b95b5aa56d0f9381d#diff-86561795f3a30df22c4d2bba1d5c837b

Anything older will not get sent out.

FWIW, I did receive a group email from Colin yesterday at 9:50pm ET. Not sure what is causing the backlog. The health check should have caught these instances. Perhaps there is too much mail being sent out within 30 minutes? See also #13430.

We might need to widen the timeframe in the health check to an hour or two.

#8 Updated by Raymond Hoh about 1 year ago

I'm currently trying to force them to go out by triggering the health check manually via wp cron event run but they're not going

Edit - I misread your sentence. I thought you were trying to trigger the immediate emails without using the health check. You can skip what I wrote below!

Correct me if I'm wrong, Boone, but I believe immediate emails are not queued up using cron.

It uses a wp_remote_post() call:

And repeats the wp_remote_post() call until the immediate queue is done:
https://github.com/boonebgorges/buddypress-group-email-subscription/blob/383131d15a26d19b3c65f5c89c566e1e97a4b0ab/classes/class-bpges-async-request-send-queue.php#L134-L137

#9 Updated by Boone Gorges about 1 year ago

The health check only sends out emails from the last 30 minutes:

Not quite. It will only send emails that were sent between 24 hours and 30 minutes ago. It will NOT send anything older than a day, or anything more recent than 30 minutes.

Correct me if I'm wrong, Boone, but I believe immediate emails are not queued up using cron.

This is true. But occasionally the queue that you've described fails - timeout, etc - which leaves items in the queue that are unsent. This is what the bpges_health_check is for: every 5 minutes, it checks to see if there are items that should've gone out, but haven't.

I discovered that requests from the health check were failing in at least some cases because of some mysterious problem that was causing the self-pings (admin-ajax.php) not to complete properly when called via cron. I can't figure out what's going on here - there's no record in Apache. But it looks like I'll be able to work my way around it by eliminating the remote request, and instead processing the unsent emails directly, in the health-check request. I've got this change in place on the production site, and I'll commit it if it looks like it's working over the next few days.

#10 Updated by Boone Gorges about 1 year ago

I've made some mods to the BPGES health check and moved it into its own plugin. I think it will now be more reliable, since it doesn't rely on HTTP self-requests. https://github.com/cuny-academic-commons/cac/commit/aaa04d13e93d2f31dca0de6f4a59e8b26cee7b41 This is deployed to the production site.

#11 Updated by Matt Gold about 1 year ago

Hi Ray and Boone -- I'm not sure whether this is related, but in this thread, it sounds like the original poster is saying that a post by email he sent a while ago was only just now posted -- https://commons.gc.cuny.edu/groups/gis-working-group/forum/topic/having-issues-opening-gis/ . Is this something that is/will be addressed by the fixes you've made, or is this potentially a new issue?

#12 Updated by Raymond Hoh about 1 year ago

Hi Matt, your gis-working-group report is related to #13430. I've posted my reply there.

The fixes in this ticket are not related to that. I'm thinking #13430 is an issue with how the GC mail server is configured to handle vast amounts of outgoing mail.

#13 Updated by Boone Gorges about 1 year ago

Ray, could you please have a look at this ticket and the related ones from the last week to see where things stand? With regard to this specific ticket, I think that the changes to the BPGES health check tool go some way toward resolving the issue, but it could be that I'm missing something :)

#14 Updated by Raymond Hoh about 1 year ago

I think the main issue in this ticket regarding multiple forum posts was resolved when I removed all the duplicate Cavalcade jobs.

I was looking at the Cavalcade issues list and wordpress.org uses a daily cron job to look for duplicate jobs and removes them if they exist: https://github.com/humanmade/Cavalcade/issues/89#issuecomment-630577767

Perhaps that is something we might want to implement as well.

#15 Updated by Boone Gorges about 1 year ago

  • Related to Feature #13466: Automated cleanup for duplicate Cavalcade tasks added

#16 Updated by Boone Gorges about 1 year ago

  • Status changed from Assigned to Resolved
  • Target version set to Not tracked

Thanks, Ray! I've opened #13466 to track that possibility.

#17 Updated by Raymond Hoh 12 months ago

  • Related to Bug #13500: Received multiple notifications for a single post added

Also available in: Atom PDF