Project

General

Profile

Actions

Bug #10720

closed

GES email queue triggers many failed user lookups

Added by Boone Gorges about 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority name:
Normal
Assignee:
-
Category name:
Email Notifications
Target version:
Start date:
2018-11-17
Due date:
% Done:

0%

Estimated time:
Deployment actions:

Description

When BP sends emails, it sets the From and Reply-To field using the BP_Email_Recipient class. We (in RBE and in bp-custom) then override the Reply-To address in various cases. BP_Email_Recipient, in turn, does a get_user_by() lookup on these email addresses. The problem is that the emails we pass into these functions generally do not belong to actual Commons users - , etc. The way that get_user_by() works means that failed lookups are not cached; see https://core.trac.wordpress.org/browser/tags/4.9.8/src/wp-includes/class-wp-user.php#L214. So we get lots of duplicate database queries for the same users, which look like this:

 [1739] => Array
        (
            [0] => SELECT * FROM wp_users WHERE user_email = 'commons@gc.cuny.edu'
            [1] => 0.001478910446167
            [2] => do_action('wp_ajax_nopriv_wp_bpges_send_queue'), WP_Hook->do_action, WP_Hook->apply_filters, WP_Async_Request->maybe_handle, BPGES_Async_Request_Send_Queue->handle, BPGES_Async_Request_Send_Queue->handle_digest_queue, bpges_process_digest_for_user, bpges_generate_digest, ass_send_email, bp_send_email, bp_get_email, BP_Email->__construct, BP_Email->set_from, BP_Email_Recipient->__construct, BP_Email_Recipient->get_user, get_user_by, WP_User::get_data_by
        )

    [1740] => Array
        (
            [0] => SELECT * FROM wp_users WHERE user_email = 'cunycommons@gmail.com'
            [1] => 0.0023388862609863
            [2] => do_action('wp_ajax_nopriv_wp_bpges_send_queue'), WP_Hook->do_action, WP_Hook->apply_filters, WP_Async_Request->maybe_handle, BPGES_Async_Request_Send_Queue->handle, BPGES_Async_Request_Send_Queue->handle_digest_queue, bpges_process_digest_for_user, bpges_generate_digest, ass_send_email, bp_send_email, bp_get_email, BP_Email->__construct, BP_Email->set_reply_to, BP_Email_Recipient->__construct, BP_Email_Recipient->get_user, get_user_by, WP_User::get_data_by
        )

    [1741] => Array
        (
            [0] => SELECT * FROM wp_users WHERE user_email = 'cunyacademiccommons+bounce@gmail.com'
            [1] => 0.0011720657348633
            [2] => do_action('wp_ajax_nopriv_wp_bpges_send_queue'), WP_Hook->do_action, WP_Hook->apply_filters, WP_Async_Request->maybe_handle, BPGES_Async_Request_Send_Queue->handle, BPGES_Async_Request_Send_Queue->handle_digest_queue, bpges_process_digest_for_user, bpges_generate_digest, ass_send_email, bp_send_email, bp_get_email, BP_Email->__construct, do_action('bp_email'), WP_Hook->do_action, WP_Hook->apply_filters, BuddyPress->{closure}, BP_Email->set_reply_to, BP_Email_Recipient->__construct, BP_Email_Recipient->get_user, get_user_by, WP_User::get_data_by
        )

Normally, this is a fairly minor issue. But when GES sends its batches, these queries constitute 30-40% of the database i/o.

There are various ways this could be fixed.

1. In a sense, much of this would go away if WP properly cached failed user lookups. That said, it's hard to get a fix into WP (in general), and in addition, I'm not sure what the fix would be. We can't send a dummy value to update_user_caches() because it checks for $user->exists(). I guess you could just shove the failures directly into the useremails etc caches?
2. BP could be less aggressive about these user lookups. If I understand correctly, the get_user_by() calls in BP_Email_Recipient are meant to provide us with a sender Name, as well as a canonical email address. In our use case, these lookups are unneeded. At the very least, we ought to be able to filter them to prevent the lookups, since we know what they're going to be. Or, knowing that they're not going to match a WP user, we should be able to tell BP_Email_Recipient (via a filter?) not to ping the database. Or, BP's set_reply_to() and set_from() shouldn't use BP_Email_Recipient at all, since these are not recipients, and there will be many BP installs where the reply-to/from emails won't match actual users. Or, BP should have its own cache wrapped around some of this.
3. We could do our own magic to fix this. The only way I can think of is to wp_cache_add( $bad_email, 0, 'useremails' ) (which would be enforced whenever we begin sending an email, in case of a cache flush), but I have no idea whether this will cause other bugs.

Ray, I was hoping to get your thoughts before potentially opening upstream tickets.

Actions

Also available in: Atom PDF