Bug #15757
closedMembers # do not match due to multiple last_activity entries
0%
Description
Hi All,
I am an admin in the Teaching Courses on the Commons group: https://commons.gc.cuny.edu/groups/teaching-courses-on-the-commons/members/
I believe this is the group faculty are invited to when they now create a "teaching" group/site.
In the group, it says there are 63 members, but when I click members, I only see that there are 18 (screenshot attached).
Is this a bug or is it that 18 people have accepted the invite to be a member of the group?
Do we think that this group should function more like the group for group admins where folks are auto added?
I ask because I was hoping to use the forum to reach faculty teaching on the Commons. I wanted to send out a message to this group inviting them to ask questions in the group since there has been some recent activity but if it is only going to 18 people instead of 63, that is not having the sort of faculty reach I thought this group might have.
Files
Related issues
Updated by Boone Gorges over 2 years ago
18 is the correct count.
For some reason, there's an incorrect count of 63 in the total_member_count of the groupmeta table. I'm not sure why. I can pretty easily write a script that will correct the count for this and other groups, but I'm wary to do so before understanding why there's a mismatch. Ray, I'm swamped today, but perhaps you could find a moment to take a look and see if anything jumps out at you.
Do we think that this group should function more like the group for group admins where folks are auto added?
I leave this up to the rest of the team. I think we decided not to auto-add because we wanted to respect individual control over group memberships.
Updated by Raymond Hoh over 2 years ago
The total group member count is calculated here: https://github.com/buddypress/buddypress/blob/master/src/bp-groups/classes/class-bp-groups-group.php#L1791
When I call the groups_get_group_members()
function manually for the Teaching Courses group, the returned $members['count']
is 63, but the number of members in the members array adds up to 18, so there is a bug occurring in this function. Going to do a bit more research to see what is wrong.
Updated by Raymond Hoh over 2 years ago
The issue is caused by having multiple last_activity
entries for certain users.
The group member count uses the following SQL query:
SELECT COUNT(u.user_id) FROM wp_bp_activity u WHERE 1=1 AND u.component = 'members' AND u.type = 'last_activity' AND u.user_id IN (USER IDS)
For the Teaching Courses group, users 1, 49, 4203 and 10971 had multiple last_activity
type entries, which caused the count to balloon from the actual group member count. See the DB query below:
mysql> SELECT u.user_id,date_recorded FROM wp_bp_activity u WHERE 1=1 AND u.component = 'members' AND u.type = 'last_activity' AND u.user_id IN (31672,37117,8864,21948,8297,12242,1510,13032,4262,7926,155,582,30020,1,34500,49,10971,4203); +---------+---------------------+ | user_id | date_recorded | +---------+---------------------+ | 1 | 2022-03-11 20:26:15 | | 1 | 2022-02-22 20:04:49 | | 1 | 2022-03-23 20:39:27 | | 1 | 2022-03-07 01:30:23 | | 1 | 2022-01-07 15:46:48 | | 1 | 2022-02-21 01:46:15 | | 1 | 2022-02-09 01:55:43 | | 1 | 2022-03-11 15:44:59 | | 1 | 2021-12-21 14:00:57 | | 1 | 2022-03-11 15:14:58 | | 1 | 2021-11-18 16:09:42 | | 1 | 2022-02-11 18:26:56 | | 1 | 2022-03-09 01:59:30 | | 1 | 2022-03-29 00:38:05 | | 1 | 2021-12-23 18:05:58 | | 1 | 2021-03-17 22:42:51 | | 1 | 2021-10-22 17:59:07 | | 1 | 2022-01-28 03:14:24 | | 1 | 2022-01-08 18:21:29 | | 1 | 2022-02-01 23:41:50 | | 1 | 2021-01-12 22:05:23 | | 1 | 2019-10-17 22:21:06 | | 1 | 2019-08-15 01:38:46 | | 1 | 2019-08-23 12:36:34 | | 1 | 2019-09-05 16:42:58 | | 1 | 2019-10-03 17:42:31 | | 1 | 2019-10-12 16:47:04 | | 1 | 2019-12-03 15:10:28 | | 1 | 2019-12-20 12:31:22 | | 1 | 2020-01-03 17:12:31 | | 1 | 2020-07-14 22:04:04 | | 1 | 2020-01-27 15:09:30 | | 1 | 2020-02-04 22:29:41 | | 1 | 2020-02-19 04:01:07 | | 1 | 2020-02-21 00:43:49 | | 1 | 2020-03-03 23:13:38 | | 1 | 2020-03-15 01:19:34 | | 1 | 2020-03-23 13:18:44 | | 1 | 2020-03-29 18:21:59 | | 1 | 2020-04-01 18:47:14 | | 1 | 2020-05-07 19:41:11 | | 1 | 2022-01-19 15:08:40 | | 49 | 2022-03-11 19:11:17 | | 49 | 2021-12-10 16:23:01 | | 49 | 2022-03-30 15:26:33 | | 155 | 2022-03-26 14:38:52 | | 582 | 2022-03-23 18:40:01 | | 1510 | 2022-03-30 18:33:41 | | 4203 | 2022-03-30 19:12:38 | | 4203 | 2022-03-30 15:36:05 | | 4262 | 2022-03-30 18:37:00 | | 7926 | 2022-03-29 20:08:26 | | 8297 | 2022-03-16 18:38:27 | | 8864 | 2022-03-30 16:31:42 | | 10971 | 2022-03-15 16:26:42 | | 10971 | 2022-03-29 16:17:29 | | 12242 | 2022-03-30 17:58:16 | | 13032 | 2022-03-30 17:32:56 | | 21948 | 2022-03-04 20:52:18 | | 30020 | 2022-02-09 06:50:54 | | 31672 | 2022-03-30 15:25:21 | | 34500 | 2021-11-01 13:46:34 | | 37117 | 2022-03-30 14:41:03 | +---------+---------------------+ 63 rows in set (0.05 sec)
I've just removed the duplicate last_activity
entries for those users and the member count for the Teaching Courses group is now correct.
To see how many other users might have duplicate 'last_activity'
entries, I ran the following DB query:
select user_id,count(user_id) as DuplicateLastActivity from wp_bp_activity where type = 'last_activity' group by user_id having count(user_id) > 1;
The query returned 128 other users, so we'll also have to run a deletion routine to remove the duplicate 'last_activity'
items for those users. Then we would have to recalculate the group member counts for groups that these users are a member of. sigh.
The easier workaround would be to change the count select query from COUNT( u.user_id )
to COUNT( DISTINCT u.user_id )
here:
https://github.com/buddypress/buddypress/blob/c9b0e32face9b2e028184d7dad0ffd14fee87919/src/bp-core/classes/class-bp-user-query.php#L556
But that would be hiding why (and how) the duplicate 'last_activity'
entries are created in the first place.
Updated by Boone Gorges over 2 years ago
Thanks for this, Ray. I have a vague recollection of trying to use DISTINCT
when writing BP_User_Query
but running into performance problems.
BuddyPress is supposed to enforce uniqueness for a user's last_activity items. See https://github.com/buddypress/buddypress/blob/c9b0e32face9b2e028184d7dad0ffd14fee87919/src/bp-core/classes/class-bp-core-user.php#L864=. There must be a bug in the routine for fetching the existing last_activity - https://github.com/buddypress/buddypress/blob/c9b0e32face9b2e028184d7dad0ffd14fee87919/src/bp-core/classes/class-bp-core-user.php#L803=
Maybe we could set up a routine for detecting these dupes? We could hook into bp_core_user_updated_last_activity and detect whether the just-updated user has multiple entries, and if so, record the stack trace. This might help to narrow down the issue.
Updated by Raymond Hoh about 2 years ago
- Related to Bug #16671: Group member count off added
Updated by Boone Gorges over 1 year ago
- Subject changed from Members # do not match to Members # do not match due to multiple last_activity entries
- Category name set to BuddyPress (misc)
- Assignee set to Boone Gorges
- Target version set to 2.1.5
Running this query again, I see that there are a few instances of duplicate last_activity entries:
mysql> select user_id,count(user_id) as DuplicateLastActivity from wp_bp_activity where type = 'last_activity' group by user_id having count(user_id) > 1; +---------+-----------------------+ | user_id | DuplicateLastActivity | +---------+-----------------------+ | 10943 | 2 | | 31208 | 3 | | 34652 | 2 | +---------+-----------------------+ 3 rows in set (0.17 sec)
They've all occurred since the beginning of 2023:
mysql> select user_id, date_recorded from wp_bp_activity where type = 'last_activity' and user_id in (10943, 31208, 34652); +---------+---------------------+ | user_id | date_recorded | +---------+---------------------+ | 10943 | 2023-03-26 00:50:24 | | 10943 | 2023-04-10 18:13:49 | | 31208 | 2023-02-13 06:16:35 | | 31208 | 2023-01-08 20:15:59 | | 31208 | 2023-01-20 19:08:49 | | 34652 | 2023-03-20 22:42:02 | | 34652 | 2023-01-08 20:16:14 | +---------+---------------------+ 7 rows in set (0.00 sec)
Obviously, this is not a very large number, so it's not a runaway problem that's urgent to deal with. But it would be nice to come to some sort of conclusion. I see a couple possibilities:
1. Do nothing. We might decide that this is not a serious issue and thus is not worth the effort. We might occasionally clean out the duplicates in a manual fashion.
2. Set up a routine that occasionally (say, monthly?) queries for dupes and cleans them up.
3. Set up a post-update routine, running perhaps here https://github.com/buddypress/buddypress/blob/c9b0e32face9b2e028184d7dad0ffd14fee87919/src/bp-core/classes/class-bp-core-user.php#L944, that detects duplicates for the specific user, and deletes the older one.
4. Put some tracers in place to try to detect how the duplicates are being created in the first place.
In an ideal world, we'd do 4. But that's hard and will take lots of iteration. Option 3 seems pretty straightforward, and it would likely cover nearly all instances of the problem. It's just a band-aid, but it should be a pretty good one. Ray, do you agree?
Updated by Raymond Hoh over 1 year ago
Ray, do you agree?
Yeah, let's do option 3 for now. Thanks Boone!
Updated by Boone Gorges over 1 year ago
- Status changed from New to Staged for Production Release
Sounds good. This is done in https://github.com/cuny-academic-commons/cac/commit/4254e57f6576ce1f080573ed5352e9976467e667
Updated by Boone Gorges over 1 year ago
- Status changed from Staged for Production Release to Resolved