Project

General

Profile

Actions

Support #18982

open

Sites missing pages, menus, icons, banners

Added by Marilyn Weber 5 months ago. Updated 4 months ago.

Status:
Reporter Feedback
Priority name:
Normal
Assignee:
-
Category name:
-
Target version:
Start date:
2023-10-05
Due date:
% Done:

0%

Estimated time:
Deployment actions:

Description

Syelle Graves reports:

"Our websites are acting very strange across three browsers.
Our main site menu is sprawled out instead of dropping down, and the site homepage icon/banner is gone:
https://iletc.commons.gc.cuny.edu/we-authors/
These sites are missing most of the pages that we had in the menu, and the missing pages are gone from dashboard, as well, not even in a trash folder:
https://weauthors.commons.gc.cuny.edu/
https://8nsshl2021.commons.gc.cuny.edu/ (this site also has no more image at the top of the site).
Is there some kind of bug going on at the Commons? I just noticed all of these issues today. I don't see anything new on redmine so I can't imagine what's going on.
Thanks for your help!"


Files

screenshot-francaiscuny.commons.gc.cuny.edu-2023.10.14-14_53_01.png (72.4 KB) screenshot-francaiscuny.commons.gc.cuny.edu-2023.10.14-14_53_01.png Syelle Graves, 2023-10-14 03:11 PM
export.sh (885 Bytes) export.sh Tool for exporting site database tables from restored mysql database Boone Gorges, 2023-10-20 03:13 PM
back-up-production-site.sh (1.29 KB) back-up-production-site.sh Utility for backing up a production site Boone Gorges, 2023-10-20 03:20 PM

Related issues

Related to CUNY Academic Commons - Design/UX #18995: Clarifying 'delete account' textNewBoone Gorges2023-10-06

Actions
Actions #1

Updated by Marilyn Weber 5 months ago

Juwon Jun added

"I'm sorry to add to the stress of this situation, but I am wondering if it has something to do with my Commons account being closed. I had two accounts (one with and ) associated with Commons, and because there was much confusion, I decided to close my gradcenter.cuny.edu account. I may have been an author on a number of posts on ILETC's website, would this have contributed to the deletion of posts? Would it be possible to recover this account, if so?

I'm very sorry about contributing to this problem. I'm glad to jump on a call to discuss.

Best,
Juwon"
"

Actions #2

Updated by Raymond Hoh 5 months ago

I'm sorry to add to the stress of this situation, but I am wondering if it has something to do with my Commons account being closed. I had two accounts (one with and ) associated with Commons, and because there was much confusion, I decided to close my gradcenter.cuny.edu account. I may have been an author on a number of posts on ILETC's website, would this have contributed to the deletion of posts?

Deleting your Commons account through the "Settings > Delete Account" screen would have deleted posts from sites that your user account had published. Here's where WordPress deletes posts when a user account is deleted - https://github.com/WordPress/WordPress/blob/8f66e7884bbe33b81bb80159b2061519a2a72686/wp-admin/includes/ms.php#L184-L186 .

However, if the Commons account was deleted less than 30 days ago, those posts might still be in the trash filter in the Posts admin dashboard. If the Commons account was deleted more than 30 days ago, it would be hard to recover the post content but it might be possible to retrieve those posts through our database backups. However, this process is arduous and I'm not sure how long we keep database backups for. Boone, do you remember the database retention period that IT keeps backups for?


In the meantime, I restored the banner image for https://8nsshl2021.commons.gc.cuny.edu/ by finding an online backup of the site through web.archive.org -- https://web.archive.org/web/20230404002546/https://8nsshl2021.commons.gc.cuny.edu/.

Actions #3

Updated by Syelle Graves 5 months ago

Thanks, Ray. I thought the same thing, because I've seen deleted accounts on BMCC OL simply put those posts in the trash folder, but the issue is pages, not just posts, and in all sites, db > pages has no items in the trash--all pages on all sites have simply vanished. All posts she wrote have also vanished.

See
https://weauthors.commons.gc.cuny.edu/wp-admin/edit.php?post_type=page
https://8nsshl2021.commons.gc.cuny.edu/wp-admin/edit.php?post_type=page

Could Juwon's account be restored? It was definitely only a few days ago.

There is no way for us to even figure out all that has been deleted--Juwon created and worked on our sites for years, and we can't even see them in a trash folder to restore them, or recall exactly what was in each menu, in some cases. And this means we've lost content that lived in the sites, so we couldn't even create the pages again by hand.

Actions #4

Updated by Boone Gorges 5 months ago

I can contact IT to see what's involved in getting a backup restored. First, I need to know what backup to ask for. Syelle said "it was definitely only a few days ago" but I need to know a specific date, because it's arduous to go through this even once and I'd like to avoid doing it twice. Perhaps the team could let me know the most recent day they are sure is definitely before the account deletion?

Assuming we are able to get a restored backup with the information, the process of restoring the content in question is then going to be quite complicated. In advance of that event, it would be helpful if the team could make a list of affected sites. Then, for each of those sites, I'll need to know whether (a) it's possible to do a wholesale restore of that site based on what's in the backup (in other words, completely replacing the contents of the production site with the one from the backup), or (b) whether fine-grained restore of specific pieces of content will be necessary. You'll make that decision based on whether content has been created and/or modified in the time since the account deletion. Wholesale restoration is much easier and more foolproof from a technical point of view, but we can discuss this more once the team has catalogued the sites that need attention.

Actions #5

Updated by Syelle Graves 5 months ago

I can contact IT to see what's involved in getting a backup restored. First, I need to know what backup to ask for. Syelle said "it was definitely only a few days ago"

Thank you, Ray.

but I need to know a specific date, because it's arduous to go through this even once and I'd like to avoid doing it twice. Perhaps the team could let me know the most recent day they are sure is definitely before the account deletion?

-Juwon says "I deleted my account on Monday of this week, or at latest by Tuesday. I gave myself all admin access to my other account ()."
-The team couldn't know, because it's Juwon's personal account, and Juwon isn't fully sure because she didn't receive a confirmation email of the account deletion.
-I've also requested a redmine account for her so she can reply directly.

Assuming we are able to get a restored backup with the information, the process of restoring the content in question is then going to be quite complicated. In advance of that event, it would be helpful if the team could make a list of affected sites.

I've asked Juwon to work on this--she worked on all our sites with both Commons accounts, over many years, and many sites of her own. She'll add the site list here, or email it to me and I'll share here, ASAP.

Then, for each of those sites, I'll need to know whether (a) it's possible to do a wholesale restore of that site based on what's in the backup (in other words, completely replacing the contents of the production site with the one from the backup), or (b) whether fine-grained restore of specific pieces of content will be necessary.

(a) sounds much better; I don't believe it's possible to estimate all that Juwon created.

You'll make that decision based on whether content has been created and/or modified in the time since the account deletion.

I don't think changes have been made to our sites this week, or not enough to matter as much as the restore of years of content. I'll tell our team not to make any changes to any sites right away, and find out if anyone has done so this week, since Monday.

Wholesale restoration is much easier and more foolproof from a technical point of view, but we can discuss this more once the team has catalogued the sites that need attention.

All the better. List of sites coming as soon as we can.

Is it expected that no pages or posts show in any deleted folders on sites I've checked, even within the thirty days, and that it seems her media was deleted as well?

Actions #6

Updated by Boone Gorges 5 months ago

Thanks for this update. I'll start the conversation with our IT team. I'll be requesting the first available backup before Monday, October 2, which is probably late in the night on Sunday Oct 1.

Actions #8

Updated by Colin McDonald 5 months ago

Actions #9

Updated by Syelle Graves 5 months ago

Once the sites are restored, should we routinely try to reassign all page and post authors to a current team member, every time a team member leaves? Regarding media, and most recent edits to other authors' posts--which we found got deleted on https://weauthors.commons.gc.cuny.edu/ when Juwon deleted her account--how would we even reassign those elements?

The other thing I was considering recommending to our team is to use the co-authors plus plugin to create a neutral author that we can use, like “ILETC staff” with the aim of those materials remaining in perpetuity. But if we used a co-authors plus-created author, would the posts and pages still vanish if the original author of that post deletes their Commons account?

Actions #10

Updated by Boone Gorges 5 months ago

Thanks for the list, Syelle. I'm still waiting to hear IT about how exactly they'll be able to provide the necessary backup for us. If I don't hear in the next day or so, I'll follow up with them to get an update.

You ask some good questions about the specifics of user deletion. As part of this ticket and the follow-up #18995, our team will be putting together a technical description of what happens throughout the Commons when a user account is deleted. (It's complicated because it's controlled by many plugins, etc.) As that comes together, we'll keep your team apprised regarding best practices.

Actions #11

Updated by Syelle Graves 5 months ago

Thank you so much for the update, Boone, and for keeping in touch with IT. We are writing a DOE activities report this week, so having access to our full website history/content is particularly on our minds at the moment, especially because we can't know/see what is gone from the sites.

And yes, I'm watching #18995; we appreciate those steps being taken.

One more thing occurs to me: to restore the websites to the back-ups from the day before Juwon's account got deleted, all her content will have to be assigned to a user, right? Must/might her deleted user account simply be "undeleted" from the database, in order to restore the websites? Or, would her content somehow be assigned to another site admin? Or, could her content be reassigned to Juwon's active Commons user account?

Actions #12

Updated by Boone Gorges 5 months ago

Hi all - IT has provided a restored database backup dating from Sunday 2023-10-01. I've generated some initial backups, which I'll summarize here before talking about next steps.

Identified data

The deleted account had username jjun, with ID 17832. When querying for sites associated with this user, I found the following that matched your list:

1185 https://iletc.commons.gc.cuny.edu
8732 https://heritagespn.commons.gc.cuny.edu
9470 https://teleplaza.commons.gc.cuny.edu
13045 https://francaiscuny.commons.gc.cuny.edu
14452 https://8nsshl2021.commons.gc.cuny.edu
17904 https://weauthors.commons.gc.cuny.edu
21293 https://psyched4stem.commons.gc.cuny.edu

I also found the following sites associated with the jjun user, which were not on Syelle's list:

8321 mals78500fa19.commons.gc.cuny.edu
11990 8sshl2021.commons.gc.cuny.edu
13226 orientaldeco.commons.gc.cuny.edu

The following site was on Syelle's list but the jjun user doesn't appear to have had an active role on the site:

23444 https://slweauthors.commons.gc.cuny.edu/

Note that my technique for identifying the sites associated with the jjun user is not foolproof. It only detects those sites where the jjun user had a role as of the snapshot. If jjun was a member of another site at some point in the past, and created content on that site, but later left the site, it would not be reflected in my research.

Apart from site data, I was able to identify a small amount of BuddyPress data (group memberships, profile data) associated with the jjun account. I assume that this content does not need to be restored.

Preliminarily, I am fetching the data from all the sites in the first two lists above. Please confirm whether this is an exhaustive list of the sites where data has been lost.

Next steps

For each affected site, we have two potential ways forward.

Overwrite (destructive)

On this strategy, I replace the entire content of the production site in question with the corresponding data from the restored snapshot. I then manually change the authorship for all the jjun content to be owned by the jjun2 account (23855). This will completely overwrite the existing site, and as such it is destructive: Any divergence between the production site and the backup will be destroyed. (I will run backups beforehand in case we need to roll back the overwriting.) Advantages: There's less possibility of missed content; this technique will capture settings changes and other non-post content that might be difficult to track down manually; it's somewhat less work for me. Disadvantages: It's destructive and is all-or-nothing, as described previously.

Duplicate (non-destructive)

On this strategy, I create a blank Commons site for each site in question. I then overwrite the blank site with the data restored from the backup for that site. In this way, you'll have a perfect clone of (for example) iletc.commons.gc.cuny.edu. Since I'm not touching the existing site, this is non-destructive. Your team will be responsible for manually finding and copying over the missing data. Advantages: It's non-destructive, and leaves your team with the responsibility of moving content, which is good because you know your content best. Disadvantages: More work for your team, and the possibility that you might miss something.

We can do a different strategy for each individual site, if necessary. If you can be absolutely certain that there's no changes to preserve from the past ~10 days, 'Overwrite' is probably OK. Otherwise we should go with 'Duplicate', which is far safer and makes me less nervous. Please let me know how your team wants to proceed, for each site.

Actions #13

Updated by Syelle Graves 5 months ago

Confirming receipt of your update. This is a huge relief to hear!

I'm 90% sure we'll stick with both the overwrite option and with the original list of sites, despite the caveats you laid out, but I'm going to review with Juwon by phone first, be confirm for certain.

Will confirm here for sure asap.

Actions #14

Updated by Syelle Graves 5 months ago

I also found the following sites associated with the jjun user, which were not on Syelle's list:

Thank you for finding these three additional sites. Responses below each link:

8321 mals78500fa19.commons.gc.cuny.edu

Juwon will contact this site admin and if the site needs restoring, she will reply and let you know. Nothing to do for now.

11990 8sshl2021.commons.gc.cuny.edu

Nothing to do for this site; it was a mock-up. I'm an admin, and I may delete the site anyway.

13226 orientaldeco.commons.gc.cuny.edu

Juwon will contact this site admin and if the site needs restoring, she will reply and let you know. Nothing to do for now.

The following site was on Syelle's list but the jjun user doesn't appear to have had an active role on the site:

23444 https://slweauthors.commons.gc.cuny.edu/

We will take the Duplicate option for this site, so we can confirm nothing has gone missing and then restore any missing data by hand. Thank you!

Note that my technique for identifying the sites associated with the jjun user is not foolproof. It only detects those sites where the jjun user had a role as of the snapshot. If jjun was a member of another site at some point in the past, and created content on that site, but later left the site, it would not be reflected in my research.

We understand.

Apart from site data, I was able to identify a small amount of BuddyPress data (group memberships, profile data) associated with the jjun account. I assume that this content does not need to be restored.

Juwon confirms that this BP data is not needed, thank you.

Preliminarily, I am fetching the data from all the sites in the first two lists above. Please confirm whether this is an exhaustive list of the sites where data has been lost.

We've re-pasted the full list below.

We can do a different strategy for each individual site, if necessary. If you can be absolutely certain that there's no changes to preserve from the past ~10 days, 'Overwrite' is probably OK. Otherwise we should go with 'Duplicate', which is far safer and makes me less nervous. Please let me know how your team wants to proceed, for each site.

Final site list:

DUPLICATE https://slweauthors.commons.gc.cuny.edu/
PLEASE HOLD, WHILE JUWON CONTACTS SITE ADMIN https://psyched4stem.commons.gc.cuny.edu
OVERWRITE https://francaiscuny.commons.gc.cuny.edu
OVERWRITE https://teleplaza.commons.gc.cuny.edu
OVERWRITE https://heritagespn.commons.gc.cuny.edu
OVERWRITE https://8nsshl2021.commons.gc.cuny.edu
OVERWRITE https://iletc.commons.gc.cuny.edu
OVERWRITE https://weauthors.commons.gc.cuny.edu

We're sorry that we have to choose so many Overwrites, because it's a more nerve-wracking process for you. If you wanted, and it wasn't too much trouble, we could take one Duplicate of iletc just to look at it and be 100% that we need the Overwrite, but we doubt this is necessary. Would you recommend this intermediary step? The site iletc has the most lost data, so we doubt the Duplicate would work for restoring the site, but doing the test Duplicate before the Overwrite could help us confirm.

Is there any other info we can give to help? Thanks so much again.

Actions #15

Updated by Boone Gorges 5 months ago

Thanks for this, Syelle.

I've begun by creating a duplicate of https://slweauthors.commons.gc.cuny.edu/. It can be found at https://slweauthors-duplicate.commons.gc.cuny.edu/, visible only to members of the site. Users syellegraves and jjun2 have been added as Administrators of the site - you probably received an email.

we could take one Duplicate of iletc just to look at it and be 100% that we need the Overwrite, but we doubt this is necessary. Would you recommend this intermediary step?

The process of setting up a Duplicate is complicated. It took me about an hour to figure out the necessary steps to duplicate slweauthors above (though I took notes, so it'd be faster in the future). As such, if there's no real chance that the Duplicate would be helpful, let's skip it.

I don't have time to process the rest of these sites today. I will try to do at least some of them tomorrow morning, begining with iletc, which it seems like may be the most important site in question.

Actions #16

Updated by Syelle Graves 5 months ago

Copy all. Sound perfect.

Actions #17

Updated by Juwon Jun 5 months ago

Hi Boone, thank you so much for working on this. So glad to hear the restoration is possible. Confirming that I received an invitation for https://slweauthors-duplicate.commons.gc.cuny.edu/.

I have contacted the site admin for https://psyched4stem.commons.gc.cuny.edu. They approve of moving forward with the overwrite. Please proceed with overwriting the site at your convenience.

Updated final site list:

DUPLICATE https://slweauthors.commons.gc.cuny.edu/
OVERWRITE https://psyched4stem.commons.gc.cuny.edu
OVERWRITE https://francaiscuny.commons.gc.cuny.edu
OVERWRITE https://teleplaza.commons.gc.cuny.edu
OVERWRITE https://heritagespn.commons.gc.cuny.edu
OVERWRITE https://8nsshl2021.commons.gc.cuny.edu
OVERWRITE https://iletc.commons.gc.cuny.edu
OVERWRITE https://weauthors.commons.gc.cuny.edu

Actions #18

Updated by Boone Gorges 5 months ago

Thanks, Syelle. I'm going to begin overwriting sites now. I'll provide updates as I've got them.

Actions #19

Updated by Boone Gorges 5 months ago

Hi all - I've completed 4 of the 8 overwrites:

x 23444 DUPLICATE https://slweauthors.commons.gc.cuny.edu/
x 21293 OVERWRITE https://psyched4stem.commons.gc.cuny.edu
x 13045 OVERWRITE https://francaiscuny.commons.gc.cuny.edu
x 9470 OVERWRITE https://teleplaza.commons.gc.cuny.edu

In each case, I overwrote the existing site with the backup, then I reattributed all posts previously belonging to user 17832 (jjun) to 23855 (jjun2). Each one is looking good, as far as I can see.

I've got to pause for a while now. I hope to return and finish the remaining items by the end of the day today.

Actions #20

Updated by Boone Gorges 5 months ago

I'm sorry, but I've been unable to find the time to focus on the last four site restores. I don't want to rush it or do it on the weekend in case of problems, so I will plan to tackle it first thing next week.

Actions #21

Updated by Syelle Graves 5 months ago

Thanks, Boone:

Hi all - I've completed 4 of the 8 overwrites:

x 23444 DUPLICATE https://slweauthors.commons.gc.cuny.edu/
x 13045 OVERWRITE https://francaiscuny.commons.gc.cuny.edu
x 9470 OVERWRITE https://teleplaza.commons.gc.cuny.edu
In each case, I overwrote the existing site with the backup, then I reattributed all posts previously belonging to user 17832 (jjun) to 23855 (jjun2). Each one is looking good, as far as I can see.

-Teleplaza looks back to normal.
-Juwon is reviewing/comparing the two slweauthors sites, live and restored back-up.
-francaiscuny looks restored, but it does have a few restored pages with Commons Admin as the author, and a few with no author at all. I'll go through and assign all pages and posts to our project director (I'll be doing this on all our sites, post-back-up, anyway), but I just want to make sure this is expected behavior? Screenshot attached. May be related to the Co-authors Plus plugin on that site?

I'm sorry, but I've been unable to find the time to focus on the last four site restores. I don't want to rush it or do it on the weekend in case of problems, so I will plan to tackle it first thing next week.

I fully support not rushing or taking up the weekend!

Our faculty will need to use weauthors next week, but not until Wed., 10/18, in case that's helpful.

Actions #22

Updated by Boone Gorges 5 months ago

  • Status changed from New to Reporter Feedback
  • Target version set to Not tracked

Hi all,

I've completed the last four overwrites:

x 8732 OVERWRITE https://heritagespn.commons.gc.cuny.edu
x 14452 OVERWRITE https://8nsshl2021.commons.gc.cuny.edu
x 1185 OVERWRITE https://iletc.commons.gc.cuny.edu
x 17904 OVERWRITE https://weauthors.commons.gc.cuny.edu

The steps taken are the same as those I described in my last comment. Here too I have done visual verification of each site and it appears to be restored properly.

A few possible limitations to restoration that I noted during the process:
- Because of file permission issues, you may find it difficult to delete certain existing items in the Media Library. More specifically, it will be possible to delete the entry from the Media Library, but it might not be possible to delete the file itself. You're unlikely to notice this in practice.
- Certain files could not always be synced from the backup in the proper way, again due to file permissions. This generally applies only to files that were originally created by a plugin, such as the broken-link-checker log or certain cache files. Again, I don't anticipate that you would notice any such issues, but I'm filing it here for reference.

After you've verified the restored state of the sites, please keep track of any odd behavior you might see in the upcoming weeks of regular usage. If there are issues, they're likely to be related to file uploads.

-francaiscuny looks restored, but it does have a few restored pages with Commons Admin as the author, and a few with no author at all. I'll go through and assign all pages and posts to our project director (I'll be doing this on all our sites, post-back-up, anyway), but I just want to make sure this is expected behavior? Screenshot attached. May be related to the Co-authors Plus plugin on that site?

What you see is what was in the 2023-10-01 backup. My only explanation is that those pages must have been in the database at that time. Even in the case of Co-Authors Plus, it ought to be the case that the entire site is restored precisely to its state when that snapshot was taken. In any case, your manual reattribution should be fine. As with the rest of this ticket, let me know if you run into problems with this kind of data that you don't know how to work around.

I'll leave this ticket open as your team reviews the restored sites.

Actions #23

Updated by Syelle Graves 5 months ago

The four sites look fully restored, at a quick glance. I'm nearly in tears with relief! We'll investigate more thoroughly when we can.

Thank you!

Actions #24

Updated by Boone Gorges 4 months ago

So that our team has a record of what happened, I'm going to post a summary here, along with some small tools I created to help the process.

IT was able to restore a MySQL backup to a server that could be accessed via ldv2. To ease the process of generating an export of database tables for each of the affected sites, I assembled a list of site IDs and wrote the tool export.sh (attached) to query the restored backup for the relevant tables, then to use mysqldump to create .sql files in a specified location. Note that this script requires that you have a .my.cnf file in your home directory that points you toward the relevant database host (and provides credentials).

As for user-uploaded files, IT noted that the Commons upload directory is so large that it would be easier to export only specific directories from the snapshot. I provided a list of subdirectories of /blogs.dir/ corresponding to the site IDs mentioned above, and IT copied them to my home directory.

For import, I handled each site separately. Here's the checklist I used:

1. Verify backup .sql file (site_XXX_backup.sql) and corresponding uploads directory
2. Run backup command on production data: ~/data-restore-2023-10-11/back-up-production-site.sh {site_id} (See attached back-up-production-site.sh, which fetches and backs up production tables, and also backs up the production upload files. Note that your .my.cnf file must be change so that it points to the production host!)
3. Verify this production backup data
4. Import data: wp db import ~/data-restore-2023-10-11/site_{site_id}_backup.sql
5. Reattribute 17832 posts to 23855 (17832 was the deleted author). UPDATE wp_xxxx_posts SET post_author = 23855 WHERE post_author = 17832
6. wp cache flush
7. rsync data: ~/data-restore-2023-10-11/100223/{site_id} to production site's uploads directory
8. Verify site in browser.

Actions #25

Updated by Boone Gorges 4 months ago

Syelle, when you get a moment, could you let me know whether your team has had the chance to verify more of the restored sites? I've had an inquiry from IT about the status of the export. At the moment, the restored database backup is still available, but they would like to be able to shut it down when it's no longer needed. But I'd like to be relatively confident that things are working as you'd expect before making that call. Thanks!

Actions #26

Updated by Syelle Graves 4 months ago

Thanks for the nudge! I've had the whole team take a look at all the sites all week, and no one can find any problems. Everything seems to be aces on our end.

Actions

Also available in: Atom PDF