Support #21888
openTrouble with site cloning
0%
Description
Shiraz Biggie wrote yesterday:
Hi Fabulous Commons Folks,
I’m having trouble with site cloning since the hosting shift to Reclaim and I wanted to check in if its something with my sites or a general problem going on since the shift. We’ve had issues with our site cloning over at Blogs@Baruch, but I haven’t had issues previously with Commons.
I cloned the course site I teach from initially a few days ago but didn’t check at the time that it had worked fully. It hadn’t. It cloned all the pages and posts, including non-Admin posts from my students. Media is there but no longer appearing on pages, and internal site links have not updated. I tried to clone the same site again tonight with the same result. I though it might be a potential site editor site issue, so tried with two additional older sites. Both had similar issues.
I also experienced some time out issues though that may have been because I was asking too much at once. Hopefully this won’t be too hard a solve.
I asked for more info and here's the replye:
"I tried cloning with the following:
https://f24casd1717.commons.gc.cuny.edu/
to https://s25casd1717.commons.gc.cuny.edu/ and then another time to https://casd1717s25.commons.gc.cuny.edu/ - media seemed to have come over on the second one, but links didn’t update, and all brought over non-admin posts
I also tried https://kidlitperf.commons.gc.cuny.edu/ to https://kidlittest.commons.gc.cuny.edu/ which had similar results.
Then I tried https://fairytales.commons.gc.cuny.edu/ to https://fairytales2.commons.gc.cuny.edu/one-story-many-variations/modern-lrrh/ which had no additional users and that seems to have more or less cloned correctly"
Related issues
Updated by Boone Gorges 11 months ago
Hi there - I can confirm the issue.
Essentially, it's a timeout. The source sites are very large - they have lots of content, and have had many plugins activated in the past, and thus have many database tables. The request is timing out partway through, and the whole thing fails.
We probably need a multi-pronged approach here. My first step will be to ask Reclaim if we can increase the timeout.
Updated by Marilyn Weber 10 months ago
An update from Shiraz:
I’m following up on the cloning issue.
I started remaking the site to have a base template that I can clone each semester instead of continually cloning the ever growing site of several years of teaching. (Also I can’t put my students off longer!)
As I was doing so – do we no longer have the ability to embed an iframe? I’ve always been able to directly embed books from archive.org with an iframe. I know there’s a plugin for video and audio, but do we have one that will do those flip through books?
Updated by Marilyn Weber 10 months ago
Boone and Matt -
I'm wondering if it makes sense for Shiraz to have a Redmine account so you can talk directly? LMK your thoughts.
Updated by Boone Gorges 10 months ago
We are working with Reclaim to identify ways around the issue, but essentially we will need to rewrite some of the cloning architecture in order to accommodate very large clones like Shiraz's. See #21895. Unfortunately, this will not happen in the very short term. I'll follow up in this ticket if we have more info about short-term workarounds.
As I was doing so – do we no longer have the ability to embed an iframe? I’ve always been able to directly embed books from archive.org with an iframe. I know there’s a plugin for video and audio, but do we have one that will do those flip through books?
We can't embed arbitrary iframes, for security reasons. If the user would like to share a specific page that she'd like embedded, or a specific bit of iframe code, or generally some more details on just how the embed flow ought to work, I can think about whether we can do a custom shortcode or the like.
I'm wondering if it makes sense for Shiraz to have a Redmine account so you can talk directly? LMK your thoughts.
Only if Shiraz is interested in engaging directly with our dev team.
Updated by Marilyn Weber 10 months ago
Shiraz writes "Happy to have a Redmine account if that makes life easier for you.
FYI – I tried just now to clone the more barebones template I started. At first it didn’t work, but then I deleted a bunch of the media and it did. So, I have a workaround for now."
You can reach Shiraz at sbiggie@gradcenter.cuny.edu
Updated by Marilyn Weber 10 months ago
She is interested. Please let me know what I should tell her about how to join Redmine. Thanks!
Updated by Boone Gorges 10 months ago
I've created an account for the user. She should have received login info at the email address provided.
Updated by Shiraz Biggie 10 months ago
Hi Boone (et. al),
Just had a chance to login here. I know you're handling a lot with the transition. I'm up and running for this semester, so the cloning issue won't be a pressing issue until next fall.
Regarding the i-frame issue - In the past, I embedded a number of public domain picture books from archive.org, using their instructions, on the site so students could flip through them without having to be redirected to the outward link. I know the Commons has the audio/video plugin, but do you know anything that would allow the books to be embedded that could be enabled or another workaround? Archive.org gives a different embed code for wordpress.com, but I am not seeing anything for .org, or this listed as a source for any of the other embed plugins available.
Example books: https://archive.org/details/applepie00gree2
https://archive.org/details/peterpanalphabet00herf
And please let me know in what way I can help as you continue to work on these. I'm especially interested in what might come out of the cloning solutions, particularly if it is anything that might be of help for Christopher and I in figuring out some better options for cloning on Blogs@Baruch.
Much thanks!
Updated by Boone Gorges 10 months ago
Shiraz, thanks for chiming in here. We'll use this ticket to continue to track cloning issues.
Regarding archive.org embeds, it looks like the Jetpack plugin will enable the same [archiveorg] and [archiveorg-book] shortcodes that you can use on wordpress.com. Try activating Jetpack and then using that syntax on your site.
Updated by Boone Gorges 10 months ago
I've identified a couple of problems in the cloning process:
1. There was previously a bug in the way that tables-to-be-cloned were identified, which I noticed and then "fixed" in https://github.com/cuny-academic-commons/cac/commit/8c849bd7561052fa398809e969ba4a3f0ad15857. See #21886. But my "fix" had its own bug, an extra trailing underscore that caused the query to match nothing. As a result, that part of the clone has been broken for the past couple of days. This is fixed in https://github.com/cuny-academic-commons/cac/commit/a9b8bf315f62e2f356e7ed7231ad6e0c384578b9
2. While building a tool for splitting off the various parts of cloning, I noticed that the copying of user uploads was taking a really long time. It turns out that my previous technique for copying these uploads, which involved a recursive crawl of the uploads directory for that site, was creating some weird loop logic when combined with the S3 stream wrapper. I've rewritten it to use the S3 API directly: https://github.com/cuny-academic-commons/cac/commit/ff9b53eb3537cc1163752bfd81c4ba9245a063ab, https://github.com/cuny-academic-commons/cac/commit/e4b6ae2274b557b5fb59fc88d0e9acb3cbfaa190
Cloning is still going slowly and will break with sites that use lots of tables, have huge numbers of posts, or have huge numbers of uploads. I'll continue to work on a more general solution in #21895.
Updated by Boone Gorges 9 months ago
- Has duplicate Bug #22399: Connected Group and Site cloning time out added
Updated by Boone Gorges 7 months ago
- Has duplicate Support #22818: critical error when trying to clone added
Updated by Shiraz Biggie 3 months ago
I mentioned this to Laurie in a different email about something else, but I'll add to my note here for everyone's reference and as a help ask. When I cloned my site this semester (f25casd1717 cloned from childlit), it initially appeared that everything was transferred correctly. However, although there are items in the library for all the media files and their accompanying alt text, captions, etc, none of them actually display. They do, however, display in editor modes.
Currently, I am changing media file addresses to pull from the base site that I ended up setting up last semester (previously, I had cloned from each semester's site to the next, but this seemed an easier way from now on) but I am hoping there is a better fix.
Updated by Boone Gorges 3 months ago
although there are items in the library for all the media files and their accompanying alt text, captions, etc, none of them actually display. They do, however, display in editor modes.
The fact that they display in editor modes but not on the front end suggests a different kind of problem than "they didn't copy over". If possible, perhaps you could share a link to the site in question and let me know where I could look to see the problem in action.
Updated by Shiraz Biggie 3 months ago
Sure - https://f25casd1717.commons.gc.cuny.edu/ is the cloned site. https://childlit.commons.gc.cuny.edu/ was the site that was cloned.
I have adjusted images on the main home page and through module one pages, but not beyond.
See, for example, https://f25casd1717.commons.gc.cuny.edu/module-3/the-call-of-the-sea/, where the alt text and captions (blue) appear. The items in the media library all appear blank, but show in page editors, and if you go to edit the media file itself. The direct URLs for the media files give the following error:
This XML file does not appear to have any style information associated with it. The document tree is shown below.
<Error>
<Code>AccessDenied</Code>
<Message>Access Denied</Message>
<RequestId>KAQD80KHJR4AKRWA</RequestId>
<HostId>
GlwQGiCQmaMI9UiYgzxcdGsPSGAy5Eb5RoJ0tNqyMbOfVH0aRDGMRNZ8YygKncOqH2zj39/7DqLlgIQzmvu1sH+FVr+17c1mbunu6TDX7qE=
</HostId>
</Error>
Thanks!
Updated by Boone Gorges 3 months ago
Thanks so much, Shiraz. The particular problem here had something to do with the permissions on the files, specifically as they interact with the overall privacy of your site (see 'Site visibility' at https://f25casd1717.commons.gc.cuny.edu/wp-admin/options-reading.php). I just toggled that radio button to a private setting, and then restored back to 'Discourage...'. This has the effect of jumpstarting the Commons's script for syncing these file permissions, and it appears to have reset yours properly: https://f25casd1717.commons.gc.cuny.edu/wp-admin/upload.php
To be clear, it's a bug that this doesn't happen properly in some cases of cloning. But at least this can be a workaround while the team investigates.
Updated by Shiraz Biggie 3 months ago
It may be a workaround, but one that I am thrilled to have a quick solution for now. Thank you!