Support #21888
openTrouble with site cloning
0%
Description
Shiraz Biggie wrote yesterday:
Hi Fabulous Commons Folks,
I’m having trouble with site cloning since the hosting shift to Reclaim and I wanted to check in if its something with my sites or a general problem going on since the shift. We’ve had issues with our site cloning over at Blogs@Baruch, but I haven’t had issues previously with Commons.
I cloned the course site I teach from initially a few days ago but didn’t check at the time that it had worked fully. It hadn’t. It cloned all the pages and posts, including non-Admin posts from my students. Media is there but no longer appearing on pages, and internal site links have not updated. I tried to clone the same site again tonight with the same result. I though it might be a potential site editor site issue, so tried with two additional older sites. Both had similar issues.
I also experienced some time out issues though that may have been because I was asking too much at once. Hopefully this won’t be too hard a solve.
I asked for more info and here's the replye:
"I tried cloning with the following:
https://f24casd1717.commons.gc.cuny.edu/
to https://s25casd1717.commons.gc.cuny.edu/ and then another time to https://casd1717s25.commons.gc.cuny.edu/ - media seemed to have come over on the second one, but links didn’t update, and all brought over non-admin posts
I also tried https://kidlitperf.commons.gc.cuny.edu/ to https://kidlittest.commons.gc.cuny.edu/ which had similar results.
Then I tried https://fairytales.commons.gc.cuny.edu/ to https://fairytales2.commons.gc.cuny.edu/one-story-many-variations/modern-lrrh/ which had no additional users and that seems to have more or less cloned correctly"
Related issues
Updated by Boone Gorges about 2 months ago
- Related to Bug #21895: Site creation/cloning should be off-loaded and broken into batches added
Updated by Boone Gorges about 2 months ago
Hi there - I can confirm the issue.
Essentially, it's a timeout. The source sites are very large - they have lots of content, and have had many plugins activated in the past, and thus have many database tables. The request is timing out partway through, and the whole thing fails.
We probably need a multi-pronged approach here. My first step will be to ask Reclaim if we can increase the timeout.
Updated by Marilyn Weber about 2 months ago
An update from Shiraz:
I’m following up on the cloning issue.
I started remaking the site to have a base template that I can clone each semester instead of continually cloning the ever growing site of several years of teaching. (Also I can’t put my students off longer!)
As I was doing so – do we no longer have the ability to embed an iframe? I’ve always been able to directly embed books from archive.org with an iframe. I know there’s a plugin for video and audio, but do we have one that will do those flip through books?
Updated by Marilyn Weber about 2 months ago
Boone and Matt -
I'm wondering if it makes sense for Shiraz to have a Redmine account so you can talk directly? LMK your thoughts.
Updated by Boone Gorges about 2 months ago
We are working with Reclaim to identify ways around the issue, but essentially we will need to rewrite some of the cloning architecture in order to accommodate very large clones like Shiraz's. See #21895. Unfortunately, this will not happen in the very short term. I'll follow up in this ticket if we have more info about short-term workarounds.
As I was doing so – do we no longer have the ability to embed an iframe? I’ve always been able to directly embed books from archive.org with an iframe. I know there’s a plugin for video and audio, but do we have one that will do those flip through books?
We can't embed arbitrary iframes, for security reasons. If the user would like to share a specific page that she'd like embedded, or a specific bit of iframe code, or generally some more details on just how the embed flow ought to work, I can think about whether we can do a custom shortcode or the like.
I'm wondering if it makes sense for Shiraz to have a Redmine account so you can talk directly? LMK your thoughts.
Only if Shiraz is interested in engaging directly with our dev team.
Updated by Marilyn Weber about 2 months ago
Shiraz writes "Happy to have a Redmine account if that makes life easier for you.
FYI – I tried just now to clone the more barebones template I started. At first it didn’t work, but then I deleted a bunch of the media and it did. So, I have a workaround for now."
You can reach Shiraz at sbiggie@gradcenter.cuny.edu
Updated by Marilyn Weber about 1 month ago
She is interested. Please let me know what I should tell her about how to join Redmine. Thanks!
Updated by Boone Gorges about 1 month ago
I've created an account for the user. She should have received login info at the email address provided.
Updated by Shiraz Biggie about 1 month ago
Hi Boone (et. al),
Just had a chance to login here. I know you're handling a lot with the transition. I'm up and running for this semester, so the cloning issue won't be a pressing issue until next fall.
Regarding the i-frame issue - In the past, I embedded a number of public domain picture books from archive.org, using their instructions, on the site so students could flip through them without having to be redirected to the outward link. I know the Commons has the audio/video plugin, but do you know anything that would allow the books to be embedded that could be enabled or another workaround? Archive.org gives a different embed code for wordpress.com, but I am not seeing anything for .org, or this listed as a source for any of the other embed plugins available.
Example books: https://archive.org/details/applepie00gree2
https://archive.org/details/peterpanalphabet00herf
And please let me know in what way I can help as you continue to work on these. I'm especially interested in what might come out of the cloning solutions, particularly if it is anything that might be of help for Christopher and I in figuring out some better options for cloning on Blogs@Baruch.
Much thanks!
Updated by Boone Gorges about 1 month ago
Shiraz, thanks for chiming in here. We'll use this ticket to continue to track cloning issues.
Regarding archive.org embeds, it looks like the Jetpack plugin will enable the same [archiveorg] and [archiveorg-book] shortcodes that you can use on wordpress.com. Try activating Jetpack and then using that syntax on your site.
Updated by Boone Gorges about 1 month ago
I've identified a couple of problems in the cloning process:
1. There was previously a bug in the way that tables-to-be-cloned were identified, which I noticed and then "fixed" in https://github.com/cuny-academic-commons/cac/commit/8c849bd7561052fa398809e969ba4a3f0ad15857. See #21886. But my "fix" had its own bug, an extra trailing underscore that caused the query to match nothing. As a result, that part of the clone has been broken for the past couple of days. This is fixed in https://github.com/cuny-academic-commons/cac/commit/a9b8bf315f62e2f356e7ed7231ad6e0c384578b9
2. While building a tool for splitting off the various parts of cloning, I noticed that the copying of user uploads was taking a really long time. It turns out that my previous technique for copying these uploads, which involved a recursive crawl of the uploads directory for that site, was creating some weird loop logic when combined with the S3 stream wrapper. I've rewritten it to use the S3 API directly: https://github.com/cuny-academic-commons/cac/commit/ff9b53eb3537cc1163752bfd81c4ba9245a063ab, https://github.com/cuny-academic-commons/cac/commit/e4b6ae2274b557b5fb59fc88d0e9acb3cbfaa190
Cloning is still going slowly and will break with sites that use lots of tables, have huge numbers of posts, or have huge numbers of uploads. I'll continue to work on a more general solution in #21895.
Updated by Boone Gorges about 9 hours ago
- Has duplicate Bug #22399: Connected Group and Site cloning time out added