Support #21888: Trouble with site cloning - CUNY Academic Commons - CUNY Graduate Center - Project Tracking System

Actions

Copy link

Support #21888

open

Trouble with site cloning

Added by Marilyn Weber 6 months ago. Updated 5 months ago.

Status:

New

Priority name:

Normal

Assignee:

Category name:

Target version:

Start date:

2025-01-27

Due date:

% Done:

Estimated time:

Deployment actions:

Description

Shiraz Biggie wrote yesterday:

Hi Fabulous Commons Folks,

I’m having trouble with site cloning since the hosting shift to Reclaim and I wanted to check in if its something with my sites or a general problem going on since the shift. We’ve had issues with our site cloning over at Blogs@Baruch, but I haven’t had issues previously with Commons.

I cloned the course site I teach from initially a few days ago but didn’t check at the time that it had worked fully. It hadn’t. It cloned all the pages and posts, including non-Admin posts from my students. Media is there but no longer appearing on pages, and internal site links have not updated. I tried to clone the same site again tonight with the same result. I though it might be a potential site editor site issue, so tried with two additional older sites. Both had similar issues.

I also experienced some time out issues though that may have been because I was asking too much at once. Hopefully this won’t be too hard a solve.

I asked for more info and here's the replye:

"I tried cloning with the following:

https://f24casd1717.commons.gc.cuny.edu/

to https://s25casd1717.commons.gc.cuny.edu/ and then another time to https://casd1717s25.commons.gc.cuny.edu/ - media seemed to have come over on the second one, but links didn’t update, and all brought over non-admin posts

I also tried https://kidlitperf.commons.gc.cuny.edu/ to https://kidlittest.commons.gc.cuny.edu/ which had similar results.

Then I tried https://fairytales.commons.gc.cuny.edu/ to https://fairytales2.commons.gc.cuny.edu/one-story-many-variations/modern-lrrh/ which had no additional users and that seems to have more or less cloned correctly"

Related issues

Actions

Copy link

Updated by Boone Gorges 6 months ago

Related to Bug #21895: Site creation/cloning should be off-loaded and broken into batches added

Actions

Copy link

Updated by Boone Gorges 6 months ago

Hi there - I can confirm the issue.

Essentially, it's a timeout. The source sites are very large - they have lots of content, and have had many plugins activated in the past, and thus have many database tables. The request is timing out partway through, and the whole thing fails.

We probably need a multi-pronged approach here. My first step will be to ask Reclaim if we can increase the timeout.

Actions

Copy link

Updated by Marilyn Weber 6 months ago

Thanks! I let Shiraz know.

Actions

Copy link

Updated by Marilyn Weber 6 months ago

An update from Shiraz:

I’m following up on the cloning issue.

I started remaking the site to have a base template that I can clone each semester instead of continually cloning the ever growing site of several years of teaching. (Also I can’t put my students off longer!)

As I was doing so – do we no longer have the ability to embed an iframe? I’ve always been able to directly embed books from archive.org with an iframe. I know there’s a plugin for video and audio, but do we have one that will do those flip through books?

Actions

Copy link

Updated by Marilyn Weber 6 months ago

Boone and Matt -

I'm wondering if it makes sense for Shiraz to have a Redmine account so you can talk directly? LMK your thoughts.

Actions

Copy link

Updated by Boone Gorges 6 months ago

We are working with Reclaim to identify ways around the issue, but essentially we will need to rewrite some of the cloning architecture in order to accommodate very large clones like Shiraz's. See #21895. Unfortunately, this will not happen in the very short term. I'll follow up in this ticket if we have more info about short-term workarounds.

As I was doing so – do we no longer have the ability to embed an iframe? I’ve always been able to directly embed books from archive.org with an iframe. I know there’s a plugin for video and audio, but do we have one that will do those flip through books?

We can't embed arbitrary iframes, for security reasons. If the user would like to share a specific page that she'd like embedded, or a specific bit of iframe code, or generally some more details on just how the embed flow ought to work, I can think about whether we can do a custom shortcode or the like.

I'm wondering if it makes sense for Shiraz to have a Redmine account so you can talk directly? LMK your thoughts.

Only if Shiraz is interested in engaging directly with our dev team.

Actions

Copy link

Updated by Marilyn Weber 6 months ago

Shiraz writes "Happy to have a Redmine account if that makes life easier for you.

FYI – I tried just now to clone the more barebones template I started. At first it didn’t work, but then I deleted a bunch of the media and it did. So, I have a workaround for now."

You can reach Shiraz at sbiggie@gradcenter.cuny.edu

Actions

Copy link

Updated by Marilyn Weber 6 months ago

She is interested. Please let me know what I should tell her about how to join Redmine. Thanks!

Actions

Copy link

Updated by Boone Gorges 6 months ago

I've created an account for the user. She should have received login info at the email address provided.

Actions

Copy link

#10

Updated by Shiraz Biggie 6 months ago

Hi Boone (et. al),
Just had a chance to login here. I know you're handling a lot with the transition. I'm up and running for this semester, so the cloning issue won't be a pressing issue until next fall.

Regarding the i-frame issue - In the past, I embedded a number of public domain picture books from archive.org, using their instructions, on the site so students could flip through them without having to be redirected to the outward link. I know the Commons has the audio/video plugin, but do you know anything that would allow the books to be embedded that could be enabled or another workaround? Archive.org gives a different embed code for wordpress.com, but I am not seeing anything for .org, or this listed as a source for any of the other embed plugins available.
Example books: https://archive.org/details/applepie00gree2
https://archive.org/details/peterpanalphabet00herf

And please let me know in what way I can help as you continue to work on these. I'm especially interested in what might come out of the cloning solutions, particularly if it is anything that might be of help for Christopher and I in figuring out some better options for cloning on Blogs@Baruch.

Much thanks!

Actions

Copy link

#11

Updated by Boone Gorges 5 months ago

Shiraz, thanks for chiming in here. We'll use this ticket to continue to track cloning issues.

Regarding archive.org embeds, it looks like the Jetpack plugin will enable the same [archiveorg] and [archiveorg-book] shortcodes that you can use on wordpress.com. Try activating Jetpack and then using that syntax on your site.

Actions

Copy link

#12

Updated by Boone Gorges 5 months ago

I've identified a couple of problems in the cloning process:

1. There was previously a bug in the way that tables-to-be-cloned were identified, which I noticed and then "fixed" in https://github.com/cuny-academic-commons/cac/commit/8c849bd7561052fa398809e969ba4a3f0ad15857. See #21886. But my "fix" had its own bug, an extra trailing underscore that caused the query to match nothing. As a result, that part of the clone has been broken for the past couple of days. This is fixed in https://github.com/cuny-academic-commons/cac/commit/a9b8bf315f62e2f356e7ed7231ad6e0c384578b9

2. While building a tool for splitting off the various parts of cloning, I noticed that the copying of user uploads was taking a really long time. It turns out that my previous technique for copying these uploads, which involved a recursive crawl of the uploads directory for that site, was creating some weird loop logic when combined with the S3 stream wrapper. I've rewritten it to use the S3 API directly: https://github.com/cuny-academic-commons/cac/commit/ff9b53eb3537cc1163752bfd81c4ba9245a063ab, https://github.com/cuny-academic-commons/cac/commit/e4b6ae2274b557b5fb59fc88d0e9acb3cbfaa190

Cloning is still going slowly and will break with sites that use lots of tables, have huge numbers of posts, or have huge numbers of uploads. I'll continue to work on a more general solution in #21895.

Actions

Copy link

#13

Updated by Boone Gorges 4 months ago

Has duplicate Bug #22399: Connected Group and Site cloning time out added

Actions

Copy link

#14

Updated by Boone Gorges 2 months ago

Has duplicate Support #22818: critical error when trying to clone added

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

CUNY Academic Commons

Custom queries

Support #21888

Trouble with site cloning

Updated by Boone Gorges 6 months ago

Updated by Boone Gorges 6 months ago

Updated by Marilyn Weber 6 months ago

Updated by Marilyn Weber 6 months ago

Updated by Marilyn Weber 6 months ago

Updated by Boone Gorges 6 months ago

Updated by Marilyn Weber 6 months ago

Updated by Marilyn Weber 6 months ago

Updated by Boone Gorges 6 months ago

Updated by Shiraz Biggie 6 months ago

Updated by Boone Gorges 5 months ago

Updated by Boone Gorges 5 months ago

Updated by Boone Gorges 4 months ago

Updated by Boone Gorges 2 months ago