Project

General

Profile

Actions

Feature #5434

closed

Clean repository history

Added by Boone Gorges almost 8 years ago. Updated over 6 years ago.

Status:
Resolved
Priority name:
Normal
Assignee:
Category name:
Meta
Target version:
Start date:
2016-04-11
Due date:
% Done:

0%

Estimated time:
Deployment actions:

Description

The CAC git repository is humongous. The .git directory is almost 1GB in size. This makes it very difficult to work with: cloning takes a long time, and git status takes foooorrrreeeeevvvveeeerrrr the first time you run it during a session.

The root problem is that there are a number of large objects in the repository history. I think the most problematic ones are some fairly large MySQL zip files from 2009 or 2010, though there may be others. A side effect of the presence of these objects is that we are unable to make the CAC repo public, due to the non-public information in these zips.

I'd like to propose that I aggressively clean the history of the repository. This will mean that all existing clones will need to be recloned: all developer copies, cdev, and the production site. I can handle cdev and the production sites, but members of the development team would obviously be responsible for recloning their own repos.

The change will also mean that most existing hashes will be rewritten, breaking old links to Github changesets. This is obviously not ideal, but I think it's probably worth it. We can create an archived (private) copy of the existing repo in case we ever need to cross-reference old hashes.

I'm in the process of running some tests and some statistics to make sure that it's a worthwhile move, but I wanted to open this ticket first so that anyone with concerns could raise them here.

Actions

Also available in: Atom PDF