Project

General

Profile

Actions

Feature #17522

closed

Strategies for reducing database size

Added by Boone Gorges almost 2 years ago. Updated over 1 year ago.

Status:
Duplicate
Priority name:
Normal
Assignee:
-
Category name:
Server
Target version:
-
Start date:
2023-01-20
Due date:
% Done:

0%

Estimated time:
Deployment actions:

Description

Several weeks ago, we experienced a server outage due to a lack of storage space on the database server. And we have received some reports from IT that the DB server continues to be close to full.

Could we talk about ideas for reducing database size? A few initial ideas:

1. In my experience, it's often the case that a small number of sites are responsible for a disproportionate amount of DB storage space. We could develop a tool that analyzes existing sites and identifies these sites, and then work to discover what's causing the disproportionate use. We could then, on a case-by-case basis, take mitigating steps: decommissioning sites, culling excess or inappropriate data, fixing bugs that might cause inappropriate database use, moving blob assets out of the DB and into the filesystem, etc.

2. Explore a strategy for deleting/decommissioning/archiving empty sites. It's likely that a large number of sites have been created but never used, and can be safely removed.

3. Explore a strategy for archiving legacy sites. This might involve rendering them as HTML and saving them to the filesystem, then deleting them from the WP installation. This is the strategy most likely to have a long-term impact on storage space, but is also the strategy that will involve the most technical and outreach work.

Other ideas or thoughts?


Related issues

Related to CUNY Academic Commons - Bug #17542: Ninja Forms plugin can write many "_wp_session" entries into the options tableAbandonedRaymond Hoh2023-01-23

Actions
Actions #1

Updated by Raymond Hoh almost 2 years ago

  • Related to Bug #17542: Ninja Forms plugin can write many "_wp_session" entries into the options table added
Actions #2

Updated by Raymond Hoh almost 2 years ago

We could develop a tool that analyzes existing sites and identifies these sites, and then work to discover what's causing the disproportionate use.

This can be done in WP-CLI with:

wp db size --tables --human-readable --all-tables --orderby=size --order=desc

It does do queries against information_schema, so this might be an intensive operation (especially the --all-tables clause).

fixing bugs that might cause inappropriate database use

I came across one such instance today with the Ninja Forms plugin. See #17542. I suspect there are many other plugins doing similar things such as Jetpack.

2. Explore a strategy for deleting/decommissioning/archiving empty sites. It's likely that a large number of sites have been created but never used, and can be safely removed.

We could do a DB query against the wp_blogs DB table. Check the registered and last_updated columns to see if they are the same (or very close to the same) to determine empty sites. Probably add a 5-year threshold to be safe and check the corresponding posts table to verify post count.

3. Explore a strategy for archiving legacy sites.

What would the criteria be for a legacy site that would warrant converting over an existing site over to static HTML?

Actions #3

Updated by Jeremy Felt almost 2 years ago

In my experience, it's often the case that a small number of sites are responsible for a disproportionate amount of DB storage space

This seems the most impactful near-term. If we can pull a list of tables by size, I would expect to see a small handful of sites with a runaway options table or some other sort of generated content that we can find a root cause for. (Like Ray just mentioned w/ Jetpack & Ninja Forms)

I think options 2 and 3 may not save a bunch immediately, but could account for more predictable data usage over time. If we know what those sites look like (e.g. how often a site is started and immediately abandoned), we can plan ahead and say "over the next N months, this much more storage is likely going to be used and this is how we maintain it".

Actions #5

Updated by Boone Gorges over 1 year ago

  • Status changed from New to Duplicate

I'm going to close this in favor of the more active #17761.

Actions

Also available in: Atom PDF