Transition from single cac-files.php file server to site-specific .htaccess directives
Currently, all requests for uploaded files ([domain]/files/...) run through wp-content/cac-files.php, our custom router based on WP's old ms-files.php. We use cac-files.php primarily for access protection: we need to load the WordPress application to decide who should have access to what files. But this is totally unnecessary for sites that are publicly accessible. The result is a severe increase on server load and perceived performance, especially when requesting large numbers of files. (See for example #7985, where the lag is almost certainly due in part to the problem described here.)
I'd like to move to a system that looks like the following:
1. As in new WP Multisite installations, we don't rewrite uploaded file URLs by default, but allow Apache to serve them directly.
2. Non-public sites - which includes the main site, which needs to protect forum attachments and other files - will have an .htaccess file in their site-specific blogs.dir directory. This file will redirect all requests to a download handler similar to cac-files.php (maybe even cac-files.php itself, I guess)
I think that the change to 1 is very easy (the UPLOADS constant or something like that). For 2, I'll need to write some custom code to generate .htaccess files. This routine will need to run whenever a site's privacy settings are changed, to either create or delete .htaccess, as appropriate. There'll also need to be a one-time update routine to create .htaccess files for existing sites. I've done a lot of this work for BuddyPress Docs and other projects, so it should be a somewhat reasonable port.
Posting here for any discussion (especially thoughts from Ray, in case I'm missing anything) before working on it. Putting in 1.12 for now, though it could be moved to a minor release, maybe early in the summer.
#8 Updated by Boone Gorges 7 months ago
After getting a report similar to #7985 this morning (#8602), I spent some time trying to understand our options a bit better. A full migration will be more complicated than I thought (we'd be inverting the default behavior of WordPress in a number of ways, with several layers of legacy content to deal with), but a partial mitigation of the problem is actually quite a bit easier.
I've made the following change, effective immediately on the production site: https://github.com/cuny-academic-commons/cac/commit/46d9b0de1e65ad80667c5b6b3a62b8acc5fd1e0b Here's what it does:
1. On private sites, do nothing.
2. On non-private sites (fully public or "discourage search engines"), do nothing on the front end. In the admin, filter WP's upload directory so that it points to /wp-content/blogs.dir/[blog-id]/files rather than /files/. The primary effect of this change is that dynamically-generated attachment URLs point directly to the files instead of to a WordPress endpoint. This means two things:
a. Uploaded images displayed in the admin - especially in the Media Library - are served directly from Apache. This should basically solve the problem in #7985 for non-private sites.
b. Images inserted into posts will have URLs that point directly to the file, so that loading them on the front end will not require bootstrapping WordPress a second (third, fourth...) time.
As noted, private sites are unaffected. Admins will still experience some slowness when loading their Media Libraries.
This solution is actually fairly robust and might constitute a "good enough" solution in the long run. There are some smaller remaining issues, which might fall outside the scope of this ticket:
a. While we do /files/ rewriting using cac-files.php to protect uploaded files, we don't actually have server-level protections against accessing these files directly. This doesn't have a practical effect in most cases, since the server URLs are never exposed, but it's something we should fix.
b. Related: When switching a site from private to non-private - or, more importantly, from non-private to private - we should have a cleanup routine that creates the necessary .htaccess file.
I'll leave the ticket open to address these latter issues. In the meantime, please keep a careful eye out for media-related support requests in the upcoming days.
#10 Updated by Boone Gorges 6 months ago
Here's a first pass at a tool that generates the necessary .htaccess file for non-public sites: https://github.com/cuny-academic-commons/cac/commit/5ce7e07d19f8ad8559a83fbcd68979a327efca5c
I've also written a wp-cli-script that will backfill .htaccess for existing sites, which is attached to this ticket.
It won't be practical to do very wide testing (beyond what I've already done) until we're in production. I'm leaving the ticket open to account for post-release issues.