Ray, you're copied as an FYI, in case I go into a coma and you have to pick up where I left off.
I spent most of the day working on this today. As I was in the middle of rolling out the fix, pcp.gc.cuny.edu published another newsletter, which crashed the site. So what I'd hoped would be a reasonably painless deploy ended up being fairly horrific.
Anyway, here's how the caching layer works:
- I added a custom action to MailPoet's email model, so that I can detect whenever an email is being created or updated
- Hooked to that action, I have an mu-plugin (wp-content/mu-plugins/assets/mailpoet.php) that uses wp_remote_get() to fetch a static copy of the HTML generated by MailPoet
- That HTML is saved to a directory with the following format: wp-content/blogs.dir/1/mailpoet-cache/[domain]/[email_id].html. Eg: wp-content/blogs.dir/1/mailpoet-cache/pcp.gc.cuny.edu/399.html
- There's a rule in .htaccess that detects requests of the form ?wysija-page=1&email_id=399, that do not contain 'controller=email'. This is pretty broad, but it catches all cases we're interested in. ('controller=email' is a way to fetch the raw email content, which we need when generating the cache)
- These requests go to a small PHP script wp-content/mailpoet-cache.php that loads the static cache. This probably could have been skipped with a fancier rewrite rule, but I was having problems with multiple backreferences in separate RewriteCond lines, and the site was in the process of crashing, so I went with this
I wrote a script to crawl each site on the Commons that runs MailPoet, in order to generate the static versions of all existing emails. This is in my user's ~/wp-cli-scripts directory. The first time I ran the script, it threatened to DOS the server, so I throttled it to generating one email every 30 seconds. When pcp.gc.cuny.edu sent out their newsletter this afternoon, I had to stop this process and skip straight to generating the cached version of the new email. Now I'm running the script again. It will take a few hours to complete. In the meantime, people clicking through to old newsletters may get 404s. There's nothing I can do to speed this up, unfortunately.
Relevant changesets:
https://github.com/cuny-academic-commons/cac/commit/84897045a003a1bb3c8f325761d61305c8320d3d
https://github.com/cuny-academic-commons/cac/commit/64823f95638327a9b4f8a5c670bd40ed010c01a3
https://github.com/cuny-academic-commons/cac/commit/ec020fe66263eecff3e4ec0b249922c6febcfc8b
I'll continue to monitor the import process as well as site traffic/performance, and will update this ticket when things have stabilized.