Plugins should not create upload directories until they're needed
I got a notification from Lihua that the Commons has a large number of empty directories. The vast majority appear to be linked to user uploads: the object in question (group, user, etc) has an upload directory created, without anything to put into it. I'm guessing that, at least in some of these cases, the directory is being created at the same time as the object (eg, group). I don't think this is necessary; we should just create the directory when it's first needed to hold an upload. We could also have a cleanup routine that empties a directory after a file deletion makes it empty.
Here are the directory types I see in Lihua's log file:
1 and 3 are both tied to buddypress-group-documents. We run a fork of this plugin, so let's fix the problem there. 4 and possibly 2 are linked to BuddyPress, so we should do a check to make sure we're not missing anything obvious there. 5 (which forms the vast majority of instances) is due to WP itself. I don't want us to bend over backward to fix this, but it would be nice to understand the logic that WP is using and to see if there's any obvious improvements to be made.
Dan, can you have a first pass at this? The first step will be to understand how and when the directories are being created. Then we can decide what, if anything, to do about it.
#1 Updated by Daniel Jones about 3 years ago
One quick update here - on my local install of the Commons, the upload directories for new sites don't get created until I upload a file to the site. It exists in the database but not in the file system. So I'm not totally sure what's happening there. I'll look into the others ASAP.
#2 Updated by Daniel Jones about 3 years ago
Okay I've gone through and looked at each of these on my local install of the Commons, and here's what I've found:
1. Not created until a file is uploaded to the group
2. Not created at all as far as I can tell, but "group-avatars" is, but not until one is actually uploaded for the group
3. Not created until a file is uploaded to the group, but stays empty because we don't use that directory for uploads in our version of the plugin. It was a change we made a couple years back (to plugins/cac-bp-custom-includes/group-documents.php) when we were adding the "Download All" button for group-documents. We changed the directory there, but as of now any call to "get_path" for bp-group-documents will create the directory, because it's set to always call "create_dir_or_htaccess". There is a "create_folders" parameter for the get_path function, which defaults to 0, but it isn't actually used in the function. If we made use of it, it would fix the issue, except for in the case of uploading a file, and also going to the admin screen, which runs a routine to move the files from an older legacy location, to the one that we made into a legacy location, at least for us, when we made our changes 2 years ago. This seems like a tricky issue, but maybe its fine since we're already on our own fork of the plugin and can just make whatever changes we want. The functions causing the problem are mostly in bp-group-documents/include/classes.php, but there's also one in bp-group-documents/include/admin.php
4. Not created at all as far as I can tell - forum post attachments are just in the top level of the uploads folder for the group, under blogs.dir/1/.
5. Not created until a file is uploaded to the new site.
Okay let me know what you think of all this: seems like the most obvious place to fix something would be 3., but maybe there's a difference in a setting somewhere for me with the others. Let me know!
#3 Updated by Boone Gorges about 3 years ago
Dan - This is some excellent detective work. Exactly what I had in mind. Thanks!
It's possible that the empty directories in 1 are part of the custom migration routine that I built during one of our moves. I guess we can ignore that one.
I believe that 4 is created by buddypress-docs, though I think that we've disable the attachment functionality for Docs on the Commons. It could be that the handful of directories in this case were created during some testing.
Very strange about 5. There are so many empty directories of this format. Something must be going on, but obviously it's not going to be easy to trace.
Could you go ahead and make the fix you've suggested in 3, in our fork of the plugin? The rest of the issues we can file away for later investigation.
#6 Updated by Daniel Jones almost 2 years ago
I'm going back through some old tickets - it looks like the current version of buddypress-group-documents has fixed this problem by making use of the $create_folders and making its default 0. Do you remember if you implemented this? I did make one further change though - there was one call to get_path, in UploadFile, that had create_folders set to 1. I set it to the default (0) instead. File upload still works and we don't get the zombie directory: https://github.com/cuny-academic-commons/cac/commit/44c5d4e02f0bbd3e6204b5f501554ced4569e64e
#7 Updated by Boone Gorges almost 2 years ago
Thanks, Dan. I didn't implement the create_folders thing myself, and the logs suggest it's always been there. Not sure why this would be happening, then. It seems odd that you'd be able to remove the create_folders param in UploadFile and still have it work; I don't think move_uploaded_file will create the necessary directory http://php.net/manual/en/function.move-uploaded-file.php#105026. You've confirmed that this works in cases where the group's directory doesn't already exist?
#8 Updated by Daniel Jones almost 2 years ago
I tested it with a brand new group and it seemed to work. I think it still works because we hook into the filter at the very end of the get_path() function to change the upload path to the new one we use. I can't find the ticket in here but a few years ago we moved the group files to a top level directory as a part of making the downloads more secure. You can see it in cac-bp-custom-includes/group-documents.php in the cac_filter_doc_path function. In that function, we automatically create the directory if it doesn't exist.
#9 Updated by Boone Gorges almost 2 years ago
Ah! So it's our custom function that is creating the directory :-D Note that cac_filter_doc_path() does not respect (or even look at) the create_folders param.
Any chance you could dig to see if it's possible to pass create_folders through to the filter so that we can respect it?
#10 Updated by Daniel Jones almost 2 years ago
Good call - tested changes are in this commit: https://github.com/cuny-academic-commons/cac/commit/f6c408d08322a6f5500ef0e02e9cbd32dc6744a9
#11 Updated by Boone Gorges almost 2 years ago
Excellent! Thanks, Dan.
Could you do me another favor and write a PHP script (wp cli eval-file) that will crawl through BP_GROUP_DOCUMENTS_SECURE_PATH subdirectories and delete empty ones?
Once that's done, I think we can close this ticket - bp-group-documents was by far the biggest offender, and probably the one that's most in our control.
#14 Updated by Daniel Jones almost 2 years ago
I was running into an issue with WP CLI - I couldn't get it to run without the --skip-wordpress option, and when I ran it with debug mode on it seemed to fail silently - so I wasn't able to dynamically load the directory. Instead you can pass it as an argument to the wp eval-file command. In my case it was /var/www/commons/cac/wp-content/blogs.dir/1/files/group-documents
It doesn't recurse, but I don't think it needs to: I think buddypress-group-documents only creates flat directories.
#15 Updated by Boone Gorges almost 2 years ago
- Target version changed from Future release to 1.12.1
Thanks, Dan! It looks like this should work. (Not sure about the wp-cli problem. The "silent" failure was probably a fatal PHP error, which you should be able to diagnose by tailing the Apache or WP error log, whatever you use.)
I'll try to take care of this for 1.12.1.