Bug #723
closedCannot upload .ppt, .doc, or .docx files on wiki
Added by Sarah Morgano over 13 years ago. Updated over 13 years ago.
0%
Description
Forgive me if this ticket already exists, but a member recently informed me that he cannot upload .doc or .docx files on the wiki and I just tested this and also got the same error message he did. I also just tried to upload a powerpoint and got the same error message: "Upload warning
The file is corrupt or has an incorrect extension. Please check the file and upload again."
Best,
Sarah
Updated by Sarah Morgano over 13 years ago
I forgot to add that I tested this using Microsoft Windows XP Professional, Version 2002, Service Pack 3 / FireFox 3.5.3. I'm not sure what the member used, but I can reach out to him if you would like this information.
Best,
Sarah
Updated by Matt Gold over 13 years ago
- Status changed from New to Assigned
- Assignee set to Boone Gorges
- Priority name changed from Normal to High
- Target version set to 1.2
Updated by Raymond Hoh over 13 years ago
- Assignee changed from Boone Gorges to Raymond Hoh
Boone, I've set MediaWiki to allow .docx and .pptx uploads on the 1.2-branch:
https://github.com/castiron/cac/commit/4a4c83c4305144ce32251d0ad5d7c058d9d1bffe
One thing to note after applying the fix is when I tried to download a .docx file after uploading, it would try to download as a .ZIP file. It could just be my server setup, but when we commit this change, we'll want to test this on cdev as well.
If we experience the same issue on cdev, we can apply the .htaccess fix listed in this forum post to force the mime type:
http://www.webdeveloper.com/forum/showpost.php?p=898935&postcount=2
Updated by Boone Gorges over 13 years ago
- Priority name changed from High to Normal
- Target version changed from 1.2 to 1.2.1
Thanks, Ray.
It looks like Ray's fix requires additional PHP extensions from those needed to check image file types. For that reason, the fix is not working on either my local installation or on cdev.
According to http://meta.wikimedia.org/wiki/MediaWiki_FAQ#When_I_try_to_upload_files_in_Mediawiki_1.5_I_always_get_a_.22The_file_is_corrupt_or_has_an_incorrect_extension._Please_check_the_file_and_upload_again..22_error., the PHP extension in question is fileinfo. This extension comes enabled by default in PHP 5.3+. The Commons server is running PHP 5.1.6.
This suggests one of two possible solutions. One is to upgrade PHP to 5.3+. The other is to install the fileinfo extension, which is available through pecl http://pecl.php.net/package/Fileinfo
André and Ray, what do you think? Ray, do you have a lot of experience with BP/WP on new versions of PHP? The only issue I am aware of has to do with passing variables by reference to some filters in BP, which has been fixed in the dev version of BP but might cause errors on our current 1.2.x installation. André, are there any other implications involved in upgrading PHP, at least that you are aware of?
I should note that a third option is to tell MediaWiki not to double check file types. But this poses a security threat, and so is not ideal.
I'm going to punt this ticket to the next bugfix release, because in any case I do not want to mess with our production environment just before tomorrow's 1.2 release. (A PHP upgrade, if necessary, should be done apart from a WP/BP/MW upgrade, so that we can easily isolate problems.) If this is only coming up now (for the first time in the history of the Commons), then uploading non-images to the wiki must not be an extremely high priority for users of the site.
Updated by local admin over 13 years ago
Seems like this PHP module is available from a (semi) canonical repo, namely EPEL:
[root@commons ~]# yum info php-pecl-Fileinfo
Loaded plugins: dellsysid, rhnplugin, security Available Packages Name : php-pecl-Fileinfo Arch : x86_64 Version : 1.0.4 Release : 2.el5 Size : 9.2 k Repo : epel Summary : Fileinfo is a PHP extension that wraps the libmagic library URL : http://pecl.php.net/package/Fileinfo License : PHP Description: This extension allows the retrieval of file type information for the vast : majority of files. This information may include such information as dimensions : or compression quality of images, duration of sound files, etc... : : Additionally, it can also be used to retrieve the MIME type for a particular : file, and for text files, the proper language encoding.
Thus, a PHP upgrade will probably not be needed.
-AP
Updated by Boone Gorges over 13 years ago
Thanks, André. Let's go ahead and install the extension, if you don't mind. We'll deal with the issue of PHP upgrades separately.
Updated by local admin over 13 years ago
Alright, done. Installed the extension and restarted the service. let me know if it still doesn't work, ok?
[root@commons ~]# yum install php-pecl-Fileinfo Loaded plugins: dellsysid, rhnplugin, security Setting up Install Process Resolving Dependencies --> Running transaction check ---> Package php-pecl-Fileinfo.x86_64 0:1.0.4-2.el5 set to be updated --> Finished Dependency Resolution Dependencies Resolved ================================================================================================================================================================================================================================================================================== Package Arch Version Repository Size ================================================================================================================================================================================================================================================================================== Installing: php-pecl-Fileinfo x86_64 1.0.4-2.el5 epel 9.2 k Transaction Summary ================================================================================================================================================================================================================================================================================== Install 1 Package(s) Upgrade 0 Package(s) Total download size: 9.2 k Is this ok [y/N]: y Downloading Packages: php-pecl-Fileinfo-1.0.4-2.el5.x86_64.rpm | 9.2 kB 00:00 Running rpm_check_debug Running Transaction Test Finished Transaction Test Transaction Test Succeeded Running Transaction Installing : php-pecl-Fileinfo 1/1 Installed: php-pecl-Fileinfo.x86_64 0:1.0.4-2.el5 Complete! [root@commons ~]# service httpd configtest Syntax OK [root@commons ~]# service httpd restart Stopping httpd: [ OK ] Starting httpd: [ OK ]
Updated by Boone Gorges over 13 years ago
- Status changed from Assigned to Resolved
Thanks for doing this so fast, André.
The extension worked as far as it's supposed to (I can now successfully check filetypes). But I still into problems where MediaWiki tries to check the detected filetype against the whitelist. The problem is that newer Office files .docx .xlsx etc are gussied-up zip files. So fileinfo is (correctly) detecting that they are really zip files, and then seeing a mismatch between the mime type and the list of extensions, defined in wiki/mime.types, that are identified as working with that extension.
In https://github.com/castiron/cac/commit/5890811496604df682e92f71fd77f5c4017d677c I solve this by adding docx etc to that list. I also manually remove application/zip from MW's mime type blacklist. This last move seems less than 100% ideal, given the potential security ramifications. But given that items loaded to the wiki/images directory (where MW uploads are stored) are not executable by the apache user, I think this is a reasonably small risk. And it should be a temporary one. I found at least one suggestion online that the next version of MW will have proper mime detection for new Office files: http://www.gossamer-threads.com/lists/wiki/mediawiki/227645
Updated by Raymond Hoh over 13 years ago
My fix only overrides the default mime types lookup.
We didn't enable FileInfo on MediaWiki in LocalSettings.php:
http://www.mediawiki.org/wiki/Manual:$wgLoadFileinfoExtension
So MediaWiki will not use FileInfo unless we force it to.
Re: ZIP problem - See:
http://redmine.gc.cuny.edu/issues/show/723#note-3
Updated by Boone Gorges over 13 years ago
Re: ZIP problem - See:
http://redmine.gc.cuny.edu/issues/show/723#note-3
My problem was not with downloads, but with MIME type detection of uploaded files. I'm not sure how .htaccess would help.
Good call about activating FileInfo. I just did that on cdev, and rolled back my changes, and I still ran into the same problems - the file extensions still do not match the mime type whitelist. I'll have to do more debugging, but this probably means that Fileinfo is incorrectly identifying docx mimetypes as zip.
Updated by Raymond Hoh over 13 years ago
To me, mime type detection of uploaded files in MediaWiki is not a major issue; it's mostly cosmetic.
You can patch /wiki/includes/MimeMagic.php to fix this.
The download as ZIP issue for Office files is more important and can be solved with the .htaccess fix I posted a link to previously.
I've pushed the .htaccess fix to 1.2-branch:
https://github.com/castiron/cac/commit/cf616125145f9cd68035ca8bd7943c0b9ac7f69e
Give it a shot and see if it works on cdev.
Updated by Boone Gorges over 13 years ago
To me, mime type detection of uploaded files in MediaWiki is not a major issue; it's mostly cosmetic.
Fair enough. It's true that it's not surefire. But using fileinfo does make it a bit more than merely cosmetic (more than just checking against the extension name, anyway).
I didn't have mime type download problems in Chrome (webkit) or FF, but you're right that it should be covered as it will probably be an issue in some browsers. Your .htaccess fix looks good. I'll pull it to cdev later and have a look see.
Thanks for the research and for your feedback, Ray!
Updated by Raymond Hoh over 13 years ago
The mime type download issue appears to affect Opera and IE only.
I also did some more research into FileInfo just now.
When I first did my testing, I didn't have FileInfo enabled and was able to upload with my initial fix; the mime type displayed then in MediaWiki was "unknown/unknown."
Now, I have FileInfo activated and I can duplicate the "application/zip" mime type in MediaWiki.
So it's definitely an issue with FileInfo:
http://stackoverflow.com/questions/4807036/php-5-3-5-fileinfo-mime-type-for-ms-office-2007-files-magic-mime-updates