Project

General

Profile

Bug #723

Cannot upload .ppt, .doc, or .docx files on wiki

Added by Sarah Morgano about 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority name:
Normal
Assignee:
Category name:
Wiki
Target version:
Start date:
2011-04-29
Due date:
% Done:

0%

Estimated time:

Description

Forgive me if this ticket already exists, but a member recently informed me that he cannot upload .doc or .docx files on the wiki and I just tested this and also got the same error message he did. I also just tried to upload a powerpoint and got the same error message: "Upload warning
The file is corrupt or has an incorrect extension. Please check the file and upload again."

Best,
Sarah

History

#1 Updated by Sarah Morgano about 11 years ago

I forgot to add that I tested this using Microsoft Windows XP Professional, Version 2002, Service Pack 3 / FireFox 3.5.3. I'm not sure what the member used, but I can reach out to him if you would like this information.

Best,
Sarah

#2 Updated by Matt Gold about 11 years ago

  • Status changed from New to Assigned
  • Assignee set to Boone Gorges
  • Priority name changed from Normal to High
  • Target version set to 1.2

#3 Updated by Raymond Hoh about 11 years ago

  • Assignee changed from Boone Gorges to Raymond Hoh

Boone, I've set MediaWiki to allow .docx and .pptx uploads on the 1.2-branch:
https://github.com/castiron/cac/commit/4a4c83c4305144ce32251d0ad5d7c058d9d1bffe

One thing to note after applying the fix is when I tried to download a .docx file after uploading, it would try to download as a .ZIP file. It could just be my server setup, but when we commit this change, we'll want to test this on cdev as well.

If we experience the same issue on cdev, we can apply the .htaccess fix listed in this forum post to force the mime type:
http://www.webdeveloper.com/forum/showpost.php?p=898935&postcount=2

#4 Updated by Boone Gorges about 11 years ago

  • Priority name changed from High to Normal
  • Target version changed from 1.2 to 1.2.1

Thanks, Ray.

It looks like Ray's fix requires additional PHP extensions from those needed to check image file types. For that reason, the fix is not working on either my local installation or on cdev.

According to http://meta.wikimedia.org/wiki/MediaWiki_FAQ#When_I_try_to_upload_files_in_Mediawiki_1.5_I_always_get_a_.22The_file_is_corrupt_or_has_an_incorrect_extension._Please_check_the_file_and_upload_again..22_error., the PHP extension in question is fileinfo. This extension comes enabled by default in PHP 5.3+. The Commons server is running PHP 5.1.6.

This suggests one of two possible solutions. One is to upgrade PHP to 5.3+. The other is to install the fileinfo extension, which is available through pecl http://pecl.php.net/package/Fileinfo

André and Ray, what do you think? Ray, do you have a lot of experience with BP/WP on new versions of PHP? The only issue I am aware of has to do with passing variables by reference to some filters in BP, which has been fixed in the dev version of BP but might cause errors on our current 1.2.x installation. André, are there any other implications involved in upgrading PHP, at least that you are aware of?

I should note that a third option is to tell MediaWiki not to double check file types. But this poses a security threat, and so is not ideal.

I'm going to punt this ticket to the next bugfix release, because in any case I do not want to mess with our production environment just before tomorrow's 1.2 release. (A PHP upgrade, if necessary, should be done apart from a WP/BP/MW upgrade, so that we can easily isolate problems.) If this is only coming up now (for the first time in the history of the Commons), then uploading non-images to the wiki must not be an extremely high priority for users of the site.

#5 Updated by local admin about 11 years ago

Seems like this PHP module is available from a (semi) canonical repo, namely EPEL:

[root@commons ~]# yum info php-pecl-Fileinfo
Loaded plugins: dellsysid, rhnplugin, security
Available Packages
Name       : php-pecl-Fileinfo
Arch       : x86_64
Version    : 1.0.4
Release    : 2.el5
Size       : 9.2 k
Repo       : epel
Summary    : Fileinfo is a PHP extension that wraps the libmagic library
URL        : http://pecl.php.net/package/Fileinfo
License    : PHP
Description: This extension allows the retrieval of file type information for the vast
           : majority of files.  This information may include such information as dimensions
           : or compression quality of images, duration of sound files, etc...
           : 
           : Additionally, it can also be used to retrieve the MIME type for a particular
           : file, and for text files, the proper language encoding.

Thus, a PHP upgrade will probably not be needed.

-AP

#6 Updated by Boone Gorges about 11 years ago

Thanks, André. Let's go ahead and install the extension, if you don't mind. We'll deal with the issue of PHP upgrades separately.

#7 Updated by local admin about 11 years ago

Alright, done. Installed the extension and restarted the service. let me know if it still doesn't work, ok?

[root@commons ~]# yum install php-pecl-Fileinfo
Loaded plugins: dellsysid, rhnplugin, security
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package php-pecl-Fileinfo.x86_64 0:1.0.4-2.el5 set to be updated
--> Finished Dependency Resolution

Dependencies Resolved

==================================================================================================================================================================================================================================================================================
 Package                                                                   Arch                                                           Version                                                              Repository                                                    Size
==================================================================================================================================================================================================================================================================================
Installing:
 php-pecl-Fileinfo                                                         x86_64                                                         1.0.4-2.el5                                                          epel                                                         9.2 k

Transaction Summary
==================================================================================================================================================================================================================================================================================
Install       1 Package(s)
Upgrade       0 Package(s)

Total download size: 9.2 k
Is this ok [y/N]: y
Downloading Packages:
php-pecl-Fileinfo-1.0.4-2.el5.x86_64.rpm                                                                                                                                                                                                                   | 9.2 kB     00:00     
Running rpm_check_debug
Running Transaction Test
Finished Transaction Test
Transaction Test Succeeded
Running Transaction
  Installing     : php-pecl-Fileinfo                                                                                                                                                                                                                                          1/1 

Installed:
  php-pecl-Fileinfo.x86_64 0:1.0.4-2.el5                                                                                                                                                                                                                                          

Complete!
[root@commons ~]# service httpd configtest
Syntax OK
[root@commons ~]# service httpd restart
Stopping httpd:                                            [  OK  ]
Starting httpd:                                            [  OK  ]

#8 Updated by Boone Gorges about 11 years ago

  • Status changed from Assigned to Resolved

Thanks for doing this so fast, André.

The extension worked as far as it's supposed to (I can now successfully check filetypes). But I still into problems where MediaWiki tries to check the detected filetype against the whitelist. The problem is that newer Office files .docx .xlsx etc are gussied-up zip files. So fileinfo is (correctly) detecting that they are really zip files, and then seeing a mismatch between the mime type and the list of extensions, defined in wiki/mime.types, that are identified as working with that extension.

In https://github.com/castiron/cac/commit/5890811496604df682e92f71fd77f5c4017d677c I solve this by adding docx etc to that list. I also manually remove application/zip from MW's mime type blacklist. This last move seems less than 100% ideal, given the potential security ramifications. But given that items loaded to the wiki/images directory (where MW uploads are stored) are not executable by the apache user, I think this is a reasonably small risk. And it should be a temporary one. I found at least one suggestion online that the next version of MW will have proper mime detection for new Office files: http://www.gossamer-threads.com/lists/wiki/mediawiki/227645

#9 Updated by Raymond Hoh about 11 years ago

My fix only overrides the default mime types lookup.

We didn't enable FileInfo on MediaWiki in LocalSettings.php:
http://www.mediawiki.org/wiki/Manual:$wgLoadFileinfoExtension

So MediaWiki will not use FileInfo unless we force it to.

Re: ZIP problem - See:
http://redmine.gc.cuny.edu/issues/show/723#note-3

#10 Updated by Boone Gorges about 11 years ago

Re: ZIP problem - See:
http://redmine.gc.cuny.edu/issues/show/723#note-3

My problem was not with downloads, but with MIME type detection of uploaded files. I'm not sure how .htaccess would help.

Good call about activating FileInfo. I just did that on cdev, and rolled back my changes, and I still ran into the same problems - the file extensions still do not match the mime type whitelist. I'll have to do more debugging, but this probably means that Fileinfo is incorrectly identifying docx mimetypes as zip.

#11 Updated by Raymond Hoh about 11 years ago

To me, mime type detection of uploaded files in MediaWiki is not a major issue; it's mostly cosmetic.
You can patch /wiki/includes/MimeMagic.php to fix this.

The download as ZIP issue for Office files is more important and can be solved with the .htaccess fix I posted a link to previously.

I've pushed the .htaccess fix to 1.2-branch:
https://github.com/castiron/cac/commit/cf616125145f9cd68035ca8bd7943c0b9ac7f69e

Give it a shot and see if it works on cdev.

#12 Updated by Boone Gorges about 11 years ago

To me, mime type detection of uploaded files in MediaWiki is not a major issue; it's mostly cosmetic.

Fair enough. It's true that it's not surefire. But using fileinfo does make it a bit more than merely cosmetic (more than just checking against the extension name, anyway).

I didn't have mime type download problems in Chrome (webkit) or FF, but you're right that it should be covered as it will probably be an issue in some browsers. Your .htaccess fix looks good. I'll pull it to cdev later and have a look see.

Thanks for the research and for your feedback, Ray!

#13 Updated by Raymond Hoh about 11 years ago

The mime type download issue appears to affect Opera and IE only.

I also did some more research into FileInfo just now.

When I first did my testing, I didn't have FileInfo enabled and was able to upload with my initial fix; the mime type displayed then in MediaWiki was "unknown/unknown."

Now, I have FileInfo activated and I can duplicate the "application/zip" mime type in MediaWiki.

So it's definitely an issue with FileInfo:
http://stackoverflow.com/questions/4807036/php-5-3-5-fileinfo-mime-type-for-ms-office-2007-files-magic-mime-updates

Also available in: Atom PDF