File Types on Commons
I am writing to see if you may be able to help me. Part of my dissertation was the development of a corpus of text messages that I would like to make publicly available, but also password protected. In a perfect world, I would like this to live on a GC-affiliated website that is founded on open source principles (i.e., the Commons).
I know the Commons generally cannot host certain file types (i.e., .csv, .sql, .zip, .dmg, because I've been trying), but was wondering if it might be possible to make an exception or if you know of a way I could host these. I have a csv file of all of the text messages that is 3MB, and an sql file that is 14 MB. Zipped together, they get down to about 3MB.
I am currently hosting these on my GitHub (https://github.com/michellejm/BYTs) where they are in a password protected folder.
Thank you greatly for any help you can provide!
#2 Updated by Boone Gorges over 5 years ago
- Target version set to Not tracked
It's definitely possible to make an exception for hosting a specific file. If the file will remain static over time - that is, you won't need to modify it in the future - I will just upload it manually, and give you the URL.
The issue of password protection is more complex. Can you give more details about the kind of protection you need? What does it mean to be "publicly available" but behind a password? Will you be handing out the password to people? Or do you simply want to ensure that the people downloading the files are humans (rather than, say, search engine crawlers)? What do you imagine access looking like - will there be a page that links to the document, or will you be sending the URL to people privately? If you can provide a bit more information, I can help to work out something that'll meet your needs.
FYI - the .dmg file at https://github.com/michellejm/BYTs is currently publicly available (in the true sense!) and can be viewed/downloaded by anyone. If you don't want this to be the case, you may want to delete the repo (deleting only the file is not enough, because Git keeps track of the full file history).
#3 Updated by Michelle McSweeney over 5 years ago
Thank you so much!
Perfect - I shouldn't ever need to modify the file.
It's part of my IRB that it has to be "password-protected" and I have to verify that anyone downloading it is a "researcher" (clearly this is all very vague). The best I can come up with is to have people email me or the lab to get the password. I would like the page where the download link lives to be password protected.
The dmg file on Github is encrypted, so if you go in and try to view the files, it will prompt you for a password, though you can still download them. This is less than ideal because I would like it to be more obviously "password-protected" should anyone look into the IRB.
I'm setting up a site with a downloads page byts.commons.gc.cuny.edu/download that is "password-protected" (Bilingual365). This would be my ideal approach if it is possible.
Thank you again so much!!
#4 Updated by Boone Gorges about 5 years ago
Hi Michelle - Thanks for the additional info.
I've looked into a couple of different possibilities, and I think easiest thing will be for me to upload the file manually and put some server-level restrictions in place to enforce password protection. I'm working with our IT department to iron out some wrinkles. In the meantime, do you have a means to send me a raw version of the file that you want to post? Sounds like it's small enough that you could probably email it to me directly as an attachment (boone at gorg dot es). If I've got the file in-hand when IT has helped me to get stuff set up, I'll be able to add it to the site right away and give you the URL.
#5 Updated by Boone Gorges about 5 years ago
- Status changed from Assigned to Reporter Feedback
I've worked out our problems with IT, and the file is now available: https://commons.gc.cuny.edu/wp-content/protected/2717/Spanish_English_BYTs.zip
Let me know if this meets your needs. Thanks!