24 January 2011

It is well-documented at this point that Office documents open as zip files from the Internet under certain circumstances, generally involving IE. If you search for the problem on google, you mainly find people who just discovered that problem and/or just found out that Office documents are structured zip files. Unfortunately, the Content-Type header is insufficient for IE to properly load Office documents from S3 links, at least in some cases.

The solution: Add a Content-Disposition header to the content as well. Something along the lines of: Content-Disposition: attachment; filename="filename.docx"

As with most problems, this was discovered after it had already happened. On top of adding this functionality to our API, we also needed to correct those documents already loaded. Fortunately, s3cmd makes this scriptable:

mkdir temp cd temp s3cmd list [bucket]:[folder-prefix] | egrep docx | while read f; do s3cmd get [bucket]:$f ${f/*\//}; done ls *docx | while read f; do s3cmd put files.cpnp.org:2011/abstract/award/$f $f Content-type:application/vnd.openxmlformats-officedocument.wordprocessingml.document 'Content-Disposition: attachment; filename="'$f'"'; done echo "Cleanup" cd .. rm -rf temp

blog comments powered by Disqus