drupal上传文件时“截断文件名开头中文”问题的解决
xtykc 01月 29th, 2010
很多人反映drupal上传文件时不支持中文文件名(文件名变成乱码)的问题,实际上在新版本中这个问题已经得到了解决,但是解决得不完美。
我遇到的问题是,文件名如果以中文开头,上传后,系统会自动截断文件名开头的中文部分。如《某某单位ABC系统测试报告.doc》这个文件上传后变成《ABC系统测试报告.doc》,如果是英文开头的即使包含中文也不会有问题,如《ABCDE学习指南.doc》这个文件上传后就是正常的,数字开头的也不会有问题。
很多人推荐使用统一字符编码的transliteration模块,我下载使用了下,发现它会把中文翻译成拼音,虽然不会出现文件名截断的问题,但并不是我想要的。看来只能从代码层面来解决问题了。
在drupal中掌管文件上传的代码在../includes/file.inc这个文件中,截断文件名的情况发生在语句:
$file->filename = file_munge_filename(trim(basename($_FILES['files']['name'][$source]), ‘.’), $extensions);
之后,是因为basename函数有时会截去某些UTF8字符的,原因是basename函数的结果取决于当时的locale设置:
The results of the basename() function are dependent on your locale setting.
If basename() is returning blank results for strings with multibyte characters, you can try including the following in your script:
<?php
setlocale(LC_ALL, ‘en_US.UTF8′); # or any other locale that can handle multibyte characters.
?>
However, the best solution to do this would be to change the locale setting on your system or webserver. For example, on Debian systems, this is done in /etc/init.d/apache对于CentOS的apache,确实存在这样的情况:
The system locale in mod_php seems to be set to "C" instead of the locale of the system (which is "en_US.UTF8" in my case).一种方法就是setlocale:
The workaround is to explicitely set the locale with "setlocale(LC_CTYPE, "en_US.UTF-8")". 另一种方法是修改httpd的启动脚本:
vi /etc/sysconfig/httpd
HTTPD_LANG=en_US.UTF8理论上应该修改为en_US.UTF8,但是提交bug的人建议修改为en_US:
I tried it, but it didn't seem to work first. Now I got it to work. It works only with strings like "en_US" and not with "en_US.UTF8". The result is then the same as on CentOS 4 with PHP 4. However, on CentOS 4 the HTTPD_LANG variable is set to "C" as well.$file->filename = file_munge_filename(trim(basename($_FILES['files']['name'][$source]), ‘.’), $extensions);之前插入一行:setlocale(LC_ALL, 'en_US.UTF8');