LinuxSir.cn,穿越时空的Linuxsir!

 找回密码
 注册
搜索
热搜: shell linux mysql
查看: 1788|回复: 16

有关中文文件名的 kernel patch

[复制链接]
发表于 2003-6-7 08:50:22 | 显示全部楼层 |阅读模式
大家也许发现过在挂载 Windows 分区时,中文文件名会出错的问题。主要的表现形式有两种:
一、读取 NTFS 分区的中文文件名文件,会说文件不存在;
二、在 linux 下向 FAT 分区中写入中文文件名文件后,在 Windows 下看到的是乱码。

很多人都知道这是内核的一个 bug,如果在网上仔细找也能找到 patch,但不知为什么这个 patch 没能进入官方的内核。我现在打算为把这个 patch 加入官方内核努力,第一步是请大家试用这个 patch。

这个 bug 的解决方案不是我发现的(抱歉我也不记得原作者的名字了,谁知道的话请告诉我),但是这个 patch 完全是我自己写的。我是在 2.4.18 版的内核上作的这个 patch,但是应该适用于所有的 2.4.13 以后的版本(不适用于 2.2 系列,2.5 系列我还没有看,但应该适用)。把这个 patch 加入 source tree 的办法是:
进入 linux source tree 的 top directory,运行
$ patch -p1 < nls_cp936.patch

请大家帮忙检验这个 patch,有什么意见和建议可以直接写在这里,但最好是 Email 到:plateauwolf@sina.com。谢谢。

附件是 gzip 过的 patch。

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?注册

x
发表于 2003-6-7 08:57:56 | 显示全部楼层
非常感谢!正在编译内核.
发表于 2003-6-7 10:13:56 | 显示全部楼层

i haven't encount such a problem.

everything is correct in my system.
i think mount the ntfs fs should add "-o utf8", if this option ignored, the chinese filename may have some problem.
 楼主| 发表于 2003-6-7 10:29:58 | 显示全部楼层
那么请您做一下下面的试验:在 Windows 下在 NTFS 分区上建立一个文件,文件名为“金”,然后到 Linux 下(我想你的 NTFS 分区应该是以 iocharset=cp936 参数挂载的),在终端下(比如 xterm)进入该文件所在的目录,看一下 ls 的结果。

如果这个文件能够正常显示,那您的内核确实没有这个问题。我现在知道 Magic Linux 是解决了这个 bug 的(用的不是我的 patch,但应该是一个解决方法)。如果您用的是其它的发行版,能不能把版本告诉我?
发表于 2003-6-7 10:55:18 | 显示全部楼层

reply

my system is windows2000+slackware9.0
1-> as i know the windows2000 uses utf8 as its core charset, so i mount the ntfs use "-o utf8" option.
2-> my linux kernel was recompiled, when you configure your linux kernel, in FileSystem->Language(?)->you may find chinese(simp),chinese(tra),utf8...support, and you may built them into the kernel.(btw,my kernel was downloaded from kernel.org 2.4.20 origin)
发表于 2003-6-7 10:57:34 | 显示全部楼层
if mount the ntfs as "-o iocharset=cp936", it will encount such a problem, but i said, the ntfs is utf8 actual, so mount as "-o utf8", everything will be ok.
发表于 2003-6-7 11:01:34 | 显示全部楼层
generally speaking, the vfat and iso9660 should mount as "cp936", but ntfs should be "utf8", ext3.reiser... need not a mount option.
发表于 2003-6-7 11:08:33 | 显示全部楼层
there will be many problem if mount ntfs as "cp936", as i have experienced:
1-> ls -laR(some chinese dirs can't enter)
2-> du -abch(some chinese dirs can't caculate)
3-> cp -av xxx yyy(not all files/dirs especially chinese files/dirs could be copied, but "cp" will not fail)
 楼主| 发表于 2003-6-7 11:31:35 | 显示全部楼层
Sorry I misread your first post, I thought you mean ``there would be problem if you mount it as -o utf8'' instead of ``there would be problem unless you mount it as -o utf8''.

Thanks for your detailed explanation.  I haven't thought of using UTF-8 as the I/O charset of NTFS partition, and it is definitely a good idea.  However, this solution has its limitations.

As I understand it, you should specify your encoding when you want to display Chinese on the console.  So I think you are using the UTF8 encoding.  There are people that need GB encoding, for example, they may have Chinese text files in GB encoding, and they just want to cat or more it (I am not sure if grep works).

And FAT partitions need GB encoding anyway (since the filenames are stored in codepage 936 instead of Unicode), people may have both NTFS and FAT partitions, and would be reluctant to change locale (encoding) when they change directories.

Your UTF8 solution is a good idea and the Rigth Way to go, yet a working GB enviroment would be easier for the newbies and Windows immigrants.  My point is, the CP936 encoding in kernel should work, it is just some code table errors that is causing the problem you mentioned.  I am trying to fix this problem so that people can use either UTF8 or GB encoding as they wish.

So if you have time, would you mind give this patch a try?  I am pretty sure after applying it, you can mount NTFS partitions with option iochatset=cp936 without any problem (of course you need to be in a GB encoding locale to see the characters), and all the three problems you mentioned should be solved as well.
发表于 2003-6-7 11:55:16 | 显示全部楼层
yes, you are right indeed.
as i said, generally speaking, the ntfs should mount as "utf8" and vfat,iso9660 should be "cp936", i find the solution in the "mount"'s manpage, the manpage explained everything well.
but still, i would be gladly if your solution solved the "ntfs-cp936" problem. i will try your patch as soon as possible.
您需要登录后才可以回帖 登录 | 注册

本版积分规则

快速回复 返回顶部 返回列表