|
|
那啥, 懒得翻译一遍了, 怕翻成乱码 ;-)
I'm testing with a file called "柯有伦-零.mp3", which contains Chinese characters.
My locale: en_US.utf8
Downloader I tested with: wget, aria2c
Target filesystem I tested with: ext4, ntfs
I find it strange the same filename has two forms in two urls:- %BF%C2%D3%D0%C2%D7-%C1%E3.mp3
- %E6%9F%AF%E6%9C%89%E4%BC%A6-%E9%9B%B6.mp3
复制代码 I don't know why... Must have something to do with character set/encoding. Somebody explain this to me please.
---------------------------------------------experiment--with--wget--------------------------------------------------
wget "%BF%C2%D3%D0%C2%D7-%C1%E3.mp3" to ext4 partition:- $ wget 'http://down.jsharer.com/user/userAction.do?method=download&urlpath=ftp://-552790109:693110381@58.215.91.170:2022/22487/200908/%BF%C2%D3%D0%C2%D7-%C1%E3.mp3'
- --2009-08-25 12:17:41-- http://down.jsharer.com/user/userAction.do?method=download&urlpath=ftp://-552790109:693110381@58.215.91.170:2022/22487/200908/%BF%C2%D3%D0%C2%D7-%C1%E3.mp3
- Resolving down.jsharer.com... 222.73.163.168
- Connecting to down.jsharer.com|222.73.163.168|:80... connected.
- HTTP request sent, awaiting response... 302 Moved Temporarily
- Location: ftp://-552790109:693110381@58.215.91.170:2022/22487/200908/%BF%C2%D3%D0%C2%D7-%C1%E3.mp3 [following]
- --2009-08-25 12:17:41-- ftp://-552790109:*password*@58.215.91.170:2022/22487/200908/%BF%C2%D3%D0%C2%D7-%C1%E3.mp3
- => `¿ÂÓÐÂ×-Áã.mp3'
- Connecting to 58.215.91.170:2022... connected.
- Logging in as -552790109 ... Logged in!
- ==> SYST ... done. ==> PWD ... done.
- ==> TYPE I ... done. ==> CWD /22487/200908 ... done.
- ==> SIZE \277\302\323\320\302\327-\301\343.mp3 ... 5211995
- ==> PASV ... done. ==> RETR \277\302\323\320\302\327-\301\343.mp3 ... done.
- Length: 5211995 (5.0M)
- 100%[=====================================================>] 5,211,995 89.1K/s in 56s
- 2009-08-25 12:18:37 (91.2 KB/s) - `¿ÂÓÐÂ×-Áã.mp3' saved [5211995]
复制代码 wget "%BF%C2%D3%D0%C2%D7-%C1%E3.mp3" to ntfs partition:- $ wget 'http://down.jsharer.com/user/userAction.do?method=download&urlpath=ftp://1368144520:-1398555672@58.215.91.170:2022/41161/200908/%BF%C2%D3%D0%C2%D7-%C1%E3.mp3'
- --2009-08-25 12:26:29-- http://down.jsharer.com/user/userAction.do?method=download&urlpath=ftp://1368144520:-1398555672@58.215.91.170:2022/41161/200908/%BF%C2%D3%D0%C2%D7-%C1%E3.mp3
- Resolving down.jsharer.com... 222.73.163.168
- Connecting to down.jsharer.com|222.73.163.168|:80... connected.
- HTTP request sent, awaiting response... 302 Moved Temporarily
- Location: ftp://1368144520:-1398555672@58.215.91.170:2022/41161/200908/%BF%C2%D3%D0%C2%D7-%C1%E3.mp3 [following]
- --2009-08-25 12:26:29-- ftp://1368144520:*password*@58.215.91.170:2022/41161/200908/%BF%C2%D3%D0%C2%D7-%C1%E3.mp3
- => `¿ÂÓÐÂ×-Áã.mp3'
- Connecting to 58.215.91.170:2022... connected.
- Logging in as 1368144520 ... Logged in!
- ==> SYST ... done. ==> PWD ... done.
- ==> TYPE I ... done. ==> CWD /41161/200908 ... done.
- ==> SIZE \277\302\323\320\302\327-\301\343.mp3 ... 5211995
- ==> PASV ... done. ==> RETR \277\302\323\320\302\327-\301\343.mp3 ... done.
- ¿ÂÓÐÂ×-Áã.mp3: Invalid or incomplete multibyte or wide character
复制代码 wget "%E6%9F%AF%E6%9C%89%E4%BC%A6-%E9%9B%B6.mp3" to ext4 partition:- $ wget 'http://mp3.tktt.com/eec38e543bc6c0e4/15/%E6%9F%AF%E6%9C%89%E4%BC%A6-%E9%9B%B6.mp3'
- --2009-08-25 12:29:59-- http://mp3.tktt.com/eec38e543bc6c0e4/15/%E6%9F%AF%E6%9C%89%E4%BC%A6-%E9%9B%B6.mp3
- Resolving mp3.tktt.com... 58.215.81.44
- Connecting to mp3.tktt.com|58.215.81.44|:80... connected.
- HTTP request sent, awaiting response... 200 OK
- Length: 126229 (123K) [audio/x-ms-wma]
- Saving to: `æ%9F¯æ%9C%89伦-é%9B¶.mp3'
- 100%[=====================================================>] 126,229 129K/s in 1.0s
- 2009-08-25 12:30:03 (129 KB/s) - `æ%9F¯æ%9C%89伦-é%9B¶.mp3' saved [126229/126229]
复制代码 wget "%E6%9F%AF%E6%9C%89%E4%BC%A6-%E9%9B%B6.mp3" to ntfs partition:- $ wget 'http://mp3.tktt.com/eec38e543bc6c0e4/15/%E6%9F%AF%E6%9C%89%E4%BC%A6-%E9%9B%B6.mp3'
- --2009-08-25 12:37:30-- http://mp3.tktt.com/eec38e543bc6c0e4/15/%E6%9F%AF%E6%9C%89%E4%BC%A6-%E9%9B%B6.mp3
- Resolving mp3.tktt.com... 58.215.81.44
- Connecting to mp3.tktt.com|58.215.81.44|:80... connected.
- HTTP request sent, awaiting response... 200 OK
- Length: 126229 (123K) [audio/x-ms-wma]
- æ%9F¯æ%9C%89伦-é%9B¶.mp3: Invalid or incomplete multibyte or wide character
- Cannot write to `æ%9F¯æ%9C%89伦-é%9B¶.mp3' (Invalid or incomplete multibyte or wide character).
复制代码 ---------------------------------------------let's--try--with--aria2c--------------------------------------------------
aria2c "%BF%C2%D3%D0%C2%D7-%C1%E3.mp3" to ext4 partition:- $ aria2c 'http://down.jsharer.com/user/userAction.do?method=download&urlpath=ftp://1368144520:-1398555672@58.215.91.170:2022/41161/200908/%BF%C2%D3%D0%C2%D7-%C1%E3.mp3'
- 2009-08-25 12:47:52.079592 NOTICE - #1 - Download has already completed: /home/canti/Desktop/¿ÂÓÐÂ×-Áã.mp3
- 2009-08-25 12:47:52.080014 NOTICE - Download complete: /home/canti/Desktop/¿ÂÓÐÂ×-Áã.mp3
- Download Results:
- gid|stat|avg speed |path/URI
- ===+====+===========+===========================================================
- 1| OK| n/a|/home/canti/Desktop/¿ÂÓÐÂ×-Áã.mp3
- Status Legend:
- (OK):download completed.
复制代码 aria2c "%BF%C2%D3%D0%C2%D7-%C1%E3.mp3" to ntfs partition:- $ aria2c 'http://down.jsharer.com/user/userAction.do?method=download&urlpath=ftp://1368144520:-1398555672@58.215.91.170:2022/41161/200908/%BF%C2%D3%D0%C2%D7-%C1%E3.mp3'
- [#1 SIZE:0B/0B CN:1 SPD:0Bs]
- 2009-08-25 12:46:06.942979 ERROR - Exception caught
- Exception: [RequestGroup.cc:528] Download aborted.
- -> [AbstractDiskWriter.cc:115] Failed to open the file /media/20G/¿ÂÓÐÂ×-Áã.mp3, cause: Invalid or incomplete multibyte or wide character
- Download Results:
- gid|stat|avg speed |path/URI
- ===+====+===========+===========================================================
- 1| ERR| n/a|/media/20G/¿ÂÓÐÂ×-Áã.mp3
- Status Legend:
- (ERR):error occurred.
- aria2 will resume download if the transfer is restarted.
- If there are any errors, then see the log file. See '-l' option in help/man page for details.
复制代码 aria2c "%E6%9F%AF%E6%9C%89%E4%BC%A6-%E9%9B%B6.mp3" to ext4 partition:- $ aria2c 'http://mp3.tktt.com/eec38e543bc6c0e4/15/%E6%9F%AF%E6%9C%89%E4%BC%A6-%E9%9B%B6.mp3'
- [#1 SIZE:96.0KiB/123.2KiB(77%) CN:1 SPD:135.1KiBs]
- 2009-08-25 12:30:17.753443 NOTICE - Download complete: /home/canti/Desktop/柯有伦-零.mp3
- Download Results:
- gid|stat|avg speed |path/URI
- ===+====+===========+===========================================================
- 1| OK| 136.0KiB/s|/home/canti/Desktop/柯有伦-零.mp3
- Status Legend:
- (OK):download completed.
复制代码 aria2c "%E6%9F%AF%E6%9C%89%E4%BC%A6-%E9%9B%B6.mp3" to ntfs partition:- $ aria2c 'http://mp3.tktt.com/eec38e543bc6c0e4/15/%E6%9F%AF%E6%9C%89%E4%BC%A6-%E9%9B%B6.mp3'
- [#1 SIZE:0B/123.2KiB(0%) CN:1 SPD:0Bs]
- 2009-08-25 12:42:14.735797 NOTICE - Download complete: /media/20G/柯有伦-零.mp3
- Download Results:
- gid|stat|avg speed |path/URI
- ===+====+===========+===========================================================
- 1| OK| 20.0MiB/s|/media/20G/柯有伦-零.mp3
- Status Legend:
- (OK):download completed.
复制代码 ---------------------------------------------------------------------------------------------------------------------------
So I'm asking:
1. Why is '%E6%9F%AF%E6%9C%89%E4%BC%A6-%E9%9B%B6.mp3' interpreted correctly by aria2c but not with wget?
2. Why is '%BF%C2%D3%D0%C2%D7-%C1%E3.mp3' interpreted to '¿ÂÓÐÂ×-Áã.mp3' by aria2c and wget both?
3. Can I tune aria2c/wget to get those characters interpreted right? I have some experience with FTP clients such as filezilla and gftp, they both have a charset option for correctly displaying filenames from servers that has an encoding other than utf-8.
4. Despite wget/aria2c both get the '%BF%C2%D3%D0%C2%D7-%C1%E3.mp3' interpreted wrong, they are able to write a file called '¿ÂÓÐÂ×-Áã.mp3' to EXT4 partitions. But they can't do this to NTFS partitions. (PcManFM can display Chinese/Japnese filenames on NTFS partitions correctly. Just to point it out.) How can I tune ntfs-3g's options to fix this?
5. What I asked in the beginning, why does the same filename has two versions in two URLs? |
|