問題描述
更新:已解決
我在調用 FTPClient.setFileType()
之前我登錄,導致 FTP 服務器使用默認模式 (ASCII
) 否重要 what 我將它設置為.另一方面,客戶端表現得好像文件類型已正確設置.BINARY
模式現在完全按照需要工作,在所有情況下都可以逐字節傳輸文件.我所要做的就是在wireshark 中嗅探一下流量,然后使用netcat 模擬FTP 命令以查看發生了什么.為什么我前兩天沒有想到!?謝謝大家的幫助!
I was calling FTPClient.setFileType()
before I logged in, causing the FTP server to use the default mode (ASCII
) no matter what I set it to. The client, on the other hand, was behaving as though the file type had been properly set. BINARY
mode is now working exactly as desired, transporting the file byte-for-byte in all cases. All I had to do was a little traffic sniffing in wireshark and then mimicing the FTP commands using netcat to see what was going on. Why didn't I think of that two days ago!? Thanks, everyone for your help!
我有一個 utf-16 編碼的 xml 文件,我使用 apache 的 commons-net-2.0 java 庫的 FTPClient 從 FTP 站點下載該文件.它支持兩種傳輸模式:ASCII_FILE_TYPE
和 BINARY_FILE_TYPE
,區別在于 ASCII
將用適當的本地行分隔符('
'
或只是 '
'
- 十六進制,0x0d0a
或只是 0x0a
).我的問題是:我有一個 utf-16 編碼的測試文件,其中包含以下內容:
I have an xml file, utf-16 encoded, which I am downloading from an FTP site using apache's commons-net-2.0 java library's FTPClient. It offers support for two transfer modes: ASCII_FILE_TYPE
and BINARY_FILE_TYPE
, the difference being that ASCII
will replace line separators with the appropriate local line separator ('
'
or just '
'
-- in hex, 0x0d0a
or just 0x0a
). My problem is this: I have a test file, utf-16 encoded, that contains the following:
<?xml version='1.0' encoding='utf-16'?>
<數據>
<blah>blah</blah>
</data>
這是十六進制:
<代碼>0000000: 003c 003f 0078 006d 006c 0020 0076 0065 .<.?.x.m.l..v.e0000010: 0072 0073 0069 006f 006e 003d 0027 0031 .r.s.i.o.n.=.'.1
<代碼>0000020: 002e 0030 0027 0020 0065 006e 0063 006f ...0.'..e.n.c.o0000030: 0064 0069 006e 0067 003d 0027 0075 0074 .d.i.n.g.=.'.u.t
0000040: 0066 002d 0031 0036 0027 003f 003e 000a .f.-.1.6.'.?.>..
0000050: 003c 0064 0061 0074 0061 003e 000a 0009 .<.d.a.t.a.>....
0000060: 003c 0062 006c 0061 0068 003e 0062 006c .<.b.l.a.h.>.b.l
0000070: 0061 0068 003c 002f 0062 006c 0061 0068 .a.h.<./.b.l.a.h
0000080: 003e 000a 003c 002f 0064 0061 0074 0061 .>...<./.d.a.t.a
0000090: 003e 000a
; .>..
Here's the hex:
0000000: 003c 003f 0078 006d 006c 0020 0076 0065 .<.?.x.m.l. .v.e
0000010: 0072 0073 0069 006f 006e 003d 0027 0031 .r.s.i.o.n.=.'.1
0000020: 002e 0030 0027 0020 0065 006e 0063 006f ...0.'. .e.n.c.o
0000030: 0064 0069 006e 0067 003d 0027 0075 0074 .d.i.n.g.=.'.u.t
0000040: 0066 002d 0031 0036 0027 003f 003e 000a .f.-.1.6.'.?.>..
0000050: 003c 0064 0061 0074 0061 003e 000a 0009 .<.d.a.t.a.>....
0000060: 003c 0062 006c 0061 0068 003e 0062 006c .<.b.l.a.h.>.b.l
0000070: 0061 0068 003c 002f 0062 006c 0061 0068 .a.h.<./.b.l.a.h
0000080: 003e 000a 003c 002f 0064 0061 0074 0061 .>...<./.d.a.t.a
0000090: 003e 000a
.>..
當我對這個文件使用 ASCII
模式時,它會正確地逐字節傳輸;結果具有相同的 md5sum.偉大的.當我使用 BINARY
傳輸模式時,除了將 InputStream
中的字節洗牌到 OutputStream
之外,它不應該做任何事情,結果是換行符 (0x0a
) 被轉換為回車 + 換行符對 (0x0d0a
).這是二進制傳輸后的十六進制:
When I use ASCII
mode for this file it transfers correctly, byte-for-byte; the result has the same md5sum. Great. When I use BINARY
transfer mode, which is not supposed to do anything but shuffle bytes from an InputStream
into an OutputStream
, the result is that the newlines (0x0a
) are converted to carriage return + newline pairs (0x0d0a
). Here's the hex after binary transfer:
0000000: 003c 003f 0078 006d 006c 0020 0076 0065 .<.?.x.m.l..v.e
0000010: 0072 0073 0069 006f 006e 003d 0027 0031 .r.s.i.o.n.=.'.1
<代碼>0000020: 002e 0030 0027 0020 0065 006e 0063 006f ...0.'..e.n.c.o0000030: 0064 0069 006e 0067 003d 0027 0075 0074 .d.i.n.g.=.'.u.t
0000040: 0066 002d 0031 0036 0027 003f 003e 000d .f.-.1.6.'.?.>..
0000050: 0a00 3c00 6400 6100 7400 6100 3e00 0d0a ..<.d.a.t.a.>...
0000060: 0009 003c 0062 006c 0061 0068 003e 0062 ...<.b.l.a.h.>.b
0000070: 006c 0061 0068 003c 002f 0062 006c 0061 .l.a.h.<./.b.l.a
0000080: 0068 003e 000d 0a00 3c00 2f00 6400 6100 .h.>....<./.d.a.
0000090: 7400 6100 3e00 0d0a
&n; ta>..
0000000: 003c 003f 0078 006d 006c 0020 0076 0065 .<.?.x.m.l. .v.e
0000010: 0072 0073 0069 006f 006e 003d 0027 0031 .r.s.i.o.n.=.'.1
0000020: 002e 0030 0027 0020 0065 006e 0063 006f ...0.'. .e.n.c.o
0000030: 0064 0069 006e 0067 003d 0027 0075 0074 .d.i.n.g.=.'.u.t
0000040: 0066 002d 0031 0036 0027 003f 003e 000d .f.-.1.6.'.?.>..
0000050: 0a00 3c00 6400 6100 7400 6100 3e00 0d0a ..<.d.a.t.a.>...
0000060: 0009 003c 0062 006c 0061 0068 003e 0062 ...<.b.l.a.h.>.b
0000070: 006c 0061 0068 003c 002f 0062 006c 0061 .l.a.h.<./.b.l.a
0000080: 0068 003e 000d 0a00 3c00 2f00 6400 6100 .h.>....<./.d.a.
0000090: 7400 6100 3e00 0d0a
t.a.>...
它不僅轉換換行符(它不應該),而且它不尊重 utf-16 編碼(不是我希望它知道它應該知道,它只是一個愚蠢的 FTP 管道).如果不進行進一步處理以重新對齊字節,則結果是不可讀的.我只會使用 ASCII
模式,但我的應用程序也會在同一管道中移動 real 二進制數據(mp3 文件和 jpeg 圖像).在這些二進制文件上使用 BINARY
傳輸模式還會導致它們在其內容中注入隨機 0x0d
,由于二進制數據通常包含合法的0x0d0a
序列.如果我在這些文件上使用 ASCII
模式,那么聰明"的 FTPClient 會將這些 0x0d0a
s 轉換為 0x0a
無論如何我都會使文件不一致做.
Not only does it convert the newline characters (which it shouldn't), but it doesn't respect the utf-16 encoding (not that I would expect it to know that it should, it's just a dumb FTP pipe). The result is unreadable without further processing to realign the bytes. I would just use ASCII
mode, but my application will also be moving real binary data (mp3 files and jpeg images) across the same pipe. Using the BINARY
transfer mode on these binary files also causes them to have random 0x0d
s injected into their contents, which can't safely be removed since the binary data often contains legitimate 0x0d0a
sequences. If I use ASCII
mode on these files, then the "clever" FTPClient converts these 0x0d0a
s into 0x0a
leaving the file inconsistent no matter what I do.
我想我的問題是:有沒有人知道任何用于 java 的好的 FTP 庫只是將該死的字節從那里移動到這里,或者我將不得不破解 apache commons-net-2.0 并為這個簡單的應用程序維護我自己的 FTP 客戶端代碼?有沒有其他人處理過這種奇怪的行為?任何建議將不勝感激.
I guess my question(s) is(are): does anyone know of any good FTP libraries for java that just move the damned bytes from there to here, or am I going to have to hack up apache commons-net-2.0 and maintain my own FTP client code just for this simple application? Has anyone else dealt with this bizarre behavior? Any suggestions would be appreciated.
我查看了 commons-net 源代碼,它看起來不像是使用 BINARY
模式時出現的奇怪行為.但是它在 BINARY
模式下讀取的 InputStream
只是一個包裹在套接字 InputStream
周圍的 java.io.BufferedInptuStream
.這些較低級別的 java 流是否做過任何奇怪的字節操作?如果他們這樣做了,我會感到震驚,但我看不出這里還會發生什么.
I checked out the commons-net source code and it doesn't look like it's responsible for the weird behavior when BINARY
mode is used. But the InputStream
it's reading from in BINARY
mode is just a java.io.BufferedInptuStream
wrapped around a socket InputStream
. Do these lower level java streams ever do any weird byte-manipulation? I would be shocked if they did, but I don't see what else could be going on here.
編輯 1:
這是一段模仿我下載文件的最小代碼.要編譯,只需執行
Here's a minimal piece of code that mimics what I'm doing to download the file. To compile, just do
javac -classpath /path/to/commons-net-2.0.jar Main.java
要運行,您需要目錄/tmp/ascii 和/tmp/binary 以將文件下載到,以及設置有文件的 ftp 站點.代碼還需要配置適當的 ftp 主機、用戶名和密碼.我將文件放在我的測試 ftp 站點上的 test/文件夾下,并調用文件 test.xml.測試文件至少應該多于一行,并且是 utf-16 編碼的(這可能不是必需的,但有助于重現我的確切情況).在打開一個新文件并輸入上面引用的 xml 文本后,我使用了 vim 的 :set fileencoding=utf-16
命令.最后,要運行,只需執行
To run, you'll need directories /tmp/ascii and /tmp/binary for the file to download to, as well as an ftp site set up with the file sitting in it. The code will also need to be configured with the appropriate ftp host, username and password. I put the file on my testing ftp site under the test/ folder and called the file test.xml. The test file should at least have more than one line, and be utf-16 encoded (this may not be necessary, but will help to recreate my exact situation). I used vim's :set fileencoding=utf-16
command after opening a new file and entered the xml text referenced above. Finally, to run, just do
java -cp .:/path/to/commons-net-2.0.jar Main
代碼:
(注意:此代碼已修改為使用自定義 FTPClient 對象,鏈接在下面的EDIT 2"下)
(NOTE: this code modified to use custom FTPClient object, linked below under "EDIT 2")
import java.io.*;
import java.util.zip.CheckedInputStream;
import java.util.zip.CheckedOutputStream;
import java.util.zip.CRC32;
import org.apache.commons.net.ftp.*;
public class Main implements java.io.Serializable
{
public static void main(String[] args) throws Exception
{
Main main = new Main();
main.doTest();
}
private void doTest() throws Exception
{
String host = "ftp.host.com";
String user = "user";
String pass = "pass";
String asciiDest = "/tmp/ascii";
String binaryDest = "/tmp/binary";
String remotePath = "test/";
String remoteFilename = "test.xml";
System.out.println("TEST.XML ASCII");
MyFTPClient client = createFTPClient(host, user, pass, org.apache.commons.net.ftp.FTP.ASCII_FILE_TYPE);
File path = new File("/tmp/ascii");
downloadFTPFileToPath(client, "test/", "test.xml", path);
System.out.println("");
System.out.println("TEST.XML BINARY");
client = createFTPClient(host, user, pass, org.apache.commons.net.ftp.FTP.BINARY_FILE_TYPE);
path = new File("/tmp/binary");
downloadFTPFileToPath(client, "test/", "test.xml", path);
System.out.println("");
System.out.println("TEST.MP3 ASCII");
client = createFTPClient(host, user, pass, org.apache.commons.net.ftp.FTP.ASCII_FILE_TYPE);
path = new File("/tmp/ascii");
downloadFTPFileToPath(client, "test/", "test.mp3", path);
System.out.println("");
System.out.println("TEST.MP3 BINARY");
client = createFTPClient(host, user, pass, org.apache.commons.net.ftp.FTP.BINARY_FILE_TYPE);
path = new File("/tmp/binary");
downloadFTPFileToPath(client, "test/", "test.mp3", path);
}
public static File downloadFTPFileToPath(MyFTPClient ftp, String remoteFileLocation, String remoteFileName, File path)
throws Exception
{
// path to remote resource
String remoteFilePath = remoteFileLocation + "/" + remoteFileName;
// create local result file object
File resultFile = new File(path, remoteFileName);
// local file output stream
CheckedOutputStream fout = new CheckedOutputStream(new FileOutputStream(resultFile), new CRC32());
// try to read data from remote server
if (ftp.retrieveFile(remoteFilePath, fout)) {
System.out.println("FileOut: " + fout.getChecksum().getValue());
return resultFile;
} else {
throw new Exception("Failed to download file completely: " + remoteFilePath);
}
}
public static MyFTPClient createFTPClient(String url, String user, String pass, int type)
throws Exception
{
MyFTPClient ftp = new MyFTPClient();
ftp.connect(url);
if (!ftp.setFileType( type )) {
throw new Exception("Failed to set ftpClient object to BINARY_FILE_TYPE");
}
// check for successful connection
int reply = ftp.getReplyCode();
if (!FTPReply.isPositiveCompletion(reply)) {
ftp.disconnect();
throw new Exception("Failed to connect properly to FTP");
}
// attempt login
if (!ftp.login(user, pass)) {
String msg = "Failed to login to FTP";
ftp.disconnect();
throw new Exception(msg);
}
// success! return connected MyFTPClient.
return ftp;
}
}
編輯 2:
好的,我遵循了 CheckedXputStream
的建議,這是我的結果.我制作了一個名為 MyFTPClient
的 apache 的 FTPClient
副本,并將 SocketInputStream
和 BufferedInputStream
都包裹在 CheckedInputStream
使用 CRC32
校驗和.此外,我包裝了我提供給 FTPClient
的 FileOutputStream
以將輸出存儲在帶有 CRC32
校驗和的 CheckOutputStream
中.MyFTPClient 的代碼發布在 here 我已經修改了上面的測試代碼以使用這個版本的 FTPClient(試圖將要點 URL 發布到修改后的代碼,但我需要 10 個信譽點才能發布多個 URL!)、test.xml
和 test.mp3
以及結果如下:
Okay I followed the CheckedXputStream
advice and here are my results. I made a copy of apache's FTPClient
called MyFTPClient
, and I wrapped both the SocketInputStream
and the BufferedInputStream
in a CheckedInputStream
using CRC32
checksums. Furthermore, I wrapped the FileOutputStream
that I give to FTPClient
to store the output in a CheckOutputStream
with CRC32
checksum. The code for MyFTPClient is posted here and I've modified the above test code to use this version of the FTPClient (tried to post a gist URL to the modified code, but I need 10 reputation points to post more than one URL!), test.xml
and test.mp3
and the results were thus:
14:00:08,644 DEBUG [main,TestMain] TEST.XML ASCII
14:00:08,919 DEBUG [main,MyFTPClient] Socket CRC32: 2739864033
14:00:08,919 DEBUG [main,MyFTPClient] Buffer CRC32: 2739864033
14:00:08,954 DEBUG [main,FTPUtils] FileOut CRC32: 866869773
14:00:08,955 DEBUG [main,TestMain] TEST.XML BINARY
14:00:09,270 DEBUG [main,MyFTPClient] Socket CRC32: 2739864033
14:00:09,270 DEBUG [main,MyFTPClient] Buffer CRC32: 2739864033
14:00:09,310 DEBUG [main,FTPUtils] FileOut CRC32: 2739864033
14:00:09,310 DEBUG [main,TestMain] TEST.MP3 ASCII
14:00:10,635 DEBUG [main,MyFTPClient] Socket CRC32: 60615183
14:00:10,635 DEBUG [main,MyFTPClient] Buffer CRC32: 60615183
14:00:10,636 DEBUG [main,FTPUtils] FileOut CRC32: 2352009735
14:00:10,636 DEBUG [main,TestMain] TEST.MP3 BINARY
14:00:11,482 DEBUG [main,MyFTPClient] Socket CRC32: 60615183
14:00:11,482 DEBUG [main,MyFTPClient] Buffer CRC32: 60615183
14:00:11,483 DEBUG [main,FTPUtils] FileOut CRC32: 60615183
這基本上是零意義,因為這里是相應文件的 md5sum:
This makes, basically zero sense whatsoever because here are the md5sums of the corresponsing files:
bf89673ee7ca819961442062eaaf9c3f ascii/test.mp3
7bd0e8514f1b9ce5ebab91b8daa52c4b binary/test.mp3
ee172af5ed0204cf9546d176ae00a509 original/test.mp3
104e14b661f3e5dbde494a54334a6dd0 ascii/test.xml
36f482a709130b01d5cddab20a28a8e8 binary/test.xml
104e14b661f3e5dbde494a54334a6dd0 original/test.xml
我很茫然.我發誓在此過程中的任何時候我都沒有改變文件名/路徑,并且我已經對每個步驟進行了三次檢查.它一定很簡單,但我不知道下一步該往哪里看.出于實用性的考慮,我將繼續調用 shell 進行我的 FTP 傳輸,但我打算繼續這樣做,直到我了解到底發生了什么.我會用我的發現更新這個帖子,我會繼續感謝任何人可能做出的任何貢獻.希望這在某些時候對某人有用!
I'm at a loss. I swear I haven't permuted the filenames/paths at any point in this process, and I've triple-checked every step. It must be something simple, but I haven't the foggiest idea where to look next. In the interest of practicality I'm going to proceed by calling out to the shell to do my FTP transfers, but I intend to pursue this until I understand what the hell is going on. I'll update this thread with my findings, and I'll continue to appreciate any contributions anyone may have. Hopefully this will be useful to someone at some point!
推薦答案
登錄ftp服務器后
ftp.setFileType(FTP.BINARY_FILE_TYPE);
下面這行沒有解決:
//ftp.setFileTransferMode(org.apache.commons.net.ftp.FTP.BINARY_FILE_TYPE);
這篇關于使用 apache commons-net FTPClient 傳輸原始二進制文件?的文章就介紹到這了,希望我們推薦的答案對大家有所幫助,也希望大家多多支持html5模板網!