Monk_Novice has asked for the wisdom of the Perl Monks concerning the following question:

wrote a small perl script to upload / download files to & from Unix ( aix ),
the download works perfect, the \n on Unix is replaced with \r\l when downloaded, For download i have no issues

But when a text file is uploaded to Unix in ascii mode size of the file remains the same in Unix,
this should not be the case, the size should be smaller than windows as \r\l is treated as \n in unix,
i confired the same by uploading with the default windows ftp utility

the code is shown below , the codes relevent to error handling is removed for clarity

$g_FtpCnxn = Net::FTP->new($l_Server,Debug => 0); $g_FtpCnxn->login($l_Login ,$l_Passwd); $g_FtpCnxn->ascii; $l_Result = $g_FtpCnxn->put($l_SrcFile) $l_TgtFileName = "$l_TgtFile/$l_Result"; $l_ServerSize = $g_FtpCnxn->size($l_TgtFileName) # Size on Unix @s_temp = stat($l_SrcFile); $l_LocalSize = $s_temp[7]; # Size on Windows

A typical size mismatch after an upload is shown here

Unix Size : 89539 << Incorrect
Windows Size : 89539
Number of new lines in the file : 2839
Unix Size : 86700 << Correct / Expected Size
The expected size is acheived when uploaded through default ftp utility
thanks in advance for your suggestions
nanda

Replies are listed 'Best First'.
Re: Ascii upload ruins the file
by ikegami (Patriarch) on May 19, 2005 at 14:57 UTC

    How did you check if the file you're trying to upload uses CR LF as the line terminator? Alternatively, what's the output of
    od -t x1 uploaded_file | head -10
    when run on the unix machine.

      Thanks for your "express answers"


      Here is the experiments result
      i cat a file in unix ,
      cat > sample
      this is line 1
      this is line 2
      this is line 3
      now i dump the files as given below
      $od -t x1 sample
      0000000 074 068 069 073 020 069 073 020 06c 069 06e 065 020 031 00a 074
      0000020 068 069 073 020 069 073 020 06c 069 06e 065 020 032 00a 074 068
      0000040 069 073 020 069 073 020 06c 069 06e 065 020 033 00a 0000055
      as one can see the line terminator is ASCII 10 ( Line feed )

      Now comming to Net::FTP , a small text file with just 2 line is uploaded to unix from windows. then a dump is taken, the same is shown below

      0000000 02f 02f 02f 02f 02f 02f 02f 02f 02f 02f 02f 02f 02f 02f 02f 02f *
      0000100 02f 02f 02f 02f 02f 02f 02f 02f 02f 02f 00d 00a 02f 02f 020 020
      0000120 028 043 029 020 04d 065 072 065 070 06c 061 063 065 020 061 06c
      0000140 06c 020 074 068 065 061 073 065 020 077 069 074 068 020 063 072
      0000160 075 079 074 069 06e 067 020 06d 06d 06d 06d 06d 06e 00d 00a
      0000177

      The 00d & 00a is nothing but Carriage Return & Line feed This is the actual problem. Any clues
Re: Ascii upload ruins the file
by osunderdog (Deacon) on May 19, 2005 at 14:00 UTC

    I don't have an easy way to duplicate the problem. The documentation for Net::FTP says it should handle this.

    I would recommend contacting the author.

    Soon to be unemployed!

Re: Ascii upload ruins the file
by 5mi11er (Deacon) on May 19, 2005 at 14:26 UTC
    Ah, no, you misunderstand. There are still two characters whether it is under DOS/WIN or Unix. They just happen to be backward from each other. What you are confused with is the fact that C and Perl and probably many other languages allow you to do the correct thing on which ever box you're on by using '\n' as a shortcut.

    I stand corrected. I've been mistaken on this for years...

    -Scott

      No, Unix does indeed use only \x0A as the line terminator.

      $ cat > test this is a test this is a test this is a test 12345678901234 $ od -t x1 test 000000 74 68 69 73 20 69 73 20 61 20 74 65 73 74 0a 74 000020 68 69 73 20 69 73 20 61 20 74 65 73 74 0a 74 68 000040 69 73 20 69 73 20 61 20 74 65 73 74 0a 31 32 33 000060 34 35 36 37 38 39 30 31 32 33 34 0a 000074 $ \ls -l test -rw------- 1 ikegami users 60 May 19 09:40 test

      Not only do you see only "0a" without any "0d" in the binary dump, The file size indicates FreeBSD uses only one (60/4 - 14 = 1) character for the newline. From experience, the same applies to Linux, SunOS/Solaris and AIX. The Macs do something different, but I'm not sure what.

        Here is the Mac (OS X) output:

        brucelowther:~ brucelowther$ cat > ~/tmp/test.txt this is a test this is a test this is a test 12345678901234 brucelowther:~ brucelowther$ hexdump -C ~/tmp/test.txt 00000000 74 68 69 73 20 69 73 20 61 20 74 65 73 74 0a 74 |this is a + test.t| 00000010 68 69 73 20 69 73 20 61 20 74 65 73 74 0a 74 68 |his is a +test.th| 00000020 69 73 20 69 73 20 61 20 74 65 73 74 0a 31 32 33 |is is a t +est.123| 00000030 34 35 36 37 38 39 30 31 32 33 34 0a |456789012 +34.| 0000003c

        Unemployed!