in reply to File size discrepancy

How did you transfer the file to your local machine?

What do other local tools to check the file size output?

dir 000000620109000009/0000006201-09-000009-index.htm

Most likely, the file has been saved with Windows newlines, that is \r\n, whereas the SEC posts the file size using "Unix newlines", that is, \n.

Replies are listed 'Best First'.
Re^2: File size discrepancy
by wrkrbeee (Scribe) on Nov 25, 2014 at 17:54 UTC
    Thanks for the insight! I used a simple FTP to download the file. The newlines idea makes sense, any thoughts for downloading files with Unix newlines rather than Windows newlines? I am grateful for your help!! Rick
      FTP has 2 modes: ascii and bin. The bin mode transfers the file exactly as it is, the ascii (text) mode makes end-of-line conversions between Unix and Dos. Having said that, if you transfered it in ascii mode and want to get it back to Unix format, you don't need to download it again, you can just use this command:
      perl -pi.bak -e 's/\r//g' file
      to convert it back to Unix format. Or your system may very well have a dos2unix utility doing just the same. On the AIX system where I do most of my work, dos2unix did not not exist, so that I created an alias for it using the Perl one-liner above. On most Linux systems, however, I would think that dos2unix should exist.
        Thank you so much! That is very helpful! Really appreciate your insight!