in reply to Re^7: 80 characters long
in thread 80 characters long

This node falls below the community's threshold of quality. You may see it by logging in.

Replies are listed 'Best First'.
Re^9: 80 characters long
by ww (Archbishop) on Sep 29, 2007 at 10:46 UTC
    Your latest question (--) seems to me to suggest that you ignored or didn't understand ikegami's earlier explanation that text-wrapping MEANS inserting a <CR> at the end of the (wrapped) line... and the observation that Text::Wrap breaks on word boundaries.

    Then, your comment,

    012: # strip out all Carriage Returns
    013: #
    014: # Process the file by replacing each tilde (~) with a carriage return / line feed and
    015: # save the file in the correct file format.
    matches neither what you say you want to do, nor what you are doing. There's no mechanism in your code to remove "carriage returns" ... and you can't very well replace each tilde with a "carriage return" and retain the tildes.

    What's more, the output I see appears to retain the tildes.

    So perhaps you should clarify your question (preferably, after clearing up your understanding of what text-wrapping IS and what Text::Wrap DOES (for which, perldoc Text::Wrap should help.)

Re^9: 80 characters long
by furry_marmot (Pilgrim) on Sep 29, 2007 at 20:33 UTC
    Aixmike,

    I have been trying to figure out what you're doing and saying, and I am also stumped. Either you're not very good at telling us what you want, or you don't really know. Sorry if it's harsh, but I've been over this several times (slow day) and you're inconsistent at best.

    In your code snippet, you say:

    • read a line
    • strip out all CRs (I assume there's only one)
    • replace each tilde with a CR/LF
    • save each line in the correct format (not specified)

    Okay, but the text before this code that says the tilde must remain and in Notepad there is a CR after each tilde. If you replace the tilde, it is by definition no longer there. So, since your description is contradictory and doesn't match the format you say you want, let's take a look at the sample text you provide and try to deduce what you want. (NOTE: perlmonks may wrap the lines for display.) You say this:

    ISA*00* *00* *28*9000000454 *ZZ*J00370 * +070926*0342*U*00401*000489661*0*P*:~ GS*HP*9000000454*J00370*20070926*03421826*489660*X*004010X091A1~ ST*835*489662~ BPR*I*6519.36*C*ACH*CCP*01*075906003*DA*0000194917*1351840597*00454 + *01*021000021*DA*304677450*20070926~ TRN*1*EFT2545408*13
    should look like this:
    ISA*00* *00* *28*9000000454 *ZZ*J00370 * +070926*0 342*U*00401*000489661*0*P*:~ GS*HP*9000000454*J00370*20070926*03421826*489660* X*004010X091A1~ ST*835*489662~ BPR*I*6519.36*C*ACH*CCP*01*075906003*DA*00001949 17*1351840597*00454 *01*021000021*DA*304677450*20070926~ TRN*1*EFT2545408*13

    First line (ISA*...) breaks cleanly at 79 characters, in the middle of a "word". The remainder of the line retains its tilde (as do all the lines). Fine. But the second line (GS*...) breaks after 49 characters — when the whole line was only 64 characters! Why? Was it because of the asterisk?

    GS*HP*9000000454*J00370*20070926*03421826*489660*X*004010X091A1~ ^

    The third line (ST*...) is unchanged, but the fourth line (BPR*...) breaks after 48 characters (total line length 109 chars). Again, in the middle of a "word". Um...why?

    BPR*I*6519.36*C*ACH*CCP*01*075906003*DA*0000194917*1351840597*00454 + *01*021000021*DA*304677450*20070926~ ^

    So, not to put too fine a point on it, do you have any idea what you are talking about?

    -dd

      Let me take another stab at explaining this. btw, I do not take your comments personal, I receive an electronic file generated by a third party. below if the fist line of that file. Whe pasting it here the carriage returns do not show up, but there is one following EACH tilde (~). the file contains a bit over 4300 lines similar to this one. the REQUIREMENTS are: 1 - remove all carriage retuns (in vb this would be chr(10) 2 - format the remaining text (including any special characters, and tildes) into 80 characters per line. Breaks in the middle of words or other string combinations is permissible. 3 - This process should be able to be used for other files specified on a command line ie. <perlscriptname> <inputfilename> <outputfilename> As you pointed out the first line breaks at position 79 text::wrap:columns=80 Where column lenght is X - 1 thus 79. the next line as again you point out breaks at position 49 when the entire thing is only 64 positions (WHY) THIS IS MY PROBLEM. I did not understand that the module text::wrap broke on whole words only. If this is the case and it uses spaces to determine a word break, why does the first line work as expected - it may be moot. If you see another approach to achieving my requirements, please let me know. --- Thanks The code immediately below this is where I am at:
      #! / usr/bin/perl -w use Text::Wrap qw(wrap); # $name = $ARGV[0]; $name1 = $ARGV[1]; # # open (INFILE, "<$name"); open (OUTFILE, ">$name1"); $Text::Wrap::columns = 80; while (my $row = <INFILE>) { print OUTFILE wrap('','',$row) ; } close (INFILE); close (OUTFILE);
      The Inputfile sample ->
      ISA*00* *00* *28*9000000454 *ZZ*J00370 * +070926*0342*U*00401*000489661*0*P*:~ GS*HP*9000000454*J00370*20070926*03421826*489660*X*004010X091A1~ ST*835*489662~ BPR*I*6519.36*C*ACH*CCP*01*075906003*DA*0000194917*1351840597*00454 + *01*021000021*DA*304677450*20070926~ TRN*1*EFT2545408*1351840597*00454~ DTM*405*20070925~ N1*PR*NATIONAL GOVERNMENT SERVICES 00454~ N3*5151 CAMINO RUIZ~ N4*CAMARILLO*CA*930128645~ REF*2U*00454~ N1*PE*GOLDEN LIVINGCENTER FOLEY*FI*204120517~ N3*1000 FIANNA WAY~ N4*FORT SMITH*AR*72919~ REF*PQ*015032~ LX*210712~ TS3*015032*21*20071231*8*19760.22*9175.8*10584.42**8905.8**-2334****91 +75.8*2604*******8*8905.8~ TS2*11509.8**********2*41*41~ CLP*0073294416 070299289*19*856.96*769.14**MA*20725800001702*21*2~ CAS*CO*94*-284.18~ CAS*PR*2*372*0~ NM1*QC*1*BRADY*LEO*F***HN*140079398A~ NM1*82*2*GOLDEN LIVINGCENTER FOLEY*****XX*1538110192~ NM1*TT*2*WPS - TRICARE FOR LIFE*****PI*000060000~ MIA*0***1141.14*MA02**********3*****MA44~ REF*EA*0073294416~ DTM*050*20070913~ DTM*23

        First, you mean "Line Feed" (chr(10)), not "Carriage Return" (chr(13)).

        Second, providing the desired output along with a sample input is very useful, especially when you have problems explaining what you want.

        Finally, we've coverered that Text::Wrap broke on word boundaries on the first day, before you even started using Text::Wrap. Not only did I say "Text::Wrap will only break on a word boundary.", I provided code that broke on the 80th column unconditionally.

        Let's go back to that code and fix it for the then-unmentioned requirements to remove existing line breaks.

        my $text = do { local $/; <> }; $text =~ tr/\n//d; $text =~ s/(.{80})/$1\n/g; $text =~ s/(?!<\n)\z/\n/; print($text);

        Usage:

        perl script.pl infile > outfile

        Or if you want to modify it in-place:

        perl -i.bak script.pl file

        Output for provided input:

        ISA*00* *00* *28*9000000454 *ZZ*J00370 * +070926*034 2*U*00401*000489661*0*P*:~GS*HP*9000000454*J00370*20070926*03421826*48 +9660*X*004 010X091A1~ST*835*489662~BPR*I*6519.36*C*ACH*CCP*01*075906003*DA*000019 +4917*13518 40597*00454 *01*021000021*DA*304677450*20070926~TRN*1*EFT2545408*13 +51840597*0 0454~DTM*405*20070925~N1*PR*NATIONAL GOVERNMENT SERVICES 00454~N3*515 +1 CAMINO R UIZ~N4*CAMARILLO*CA*930128645~REF*2U*00454~N1*PE*GOLDEN LIVINGCENTER F +OLEY*FI*20 4120517~N3*1000 FIANNA WAY~N4*FORT SMITH*AR*72919~REF*PQ*015032~LX*210 +712~TS3*01 5032*21*20071231*8*19760.22*9175.8*10584.42**8905.8**-2334****9175.8*2 +604******* 8*8905.8~TS2*11509.8**********2*41*41~CLP*0073294416 070299289*19*856. +96*769.14* *MA*20725800001702*21*2~CAS*CO*94*-284.18~CAS*PR*2*372*0~NM1*QC*1*BRAD +Y*LEO*F*** HN*140079398A~NM1*82*2*GOLDEN LIVINGCENTER FOLEY*****XX*1538110192~NM1 +*TT*2*WPS - TRICARE FOR LIFE*****PI*000060000~MIA*0***1141.14*MA02**********3*** +**MA44~REF *EA*0073294416~DTM*050*20070913~DTM*23