in reply to Re^8: 80 characters long
in thread 80 characters long

Aixmike,

I have been trying to figure out what you're doing and saying, and I am also stumped. Either you're not very good at telling us what you want, or you don't really know. Sorry if it's harsh, but I've been over this several times (slow day) and you're inconsistent at best.

In your code snippet, you say:

Okay, but the text before this code that says the tilde must remain and in Notepad there is a CR after each tilde. If you replace the tilde, it is by definition no longer there. So, since your description is contradictory and doesn't match the format you say you want, let's take a look at the sample text you provide and try to deduce what you want. (NOTE: perlmonks may wrap the lines for display.) You say this:

ISA*00* *00* *28*9000000454 *ZZ*J00370 * +070926*0342*U*00401*000489661*0*P*:~ GS*HP*9000000454*J00370*20070926*03421826*489660*X*004010X091A1~ ST*835*489662~ BPR*I*6519.36*C*ACH*CCP*01*075906003*DA*0000194917*1351840597*00454 + *01*021000021*DA*304677450*20070926~ TRN*1*EFT2545408*13
should look like this:
ISA*00* *00* *28*9000000454 *ZZ*J00370 * +070926*0 342*U*00401*000489661*0*P*:~ GS*HP*9000000454*J00370*20070926*03421826*489660* X*004010X091A1~ ST*835*489662~ BPR*I*6519.36*C*ACH*CCP*01*075906003*DA*00001949 17*1351840597*00454 *01*021000021*DA*304677450*20070926~ TRN*1*EFT2545408*13

First line (ISA*...) breaks cleanly at 79 characters, in the middle of a "word". The remainder of the line retains its tilde (as do all the lines). Fine. But the second line (GS*...) breaks after 49 characters — when the whole line was only 64 characters! Why? Was it because of the asterisk?

GS*HP*9000000454*J00370*20070926*03421826*489660*X*004010X091A1~ ^

The third line (ST*...) is unchanged, but the fourth line (BPR*...) breaks after 48 characters (total line length 109 chars). Again, in the middle of a "word". Um...why?

BPR*I*6519.36*C*ACH*CCP*01*075906003*DA*0000194917*1351840597*00454 + *01*021000021*DA*304677450*20070926~ ^

So, not to put too fine a point on it, do you have any idea what you are talking about?

-dd

Replies are listed 'Best First'.
Re^10: 80 characters long
by aixmike (Novice) on Oct 01, 2007 at 17:00 UTC
    Let me take another stab at explaining this. btw, I do not take your comments personal, I receive an electronic file generated by a third party. below if the fist line of that file. Whe pasting it here the carriage returns do not show up, but there is one following EACH tilde (~). the file contains a bit over 4300 lines similar to this one. the REQUIREMENTS are: 1 - remove all carriage retuns (in vb this would be chr(10) 2 - format the remaining text (including any special characters, and tildes) into 80 characters per line. Breaks in the middle of words or other string combinations is permissible. 3 - This process should be able to be used for other files specified on a command line ie. <perlscriptname> <inputfilename> <outputfilename> As you pointed out the first line breaks at position 79 text::wrap:columns=80 Where column lenght is X - 1 thus 79. the next line as again you point out breaks at position 49 when the entire thing is only 64 positions (WHY) THIS IS MY PROBLEM. I did not understand that the module text::wrap broke on whole words only. If this is the case and it uses spaces to determine a word break, why does the first line work as expected - it may be moot. If you see another approach to achieving my requirements, please let me know. --- Thanks The code immediately below this is where I am at:
    #! / usr/bin/perl -w use Text::Wrap qw(wrap); # $name = $ARGV[0]; $name1 = $ARGV[1]; # # open (INFILE, "<$name"); open (OUTFILE, ">$name1"); $Text::Wrap::columns = 80; while (my $row = <INFILE>) { print OUTFILE wrap('','',$row) ; } close (INFILE); close (OUTFILE);
    The Inputfile sample ->
    ISA*00* *00* *28*9000000454 *ZZ*J00370 * +070926*0342*U*00401*000489661*0*P*:~ GS*HP*9000000454*J00370*20070926*03421826*489660*X*004010X091A1~ ST*835*489662~ BPR*I*6519.36*C*ACH*CCP*01*075906003*DA*0000194917*1351840597*00454 + *01*021000021*DA*304677450*20070926~ TRN*1*EFT2545408*1351840597*00454~ DTM*405*20070925~ N1*PR*NATIONAL GOVERNMENT SERVICES 00454~ N3*5151 CAMINO RUIZ~ N4*CAMARILLO*CA*930128645~ REF*2U*00454~ N1*PE*GOLDEN LIVINGCENTER FOLEY*FI*204120517~ N3*1000 FIANNA WAY~ N4*FORT SMITH*AR*72919~ REF*PQ*015032~ LX*210712~ TS3*015032*21*20071231*8*19760.22*9175.8*10584.42**8905.8**-2334****91 +75.8*2604*******8*8905.8~ TS2*11509.8**********2*41*41~ CLP*0073294416 070299289*19*856.96*769.14**MA*20725800001702*21*2~ CAS*CO*94*-284.18~ CAS*PR*2*372*0~ NM1*QC*1*BRADY*LEO*F***HN*140079398A~ NM1*82*2*GOLDEN LIVINGCENTER FOLEY*****XX*1538110192~ NM1*TT*2*WPS - TRICARE FOR LIFE*****PI*000060000~ MIA*0***1141.14*MA02**********3*****MA44~ REF*EA*0073294416~ DTM*050*20070913~ DTM*23

      First, you mean "Line Feed" (chr(10)), not "Carriage Return" (chr(13)).

      Second, providing the desired output along with a sample input is very useful, especially when you have problems explaining what you want.

      Finally, we've coverered that Text::Wrap broke on word boundaries on the first day, before you even started using Text::Wrap. Not only did I say "Text::Wrap will only break on a word boundary.", I provided code that broke on the 80th column unconditionally.

      Let's go back to that code and fix it for the then-unmentioned requirements to remove existing line breaks.

      my $text = do { local $/; <> }; $text =~ tr/\n//d; $text =~ s/(.{80})/$1\n/g; $text =~ s/(?!<\n)\z/\n/; print($text);

      Usage:

      perl script.pl infile > outfile

      Or if you want to modify it in-place:

      perl -i.bak script.pl file

      Output for provided input:

      ISA*00* *00* *28*9000000454 *ZZ*J00370 * +070926*034 2*U*00401*000489661*0*P*:~GS*HP*9000000454*J00370*20070926*03421826*48 +9660*X*004 010X091A1~ST*835*489662~BPR*I*6519.36*C*ACH*CCP*01*075906003*DA*000019 +4917*13518 40597*00454 *01*021000021*DA*304677450*20070926~TRN*1*EFT2545408*13 +51840597*0 0454~DTM*405*20070925~N1*PR*NATIONAL GOVERNMENT SERVICES 00454~N3*515 +1 CAMINO R UIZ~N4*CAMARILLO*CA*930128645~REF*2U*00454~N1*PE*GOLDEN LIVINGCENTER F +OLEY*FI*20 4120517~N3*1000 FIANNA WAY~N4*FORT SMITH*AR*72919~REF*PQ*015032~LX*210 +712~TS3*01 5032*21*20071231*8*19760.22*9175.8*10584.42**8905.8**-2334****9175.8*2 +604******* 8*8905.8~TS2*11509.8**********2*41*41~CLP*0073294416 070299289*19*856. +96*769.14* *MA*20725800001702*21*2~CAS*CO*94*-284.18~CAS*PR*2*372*0~NM1*QC*1*BRAD +Y*LEO*F*** HN*140079398A~NM1*82*2*GOLDEN LIVINGCENTER FOLEY*****XX*1538110192~NM1 +*TT*2*WPS - TRICARE FOR LIFE*****PI*000060000~MIA*0***1141.14*MA02**********3*** +**MA44~REF *EA*0073294416~DTM*050*20070913~DTM*23