skyler has asked for the wisdom of the Perl Monks concerning the following question:

Hello, i'm working on this program that opens a file reads its content line by line and replaces commas with pipes; then graps a string and format it to put it into another file. I'm not sure it fI should be using an array with this program to do the formatting of $var7 field or could it be done the way, I have it now. Could you give me any suggestions on how to handle a string formatting.
print "#! perl -w use strict; #use File::Find; #use File::Copy cp; sub parse_file { my $string1; my $string2; my $string3; my $finstring; open(IN, '<c:/doclist.chr') or die "Couldn't open file, $!"; open(OUT, '>c:/doclist.txt') or die "Couldn't open file, $!"; ## print OUT join '|', split /,/ while <IN>; while(<IN>) { chomp; # Remove the newline my ($var1, $var2, $var3, $var4, $var5, $var6, $var7) = split /, +/; $string1 = $var7; substr($string1, 24) = "H:\"; $string2 = substr($string1,25,2); $string3 = substr($string1,-13); $finstring = $string1.$string2.$string3; print OUT "$var1,"|",$var2,"|",$var3,"|",$var4,"|",$var5,"|",$v +ar6,"|",$finstring," \n"; } exit; } \n";

Replies are listed 'Best First'.
Re: Open file & Strip comma delimiter
by graff (Chancellor) on Feb 16, 2003 at 23:41 UTC
    You didn't mention whether this program actually does what you want it to do, or does something else instead. Also, while it's clear you want to replace commas with vertical bars, it's not really clear what you're trying to do with the last field on each line of input.

    The replacement of commas with vertical bars can be done much more easily:

    while (<IN>) { tr/,/|/; print OUT; }
    (you can look up the "tr///" operator in the perlop manpage) Note that you are already assuming that all commas in each line are field delimiters (i.e. there are no commas that are included within quoted fields, or that are escaped in some way, and should not be interpreted as delimiters) -- also, you are taking for granted that every line will contain the same number of commas and fields. If you're really confident that your input data are consistent with these assumptions, then there's no problem. But be aware that many applications do not have this luxury.

    As for this part of your code:

    $string1 = $var7; substr($string1, 24) = "H:\"; $string2 = substr($string1,25,2); $string3 = substr($string1,-13); $finstring = $string1.$string2.$string3;
    This makes assumptions about the length of the last field of on each input line. It also appears to always set $string2 to ":\" -- is that your intention? (If so, your code is just obfuscated.)

    No way for anyone here to know what $string3 would be, without seeing some of the data. There is almost certainly a better/easier way to accomplish what you're trying to do, but you'll need to post a reply with more info about the data.

Re: Open file & Strip comma delimiter
by tachyon (Chancellor) on Feb 16, 2003 at 23:33 UTC

    Oh, and here is a one liner that will change , to pipes or in fact do any sort or search and replace s/find/replace/g on a whole file:

    perl -pi.bak -e 's/,/|/g' <file>

    This does an inplace edit of <file> and writes a copy of the original to file.bak. You need to use " instead of ' on Win32.

    perl -pi.bak -e "s/,/|/g" <file>

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Open file & Strip comma delimiter
by tachyon (Chancellor) on Feb 16, 2003 at 23:29 UTC

    You will need to show us some of the data you are parsing and what you hope to get out. Your use of substr makes very little sense. Here is some code that compiles for a start as uses an array for the fields as well as the ever faithful join. Note how we use $infile and $outfile so we can include them in our error messages easily.....

    #!/usr/bin/perl -w use strict; my $infile = 'c:/doclist.chr'; my $outfile = 'c:/doclist.txt'; open IN, "<$infile" or die "Couldn't open $infile, $!"; open OUT,">$outfile" or die "Couldn't open $outfile, $!"; while(<IN>) { chomp; my @fields = split /,/; my $string1 = $fields[6]; do { warn "Empty field 7"; next } unless $string1; # this will keep the first 24 chars of string, throw out the rest a +nd # then append H:\ to it. # need \\ here or it won't compile substr($string1, 24) = "H:\\"; # this will return :\ as we get 2 chars LEN from pos 25 my $string2 = substr($string1,25,2); # this will set $string3 to the last 13 chars of the string my $string3 = substr($string1,-13); my $finstring = $string1.$string2.$string3; my $out = join '|', @fields[0..5], $finstring, "\n"; print OUT $out; } __DATA__ substr EXPR,OFFSET,LEN,REPLACEMENT substr EXPR,OFFSET,LEN substr EXPR,OFFSET Extracts a substring out of EXPR and returns it. First character is at + offset 0, or whatever you've set $[ to (but don't do that). If OFFSET is negativ +e (or more precisely, less than $[), starts that far from the end of the + string. If LEN is omitted, returns everything to the end of the string. If LEN + is negative, leaves that many characters off the end of the string. If you specify a substring that is partly outside the string, the part within the string is returned. If the substring is totally ou +tside the string a warning is produced. You can use the substr() function as an lvalue, in which case EXPR mus +t itself be an lvalue. If you assign something shorter than LEN, the string wil +l shrink, and if you assign something longer than LEN, the string will grow to a +ccommodate it. To keep the string the same length you may need to pad or chop you +r value using sprintf(). An alternative to using substr() as an lvalue is to specify the replac +ement string as the 4th argument. This allows you to replace parts of the EX +PR and return what was there before in one operation, just as you can with sp +lice().

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      Tachyon,
      print"463694|BROOKS JR|JAMES|Arnold Markoe|1/17/03|MD Wkly Note|H:\17 +\00004DA5.0057\\00004DA5.005| 479309|WALDMAN|MELVI|Arnold Markoe|1/17/03|MD Wkly Note|H:\16\00004DE0 +.0056\\00004DE0.005|\n";
      Why is printing for example "\\00004DA5.005|" the extra stuff. I only need $fields10..5|H:\<subdir>\<file.ext>' I'll appreciate your help I modified the code that you send me as follows:
      print "#! perl -w use strict; my $infile = 'c:/doclist.chr'; my $outfile = 'c:/doclist.txt'; #use File::Find; #use File::Copy cp; open IN, "<$infile" or die "Couldn't open $infile, $!"; open OUT,">$outfile" or die "Couldn't open $outfile, $!"; ## print OUT join '|', split /,/ while <IN>; while(<IN>) { chomp; my @fields = split /,/; my $string1 = $fields[6]; do { warn "Empty field 7"; next } unless $string1; # this will keep the first 24 chars of string, throw out the rest a +nd # then append H:\ to it. # need \\ here or it won't compile my $string_header = $string1; substr($string_header,0,24) = 'H:'; # this will return :\ as we get 2 chars LEN from pos 25 my $string2 = substr($string1,26,2); # this will set $string3 to the last 13 chars of the string my $string3 = substr($string1,-13); my $finstring = $string_header.$string2.$string3; my $out = join '|', @fields[0..5], $finstring, "\n"; print OUT $out; } exit; \n";

        As I have tried to explain to you your use of substr is not going to do what you want. However we need to see

        1 the exact input data (comma separated)

        2 the exact output data (pipe delim) that you want to generate from 1

        If you are just trying to get rid of part of the file path you need to use a s/// regex or split $fields[6] on the '\', select the parts you want, join these bits back together with '/' and then join the whole thing with '|'.

        while(<IN>) { chomp; my @fields = split /,/; my $path_str = $fields[6]; do { warn "Empty field 7"; next } unless $path_str; my @path = split "\\", $path_str; # remove null fields if you really have \\ in there @path = grep { $_ } @path; # assuming you want to remove the 17 and 16 dirs in the example my $fixed_path = join "\\", @path[0,2,3] my $out = join '|', @fields[0..5], $fixed_path, "\n"; print OUT $out; }

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

Re: Open file & Strip comma delimiter
by dws (Chancellor) on Feb 16, 2003 at 23:21 UTC
    print OUT "$var1,"|",$var2,"|",$var3,"|", ... ^
    Remove that first double-quote, or you'll end up bitwise OR'ing a bunch of strings together.