perl-diddler has asked for the wisdom of the Perl Monks concerning the following question:

I isolated the problem from my previous post to this behavior, which I think is a bit problematic -- i.e. split is doing more than is should, IMO. Off hand, not seeing the forest for the trees, I'm not sure how to get split to leave the strings alone and not treat them as numbers.

The following demonstrates the problem:

n> perl -E 'use P; my $fmt="prod\t006\t2.13\tx86_64\trpm";P $fmt; my $str=P $fmt;P "str=\"$str\""; my @flds=split /\t/,"$str"; foreach (@flds) { Pe "%s \x83", "$_"; } P " "; ' prod 006 2.13 x86_64 rpm str="prod 006 2.13 x86_64 rpm" prod 6 2.13 x86_64 rpm
If I put the params in an array, the 006 collapses right off the bat:
perl -E 'use P; my $fmt=["%s\t%s\t%s\t%s\trpm","prod", "006", "2.13", "x86_64", "rpm"] +; P "%s", $fmt; # already converted ' ["%s %s %s %s rpm","prod",6,2.13,"x86_64","rpm"]
Using qw the tabs don't expand (so the split doesn't work anyway), but the 6 still collapses:
> perl -E 'use P; my $fmt=[qw(%s\t%s\t%s\t%s\t%s prod 006 2.13 x86_64 rpm)]; P "%s", $fmt; # already converted before fmt my $str=P @$fmt; P "str=\"$str\""; my @flds=split "\t","$str"; Pe "(%s) \x83", "$_" foreach @flds; P " ";' ["%s\t%s\t%s\t%s\t%s","prod",6,2.13,"x86_64","rpm"] str="prod\t6\t2.13\tx86_64\trpm" (prod\t6\t2.13\tx86_64\trpm)
Quirky combo to hack it:
> perl -E 'use P; my $fmt=["%s\t%s\t%s\t%s\t%s", qw( prod "006" "2.13" x86_64 rpm)]; P "%s", $fmt; # already converted before fmt my $str=P @$fmt; P "str=\"$str\""; my @flds=split "\t","$str"; Pe "(%s) \x83", "$_" foreach @flds; P " ";' ["%s %s %s %s %s","prod",""006"",""2.13"","x86_64"," +rpm"] str="prod "006" "2.13" x86_64 rpm" (prod) ("006") ("2.13") (x86_64) (rpm)
But running that in the prog I ended up with:
Recycling 1 duplicates...(cannot stat, already deleted?) path=/Share/s +use/distribution/12.1/repo/oss/suse/test2/smugbatch-"006"-"2.1.3".x86 +_64.rpm, dev=(undef)
!!!! Guess I'll keep poking at it... there's gotta be a way... but dang if this isn't harder than it should be...

p.s. maybe I'll just toss an eval on that final string and forget figuring out how to quote it...urg...

Replies are listed 'Best First'.
Re: is split suppose to drop 0's from strings?
by roboticus (Chancellor) on Mar 05, 2013 at 23:33 UTC

    perl-diddler:

    I don't see a problem with split. Perhaps P is the problem?

    $ cat t.pl #!/usr/bin/perl use strict; use warnings; use Data::Dumper; my $fmt="prod\t006\t2.13\tx86_64\trpm"; my @flds = split /\t/,$fmt; print "<",join("|",@flds),">\n"; $ perl t.pl <prod|006|2.13|x86_64|rpm>

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

      P isn't the prob in the actual program, though it might be doing a conversion on the single datum in my examples.

      The problem was the line was getting sent across a pipe in line oriented fashion and being reread in.

      Had an rpm query:

      my @rpmcmd = qw{rpm --queryformat=%{N}\t"%{V}"\t"%{R}"\t%7{ARCH}\t%7{SOURCERPM}\t%9{NOSO +URCE}\n -qp };
      That was printed in a child and read in a parent with:
      while (<$pfrs>) { $package_p->split_n_cmp_vers($_);
      That split the fields to do compares. It was in splitting them into separate vars and trying to use them that they got converted from string->num before I could isolate it -- if I put it back together immediately into a string and do nothing to the parts, like you have done, that obviously works, but I was trying to compare the parts separately. I guess P was bad for my example as it introduces it's own example of the bug:
      perl -e '#!/usr/bin/perl use strict; use warnings;use P; use Data::Dumper; my $fmt="prod\t006\t2.13\tx86_64\trpm"; my @flds = split /\t/,$fmt; printf "%s, %s", $flds[1],"<".join("|",@flds).">\n"; P "%s, %s", $flds[1],"<".join("|",@flds).">\n"; ' 006, <prod|006|2.13|x86_64|rpm> 6, <prod|006|2.13|x86_64|rpm>
      But since the original code was trying to compare the parts, it ended up with a similar problem -- might be able to figure a better way around it than what I ended up with... since looking at the printf -- it doesn't force a convert -- not sure why it doesn't and P does, but if I figure that out, I might be able to stop it from happening in the compare code.

      P wasn't used in the main dataflow of the other prog, as it was performance sensitive -- it was used in error messages and debug messages.

      My ugly solution...I kept the quotes in until it converted it from individual parts back into a string:

      sub rec2rpm_name () { my $p = shift; my ($n, $v, $r, $arch) = ($p->N, $p->V, $p->R, $p->arch); chomp $arch; my $q = P "%s-%s-%s.%s.rpm", $n, $v, $r, $arch; $q=~s/-"(\S+)"-"(\S+)"/-$1-$2/; $q; }
      It's a utility I rarely run...but I tried it yesterday, as I wanted to measure the timing difference between using 1 proc vs. multiple (default was 75%#CPU) and noticed the problem... previous 'testbase' had no version #'s with leading zero's ... yippee! Now it does...better yippee...

      (FWIW, perf diffs of 1 vs. 9 procs:
      hot-cache: the 9 procs saved 42% over the 1
      cold-cache: the 9 procs saved 55% over the single cpu case)...

      Now to go see if I can fix P w/strings w/o turning it inside out. ;-<
      (hey, it can print to a string from an array or to output w/o batting an eyelash, and no extra LF's, unlike printf sprintf).

        The problem in 'P' was introduced a few days ago in 1.0.9 when I tried to pretty up some formatting. I was mainly trying to curtail the verbose output of floating point numbers in structures
        perl -e 'my @a=(1/3, 1/4, 1/5, 1/6, 1/7); print "[".join(", ",@a)."]\n"' [0.333333333333333, 0.25, 0.2, 0.166666666666667, 0.142857142857143] ----------- vs. -------------- > perl -e 'use P;my @a=(1/3, 1/4, 1/5, 1/6, 1/7); P "%s", \@a;' [0.33, 0.25, 0.20, 0.17, 0.14]
        Not as numerically accurate, but when doing dev/debug, easier on the eyes. I hoped it wasn't due to undef detection, but it had nothing to do w/that:
        > perl -e 'use P;my @a=(1/3, 1/4, $x, 1/6, 1/7); P "%s", \@a;' [0.33, 0.25, (undef), 0.17, 0.14]
        As usual, thanks pointing me in the right direction!
Re: is split suppose to drop 0's from strings?
by kielstirling (Scribe) on Mar 05, 2013 at 23:00 UTC
    I suspect this line
    Pe "%s \x83", "$_";
    Is this some kind of formatting using printf conventions? if so then the %s is the problem try
    %03d
Re: is split suppose to drop 0's from strings?
by Anonymous Monk on Mar 06, 2013 at 07:42 UTC