in reply to Re: Split tab-separated file into separate files, based on column name
in thread Split tab-separated file into separate files, based on column name

Kudos.

I didn't think it's possible and the trick is to shell out the writing and opening to a shorter syntax.

This might be considered dirty in a real Perl script but should be acceptable in a one-liner. And interestingly it should also work on windows.

Point is Perl has no mean to print_and_open_if_necessary()

So the next step is to ask myself if the semantics could be cleanly replicated in Perl...

IMHO a tied hash %FH would be most elegant

print $FH{">>$name"} $value

I didn't try to search CPAN for similar solutions yet, cause I'm not sure how.

Comments welcome. ..

Cheers Rolf
(addicted to the Perl Programming Language :)
Wikisyntax for the Monastery

  • Comment on Re^2: Split tab-separated file into separate files, based on column name (open on demand)
  • Select or Download Code

Replies are listed 'Best First'.
Re^3: Split tab-separated file into separate files, based on column name (open on demand)
by jcb (Parson) on Aug 27, 2020 at 03:57 UTC
    Point is Perl has no mean to print_and_open_if_ne­cessary()

    Sometimes Perl is not the best tool for the job. Awk does have that feature and here is an Awk program that does what our questioner asks:

    #!/usr/bin/awk -f BEGIN { FS = "\t" } FNR == 1 { split("", Fields) # clear fields array for (i = 1; i <= NF; i++) Fields[i] = $i next } { for (i = 1; i <= NF; i++) print $i > Fields[i] }

    Save it in a file and mark it executable; tested with GNU Awk. Feed it input on stdin or list the files you want it to read on the command line.

    If you want to add prefixes or suffixes to the output file names, add them to the print statement, like so: print $i > ("out."Fields[i]".txt"); the parentheses ensure that the invisible concatenation operator will be parsed correctly.

      Since this is currently the top node of the past 24 hours, I'll comment.

      Sometimes Perl is not the best tool for the job. Awk ...

      I strongly disagree. Perl is a replacement for awk and sed and can do everything they can, and much, much more. tobyink pointed out IO::All - and while this module may not be in the core, note that CPAN is one of Perl's greatest strengths.

      If you're familiar enough with awk to whip up this script that's fine, and it's certainly interesting to see how it's done in other languages (though this isn't AwkMonks), but consider that the OP may already not be very familiar with Perl, and throwing yet another new language into the mix is unlikely to be the most efficient approach in the long run.

        Perl is a replacement for awk and sed and can do everything they can, and much, much more.

        Yes, but sometimes the older tools are better fits for the problem at hand. Some time ago I suggested to another questioner to either use sed in his shell script or rewrite the entire script in Perl because sed could do the work in less time than Perl needs for startup/shutdown overhead. Perl is more flexible and powerful, but that power does come at a cost and this question happens to fit Awk's domain almost exactly.

        Awk's greatest strength and greatest limitation is the implicit outer loop. On one hand, that feature allows Awk programs to be very efficient, but on the other hand, it limits Awk to processing input text streams.

        (though this isn't AwkMonks)

        I firmly believe that every Perl programmer should learn Awk because learning Awk will make you a better Perl programmer.

        I think his point was that Awk has an open_on_demand.

        And if you know Perl, well it's not very difficult to decipher this Awk script ...

        ( ... oh that's were Larry got these "ideas" from ;-)

        My concern is that it's neither easier nor shorter than Perl.

        For comparison here a script version of my one-liner - already w/o taking advantages of command-line switches.

        $\="\n"; while (<DATA>) { @F = split; unless (@FH) { open $FH[@FH], ">", "$_.txt" for @F; } else { print $_ shift @F for @FH; } } __DATA__ id name position 1 Nick boss 2 George CEO 3 Christina CTO

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

      Hi jcb,

      excellent post, thank you! I did write a Perl script after all, but I suspect that your way is much faster!
      Thanks to all that offered their advice, much appreciated :)
      > Sometimes Perl is not the best tool for the job

      Well the OP asked for a one liner but you provided now a script.

      I have trouble to see why a Perl script may be worse than an Awk script. (?)

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

Re^3: Split tab-separated file into separate files, based on column name (open on demand)
by Eily (Monsignor) on Aug 26, 2020 at 15:26 UTC

    This might be considered dirty in a real Perl script but should be acceptable in a one-liner.
    100% agree with that sentence (which says a lot, since the sentence is "this might be").

    You could use operator overloading to replicate that feature. "Value" > file("path"); or "Value" >> file("path") where file returns an object that overloads > and >>

    Or you could do something closer to C++:

    fstream("path") << 120 << " in hexadecimal is " << ctrl::hex << 120; fstream("logs", "a") << ctrl::autoline << "I'm adding this line to the + logs" << "and also this line";

        Great module! So with IO::All this could yield:

        $ perl -MIO::All -F'/\t/' -lnae '@files = @F, next if 1 .. 1; @f = @fi +les; $_ >> io(shift @f) for @F' <<EOF id name position 1 Nick boss 2 George CEO 3 Christina CTO EOF $ paste id name position 1 Nick boss 2 George CEO 3 Christina CTO

        EDIT Removed do statement.

        Greetings,
        -jo

        $gryYup$d0ylprbpriprrYpkJl2xyl~rzg??P~5lp2hyl0p$
      > You could use operator overloading

      I don't think it's a good idea to overload two very different operators like > "greater-than" and >> "shift".

      That's begging for inconsistency problems. (like syntax, precedence, name it ...)

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      Wikisyntax for the Monastery

      So it got me curious, and I did a quick-and-dirty test implementation of scalar > file() and fstream() << scalar. But I get the "useless use of ... in void context" warnings.

      So my tangential question: Is there a way to "export" the no warnings 'void' from inside the streaming package, rather than requiring it in ::main? It would be best if it could just turn off the warnings for the streaming objects, but leave the warnings on for non-overloaded uses of comparison and bitshift. I tried putting the no-warnings inside the overloaded functions, to try to keep the scope limited, but that's not the right place to prevent the warning. (Yes, I understand this isn't necessarily good practice, or "nice" to the external user. This is just for my own curiosity, and not something I'd put in practical code.)

        That's another good example why overloading is doomed to fail if the operator isn't semantically compatible.

        Regarding your question:

        Either you can try to manipulate the __WARN__ handler in %SIG

        Or you can try exporting warnings inside import() like demonstrated in Modern::Perl

        (And I agree about the productive code part. ;)

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

        Updates

        Rephrased and linked