in reply to Re^6: First foray into Perl
in thread First foray into Perl

Hi, thank you very much for looking at this.

The spacing change was inadvertent, not quite sure what happened there.

The last three lines are the start of another record and can be ignored/removed. I do have another version of the record file with records delimited by a double space, thusly:

TF Unknown TF Name Unknown Gene ENSG00000113916 Motif ENSG00000113916___1|4x3 Family C2H2 ZF Species Homo_sapiens Pos A C G T 1 0.427379 0.0647991 0.288826 0.218996 2 0.201974 0.139791 0.35254 0.305695 3 0.11714 0.118042 0.143884 0.620934 4 0.637331 0.0996546 0.228428 0.0345867 5 0.0971289 0.591289 0.134781 0.176801 6 0.0715039 0.0237142 0.0432674 0.861514 7 0.73769 0.117011 0.059703 0.0855963 8 0.0728444 0.00877167 0.877166 0.0412175 9 0.959269 0.0131077 0.0159611 0.0116621 10 0.612865 0.057845 0.0583267 0.270963 TF Unknown TF Name Unknown Gene ENSG00000161940 Motif ENSG00000161940___1|1x3 Family C2H2 ZF Species Homo_sapiens Pos A C G T 1 0.614704 0.122914 0.125116 0.137266 2 0.0954267 0.010422 0.851317 0.0428343 3 0.959146 0.00959146 0.0112618 0.0200008 4 0.91149 0.0146678 0.0135794 0.0602625 5 0.67464 0.0388388 0.13716 0.149361 6 0.104655 0.0579394 0.804166 0.0332392 7 0.789171 0.102902 0.0490883 0.0588389 8 0.776513 0.0273768 0.144501 0.0516094 9 0.130657 0.06051 0.0793659 0.729467 10 0.626753 0.0648533 0.143976 0.164418

Thanks again!

Replies are listed 'Best First'.
Re^8: First foray into Perl
by AnomalousMonk (Archbishop) on Mar 26, 2014 at 00:38 UTC

    Here's a version for double-newline record separators, still no tabs, only 1+ spaces separate fields. Included new  ENSG00000113916___1|4x3 motif in test data. Most of notes and caveats of Re: First foray into Perl still apply.

      Wow - Thanks so much for your efforts on this. Really appreciated! I'm still working on using it to extract sequences from my 15,000 record file so can't state success quite yet but I'll let you know.

      Cheers!

        A minor update: This version of the loop may be a bit nicer (tested):

        # works -- a while loop might be more efficient/elegant MAX_BASES: while (@base_values) { # @base_values is consumed $max_bases .= $_->[1] for # append base of each max reduce { $a->[0] > $b->[0] ? $a : $b } # max in group map [ shift(@base_values), $_ ], @base_ord # groups of n ; }
        Perl may seem foreign if you are new to it - Learning Perl as an introduction, and Perl Cookbook are 2 good sources on Perl.

        Don't get discouraged as I am now trying to learn a new GUI, Xojo which uses a form of the Visual Basic language and to learn all the new ways of string matching, working with arrays and converting numbers to printable strings is challenging. I will probably need to get a book that explains it all. Just doing the beginners' tutorial is difficult.