in reply to Getting unique line counts between lines starting with '>'

Use a hash of the lines, keys of a hash are always unique.

perl -lne 'sub out {print $h, "\n", scalar keys %c if %c } />/ and do { out(); %c =(); $h = $_ } or $c{$_}++; END { out() }' < input-file
($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,

Replies are listed 'Best First'.
Re^2: Getting unique line counts between lines starting with '>'
by james.v (Initiate) on Oct 20, 2017 at 00:17 UTC

    A quick question. Is there a way to get the keys of the %s hash to be printed in a comma-delimited list?

    For example:

    >05143_African_trypanosomiasis 3 TRINITY_DN26760_c1_g1, 18169, 42987 >05145_Toxoplasmosis 5 43736, 38319, 38320, TRINITY_DN24151_c3_g1, TRINITY_DN25493_c0_g1

    best,

    James

      Update: Corrected for Comma-separated numbers.
      $ perl -lne 'sub prt{@c && print scalar @c,"\n",join ", ",@c;@c=();pri +nt} />/?prt:push @c,$_}{prt' TheFileName
      output
      >05143_African_trypanosomiasis 4 TRINITY_DN26760_c1_g1, 18169, 42987, 42987 >05145_Toxoplasmosis 7 43736, 38319, 38320, 38320, TRINITY_DN24151_c3_g1, TRINITY_DN25493_c0_ +g1
      For info on "}{", see "Eskimo greeting" in perlsecret.

                      All power corrupts, but we need electricity.

        Thank you NetWallah, that helps a lot.

        best,

        James

Re^2: Getting unique line counts between lines starting with '>'
by james.v (Initiate) on Oct 19, 2017 at 21:51 UTC

    Thank you for the help Choroba, works like a charm!

    best,

    James