fan li has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone

I am new to Perl. I want to get the specific information from a txt file by Perl. The following is description of the text file and what I want to get.

#####################################

C2H4 H2O3 C160H33O (1)

1 2 5 (2)

C2H4 H2O3 O4 (3)

2 1 3 (4)

C452H576O4 C2H4 (5)

1 1 (6)

######################################

The content of txt file is listed above. The name of the txt file is species.txt. There are two kinds of lines. One of the lines (labeled as (1) (3) (5)) is the chemical species and another one (labeled as (2) (4) (6)) is the number of the corresponding chemical species. And the chemical species and the numbers are not vertical alignment. And the chemical species are different from line to line. I want to get the number of the chemical species and output it into an Excel spreadsheet. Take the C2H4 as an example, the output should like this: C2H4

1

2

1

How can I do that by Perl? Thanks for your help Fan Li

  • Comment on Extract the number of the species for txt file

Replies are listed 'Best First'.
Re: Extract the number of the species for txt file
by choroba (Cardinal) on Apr 27, 2016 at 14:46 UTC
    Extract the counts to a hash:
    #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; my %counts; while (<>) { my @species = split; my @numbers = split ' ', <>; warn "Different number of columns at line $.!" if @species != @num +bers; for my $i (0 .. $#species) { push @{ $counts{ $species[$i] } }, $numbers[$i]; } } for my $spec (keys %counts) { say $spec; for my $count (@{ $counts{$spec} }) { say $count; } }

    ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,
      Hi choroba

      Thanks for your help. Your code works fine and can out put the all species and corresponding numbers to the screen. However, I just want to extract one specific specie and corresponding number at a time. Meanwhile, the corresponding number should be 0 if there is no such specific specie.And it would be better if the outcome can write into a new text file instead of the screen. Take O4 as an example, there should be just one output and should be like this

      O4

      0 # There is no “O4” specie in (1), so the corresponding number is 0.

      3

      0 # There is no “O4” species in (5), so the number is 0 too.

      It will be grateful if you can help me.

      Fan Li

        If you know the species in advance, there's no need to remember the counts for the remaining ones.
        #!/usr/bin/perl use warnings; use strict; use feature qw{ say }; say my $search_for_species = 'O4'; while (<>) { my %counts; my @species = split; my @numbers = split ' ', <>; warn "Different number of columns at line $.!" if @species != @num +bers; for my $i (0 .. $#species) { $counts{ $species[$i] } = $numbers[$i]; } say $counts{$search_for_species} // 0; }

        To output to a file, use redirection:

        perl script_from_choroba.pl > new_file

        ($q=q:Sq=~/;[c](.)(.)/;chr(-||-|5+lengthSq)`"S|oS2"`map{chr |+ord }map{substrSq`S_+|`|}3E|-|`7**2-3:)=~y+S|`+$1,++print+eval$q,q,a,