pdahal has asked for the wisdom of the Perl Monks concerning the following question:

#Reference Francis #use warnings; use XML::Simple; use LWP::UserAgent; use HTTP::Request::Common; use URI::Escape; use Data::Dumper; use Text::CSV; use List::Util qw( min max ); my @protein_keywords = ("inhibitors", "inhibitor", "activity", "activi +tor", "activities", "activated", "proteins", "deficiency", "levels", +"functions", "reductions", "protease", "proteases", "complex concentr +ate"); my $ua = LWP::UserAgent->new; my $csv = Text::CSV->new({ sep_char => ',' }); my $ab_csv = Text::CSV->new({ sep_char => ',' }); #Open result CSV file. open(my $fh, ">", "Result1.csv"); print $fh "Pubmed ID, Position of keywords, Valid proteins, Position o +f proteins, Minimum separation, Scoring\n"; #open abnormal condition csv file my $i = 0; my @protein_list; open(my $abnorm, '<', "Protein.csv"); while (my $ab_line = <$abnorm>) { chomp $ab_line; if ($ab_csv->parse($ab_line)) { #skip first line next if ($. == 1); my @ab_fields = $ab_csv->fields(); $protein_list[$i] = $ab_fields[0] . $ab_fields[1]; $i++; } } #Open specified CSV file open(my $data, '<', "publist.csv"); while (my $line = <$data>) { chomp $line; if ($csv->parse($line)) { #Skip first line next if ($. == 1); my @fields = $csv->fields(); #Initialize http request my $args = "db=pubmed&id=$fields[0]&retmode=text&rettype=abstr +act"; my $req = new HTTP::Request POST => 'https://eutils.ncbi.nlm.n +ih.gov/entrez/eutils/efetch.fcgi'; $req->content_type('application/x-www-form-urlencoded'); $req->content($args); #Get response my $response = $ua->request($req); my $content = $response->content; $content = lc($content); my @abstract = split /[.]/, $content; my $keyword_position = ""; foreach my $protein_keywords(@protein_keywords) { my $i = 0; foreach my $abstract (@abstract){ if($abstract =~ /\b$protein_keywords\b/i) { $keyword_position = $keyword_position . "+" . $i; $i++; } else { $i++; } }} foreach my $protein_list(@protein_list) { my @each_protein_list = split/[+]/, $protein_list; my $i = 0; my $protein_position = ""; foreach my $abstract(@abstract){ my @tempt = split /[,]/, $abstract; foreach my $each_protein(@each_protein_list) { $each_protein = lc($each_protein); foreach my $tempt(@tempt){ #yo tempt + wala loop chai , wala kura separate garnalai ho hai but problem solv +e vako xaina if($tempt =~ /\b$each_protein\b/i) #main modify + garne thau yo ho ..match navayara tanaab diyako xa { print $tempt; print "\n"; print $each_protein; print "\n"; $protein_position = $protein_position . "+" . +$i; }} } $i++; } if($protein_position ne "") { my $field2 = $keyword_position; my $field3 = $protein_position; my @keywords = split /[+]/, $keyword_position; splice (@keywords, 0, 1); my @proteins = split /[+]/, $protein_position; splice (@proteins, 0, 1); sub uniq { my %seen; grep !$seen{$_}++, @_; } @proteins = uniq(@proteins); my @temp; my $f = 0; foreach my $proteins(@proteins){ my $k = 0; my @difference; foreach my $keywords(@keywords) { my $diff = ($proteins - $keywords); $difference[$k] = abs $diff; $k++; } $temp[$f] = min @difference; $f++; } my $min = min @temp; if($min == 0) { $scoring = 1; my $valid_protein = $each_protein; } elsif($min == 1) { $scoring = 0.5; } else { $scoring = 0.2; } print $fh "$fields[0], $field2, $valid_protein, $field +3, $min, $scoring\n"; } } } } close($fh);

Hello monks! This is the code I am using to list the protein names in the abstract if the protein names are followed by the keywords listed in @protein_keywords in the above code. I get data on every field of the resulting CSV file but the field "valid_protein" is empty. Can anyone suggest me what is the reason behind this?

Replies are listed 'Best First'.
Re: A field comes empty
by AnomalousMonk (Archbishop) on Jul 02, 2017 at 12:38 UTC
    if($min == 0) { $scoring = 1; my $valid_protein = $each_protein; } elsif($min == 1) { $scoring = 0.5; } else { $scoring = 0.2; } print $fh "$fields[0], $field2, $valid_protein, $field3, $min, $scorin +g\n";

    If 'the field "valid_protein"' is related to the  $valid_protein lexical scalar, be aware that this scalar does not exist outside of the first if-block quoted above. The scalar  $valid_protein in the statement
        print $fh "$fields[0], $field2, $valid_protein, $field3, $min, $scoring\n";
    has never been defined and has no value. warnings would have told you about this; strict would have prevented it.

    Update: A simple fix would be to define the lexical outside the if-block, then later use it. Note, however, that this scalar is only assigned a value within the if-block, so it would be wise to initialize the scalar with some default value when it is defined:

    my $valid_protein = 'no valid protein'; if($min == 0) { $scoring = 1; $valid_protein = $each_protein; } elsif($min == 1) { $scoring = 0.5; } else { $scoring = 0.2; } print $fh "$fields[0], $field2, $valid_protein, $field3, $min, $scorin +g\n";
    And BTW:  $scoring is never defined as a lexical anywhere; is this what you really want? (Maybe take a look at the free Modern Perl download.)


    Give a man a fish:  <%-{-{-{-<

Re: A field comes empty
by LanX (Saint) on Jul 02, 2017 at 11:55 UTC
    Sorry tl;dr but a quick glance shows that strict is missing and warnings are deactivated.

    Please correct this and show us the resulting errors ... if any.

    Additionally please try to create a SSCCE and provide sample data.

    HTH :)

    Cheers Rolf
    (addicted to the Perl Programming Language and ☆☆☆☆ :)
    Je suis Charlie!

    update

    At second glance, my $valid_protein is scoped to the surrounding if-block, hence can't survive.

    Please note: strict would have caught this !!!

    Have a look at coping with scoping for the underlying mechanics.