in reply to Re^6: Easiest way to filter a file based on user input (updated)
in thread Easiest way to filter a file based on user input

The latter. See use for examples and the FAQ What's the difference between require and use? for more detail.

Addendum: Also, almost every module (including Regexp::Common) has a SYNOPSIS section in its documentation which includes a code snippet showing how to use it.

Perl and (almost) all of its modules have excellent documentation. It will pay you dividends to learn where that documentation is, how to search it and how to understand what you find there.

  • Comment on Re^7: Easiest way to filter a file based on user input (updated)

Replies are listed 'Best First'.
Re^8: Easiest way to filter a file based on user input (updated)
by Peter Keystrokes (Beadle) on Jul 16, 2017 at 09:14 UTC
    Hi there, I've since downloaded and installed the Regexp::Common module. I've used it in my script as seen below.

    When I run the script below and enter -3 I expect the script to filter my text file of all the lines beginning with 'None' or numbers which are greater than -3, leaving only lines with numbers equal to or less than -3

    Here is an example of my data:
    >hsa_circ_0067224|chr3:128345575-128345675-|NM_002950|RPN1 FORWARD -4.4 6 .. 17 xxxxxxxxxxGTGAC CAGT ATGC ACT +G AAGATGAGGTTTGTG -0.9 5 .. 18 xxxxxxxxxxxGTGA CCAGT ATGC ACT +GA AGATGAGGTTTGTGG None 1 .. 20 xxxxxxxxxxxxxxx GTGACCAGTATGCACT +GAAG ATGAGGTTTGTGGAC None 2 .. 21 xxxxxxxxxxxxxxG TGACCAGTATGCACTG +AAGA TGAGGTTTGTGGACC None 6 .. 25 xxxxxxxxxxGTGAC CAGTATGCACTGAAGA +TGAG GTTTGTGGACCATGT -2.3 5 .. 26 xxxxxxxxxxxGTGA C CAGTATGCACTGAAGA +TGAG G TTTGTGGACCATGTG -3.2 4 .. 27 xxxxxxxxxxxxGTG AC CAGTATGCACTGAAGA +TGAG GT TTGTGGACCATGTGT -1.9 3 .. 28 xxxxxxxxxxxxxGT GAC CAGTATGCACTGAAGA +TGAG GTT TGTGGACCATGTGTT

    If I typed -3 I should be left with:

    >hsa_circ_0067224|chr3:128345575-128345675-|NM_002950|RPN1 FORWARD -4.4 6 .. 17 xxxxxxxxxxGTGAC CAGT ATGC ACT +G AAGATGAGGTTTGTG -3.2 4 .. 27 xxxxxxxxxxxxGTG AC CAGTATGCACTGAAGA +TGAG GT TTGTGGACCATGTGT

    So far it is only able to filter the 'None'. Shouldn't $RE{num}{real}{-places=>2} capture real & irrational numbers?

    The script:
    #!/usr/bin/perl use strict; use warnings; use Regexp::Common qw /number/; print "Enter limit: "; chomp( my $limit = <STDIN> ); $limit = abs($limit); open my $IN, '<', "xt_spacer_results.hairpin" or die $!; open my $SIFTED, '>', "new_xt_spacer_results.hairpin" or die $!; while (<$IN>){ next if /^None/; next if /^($RE{num}{real}{-places=>2})/ && $1 > $limit; print $SIFTED $_; } close $IN; close $SIFTED;

      I added the line $limit = abs($limit); (see abs) because I wasn't sure of your original specification, as I asked in my post. Also, note that -places=N is documented as: "the number is assumed to have exactly N places after the radix point" and even goes on to show an example: "$RE{num}{real}{-places=>2} # matches 123.45 or -0.12", and your input isn't in that format. Take some time to look into the documentation and then try removing the line with abs, as well as "{-places=>2}" from the regex.

        Oh okay, apologies for the buffoonery on my part.

        The script seems to be working fine now, I added another next line: next if /^(\s\s-\d)/ && $1 > $limit;, because without it, it doesn't recognise regular Real numbers like -2, -5 etc.

        The script:

        #!/usr/bin/perl use strict; use warnings; use Regexp::Common qw /number/; print "Enter limit: "; chomp( my $limit = <STDIN> ); #$limit = abs($limit); open my $IN, '<', "xt_spacer_results.hairpin" or die $!; open my $SIFTED, '>', "new_xt_spacer_results.hairpin" or die $!; while (<$IN>){ next if /^None/; next if /^($RE{num}{real})/ && $1 > $limit; next if /^(\s\s-\d)/ && $1 > $limit; print $SIFTED $_; } close $IN; close $SIFTED;

        Haukex, you are a legend, thanks.