Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

PERL searching through a file

by ssimone (Initiate)
on Jan 20, 2017 at 13:33 UTC ( [id://1180002]=perlquestion: print w/replies, xml ) Need Help??

ssimone has asked for the wisdom of the Perl Monks concerning the following question:

Hi, i have this PERL problem that i need to solve. I need to write a PERL script, which receives the name of a file as a command line attributeARGV0. Then, through the keyboard, it should receive parameters, one by one(array of charachters) that need to be searched in the text from the file. The script should create new files (the number of new files should be the same as the number of parameters received from a keyboard) with the same name as the original file, but it needs to have "_parameter" where(instead of "parameter" it should be the actual name of the parameter). In those files, the script writes down the sentences where the searching parameter shows in the original file. Every sentence i seperated from another one with "." and/or a new line \n.

Replies are listed 'Best First'.
Re: Perl searching through a file
by toolic (Bishop) on Jan 20, 2017 at 13:54 UTC
    What a terrific opportunity for you to learn Perl!
    • Read perlintro
    • Write some code.
    • Post your detailed question back here if you need specific help.
Re: PERL searching through a file
by Marshall (Canon) on Jan 20, 2017 at 14:23 UTC
    Your post looks like you are asking us to write your homework. Many Monks here can do this assignment. new:I retract that statement as I am not sure about the requirements after a second reading.

    Write some code yourself. If you have some questions about your code, in particular how Perl works, the Monks will help.

    Update: I will personally help you, but I don't know where to start without some code. My objective would be for you to learn how to do it.

      Hi, i give you the code i have so far. I am just very new at PERL but yet i got this difficult problem(for me) to solve.
      #!/usr/bin/perl -w $file=$ARGV[0]; @params=<STDIN>; %numberoftimes={}; $nuoftimes=0; while(<FH>){ @lines=split("\n",$file); for $line(@lines){ for $p(@params){ if (index($line,$p)){ nuoftimes++; #create a new file } $%numberoftimes{$p}=$nuoftimes; #every time $p is found in $line, #i have to write that $line in the new file elsif($p eq "end") #the input of the parameters should end } } }

        Hello ssimone, and welcome to the Monastery!

        The code you’ve shown has some major problems:

        1. First, it’s good to see you using warnings, but you should also use strict and declare each variable with my. This may look like a little extra work, but it will save you heaps of time and effort in the long run.

        2. @params=<STDIN>;

          Have a look at the output of this little script:

          15:58 >perl -MData::Dump -wE "my @params = <STDIN>; dd \@params;" 1 2 3 ^Z ["1\n", "2\n", "3\n"] 15:59 >

          Besides the fact that the user has to guess what is required, and remember to terminate with Ctrl-Z (or its equivalent), you’ll notice that each parameter has a trailing newline character. You should investigate Perl’s inbuilt chomp function (and also put the parameter input code inside a loop).

        3. %numberoftimes={};

          %numberoftimes is a hash variable, but {} is a scalar (a reference to an empty anonymous hash). To initialise a hash, use parentheses: %numberoftimes = (); — although a simple declaration (my %numberoftimes;) is all that’s needed here.

        4. while(<FH>){

          You can’t just use a filehandle (FH) without first opening it to point to a file:

          open(FH, '<', $file) or die "Cannot open file '$file' for reading, stopped";

          See perlopentut.

        5. while(<FH>){ @lines=split("\n",$file); for $line(@lines){

          There are two problems here. First, you want to split on each line, so giving the split function the name of the file makes no sense. Second, the syntax <FH> reads a line of text from the filehandle, so splitting on newlines again makes no sense. You want something like this:

          while (my $line = <FH>) { chomp $line; for my $p (@params) { ... # use $line
        6. if (index($line,$p)){

          The builtin index function returns -1 on failure, and in Perl anything other than undef, 0, '0', or the empty string, is “true,” so the if clause won’t behave as you expect it to.

        As toolic advised, you really need to read (or re-read!) perlintro and master the basics. The best way is to break your program down into its smallest parts, and work on each part until it does what you want and you understand how it works.

        Hope that helps,

        Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

        Hi ssimone,

        I like the post from Athanasius so much here is a quote:

        The best way is to break your program down into its smallest parts, and work on each part until it does what you want and you understand how it works.
        That is great advice!

        However, before doing that, the very first step in writing software is to be very, very clear about what you intend for the program to do. Right off the bat, I see some issues. Your OP (Original Post) talks about sentences, but your code talks about "lines"... The OP talks about non-English sentences which can prematurely be terminated by a \n. Which is it? A complete English sentence or line?

        Example:

        Bob is tall. Mary is short. Fred is
        medium height.

        Is this 3 or 4 "sentences"? From the OP, I see 3 "sentences" in line 1 and an additional sentence in line 2. For a total of 4. A normal English "sentence" interpretation of these 2 lines would be that there are 3 instead of 4 sentences.

        For implementation, I would break this down into some steps, perhaps:

        1. Write progam to get input parameters. I give a massive hint below. A "standard" command line interface requires a number of steps. We want to do simple validation before we start doing a lot of "real work".
        2. Write program to parse an input file into an array of "sentences"
        3. Write program to print all lines in @sentences that contains one of the search tokens
        4. Start integrating the steps together
        Here is some code for the UI (User Interface) part to get you started. I think this is the hardest part, the rest should be easier.

        #!/usr/bin/perl use warnings; use strict; my $in_filename = shift @ARGV; my @extra_parms = @ARGV; sub usage { print "Usage:\n Searches for selected 'sentences' in input file\n"; print " A sentence ends with a period(.) or could be the\n"; print " end of line '\\n'\n"; # other text to expain what "rules are goes here.... # give an example of the command. print " Example: mysearch infilename\n\n"; print " The program will prompt user for the search terms.\n"; exit(1); #this is an error exit (non-zero return value) } # The very basic "sanity" checks to display the usage() message. if (!defined($in_filename) or @extra_parms>0 or $in_filename =~ /^(-)?\?/ or $in_filename =~ /^\s*-(-)?h(elp)?/ +i) { usage(); } if (! -e $in_filename) { print "Error! input file name: $in_filename does not exist\n\n"; usage(); } print "Enter search parameter(s), one per line or end\n"; my $input; my @search_tokens; while ( (print "search for: "), $input=<STDIN>, $input !~ /^\s*END\s*$ +/i) { next if $input =~ /^\s*$/; # re-prompt on a blank input line $input =~ s/^\s*//; # delete leading spaces $input =~ s/\s*$//; # delete trailing spaces if ($input =~ /^\S+\s+\S+/) { print "Error! Only one search term per input line!\n"; next; } # Note: here tr// counts the number of characters which are not # in the set, without modifying the $input variable. if ($input =~ tr/A-Za-z0-9_//c) # must be legal characters # for a filename! { print "Error! Illegal character! only A-za-z0-9_ allowed!\n"; next; } if ($input =~ /^\d/) # must be legal characters for a filename! { print "Error! Token cannot start with a number!\n"; next; } push (@search_tokens, $input); } if (@search_tokens==0) { print "Error! No search tokens entered..exiting..\n"; exit (2); # another error exit with error code 2 } # For debugging, dump the tokens back out print "\n"; #just a space line print "Search Terms are:\n"; foreach my $token (@search_tokens) { print "token=\'$token\'\n"; } __END__ Get the above working and tested, then move on the next section of the program. You can make a separate test program that just Hard codes the $filename and @search_tokens. Get that working, then move on to another step.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1180002]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others lurking in the Monastery: (8)
As of 2024-04-18 16:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found