in reply to Re: What's the best way to do a pattern search like this?
in thread What's the best way to do a pattern search like this?

I tried your method. Everything works great, execpt the program will return something like:
ba. 1 ba 2 ........
Should I do a s/\./ / on file.txt before process it through your function? What if there are other things like ? ! : ; " ' ( ) ...etc..

Replies are listed 'Best First'.
Re: Re: Re: What's the best way to do a pattern search like this?
by davorg (Chancellor) on Jul 20, 2001 at 13:57 UTC

    You just need to adjust the regex a little.

    my @array_codes = split /\s+/, <FILE>;

    assumes that you're interested in all non-whitespace characters. Changing it to:

    my @array_codes = split /\W+/, <FILE>;

    means that your're only interested in non-word characters (where word chars are A-Z, 0-9 and '-').

    --
    <http://www.dave.org.uk>

    Perl Training in the UK <http://www.iterative-software.com>

Re: Re: Re: What's the best way to do a pattern search like this?
by tachyon (Chancellor) on Jul 20, 2001 at 14:01 UTC

    Hi, you have two options. If you wish to retain ultimate control split on whitespace and filter the elemets in @array_codes using this (as above)

    $code_key =~ s/[.?!:;"'()]//g;

    This filters out all the stuff in the char class. Alternatively you can just grab alphanumerics in the first place like this:

    @array_codes = <DATA> =~ m/\w+/g;

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      I followed your instructions, and I used:
      open (FILE, "file.txt") || die "Can't open data file."; while (<FILE>) { @array_codes = split /\W+/; foreach $code_key (@array_codes) { $codes{$code_key}++; } } printf "$_\t$codes{$_}\n" for keys %codes;
      everything is outstanding, BUT the function also calculated the blank lines between the paragraphs. It outputed something like:
      12 ab 42 aba 25 .......
      I tried s/\n/ /g, but it still counted the blank lines. How do I get rid of the blank lines in a txt documents, and relace them with a space? Or there is a way to not count the blank lines? Thank you very very much...

        As noted by MeowChow the devil is in the details:

        #!/usr/bin/perl -w use strict; my %codes; open (FILE, "file.txt") || die "Can't open data file, perl says $!"; while (<FILE>) { my @array_codes = /[A-Za-z0-9]+/g; foreach my $code_key (@array_codes) { $codes{$code_key}++; } } print "$_\t$codes{$_}\n" for sort keys %codes;

        This will get groups of chars that match A-Za-z0-9 only and then prints out the sorted hash. I have added the $! var in the die which contains Perl's error message and sort the keys. I have added us strict, -w and the my declarations. Also note that printf was a typo. I meant print.

        cheers

        tachyon

        s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print