in reply to Re: Help - Counting text - Associative Array? (I was Annon Monk)
in thread Help - Counting text - Associative Array?

Hi Sam. This additional information helps, but what output do you want from the above example? Say you're searching for the number of times 'catalog' and 'crime' appear in security.txt, do you want zero for the former and 2 for the latter (which is the number of times these words appear in the section of the file you have shown us)? That would be pretty easy (but you would have to think about whether you want to include 'Crime' (capital first letter), 'crimes' (plural)... and then about how to exclude eg 'Crimean').

Or do you perchance want to open the files in the 'Object Name' field and search these for the words in question, which is not so easy, but eminently feasible (and note that the 'crimes/Crime/Crimean' issue remains)?

dave

  • Comment on Re: Re: Help - Counting text - Associative Array? (I was Annon Monk)

Replies are listed 'Best First'.
Re: Re: Re: Help - Counting text - Associative Array? (I was Annon Monk)
by Mr_Lowry (Initiate) on Mar 19, 2004 at 20:48 UTC
    HI Dave-

    You ask good questions. Disregard my earlier code because it was using the wrong variable to search with (I think that's correct?)

    What I am wanting is to find out how many times the long "exe" strings appear in a file. Then print each result with a name for the string rather than the string itself. Eg:

    C:\Program Files\Internet Explorer\IEXPLORE.EXE

    in:

    3/15/2004,2:29:31 PM,Security,Success Audit,Object Access ,560,SERVER\ +refterm,SERVER,"Object Open: Object Server: Security Object Type: File Object Name: C:\Program Files\Internet Explorer\IEXPLORE.EXE New Handle ID: 548 Operation ID: {0,178005108}

    Result:

    Name: Explorer Count: 1

    It seems I was walking down the wrong path and will have to use regex (which I have not yet read about, in all honesty) to search for the exe string within each line of text. The 'crime', 'crimean' issue is not a problem because there are no variants of the strings I'd be counting. There are MANY paragraphs like the one above in the file to be searched.

    If you have any ideas what this code might look like feel free to offer it :-) I will learn as much on my own in the meantime!

    Regards,

    Sam

      It seems I ... will have to use regex
      This is not correct. Regexes are for cases where you don't know what to search for exactly, you just now how the pattern looks like, e.g. two letters followed by 3 or more digits ending with a newline. In your case, you are searching for a literal string and thus you can use index (see perldoc -f index).

      Assuming that the long filenames are not broken across lines and the same filename does not appear multiple times on on line, the following should do what you want:

      use strict; @ARGV = ('security.txt') unless @ARGV; my %exe = ( 'C:\Program Files\Internet Explorer\IEXPLORE.EXE' => 'Catalog', 'D:\crime\Reader\AcroRd32.exe' => 'Crime', ); my %count; while (<>) { for my $filename (keys %exe) { %count{ $exe{filename} }++ if index($_, $filename) > -1; }; } print "Name: $_ Count: $count{$_}" for keys %count;
      The trick is to define the hash %exe the other way around, so that it gives the short name when accessed with the long filename.

      -- Hofmator