Searching for Certain Values

Dr.Avocado has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Searching for Certain Values by saintly (Scribe) on Jul 30, 2007 at 17:57 UTC
Well, you're in luck. This is the kind of thing that Perl excels at. I don't know what separates the data in the columns, but assuming it's some sort of spacing (or tabs), you could use something like: # Open the file containing data or abort with error message open(my $fh, "<", "some_file.txt") \|\| die "Can't open file: $!"; # Run through all lines of the file, one by one while(my $line = <$fh>) { # Break up the line on whitespace, assign columns to vars my( $score,$scorePoints, $time,$timePoints, $record,$recordPoints, $size,$sizePoints, $age,$agePoints, $diff,$diffPoints, $size2,$size2Points, $name ) = split(/\s+/,$line,13); # Check to see if name matches if($name =~ /(intrepid\|triumph)/) { print "$name\n", "Time: $timePoints, Difficulty: $diffPoints\n\n"; } } [download] That is, break up the line on spaces, assign each of the columns to variables, then print something if the data matches a test. Since your data is very regular, the code doesn't have to be complicated. You can make some modifications for simplicity: `# Assign only the columns you're interested in my ($timePoints,$diffPoints,$name) = +(split(/\s+/,$line,13))[3,11,1 +4];` [download] Or to eliminate possibly-bogus data: `# Ensure the line consists of 12 integers + something my ( ... ) = ($line =~ /^\s(?:(\d+)\s){12}(.*)/);` [download] Or for speed (don't bother splitting lines unless they have intrepid/triumph on them somewhere): `next unless $line =~ /(intrepid\|triumph)/; my ( ... ) = ...; print ....;` [download]	[reply] [d/l] [select]
Re^2: Searching for Certain Values by ikegami (Patriarch) on Jul 30, 2007 at 19:02 UTC
`if($name =~ /(intrepid\|triumph)/) {` [download] Captures are needlessly slow, and you're not actually checking for equality. Better: `if($name =~ /^(?:intrepid\|triumph)\z/) {` [download] And if you're so worried about speed, I think doing string comparisons would be even faster. `if($name eq 'intrepid' \|\| $name eq 'triumph') {` [download]	[reply] [d/l] [select]
Re^2: Searching for Certain Values by Anonymous Monk on Jul 30, 2007 at 18:14 UTC
Thanks for the help. I'll try it out and get back to you. It looks like it'll work. And the columns are separated in the form " Score \| Points \| Time \| Points \| etc." with both spaces and \|. What do I have to change to factor in the \|?	[reply]
Re^3: Searching for Certain Values by Dr.Avocado (Novice) on Jul 30, 2007 at 18:16 UTC
Sorry, I wrote that last comment, but I forgot to log in at the time.	[reply]
Re^4: Searching for Certain Values by ikegami (Patriarch) on Jul 30, 2007 at 19:00 UTC
Re: Searching for Certain Values by Dr.Avocado (Novice) on Jul 30, 2007 at 21:49 UTC
I seem to be running into a problem. Whenever I try to execute your script, I get a "Too many arguments for open at datasearch.pl line 3, near ""data.txt") " What am I doing wrong? My current code is pretty much what Saintly gave me: #!/usr/local/bin/perl open(my $fh, "<", "data.txt") \|\| die "Can't open file: $!"; # Run through all lines of the file, one by one while(my $line = <$fh>) { # Break up the line on whitespace, assign columns to vars my( $score,$scorePoints, $time,$timePoints, $record,$recordPoints, $size,$sizePoints, $age,$agePoints, $diff,$diffPoints, $size2,$size2Points, $name ) = split(/\s+/,$line,13); # Check to see if name matches if($name =~ /(intrepid\|triumph)/) { print "$name\n", "Time: $timePoints, Difficulty: $diffPoints\n\n"; } } [download]	[reply] [d/l]
Re^2: Searching for Certain Values by johngg (Canon) on Jul 30, 2007 at 22:14 UTC
You are probably running an elderly version of Perl. What do you get when you run `/usr/local/bin/perl -v` on the command line? The three-argument form of open was introduced in Perl 5.6 according to perl56delta as were lexical filehandles (the `my $fh`). If you are running an earlier version then change `open(my $fh, "<", "data.txt") \|\| die "Can't open file: $!";` [download] to `open (FH, '<data.txt') \|\| die "Can't open file: $!";` [download] and `while(my $line = <$fh>) {` [download] to `while(my $line = <FH>) {` [download] You may want to consider upgrading your version of Perl as 5.005 is positively ancient. Cheers, JohnGG	[reply] [d/l] [select]
Re^3: Searching for Certain Values by Dr.Avocado (Novice) on Jul 30, 2007 at 22:47 UTC
That ought to do it, as I am running the ancient Perl v. 5.005. Thanks! I'll get an update ASAP.	[reply]
Re^2: Searching for Certain Values by Dr.Avocado (Novice) on Jul 30, 2007 at 22:10 UTC
BTW, here is a sample of a file I would need to search: Score \| Points \| Time \| Points \| Record \| Size \| Points \| Age \| Points + \| Difficulty \| Size \| Points \| Name 4 \|15 \|356 \|17 \|45 \|14 \|45 \|24 \|12 + \|3 \|1 \|34 \|team A 6 \|24 \|354 \|45 \|345 \|53 \|25 \|47 \|34 + \|3 \|3 \|45 \|team B 3 \|18 \|303 \|34 \|234 \|32 \|48 \|67 \|32 + \|23 \|4 \|22 \|team C 7 \|13 \|322 \|26 \|33 \|56 \|57 \|46 \|23 + \|3 \|1 \|14 \|team D 5 \|10 \|353 \|24 \|58 \|82 \|35 \|33 \|12 + \|5 \|2 \|35 \|team E 5 \|30 \|264 \|48 \|26 \|23 \|23 \|73 \|23 + \|5 \|2 \|65 \|team F 6 \|18 \|363 \|58* \|39 \|71 \|35 \|75 \|46 + \|2 \|4 \|23* \|team_triumph ---------------------------------------------------------------------- +------------------------------------- x \|x \|x \|x \|x \|x \|x \|x \|x + \|x \|x \|x \|Total ---------------------------------------------------------------------- +------------------------------------- Score \| Points \| Time \| Points \| Record \| Size \| Points \| Age \| Points + \| Difficulty \| Size \| Points \| Name 2 \|32 \|443 \|34 \|464 \|38 \|89 \|9 \|43 + \|3 \|4 \|353 \|Team C 5 \|24 \|343 \|543 \|923 \|478 \|0 \|35 \|3 + \|3 \|2 \|39 \|Team B 6 \|5 \|263 \|232 \|92 \|43 \|48 \|96 \|46 + \|4 \|52 \|78 \|team_victory ---------------------------------------------------------------------- +------------------------------------- x \|x \|x \|x \|x \|x \|x \|x \|x + \|x \|x \|x \|Total ---------------------------------------------------------------------- +------------------------------------- Score \| Points \| Time \| Points \| Record \| Size \| Points \| Age \| Points + \| Difficulty \| Size \| Points \| Name 5 \|76 \|366 \|37 \|593 \|453 \|34 \|68 \|65 + \|35 \|4 \|54 \|Team D 3 \|34 \|235 \|102 \|967 \|290 \|2 \|54 \|2 + \|3 \|6 \|3 \|Team C 2 \|643 \|643 \|34 \|291 \|10 \|2 \|43 \|53 + \|3 \|7 \|46 \|Team F 5 \|43 \|362 \|2 \|152 \|35 \|35 \|24 \|5 + \|2 \|43 \|7 \|Team G 6 \|7 \|643 \|6* \|45 \|0 \|97 \|75 \|883 + \|1 \|2 \|344* \|team_intrepid ---------------------------------------------------------------------- +------------------------------------- x \|x \|x \|x \|x \|x \|x \|x \|x + \|x \|x \|x \|Total ---------------------------------------------------------------------- +------------------------------------- [download]	[reply] [d/l]