Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re^3: Addional "year" matching functionality in word matching script

by Corion (Patriarch)
on Jun 27, 2016 at 14:42 UTC ( [id://1166663]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Addional "year" matching functionality in word matching script
in thread Addional "year" matching functionality in word matching script

Yes, I chose $year_left as the name for the variable holding the year number on the left side of the comparison, and $year_right for the right side of the comparison.

You can look at a string and guess if it contains a four digit year by using the following code for example:

my $str = 'this is some text 1989 blah'; my $year; $year = $1 if( $str =~ /\b((?:19|20)\d\d)\b/ );

The regular expression looks at whether the string contains at least one number with four digits that starts with 19 or 20, and sets $year to the first such number. You could put that code into a subroutine as follows to allow for easy reuse:

sub find_year { my( $str ) = @_; my $year; $year = $1 if( $str =~ /\b((?:19|20)\d\d)\b/ ); return $year }

Replies are listed 'Best First'.
Re^4: Addional "year" matching functionality in word matching script
by bms9nmh (Novice) on Jun 27, 2016 at 16:13 UTC
    Ok, I've been trying to break things down with the info you have provided, intially with a simpler problem so I can work out whats going on. I'm trying to print the year from the title in a file called csv3 (which contains 1989 in the title) using the following code but it isn't printing anything, what am I doing wrong?
    #!/bin/perl open CSV3, "<csv3" or die; while (<CSV3>) { chomp; my ($title) = $_ =~ /^.+?,\s*([^,]+?),/; #/ match the title my %words; $words{$_}++ for split /\s+/, $title; #/ get words ## Collect unique words #+ # my @titlewords = keys(%words); + my @titlewords = keys(%words); #print "$title" } sub find_year { my( $str ) = @_; my $year; $year = $1 if( $str =~ /\b((?:19|20)\d\d)\b/ ); return $year } &find_year ($title);
    I would have thought the   &find_year ($title); would have taken the title and applied the subroutine to this string, picking out the 1989?

      You call find_year(), but you never print its output. For testing, you could start with:

      for my $title ( "Let's party like it's 1999", "If 6 was 9", "If 6 was 9", "Summer of 69", "Disco 2000", ) { print $title , " => ", find_year($title), "\n"; }
        I can see this print each line in the title and the  => bit shows the year being extracted (not sure that's the right word) from the title.
        sub find_year { my( $str ) = @_; my $year; $year = $1 if( $str =~ /\b((?:19|20)\d\d)\b/ ); return $year } for my $title ( "Let's party like it's 1999", "If 6 was 9", "If 6 was 9", "Summer of 69", "Disco 2000", ) { print $title , " => ", find_year($title), "\n"; }
        One thing I'm not sure of is, I've tried to integrate the info above so that I can use the $title from a csv file rather than titles typed into the script (as in the above example). I have attempted this below but it's not working. I'd appreciate if someone could tell me what exactly i'm doing wrong. I know I'm being prompted to learn which I'm happy to do, but I think I would learn faster if someone told me what I have done wrong in the piece of code below.
        sub find_year { my( $str ) = @_; my $year; $year = $1 if( $str =~ /\b((?:19|20)\d\d)\b/ ); return $year } #get the title from csv3 open CSV3, "<csv3" or die; while (<CSV3>) { chomp; my ($title) = $_ =~ /^.+?,\s*([^,]+?),/; #/ match the title } for my $title { print $title , " => ", find_year($title), "\n"; }
      The code as posted doesn't contain an (uncommented) print or other output statement - so, what should it print?

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1166663]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (4)
As of 2024-03-29 15:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found