http://qs1969.pair.com?node_id=1166663


in reply to Re^2: Addional "year" matching functionality in word matching script
in thread Addional "year" matching functionality in word matching script

Yes, I chose $year_left as the name for the variable holding the year number on the left side of the comparison, and $year_right for the right side of the comparison.

You can look at a string and guess if it contains a four digit year by using the following code for example:

my $str = 'this is some text 1989 blah'; my $year; $year = $1 if( $str =~ /\b((?:19|20)\d\d)\b/ );

The regular expression looks at whether the string contains at least one number with four digits that starts with 19 or 20, and sets $year to the first such number. You could put that code into a subroutine as follows to allow for easy reuse:

sub find_year { my( $str ) = @_; my $year; $year = $1 if( $str =~ /\b((?:19|20)\d\d)\b/ ); return $year }

Replies are listed 'Best First'.
Re^4: Addional "year" matching functionality in word matching script
by bms9nmh (Novice) on Jun 27, 2016 at 16:13 UTC
    Ok, I've been trying to break things down with the info you have provided, intially with a simpler problem so I can work out whats going on. I'm trying to print the year from the title in a file called csv3 (which contains 1989 in the title) using the following code but it isn't printing anything, what am I doing wrong?
    #!/bin/perl open CSV3, "<csv3" or die; while (<CSV3>) { chomp; my ($title) = $_ =~ /^.+?,\s*([^,]+?),/; #/ match the title my %words; $words{$_}++ for split /\s+/, $title; #/ get words ## Collect unique words #+ # my @titlewords = keys(%words); + my @titlewords = keys(%words); #print "$title" } sub find_year { my( $str ) = @_; my $year; $year = $1 if( $str =~ /\b((?:19|20)\d\d)\b/ ); return $year } &find_year ($title);
    I would have thought the   &find_year ($title); would have taken the title and applied the subroutine to this string, picking out the 1989?

      You call find_year(), but you never print its output. For testing, you could start with:

      for my $title ( "Let's party like it's 1999", "If 6 was 9", "If 6 was 9", "Summer of 69", "Disco 2000", ) { print $title , " => ", find_year($title), "\n"; }
        I can see this print each line in the title and the  => bit shows the year being extracted (not sure that's the right word) from the title.
        sub find_year { my( $str ) = @_; my $year; $year = $1 if( $str =~ /\b((?:19|20)\d\d)\b/ ); return $year } for my $title ( "Let's party like it's 1999", "If 6 was 9", "If 6 was 9", "Summer of 69", "Disco 2000", ) { print $title , " => ", find_year($title), "\n"; }
        One thing I'm not sure of is, I've tried to integrate the info above so that I can use the $title from a csv file rather than titles typed into the script (as in the above example). I have attempted this below but it's not working. I'd appreciate if someone could tell me what exactly i'm doing wrong. I know I'm being prompted to learn which I'm happy to do, but I think I would learn faster if someone told me what I have done wrong in the piece of code below.
        sub find_year { my( $str ) = @_; my $year; $year = $1 if( $str =~ /\b((?:19|20)\d\d)\b/ ); return $year } #get the title from csv3 open CSV3, "<csv3" or die; while (<CSV3>) { chomp; my ($title) = $_ =~ /^.+?,\s*([^,]+?),/; #/ match the title } for my $title { print $title , " => ", find_year($title), "\n"; }
      The code as posted doesn't contain an (uncommented) print or other output statement - so, what should it print?