comment on

Dear Perl Monks,

It seems like I've been a novice for sometime. I use perl in spurts and forget a lot of things I just learned. I really want to get better at this. So, I'm always asking for best practices as most of the things I use perl for are for building small utilities to help me as database administrator.

On that note, I need your help with my perl script. Especially my regular expression. I have an input file (FileList.txt) that is fairly static and has a few lines. The entries of this input file are filenames with their path directories. The first (2) files listed have another file name within that file, but it doesn't have the school name embedded in its filename like the others.

Ex. (FileList.txt)

 
/home/test/abc/.date_run_dir
/home/test/def/.date_run_dir
/home/test/abc/.date_file_sent.email@wolverine.cole.edu
/home/test/abc/.date_file_sent.dp3.drew.net
/home/test/def/.date_file_sent.email@wolverine.cole.edu
/home/test/def/.date_file_sent.dp3.drew.net
[download]

For each file listed, I want to extract type abc or def and place in a variable. I also want to extract school names cole or drew as values in a variable. If there is no school names as seen in the first (2) files, then value should be named null

Also for each filename listed, I want the contents of those files in a variable. Each file has as single value name. Nothing complex.

Script thus far:

use strict;
#use warnings;

my $file = '/home/test/FileList.txt';
open my $FILE, '<', $file or die "unable to open '$file' for reading: 
+$!";
while (my $line = <$FILE>) {
    chomp($line);
    #if ($line =~ m#home/test/(\w{3}).*[.](\w+)[.].*#) {
    if ($line =~ m#home/test/(\w{3}).*[.](\w+)[.]?.*#) {
        #print "$line\n";  .last_file_sent*
        open my $file2, '<', $line or die "unable to open '$file' for 
+reading: $!";
        while(my $line2 = <$file2>) {
        print "Type:$1:School:$2:File:$line2";
        #print "$line2";
        }
        close $file2;
    }
} #end while
close $FILE;
[download]

Output:

(note, regex is capturing edu or net which is not what i want. Also regex is capturing date_run_dir which in this case if there is no school name in the file name, default to value of null.

Type:abc:School:date_run_dir:File:/product/classroom/subject/data/sysf
+eed_abc_2010120810.ext3
Type:def:School:date_run_dir:File:/product/classroom/subject/data/sysf
+eed_def_2010120806.ext3
Type:abc:School:edu:File:domain_abc.dat.2010120810.ext3
Type:abc:School:net:File:domain_abc.dat.2010120810.ext3
Type:def:School:edu:File:domain_def.dat.2010120805.ext3
Type:def:School:net:File:domain_def.dat.2010120804.ext3
[download]

Thanks for your help.

In reply to Help constructing proper regex to extract values from filenames and concurrently opening those same files to access records by JaeDre619

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.