First, lets start with what you did right:
- Provided sample data.
- use strict; use warnings - very good.
Now for things to clean this up:
- Declare all your variables: @fields is declared, but nothing else is.
- open - you might want to take a look at perlopentut if you haven't seen that already. You need to tell Perl whether the file is for input or output, like this: open(INPUTFILE, '<', "some_filename"); open(OUTPUTFILE, '>', "some_other_filename")
- Going by your comments and file names, it looks like you have mixed up your input and output sources: you are chomp'ing data from your output file and printing data from the input file.
- You don't actually need to read the file in line by line if you choose to use for (<DATAFILE>) - Perl will automatically slurp in the entire file and generate an array with one element for each line in your input file - so you can totally get rid of that first while loop for now. However, when you get more advanced, it is usually more efficient and scalable to read files in line by line using a while loop.
- Formatting at PM: You can preserve the indenting of your code sample by surrounding it in <code> tags. Also you don't need <body> tags for Perl Monks posts.
- Formatting in your source code (just in case your source code also has all lines flush left - if not, ignore this). Normally we indent a group of statements inside each curly bracket pair so we can see easily which statements are inside the loop and which are outside.
- Checking for undefined variables: in your sample data, some of your records are missing the call number field. Data files don't always have all the data you expect, so it is usually a good idea to check first to see
if any values are in fact undefined before using them.
- HTML generation cleanup: some of your tags don't match, e.g. isbn/isbnr
Here is an example of your forloop cleaned up a bit.
#you can use my to declare things even in the middle of a
#statement!
#the for loop slurps in all the lines in DATAFILE,
#provided you have opened it for reading.
foreach my $line (<DATAFILE>)
{
$line =~ s/&+/&/;
# you can also use my (...) to declare many variables
# at once
my ($isbn, $ocln, $title, $author, $call_number)
= split(/\t/, $line);
# you'll likely have different defaults for cases where
# fields are undefined
$isbn='' unless defined($isbn);
$ocln='' unless defined($ocln);
$title='' unless defined($title);
$author='' unless defined($author);
$call_number='' unless defined($call_number);
print NEWFILE "<match>\n";
print NEWFILE "<title>";
print NEWFILE $title, "</title>\n";
print NEWFILE "<isbn>";
print NEWFILE $isbn, "</isbnr>\n"; #typo? isbn*r*
print NEWFILE "<call_number>";
print NEWFILE $call_number,"</call_number>";
print NEWFILE "</match>\n";
}
Best, beth
Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
Read Where should I post X? if you're not absolutely sure you're posting in the right place.
Please read these before you post! —
Posts may use any of the Perl Monks Approved HTML tags:
- a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
| |
For: |
|
Use: |
| & | | & |
| < | | < |
| > | | > |
| [ | | [ |
| ] | | ] |
Link using PerlMonks shortcuts! What shortcuts can I use for linking?
See Writeup Formatting Tips and other pages linked from there for more info.