in reply to data management script
Enough about the layout, let's get to the logic. I think that you would do much better to separate the reading of the LINK file from the reading and modification of the SQL file. Read the LINK file first, constructing a hash table with the titles as the keys and the corresponding URLs as the values then close that file. After that go on to read the SQL file in a separate loop, modifying it and writing a new SQL file (both modified and unmodified lines) as you go.
TIPS
Always do
use strict; use warnings;
at the top of your scripts to enforce some discipline and to catch typo slips.
Comment your code generously and help the comments to stand out by preceding them with a blank line.
Try using the three argument form of open to avoid problems like writing to a file that should only have been read. You correctly tested for the success of your opens but it would be useful to include the operating system error, $!, in the die message.
I would try to avoid all of the messing about transforming the title and URL from the LINK file by crafting a better regular expression to pull out just what you want. I would also leave off including the quote marks until the output stage. The following code illustrates the open and the regex, capturing the information in a hash reference
my $linkFile = q{linksource.dat}; open my $linkFH, q{<}, $linkFile or die qq{open: $linkFile: $!\n}; my $rhTitleToURL = {}; LINKLINE: while (<$linkFH>) { next LINKLINE unless m {(?x) title\s*=\s*" ([^"]+) .*? href\s*=\s*" ([^"]+) }; my ($title, $url) = ($1, $2); $rhTitleToURL->{$title} = $url; } close $linkFH or die qq{close: $linkFile: $!\n};
You are wasting effort when extracting your @fields from the SQL file. Firstly, split acts by default on $_ so you don't have to specify it, @fields = split /,/; would do. However you then have to waste time stripping leading and trailing spaces from each field. The first argument to split is a regular expression so if you include the spaces around the comma in the regex, they won't get into the fields in the first place. Like this
my @fields = split m{\s*,\s*};
The modification of the SQL file probably ought to go like this:-
I hope you find this useful and I hope you take the criticisms of your code layout in the constructive spirit they are intended.
Cheers,
JohnGG
Update: Corrected indentation in code snippet caused by inadvertant TABs.
|
|---|