in reply to data management script

I have to agree with the comments that chargrill made regarding the layout of your code. I have looked at your reply to his post and, although the new layout is somewhat improved, I still find it difficult to read. I don't know if problems are being introduced in the process of posting this code or whether it looks like this in reality. For instance, you have a foreach loop that is indented by two levels (four spaces each time, that's good) yet it has a closing curly brace that is indented with nine TAB characters; that's manic.

Enough about the layout, let's get to the logic. I think that you would do much better to separate the reading of the LINK file from the reading and modification of the SQL file. Read the LINK file first, constructing a hash table with the titles as the keys and the corresponding URLs as the values then close that file. After that go on to read the SQL file in a separate loop, modifying it and writing a new SQL file (both modified and unmodified lines) as you go.

TIPS

Always do

use strict; use warnings;

at the top of your scripts to enforce some discipline and to catch typo slips.

Comment your code generously and help the comments to stand out by preceding them with a blank line.

Try using the three argument form of open to avoid problems like writing to a file that should only have been read. You correctly tested for the success of your opens but it would be useful to include the operating system error, $!, in the die message.

I would try to avoid all of the messing about transforming the title and URL from the LINK file by crafting a better regular expression to pull out just what you want. I would also leave off including the quote marks until the output stage. The following code illustrates the open and the regex, capturing the information in a hash reference

my $linkFile = q{linksource.dat}; open my $linkFH, q{<}, $linkFile or die qq{open: $linkFile: $!\n}; my $rhTitleToURL = {}; LINKLINE: while (<$linkFH>) { next LINKLINE unless m {(?x) title\s*=\s*" ([^"]+) .*? href\s*=\s*" ([^"]+) }; my ($title, $url) = ($1, $2); $rhTitleToURL->{$title} = $url; } close $linkFH or die qq{close: $linkFile: $!\n};

You are wasting effort when extracting your @fields from the SQL file. Firstly, split acts by default on $_ so you don't have to specify it, @fields = split /,/; would do. However you then have to waste time stripping leading and trailing spaces from each field. The first argument to split is a regular expression so if you include the spaces around the comma in the regex, they won't get into the fields in the first place. Like this

my @fields = split m{\s*,\s*};

The modification of the SQL file probably ought to go like this:-

open old SQL file for reading open new SQL file for writing while loop reading old SQL line by line split line into fields if this is a line to be modified modify fields construct new line print to new SQL file else re-construct old line print to new SQL file close files

I hope you find this useful and I hope you take the criticisms of your code layout in the constructive spirit they are intended.

Cheers,

JohnGG

Update: Corrected indentation in code snippet caused by inadvertant TABs.