I have hundreds of files in subdirectories. I need to go into each file in turn, create a backup copy of each original file, change permissions on the original file, then pattern search each file for two specific lines. The first line always begins with <IMB SRC and ends with BR>, the second line always begins with Figure followed by a space and one or more numbers. There could be any kind of character in the lines in between these two liness, spaces, ", (), /\<>., etc. I want to throw out everything between the two lines I need. After I find the two lines I need, I want to put them in a hash, with the Figure line as the key and the <IMG SRC line as the value. Then I want to search the entire file for any other occurrances of the matching Figure line that exist without matching <IMG SRC lines. At that point I will hopefully create a link from the hash value to that found Figure line. After one file is completed, it should proceed to the next file and start again, running through the 600 or so files. When complete each "Figure digit" will be linked to the correct <IMG SRC line.
I have the first part working. I am drilling down to the correct directories, making the backup copies and changing the permissions. I get no errors when I run this; however, it doesn't find anything. At least, nothing prints. I'm a beginner at this, and have been through the mountain of books I bought to help me, I'm still stuck. Any and all help would be more than appreciated. Thank you.
This is an excerpt from one file, showing the lines I need to get with the pattern search:
<IMG SRC="/CSS/tpubs_graphics/L-/5/7/L-57174.00000001.gif">
1. Check Valve
(whitespace) 2. Check Valve (Altair)
(whitespace) #82-22-02
(whitespace) Fuel Control Water Signal Check Valve REMOVAL-01
(whitespace) Figure 301 Page 302
# ARGUMENTS: engine_figurelinks.pl xx_manual_vvv
# where xx = manual code, vvv = version
#
# MODIFICATIONS:
#
#---------------------------------------------------------------------
+-
use warnings;
use diagnostics;
use Env qw(SERVER_NAME);
use CGI qw(:standard :netscape);
use File::Copy; # Perl supplied module for making copies
new CGI;
#---------------------------------------------------------------------
+-
($manualdir_param) = @ARGV;
$working_dir = $manualdir_param;
$working_dir =~ s/manualdir=//i;
$data_area = "/tmp";
$html_dir = "$data_area/$working_dir";
#---------------------------------------------------------------------
+-
# Loop to locate HTML files, change permissions, and make working temp
+orary copies
opendir( HTMLSTORIES, "$html_dir") || die "HTML dirs do not exist: $1"
+;
@FigureArray = grep{/^(09)(\w{1,5})(00)$/} readdir ( HTMLSTORI
+ES );
foreach $FigFile (@FigureArray) {
opendir( HTMSTORY, "$html_dir/$FigFile" ) || die "File
+s do not exist: $1";
@FileArray = grep{/a.htm$/} readdir ( HTMSTORY
+ );
foreach $DirFile (@FileArray) {
copy ("$html_dir/$FigFile/$DirFile", "
+$html_dir/$FigFile/$DirFile.bak") or die "Can not make backup copy of
+ file: $1";
chmod 0600, "$html_dir/$FigFile/$DirFi
+le";
while (< "$html_dir/$FigFile/$DirFile" >) {
$Figures = "$html_dir/$FigFile/$DirFil
+e" =~ /(<IMG.*?BR>)...(Figure\d*)/i;
print $Figures;
}
}
}
closedir HTMSTORY;
closedir HTMLSTORIES;
Edited by planetscape - removed unnecessary br tags