First, please forgive the lengthiness and any undesired formatting, I am a newbee and am just becoming familiar with the forum.
I was wondering if anyone might be able to help me.
I am trying to write code that ultimately will do some complex searching.
The problem I am stuck with is this:
I have two types of files: The first is one single file (example 1), while the second is about 250 files (example 2).
Example 1 is about 1173 pages worth of the following types of lines:
abaci, U, ae 1 b ax 0 s ay 0, 100, 0
All of the example 2 files contain somewhere around 20+ pages and look something like:
47.307796 122 <EXT-I've>; U; U
or
47.530873 122 lived; l ih v d; l ah v d
Currently I am trying to get PERL to read all of the lines in the first file, split them, and print one column. I also am trying to do the same using globbing for the other 250+ files. When I do either of these in isolation I can do this perfectly and generate exactly the results I need. However, when I combine the code (see below) I run into results such as:
a) Printing only the amount of information from the first file that equals the length of the other example. For example, if I begin the code with example 1 and only test with 2 files from ex. 2 (i.e., about 40 pages of ex. 2), I get either (a) the first 40 pages of ex. 1, or one line (first or last) of the file) repeated over 40 pages. The reverse happens when I flip around the codes.
I have tried (not shown in my code below) things such as switching around loops, changing the foreach loop of the globbing to a while loop, moving around the close statements, moving "}" around, switching the order of which file should be worked with first, opening all files in the glob (including ex. 1) and than trying a sort of conditional split function, altering the file handles and $lines to make them more distinct... Nothing I am doing results in a better output.
Thus if anyone has thoughts of why this is misprocessing the information I need, I would sincerely appreciate it.
The code is:
#Open file matching ex.1; open (C, "<dic.txt") || die "dictionary"; #open file to write to; open (B, ">>all.txt") || die "output"; #Making a loop of all lines in example 1 file; while ($line2 = <C>) { #Getting rid of the newline; chomp $line2; #Split all lines; @firstgrouping = split(/, |,\s|,\t|,|\s,|\t,| ,/, $line2); #splitting the lines in $firstgrouping[2] by the numbers so that text +before and after number are different indexed scalars; @actualsyll = split(/\d |\d\s|\d\t|\d|\s\d| \d|\t\d|\t\d\t|\s\d\s| \d +/, $firstgrouping[2]); #Printing the new version of @firstgrouping[2]; print B "@actualsyll\n"; } close C; #Loop gets all files matching ex. 2 opens them; foreach $file (<s*.words>) { open (A, "<$file") || die "files"; #open (B, ">>awe.txt") || die "output"; #Making a loop of all lines in each file; while ($line1 = <A>) { #There are headers with information I do not need so this essentially +cuts them out; $line1 =~ s/^ |^ |^\s\s\s|\s{3,4}//; #Chomping of the newline; chomp; #Making a loop of all lines in all files from ex. 2 without their head +ers; foreach ($line1 =~ /^\d/g) { #Splitting the files into the numbers to the first space, the 122, the + word minus extra markers, the chopped up word before the ";, the fin +al chopped up word; if ($line1 =~ /\d\s\w|\d\s{1,2}\d|\d\s\s\d|\d \d|;\s\w/gi) { $line1 =~ s/\s| |\s\s| |\s{2}/\t/g; $line1 =~ s/\t\t|\t{2}/\t/; ($stamp,$extra,$orth,$a,$b,$c,$d,$e,$f,$g, $h,$i,$j,$k,$l, $m, $n,$o,$ +p,$q,$r,$s,$t,$u,$v,$w,$x,$y,$z) = split(/ <|>;|\t/, $line1); #splitting all of the information after the first ";" into 2 scalars; $split = "$a $b $c $d $e $f $g $h $i $j $k $l $m $n $o $p $q $r $s $t +$u $v $w $x $y $z"; ($canon,$spoke) = split(/; /, $split); #Getting rid of some additional extraneous material (i.e., unwanted sp +aces...); $orth =~ s/;//; $spoke =~ s/\s{1,}$|\t{1,}$//g; #Making an array that will bind everything together (mostly to aid in +later coding not yet created); @general = ($file, $stamp,$extra,$orth,$canon, $spoke, $syll); } #combining all of the $orth's into a loop; foreach ($general[3]) { #Making each column into its own array; push(@array0, $general[0]); push(@array1, $general[1]); push(@array2, $general[2]); push(@array3, $general[3]); push(@array4, $general[4]); push(@array5, $general[5]); }}}} close A; #Making a loop of each array created above; foreach (@array0,@array1,@array2,@array3,@array4,@array5) { #Removing each element one at a time for later (not yet created) condi +tional searching of each array element; @shift0 = shift @array0; @shift1 = shift @array1; @shift2 = shift @array2; @shift3 = shift @array3; @shift4 = shift @array4; @shift5 = shift @array5; #Prints out the $orth word of each line on its own line (used mostly a +s a debugger right now); print B "@shift3\n"; }
Many thanks,
Napa
In reply to Misprocessed Read From Files? by Napa
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |