Re: Misprocessed Read From Files?

1 #Open file matching ex.1; 2 open (C, "<dic.txt") || die "dictionary"; 3 #open file to write to; 4 open (B, ">>all.txt") || die "output";
[download]

You should at least include the file name and the $! variable in the error message so you know why open failed. Your program should start with the two lines:

use warnings;
use strict;
[download]

10 @firstgrouping = split(/, |,\s|,\t|,|\s,|\t,| ,/, $line2);
[download]

The \s character class includes both the " " and the "\t" characters so that regular expression could be simplified to: /\s?,\s?/

12 @actualsyll = split(/\d |\d\s|\d\t|\d|\s\d| \d|\t\d|\t\d\t|\s\ +d\s| \d /, $firstgrouping[2]);
[download]

And that regular expression could be simplified to: /\s?\d\s?/

20 open (A, "<$file") || die "files";
[download]

Again, you should include the file name and the $! variable in the error message so you know why open failed.

25 $line1 =~ s/^ |^ |^\s\s\s|\s{3,4}//;
[download]

That regular expression could be simplified to /^ {4}|^\s{3}|\s{3,4}/

27 chomp;
[download]

You are chomping the $_ variable but you are not using the $_ variable.

29 foreach ($line1 =~ /^\d/g) {
[download]

$line1 =~ /^\d/g returns a list of the matches in $line1 and stores each match in the $_ variable each time through the loop. However the pattern /^\d/ will only match once because it is anchored at the beginning of the line. So perhaps that line should be:

     29 if ( $line1 =~ /^\d/ ) {
[download]

31 if ($line1 =~ /\d\s\w|\d\s{1,2}\d|\d\s\s\d|\d \d|;\s\w/gi) {
[download]

That regular expression could be simplified to /[\d;]\s\w|\d\s{2}\d/

32 $line1 =~ s/\s| |\s\s| |\s{2}/\t/g;
[download]

That regular expression could be simplified to /\s/

33 $line1 =~ s/\t\t|\t{2}/\t/;
[download]

That regular expression could be simplified to /\t\t/

40 $spoke =~ s/\s{1,}$|\t{1,}$//g;
[download]

The /g option is extraneous because the pattern is anchored at the end of the line and will only match once. That regular expression could be simplified to /\s+$/

54 }}}} 55 close A;
[download]

You are closing the A filehandle outside of the foreach loop, which is OK because perl will automatically close it every time it opens it again.

So, removing all the unneeded variables and adding indentation, your code can be simplified to:

#!/usr/bin/perl
use warnings;
use strict;

# Open file matching ex.1
open C, '<', 'dic.txt' or die "dic.txt: $!";
# open file to write to
open B, '>>', 'all.txt' or die "all.txt: $!";
# Making a loop of all lines in example 1 file
while ( my $line2 = <C> ) {
    # Getting rid of the newline
    chomp $line2;
    # Split all lines
    my $firstgrouping = ( split /\s?,\s?/, $line2 )[ 2 ];
    # splitting the lines in $firstgrouping[2] by the numbers so that 
+text before and after number are different indexed scalars
    my @actualsyll = split /\s?\d\s?/, $firstgrouping;
    # Printing the new version of @firstgrouping[2]
    print B "@actualsyll\n";
    }
close C;

# Loop gets all files matching ex. 2 opens them
my @array3;
for my $file ( <s*.words> ) {
    open A, '<', $file or die "$file: $!";
    # Making a loop of all lines in each file
    while ( my $line1 = <A> ) {
        # There are headers with information I do not need so this ess
+entially cuts them out
        $line1 =~ s/^ {4}|^\s{3}|\s{3,4}//;
        # Chomping of the newline
        chomp $line1;
        next unless $line1 =~ /^\d|[\d;]\s\w|\d\s{2}\d/;
        $line1 =~ s/\s/\t/g;
        $line1 =~ s/\t\t/\t/;
        my $orth = ( split / <|>;|\t/, $line1 )[ 2 ];
        # Getting rid of some additional extraneous material
        $orth  =~ s/;//;
        push @array3, $orth;
        }
    close A;
    }

# Making a loop of each array created above
for my $shift3 ( @array3 ) {
    # Prints out the $orth word of each line on its own line (used mos
+tly as a debugger right now)
    print B "$shift3\n";
    }
[download]

Comment on Re: Misprocessed Read From Files? Select or Download Code