denting24by7 has asked for the wisdom of the Perl Monks concerning the following question:

I am doing something wrong trying to build an array that should contain all jscript extension file names. I am very new to perl. I have this working in bash. I know that the first array has the lines containing the jscript file names. Just can't figure out how to remove just them and put them in an array to sort unique items.

#!/usr/bin/perl use strict; use warnings; sub main { my @js = (); my @ujs =(); my $file = 'access_log.txt'; open(FH, $file) or die("File $file not found"); while(my $String = <FH>) { if($String =~ '[^/]*\.js') { push(@js,($String)); #print "$String \n"; @ujs = grep {/[^/]*\.js/}, @js; # print @ujs; } } close(FH); foreach (@ujs) { print "$_\n"; } } main();

Replies are listed 'Best First'.
Re: Building ARRAY with grep expression NOT working
by hippo (Archbishop) on Mar 31, 2020 at 21:37 UTC

    Here is your code modified in a few ways:

    • There's no need for a main() sub here so I've removed that. The script will run fine without it.
    • The @js array is no longer needed so has been removed.
    • open now uses the 3-argument form and a lexical filehandle. If it fails the failure message now says why (it could easily be some reason other than the file not being found).
    • The results are now stored in a hash since you mentioned you were after a unique list.
    • I've had to assume the format of your input data because you never showed that.
    • Lots of the places you've used brackets where Perl doesn't require them so those have been removed for clarity.
    • I've used postfix versions of unless and for to reduce the number of indented blocks.

    The rest of your variable names have been left the same. If I were writing this myself I would lowercase them all but that's purely a stylistic thing.

    #!/usr/bin/env perl use strict; use warnings; my %ujs; my $file = 'access_log.txt'; open my $FH, '<', $file or die "File $file not opened: $!"; while (my $String = <$FH>) { next unless $String =~ /([^\/]*\.js)/; $ujs{$1} = 1; } close $FH; print "$_\n" for sort keys %ujs;
      next unless $String =~ /([^\/]*\.js)/;

      You may also want to use an anchor so that ".js" only matches at the end of the string instead of anywhere in the string:

      next unless $String =~ /([^\/]*\.js)$/;

      • I've used postfix versions of unless and for to reduce the number of indented blocks.

      Which makes it harder for some (me) to read the code.

      The reason I answer is not because of that, but because of the wrong reason you state with it. It is easy to not have indentation and stil not use postfix unless and if as you already showed on line 7. I would prefer

      #!/usr/bin/env perl use strict; use warnings; my %ujs; my $file = "access_log.txt"; open my $fh, "<", $file or die "File $file not opened: $!\n"; while (<$fh>) { m/([^\/]*\.js)$/ and $ujs{$1}++; } close $fh; print "$_\n" for sort keys %ujs;

      Enjoy, Have FUN! H.Merijn
        open ... or ...

        is Perl idiom and completely acceptable.

        ... and ...;

        in place of

        ... if ...;

        is a sneaky use of a logical operator and short circuit evaluation that is more appropriate to golf than production code.

        I realise that eyes adapt over time, but explaining to a neophyte how and provides conditional code needs a lot more prose than explaining as postfix if.

        Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond
        A reply falls below the community's threshold of quality. You may see it by logging in.
      Thanks hippo. I really appreciate it. The log file is just an apache weblog. I was hopping to find all extensions of files requested. Any suggestions on a good starting point as in books? I have along way to go and I need to start parsing large logs and patch assessment logs, heavily. Thanks in advance to everyone for their help and for any more insight on learning material to start with.
    A reply falls below the community's threshold of quality. You may see it by logging in.
Re: Building ARRAY with grep expression NOT working
by Corion (Patriarch) on Mar 31, 2020 at 20:16 UTC

    Perl is less like the shell. See grep - it doesn't take grep command line options.

    You likely want:

    @ujs = grep { m![^/]*\.js! } @js;

    Also, in Perl, it's customary to write regular expressions between // or as m!! (or any other character pair after m). Using strings as regular expressions works, but usually it's written as:

    if($String =~ /[^\/]*\.js/ ) # or if($String =~ m![^/]*\.js! )
Re: Building ARRAY with grep expression NOT working
by bliako (Abbot) on Apr 01, 2020 at 08:45 UTC
    if($String =~ '[^/]*\.js')

    this is not wrong at all but just to clarify that the use of single quotes as regex delimiter is special in that it does not interpolate any variables inside the regex prior to matching:

    my $ext = "js"; my $String = "abc.js"; print "ok" if($String =~ '[^/]*\.$ext'); # or: print "ok2" if($String =~ m=[^/]*\.$ext=); # or: print "ok3" if($String =~ m"[^/]*\.$ext"); # or: print "ok4" if($String =~ m![^/]*\.$ext!); # or: print "ok5" if($String =~ m{[^/]*\.$ext});

    by prepending the regex with m you can use almost any symbol for delimiting, like {} or # etc. (edit: added some example delimiters)

Re: Building ARRAY with grep expression NOT working
by Tux (Canon) on Mar 31, 2020 at 20:10 UTC
    $ perldoc -f -o

    Enjoy, Have FUN! H.Merijn
Re: Building ARRAY with grep expression NOT working
by leszekdubiel (Scribe) on Apr 04, 2020 at 08:18 UTC

    In perl you can think more like human, not like programmer... what you need to do? Take lines from logfile, look for names ending in "js" and list them... so you can process it like this:

    #!/usr/bin/perl -CSDA use utf8; use Modern::Perl; my %h = map { ($_, 1) } grep { $_ } map { /([^\/]*\.js)/ && $1 } `cat /var/log/apache2/access.log.1`; print map { "$_\n" } sort keys %h;

    this is less readable:

    #!/usr/bin/perl -CSDA use utf8; use Modern::Perl; print map { "$_\n" } sort keys %{{ map { ($_, 1) } grep { $_ } map { /([^\/]*\.js)/ && $1 } `cat /var/log/apache2/access.log.1` }};

    and use Path::Tiny if failure on file open is important:

    #!/usr/bin/perl -CSDA use utf8; use Modern::Perl; use Path::Tiny; my %h = map { ($_, 1) } grep { $_ } map { /([^\/]*\.js)/ && $1 } path('/var/log/apache2/access.log.1')->lines_utf8(); print map { "$_\n" } sort keys %h;