Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

foreach type of deal?

by hoagies (Initiate)
on Mar 15, 2017 at 18:01 UTC ( [id://1184738]=perlquestion: print w/replies, xml ) Need Help??

hoagies has asked for the wisdom of the Perl Monks concerning the following question:

I have a file with a bunch of repeating words going down the list (500k +). They all start out with a g. I need to get rid of that g. From what I've learned so far, that seems to be a 'foreach' job. however, after that, I don't know what to do. Do I chomp the g?...I was thinking about passing the loop through to a system()...but I want to learn other ways. I need a basic syntax outline for constructing that script, (specifically what to do after <FH>, foreach part)or a link for help (I've looked, but since I'm learning, and this is a very specific deal, I can't seem to adjust other examples based on context). Sorry for the overly basic question.

Replies are listed 'Best First'.
Re: foreach type of deal?
by toolic (Bishop) on Mar 15, 2017 at 18:07 UTC
    One way to get rid of the g is to use the substitution operator (s///). This deletes the g if it is the 1st character on a line:
    while (<>) { s/^g//; }

    • Show a few lines of your input (in "code" tags), if you need more help.
    • perlintro
      Since 500k+ is a big number and the spec says "They all start out with a g", I would rather use substr since there is no need for a test.
      while (<>) { my $word = substr($_,1); # Start with the second letter }

      As a note for the OP, while is a nicer option than Foreach Loops in this scenario since, in general, foreach constructs the entire list before incrementing over it, whereas while will just hit your file-handle once per loop -- normally, that means while has a much smaller memory footprint than foreach does for large sets. I don't recall if in the particular case of foreach(<$fh>) { there is an optimization in perl to avoid this pitfall.


      #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

        Do we ever have perfect data? I prefer to use a test to warn about invalid data. The following code (untested) makes use of runtime switches (ref: perldoc perlrun). Use -p for the I/O and -i for backup.
        #!perl -p -ibak if (!s/^g(Ord_\d{4}\.png)\s*$/$1/) { warn "Invalid line: $_\n"; $_ = <>; redo; }
        Bill
Re: foreach type of deal?
by johngg (Canon) on Mar 15, 2017 at 23:29 UTC

    You can do this on the command line with in-place editing.

    johngg@shiraz:~/perl/Monks > cat > gtype.txt goat grate grip johngg@shiraz:~/perl/Monks > ls gtype* gtype.txt johngg@shiraz:~/perl/Monks > perl -pi.BAK -e 's{^g}{};' gtype.txt johngg@shiraz:~/perl/Monks > ls gtype* gtype.txt gtype.txt.BAK johngg@shiraz:~/perl/Monks > cat gtype.txt oat rate rip johngg@shiraz:~/perl/Monks > cat gtype.txt.BAK goat grate grip

    I hope this is helpful.

    Update: See "Command Switches" in perlrun for explanations of -e, -i and -p.

    Cheers,

    JohnGG

Re: foreach type of deal?
by LanX (Saint) on Mar 15, 2017 at 18:05 UTC
      #!/usr/bin/perl use strict; use warnings; main(@ARGV); sub main { open(FH, "gtype.txt"); foreach( my $line = <FH> ) { s/^g//; #toolic's idea. } } close FH; #and kennethk's idea #!/usr/bin/perl use strict; use warnings; main(@ARGV); sub main { open(FH, "gtype.txt"); while( my $line = <FH> ){ substr($_,1); # Start with the second letter } }

      So, using kennethk's input, I kind of hacked this up...? I can kind of see how it works...

      I looked at the perldocs linked in the replies, and I'm still working through them trying to understand all the various differences.

      the file I'm working with looks something like this:

      gOrd_3342.png

      gOrd_3343.png

      500k of those going down a file. I need to leave all 500k of those with "Ord_xxxx.png. I have yet to try either of mentioned options, but my idea here is to figure this out and learn at the same time, and not simply get an answer I can plug in and call it quits. Thanks.

        Using my $line = <FH> means $_ never gets set so your next line needs to now be $line=substr($line,1);

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1184738]
Approved by toolic
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (8)
As of 2024-03-29 13:21 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found