regular expression help

airblaine has asked for the wisdom of the Perl Monks concerning the following question:

I have the following CNC machine code:

N38 * TOOL 11: 0.125 DIA. FINISH ENDMILL
N39 M6T11
N40 E1
N41 S15000M3M7
N42 G17G90G0G8X-5.58Y-18.
N43 Z0.05H11
N44 G82G98X-5.58Y-18.Z-0.2R+0.01P50F100.
N45 X-3.171
N46 X-1.025
N47 X0.887
N48 X3.371Y-18.688
N49 Y-17.312
N50 X5.46Y-18.
[download]

The N's are line numbers that need to be stripped out. The astrix are comments that need to be replaced with ! All letters need to be stripped and only the numbers after X,Y and Z must be left in a space delimited string. So the above should come out like:

! TOOL 11: 0.125 DIA. FINISH ENDMILL 5.58 -18. 0.05 -5.58 -18. -0.2 1.
+025 0.887 3.371 -18.688 -17.312 5.46 -18.
[download]

I have hundreds of files with thousands of lines of this code that need to be parsed in this way. I need help with search and replace and inserting spaces etc. I would be eternally greatful for any help offered to help me get started on some code that might accomplish this! Thank you in advance, Blaine

Comment on regular expression help Select or Download Code

Replies are listed 'Best First'.
Re: regular expression help by amphiplex (Monk) on Jul 17, 2002 at 16:38 UTC
This should do the job: `my @results; while (<>) { if (/^N\d+\s+\(.)/) { push @results, "!$1"; } else { push @results, /[XYZ]([-+][\d\.]+)/g; } } print join ' ',@results;` [download] The first regex handles comment lines, the second one greps all occurances of either an X,Y or Z followed optionally by a negative or positive sign followed by a combination of digits and dot. If there is only one comment line on the top of each file, you could do this faster, but for a couple of thousand lines it should not matter. update: the code was wrong, I forgot the after the sign match.** ---- amphiplex	[reply] [d/l]
Re: regular expression help by broquaint (Abbot) on Jul 17, 2002 at 16:46 UTC
You could try this `my @info; while(<DATA>) { chomp; s< ^ N \d+ \s+ >()x; if(m< ^ \* >x) { push @info => '!' . substr($_, 1); } else { push @info => m< [XYZ] ( [\d.-] +) >gx; } } print "@info\n"; __DATA__ N38 * TOOL 11: 0.125 DIA. FINISH ENDMILL N39 M6T11 N40 E1 N41 S15000M3M7 N42 G17G90G0G8X-5.58Y-18. N43 Z0.05H11 N44 G82G98X-5.58Y-18.Z-0.2R+0.01P50F100. N45 X-3.171 N46 X-1.025 N47 X0.887 N48 X3.371Y-18.688 N49 Y-17.312 N50 X5.46Y-18.` [download] Which appears to do the job. HTH `_________ broquaint`	[reply] [d/l]
Re: regular expression help by BrowserUk (Patriarch) on Jul 17, 2002 at 16:55 UTC
This may get you started. #! perl -w my $output = ''; while( my $line = <DATA> ) { # get rid of line terminator chomp $line; # Get rid of line numbers $line =~ s/^N[0-9]{2}\ //; # Replace * and the beginning of the line with #(Assumption: No further processing of comments required.) $line =~ s/^\(.)/!$1/ and print $line and next if $line; # Keep any numbers prefixed with X,Y or Z. $line =~ s/(X\|Y\|Z)([0-9.-]+)/ $2/g if $line; # Throw the rest away. $line =~ s/[A-Z][0-9.+-]+//g if $line; print $line; } __DATA__ N38 * TOOL 11: 0.125 DIA. FINISH ENDMILL N39 M6T11 N40 E1 N41 S15000M3M7 N42 G17G90G0G8X-5.58Y-18. N43 Z0.05H11 N44 G82G98X-5.58Y-18.Z-0.2R+0.01P50F100. N45 X-3.171 N46 X-1.025 N47 X0.887 N48 X3.371Y-18.688 N49 Y-17.312 N50 X5.46Y-18. [download] Output (asked for) followed by actual. I think that you made a mistake in what you asked for. Correct me if I am wrong. `! TOOL 11: 0.125 DIA. FINISH ENDMILL 5.58 -18. 0.05 -5.58 -18. -0.2 1. +025 0.887 3.371 -18.688 -17.312 5.46 -18.` [download] (actual!) `C:\test>182489 ! TOOL 11: 0.125 DIA. FINISH ENDMILL -5.58 -18. 0.05 -5.58 -18. -0.2 - +3.171 -1.025 0.887 3.371 -18.688 -17.312 5.46 -18.` [download]	[reply] [d/l] [select]
Re: regular expression help by kvale (Monsignor) on Jul 17, 2002 at 16:45 UTC
The general structure of your code could look like (untested) `my $string = ''; # output string open FILE, "<file.txt" or die "Could not open file.txt: $!"; while (<FILE>) { chomp; s/^N\d+ //; # strip line numbers if (/^\/) { # comment s/^\/!/; $string .= $_; } else { # other while ($_ =~ /[x-z]/([\d.+-]+)/ig) { $string .= " $1"; } } } print $string;` [download] -Mark	[reply] [d/l]