Re: Modify a txt file
by Eliya (Vicar) on Oct 20, 2011 at 01:04 UTC
|
#!/usr/bin/perl -w
use strict;
local $/ = ""; # paragraph mode
while (<DATA>) {
my ($num) = /^(\d+)/;
s/\w+/$num/g;
print;
}
__DATA__
0 ASDF
ASEE
ASEE
13 DERG
DREG
28 QWER
QWER
42 WERT
WERT
WERT
55 QWEASD
QWEASD
QWEASD
QWEASD
(note that $/ = "" is not the same as $/ = undef, see perlvar) | [reply] [d/l] [select] |
Re: Modify a txt file
by GrandFather (Saint) on Oct 20, 2011 at 00:54 UTC
|
You tell me what the real application for such a transformation is and show me the code you have tried and I'll help you get it right. I'm not about to write what looks like a homework answer for you however. And if it's not homework then you've probably simplified the problem to the point of meaninglessness.
Update: Oh, I see this probably relates to something you have been working on for a few days and most likely isn't Perl homework. Maybe it's time you filled us in on the bigger picture so we can help with a cohesive overall solution rather than drip feeding little parts of a solution as you run into trouble?
True laziness is hard work
| [reply] |
Re: Modify a txt file
by davido (Cardinal) on Oct 20, 2011 at 05:08 UTC
|
If you look at each grouping as a record, and set the input record separator to "\n\n" (which is what appears to separate the records in your example), it gets really easy:
use strict;
use warnings;
$/ = "\n\n";
while ( <DATA> ) {
if( my( $number ) = m/(\d+)/ ) {
s/\b\p{Alpha}+\b/$number/g;
} else {
warn "Malformed record in input line $.\n$_\nContinuing.\n";
}
print;
}
__DATA__
0 ASDF
ASEE
ASEE
13 DERG
DREG
28 QWER
QWER
42 WERT
WERT
WERT
55 QWEASD
QWEASD
QWEASD
QWEASD
Here is the output from your test data:
0 0
0
0
13 13
13
28 28
28
42 42
42
42
55 55
55
55
55
| [reply] [d/l] [select] |
|
|
| [reply] |
|
|
#!/usr/bin/perl
use warnings;
use strict;
if($#ARGV<0){
die "Usage: $0 <*.txt>\n";
}
open(IN,$ARGV[0]) ;
$/ = "\n\n";
while ( <IN> ) {
if( my( $number ) = m/(\d+)/ ) {
s/\b\p{Alpha}+\b/$number/g;
} else {
warn "Malformed record in input line $.\n$_\nContinuing.\n";
}
print;
}
Input:
0 ASDF
ASEE
ASEE
13 DERG
DREG
28 QWER
QWER
42 WERT
WERT
WERT
55 QWEASD
QWEASD
QWEASD
QWEASD
Getting this output:
0 0
0
0
13 0
0
28 0
0
42 0
0
0
55 0
0
0
0
Desired Output:
0 0
0
0
13 13
13
28 28
28
42 42
42
42
55 55
55
55
55
I see that it is in this line s/\b\p{Alpha}+\b/$number/g; where the substitution is being made. Is the code referring back to the original 0 in the top left hand column perhaps? | [reply] [d/l] [select] |
|
|
use strict;
use warnings;
my $data = <<DATA;
0 ASDF
ASEE
ASEE
13 DERG
DREG
28 QWER
QWER
42 WERT
WERT
WERT
55 QWEASD
QWEASD
QWEASD
QWEASD
DATA
open my $inFile, '<', \$data;
$/ = "\n\n";
while (<$inFile>) {
if (my ($number) = m/(\d+)/) {
s/\b\p{Alpha}+\b/$number/g;
} else {
warn "Malformed record in input line $.\n$_\nContinuing.\n";
}
print;
}
Prints:
0 0
0
0
13 13
13
28 28
28
42 42
42
42
55 55
55
55
55
so there is a mismatch between what you are telling us and what you are actually doing. If you really want help you need to really tell us what you are doing and show us (as per the sample code above) how things are going wrong. We can't fix what ain't broke!
True laziness is hard work
| [reply] [d/l] [select] |
|
|
Why do you suppose this is happening? Have you taken any steps besides posting to figure out why the solution isn't working for you?
When debugging it's often helpful to check the state of the program's logic at one or more points. An easy way to do this is with "print" statements that give you clues as to where you are within the program's control flow.
For example, if you added a print "Record: $.\n"; statement as the first line of the block of your while() loop you would see each time the loop iterates over a new record. And if you added print "New match: $number\n"; as the first line of the if() block, you would see each time a new number is matched and captured into $number. After running the script with those two debugging aids you would probably see that the file is being read in as one big record, rather than as multiple records.
That seems impossible if your input data matches the data you showed us, and if you're executing the code you say you are. Either you've got $/ = ''; in your code, or you have data that isn't separated by two newlines like it appears in your post. At least those are my best guesses without seeing exact cut&pastes of the first few records of your data, and of the script exactly as it's being run.
For what it's worth, I copied and pasted the exact data you posted here and used that as the sample run for my solution. I also copied and pasted the exact data that you posted in your followup, and it also produced the correct results.
Is it possible that you're re-typing the data rather than copy/pasting it, and that the blank line between records actually contains some space characters that we can't see, and that you didn't paste into your example data?
By the way: This isn't contributing to your problem, but it is a darn good idea anyway: Put use autodie; right after the use warnings; line at the top of your script. That will alert you if a file fails to open (among other things).
| [reply] [d/l] [select] |
Re: Modify a txt file
by Marshall (Canon) on Oct 20, 2011 at 03:18 UTC
|
Given that the input file has a very large number of lines, I would try to just process the file line by line (no slurping the file into a single $file_content or @lines).
Here is one way:
#/usr/bin/perl -w
use strict;
my $cur_num;
while (<DATA>)
{
$cur_num = $1 if (/^(\d+)/); # new $cur num if line starts
# with digits
s/(\S+)$/$cur_num/; # substitute the non-spaces at the end
# of the line with the cur_num
print; #a blank line is not modified
}
=Prints
0 0
0
0
13 13
13
28 28
28
42 42
42
42
55 55
55
55
55
=cut
__DATA__
0 ASDF
ASEE
ASEE
13 DERG
DREG
28 QWER
QWER
42 WERT
WERT
WERT
55 QWEASD
QWEASD
QWEASD
QWEASD
| [reply] [d/l] |
Re: Modify a txt file
by mrstlee (Beadle) on Oct 20, 2011 at 09:36 UTC
|
This one uses the new(ish) treat-strings-as-file-handles feature (See Effective Perl Programming)
my $data = q(0 ASDF
ASEE
ASEE
13 DERG
DREG
28 QWER
QWER
42 WERT
WERT
WERT
55 QWEASD
QWEASD
QWEASD
QWEASD
);
open $hndl , "<", \(my $s = $data);
open $out , ">", \(my $formatted = '');
my $substitute_field;
READ_FILE:
while (my $line = <$hndl>) {
chomp $line;
## Ignore any lines that don't contain relevant text
$line =~ /\S+/ or next READ_FILE;
$line =~ /^\s*(\d+)(\s+)([A-Z]+)/ and do {
$substitute_field = $1;
print $out $substitute_field, $2 ,$substitute_field,"\n";
next READ_FILE;
};
## If we don't have a field to substitute move on
defined $substitute_field or next READ_FILE;
## Must have valid line
$line =~ s/^(\s*)([A-Z]+)/$1$substitute_field/;
print $out $line,"\n";
}
print $formatted;
close $hndl;
close $out;
In:
0 ASDF
ASEE
ASEE
13 DERG
DREG
28 QWER
QWER
42 WERT
WERT
WERT
55 QWEASD
QWEASD
QWEASD
QWEASD
out:
0 0
0
0
13 13
13
28 28
28
42 42
42
42
55 55
55
55
55
| [reply] [d/l] |
|
|
knoppix@Microknoppix:~$ perl -e '
> open my $inFH, q{<}, \ <<EOD or die qq{open: << HEREDOC: > $!\n};
> line 1
> line 2
> line 3
> EOD
>
> my $out = qq{some rubbish here\n};
> open $outFH, q{>}, \ $out or die qq{open: > scalar: $!\n};
>
> while ( <$inFH> )
> {
> print $outFH uc;
> }
>
> print $out;'
LINE 1
LINE 2
LINE 3
knoppix@Microknoppix:~$
I hope this is of interest.
| [reply] [d/l] |