Re: new line every match
by johngg (Canon) on Jul 06, 2012 at 21:56 UTC
|
Rather than split'ing the line on spaces use a look-ahead to do it at a point followed by the text "address".
knoppix@Microknoppix:~$ perl -E '
> $line = qq{address 6433 main st address 6434 main st address 6435 ma
+in st\n};
> chomp $line;
> say for split m{(?=address)}, $line;'
address 6433 main st
address 6434 main st
address 6435 main st
knoppix@Microknoppix:~$
I hope this is helpful.
Update: A more complete solution catering for the two header lines and split'ing at a space that is followed by the text "address".
knoppix@Microknoppix:~$ perl -Mstrict -Mwarnings -E '
> open my $inFH, q{<}, \ <<EOF or die $!;
> This is the list of address
> here are the addreses
> address 6433 main st address 6434 main st address 6435 main st
> EOF
>
> print scalar <$inFH> for 1 .. 2;
> my $line = <$inFH>;
> chomp $line;
> say for split m{\s+(?=address)}, $line;'
This is the list of address
here are the addreses
address 6433 main st
address 6434 main st
address 6435 main st
knoppix@Microknoppix:~$
| [reply] [d/l] [select] |
Re: new line every match
by davido (Cardinal) on Jul 06, 2012 at 22:11 UTC
|
You have a record separator: "address", it's just disguised as a record heading. Perl knows how to deal with alternate record separators (alternate to "\n", that is), as long as they can be represented as a literal string.
local $/ = q{address}; # Set record separator to 'address'.
open my $infile, '<', 'filename.txt' or die $!;
print "This is the list of addresses\n",
"here are the addresses\n";
while( <$infile> ) {
chomp; # chomp removes the trailing record separato
+r.
s/^\s+|\s+$//g;
next unless length;
print "address $_\n";
}
| [reply] [d/l] [select] |
Re: new line every match
by aaron_baugher (Curate) on Jul 06, 2012 at 22:03 UTC
|
You might want to put your sample data into <code> tags, like your code, so we can tell for sure how it is formatted. But it appears that all your addresses are on a single line, and you want to break them into multiple lines. If that's the case, then <INFILE> will read in an entire line containing many addresses. That should match your regex (once), so the line should be split on spaces into words, and each word printed on its own line. What do you mean by "somehow it is not printing this part"? What is it printing?
Incidentally, there's no point in looping on $_ and then assigning it to a variable inside your loop, and it could (theoretically, at least) cause a bug. So don't do this:
while(<INFILE>){
my $line = $_;
# do stuff with $line
# but do one of these instead
while(<INFILE>){
# do stuff with $_
# or
while(my $line = <INFILE>){
# do stuff with $line
(Is there a popular Perl book or tutorial that shows that method? Seems like it shows up a lot here lately.)
Aaron B.
Available for small or large Perl jobs; see my home node.
| [reply] [d/l] [select] |
Re: new line every match
by 2teez (Vicar) on Jul 07, 2012 at 04:22 UTC
|
Hi starface245,
from your code:
.....
if ($line =~ /(address............)/) #13 dots for 13characters
{
my @values = split(' ', $line);
foreach my $val (@values) {
print "$val\n";
}
......
your foreach loop will only print out the following:
address
6433
main
st
address
6434
main
st
address
6435
main
st
Thanks to the split function used to create array @values
Atleast, this is one of the many reasons, why you can't insist on your "ways" when asking for help.
However, If you must but have the same code structure, then I believe the script below could help:
#!/usr/bin/perl
use warnings;
use strict;
my @values;
while (<DATA>) {
my ($line) = $_;
chomp($line);
if ( $line =~ m/^address/ ) {
push @values, split( ' st ', $line );
}
else {
print $line, $/;
}
}
print $_, ' st', $/ foreach @values;
__DATA__
This is the list of address
here are the addreses
address 6433 main st address 6434 main st address 6435 main st
OUTPUT:
This is the list of address
here are the addreses
address 6433 main st
address 6434 main st
address 6435 main st
| [reply] [d/l] [select] |
Re: new line every match
by ww (Archbishop) on Jul 06, 2012 at 22:54 UTC
|
A number of other issues have already been addressed, but not the poor design of the regex. It's hard to imagine a regex much more fragile than your /(address............)/ #13 dots for 13 characters.
Will you never need to deal with address 6435 main ave or address 643 main st?
Update re reply: ... then the regex (still fragile, IMO because I don't believe in perfect data) might be more readably written as
/=~ /(address.{13})/ | [reply] [d/l] [select] |
|
|
No, the number of characters are always exact
| [reply] |
Re: new line every match
by Kenosis (Priest) on Jul 07, 2012 at 01:30 UTC
|
use Modern::Perl;
while (<DATA>) {
if ( !/\d/ ) {
print;
}
else {
say $1 while /(address.{13})/g;
}
}
__DATA__
This is the list of address
here are the addreses
address 6433 main st address 6434 main st address 6435 main st
Output:
This is the list of address
here are the addreses
address 6433 main st
address 6434 main st
address 6435 main st
| [reply] [d/l] [select] |
Re: new line every match
by Marshall (Canon) on Jul 09, 2012 at 06:14 UTC
|
#!/usr/bin/perl -w
use strict;
my $str = "address 6433 main st address 6434 main st address 6435 main
+ st ";
# grep gets rid of the "" at the beginning.
# performace is slow, but effective
# there is more than one way to do this
# no fancy regex is required
my (@addresses) = grep{$_}split /address\s*/, $str;
my $line=1;
foreach (@addresses)
{
print "address ", $line++, " $_\n";
}
__END__
address 1 6433 main st
address 2 6434 main st
address 3 6435 main st
| [reply] [d/l] |
Re: new line every match
by starface245 (Novice) on Jul 09, 2012 at 15:18 UTC
|
I would like to thank everyone for their help. I finally got the code to work how I wanted...
Thanks again! Big thanks to Aaron, his code helped out alot | [reply] |
Re: new line every match
by starface245 (Novice) on Jul 06, 2012 at 22:34 UTC
|
It prints:
address 6433 main st
address 6434 main st
address 6435 main st
It does not print this part:
This is the list of address
here are the addreses
I want it to look exactly like this:
This is the list of address
here are the addreses
address 6433 main st
address 6434 main st
address 6435 main st
| [reply] [d/l] |
|
|
Then I would describe the process as:
- for each line
- if it starts with 'address', replace any space characters preceding the word 'address' with a newline
- if not, pass the line through unchanged
So that would give me this code, which you can wrap in whatever input/output you need to do once you understand it:
while(<>){
if(/^address/){
s/ address/\naddress/g;
print;
} else {
print;
}
}
Aaron B.
Available for small or large Perl jobs; see my home node.
| [reply] [d/l] |
|
|
| [reply] |
|
|
Sorry Dave, but it did fail my needs. I needed someone to help fix my code, but not totally change it.
| [reply] |
|
|
Re: new line every match
by starface245 (Novice) on Jul 06, 2012 at 22:14 UTC
|
This is exactly what I want to do..
"addresses are on a single line, and you want to break them into multiple lines." | [reply] |