Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi all

I need to extract some specific data in the middle of a line from a file, eg.

var1|var2|DOMAIN|test.com|var9|var6

All lines have different number of values, so I need to extra the DOMAIN|test.com.. 'DOMAIN' will always be static but the actual domain name will change..

Is there any way in perl of matching this and extracting the domain name?

Replies are listed 'Best First'.
Re: perl Regex
by Limbic~Region (Chancellor) on Feb 11, 2004 at 23:29 UTC
    Anonymous Monk,
    Here is a complete working script:
    #!/usr/bin/perl use strict; use warnings; use Text::xSV; my $csv = Text::xSV->new( 'filename' => 'domain.txt', 'sep' => '|' ); while ( my $row = $csv->get_row() ) { for ( 0 .. $#$row ) { if ( $row->[$_] eq 'DOMAIN' ) { print "Domain: $row->[ $_ + 1 ]\n"; last; } } }
    Cheers - L~R
Re: perl Regex
by Vautrin (Hermit) on Feb 11, 2004 at 23:07 UTC

    If this is the | seperated equivalent of a CSV file, you could do something like this:

    use strict; use warnings; open("FILE", "< ./file") or die ("Can't open the file: $!"); while (my $line = <FILE>) { my ($var1, $var2, $domain, $domain_name, $var9, $var6) = split '\|', $line; # do whatever with each value here... }

    Or you could use split to split on "DOMAIN\|" and then split on pipes, and only grab the domain by shifting out the resulting array. Btw, I escaped the pipe character because it's a special char in regular expressions.


    Want to support the EFF and FSF by buying cool stuff? Click here.
      Hi

      thanks, but I didnt explain it very clearly. Yes the pipe will be the delimeter, but the number of vars will vary on each line so split wont work, i need to match the DOMAIN|test.com from the middle of the var $comments for example.

      Please any help is appreciated.
        #!/usr/bin/perl use warnings; use strict; while ( defined ( $_ = <DATA> ) ) { if ( /\|DOMAIN\|([^|]*)/ ) { print "$1\n"; } } __DATA__ var1|var2|DOMAIN|test.com|var9|var6
        Boris
        perl -ne 'print $1, $/ if /\|DOMAIN\|([^|]*)/' your_data_file
        Boris

        Perhaps I should have been more clear in my original comment when I said you could split on "DOMAIN\|".

        use strict; use warnings; my @domain_names; open ("FILE", "< ./file.csv") or die ("Can't open file $!"); while (my $line = <FILE>) { my @parts = split 'DOMAIN\|', $line; # throw away the stuff before DOMAIN| shift (@parts); # now get what's between the beginning # of the line and the | -- the domain, correct? my $part = shift (@parts); @parts = split '\|', $part; my $domain = shift (@parts); push @domain_names, $domain; }

        And, of course, if you only have one DOMAIN|test.com in every file you could add an if to check if there's one on the current line using a regular expression, and then a last to exit the loop once you found one.


        Want to support the EFF and FSF by buying cool stuff? Click here.
Re: perl Regex
by Theo (Priest) on Feb 11, 2004 at 23:06 UTC
    Hi.
    Are the vars always numbers and always separated by the pipe symbol?

    -Theo-
    (so many nodes and so little time ... )