Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Can anyone suggest a regular expression to extract the 8 digit numbers that can be submitted in a file that can have the following formats:
file format 1:
1,10000021
2,10000023
2,10000035
3,10000043
file format 2:
10000023
10000035
10000043
10000045
10000060
file format 3:
10000043,,zrous
10000045,,sjandw
10000060,,jnbro2
10000063,,eedwa
10000067,,empav
file format 4:
10000021/1
10000023/1
10000035/1
10000043/1
10000045/1
10000060/1
file format 5:
10000021/1,10000021
10000023/1,10000023
10000035/1,10000035
file format 6:
10000112,10002524,10000998,10000071,10000043,10000612,10000018,10000110,10000011,10000013
I'm currently using this code to iterate over each line in the file:
open(SDATA,">$file")||die "Could not open data file"; @filedata=<SDATA>; close(SDATA); foreach $line (@filedata){ chomp($line); #@sid=split m!/\d{8}/!,$line; #@sid=split m!,!,$line; foreach $nline (@sid){ do something .... } }
TIA.

Replies are listed 'Best First'.
Re: regular expression to extract 8 digit number from files
by demerphq (Chancellor) on Aug 16, 2002 at 11:08 UTC
    my @extract; while (<DATA>) { push @extract,m/\d{8}/g; } print join "\n",@extract; __DATA__ 1,10000021 2,10000023 2,10000035 3,10000043 10000023 10000035 10000043 10000045 10000060 10000043,,zrous 10000045,,sjandw 10000060,,jnbro2 10000063,,eedwa 10000067,,empav 10000021/1 10000023/1 10000035/1 10000043/1 10000045/1 10000060/1 10000021/1,10000021 10000023/1,10000023 10000035/1,10000035 10000112,10002524,10000998,10000071,10000043,10000612,10000018,1000011 +0,10000011,10000013
    Outputs:
    10000021 10000023 10000035 10000043 10000023 10000035 10000043 10000045 10000060 10000043 10000045 10000060 10000063 10000067 10000021 10000023 10000035 10000043 10000045 10000060 10000021 10000021 10000023 10000023 10000035 10000035 10000112 10002524 10000998 10000071 10000043 10000612 10000018 10000110 10000011 10000013
    Of course you could also write that as
    my @extracted=map { m/\d{8}/g } <DATA>;
    HTH

    Yves / DeMerphq
    ---
    Software Engineering is Programming when you can't. -- E. W. Dijkstra (RIP)

      Thanks very much - as usual I was over-complicating things. Col
      How would I get to pick numbers ranging from 5 - 8 digits in length, I've tried using m/\d{5,8}/g but that doesn't work. TIA
Re: regular expression to extract 8 digit number from files
by Anonymous Monk on Aug 16, 2002 at 12:21 UTC
    OK - done it - requires /\d{5,8}/.
      Hmm, I dont see why /\d{5,8}/g shouldn't work... Are you sure?

      Yves / DeMerphq
      ---
      Software Engineering is Programming when you can't. -- E. W. Dijkstra (RIP)