morgon has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

on OpenWrt I try to find out about visible wifi-networks by parsing the output of iw scan.

This output looks like this:

BSS <blah blah> SSID: <ssid> <blah blah> BSS <and so on>
What I am interested in is the set of all ssids, each being contained in a stanza that starts with "BSS" and ends when the next "BSS" is encountered.

So I parse it like this:

my $out = qx| sudo iw dev wlan0 scan |; $out .= "\nBSS"; my @chunks = $out =~ /^(BSS.*?)(?=^BSS)/smg; my @essids = map { /SSID: (.*?)$/ms; $1 } @chunks;
And that works, but it bugs me that I manually add a "synthetic" BSS to the output of iw so I can then use a lookahead in the regex that would also match on the last entry.

So I wonder: Is there a more elegant way to do this?

Many thanks!

Replies are listed 'Best First'.
Re: more elegant way to parse this?
by tybalt89 (Monsignor) on Feb 24, 2017 at 06:39 UTC
    #!/usr/bin/perl use strict; use warnings; my $out = join '', <DATA>; # fake data :) my @essids = map /SSID: (.*)/, split /^BSS/m, $out; use Data::Dumper; print Dumper \@essids; __DATA__ BSS <blah blah> SSID: <ssid> <blah blah> BSS <blah blah> SSID: <ssid2> <blah blah> BSS <blah blah> SSID: <ssid3> <blah blah>

    yes? no? maybe?

Re: more elegant way to parse this?
by johngg (Canon) on Feb 24, 2017 at 12:09 UTC

    How about an alternation in the look-ahead? I also made the capture non-greedy.

    johngg@shiraz:~/perl/Monks > perl -Mstrict -Mwarnings -E ' open my $cmdFH, q{<}, \ <<EOD or die $!; BSS <blah blah> SSID: <ssid> <blah blah> BSS <blah blah> SSID: <ssid2> <blah blah> BSS <blah blah> SSID: <ssid3> <blah blah> EOD my $out = do { local $/; <$cmdFH>; }; my @ssids = $out =~ m{SSID: (.*?)(?=(?:^BSS|\z))}smg; say qq{--<$_>--} for @ssids;' --<<ssid> <blah blah> >-- --<<ssid2> <blah blah> >-- --<<ssid3> <blah blah> >--

    I hope this helps.

    Cheers,

    JohnGG

Re: more elegant way to parse this?
by 1nickt (Canon) on Feb 24, 2017 at 12:55 UTC

    Hi morgon,

    Elegance is in the eye of the beholder, I reckon. Personally I sometimes find it more elegant to decompose things into neat little units than to combine everything into one operation.

    use strict; use warnings; use feature 'say'; my $input = do { local $/; <DATA> }; my @output = map { /(\S+)$/ } grep { /SSID/ } split /\n/, $input; say for @output; __DATA__ BSS <blah blah> SSID: <ssid> <blah blah> BSS <blah blah> SSID: <ssid2> <blah blah> BSS <blah blah> SSID: <ssid3> <blah blah>
    Output:
    $ perl 1182702.pl <ssid> <ssid2> <ssid3>
    Hope this helps!


    The way forward always starts with a minimal test.
Re: more elegant way to parse this?
by Anonymous Monk on Feb 24, 2017 at 16:24 UTC

    If you really want to split into records with a regex:

    /^(BSS.*?)(?=^BSS|\z)/smg

    But I don't know why you want the records at all. This format seems easiest to parse line-by-line.

    my @essids; for (qx| sudo iw dev wlan0 scan |) { /^\s*SSID:\s*(.*)/ and push @essids, $1; }

    If you need other information besides the ssid, maybe you can just do something like this:

    my @records; for (qx| sudo iw dev wlan0 scan |) { if (/^BSS\s*(.*)/) { push @records, { BSS=>$1 } } elsif (/^\s*(\w+):\s*(.*)/) { $records[-1]{$1} = $2 if @records } } foreach my $record (@records) { print "$record->{SSID}\n"; }