more elegant way to parse this?

morgon has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

on OpenWrt I try to find out about visible wifi-networks by parsing the output of iw scan.

This output looks like this:

BSS <blah blah>
   SSID: <ssid>
   <blah blah>
BSS <and so on>
[download]

What I am interested in is the set of all ssids, each being contained in a stanza that starts with "BSS" and ends when the next "BSS" is encountered.

So I parse it like this:

my $out = qx| sudo iw dev wlan0 scan |;
$out .= "\nBSS";

my @chunks = $out =~ /^(BSS.*?)(?=^BSS)/smg;

my @essids = map { /SSID: (.*?)$/ms; $1 } @chunks;
[download]

And that works, but it bugs me that I manually add a "synthetic" BSS to the output of iw so I can then use a lookahead in the regex that would also match on the last entry.

So I wonder: Is there a more elegant way to do this?

Many thanks!

Comment on more elegant way to parse this? Select or Download Code

Replies are listed 'Best First'.
Re: more elegant way to parse this? by tybalt89 (Monsignor) on Feb 24, 2017 at 06:39 UTC
`#!/usr/bin/perl use strict; use warnings; my $out = join '', <DATA>; # fake data :) my @essids = map /SSID: (.*)/, split /^BSS/m, $out; use Data::Dumper; print Dumper \@essids; __DATA__ BSS <blah blah> SSID: <ssid> <blah blah> BSS <blah blah> SSID: <ssid2> <blah blah> BSS <blah blah> SSID: <ssid3> <blah blah>` [download] yes? no? maybe?	[reply] [d/l]
Re: more elegant way to parse this? by johngg (Canon) on Feb 24, 2017 at 12:09 UTC
How about an alternation in the look-ahead? I also made the capture non-greedy. `johngg@shiraz:~/perl/Monks > perl -Mstrict -Mwarnings -E ' open my $cmdFH, q{<}, \ <<EOD or die $!; BSS <blah blah> SSID: <ssid> <blah blah> BSS <blah blah> SSID: <ssid2> <blah blah> BSS <blah blah> SSID: <ssid3> <blah blah> EOD my $out = do { local $/; <$cmdFH>; }; my @ssids = $out =~ m{SSID: (.*?)(?=(?:^BSS\|\z))}smg; say qq{--<$_>--} for @ssids;' --<<ssid> <blah blah> >-- --<<ssid2> <blah blah> >-- --<<ssid3> <blah blah> >--` [download] I hope this helps. Cheers, JohnGG	[reply] [d/l]
Re: more elegant way to parse this? by 1nickt (Canon) on Feb 24, 2017 at 12:55 UTC
Hi morgon, Elegance is in the eye of the beholder, I reckon. Personally I sometimes find it more elegant to decompose things into neat little units than to combine everything into one operation. `use strict; use warnings; use feature 'say'; my $input = do { local $/; <DATA> }; my @output = map { /(\S+)$/ } grep { /SSID/ } split /\n/, $input; say for @output; __DATA__ BSS <blah blah> SSID: <ssid> <blah blah> BSS <blah blah> SSID: <ssid2> <blah blah> BSS <blah blah> SSID: <ssid3> <blah blah>` [download] Output: `$ perl 1182702.pl <ssid> <ssid2> <ssid3>` [download] Hope this helps! The way forward always starts with a minimal test.	[reply] [d/l] [select]
Re: more elegant way to parse this? by Anonymous Monk on Feb 24, 2017 at 16:24 UTC
If you really want to split into records with a regex: `/^(BSS.?)(?=^BSS\|\z)/smg` But I don't know why you want the records at all. This format seems easiest to parse line-by-line. `my @essids; for (qx\| sudo iw dev wlan0 scan \|) { /^\sSSID:\s(.)/ and push @essids, $1; }` [download] If you need other information besides the ssid, maybe you can just do something like this: `my @records; for (qx\| sudo iw dev wlan0 scan \|) { if (/^BSS\s(.)/) { push @records, { BSS=>$1 } } elsif (/^\s(\w+):\s(.*)/) { $records[-1]{$1} = $2 if @records } } foreach my $record (@records) { print "$record->{SSID}\n"; }` [download]	[reply] [d/l] [select]