HJ has asked for the wisdom of the Perl Monks concerning the following question:

Hi All,

I have been trying to parse a HDL file that has a format as below

a='0 b='001 c='110 d='1

Expected Output: a d

Code:
while<> {chomp; if(~/(.*?)=\'[0|1]/) #if(~/(.*?)=\'[0-1]/) #if(~/(.*?)=\'[01]/) { print $1; } }

Actual Output: a b c d

I tried the commented 'if' statements, individually. Seem to be getting the same output. Any suggestion would help. Thanks.

Replies are listed 'Best First'.
Re: Parse for a Single bit
by davido (Cardinal) on Jan 27, 2012 at 22:27 UTC

    How about this?

    use strict; use warnings; while( <DATA> ) { if( /^(\w)='[01]$/ ) { print "$1 "; } } print "\n"; __DATA__ a='0 b='001 c='110 d='1

    Despite your subject line mentioning "single bit", the code you present looks more like you're trying to find those strings that have only a single character consisting of 0 or 1 after the =', so that's the tack I took.

    One more note: please, when posting code here, do us the courtesy of making it presentable from a formatting standpoint. The code tags (which you used) are one step. But they're the final step. An earlier step ought to be to follow some basic indentation strategy. And the step before that is to fix syntax errors such as in your while loop.


    Dave

Re: Parse for a Single bit
by mbethke (Hermit) on Jan 27, 2012 at 22:27 UTC

    The character class is not the problem, it's that all lines have a variable name, equals, prime and either 0 or 1. But you want the line to end there; if you don't make that explicit, the regexp matches and the remainder of the line is ignored. So you have to use /(.*?)=\'[01]$/

    Edit: as davido beat me to it: yes, (\w+) makes much more sense than (.*) as empty identifiers (or those of weird characters ... although I vaguely remember some HDLs can use petty uncommon chars in wire names, right?) are certainly an error.

      Kudos for knowing what HDL's are. To me they're High-Density Lipoprotein. :)

      The problem presented by the OP only showed single alpha-character identifiers, so I went with \w. If there's a larger set of permissible identifiers, he might use an explicit character class, and if the identifiers may be longer than a single character, he might specify a quantifier. Greediness shouldn't be an issue since I'm explicitly anchoring to the start of the line, and to ='.


      Dave

Re: Parse for a Single bit
by JavaFan (Canon) on Jan 27, 2012 at 23:43 UTC
    Are you sure you want the ~ in front of the matches? This makes your test to always succeed; if the match succeeds, the result of the ~/pattern/ is 4294967294 or 18446744073709551614 depending whether you have 32 or 64 bit integers. If the match fails, ~/pattern/ is 4294967295 or 18446744073709551615, depending on the bit size of your integers. All these values are true values.