comment on

Assuming you missed my comments in the Chatterbox yesterday (or maybe you chose to ignore them:), I will re-state them here. Since Verilog employs a comlpex syntax, a complex parser is required. Obviously, if you can control the format of the Verilog source code you are trying to parse, you may be able to roll your own simple parser. Since that is unlikely, I whole-heartedly agree with shoness' advice on trying to use the CPAN modules.

Here are some of the common pitfalls you are facing (and there are many, many more):

What if there are ports or always blocks inside comments, both single-line (//) and multi-line (/**/)?
What if someone used `include or `define compiler directives to declare ports?
What if there are functions declared within a module, which have their own ports?
How do you reliably determine the end of an always block? There is no "always_end" construct.
How do you deal with IEEE Std 2001 syntax for ports?

Here are some techniques I have used to overcome some of these issues. You could pre-process the Verilog file to remove all comments using the regex in perlfaq6:

How do I use a regular expression to strip C style comments from a file?

This will work in most (but not all) cases.

If you have access to the Cadence ncverilog simulation tools, you could first compile the Verilog files, then de-compile the files using ncdc. This tool allows for some control over the resulting de-compiled files so that parsing may be simpler. I am not sure if other simulation tools have the same capability.

If you can edit the Verilog source files, you could embed pragmas in comments such as // always_end -- but that will only help you for future development.

That being said, here is some EXTREMELY BRITTLE code which works for the Verilog code you provided:

use strict;
use warnings;

my @ins;
my @outs;
my @blocks;
my $flag = 0;
while (<DATA>) {
    if (/\binput\b/)  { push @ins , $_ }
    if (/\boutput\b/) { push @outs, $_ }
    if (/\balways\b/) {
        s/^.*\b(always)\b/$1/;
        push @blocks, $_;
        $flag = 1;
        next;
    }
    if ($flag) {
        push @blocks, $_;
        $flag = 0 if (/^\s*$/); # blank line ends always block
    }
}

create_file('inputs.v' , @ins);
create_file('outputs.v', @outs);
create_file('always.v' , @blocks);

sub create_file {
    my $file = shift;
    open my $fileHandle, '>', $file or die "Unable to create $file: $!
+\n";
    print $fileHandle $_ for (@_);
    close $fileHandle or die "Unable to close file $file: $!\n";
}

__DATA__
   input [31:0]           ucast_mem_wdata;
   input [31:0]           ucast_mem_wren;
   input               ucast_mem_wr;
   input               ucast_mem_rd;
   output [31:0]           ucast_mem_rdata;

   wire [BMU_MEM_ADDR_BITS-1:0]   mem_addr; always @ (posedge sys_clk 
+or negedge sys_reset_n)
     begin
    if (!sys_reset_n)
      ucast_int_mem_rd_r <= 0;
    else
      ucast_int_mem_rd_r <= ucast_int_mem_rd & !ucast_access;
     end


endmodule
[download]

Here is the output:

> ./701343.pl
> cat inputs.v
   input [31:0]                   ucast_mem_wdata;
   input [31:0]                   ucast_mem_wren;
   input                          ucast_mem_wr;
   input                          ucast_mem_rd;
> cat outputs.v
   output [31:0]                  ucast_mem_rdata;
> cat always.v
always @ (posedge sys_clk or negedge sys_reset_n)
     begin
        if (!sys_reset_n)
          ucast_int_mem_rd_r <= 0;
        else
          ucast_int_mem_rd_r <= ucast_int_mem_rd & !ucast_access;
     end

>
[download]

In reply to Re: Reading block: Verilog parse by toolic
in thread Reading block by ravi030

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.