comment on

If that is your above command,you might want to consider a pure Perl implementation. It appears that you are filtering out lines (grep) and then using a series of calls to awk to split the input into fields and subfields. All of this can be done quite easily in Perl four or five lines of Perl (maybe less) using a regular expression and maybe split. A pure Perl implementation is likely to be much faster as you will only need a single process rather than the 4 you are currently using in your pipe.

The following sample code illustrates how grep and awk can be mapped to Perl constructs. It is a lot more verbose than necessary because I've assigned things to variables to make it clearer exactly what is going on. The real production code could easily be mushed down to no more than four lines inside the while loop and possibly even down to one line (print if regex matches) if splits are replaced by a capturing regular expression:

while(my $line = <DATA>) {
  #grep 'DataDictionary'
  next unless $line =~ /DataDictionary/;

  #awk -F'<pciOFACViolation>' {print $1}
  my @aFields = split(/<pciOFACViolation>/, $line);
  my $sFieldICareAbout = $aFields[0];  #$1 in awk

  #awk '{print $3}'
  @aFields = split(/\s/, $sFieldICareAbout);
  $sFieldICareAbout = $aFields[2];  #$3 in awk

  #awk -F'>' '{print $1}'
  @aFields = split(/>/, $aFields[2]);
  $sFieldICareAbout = $aFields[0]; #$1 in awk
  print "$sFieldICareAbout\n";
}

__DATA__
*** *** G1>H>I<pciOFACViolation>DataDictionary
Whan that aprill with his shoures soote
The droghte of march hath perced to the roote,
And bathed every veyne in swich licour
Of which vertu engendred is the flour;
*** *** G2>H>I<pciOFACViolation>DataDictionary
Whan zephirus eek with his sweete breeth
Inspired hath in every holt and heeth
Tendre croppes, and the yonge sonne
Hath in the ram his halve cours yronne,
And smale foweles maken melodye,
That slepen al the nyght with open ye
(so priketh hem nature in hir corages);
*** *** G3>H>I<pciOFACViolation>DataDictionary
Thanne longen folk to goon on pilgrimages,
And palmeres for to seken straunge strondes,
[download]

The one liner (print if regex) depends heavily on the exact format of each line, particularly the placement of "DataDictionary". To give you a feel for its succinctness, here is the one-line code for the above format of DataDictionary lines.

while(<DATA>) {
  print "$1\n"
    if /^\S+\s+\S+\s+([^>]+).*<pciOFACViolation>.*DataDictionary/;
}
[download]

If you are interested in this approach, perhaps you could give us a few sample lines containing "DataDictionary"?

Best, beth

Update: Added code illustrating mapping of grep and awk to Perl constructs.

Update: Added more succinct example using one line (print if regex).

In reply to Re: commands with multiple pipes in perl by ELISHEVA
in thread commands with multiple pipes in perl by raghu_shekar

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.