bulrush has asked for the wisdom of the Perl Monks concerning the following question:

Perl 5.8.8. Input line is text but line contains application specific tags indicating macros/tags. There are two types of macros: <macro> and {tag}. There is text interspersed with macros I want to create an array where each entry/slot contains either text only, or just macros/tags.

Input line is: <macro1><macro2>This is line one.{tag1}<macro3>This is more text.

Output array should be:

[0]: <macro1><macro2> [1]: This is line one. [2]: {tag1}<macro3> [3]: This is more text.

Thank you! Try to be kind, this is something new to me. Normally I don't have to separate this stuff.

  • Comment on How to split line of text into array containing macros and non-macro text
  • Download Code

Replies are listed 'Best First'.
Re: How to split line of text into array containing macros and non-macro text
by hazylife (Monk) on Mar 25, 2014 at 19:22 UTC
    This will probably do the trick:
    #!/usr/bin/perl -nl use strict; my @fields = split /((?:<[^>]+>|\[[^]]+])+)/; shift @fields if $fields[0] eq ''; print for @fields;
      Thank. This seems to work, but I have to hit the ENTER key when I run my test program, to see the output from the print statement. I'd like to make this a subroutine. Test program:
      #!/usr/bin/perl -nl use strict; my($i,$t); $t="<macro1>[tag1]This is plain text.<macro2>More text.[tag2]And more +text."; my @fields=split /((?:<[^>]+>|\[[^]]+])+)/, $t; shift @fields if $fields[0] eq ''; print for @fields; exit;
      Why do I have to hit ENTER to execute the print statement?
        I have to hit the ENTER key when I run my test program
        Because you called Perl with -n which makes it read from input. See perlrun - how to execute the Perl interpreter for details.
        لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
      Oops. Unfortunately my original post had a typo. Tags should be {tag} with braces, not brackets. And I can't seem to decipher your code to replace the brackets with braces. I only see one escaped bracket, which doesn't make sense to me.
        sub parse { my @fields = split /((?:<[^>]+>|{[^}]+})+)/, shift; shift @fields if $fields[0] eq ''; return @fields; }