Team

I need help with fixing the below problem, for which I am unable to find a solution.

I am trying to write a program to extract all data within the tag "BIB."

The problem is this: When my find code is this

while ($data1 =~ m{(<BIB>.*</BIB>)}gx)

the output comes as

<BIB>Falco (2012)</BIB> today Louise is hardly isolated. More than 5 m +illion babies have been born using the procedure, which has become al +most routine. And at the age of 28, Louise became a mother herself, g +iving birth to a baby boy name Cameron—conceived, by the way, in the +old-fashioned way (<BIB>Falco, 2012</BIB>; <BIB>ICMRT, 2012</BIB> Total occurrences of <BIB> is 1

which is not what I want.

When my find code is changed to this

while ($data1 =~ m{(<BIB>)}gx)

I get something closer; at least the number of items within the "BIB" tag matches the total number of items within "BIB."

What I want is this, each entry saved as an array value:

<BIB>Falco (2012)</BIB>

<BIB>Falco, 2012</BIB>

<BIB>ICMRT, 2012</BIB>

use strict; use 5.14.2; my $bib_count = 0; my $INPUT_REF_FH; my @text_found; open $INPUT_REF_FH,"<:utf8", "ch01.txt"; binmode STDOUT, ':utf8'; while(<$INPUT_REF_FH>){ my $data1 = $_; while ($data1 =~ m{(<BIB>.*</BIB>)}gx){ $bib_count += 1; # print "$&\n"; push @text_found, ${^MATCH}; }; }; foreach (@text_found){ print "$_\n"; }; print "Total occurrences of <BIB> is $bib_count"; close $INPUT_REF_FH;

INPUT TEXT:

In fact, <BIB>Falco (2012)</BIB> today Louise is hardly isolated. More than 5 million babies have been born using the procedure, which has become almost routine. And at the age of 28, Louise became a mother herself, giving birth to a baby boy name Cameron—conceived, by the way, in the old-fashioned way (<BIB>Falco, 2012</BIB>; <BIB>ICMRT, 2012</BIB>).


In reply to Extract Data between Tags by ppremkumar

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.