renovatio has asked for the wisdom of the Perl Monks concerning the following question:

I have a xhtml file(using MathML to present definite integrals). In the xhtml file, there have some dummy variables. Now, I want to using Twig to replace those dummy variables.
integrals.xhtml:
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mstyle mathsize="1em" displaystyle="true" mathcolor="#0000ff" f +ontfamily="serif"> <mml:mrow> <mml:msubsup> <mml:mstyle displaystyle="true"> <mml:mo>&#x222b;</mml:mo> </mml:mstyle> <mml:mn>0</mml:mn> <mml:mn>4</mml:mn> </mml:msubsup> </mml:mrow> <mml:msqrt> <mml:mrow> <mml:mrow> <mml:msup> <mml:mi>s</mml:mi> <mml:mn>2</mml:mn> </mml:msup> </mml:mrow> <mml:mo>+</mml:mo> <mml:mi>s</mml:mi> </mml:mrow> <mml:mspace width="0em" height="1.2ex" /> </mml:msqrt> <mml:mo maxsize="1">(</mml:mo> <mml:mn>2</mml:mn> <mml:mi>s</mml:mi> <mml:mo>+</mml:mo> <mml:mn>1</mml:mn> <mml:mo maxsize="1">)</mml:mo> <mml:mspace width="0.167em" /> <mml:mi>d</mml:mi> <mml:mi>s</mml:mi> </mml:mstyle> </mml:math>
First, I need to find the value?
<mml:mi>s</mml:mi>
Does anyone know an easy way to do this?

Replies are listed 'Best First'.
Re: Replace MathML content using Twig
by GrandFather (Saint) on May 10, 2011 at 08:56 UTC

    What 'this'? Here's a way to run through the XML and replace the contents of mml:mi elements with '...'. Maybe that's enough to get you started?

    #!/usr/local/bin/perl use strict; use warnings; use XML::Twig; my $xml = <<XML; <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mstyle mathsize="1em"> <mml:mi>s</mml:mi> <mml:mo>+</mml:mo> <mml:mi>s</mml:mi> </mml:mstyle> </mml:math> XML my $twig = XML::Twig->new ( twig_roots => {'mml:mi' => \&subst,}, # process the element twig_print_outside_roots => 1, # print the rest ); $twig->parse ($xml); sub subst { my ($twig, $value) = @_; print '<mml:mi>...</mml:mi>'; }

    Prints:

    <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mstyle mathsize="1em"> <mml:mi>...</mml:mi> <mml:mo>+</mml:mo> <mml:mi>...</mml:mi> </mml:mstyle> </mml:math>

    Update: fix copy and paste madness.

    True laziness is hard work
      Thank you so much!!
      That is very helpful for me.
      BTW
      Can I add some conditions in <subst>?
      The original xhtml file:
      .... <mml:mi>s</mml:mi> <mml:mi>sin</mml:mi> ..

      Only <mml:mi>s</mml:mi> just replace with <mml:mi>x</mml:mi>.
      the result:
      New xhtml file
      .... <mml:mi>x</mml:mi> <mml:mi>sin</mml:mi> ..

        Sure. Change the sub to:

        sub subst { my ($twig, $value) = @_; $value->subs_text ('^s$', 'x'); $value->print (); }

        I can thoroughly recommend the XML::Twig documentation by the way. It is long, but there are some really good examples.

        True laziness is hard work
Re: Replace MathML content using Twig
by cjb (Friar) on May 10, 2011 at 11:05 UTC

    This will replace <mml:mi>s</mml:mi> with <mml:mi>S</mml:mi>

    #!/opt/perl/bin/perl use Modern::Perl; use XML::Twig; my $xml = do { local $/; <DATA> }; my $twig=XML::Twig->new(TwigHandlers => {'mml:mi' => \&mmlmi}); $twig->parse($xml); $twig->print(pretty_print => 'indented'); sub mmlmi { my ($fromtwig, $mmlmifrom) = @_; my $mmlmi = $mmlmifrom->text; if ($mmlmi eq 's') { $mmlmifrom->set_text('S'); } } __DATA__ <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML"> <mml:mstyle mathsize="1em" displaystyle="true" mathcolor="#0000ff" f +ontfamily="serif"> <mml:mrow> <mml:msubsup> <mml:mstyle displaystyle="true"> <mml:mo>&#x222b;</mml:mo> </mml:mstyle> <mml:mn>0</mml:mn> <mml:mn>4</mml:mn> </mml:msubsup> </mml:mrow> <mml:msqrt> <mml:mrow> <mml:mrow> <mml:msup> <mml:mi>s</mml:mi> <mml:mn>2</mml:mn> </mml:msup> </mml:mrow> <mml:mo>+</mml:mo> <mml:mi>s</mml:mi> </mml:mrow> <mml:mspace width="0em" height="1.2ex" /> </mml:msqrt> <mml:mo maxsize="1">(</mml:mo> <mml:mn>2</mml:mn> <mml:mi>s</mml:mi> <mml:mo>+</mml:mo> <mml:mn>1</mml:mn> <mml:mo maxsize="1">)</mml:mo> <mml:mspace width="0.167em" /> <mml:mi>d</mml:mi> <mml:mi>s</mml:mi> </mml:mstyle> </mml:math>

    You could use parsefile instead of parse to read from the file system, and print_to_file instead of the print to write it back to the file system

    20110510 @ 11:32 UTC Added print_to_file & parsefile

    20110510 @ 12:52 UTC Added ! to first line. Doh!

      Thank you for GrandFather, cjb and choroba!

      These responses are very valuable and helpful for me.

Re: Replace MathML content using Twig
by choroba (Cardinal) on May 10, 2011 at 11:06 UTC
    Using XML::XSH2:
    open 903926.xml ; map :i {s/(.*)/[$1]/} //mml:mi[text()!="s"] ; # wrap non-s in square b +rackets map "..." //mml:mi[text()="s"] ; # replace s with ...