http://qs1969.pair.com?node_id=606397


in reply to Help in manipulating values from two arrays

How many times do we have to tell the children? Do not hand roll code to parse HTML/XML, life just ain't long enough for that to be worth while - even as an exercise.

Use HTML::TreeBuilder for HTML or XML::Twig for XML. In this case it looks like XML so lets wrap a root element around the sample data provided and see what we can do:

use strict; use warnings; use XML::Twig; my $xml = <<XML; <root>
<act>Key</act><emp>3384</emp><job>78082</job><chap>6</chap><pg>20</pg> +<time>0.7</time><prod>114.285714285714</prod> <act>Reconcile</act><emp>3017</emp><job>78062</job><chap>2-7</chap><pg +>0</pg><time>1.4</time><prod>Insufficient Information</prod> <act>Training</act><emp>3384</emp><job>77654</job><chap>-</chap><pg>0< +/pg><time>5.1</time><prod>Non-Billable</prod> <act>Management</act><emp>3017</emp><job>77893</job><chap>-</chap><pg> +0</pg><time>4.4</time><prod>Non-Billable</prod> <act>Break</act><emp>3379</emp><job>33843</job><chap>-</chap><pg>0</pg +><time>0.2</time><prod>Non-Billable</prod> <act>Excess overload</act><emp>3379</emp><job>77570</job><chap>14</cha +p><pg>1</pg><time>0.5</time><prod>6.66666666666667</prod> <act>Management</act><emp>3123</emp><job>88898</job><chap>-</chap><pg> +0</pg><time>0.5</time><prod>Non-Billable</prod> <act>Management</act><emp>3123</emp><job>22304</job><chap>-</chap><pg> +0</pg><time>0.3</time><prod>Insufficient Information</prod> <act>Management</act><emp>3123</emp><job>11121</job><chap>-</chap><pg> +0</pg><time>1.4</time><prod>Non-Billable</prod> <act>Adapt</act><emp>3123</emp><job>78143</job><chap>08-</chap><pg>0</ +pg><time>0.3</time><prod>Insufficient Information</prod> <act>Import</act><emp>3417</emp><job>76584</job><chap>App K</chap><pg> +4</pg><time>1.0</time><prod>11.4285714285714</prod> <act>Break</act><emp>3123</emp><job>22732</job><chap>-</chap><pg>0</pg +><time>0.4</time><prod>50.65687</prod> <act>key</act><emp>3123</emp><job>78143</job><chap>08</chap><pg>0</pg> +<time>3.3</time><prod>45.5544</prod> <act>Supervision</act><emp>3192</emp><job>54281</job><chap>-</chap><pg +>0</pg><time>4.0</time><prod>Non-Billable</prod>
</root> XML my $t= XML::Twig->new (twig_roots => {emp => \&emp, prod => \&prod}); my %emp; my $currEmp; $t->parse ($xml); print "$_ - $emp{$_}\n" for sort keys %emp; sub emp { my ($t, $data) = @_; $currEmp = $data->trimmed_text (); $emp{$currEmp} ||= ''; } sub prod { my ($t, $data) = @_; my $text = $data->trimmed_text (); return if $text !~ /^\d+(\.\d*)?/; $emp{$currEmp} ||= 0; $emp{$currEmp} += $text; }

Prints:

3017 - 3123 - 96.21127 3192 - 3379 - 6.66666666666667 3384 - 114.285714285714 3417 - 11.4285714285714

DWIM is Perl's answer to Gödel

Replies are listed 'Best First'.
Re^2: Help in manipulating values from two arrays
by zerogeek (Monk) on Mar 24, 2007 at 21:39 UTC
    I, for one, often like to find the solutions that don't involve modules. As a fairly new coder, I am not confident in their use. Additionally, I think that many of the solutions that can be solved without the use of a module are enhancing my understanding of how perl works.

    On the other hand, learning something about modules is helpful as well... and helping everyone learn is what this site is supposed to be about.

      All problems that can be solved using a module (written by someone else) can be solved without using the module. It's just that you may end up rewriting the module! Redoing hundred or thousands of hours of work may be a good way of learning, but it doesn't get the task at hand achieved in a timely fashion.

      Of course if you really want to learn stuff try solving the same problems in assembly language or Ook! - you'll learn all sorts of stuff about frustration and low productivity, but those are probably not the things you want to learn.

      One of the important lessons to learn here is that there are a lot of very clever people writing modules for Perl and making them freely available. Using those modules can save you a lot of time. Peeking at the internals of those modules can teach you a lot about coding techniques. Using modules you can win both ways - learning and saving time.


      DWIM is Perl's answer to Gödel
        Grandfather-
        I didn't mean to be disrespectful in any way. I was only trying to point out how my brain is trying to work out the problems. I have trouble understanding how to use the modules.

        With that in mind, after reading your response, I learned a bit about them. I am going to take a look at some of the code that I have that uses modules and try to look into the module itself. That just might be the ticket!

        Thanks for your response :)

      I, for one, often like to find the solutions that don't involve modules. As a fairly new coder, I am not confident in their use.

      That attitude can be a little bit dangerous. If you have a chance, skim the XML 1.0 specification. I'm certainly not going to hand-roll code to parse XML in a couple of hours, and I'm a fairly experienced coder.

      You're a lot better off spending your learning time figuring out how to take advantage of work other people have already done. This particular case is awfully complex.

        WOAH!
        Great example, but that seems a bit out of the scope of the OP I think. What he was looking to do (and really, much of what those of us new to Perl are trying to do) didn't look to be too hard without the module.

        I understand that modules certainly have their place. No doubt in my mind, but isn't it also fair (when doing something like this that is fairly simple) to try and figure it out on one's own? I think there is just as much learning value in that and was only trying to make that point in my OP.

        Of course, I'm just sitting down to read through Learning Perl for the 2nd time and in no way consider myself a programmer. Anyhow, this is starting to get way OT from the OP and for that I apologize.

Re^2: Help in manipulating values from two arrays
by rsriram (Hermit) on Mar 26, 2007 at 05:46 UTC

    hi

    Thank you very much!! Your code is amazing. I just need one further help. I have the contents <act>key</act>... on a separate xml file. I am not able to succeed if I use open or if I store the file to a variable and call the variable from the place where you have placed the content. Can you please tell me how I can call a external file here?

    Thanks, for all your help in this

      It's not clear to me what you are trying to achieve with the separate file. Perhaps you could sketch in code what you are trying to do? Not a fully worked attempt to provide a solution to the problem, but an outline of the steps you think are required.


      DWIM is Perl's answer to Gödel

        Hi

        From the lines of text I mentioned previously, (if the content of <prod> is not "non billable" or "insufficient information") the output should display the average of the <prod>. To be more specific, if there are 4 lines which has <emp>3984</emp>, I need to display the average of all the <prod>. Thanks for all your efforts.