in reply to Help:getting parts of the strings from a file into managable variables

(Update: Noticed that UserIDs could repeat; changed code to merge values for duplicate UserIDs.)

Here's one way of doing it that stores the data as a hash of hashes:

#!/usr/bin/perl use warnings; use strict; my %user_data; my $current_user; while (<DATA>) { if (my ($elem, $content) = m|^ <([^>]+)> (.*) </\1> |x) { $current_user = $content if $elem eq "UserID"; $user_data{$current_user}{$elem} = $content; } else { print "Bad line $.: $_"; } } use Data::Dumper; print Dumper(\%user_data); # $VAR1 = { # '98766' => { # 'var6' => 'some string', # 'var1' => 'some string', # 'dev' => 'Some Text', # 'UserID' => '98766', # 'var2' => 'some string', # 'var5' => 'some string', # 'start' => '2004-10-21TO09:57:25Z' # }, # '57864' => { # 'var6' => 'some string', # 'var1' => 'some string', # 'dev' => 'Some Text', # 'var4' => 'some string', # 'UserID' => '57864', # 'start' => '2004-10-25TO09:57:25Z' # }, # '46786' => { # 'var3' => 'some string', # 'var1' => 'some string', # 'dev' => 'Some Text', # 'var4' => 'some string', # 'UserID' => '46786', # 'var2' => 'some string', # 'start' => '2004-10-21TO09:57:25Z' # } # }; __DATA__ <UserID>46786</UserID> <start>2004-10-21TO09:57:25Z</start> <dev>Some Text</dev> <var1>some string</var1> <var2>some string</var2> <UserID>57864</UserID> <start>2004-10-25TO09:57:25Z</start> <dev>Some Text</dev> <var1>some string</var1> <UserID>46786</UserID> <var3>some string</var3> <var4>some string</var4> <UserID>98766</UserID> <start>2004-10-21TO09:57:25Z</start> <dev>Some Text</dev> <var1>some string</var1> <var2>some string</var2> <var5>some string</var5> <var6>some string</var6> <UserID>57864</UserID> <var4>some string</var4> <var6>some string</var6>
When parsing files, it's a good idea to detect and report errors. A few of your sample lines, for example, had opening and closing tags that didn't match. I fixed them in my example data, but only after the error-reporting code caught them.

Cheers,
Tom

  • Comment on Re: Help:getting parts of the strings from a file into managable variables
  • Download Code

Replies are listed 'Best First'.
Re^2: Help:getting parts of the strings from a file into managable variables
by my_perl (Initiate) on Nov 11, 2004 at 22:41 UTC
    Hi, THank you so much for a prompt response :) I cut and pasted your response, and it does not work for me. It gives result that every line is bad. and VAR1{} any idea why? Thanks one more time.
      To make my code easier to read, I indent it by four spaces when I quote it. (That way, it doesn't get lost in the surrounding flow of text.) As a result, you'll need to unindent it (or at least the __DATA__ portion) before running it.

      Try processing the script through this one liner to remove the leading four spaces:

      perl -i.bak -pe's/^ //' the-script.pl # unindent the-script.pl
      That ought to do it.

      Cheers,
      Tom

        Hi, thanks again:) what would i need to change if i had spaces before these strings,number of spaces is not constant. Thanks Aida