cibien has asked for the wisdom of the Perl Monks concerning the following question:

Hello everybody, I'am a new member and I need your help to modify a perl file named 'material mapping generator.pl'. My file perl convert a excel table to material mapping .xml. in the field "pr_family" the user can concatenate the family value in one cell like '1,2' , with 'foreach' and 'split' command every value generate a line in the xml file with the right family. is possible to do the same for propriety? In the perl the part of code to modify I think is near these comment: ( #for every family print a line) I have to do the same for (#set properties) I'am beginner with perl, thanks for your help. Thanks for a genius can help me here the complete code:

#!/usr/bin/perl -w ######################### # # Materialmapping generator my $version="0.0.2"; # 2012.04.12 v0.0.1 DM: Creation # 2012.04.23 v0.0.2 DM: Added support for active column and comma s +epareted families #Excel format rules: #-a merged cell named "Configurator mapping" on the first line identif +ies the mapping data (internal+properties) #-a merged cell named "Brand" on the first line identifies the Brand d +ata (optional) #-a merged cell named "Market" on the first line identifies the Market + data (optional) #-a merged cell named "Customer" on the first line identifies the Cust +omer data (optional) #-the first column of "Configurator mapping" should have an "internal" + as cell(line number is not important), the "internal" define the tit +le bar line; the cell must be on the same line where all the properti +es/markets/brand/customer names are #-the "active" column must be exactly left of the "internal" column #-the commercial code column must be exacly left of the "active" colum +n #-the "pr_family" column should be exactly right of the "internal" col +umn #-the tecnical description must be exactly right of the last "Configur +ator mapping" column #-the colorzones must be exactly right of the last "tecnical descripti +on" column #Notes: #-in the field "pr_family" the user can concatenate the family value l +ike 'TL1_1,TL1_4' , every value generate a line in the xml file with +the right family #-if the "active" column is not empty, the line is not evaluated #-if "internal" or "pr_family" field is empty, the line is not evaluat +ed # ######################### use Encode; use utf8; use XML::LibXML; use File::Basename; use File::Find; use Spreadsheet::ParseExcel; use Switch; system("cls"); # Read command line arguments #--------------------------------------------------------------------- +------------------- my $materialmapping_file = shift; my $data_folder = shift; print "OPTIONS\nmaterialmapping_file:$materialmapping_file\ndata_folde +r:$data_folder\n"; # subroutine sub getSQLTimeStamp { my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst)=localt +ime(time); return sprintf "%4d-%02d-%02d %02d:%02d:%02d",$year+1900,$mon+1 +,$mday,$hour,$min,$sec; } # Main #--------------------------------------------------------------------- +------------------- if(length($materialmapping_file) gt 0 and length($data_folder) gt 0) { opendir ( DIR, $data_folder ) || die "Error in opening dir $data_f +older\n"; my $materialmapping_table_xml = XML::LibXML->createDocument( "1.0" +, "UTF-8"); my $materialmapping_table_xml_root = $materialmapping_table_xml->c +reateElement("masterdata"); $materialmapping_table_xml_root->setAttribute('version',getSQLTime +Stamp()); $materialmapping_table_xml->setDocumentElement($materialmapping_ta +ble_xml_root); print "Parsing files...\n"; while( ($filename = readdir(DIR))){ if($filename =~ /\.xls$/i) { my $parser = Spreadsheet::ParseExcel->new(); my $oBook = $parser->parse($data_folder.$filename); if (defined $oBook ) { for(my $iSheet=0; $iSheet < $oBook->{SheetCount} ; $iS +heet++) { $oWkS = $oBook->{Worksheet}[$iSheet]; # find the needed columns my $map_Cmin = -1; # the first is internal my $map_Cmax = -1; my $artno_col = -1; my $active_col = -1; my $colorzones_col = -1; my $family_col = -1; my $tec_desc_col = -1; my $brand_Cmin = -1; my $brand_Cmax = -1; my $customer_Cmin = -1; my $customer_Cmax = -1; my $market_Cmin = -1; my $market_Cmax = -1; my $title_row = -1; for (my $iC = $oWkS->{MinCol}; defined $oWkS->{Max +Col} && $iC <= $oWkS->{MaxCol} ; $iC++) { $oWkC = $oWkS->{Cells}[$oWkS->{MinRow}][$iC]; if(defined $oWkC) { if(decode('cp1252',$oWkC->{Val}) eq "Confi +gurator mapping") { for (my $iR = $oWkS->{MinRow} +1; defi +ned $oWkS->{MaxRow} && $iR <= $oWkS->{MaxRow} ; $iR++) { $oWkC_int = $oWkS->{Cells}[$iR][$i +C]; if(defined $oWkC_int and decode('c +p1252',$oWkC_int->{Val}) eq "internal") { $title_row = $iR; } } if($title_row > 0) { foreach my $area ( @{ $oWkS->{Merg +edArea} } ) { if($area->[1] eq $iC and $area +->[0] eq $oWkS->{MinRow}){ $map_Cmax = $area->[3]; } } $map_Cmin = $iC; $artno_col = $map_Cmin -2; $active_col = $map_Cmin -1; $family_col = $map_Cmin +1; $colorzones_col = $map_Cmax +2; $tec_desc_col = $map_Cmax +1; } } elsif(decode('cp1252',$oWkC->{Val}) eq " +Market"){ foreach my $area ( @{ $oWkS->{MergedAr +ea} } ) { if($area->[1] eq $iC and $area->[0 +] eq $oWkS->{MinRow}){ $market_Cmax = $area->[3]; } } $market_Cmin = $iC; } elsif(decode('cp1252',$oWkC->{Val}) eq " +Customer"){ foreach my $area ( @{ $oWkS->{MergedAr +ea} } ) { if($area->[1] eq $iC and $area->[0 +] eq $oWkS->{MinRow}){ $customer_Cmax = $area->[3]; } } $customer_Cmin = $iC; } elsif(decode('cp1252',$oWkC->{Val}) eq " +Brand"){ foreach my $area ( @{ $oWkS->{MergedAr +ea} } ) { if($area->[1] eq $iC and $area->[0 +] eq $oWkS->{MinRow}){ $brand_Cmax = $area->[3]; } } $brand_Cmin = $iC; } } } if($map_Cmin >= 0 and $map_Cmin >= 0 and $title_ro +w >= 0){ for(my $iR = $title_row +1; defined $oWkS->{Ma +xRow} && $iR <= $oWkS->{MaxRow} ; $iR++){ $internal_cell = $oWkS->{Cells}[$iR][$map_ +Cmin]; $active_cell = $oWkS->{Cells}[$iR][$active +_col]; $family_cell = $oWkS->{Cells}[$iR][$family +_col]; if(defined $internal_cell and defined $act +ive_cell and defined $family_cell and length(decode('cp1252',$interna +l_cell->{Val})) gt 0 and !(length(decode('cp1252',$active_cell->{Val} +)) gt 0) and length(decode('cp1252',$family_cell->{Val})) gt 0) { #for every family print a line my @families = split(/\,/, decode('cp1 +252',$family_cell->{Val})); foreach $family(@families){ my $materialmapping_item = $materi +almapping_table_xml->createElement("item"); #set internal $materialmapping_item->setAttribut +e("internal",decode('cp1252',$internal_cell->{Val})); #set family $materialmapping_item->setAttribut +e("pr_family",$family); #set item_number $item_number_cell = $oWkS->{Cells} +[$iR][$artno_col]; if(defined $item_number_cell and l +ength(decode('cp1252',$item_number_cell->{Val})) gt 0) { $materialmapping_item->setAttr +ibute("item_number",decode('cp1252',$item_number_cell->{Val})); } #set colorzones $colorzones_cell = $oWkS->{Cells}[ +$iR][$colorzones_col]; if(defined $colorzones_cell and le +ngth(decode('cp1252',$colorzones_cell->{Val})) gt 0) { $materialmapping_item->setAttr +ibute("colorzone",decode('cp1252',$colorzones_cell->{Val})); } #set properties for(my $prC = $map_Cmin+2; $prC <= + $map_Cmax ; $prC++) { $pr_cell_name = $oWkS->{Ce +lls}[$title_row][$prC]; $pr_cell = $oWkS->{Cells}[ +$iR][$prC]; if(defined $pr_cell and de +fined $pr_cell_name and length(decode('cp1252',$pr_cell->{Val})) gt 0 + and length(decode('cp1252',$pr_cell_name->{Val})) gt 0) { $materialmapping_item- +>setAttribute(decode('cp1252',$pr_cell_name->{Val}),decode('cp1252',$ +pr_cell->{Val})); } } #set description $description_cell = $oWkS->{Cells} +[$iR][$tec_desc_col]; if(defined $description_cell and l +ength(decode('cp1252',$description_cell->{Val})) gt 0) { $materialmapping_item->setAttr +ibute("description",decode('cp1252',$description_cell->{Val})); } #set Brands if($brand_Cmin gt 0 and $brand_Cma +x gt 0) { my $brand_string = ""; my $ignored_brand_string = ""; for(my $brandC = $brand_Cmin; +$brandC <= $brand_Cmax ; $brandC++) { $brand_cell_name = $oWkS-> +{Cells}[$title_row][$brandC]; if(defined $brand_cell_nam +e and decode('cp1252',$brand_cell_name->{Val}) ne "Epta std") { $brand_cell = $oWkS->{ +Cells}[$iR][$brandC]; if(defined $brand_cell + and length(decode('cp1252',$brand_cell->{Val})) gt 0) { if(decode('cp1252' +,$brand_cell->{Val}) eq '0') { $ignored_brand +_string = $ignored_brand_string . decode('cp1252',$brand_cell_name->{ +Val}) . ","; } else { $brand_string += $brand_string . decode('cp1252',$brand_cell_name->{Val}) . ","; } } } } if(length($brand_string) gt 0) +{ $materialmapping_item->set +Attribute('brand',$brand_string); } if(length($ignored_brand_strin +g) gt 0){ $materialmapping_item->set +Attribute('ignore_brand',$ignored_brand_string); } } #set Customer if($customer_Cmin gt 0 and $custom +er_Cmax gt 0) { my $customer_string = ""; my $ignored_customer_string = +""; for(my $customerC = $customer_ +Cmin; $customerC <= $customer_Cmax ; $customerC++) { $customer_cell_name = $oWk +S->{Cells}[$title_row][$customerC]; $customer_cell = $oWkS->{C +ells}[$iR][$customerC]; if(defined $customer_c +ell_name and defined $customer_cell and length(decode('cp1252',$custo +mer_cell->{Val})) gt 0 and length(decode('cp1252',$customer_cell_name +->{Val})) gt 0) { if(decode('cp1252' +,$customer_cell->{Val}) eq '0') { $ignored_custo +mer_string = $ignored_customer_string . decode('cp1252',$customer_cel +l_name->{Val}) . ","; } else { $customer_stri +ng = $customer_string . decode('cp1252',$customer_cell_name->{Val}) . + ","; } } } if(length($customer_string) gt + 0){ $materialmapping_item->set +Attribute('customer',$customer_string); } if(length($ignored_customer_st +ring) gt 0){ $materialmapping_item->set +Attribute('ignore_customer',$ignored_customer_string); } } #set Market if($market_Cmin gt 0 and $market_C +max gt 0) { my $market_string = ""; my $ignored_market_string = "" +; for(my $marketC = $market_Cmin +; $marketC <= $market_Cmax ; $marketC++) { $market_cell_name = $oWkS- +>{Cells}[$title_row][$marketC]; $market_cell = $oWkS->{Cel +ls}[$iR][$marketC]; if(defined $market_cel +l_name and defined $market_cell and length(decode('cp1252',$market_ce +ll->{Val})) gt 0 and length(decode('cp1252',$market_cell_name->{Val}) +) gt 0) { if(decode('cp1252' +,$market_cell->{Val}) eq '0') { $ignored_marke +t_string = $ignored_market_string . decode('cp1252',$market_cell_name +->{Val}) . ","; } else { $market_string + = $market_string . decode('cp1252',$market_cell_name->{Val}) . ","; } } } if(length($market_string) gt 0 +){ $materialmapping_item->set +Attribute('market',$market_string); } if(length($ignored_market_stri +ng) gt 0){ $materialmapping_item->set +Attribute('ignore_market',$ignored_market_string); } } $materialmapping_table_xml_root->a +ddChild($materialmapping_item); }; } } } } } print "Parsed $filename \n"; } } $materialmapping_table_xml->toFile($materialmapping_file,2); }

Replies are listed 'Best First'.
Re: modify perl 'excel to xml'
by Athanasius (Archbishop) on Jul 26, 2012 at 04:39 UTC

    Hello cibien, and welcome to the Monastery!

    I can’t really help you with your problem, since I don’t understand what you are trying to achieve. You ask:

    is possible to do the same for propriety?

    but I have no idea what this means. Some examples — input and desired output — would be useful for explaining what you want to do.

    Also, you should supply some sample input for the existing code, together with the output it produces, so the monks can experiment on the code and verify that it still does what it should. See How do I post a question effectively?. Also specify the version of Perl you are using. (From the line system("cls"); it appears the script is running on Windows.)

    I gather you inherited the code you posted from someone else. You should be aware there are a few issues with this code as it stands:

    • Add use strict; to the start of the script. (This won’t make any difference in this particular case, but it’s always good practice, and is guaranteed to help you down the track.)
    • Numeric comparisons should be made using < <= == >= > != <=>. Their counterparts lt le eq ge gt ne cmp are for string comparisons only.
    • The modules File::Basename, File::Find, and Switch don’t appear to be used. The corresponding use directives at the head of the script should therefore be removed.
    • In any case, the Switch module dates from a time before Perl introduced the given/when construct. If possible, always prefer the latter.

    But the real problem with the code posted is that it’s just — well — too long and complicated to wade through. In other words, it’s in desperate need of refactoring.

    Sorry I can’t provide the solution you wanted; but I hope the above at least gives you some useful pointers.

    Update: There are a couple of other issues with the original code:

    • The line:
      my $oBook = $parser->parse($data_folder.$filename);
      does not insert a path separator between the folder and the filename. So, unless the script is called like this:
      >perl script.pl output.xml Excel\
      (note the trailing backslash), the file will not be found and the script will fail silently.

    • The following code:
      if($map_Cmin >= 0 and $map_Cmin >= 0 and $title_row >= 0){ for(my $iR = $title_row +1; defined $oWkS->{MaxRow} && $iR <= $oWk +S->{MaxRow} ; $iR++){ $internal_cell = $oWkS->{Cells}[$iR][$map_Cmin]; $active_cell = $oWkS->{Cells}[$iR][$active_col]; $family_cell = $oWkS->{Cells}[$iR][$family_col]; if(defined $internal_cell and defined $active_cell and defined + $family_cell and length(decode('cp1252',$internal_cell->{Val})) gt 0 + and !(length(decode('cp1252',$active_cell->{Val})) gt 0) and length( +decode('cp1252',$family_cell->{Val})) gt 0)
      contains a condition which can never be fulfilled, as it requires $active_cell to be both defined yet also decodable to zero length. Fix: remove the ! (negation) operator.

    The input also requires a special format: see not only the post by Anonymous Monk below, but also the comments at the head of the code in the original post by cibien. The following format works when saved as an “Excel 97-2003 Workbook (*.xls)”:

    | A | B | C | D | E | ===+========+==========+===========+=========+=========+ 1 | | Configurator mapping | ---+--------+----------+-----------+---------+---------+ 2 | active | internal | pr_family | pr1 | pr2 | ---+--------+----------+-----------+---------+---------+ 3 | yes | item 1 | 1,2 | value 1 | value 2 | ---+--------+----------+-----------+---------+---------+ 4 | yes | item 2 | 1,2 | value 3 | value 3 | ---+--------+----------+-----------+---------+---------+ 5 | yes | item 3 | 1,2 | value 5 | value 4 | ---+--------+----------+-----------+---------+---------+

    When this spreadsheet is processed by the script with the fixes applied, it produces the following output:

    <?xml version="1.0" encoding="UTF-8"?> <masterdata version="2012-07-27 01:51:57"> <item internal="item 1 " pr_family="1" pr1="value 1" pr2="value 2"/> <item internal="item 1 " pr_family="2" pr1="value 1" pr2="value 2"/> <item internal="item 2 " pr_family="1" pr1="value 3" pr2="value 3"/> <item internal="item 2 " pr_family="2" pr1="value 3" pr2="value 3"/> <item internal="item 3 " pr_family="1" pr1="value 5" pr2="value 4"/> <item internal="item 3 " pr_family="2" pr1="value 5" pr2="value 4"/> </masterdata>

    Of course, none of this answers the OP’s question; but it may provide a base upon which an answer can be constructed.

    Athanasius <°(((><contra mundum

      Thankyou very much for your help. I'am novice in perl and I try to modify existing one. the script is for windows. in poor words:
      for example: how perl work now: read from these example excel ___________________________________ ---------|pr_family|___pr1___|___pr2___| ___________________________________ item 1 |___1,2__|__value 1_|_value 2_| item 2 |___1,2__|__value 3_|_value 3_| item 3 |___1,2__|__value 5_|_value 4_| convert to XML (output): item 1 - pr_family: 1 - pr_1: value 1 - pr_2: value 2 - item 1 - pr_family: 2 - pr_1: value 1 - pr_2: value 2 - item 2 - pr_family: 1 - pr_1: value 3 - pr_2: value 3 - item 2 - pr_family: 2 - pr_1: value 3 - pr_2: value 3 - item 3 - pr_family: 1 - pr_1: value 5 - pr_2: value 4 - item 3 - pr_family: 2 - pr_1: value 5 - pr_2: value 4 - _________________________________________________________ ok, but I need to modify theese perl because I must add moore than one + value in pr conlum, like is for family, with split (',') in pr1,pr2: example what the perl have to do after modify: excel ________________________________________________ ---------|_pr_family_|____pr1________|____pr2________| ________________________________________________ item 1 |___1,2____|_value 1,value 5_|____value 2_____| item 2 |___1,2____|_____value 3____|_value 3,value 4_| item 3 |___1,2____|_____value 5____|____value 4_____| convert to XML (output): item 1 - pr_family: 1 - pr_1: value 1 - pr_2: value 2 - item 1 - pr_family: 2 - pr_1: value 1 - pr_2: value 2 - item 1 - pr_family: 1 - pr_1: value 5 - pr_2: value 2 - item 1 - pr_family: 2 - pr_1: value 5 - pr_2: value 2 - item 2 - pr_family: 1 - pr_1: value 3 - pr_2: value 3 - item 2 - pr_family: 2 - pr_1: value 3 - pr_2: value 3 - item 2 - pr_family: 1 - pr_1: value 3 - pr_2: value 4 - item 2 - pr_family: 2 - pr_1: value 3 - pr_2: value 4 - item 3 - pr_family: 1 - pr_1: value 5 - pr_2: value 4 - item 3 - pr_family: 2 - pr_1: value 5 - pr_2: value 4 -