Remove line and modify another

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Remove line and modify another by hippo (Archbishop) on Sep 05, 2018 at 22:57 UTC
Here's an SSCCE for your point 1: `use strict; use warnings; use Test::More tests => 1; my $in = 'foo .024. 3# bar'; my $want = 'foo bar'; my $try = $in; $try =~ s/\.024\. 3# \n//; is ($try, $want);` [download] Now it's up to you to do the same for point 2. Can you?	[reply] [d/l]
Re: Remove line and modify another by Marshall (Canon) on Sep 05, 2018 at 20:47 UTC
Based upon your question, I don't know how to help you. Please do the work to post a simple code example that demonstrates your problem. I don't need your whole program, just the problematic parts. Update: you say: My current script is very long and consists of general substitutes. but you apparently cannot do some relatively simple stuff. Show some code.	[reply]
Re: Remove line and modify another by AnomalousMonk (Archbishop) on Sep 05, 2018 at 23:30 UTC
Based on several guesses, here's another approach to an SSCCE: c:\@Work\Perl\monks>perl -wMstrict -e "my @lines = ( qq{.024. 3# keep this one \n}, qq{.024. 3# \n}, qq{.024. 3# keep this two \n}, qq{.024. 3#\|a9780750247092\|xisbn13\n}, qq{.024. 3# keep this three \n}, qq{.024. 3#\|a0750247096\|xisbn\n}, qq{.024. 3# keep this four \n}, ); print for @lines; print qq{\n}; ;; my $rx_first = qr{ [.] 024 [.] }xms; my $replace = '.020.'; ;; LINE: for my $line (@lines) { next LINE if $line =~ m{ \A $rx_first \s+ 3[#] [ ]{2} \Z }xms; $line =~ s{ \A $rx_first (?= .* [\|]xisbn (?: 13)? \Z) } {$replace}xms; print $line; } " .024. 3# keep this one .024. 3# .024. 3# keep this two .024. 3#\|a9780750247092\|xisbn13 .024. 3# keep this three .024. 3#\|a0750247096\|xisbn .024. 3# keep this four .024. 3# keep this one .024. 3# keep this two .020. 3#\|a9780750247092\|xisbn13 .024. 3# keep this three .020. 3#\|a0750247096\|xisbn .024. 3# keep this four [download] Update: But pay attention to the Test::More approach hippo has used here: it's something you should be using generally in your development. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re: Remove line and modify another by IceJ (Initiate) on Sep 06, 2018 at 05:48 UTC
Hi all. Thank you for your assistance. As requested by one of the replies, here is some of the code s/ <datafield tag="/./i; #begin tag num with "." s/" ind1="/. /i; #end tag num with ". " s/" ind2="//i; #remove text between indic s/">\n//i; #to imprecise? No, its OK. s/ <subfield code="/\|/i; #remove text between fields s/">//i; #to imprecise? No, its OK. s/<\/subfield>\n//i; #remove text between fields s/<\/record>\n//i; #remove record end marker s/<\/datafield>//i; #remove end of field marker s/http:\/\/wc.slims.gov.za/WCPLIS/i; #change SITA url to org s/<\/collection>\n//i; #remove end of document if (/\d-\d/) {s/-//g}; #Remove hyphens in ISBN's [download] This part takes the xml code and puts it into the flat format as required. Below is a sample of the xml code `<datafield ind1="1" ind2="4" tag="245"> <subfield code="a">The solar system</subfield> <subfield code="c">Chris Oxlade [Author]</subfield> </datafield>` [download] Below is the output of one line in the flat file. `.245. 14\|aThe solar system\|cChris Oxlade [Author]` [download] This perl code is user for all lines with various MARC tags (.024.). It might be possible that I am putting the part to focus on the ISBN correction in the wrong place in the code. I used the line below: `if (/^.024. 3#\|a/) {s/.{10}(.............)........../.020. 3#\|a$1 +/};` [download] I will try some of the suggestions and see if I can get it sorted. Thank you once again for assisting.	[reply] [d/l] [select]
Re^2: Remove line and modify another by poj (Abbot) on Sep 06, 2018 at 07:17 UTC
It might be possible that I am putting the part to focus on the ISBN correction in the wrong place in the code Consider using XML::Twig instead of regexes to process the file. Easier to apply changes to a data element before creating the flat file rather a complete line afterwards. #!/usr/bin/perl use strict; use XML::Twig; my $xml = join '',<DATA>; my $twig = XML::Twig->new( twig_handlers => {'datafield' => \&datafield} ); $twig->parse( $xml ); sub datafield { my( $t, $e ) = @_; my %subfield = (); for my $elem ($e->children('subfield')){ $subfield{$elem->att('code')} = $elem->text; } my @f = (); $f[0] = $e->att('tag'); $f[1] = $e->att('ind1').$e->att('ind2'); my @tmp; for (sort keys %subfield){ push @tmp,$_.$subfield{$_}; } $f[2] = join '\|',@tmp; # change if ($subfield{'x'} =~ /^(isbn13\|isbn)$/){ $f[0] =~ s/024/020/; } # flat format for output printf ".%s. %s\|%s\n",@f if ($f[2]); # skip blanks } #.245. 14\|aThe solar system\|cChris Oxlade [Author] #.024. 3#\|a9780750247092\|xisbn13 #.024. 3#\|a0750247096\|xisbn __DATA__ <collection> <record> <datafield ind1="1" ind2="4" tag="245"> <subfield code="a">The solar system</subfield> <subfield code="c">Chris Oxlade [Author]</subfield> </datafield> </record> <record> <datafield ind1="3" ind2="#" tag="024"> <subfield code="a">a9780750247092</subfield> <subfield code="x">isbn13</subfield> </datafield> <record> </record> <datafield ind1="3" ind2="#" tag="024"> <subfield code="a">a0750247096</subfield> <subfield code="x">isbn</subfield> </datafield> </record> <record> <datafield ind1="3" ind2="#" tag="024"> </datafield> </record> </collection> [download] poj	[reply] [d/l]