Remove line above matching criteria

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Remove line above matching criteria by idsfa (Vicar) on Nov 20, 2006 at 17:28 UTC
You haven't shown us what you have tried so far, nor what your criteria for "best" are (fastest, least memory, simplest, etc). We cannot easily help you if you don't do your homework first. From what you've given us, I'd `egrep -v` the data from the command line and skip perl altogether ... The intelligent reader will judge for himself. Without examining the facts fully and fairly, there is no way of knowing whether vox populi is really vox dei, or merely vox asinorum. — Cyrus H. Gordon	[reply] [d/l]
Re^2: Remove line above matching criteria by kwaping (Priest) on Nov 20, 2006 at 17:48 UTC
If you're going to go that route, you can also (with GNU grep, at least) `grep -B 1 '<id:>'` to get the ID line together with the previous line. --- It's all fine and dandy until someone has to look at the code.	[reply] [d/l]
Re: Remove line above matching criteria by jbert (Priest) on Nov 20, 2006 at 18:05 UTC
One way to do this (discard N lines previous to a match) is to keep a buffer of seen lines which is at least N long. You print lines which overflow out normally and then just discard the buffer on a match. Lastly, remember to print anything in the buffer at the end. In code: `#!/usr/bin/perl use strict; use warnings; # We don't really need an array for one line, but it seems # conceptually nicer (and generalises more easily) my $max_buffer_size = 1; my @buffer; my $line; while ($line = <ARGV>) { push @buffer, $line; # Replace the pattern match with your criterion if this # isn't right. @buffer = () if $line =~ /^.id/; if (scalar @buffer > $max_buffer_size) { print shift @buffer; } } print @buffer;` [download]	[reply] [d/l]
Re: Remove line above matching criteria by swampyankee (Parson) on Nov 20, 2006 at 19:17 UTC
I can, off the top of my head, think of at least three ways, which I view as distinct: Use Tie::File, which lets you treat a file (more or less) as an array. Use File::ReadBackwards, which (duh!) reads a file backwards. Read the file a record at a time, but keep track of the contents of the current and previous record, and print them as needed. If your data are as shown (a name, followed by lines starting with labels, such as <id:>, <city:>, etc), something like this may work: `#!perl use strict; use warnings; open($my fh, "<", $infile) or die "Could not open $infile because $!\n +"; while(<$fh>){ next if /^<id:>\|^[A-Za-z]/; print; }` [download] Now, if my regex brain is turned on, this regex should skip lines which start with <id:> or start with letters. Incidently, how is this anonymyzing data if you're leaving addresses and phone numbers? Update Having noticed ww's comment in a message, I may have misread or confused the title ("Remove line above matching criteria") and "extract name and id". The regex I put in the sample above (unless I screwed it up) should skip the name and id; to skip everything else one could change the "if" to "unless". emc If it's not foggy out, I need new glasses.	[reply] [d/l]
Re: Remove line above matching criteria by madbombX (Hermit) on Nov 20, 2006 at 17:56 UTC
To reiterate what idsfa said, we need to know what you have tried, what the record separators are, etc, etc. Just generally more information. That being said, you could always load each record into a variable (hash or array). Then pass that variable off to a function that checks the data for inconsistancies, or whatever you are looking for. Removes whatever needs to be removed (since manipulating a hash or array is simple if the structure doesn't change), and then return the new variable to the main program. Obviously this can also be done via an OO method (and this is likely preferred since this sounds to be repetative data).	[reply]
Re: Remove line above matching criteria by swampyankee (Parson) on Nov 21, 2006 at 16:21 UTC
Your first paragraph ("extract user names and ID's…") and your final paragraph ("remove the ID and name for each entry") are contradictory. "Extract" usually means "copy from a file (or database, etc) for use elsewhere". "Remove" usually means "erase"; they are not (in the English version of computerese) synonyms, although they are in normal English (having "a tooth extracted" and "a tooth removed" both result in one fewer teeth in one's mouth). emc At that time [1909] the chief engineer was almost always the chief test pilot as well. That had the fortunate result of eliminating poor engineering early in aviation. —Igor Sikorsky, reported in AOPA Pilot magazine February 2003.	[reply]