Re^2: Updating fields in a text file

Replies are listed 'Best First'.
Re^3: Updating fields in a text file by Limbic~Region (Chancellor) on Jul 11, 2006 at 15:45 UTC
wendy24, You are still omitting details that would allow us to provide a full working solution such as the width of each field. Additionally, do the records have a newline separator or is it just one long runone line? `#!/usr/bin/perl use strict; use warnings; my ($in, $out) = @ARGV; die "Usage: $0 <input file> <output file>" if ! defined $in \|\| ! defin +ed $out; open(my $in_fh, '<', $in) or die "Unable to open '$in' for reading: $! +"; open(my $out_fh, '>', $out) or die "Unable to open '$out' for writing: + $!"; while ( <$in_fh> ) { chomp; my ($name, $add, $city, $state, $zip) = unpack('A10A20A15A12A5', $ +_); $zip = 15206 if uc($city) eq 'PITTSBURGH'; print $out_fh $name, $add, $city, $state, $zip, "\n"; }` [download] Of course, this assumes that name is only 10 characters long and zip is 5 but they can be adjusted accordingly. It also assumes the records are newline separated and will not work otherwise. Don't worry about being new but think about what information is needed to solve the problem even if you don't know how to solve it yourself. Cheers - L~R	[reply] [d/l]
Re^4: Updating fields in a text file by Anonymous Monk on Jul 11, 2006 at 16:28 UTC
The records will not be runon lines. The fields can and probably will be of different lengths, but I can adjust for that. Thanks for your help. I will try your suggestions.	[reply]
Re^3: Updating fields in a text file by davido (Cardinal) on Jul 12, 2006 at 06:41 UTC
Here is an untested one-liner: `perl -pi.bak -e "/.{23}pittsburgh/ && s/\d{5}$/15206/;" filename.txt` [download] This works as follows: Check to see if the line contains the word 'pittsburgh', starting on the 24th character position in the line. You did mention that the lines are fixed length. You may have to tailor the {23} to meet your actual data field widths. The insistence on commencing the search for 'pittsburgh' at the 24th position in the line is to eliminate false positives such as the odd possibility of someone living on pittsburgh street, or being named John Pittsburgh. If it does find 'pittsburgh' in the correct position, perform a substitution on the final five numeric digits found on the line, substituting in the new zip code. Trailing newline is ignored and preserved. The -p switch wraps the code in a while loop and outputs the result of any executed code. The -i switch turns on 'in place editing.' See perlrun for a more thorough explanation of the command line switches, and perlre and perlretut for the rest. ;) Update: Wait, I'm confused. In one followup node you said that the file is fixed length. I took this to mean that the fields are fixed length. In another followup, where you posted as Anonymous Monk, you (at least I think it's you) said that the fields may be variable length, but that you can adjust for that. If the latter is true, my solution breaks, and you've got one heck of a problem. Here's why: If you have variable width fields, delimited with whitespace, and the fields may also each contain whitespace (such as between house numbers and street names), your delimiters are not unique, and thus, not special. How can you check the city as the third field if you don't have any sure method of delimiting fields? Your sample data implied fixed-length fields. It also implied that you're not 'escaping' whitespace that might be embedded within a field. It also implied that you're not wrapping you fields in anything like quotes. So there is no way to predict whether a piece of whitespace represents a field delimiter, or simply a space character within the text. For that reason, either your data is fundamentally flawed, or you're not showing us the whole big picture. Which is it? Your data needs one of the following characteristics: Fixed width fields. Variable width fields with unique delimiters. Variable width fields with non-unique delimiters, but with some means of escaping embedded characters that might otherwise seem to be delimiters. Variable width fields with quoted data to help distinguish between delimiters and plain text. ...note, this opens another can of worms: escaping quotes. ;) Some other clear-cut easily definable means of identifying where each field begins. Dave	[reply] [d/l]