in reply to Converting GEDCOM files
I made two changes. The input record separator is now "\n0", because each section begins with a line starting with 0. (Although this will have each "\n0" read in as the end of the previous record, that doesn't make a difference for our purposes.) In the substitutions for NAME, I added a beginning of line anchor and removed the word 'data' (turns out that bit was just a placeholder for actual data). Also I fixed my typo of TITLE to be TITL (everything is four letters in this format).#!/usr/local/bin/perl -w use strict; $/ = "\n0"; # read in one record at a time $^I = '.bak'; # modify input files in place, save originals with .bak my %convert = ( HEAL => 'Medical', HIST => 'Biography', EDUC => 'Educated', RESI => 'Resided', OCCU => 'Occupation', ); my $convert_re = join '|', keys %convert; $convert_re = qr/\b($convert_re)\b/; while (<>) { s/$convert_re/NOTE $convert{$1}:/g; if (/^.*SOUR/) { s/^1 NAME/1 TITL/m; s/^1 NAME/1 TEXT/m; s/^1 NAME/2 CONT/mg; } } continue { print; }
Update: Make that four changes. The change of NAME to TITL is only supposed to occur for sections that start 0 @Snnn@ SOUR, not for all sections. And I stupidly forgot the /m modifier on those regexes when I added the anchors. Thanks for the notice of the problems, tachyon! (Notwithstanding the suggestion that I'm reading in one line at a time; I'm actually reading in a block of lines ending with a newline followed by a zero, as I intended.)
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Re: Converting GEDCOM files
by tachyon (Chancellor) on Jul 09, 2001 at 21:34 UTC | |
by Joes (Acolyte) on Jul 10, 2001 at 16:28 UTC | |
by tachyon (Chancellor) on Jul 10, 2001 at 22:02 UTC | |
|
Re: Re: Converting GEDCOM files
by Anonymous Monk on Jul 10, 2001 at 03:18 UTC |