JillB has asked for the wisdom of the Perl Monks concerning the following question:
Hello, this is my first ever go at Perl. For starters, I want to open a GEDCOM text file and count the number of lines in the file, and read a line. The script below does not work for me.It shows the count as 1, and no record is shown for the 10th record Can anyone please assist?
#!/usr/bin/perl open(MYFILE, "C:/Users/Jill/Documents/Genealogy/birdt.ged") || die; @MyGed=birdt.ged; $count=@MyGed; print "10th record : $MyGed[10]\n"; print "No. of records : $count \n";
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: How to read a GEDCOM file
by haukex (Archbishop) on Nov 19, 2017 at 12:37 UTC | |
I don't know anything about this format, but CPAN is your friend, there is a module Gedcom that sounds like it will be able to help you. Note that your Perl source contains some invalid syntax*. Since you say you're just starting with Perl, I suggest starting with perlintro (which includes the tip to Use strict and warnings). It might be easiest to start with a bit simpler task, like reading the 10th line of a simple text file. <update> Once you get to the section "Files and I/O" of perlintro, you should be able to see what to change in your current script for it to read the lines from the file. </update> Also, the Basic debugging checklist contains some good tips, like to use Data::Dumper or Data::Dump to look at data structures, which will probably be helpful later on in inspecting the data that the module returns. We'll be happy to help with any questions you may have while learning. For asking questions, it's always best to include the code you are working on, with things irrelevant to the question removed (although the code should at least still compile, see SSCCE), short sample input data, the expected output for that input, and the actual output you're getting, including any error messages (with line numbers intact). See also How do I post a question effectively? * Actually, you've since edited your post to fix some of that. Please mark your updates as such, to prevent replies from being confusing - see How do I change/delete my post? Update: Typo fix: Data::Dumper, not "Date::Dumper", thanks 1nickt! | [reply] [d/l] [select] |
by JillB (Novice) on Nov 19, 2017 at 19:51 UTC | |
Thanks for this advice. It should help me make better use of your Forum | [reply] |
|
Re: How to read a GEDCOM file
by Laurent_R (Canon) on Nov 19, 2017 at 13:13 UTC | |
That should probably solve your problem and make your script work. I would suggest, however, that you rewrite the script in accordance with commonly accepted best practices. For example as follows:
| [reply] [d/l] [select] |
by JillB (Novice) on Nov 19, 2017 at 19:47 UTC | |
Thanks you for this quick and very helpful reply. It now works perfectly with your code, and I have learnt lessons from your post regards, Jill | [reply] |
by AnomalousMonk (Archbishop) on Nov 19, 2017 at 20:39 UTC | |
It's clear from this node (if the janitors haven't already tidied it away) that you haven't yet quite gotten the hang of editing your posts. :) Please see How do I change/delete my post? for site etiquette and protocol regarding changing your posts. Bottom line: Don't Destroy Context! Give a man a fish: <%-{-{-{-< | [reply] [d/l] |
|
Re: How to read a GEDCOM file
by kcott (Archbishop) on Nov 20, 2017 at 22:09 UTC | |
G'day JillB, "Hello, this is my first ever go at Perl." As has alreay been pointed out, there's quite a few problems there; and you've received good advice on dealing with these. I was putting together a short script to show how to do this without slurping entire files into arrays (which can often be problematic when large files chew up lots of memory). Additionally, I included code to create a temporary test file before processing and to delete it afterwards; also, there's a routine to check for the existence of that file. I've ended up with "pm_1203774_file_io_basics.pl" which covers many of the basic aspects of I/O and file handling.
Here's some sample runs. Firstly, with no arguments, just the record count is reported:
Arguments specify the record numbers you want to print:
The order of arguments is unimportant:
Out-of-range record numbers and non-numeric arguments are not processed; they are, however, reported on STDERR:
I then, out of curiousity, took a look at this Wikipedia GEDCOM entry. You have a problem with your terminology which is highly likely to translate into problems in your code. You're using the terms "lines" and "records" interchangeably: in many cases that equivalency exists; however, the GEDCOM format uses multiline records (i.e. "lines" and "records" are not the same thing). To demonstrate a technique you could use to read GEDCOM records, I copied "sample.ged" (from that Wikipedia article) to "pm_1203774_sample.ged", and parsed it like so:
Which outputs:
Adapting that code, for use in my first script, is left as an exercise for your good self. Of course, if you really get stuck on something, come back and ask another question. — Ken | [reply] [d/l] [select] |
by afoken (Chancellor) on Nov 21, 2017 at 05:05 UTC | |
if you really get stuck on something, come back and ask another question. ... preferably in this thread, so that we don't have to start at zero again. Alexander
-- Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-) | [reply] |