Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:
My company receives orders via e-mail from our customers system. Each message is one order. I need to pull the files apart and drop the data into a database. The part I am having trouble with is the pulling apart and defining fields (I can do the database work, I just can't get the data into nice chunks). These messages have a very standard format (this is just a snippet to show some of the different format issues I have...if you can show me how to deal with these I will we able to mix and match to do the whole message):
feild1 name:
data1
data1
data1
feild2 name:
data2
feild3 name: data3
**a full line of astris**
unimportant text
***field4 name: data4
***field5 name: data5
**a full line of astris**
unimportant text
field6 name: data6
field7 name: data7
field8 name: data8
location 1
field9 name: data9
field10 name: data10
location 2
field11 name: data11
field12 name: data12
etc...
In case it does not display correctly here online, the field names are flush left and the data is aligned some distance out. The field names
could be more than one word but they all (the field names) end with a colon Each field is on a separate line. But not every line is a field
(there are blank lines and lines of * used to make the message more easily human readable). Not all fields are filled. Some of the field
names are duplicated (in the above example field 9 would match field 11 (example state:) and field 10 would match field 12 (example
city:). The order is not always the same, and the specific fields change (some messages will have some fields and others will have other
fields) The amount of data (number of lines) in field1 is variable (it is a list of addresses which I don't care about anyway.). As you can
see, field2's data is on the line below the field name (other than field 1, it is the only one like this.)
I would like to be able to reference the data by it's field name (in the case of the order being changed, I can still refer to the same name. Also, when new fields are added I am set)...
That's it....seems like it should be relatively simple. Anything to the left of a colon is a field name and, on that same line but some distance
to the right, is the data. The only exception is field #2 which is on the line below. (the good news here is that the name of this field is
always the same. So if I see the word Subject: I can pull data from the line below it).
Questions:
How do i do this?
How can I handle the duplicated field names?
The Perl Nubie
|
|---|
| Replies are listed 'Best First'. | |
|---|---|
|
Re: Probably very simple (for those in the know)
by atcroft (Abbot) on Feb 03, 2002 at 13:39 UTC | |
|
Re: Probably very simple (for those in the know)
by jonjacobmoon (Pilgrim) on Feb 03, 2002 at 12:16 UTC | |
|
Re: Probably very simple (for those in the know)
by Anonymous Monk on Feb 03, 2002 at 10:08 UTC |