comment on

Hi Kenosis, GrandFather, tangent, kcott, and ww!

thank you so much for warm welcome as well as your invaluable insights. Indeed, my apologies for not being more detailed with the problem in my previous message. Reading through your posts & insights made me realize that maybe I underestimated the "problem". :)

So... what it boils down to is the following. I am having some weather data (air temperature) transmitted wirelessly from point A to point B. The "problem" is that sometimes, there is a character (or characters) missing in a line (perhaps due to interference?), causing the numbers to show differently from what they should be in reality.

Hence, focusing only on field #4 for this example, sometimes the real data collected could look like this (ex1) but might actually be received as x2 due to drops of characters:

Ex1:

A15 26.62 765 27.30 4.3

A11 26.63 763 27.28 4.2

A12 26.68 767 27.29 4.3

A16 26.64 768 27.30 4.2

A11 26.62 761 27.31 4.1

A15 26.62 765 27.30 4.3

A15 26.63 763 27.28 4.2

A16 26.68 767 2.29 4.3

A17 26.64 768 27.30 4.2

A18 26.62 761 27.31 4.1

Ex2:

A15 26.62 765 2.30 4.3

A11 26.63 763 27.8 4.2

A12 26.68 767 27.29 4.3

A16 26.64 768 27.30 4.2

A11 26.62 761 7.31 4.1

A15 26.62 765 27.30 4.3

A15 26.63 763 27.28 4.2

A16 26.68 767 2.29 4.3

A17 26.64 768 27.30 4.2

A18 26.62 761 27.31 4.1

There are a variety of factors to tackle, including checking that there is the correct number of fields in each line, and that each field has a value within a certain range of valid data (for example, in case of temperature, this would go from 0 through 45. Note that I've already written a small piece of code that takes care of this).

As GrandFather and others very insightfully mentioned, if I was to use the "solution" approach that I proposed initially, one very real and potential problem would be to have an anomalous data value at the start of the transmitted data, and not have enough lines to compare it with.

One thing to consider is that in reality I have 30 lines of data, and typically less than 5 lines have character dropouts in them (if any at all) (in my example I just included 10 for simplicity's sake). Therefore, I wonder if by taking an average, frequency test, or something like that would do the trick.

What do you guys think?

Ahhh... as I was writing this I just thought that perhaps, one additional check would be to ensure that column #4 has a total of 2 decimal places! (since the data is always transmitted with 2 decimal places)

PS: kscott... hehe you are right. My apologies for my initial xy approach :)

In reply to Re^3: comparing numbers from previous lines in a file? by coding1227
in thread comparing numbers from previous lines in a file? by coding1227

Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!

Titles consisting of a single word are discouraged, and in most cases are disallowed outright.

Read Where should I post X? if you're not absolutely sure you're posting in the right place.

Please read these before you post! —

Posts may use any of the Perl Monks Approved HTML tags:

a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, details, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, summary, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr

You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)

	For:		Use:
	&		`&`
	<		`<`
	>		`>`
	[		`[`
	]		`]`

Link using PerlMonks shortcuts! What shortcuts can I use for linking?

See Writeup Formatting Tips and other pages linked from there for more info.