Re^5: Geo Package files

Replies are listed 'Best First'.
Re^6: Geo Package files by Bod (Parson) on Mar 05, 2022 at 14:17 UTC
something like this will work? So far so good. And the srs_id passes the sanity check as it is the value I would expect. The problem comes when we get to the envelope. We know it is a `double[]` from the spec and we know it is Little Endian from the flag byte. The trouble I'm having is understanding how to take that information and translate it to a template for unpack. From the list in the documentation, the closest seems to be 'V' but that is a `long` not a `float`. unpack'ing the envelope with 'V' gives these values: `minx - 3307124816 maxx - 1092842796 miny - 867583392 maxy - 1092852469 minz - 1374389536 maxz - 1091757350` [download] This makes no sense as `min_x` has to numerically less than `max_x` - it is only a co-ordinate system, albeit one in 3 dimensional irregular spherical space. Likewise with `min_z` and `max_z`, this is height above sea-level and the min has to numerically less than the max. So from the data it seems I am using the wrong unpack template of `'nCCV(V)6'`. I've tried the two float options - 'f' and 'd' but they produce equally unrealistic values. Do you have any advice on how I go about translating from the information I know about the data to the template necessary to unpack it? buzzwords like SRSID or OSGB mean nothing to me Sorry! SRS - Spatial Reference System is the co-ordinate system used to identify where a point is on the Earth's surface. Many SRS's exist and none are perfect. In order that the data can be used, we need to know which SRS we are using OSGB - Ordnance Survey of Great Britain is the SRS that we most commonly use here in the UK. Because the UK is a relatively small country, OSGB mostly ignores the curvature of the Earth. OSGB is what we have found as the SRS ID in this GeoPackage so that's a sanity check that we are on the right path so far.	[reply] [d/l] [select]
Re^7: Geo Package files by soonix (Canon) on Mar 05, 2022 at 18:07 UTC
The WKB spec says it's an "8-byte IEEE double". I assumed that would imply "f" format specifier, but probably I was mistaken, and it should be "d", as this older post suggests. Data::IEEE754::Tools mentions "d>"… The "Floating point numbers" section in perlpacktut says "There is no such thing as a network representation for reals", which might mean that the "little-endian"-flag bit could be ineffective for these doubles.	[reply]
Re^8: Geo Package files by Bod (Parson) on Mar 06, 2022 at 13:59 UTC
I assumed that would imply "f" format specifier, but probably I was mistaken, and it should be "d" Unfortunately, it doesn't seem to be as simple as that. Either that or I am totally misunderstanding the results I am getting Using this code: `use DBD::SQLite; use Data::Dumper; use strict; use warnings; my $dbh = DBI->connect("dbi:SQLite:uri=file:osopenusrn_202203.gpkg?mod +e=rwc"); my $tab = $dbh->prepare("SELECT * FROM openUSRN"); $tab->execute; my $data = $tab->fetchrow_hashref; my @test = unpack "(H2)2 CCVd6CVV", $data->{'geometry'}; foreach my $t(@test) { print "$t\n"; } print "\n-----\n\n"; printf ('%f ', $test[4]); printf ('%f', $test[6]);` [download] I get this output `47 \| - G 50 \| - P 0 - version 5 - flags (Little Endian & envelope type 2) 27700 - SRS (OSGB 1936) 637590.385 \| - min_x -\| 642426.601 \| - max_x \| 309577.58 \| - min_y \|- envelope 310361.391000001 \| - max_y \| 0 \| - min_z \| 0 \| - max_z -\| 1 - StandardGeoPackageBinary encoding (Little En +dian) 1005 - StandardGeoPackageBinary type (MultiLineStr +ing Z) 34 - Number of LineStrings ----- 27700.000000 642426.601000` [download] The StandardGeoPackageBinary Spec (section 8.2.2) tell us that there are only three data types. A single byte plus the following: An Unsigned Integer is a 32-bit (4-byte) data type that encodes a nonnegative integer in the range (0, 4,294,967,295). A Double is a 64-bit (8-byte) double precision datatype that encodes a double precision number using the IEEE 754 double precision format. Applying a simple sanity check to the results I get, show that `byte` 'C' makes sense - e.g. 'GP' and the 'flags'. Also the `unsigned int` 'V' makes sense. 1005 is a perfectly sensible result for the StandardGeoPackageBinary type. The problem comes with the IEEE 754 double. Data::IEEE754 says that: If you can require Perl 5.10 or greater then this module is pointless. Just use the d> and f> pack formats instead!. I am using Perl 5.32.1 so 'd' should give me an IEEE 754 double. It pulls the right number of bytes off the binary (otherwise the following values would be gobbledegook) but it returns values that don't make sense. The OBGB SRS doesn't use latitude and longitude. Instead, it uses easting and northing (x, y) coordinates. They have to be integers in the range 0-670000 for x and 0-140000 for y. I'm beginning to think that I must be interpreting the data wrongly. Is there any other way that a Little Endian IEEE 754 double can be decoded? Is it somehow platform dependent? Although surely the purpose of IEEE 754 is to make data universal and remove the platform dependency (once the endiness is established).	[reply] [d/l] [select]
Re^9: Geo Package files by swl (Parson) on Mar 07, 2022 at 02:39 UTC
Solved! (was: Re^10: Geo Package files) by Bod (Parson) on Mar 11, 2022 at 22:39 UTC
Some notes below your chosen depth have not been shown here
Re^10: Geo Package files by Bod (Parson) on Mar 11, 2022 at 20:27 UTC
Some notes below your chosen depth have not been shown here
Re^7: Geo Package files by Marshall (Canon) on Mar 07, 2022 at 23:50 UTC
Dealing with floats is messy. Among those issues is what the heck does a 64 bit float mean? We know that this format is in "little endian". Is your processor also "little endian", e.g. an Intel processor? If it is then maybe we don't have to write completely platform independent code? I think also relevant to this would be: Is your Perl version 32 or 64 bit? I am running 64 bit Perl on a 64 bit Windows platform. And I have a 64 bit GCC complier. On my 64 bit machine, in C, a simple float is 64 bits. I don't know for sure, but I would suspect that 64 bit Perl's representation of a simple float is also 64 bits? Added: I see trouble if you are using 32 bit Perl. I don't know where the 6 comes from in 'nCCV(V)6'? Each float is 8 bytes, not 6 bytes. From what I understand so far, index 8, length 8 bytes => minx index 16.length 8 bytes => maxx index 24,length 8 bytes => miny index 32,length 8 bytes => maxy index 40,length 8 bytes => minz index 48.length 8 bytes => maxz so, `my $minx = unpack ("d8", substr($geo,8,8));` may work?? I am not sure if f8 would work also? Update: BTW, what is length ($geo)? Also, again, I request the binary data dump of $geo, NOT what you think that binary unpacks as. Dump should be 2 hex digits per byte, with a space between bytes. 0 should be 00 so that columns line up nicely. 16 bytes per line would be fine.	[reply] [d/l]
Re^8: Geo Package files by Bod (Parson) on Mar 11, 2022 at 20:20 UTC
I don't know where the 6 comes from in 'nCCV(V)6'? Each float is 8 bytes, not 6 bytes. There are 6 values to extract: minx, maxx, miny, maxy, minz, maxx so, my `$minx = unpack ("d8", substr($geo,8,8));` may work?? I am not sure if f8 would work also? Both d8 and f8 give fractions that seem nonsensical in this situation (because of the SRSID we are using). f8 gives a negative fraction which is doubly nonsensical. However, it could be that the decoded data is right and I am not understanding it correctly. My plan is to ignore the envelope for the moment and press on to the geometry data. That way I can feed the USRN into a USRN finder (the link is for the first entry in the SQLite DB) and sanity check the values against a known entity instead of trying to work out the envelope first.	[reply] [d/l]
Re^8: Geo Package files by Bod (Parson) on Mar 11, 2022 at 22:43 UTC
Thank you for all your help Marshall - it is very appreciated The problem was entirely with my interpretation of the data as explained in Solved! (was: Re^10: Geo Package files)	[reply]
Re^9: Geo Package files by Marshall (Canon) on Mar 12, 2022 at 20:43 UTC
Re^8: Geo Package files by Bod (Parson) on Mar 11, 2022 at 19:52 UTC
Is your processor also "little endian" Yes! We use entirely refurbished Dell PCs built on Intel processors. This one is: Intel(R) Core(TM) i5-4590 CPU @ 3.30GHz 3.30 GHz I would suspect that 64 bit Perl's representation of a simple float is also 64 bits? Added: I see trouble if you are using 32 bit Perl. It hadn't occurred to me that the Perl build or the processor would affect this decoding but that does make sense. I am running Strawberry Perl: This is perl 5, version 32, subversion 1 (v5.32.1) built for MSWin32-x64-multi-thread So, that looks to me like we should be OK there.	[reply]
Re^9: Geo Package files by pryrt (Abbot) on Mar 11, 2022 at 20:16 UTC
Re^10: Geo Package files by Bod (Parson) on Mar 11, 2022 at 20:34 UTC
Some notes below your chosen depth have not been shown here


Don't ask to ask, just ask
	PerlMonks