Perl newbie here trying to make maps. I've written a script to extract the US census block information from the TIGER Line Shapefiles (ESRI format) into a plain tab-delimited spreadsheet. I'm on macOS Sierra. Specifically, I used Jason Kohles' Geo::ShapeFile module to read the shapefile, and copy each Block ID and its associated polygon (coordinates) into a two-column, tab-delimited text file.
Here is the code:
#!/usr/bin/perl use strict; use warnings; use Math::Round; use Geo::ShapeFile; use Geo::ShapeFile::Shape; use Geo::ShapeFile::Point; # Declare variables my $blockid; my $bcount; my $totalblocks; my $progress; my $shapefile; # used for loading the TIGER shapefile my %attr; # used to read the dbf file my $polygon; # used to read the shp file my @points; # used to read the shape into an array of points my $coor; # used to read the coordinates of each point my $x; my $y; my $pcount; # counter for the coordinates being written into output my $totalpoints; # total number of coordinates in each polygon # Variables for time keeping my $start = time; my $duration; # Prepare the TIGER spreadsheet. open (TIGER, ">tiger.txt") or die "Failed to open"; print TIGER "BLOCKID\tPolygon\n"; # make a header # Load the TIGER shapefile $shapefile = 0; $shapefile = Geo::ShapeFile->new ("tabblock2010_42_pophu"); # use the +TIGER census block special release (population & housing units only) $totalblocks = $shapefile->shapes(); for (1 .. $totalblocks) { %attr = $shapefile->get_dbf_record($_); $blockid = $attr{BLOCKID10}; $bcount++; print TIGER "$blockid\t\"geometry\":{\"type\":\"Polygon\",\"coordi +nates\":[["; # The polygon column is in the GEOJSON format $polygon = $shapefile->get_shp_record($_); @points = $polygon->points(); $pcount = 0; $totalpoints = scalar (@points); foreach $coor (@points) { $x = $coor->get_x(); $y = $coor->get_y(); print TIGER "[$x,$y]"; $pcount++; if ($pcount != $totalpoints) { # All but the last coordinate s +hould be followed by a comma print TIGER ","; } } print TIGER "]]}\n"; # finish the entry print "Done $bcount out of $totalblocks.\n"; %attr = (); $blockid = 0; $polygon = 0; @points = (); $totalpoints = 0; } close (TIGER); $shapefile = 0; select()->autoflush(1); # Show stopwatch $duration = time - $start; $duration = round ($duration / 60); print "\n\nThis script took $duration minutes\n\n"; exit;
This has been working well for smaller states like Connecticut. But when I run this for a large shapefile, the program accumulates a huge amount of memory, and cannot go to the end of the script. I have a piece at the end of the script that prints a stopwatch (execution time). For example, Pennsylvania has 421,545 census blocks, the script will take about 7 minutes to run, and it will display:
... Done 421543 out of 421545. Done 421544 out of 421545. Done 421545 out of 421545.
And it will get stuck there, never finishing and never displaying the stopwatch. And in my Activity Monitor will say perl is taking 7 GB of memory. But my output file is perfectly fine. Geo::ShapeFile doesn't have a close file command, so I suspect there's some problem with the ShapeFile remaining open and taking up all of the memory.
In reply to Geo::ShapeFile memory problem by rhzhang
| For: | Use: | ||
| & | & | ||
| < | < | ||
| > | > | ||
| [ | [ | ||
| ] | ] |