PostGIS Users Rejoice!

Over the past few days I have been working on a project for work when it suddenly dawned on me that PostGIS would do exactly what I want and be incredibly useful for some marketing-related information. So, I got to work on the PostGIS 1.5.2 ebuild.

So, there’s some customer information that I need to clean up, among which I need to verify the zip codes. Or, at least fill in the city and state based on the zip code. (As I get more focused on database work, I get more frustrated with the loose restrictions on data entry that our software has.)

Since this same data set I’m using is also used for other marketing reports, I figured it’d be kind of neat to determine how far away our customers are coming, or even to generate a report that reduces the number of potential mail outs to those who are within a certain radius of our location.

So, I got cracking on the ebuild because now that I needed it. (Okay, wanted, but it’s a fine line.) The ebuild has been entirely revamped, and I think you PostGIS users may be quite happy with it.

It properly works with slotted PostgreSQL with one caveat: You’ll need to reemerge PostGIS for each slot you want it to work with.

root@host # postgresql-config set 9.0
root@host # emerge -av =dev-db/postgis-1.5.2-r1
root@host # emacs /etc/postgis_dbs
root@host # emerge --config =dev-db/postgis-1.5.2-r1
root@host # postgresql-config set 8.4
root@host # emerge -av =dev-db/postgis-1.5.2-r1
root@host # emacs /etc/postgis_dbs
root@host # emerge --config =dev-db/postgis-1.5.2-r1

And now that that’s done, I get to have some fun. First, I needed to find a zip code database that wouldn’t cost me huge bucks to get it from USPS themselves, or indeed any bucks at all. (Seriously, USPS can’t just give me a list of zip codes with the city name and state?) There are other options out there that don’t cost much, but I need that money to pay for food.

Then I stumbled upon The Zip Code Database Project (zips.sourceforge.net). It has all the data I was looking for. (Side note: They need an alternative means to update the database as the U.S. Census Bureau won’t be providing that information anymore. So, if you can throw them a ) It has the zip codes, city names, states they’re in, and as a bonus the latitude and longitude for those which is precisely what I need to do distance calculations.

I had to tweak things a bit in the SQL script they provide to get it to be proper for PostgreSQL and adjust the insert lines a bit to be proper for PostGIS. Resulting in a rather large file.

Okay, okay, I’ll share it with you.

And once all that’s inserted, I can do some fun stuff.

db=> SELECT st_distance(gg1, gg2) * 0.0006214 AS miles
  FROM (SELECT
    (SELECT geoloc FROM geo_zip_codes WHERE zip = '35004') AS gg1,
    (SELECT geoloc FROM geo_zip_codes WHERE zip = '28713') AS gg2
  ) AS foo;
      miles       
------------------
 212.304842735422
(1 row)

That’s 212 miles as the crow flies, but good enough. Pretty neat I should say.

7 thoughts on “PostGIS Users Rejoice!”

  1. Hi-
    I’m unable to merge postgis-1.5.2-r1 (with doc USE flag), due to a sandbox violation when installing the built PostGIS package into the /var/tmp/portage/dev-db/postgis-1.5.2-r1/image. The Makefile attempts to copy html/postgis.html to /usr/share/doc/postgresql-8.4/postgis. The Portage sandbox then complains that there’s a violation. I’ll poke at the ebuild a little to see if I can resolve it, but I thought you might appreciate a heads up. Thank you for your work on the PostGIS ebuild! 🙂

    -Robin K.

  2. Here’s nicer SQL, and you can use the “geography” type to get correct distances without magic scaling factors (well, except the one to convert km to miles):

    SELECT ST_Distance(g1.geoloc::geography, g2.geoloc::geography) AS kilometers
    FROM geo_zip_codes AS g1, geo_zip_codes AS g2
    WHERE g1.zip = '35004' AND g2.zip = '28713';

Leave a Reply