Over the past few days I have been working on a project for work when it suddenly dawned on me that PostGIS would do exactly what I want and be incredibly useful for some marketing-related information. So, I got to work on the PostGIS 1.5.2 ebuild.
So, there’s some customer information that I need to clean up, among which I need to verify the zip codes. Or, at least fill in the city and state based on the zip code. (As I get more focused on database work, I get more frustrated with the loose restrictions on data entry that our software has.)
Since this same data set I’m using is also used for other marketing reports, I figured it’d be kind of neat to determine how far away our customers are coming, or even to generate a report that reduces the number of potential mail outs to those who are within a certain radius of our location.
So, I got cracking on the ebuild because now that I needed it. (Okay, wanted, but it’s a fine line.) The ebuild has been entirely revamped, and I think you PostGIS users may be quite happy with it.
It properly works with slotted PostgreSQL with one caveat: You’ll need to reemerge PostGIS for each slot you want it to work with.
root@host # postgresql-config set 9.0 root@host # emerge -av =dev-db/postgis-1.5.2-r1 root@host # emacs /etc/postgis_dbs root@host # emerge --config =dev-db/postgis-1.5.2-r1 root@host # postgresql-config set 8.4 root@host # emerge -av =dev-db/postgis-1.5.2-r1 root@host # emacs /etc/postgis_dbs root@host # emerge --config =dev-db/postgis-1.5.2-r1
And now that that’s done, I get to have some fun. First, I needed to find a zip code database that wouldn’t cost me huge bucks to get it from USPS themselves, or indeed any bucks at all. (Seriously, USPS can’t just give me a list of zip codes with the city name and state?) There are other options out there that don’t cost much, but I need that money to pay for food.
Then I stumbled upon The Zip Code Database Project (zips.sourceforge.net). It has all the data I was looking for. (Side note: They need an alternative means to update the database as the U.S. Census Bureau won’t be providing that information anymore. So, if you can throw them a ) It has the zip codes, city names, states they’re in, and as a bonus the latitude and longitude for those which is precisely what I need to do distance calculations.
I had to tweak things a bit in the SQL script they provide to get it to be proper for PostgreSQL and adjust the insert lines a bit to be proper for PostGIS. Resulting in a rather large file.
Okay, okay, I’ll share it with you.
And once all that’s inserted, I can do some fun stuff.
db=> SELECT st_distance(gg1, gg2) * 0.0006214 AS miles FROM (SELECT (SELECT geoloc FROM geo_zip_codes WHERE zip = '35004') AS gg1, (SELECT geoloc FROM geo_zip_codes WHERE zip = '28713') AS gg2 ) AS foo; miles ------------------ 212.304842735422 (1 row)
That’s 212 miles as the crow flies, but good enough. Pretty neat I should say.