Tuesday, May 24, 2011

Batch Import Shapefiles into PostgreSQL

Importing shapefiles into PostgreSQL is simple using shp2pgsql, unless you want batch import multiple shps. Bash to the rescue. Below is a simple script that imports all shapefiles within a directory.

To run:
  • Update the schema in the script
  • Update the SRID (make sure all shps in the folder have the same SRID)
  • Update the database name and user 
  • Navigate to the folder where your data is sitting
  • Decide whether or not you want to remove the shps afterwards (if not, remove the line above 'done') 
  • Run
#--------------------------------------------------------------
FILES=$(find *.shp);
for FILE in $FILES
    do
    basename=${FILE%.shp}
    shp2pgsql -D -s 4326 -I $FILE schema.$basename | psql -U postgres -d postgis
    rm $basename.*
done
#--------------------------------------------------------------

Wednesday, May 18, 2011

PostGIS Polygon to Point Grid

I've occasionally run into the need to convert my PostGIS polygon geometries to points. Usually this need arises as my datasets get large and larger and simple geometry helps speed up processing (treating the dataset as a raster). Hopefully, with the release of PostGIS 2.0 this function will lose its value (hopefully), since 2.0 will fully support rasters.

To the function

The below function converts polygons to point grids letting the user specify the resolution. It also takes on the projection of the polygonal dataset. Simple yet effective. Note, that I cannot take full credit for this function, I had seen the beginnings of it on some user forum. Unfortunately I simply cannot track it down...if I stumble into it again I will post the url. Note, that the resolution is in projected units i.e. if your projection is in meters, then inputting 10 = 10 meter resolution.

point grid inside polygon

-----------------------------------------------------------------------------------------
CREATE OR REPLACE FUNCTION st_polygrid(geometry, integer) RETURNS geometry AS
$$
SELECT
    ST_SetSRID(ST_Collect(ST_POINT(x,y)), ST_SRID($1))
FROM
    generate_series(floor(ST_XMin($1))::
numeric, ceiling(ST_xmax($1))::numeric, $2) as x,
    generate_series(floor(ST_ymin($1))::numeric, ceiling(ST_ymax($1))::numeric,$2) as y 
WHERE
    ST_Intersects($1,ST_SetSRID(ST_POINT(x,y), ST_SRID($1)))
$$
  LANGUAGE sql VOLATILE;

-----------------------------------------------------------------------------------------

--Example using the above function
CREATE TABLE some.other_table AS
SELECT
    a.gid,
    st_polygrid(a.the_geom, 1000) AS the_geom
FROM
    some.table a

Google Geocoder in PostGIS (with reverse geocode!)

Well, it's been a while since my last post. It seems there is a negative correlation between having twins and blog posts!

Alright, onto the reason for this post, PostGIS geocoding using Google! OK, I realize that there are other methods for geocoding in PostGIS, but I had a couple of drivers motivating me:
1. I wanted to create a Function in Postgres using PL-Python. A way of expanding my horizons we'll say.
2. Geopy (get this python module here) makes it extremely easy to geocode using Google, Yahoo, GeoNames, MediaWiki and more.
3. I wanted to be able to reverse geocode i.e. give a lat, long and return an address.

You'll need a couple things first before these functions will work (don't worry, nothing drastic!).
1. Make sure you have PL-Python installed. To check if you have it; if you have pgAdmin, go to Preferences, and navigate to the list of objects displayed in the tree and check Languages. If its not there, pgAdmin to the rescue. Right click languages, new language, plpythonu, done!
2. Make sure you have Geopy installed, see link above. If you operate in a virtualenv, no worries, I've added a simple workaround for that in the function. So virtualenv away!
3. Get a Google API key.
4. If you are planning on using the reverse geocoder, you'll need to download and install the experimental (but stable) branch of geopy here.

Now for the code!
-------------------------------------------------------------------------------
--Converts an address to lat, long
CREATE OR REPLACE FUNCTION google_latlng(IN address text) RETURNS text ARRAY[2] AS
$$
#If using virtualenv set path to appropriate environment
import site
site.addsitedir('/your/virtualenv/lib/pythonX.Y/site-packages')

from geopy import geocoders
gc = geocoders.Google('Your key here')
try:
    geo = list(gc.geocode(address, exactly_one=True))
    lat, lng = geo[1][0], geo[1][1]
except:
    lat, lng = 0, 0

return lng, lat
$$
LANGUAGE 'plpythonu' VOLATILE;

--Converts an address to PostGIS geometry
CREATE OR REPLACE FUNCTION google_geocode(IN address text) RETURNS geometry AS
$$
SELECT
ST_SetSRID(ST_GeomFromText('POINT(' || r.ll[1] || ' ' || r.ll[2] || ')'), 4326)
FROM (
SELECT
google_latlng($1) AS ll ) r
$$
LANGUAGE sql VOLATILE;

--Reverse Geocoding! Converts lat, long to an address
CREATE OR REPLACE FUNCTION google_addr(IN lat double precision, lng double precision) RETURNS text AS
$$
#If using virtualenv set path to appropriate environment
import site
site.addsitedir('/your/virtualenv/lib/pythonX.Y/site-packages')

from geopy import geocoders
gc = geocoders.Google('Your key here')
try:
    (address, point) = gc.reverse((lat,lng))
except:
    address = '????'

return address
$$
LANGUAGE 'plpythonu' VOLATILE;

------------------------------------------------------------------------------------


That's it! Create those functions and your on your way. Just remember to update your Geography Columns Table if you create geometry.

And here are some examples using the functions:
--Reverse Geocode a lat, long
SELECT google_addr(39.7439848, -105.0213316)

--Return the lat, long of an address as a text array
SELECT google_latlng('1701 Mile High Stadium # 300 Denver, CO 80204')

--Return a PostGIS geometry given an address, note this is dependent on google_latlng()
SELECT google_geocode('1701 Mile High Stadium # 300 Denver, CO 80204')

Enjoy!