- Raster: 90 meter resolution covering most of the contiguous United States. Represents available land for solar development.
- Vector: Native American lands from the Bureau of Indian Affairs. Roughly 300 distinct boundaries.
-ESRI: 113 MB
-PostGIS: Loaded in using 100x100 tiles. Table = 137 MB, spatial index = 19 MB
3. Tabulate Area - Time Trial
-ESRI: tabulate area tool finished in roughly 1 minute.
-PostGIS: Using the SQL below, finished in roughly 13 minutes. Confirmed spatial index was used.
CREATE TABLE tabulate_area AS
,SUM(ST_Count(ST_Clip(b.rast, 1, a.the_geom))* pow(90,2)) AS square_meters
FROM tribal_boundaries a
INNER JOIN available_land b ON ST_Intersects(a.the_geom, b.rast)
4. Compare Results
I found that PostGIS likes to automatically snap to the raster whereas ESRI defaults to not snapping. If not snapping in ESRI, results between ESRI and PostGIS are different (as would be expected). If snapping in ESRI, then results are exactly the same. Loaded in the ESRI results and compared the two with the following SQL:
,post.sqm AS post_sqm
,esri.sqm AS esri_sqm
,post.sqm - esri.sqm AS tabulate_diff
FROM tabulate_area AS post
LEFT OUTER JOIN esri ON post.tribe_name = esri.tribe_name;
5. Why is PostGIS so slow? It's not! Read on...
Well...12 minutes slower is simply not acceptable...so what is the reason? I decided to look at the tile size on the raster. My initial thought was, the tighter the tiles, the better the spatial index, so I set it at 100x100 or roughly 9 kilometers ((90cell sizex100tile size)/1000meters). Thinking on it, sure my ST_Intersects() was fast, but my clip was choking on all the individual tiles. So I loaded in the raster again, this time at 1000x1000tile size. Ran the query again and que bueno! New runtime for PostGIS is 1 1/2 minutes.
Lesson learned: think about tile size and the analysis you'll be performing before you load your raster.