How PostGIS can help SQL Server users?

I may be a gonzo or it’s just that today I didn’t have my notorious 4th coffee in my favourite Winnie The Pooh cup I got from Pantera on our 14th (or 15th?) anniversary we celebrated a month ago, so…

Apparently, there are situations in which PostGIS could be an affordable anti-GML vaccine jab. It seems there is a potential market for PostGIS to conquer. Perhaps it wouldn’t be estimated as profitable as the H1N1 but who knows what will happen if no one takes a brave stand and stop GML designers! Here I’d eagerly conclude with one of the famous Scottish sentences :-)

Back to the subject matter. Today, I spotted an interesting question on the StackOverflow archives: Is it possible to export spatial data from Sql Server 2008 in gml2 format?. Natively? No, there is no such solution. Presumably, Microsoft thinks forward and thinks GML 2 is a legacy standard. Fair enough, someone has to draw a line between prehistoric and modern, somewhere. Why Microsoft? Again?

Facing such a tremendous suffer Microsoft exposed SQL Server users to, I suggested to visit the “underworld” for a while and hire PostGIS to do the dirty job.

Paraphrasing Andrei Alexan­dres­cu‘s, hysterically famous recently, sentence: SQL Server should go!.

Make’ing PostGIS database

I’ve extended Sean’s idea of baking PostGIS-enabled databases using GNU Make a little bit and prepared new version of Makefile.postgis.

How to use it:

  • Install it to save yourself some typing:
    $ ln -s Makefile.postgis Makefile
  • Run make to get basic usage information:
    $ make
    ****** Makefile.postgis usage ******
    *** Create new PostGIS database:
    	DBNAME=mydb make -f Makefile.postgis create
    *** Drop PostGIS database:
    	DBNAME=mydb make -f Makefile.postgis drop
    *** Check if database exists and PostGIS if enabled with PostGIS:
    	DBNAME=mydb make -f Makefile.postgis check
    *** Check if database exists:
    	DBNAME=mydb make -f Makefile.postgis check-db
    *** Check if database is enabled with PostGIS:
    	DBNAME=mydb make -f Makefile.postgis check-postgis
  • Check if your database exists:
    $ DBNAME=mydb make check-db
    ****** Makefile.postgis ******
    ****** Database 'mydb' not found
  • Create your database with PostGIS extension installed:
    $ DBNAME=mydb make create
    ****** Makefile.postgis ******
    ****** Creating database 'mydb'...
    ****** Loading PostGIS into 'mydb'...
  • Check what has been created and installed:
    $ DBNAME=mydb make check
    ****** Makefile.postgis ******
    ****** Database 'mydb' found
    ****** Makefile.postgis ******
    ****** Database 'mydb' is enabled with PostGIS
    1.4 USE_GEOS=1 USE_PROJ=1 USE_STATS=1
  • When you don’t need your database anymore, just drop it:
    $ DBNAME=mydb make drop

WKT Raster Cruncher

I’ve been playing with some nice amount of pixels using WKT Raster engine recently. Today, I tiled the japan.tif (gdalinfo output is here) using 3 different sizes of tile block. All the work was done using gdal2wktraster.py loader available from WKT Raster repository. Below, I included more interesting numbers which may be helpful for others who will decide to juggle rasters in PostGIS database.

Tiling schemes applied on japan.tif file:

  1. Using natural block size reported by GDAL gives 700 tiles, 14000 x 20 pixels each:
    gdal2wktraster.py -r japan.tif -t japan -o japan.sql -k
  2. Using block size 100 x 100 pixels gives 140 x 140 = 19600 tiles:
    gdal2wktraster.py -r japan.tif -t japan_100 -o japan_100.sql -k -m 100x100
  3. Using block size 200 x 200 pixels gives 70 x 70 = 4900 tiles:
    gdal2wktraster.py -r japan.tif -t japan_100 -o japan_200.sql -k -m 200x200

Sizes of all .sql text files produced are close to 1.2 GB. The timing for gdal2wktarster falls to around 40 minutes on Ubuntu 8.10 (i386) running under VirtualBox 2.with assigned single CPU core of Intel Xeon 3.2 GHz and 1024 MB RAM (host machine: 4 x Intel Xeon 3.2 GHz with 16 GB RAM). psql loads single table in about 6 minutes:

psql -d japan -f japan_100.sql

Interestingly, memory usage during processing WKT Raster by gdal2wktraster.py (Python run-time) seems not high (10-20%) and was on constant level of ~85 MB. However, 100% of CPU power was being eaten. It’s got me interested in comparing these results vector-brother of gdal2wktraster, to the shp2pgsql from PostGIS.