Make’ing PostGIS database

I’ve extended Sean’s idea of baking PostGIS-enabled databases using GNU Make a little bit and prepared new version of Makefile.postgis.

How to use it:

  • Install it to save yourself some typing:
    $ ln -s Makefile.postgis Makefile
  • Run make to get basic usage information:
    $ make
    ****** Makefile.postgis usage ******
    *** Create new PostGIS database:
    	DBNAME=mydb make -f Makefile.postgis create
    *** Drop PostGIS database:
    	DBNAME=mydb make -f Makefile.postgis drop
    *** Check if database exists and PostGIS if enabled with PostGIS:
    	DBNAME=mydb make -f Makefile.postgis check
    *** Check if database exists:
    	DBNAME=mydb make -f Makefile.postgis check-db
    *** Check if database is enabled with PostGIS:
    	DBNAME=mydb make -f Makefile.postgis check-postgis
  • Check if your database exists:
    $ DBNAME=mydb make check-db
    ****** Makefile.postgis ******
    ****** Database 'mydb' not found
  • Create your database with PostGIS extension installed:
    $ DBNAME=mydb make create
    ****** Makefile.postgis ******
    ****** Creating database 'mydb'...
    ****** Loading PostGIS into 'mydb'...
  • Check what has been created and installed:
    $ DBNAME=mydb make check
    ****** Makefile.postgis ******
    ****** Database 'mydb' found
    ****** Makefile.postgis ******
    ****** Database 'mydb' is enabled with PostGIS
    1.4 USE_GEOS=1 USE_PROJ=1 USE_STATS=1
  • When you don’t need your database anymore, just drop it:
    $ DBNAME=mydb make drop

Run WKT Raster, run!

I have a not-so-small GeoTIFF raster dataset. Here is what GDAL has to say about it:

$ gdalinfo japan.tif
Driver: GTiff/GeoTIFF
Files: japan.tif
Size is 14000, 14000
Coordinate System is:
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.2572235630016,
            AUTHORITY["EPSG","7030"]],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0],
    UNIT["degree",0.0174532925199433],
    AUTHORITY["EPSG","4326"]]
Origin = (137.386330630776030,38.325757833006122)
Pixel Size = (0.000256020461176,-0.000256020461176)
Metadata:
  AREA_OR_POINT=Area
  TIFFTAG_DOCUMENTNAME=ER Mapper 6.4
  TIFFTAG_IMAGEDESCRIPTION=ER Mapper GeoTiff raster translator V1.0: Band 1 = Red, Band 2 = Green, Band 3 = Blue
  TIFFTAG_SOFTWARE=ER Mapper 6.4
  TIFFTAG_XRESOLUTION=0
  TIFFTAG_YRESOLUTION=0
Image Structure Metadata:
  COMPRESSION=LZW
  INTERLEAVE=PIXEL
Corner Coordinates:
Upper Left  ( 137.3863306,  38.3257578) (137d23'10.79"E, 38d19'32.73"N)
Lower Left  ( 137.3863306,  34.7414714) (137d23'10.79"E, 34d44'29.30"N)
Upper Right ( 140.9706171,  38.3257578) (140d58'14.22"E, 38d19'32.73"N)
Lower Right ( 140.9706171,  34.7414714) (140d58'14.22"E, 34d44'29.30"N)
Center      ( 139.1784739,  36.5336146) (139d10'42.51"E, 36d32'1.01"N)
Band 1 Block=14000x20 Type=Byte, ColorInterp=Red
Band 2 Block=14000x20 Type=Byte, ColorInterp=Green
Band 3 Block=14000x20 Type=Byte, ColorInterp=Blue

I loaded it to PostGIS/WKT Raster table with regular blocking enabled in PostgreSQL 8.3 database running on Ubuntu 8.10 (32-bit) installed as a guest system under VirtualBox:

$ gdal2wktraster.py -r japan.tif -t japan_rb -o japan_rb.sql -k
$ psql -d test -f japan_rb.sql

The loader generated output file japan_rb.sql of size of 1.1 GB and it makes 700 records (raster tiles or blocks) in the database. The disk usage reported for the raster table is:

sistest=# SELECT relfilenode, relpages FROM pg_class WHERE relname = 'japan_rb';
 relfilenode | relpages
-------------+----------
       71700 |        5
(1 row)

I run simple test query:

$ psql -d test
test=# SET log_statement_stats TO 1;
SET
test=# SELECT rid FROM japan_rb WHERE RT_Width(rast) != 14000 OR RT_Height(rast) != 20;
 rid
-----
(0 rows)

In the PostgreSQL log I got dumped a bunch of interesting statistics:

2009-03-27 15:23:38 GMT LOG:  QUERY STATISTICS
2009-03-27 15:23:38 GMT DETAIL:  ! system usage stats:
 ! 5.084838 elapsed 2.544159 user 2.468154 system sec
 ! [4.960310 user 5.204325 sys total]
 ! 0/0 [16/28160] filesystem blocks in/out
 ! 0/320190 [0/647479] page faults/reclaims, 0 [0] swaps
 ! 0 [0] signals rcvd, 0/0 [0/0] messages rcvd/sent
 ! 0/155 [9/327] voluntary/involuntary context switches
 ! buffer usage stats:
 ! Shared blocks:      74192 read,          0 written, buffer hit rate = 51.71%
 ! Local  blocks:          0 read,          0 written, buffer hit rate = 0.00%
 ! Direct blocks:          0 read,          0 written

The virtual machine has assigned 1024 MB of RAM and one CPU Intel Xeon 3.2 GHz.

You are strongly welcome to share your comments?

Lambda, I love you!

I’m writing a small driver for readint WKT Raster data from RASTER column in PostGIS-enabled database and I want to report name of database I’m connected with. My reader eats connection string, and here I’ve fallen in love with Python lambda.

Given connstr stores connection string to PostgreSQL database in format well-known from libpq, single-line anonymous function can do the whole job:

filter(lambda db: db[:6] == 'dbname', connstr.split())[0].split('=')[1]

Complete example:

$ python
Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49)
>>> connstr = "dbname='rtest' host='localhost' user='mloskot'"
>>> filter(lambda db: db[:6] == 'dbname', connstr.split())[0].split('=')[1]
"'rtest'"
>>> connstr = "password='xxx' port=5432 dbname='rtest' host='localhost' user='mloskot'"
>>> filter(lambda db: db[:6] == 'dbname', connstr.split())[0].split('=')[1]
"'rtest'"
>>>

Have fun!