Update on SqlGeometry and POINT EMPTY in WKB

Long time ago I discussed about how SqlGeometry handles POINT EMPTY in WKB format. The SqlGeometry states the definition of OGC GEOMETRY type for Microsoft SQL Server. Shortly, the message was that SqlGeometry implicitly casts POINT EMPTY to MULTIPOINT EMPTY geometry when generating WKB output. PostGIS casts as well, but does it in a consistent way, in my opinion, outputting GEOMETRYCOLLECTION.

Following those findings, I assumed it is not quite correct, or I didn’t like the inconsistency, and I had reported it to Microsoft Connect as a bug: SqlGeometry reports invalid type of WKB of POINT EMPTY.

Recently, I have received a couple of comments from Microsoft to my report. The comments are attached to the report linked above, but I paste them below for completeness and archive:

Our development team for the spatial data types tells me that it is not possible to use a single value for the WKB format of any spatial data type. For the POINT EMPTY, the WKB format does not allow empty points, so we are outputting a MULTIPOINT with zero elements.
In a MULTIPOINT EMPTY, we are stripping out empty points.

The reasoning is technically correct. It’s just Microsoft does it differently. However, as second comment suggests, the current behaviour may change in future:

But we might consider changing it to get consistent behavior.

Interface Versioning in C++ Video

Friends from Skills Matter has put video with lecture about Interface Versioning in C++ given by Steve Love last Thursday. The lecture was organised by London chapater of ACCU.

Generally, Steve addressed problems of the DLL Hell and ABI compatibility proposing a not-so-simple, but applicable and usable solution for number of most common problems. Along the video, slides are also available, so it should be easy to grasp the idea.

I’ve received copy of Steve’s code and I’m preparing a few more tests which I hope to describe in details and post here soon.

SqlGeometry and POINT EMPTY in WKB

Inspired by question Paul Ramsey asked today morning on IRC, I’ve inspected what kind of Well-Known-Binary output gives SqlGeometry for EMPTY geometries of all the seven geometry types as specified in OGC SFS. The SqlGeometry class is available from SQL Server System CLR Types for .NET Framework. Here we go.

I checked Well-Known-Binary output as returned by the SqlGeometry method STAsBinary(). Here is a small test program written in C#:

using System;
using System.Linq;
using Microsoft.SqlServer.Types;
namespace SqlGeometryEmpty
{
  class Test
  {
    static void Main(string[] args)
    {
      foreach (string type in
         Enum.GetNames(typeof(OpenGisGeometryType)))
      {
        string wkt = type.ToUpper() + " EMPTY";
        SqlGeometry geom = SqlGeometry.Parse(wkt);
        byte[] wkb = geom.STAsBinary().Buffer;
        string wkbhex = string.Join("",
          wkb.Select(
            b => b.ToString("X2")).ToArray());

        Console.WriteLine("{0}\n{1} ({2} bytes)\n",
          wkt, wkbhex, wkb.Length);
      }
    }
  }
}

The first observation is that WKB of EMPTY geometry for all types is returned as a a slightly different binary. All the binary forms are truncated to nine bytes. The first byte indicates endianness as expected. The second chunk of four bytes indicate geometry type. It is exactly as defined in OGC specifications. The third chunk of remaining four bytes are set to Zero and seem to play a role of size specifier: number of points in LINESTRING or number of rings in POLYGON, number of points in MULTIPOINT, and so on. This makes another observation that WKB for EMPTY is reported as a collection of primitive components.

The difference in binary of WKB of EMPTY geometry I mentioned is in that the actual type of input geometry is preserved, so there seems to be no implicit translation to geometry of some other type.

So far so good but not for too long. In fact, SqlGeometry implicitly casts POINT EMPTY to MULTIPOINT EMPTY geometry with the WKB of the following form (in hex):

010400000000000000

Here is complete output of the test program above:

POINT EMPTY
010400000000000000 (9 bytes)

LINESTRING EMPTY
010200000000000000 (9 bytes)

POLYGON EMPTY
010300000000000000 (9 bytes)

MULTIPOINT EMPTY
010400000000000000 (9 bytes)

MULTILINESTRING EMPTY
010500000000000000 (9 bytes)

MULTIPOLYGON EMPTY
010600000000000000 (9 bytes)

GEOMETRYCOLLECTION EMPTY
010700000000000000 (9 bytes)

A word about how PostGIS behaves. PostGIS reports GEOMETRYCOLLECTION EMPTY, regardless of actual type of input EMPTY geometry. It is in hex form:

010700000000000000

Generally, there is not many choices of how to report EMPTY geometry in clear and usable way and a form of collection with size equal to Zero seems to be the most appropriate choice. POINT EMPTY reported with type set to POINT (010100000000000000) would be ambiguous as feels like truncated or invalid form of POINT(0 0), especially in programming languages like C where native dynamic allocated arrays do not carry information about their size. IOW, geometry type is not enough information to process binary form of POINT EMPTY properly.

Reporting EMPTY geometries as a collection is a useful convention that seems to work well. PostGIS behaves about it in the very consistent manner reporting one type for all empties. SqlGeometry, so SQL Server, forces programmers to write a few more lines of code to handle all the possible cases. Yet another original exotic solution from Microsoft.

Consistent API is a bless!

Update: consistent specification of interface is even better.

Kitware Developer blog launched

CMake - cross-platform build systemA few minutes ago, Bill Hoffman from Kitware posted short message to the CMake project mailing list with an interesting announce:

Kitware launched its first developer blog today with contributions from Company technical and business leaders.

The CMake build system is one of the main category of topics on the Kitware blog, so I presume it may be of interest of OSGeo Community as the CMake build system is slowly winning over more and more folks here :-)

First CMake-related post is about Deploying on Windows with DLL Manifest Issue

Another interesting post on the blog is Will Schroeder‘s answer to the question Why Open Source Will Rule Scientific Computing? It’s really worth reading.

illegal token on right side of ‘::’

libLAS - ASPRS LiDAR data translation toolsetOne of libLAS users reported that when use of #include <liblas/lasreader.hpp> in his application compiled with Visual C++ 10.0 from Visual Studio 2010 cause this error:

utility.hpp(253): error C2589: '(' : illegal token on right side of '::'

The error is an incarnation of a very well-known problem in Visual C++ when using the C++ Standard Library elements, especially the Standard Template Library, in Windows API-based programs. As libLAS library does use the C++ library, so does a user’s application if includes libLAS headers.

The problem is caused by conflicting definitions of min() and max() macros defined in windef.h header. Macros in C++ are scope-less evil, especially if defined in public headers using such extremely unique names as min or max. The fact that Microsoft defined it way before C++ was born absolves them at large, but for the Spirit sake, they should learn the lesson and disable it for good in C++ mode (but not yet another MS-specific way!). No one who’s sane need or want to use them!

Pie in the sky. In the meantime, C++ programmers as the libLAS user who reported this problem have to deal with it on their own. The easiest way is to check CodeProject or Q143208 or search (not google) for solution like #define NOMINMAX for Visual C++ compiler.

However, is another option is to apply a simple trick to call of *::min() or *::max() functions (i.e. std::min() or std::max() which effectively prevents macro substitution, so the Visual C++ compiler (or any other compiler with similar problem) does not complain about illegal token. The trick is to wrap function name, fully qualified name, with parentheses:

(std::min)(x, y);

In most cases of use of C++ Standard Library as described above, it is required for the following functions:

(std::min)(x, y);
(std::max)(x, y);
(std::numeric_limits<T>::min)();
(std::numeric_limits<T>::max)();

In case a user-defined type has a member function with exactly the same name as a macro present in global scope (macros always live in global scope!), it may be necessary to apply the very same trick when a member function is called on an object:

template <typename T, int Size>
struct Series
{
  T min() { return *(std::min_element(s, s + Size); }
  T& operator[](int index) { return s[index]; }
private:
  T s[Size];
};

Series<int, 3> s;
s[0] = 2;
s[1] = 3;
s[2] = 1;

int m = (s.min)(); // long way, but here is the trick

There is one side effect which may be an inconvenience. This trick disables argument dependent name lookup (ADL, aka Koenig lookup).

Mouse vs keyboard quotes of the day

My two favourite quotes I’ve remembered from today’s thread about programming using a proportional font and other stuff that happened on, as busy as always, ACCU mailing list:

For me, when coding I think fast and I type just as fast, and every time I have to touch that stupid mouse I curse the idiot who failed to add or, worse, removed (which seems to happen as software “evolves”) the menus/shortcuts/tabbing-logic that would allow me never to lose my thread, or efficiency.

— Matthew

and

I’ve watched people using IDEs (mainly on Windows) and usually I wonder if I’m going to die of old age before they finish carrying out various simple editing tasks by searching through menu trees, navigate through dialogue boxes, clicking on this option and that option (…) I often wonder if I should only have to work 20 hour weeks as I can get my typing-in work done twice as quickly as some other people :)

— Stewart

Signed, with both hands.

Video lecture about CMake

Bill Hoffman from Kitware gives presentation about CMake and a pack of related tools to the happiest easygoing working nation on the Earth:

It’s worth to watch if interested in CMake.