pygit-svn-mirror 0.1 released

I have been looking for easy and quick solution to mirror Subversion repositories in Git at GitHub. With bit of reading and testing, I came up with some quite usable workflow. But, most likely due to my lack of Git fu, I wasn’t happy with it. Especially, could not find how to update Git mirrors from various locations and computers, also to allow others to do that.

Lately, I have found a tool written in Ruby by Eloy Durán. It is git-svn-mirrora command-line tool that automates the task of creating a Git mirror for a SVN repository, and keeping it up-to-date. I installed Eloy’s tool from Ruby gems and played with it for a while. I really liked it.

I skimmed the Ruby code of git-svn-mirror and found out it makes use of bare repositories in Git. A Git bare repository stores just the contents of the .git directory, without any files checked out around it. Long story short, this script does almost exactly what I need and if there is something it does not do, then I can add it.

I have never written a single line of code in Ruby and I don’t feel like I need to learn it now. So, I decided to port git-svn-mirror to Python. I have just pushed pygit-svn-mirror 0.1 based on git-svn-mirror 0.1 to the repository at GitHub. I have tried to follow command line interface and overall code structure of the original version in Ruby. I have also preserved the original license and Eloy’s copyright.

There is README.md file included with detailed guide on how to use the pygit-svn-mirror. Basically, there are two commands: init and update. For each command, --help option will display required and supported arguments.

For example, creating mirror of Subversion repository of PROJ.4 project at GitHub involves the following commands:

mkdir /path/to/proj4/mirror
cd /path/to/proj4/mirror
git-svn-mirror.py init \
  --from=https://svn.osgeo.org/metacrs/proj/ \
  --to=git@github.com:<USRNAME>/proj.4.git

and to update the mirror from its workbench directory:

cd /path/to/proj4/mirror
git-svn-mirror.py update

or from any folder but with workbench location pointed explicitly:

git-svn-mirror.py update -w /path/to/proj4/mirror

Feedback, bug reports and patches highly appreciated.

Finally, big thanks to Eloy Durán for the original git-svn-mirror written in Ruby.

Python sys.stdout redirection in C++

Lately, I have been embedding Python interpreter and implementing plenty of Python extensions in C++ using plain C API provided by Python 3. One of common challenges at C/C++ level is to intercept output sent to sys.stdout or sys.stderr by Python functions like print. Python Embedding/Extending FAQ suggests common solution based on Python code:

# catcher code
import sys
class StdoutCatcher:
   def __init__(self):
      self.data = ''
   def write(self, stuff):
      self.data = self.data + stuff
catcher = StdoutCatcher()
sys.stdout = catcher

This Python code can be executed by embedded Python interpreter using PyRun_SimpleString, then the output can be accessed by fetching __main__ module attributes:

PyObject* m = PyImport_AddModule("__main__");
char const* code = "... catcher code here...";
PyRun_SimpleString(code);
PyRun_SimpleString("print(3.14)");
PyObject* catcher = PyObject_GetAttrString(m, "catcher");
PyObject* output = PyObject_GetAttrString(catcher, "data");
// get textual data contained in output

Such mix of Python and C code is neither convenient to use nor states a flexible solution. I simply don’t like this prosthesis, especially if I need to frequently switch between number output sinks.

So, I have come up with better solution which allows me to directly bind any callable C++ entity. The syntax I mean looks and feels like this:

int main()
{
    PyImport_AppendInittab("emb", emb::PyInit_emb);
    Py_Initialize();
    PyImport_ImportModule("emb");

    PyRun_SimpleString("print(\'hello to console\')");

    // here comes the ***magic***
    std::string buffer;
    {
        // switch sys.stdout to custom handler
        emb::stdout_write_type write =
            [&buffer] (std::string s) { buffer += s; };

        emb::set_stdout(write);
        PyRun_SimpleString("print(\'hello to buffer\')");
        PyRun_SimpleString("print(3.14)");
        PyRun_SimpleString("print(\'still talking to buffer\')");
        emb::reset_stdout();
    }

    PyRun_SimpleString("print(\'hello to console again\')");
    Py_Finalize();

    // output what was written to buffer object
    std::clog << buffer << std::endl;
}

This allows me to handle sys.stdout.write with C++ free function, class member function, named function objects or even anonymous functions as in the example above where I use C++11 lambda.

Complete implementation of the emb module in C/C++ using plain Python C API is available from my Python workshop at GitHub:

git clone git://github.com/mloskot/workshop.git

The complete code is enclosed in python/emb/emb.cpp file. Note, this is a minimal example to present the essential concept. In production-ready code, it certainly needs more attention around reference counting of PyObject, getting rid of global state, and so one.

Lambda, I love you!

I’m writing a small driver for readint WKT Raster data from RASTER column in PostGIS-enabled database and I want to report name of database I’m connected with. My reader eats connection string, and here I’ve fallen in love with Python lambda.

Given connstr stores connection string to PostgreSQL database in format well-known from libpq, single-line anonymous function can do the whole job:

filter(lambda db: db[:6] == 'dbname', connstr.split())[0].split('=')[1]

Complete example:

$ python
Python 2.5.2 (r252:60911, Oct  5 2008, 19:24:49)
>>> connstr = "dbname='rtest' host='localhost' user='mloskot'"
>>> filter(lambda db: db[:6] == 'dbname', connstr.split())[0].split('=')[1]
"'rtest'"
>>> connstr = "password='xxx' port=5432 dbname='rtest' host='localhost' user='mloskot'"
>>> filter(lambda db: db[:6] == 'dbname', connstr.split())[0].split('=')[1]
"'rtest'"
>>>

Have fun!

FDO goes for Python

A few hours ago, guys from the core development team of Feature Data Objects submitted new cool stuff to the FDO repository – Python scripting support for FDO API.

The Python bindings are generated with SWIG. As Greg Boone reported on the fdo-internals, currently Python bindings can be generated and used on Windows only. Linux support will be available soon. Also, it has been only tested with Python .2.4 so far and 2.5 line is not supported yet.

The Python scripting support for the Feature Data Objects is still under development but I’m sure it’s a great news for the large community of Python users in geospatial field. This is a next step to attract potential FDO users.

There is more about pyFDO subject on Jason Birch’s blog: pyFDO is in the House – Yeah Baby!

Running PyLint from Komodo

ActiveState Komodo LogoThanks to John’s Get Komodo for free post I started to use Komodo on my laptop. It’s really well-done software, so I put Vim away (for a while or longer :-)) and started to develop my scripts using Komodo.

A few days ago I also read John’s post about PyLint tool. I was looking for something like this as well as for a comprehensive style guide for Python. John points out both in his post.

Update: Command package has been updated – PyLint output parser included. Now, you can navigate to every line of script reported by PyLint with single click! I’d like to say BIG THANKS to Trent for this excellent solution.

Update: I’m still experimenting with Komodo customizations, so here you have Run PyLint command availabled as a package ready to import into your Komodo.

Continue reading