Archive for the 'Locality and Space' Category

Are you generative than consumptive in your field?

Posted in Locality and Space, OpenLayers, Social, Software on May 26th, 2009 at 10:57:47

Anselm just posted what appears to be a random thought on twitter:

Are you more generative than consumptive in your particular field? … Create more than you consume?

In open source, I often rephrase this question as “Are you a source, or a sink?”

There are many people in the community who contribute more than they consume. Organizations, individuals, etc. There are also many sinks in the community — since entropy is every increasing, this seems a forgone conclusion — and one of the key things that causes an open source project to succeed or fail is the number of sources or sinks.

I personally try very hard to be a source in all that I do, rather than a sink. One way that I do this is that I try very hard to always followup any question I ask — for example, on a mailing list, on an IRC channel, or what have you — with at least two answers of my own. This means that, for example, when I hopped into #django to ask about best practices for packaging apps, I stuck around, and helped out two more people — one who was asking a question about PIL installation, and one about setting up foreign keys to different models.

Now, in the end, my answers were simple — no one with even a basic knowledge of Django would have had problems answering them. But by sticking around and answering them, I was able to make up to some extent for the time/energy that I consumed from someone more familiar with the project, by saving them from needing to answer as well.

It is often the case that users trying to get help will claim that once they get help, they will ‘contribute back’ to the community by, for example, writing documentation. This never happens. Though there are exceptions to every rule, it is almost always the case that users who ask a question, prefacing it with “I will document this for other users”, never follow through on the latter half. The exceptions to this — or rather, the alternate cases — are cases where a user has already investedлегла significant research, and likely already started writing documentation. Unless the process is started before the problem is solved, it is almost universally true — in my experience — that the user will act as a sink, taking the information from the source and disappearing with it.

I work very hard on supporting a number of open source projects that I work on. Though my involvement lately has been more hands off — by doing things like writing documentation instead of answering questions, acting as a release manager instead of fixing bugs, and so on — I work very hard to keep the karmic balance of my work on the positive side. I believe that this pays off in the long run — I have somewhat of a reputation of being helpful, which is beneficial to me since it means I’m more likely to receive help when I need it. I also work to keep karmic balance high on the part of the organization I work for, since many of the other people in the organization are less able to keep karmic balance high.

These rules don’t apply solely to open source — I have the same karmic balance issues going on in my work inside of MetaCarta — but I maintain the same attitude there. Coming in with the idea that it is okay to be a sink can lead to a nasty precedent. In the end, I think that everyone loses. Sinks — both in open source and other karmic ventures — will eventually use up the karma they start with, and be left out to dry. It is the case for more than one person that they have extended their information seeking without contributing back beyond the point where I am willing to continue to support their information entropy.

I joke sometimes about giving out “crschmidt karma points”. Though I don’t have an actual system in this regard, I do quite clearly delineate between constant sinks, and regular sources, and grey areas in-between. I try to stay on the source side, and I encourage anyone else to do the same — even if it’s only by answering easy questions on the mailing list, or doing a bit more research on your own. Expecting other people to fix your problems, in open source or otherwise, is simply a false economy of help, since in the end, it simply doesn’t work.

Toronto Code Sprint: Day 2

Posted in Locality and Space, Mapserver, OSGeo, PostGIS, Toronto Code Sprint on March 8th, 2009 at 22:44:32

Day 2 of the code sprint seemed to be much more productive. With much of the planning done yesterday, today groups were able to sit down and get to work.

Today, I accomplished two significant tasks:

  • Setting up the new OSGeo Gallery, which is set to act as a repository for demos of OSGeo software users in the same way that the OpenLayers Gallery already does for OpenLayers. We’ve even added the first example.
  • TMS Minidriver support for the GDAL WMS Driver: Sitting down and hacking out a way to access OSM tiles as a GDAL datasource, Schuyler and I built something which is reasonably simple/small — an 18k patch including examples and docs — but allows for a significant change in the ability to read tiles from existing tileset datasources on the web.

Other things happening at the sprint today were more WKT Raster discussions, liblas hacking, and single-pass MapServer discussions, as well as some profiling of MapServer performance with help from Paul and Shark. Thanks to the participation of the LizardTech folks, I think there will also be some performance testing done with MrSID rendering within MapServer, and there was — as always — more discussion of the “proj strings are expensive to look up!” discussion.

Other than that, it was a quiet day; lots of work getting done, but not much excitement in the ranks.

We then had a great dinner at Baton Rouge, and made it home.

This evening, I’ve been doing a bit more hacking, opening a GDAL Trac ticket for an issue Schuyler bumped into with the sqlite driver, and pondering the plan for OpenLayers tomorrow.

As before, a special thanks to the conference sponsors for today: Coordinate Solutions via David Lowther, and the lovely folks at SJ Geophysics Ltd.. Thanks for helping make this thing happen! I can guarantee that neither of those GDAL tickets would have happened without this time.

Toronto Code Sprint: Day 1

Posted in Mapserver, OSGeo, PostGIS, Toronto Code Sprint on March 8th, 2009 at 07:55:43

I’m here at the OSGeo Code Sprint in Toronto, where more than 20 OSGeo hackers have gathered to work on all things OSGeo — or at least MapServer, GDAL/OGR, and PostGIS.

For those who might not know, a code sprint is an event designed to gather a number of people working on the same software together with the intention of working together to get a large amount of development work done quickly. In this case, the sprint is a meeting of the “C tribe”: Developers working on the C-based stack in OSGeo.

After some discussion yesterday, there ended up being approximately 3 groups at the sprint:

  • People targeting MapServer development
  • PostGIS developers
  • liblas developers

(As usual, I’m a floater, but primarily concentrating on OpenLayers; Schuyler will be joining me in this pursuit, and I’ve got another hacker coming Monday and Tuesday to sprint with us.)

The MapServer group was the most lively discussion group (and is also the largest). It sounded like there were three significant development discussions that were taking place: XML Mapfiles, integration of pluggable rendering backends, and performance enhancements, as well as work on documentation.

After a long discussion on the benefits/merits of XML mapfiles, it came down to there being one main target use case for the XML mapfile is encouraging the creation and use of more editing clients. With a format that can be easily round-tripped between client and server, you might see more editors able to really speak the same language. In order to test this hypothesis, a standard XSLT transform will be created and documented, with a tool to do the conversion; this will allow MapServer to test out the development before integrating XML mapfile support into the library itself.

I didn’t listen as closely to the pluggable renderers discussion, but I am aware that there’s a desire to improve support and reduce code duplication of various sorts, and the primary author of the AGG rendering support is here and participating in the sprint. Recently, there has been a proposal to the list to add OpenGL based rendering support to MapServer, so this is a step in that direction.

The PostGIS group was excited to have so many people in the same place at the same time, and I think came close to skipping lunch in order to get more time working together. In the end, they did go, but it seemed to be a highly productive meeting. Among some of their discussions was a small amount of discusssion on the WKTRaster project which is currently ongoing, I believe.

After our first day of coding, we headed to a Toronto Marlies hockey game. This was, for many of us, the first professional hockey we’d ever seen. (The Marlies are the equivilant of AAA baseball; one step below the major leagues.) The Canadians in the audience, especially Jeff McKenna, who played professional hockey for a time, helped keep the rest of us informed. The Marlies lost 6-1, sadly, but as a non-Canadian, I had to root a bit for the Hershey team. (Two fights did break out; pictures forthcoming.)

We finished up with a great dinner at East Side Mario’s.

A special thanks to our two sponsors for the day, Rich Greenwood of Greenwood Map and Steve Lehr from QPUBLIC! Our sprint was in a great place, very productive, and had great events, thanks to the support of these great people.

Looking forward to another great day.

Geodata Cost Recovery: Eaton County

Posted in Locality and Space on February 25th, 2009 at 08:37:47

I was pointed out to Eaton County’s GIS Data Prices last night, and all I can say is how disappointed I am that people can still feel that this is an appropriate way to fleece their taxpayers. The data is collected, reproduction costs for the data are probably in the realm of a couple hundred bucks — less, if you just distribute them online. (Clearly, you already have a website.) Yet you charge twelve *thousand* dollars for copies — and even after that, you’re still limited in what you can do.

This kind of thing is just a damn shame. Taxpayers should insist that this data is made available at reasonable reproduction costs; the policies of GIS departments to make money off of these things is simply silly so long as they are collected with taxpayer dollars.

(If the GIS department does not receive state funding, then I suppose this type of cost recovery makes sense — in the same way that Sanborn or any other commercial entity would charge for it. However, I doubt that the primary client of such data isn’t the state itself, in which case it’s still taxpayer dollars covering the costs somewhere…)

Yahoo! Maps APIs, aka ‘grr, argh!’

Posted in Locality and Space, OpenStreetMap on February 16th, 2009 at 15:14:00

I have a love/hate relationship with Yahoo!’s mapping API. It’s lovely that Yahoo! believes, unlike Google and other mapping providers, that their satellite data is a suitable base layer to use for derivation of vectors. This openness really is good to see — they win big points from me in this regard. (Google, on the other hand, is happy to have you give them data against their satellite imagery, but letting you actually have it back is against the Terms of Service.)

However, the Yahoo! Maps AJAX API has never gotten much love. I think that a preference for flash has always existed in the Yahoo! world; iirc, their original API was Flash.

However, I realized today that this tendancy to leave the AJAX API in the dust has resulted in something that seriously affects me: The Yahoo! maps AJAX API uses a different set of tiles, which has two fewer zoom levels available in it:

AJAX Maps: most zoomed in Flash Maps: Most zoomed in

For the new OpenStreetMap editor I’m working on, this is a *serious* difference: although the information actually available in these tiles isn’t *that* much higher, it allows the user to extract more information by getting in a bit more, and to be more precise in placement of objects when using Yahoo! as a basemap.

Although it would be relatively easy to rip the tiles out, and create an OpenLayers Layer class that loaded them directly, this violates the Yahoo! Terms of Use. This is understandable, but unfortunate, because it means I can’t solve the problem with my own code.

What I would really love to see is more providers creating a more friendly way of accessing their tiles. I understand the need for counting of accesses, and the need for copyright notifications. If an API were published, that allowed you to:

  • Fetch a copyright notice for a given area, possibly also returning a temporary token to use
  • Following that, fetch tiles to fill that area
  • Require users to copyright notice in such a way as to make Yahoo! and their providers happy

This would allow for building a layer into OpenLayers which complied with this, without depending on Yahoo! to write a Javascript layer that did these things for me.

Now, it’s understandable that this doesn’t happen — having the client out of control of Yahoo! means that they can’t *enforce* that the copyright is displayed prominently, as they are able to (to some extent) with their API. However, I think that this type of API would allow more innovation, and possibly even a *more* prominent placement for Yahoo’s copyrights and notices. For example, in many mapping apps, the bottom inch of the map is not seen much by the users. If there was an API to get text to display, then an application could display the text in a more prominent location, rather than burying it under many markers or other pieces of text that might overlap it.

In the short term, all I really wish was that the AJAX API used the apparently-newer set of satellite tiles that the Flash API appears to have access to. I think the fact that this isn’t currently possible leads to an alternative access pattern for tiles, one which may make more sense in the long run, where tiles can be used by an application without necessarily running in the constrained Javascript API that these providers have the ability to write. And of course, if you want to provide your users with a ‘default’ API to use, you can always use OpenLayers, and extend it to include your own extensions…

polyshp2osm

Posted in Locality and Space on January 5th, 2009 at 00:47:40

For ages, people have been asking me to help them with shapefile to OSM conversion, because I wrote one of the scripts that got used a lot for different conversion projects. Since I get a fair amount of email on this, I figured it was worth blogging that I’ve actually put together a newer script from scratch that does something similar, though for Polygons instead of lines.

One of the benefits of this script is that it was written ahead of time with the intention of sharing it — which was never meant for the other script that I wrote. This means that it is slightly more readable; at least, the unreadable parts are better separated. (I will admit that there are several aspects of it that are terribly un-Pythonic.)

You can find the code in OSM’s shp2osm directory.

Some aspects of this code:

  • It is designed to help you create .osm files you can read/merge in JOSM, so it has the ability to do vertical striping across a dataset in order to create geographically ordered smaller datasets.
  • It has an option to limit the number of objects per .osm file; this defaults to 50,000, which in some cases was about JOSM’s limit. (In others, it seemed lower; it’s adjustable via a command line switch.)
  • Uses optparse, which means –help does help you (once you read the initial docstring and get yourself started.)
  • Supports both direct tag mappings (shapefile-attr -> OSM-attr) as well as custom functions to add more tags based on multiple attributes of a feature. It was built for the MassGis OpenSpace Layer upload (which is now in progress), so it needed more advanced tagging possibilities.
  • Supports saving the original shapefile data ‘automatically’, to create the possibility of recreating the original shapefile attribtes. These attributes are namespaced so as to minimize collisions.

The script was also used to convert the MassGIS buildings layer, which means that there are now building outlines for metro Boston slowly appearing on the map.

If you have polygon data, this tool may be helpful to you. If you don’t have polygon data, this tool also may be helpful to you, as a better demonstration of how to map shapefiles into OSM data without writing all the code yourself.

I’m not likely to be doing a lot of support for this, but I wanted to let people know, because I personally think the code is much much more readable than the last shapefile conversion tool I wrote. (Also, it’s not every day I get to collaborate on OSM with Tim Berners-Lee.)

OpenAerialMap Project Update

Posted in Locality and Space, OpenAerialMap on December 18th, 2008 at 20:24:06

For the past 6 months, the OpenAerialMap project has been in a state of … well, stagnation would be a nice way to put it. I’ve just sent an email to the mailing list Outlining the status as I see it, and I would love to see feedback and opinions on the list.

The biggest problem with OAM is that it never developed a community around it. My hope is that with an increase in interest in the past $shortWhile, there is sufficient interest to build a community this time around, and with that, enable the project to succeed in a way that it couldn’t 6 months or a year ago.

Using Jython + GeoTools

Posted in Jython, Locality and Space on December 15th, 2008 at 12:34:32

So, after a weekend of working with Java and GeoTools, suddenly things got a lot simpler to work with in Jython. This is a case of trying to do the completely wrong thing because I don’t know anything about the project, packaging… or language.

Thankfully, I’m now much better off, and have put together a little HTTP server that runs and gives me back WKT for any EPSG code, using Jython + GeoTools.

The code is very simple: the geotools-epsg-server just takes in a code, and spits out WKT. It uses the GeoTools CRS package, and a BaseHTTPServer.

It’s not much, but it’s enough to get me in the right direction. Maybe I’ll even stop whining about Java so much, since I can use it more like Python…

Nah, that wouldn’t be any fun.

Jython + TileCache/FeatureServer: It Just Works

Posted in ESRI, FeatreServer, FeatureServer, TileCache, default, spatialreference.org on December 14th, 2008 at 10:37:04

Earlier today, I tried Jython for the first time, because I’m doing some work that may involve interactions with Java libraries in the near future. Jython, which I’ve always avoided in the past due to an irrational fear of Java, is “an implementation of the high-level, dynamic, object-oriented language Python written in 100% Pure Java, and seamlessly integrated with the Java platform.” (I love projects that have great one-liners that I can copy paste.)

My goal for Jython was to do some work with the GeoTools EPSG registry code related to SpatialReference.org. Sadly, I didn’t get that working, but in the process, I learned that Jython now has a beta version which is up to Python 2.5 — much newer than the 2.2 that had previously been available.

With that in hand, I decided to see if I could get some of my other Python projects running under Jython. I’m the maintainer for both TileCache and FeatureServer — two pure Python projects. Theoretically, these projects should both work trivially under Jython, but I’ve always had my doubts/fears about this being the case. However, it turns out that my fears here are entirely unfounded.

I downloaded the FeatureServer ‘full’ release from featureserver.org: this includes the supporting libraries needed to get a basic FeatureServer up and running. I then tried to run the FeatureSever local HTTP server… and it worked out of the box. I was able to Load the layer, save data to it, query it, etc. with no problems whatsoever. Java has support for the DBM driver that FeatureServer uses by default, so out of the box, I was able to use FeatureServer with Jython without problems.

Next came TileCache. TileCache was originally built to support Python all the way back to 2.2, so I wasn’t expecting many problems. Getting it running turned out to be almost as easy: the only code modification that was needed was a minor change to the disk cache, because Jython doesn’t seem to support the ‘umask’ method. Once I changed that (now checked into SVN), Jython worked just as well with TileCache as it did with FeatureServer.

Clearly, there are some things which are less trivial. The reason that these libraries were so easy to use is because they were designed to be low-dependancy: TileCache and FeatureServer default paths are both entirely free of compiled code. Using something like, for example, GDAL Layers in TileCache, would be much more difficult (if it’s even possible).

However, this presents some interesting capabilities I had not previously thought of.

For FeatureServer, this means that it may be possible to write a DataSource which accesses SDE using the ArcSDE Java API, ESRI’s supported method for accessing their SDE databases. One of the purported “holy grails” of the GIS world is RESTful access to SDE databases via lightweight servers — Jython may provide a path to that, if someone is interested in it. (It may be that this has become a moot point with the release of the ESRI 9.3 REST API — I’m not really sure.) This may be a waste of time, but the fact that it *can* be done is interesting to me. Edit: Howard points out that ArcSDE read/write support exists in OGR 1.6, so this is a moot point; you can simply use OGR to do this without involving Jython/Java.

I think this might also speak to a possibility of having better answers available for people who want to use things like FeatureServer from Java platforms (though I don’t know enough about jython to be sure): the typical answer of “use GeoServer” is great, but to be able to provide something a bit more friendly would be interesting. Thankfully, the Java world is largely catching up to the advances made in TileCache/FeatureServer, so this is also less urgent than it has been in the past.

In the end, this was likely simply an interesting experiment. However, it’s nice to know that the capabilities to do things like this within Jython are improving, and that Jython continues to advance their Python. The 2.2 release being the ‘current’ one still is disappointing, but seeing a 2.5 beta available is an exciting development.

As I said, the current version of FeatureServer works out of the box with Jython, and I’ll be doing a TileCache release shortly that will work with Jython out of the box as well. It’s neat to see more possibilities for using these libraries I’ve spent so much time on.

Open Source Project Documentation: OpenLayers

Posted in Locality and Space, OpenLayers, Social on December 11th, 2008 at 16:38:34

Earlier today, I read a post on the OpenGeo GeoSpiel blog calling for OpenLayers to get on the “Usable Documentation Bandwagon”.

Now, I’ll be honest: I followed the links that he offered, and found something that is not, to me, much more convincing than the OpenLayers documentation as it stands today. If I look at ESRI’s JSAPI, I see a couple things wrong right off the bat — like the fact that I can’t actually link to where I want to. In any case, if you click Geometry -> Point, you see a document which, to me, is a lot less interesting than the similar page provided by OpenLayers.

Perhaps there is some subset of documentation put together by ESRI in their JSAPI that makes sense, but for an API reference, it seems to me that OpenLayers does reasonably well on the portions of our API that someone has invested time to work on.

This doesn’t mean we’re “done” by any stretch of the imagination: One of the things that I’ve wanted to do for ages is actually sit down and do a thorough review of the OpenLayers API documentation and improve a lot of it, targeting cross-linking with other documentation and examples especially. These types of tasks, however, are the types of tasks that require a lot of time, and don’t have a lot of immediate benefit. Since all work on OpenLayers for the past several months has been my personal free time after work, there’s only so much I’m personally able to do. However, no one has *ever* made a request to the OpenLayers team to be able to do this work — including OpenGeo — that I’ve seen. It seems to me that whit is experienced and knowledgable about OpenLayers, since I’ve seen him using it for development, but I’ve never seen him ask to participate in improving the OpenLayers API documentation on the mailing lists, nor have I seen a request of this nature from any other organization.

To me, this means that it’s likely that our API documentation does, to some extent, meet the needs of the organizations using OpenLayers. It’s not perfect — nothing is — but no one thinks it’s a major stumbling block that’s worth fixing, at least not enough to spend money on it.

This is exactly the type of reason that OpenLayers is now accepting Sponsorship from organizations looking to support the project. This type of improvement is exactly the kind that project sponsorship can help support.

OpenLayers is aware of how important documentation is to the success of the project. We have invested dozens of hours between a dozen different contributors to create and maintain a set of relatively complete API documentation. According to Ohloh, of the 55,000 lines in OpenLayers Javascript, over 30% are comments: over 25,000 lines of what is mostly API documentation. No one is ignoring the problem — and if the current state is insufficient (as everything in the project is, *especially* to new users and beginners), then we’re very open to help from any and all interested parties.

API documentation isn’t the only thing a project needs, of course. Documentation comes in all forms — and OpenLayers is seriously lacking in a lot of documentation targeted towards beginners of all kinds. We’ve been working to change that, with a new documentation site available, and other efforts targeted documentation of OpenLayers in English and other languages. I would say these efforts are much less complete than the API documentation, and starting on them is far more important, in my opinion, than improving our API documentation is at this time.

OpenLayers is a large library. It’s used by many organizations. We’re open to contributions of all types — never have I seen OpenLayers turn away someone who wanted to help with documentation intentionally. We have regularly worked with contributors in helping them to improve the documentation, and to claim that we are ignoring the need for documentation seems to me to be representative of a lack of knowledge of the tools that the project uses for documentation, not specifically a lack in the goals of the project, which puts documentation of functionality — via API docs and minimal examples demonstrating functionality — as a requirement of almost all new code in the library.