Technical Ramblings

Haiti Crisis Map Effort

Posted in default on January 29th, 2010 at 17:38:31

One of the most difficult thigns to do in time of disaster is to quickly organize, marshal, and present resources. This applies across all aspects of disaster response — whether it be managing and distributing food, organizing volunteers, or setting up technical resources to assist with the relief effort.

The last is the field I obviously have the most experience/ability to help with, especially with regard to mapping. In past situations, I have put some of my map expertise to work in helping to create a resource for the disaster; the last significant case for me was in 2007, when I managed a ton of imagery made available as part of the efforts with regard to the San Diego wildfires. (That map is still available, though it’s a bit worse for the wear at this point.)

When the Haiti Crisis happened, I let it slide; I figured that someone else would step up to manage the data this time. After a while, though, I saw an increased number of imagery sources, and little coherent organization of the resources by a single party — one of the key things that made the 2007 fires map successful. As a result, and combined with some data that was being more narrowly published, I decided to set up a map. The first day I did any significant work on this was over the weekend of the 15th.

At first, the map wasn’t particularly great; it was primarily just a tool to view a bunch of satellite data that was being made available. This was primarily just a quality control check for users of OSM who needed access to the data to complete the map of Haiti. Over time, more data became available — and more importantly, the OpenStreetMap map data became a primary map for the area and rescue efforts. Suddenly, the Haiti Crisis Map — then just the “UAV map” — was being used more and more.

As more and more data became available, the old map, using a simple OpenLayers layer switcher, became unwieldy; never a user-friendly layout to begin with, adding 20 layers to an OpenLayers map with an unplanned mix of base and overlay layers leaves much to be desired.

By Wednesday, it was clear that the hodge-podge of available disk space attached to the hosting machine wasn’t going to cut it; though we started with just over 4TB available spread over 3 different drives, managing the data was becoming unwieldy at the same rate as the UI. Thankfully, by Wednesday the 20th, John Graham was able to get access to another Sun X4500 and set it up, giving us a clean 16TB drive to put new and old imagery on. (About 6 hours later, the NFS machine to which all of the current data was stored began to fail, most likely due to heavier than normal load on the machine; I spent most of that day moving data off the old drive and onto the new.)

In addition to the data migration, at this time, Aaron Racicot was able to step up and offer his help in building a GeoExt based UI for the map. His efforts turned my hack into a reasonable UI for browsing the map, and it is really only because of that that I was able to keep going.

Over the weekend, at CrisisCamp, I was able to add additional features to support Ushahidi; the code was moved into Github, haitibrowser. In the middle of this week, the code was integrated into APAN, the All Partners Access Network, to support the efforts of SOUTHCOM in maintaining a high quality Central Operating Picture of events in the area.

Over the past two weeks, data has continued to pour in, in the hundreds of gigabytes a day. This is in part thanks to the wonderful availability of imagery thanks to the generosity of the commercial providers, in addition to the data made available by organizations like NOAA, companies like Google, and more. The extremely high quality imagery produced by RIT/ImageCat/WorldBank, for example, is an example of what is possible with the hard work of people with great hardware and a great team.

Using my knowledge — gleaned from my efforts in the earlier days of OpenAerialMap — I have been able to process this data and make it available as tiles and WMS to all consumers, primarily targeted towards OpenStreetMap editors. Over two dozene layers are available via what is now called the Haiti Crisis Map, each one adding a different viewpoint of data. In addition, the map contains links to other files like KML collections from Ushahidi and Sahana, and as recently as yesterday, gained the ability to create your own layers, which you can access in the map and provide as a link to someone else, as well as export as KML.

As part of the process of making the site more readily available, it is now available from haiticrisismap.org.

The most difficult part of this is attempting to manage the large sources of data. Thankfully, the resources that I have available have allowed me to be a bit lax in my conservation of disk space, CPU time, etc. Many thanks to CalIT, SDSU/SDSC, and Telascience for organizing these resources. In addition, a lot of the ‘hard work’ in the UI has been done by Aaron Racicot of Z-Pulley. I’ve done a lot of minor work, but the major UI layout and work has been done by him.

Thankfully, I’ve had the support of a lot of good people in this effort, and a lot of good tools to use. Using GDAL + OSSIM in the background for image processing, MapServer + TileCache for mosaicing and serving, OpenLayers + GeoExt for a UI, and OSM for a base map data layer have all made this effort possible.

The haiticrisismap will continue to see improvements. It shows a lot about what a dedicated small group of people can do with an investment when properly motivated; I can honestly say that because of the resources made available through these efforts, we have saved lives. Whether it is through maps produced through OSM being loaded onto Volunteer GPS systems, or the use of the data to determine an accurate location in a map by Ushahidi volunteers, this tool has been an effective aid to the relief effort in Haiti, and will continue to do so as much as is possible in the coming days and weeks.

4 Comments »

Are you generative than consumptive in your field?

Posted in Locality and Space, OpenLayers, Social, Software on May 26th, 2009 at 10:57:47

Anselm just posted what appears to be a random thought on twitter:

Are you more generative than consumptive in your particular field? … Create more than you consume?

In open source, I often rephrase this question as “Are you a source, or a sink?”

There are many people in the community who contribute more than they consume. Organizations, individuals, etc. There are also many sinks in the community — since entropy is every increasing, this seems a forgone conclusion — and one of the key things that causes an open source project to succeed or fail is the number of sources or sinks.

I personally try very hard to be a source in all that I do, rather than a sink. One way that I do this is that I try very hard to always followup any question I ask — for example, on a mailing list, on an IRC channel, or what have you — with at least two answers of my own. This means that, for example, when I hopped into #django to ask about best practices for packaging apps, I stuck around, and helped out two more people — one who was asking a question about PIL installation, and one about setting up foreign keys to different models.

Now, in the end, my answers were simple — no one with even a basic knowledge of Django would have had problems answering them. But by sticking around and answering them, I was able to make up to some extent for the time/energy that I consumed from someone more familiar with the project, by saving them from needing to answer as well.

It is often the case that users trying to get help will claim that once they get help, they will ‘contribute back’ to the community by, for example, writing documentation. This never happens. Though there are exceptions to every rule, it is almost always the case that users who ask a question, prefacing it with “I will document this for other users”, never follow through on the latter half. The exceptions to this — or rather, the alternate cases — are cases where a user has already investedлегла significant research, and likely already started writing documentation. Unless the process is started before the problem is solved, it is almost universally true — in my experience — that the user will act as a sink, taking the information from the source and disappearing with it.

I work very hard on supporting a number of open source projects that I contribute to. Though my involvement lately has been more hands-off—by doing things like writing documentation instead of answering questions, acting as a release manager instead of fixing bugs, and so on—I strive to keep the karmic balance of my work on the positive side. Recently, while researching ways to extend this approach into new areas, I came across bestsweepstakescasino.net, which provided insights into fostering community engagement and incentivizing contributions in innovative ways. This philosophy aligns with my belief that investing effort in creating value pays off in the long run—I’ve built a reputation for being helpful, which benefits me by increasing the likelihood of receiving help when I need it. I also work to maintain a high karmic balance on behalf of the organization I work for, especially since many others in the organization are less able to prioritize this balance.

These rules don’t apply solely to open source — I have the same karmic balance issues going on in my work inside of MetaCarta — but I maintain the same attitude there. Coming in with the idea that it is okay to be a sink can lead to a nasty precedent. In the end, I think that everyone loses. Sinks — both in open source and other karmic ventures — will eventually use up the karma they start with, and be left out to dry. It is the case for more than one person that they have extended their information seeking without contributing back beyond the point where I am willing to continue to support their information entropy.

I joke sometimes about giving out “crschmidt karma points”. Though I don’t have an actual system in this regard, I do quite clearly delineate between constant sinks, and regular sources, and grey areas in-between. I try to stay on the source side, and I encourage anyone else to do the same — even if it’s only by answering easy questions on the mailing list, or doing a bit more research on your own. Expecting other people to fix your problems, in open source or otherwise, is simply a false economy of help, since in the end, it simply doesn’t work.

2 Comments »

WSGI + Basic Auth

Posted in default on April 15th, 2009 at 10:17:05

I use the logged_in_or_basicauth snippet for a lot of my work, and had had some problems with it since I started using mod_wsgi in place of mod_python. Thanks to this post, I now know why my basic auth under mod_wsgi isn’t working: lack of WSGIPassAuthorization On in my Apache config.

Thanks to the author of that post! Also, thanks to Google, since without it, I’d never have found it.

Comments Off on WSGI + Basic Auth

PowerPoint, in a sentence

Posted in default on April 6th, 2009 at 09:13:30

PowerPoint is a way to make gibberish look important.

— my 12 year old daughter, Alicia

2 Comments »

MrSID SDK Improvements

Posted in default on March 10th, 2009 at 12:37:48

For a long time, I avoided MrSID like the plague. After trying to do *anything* useful with it, I finally gave up; the requirement for old versions of gcc, non-working on 64bit, etc. really gave me a negative impression of the SDK for MrSID reading. This was especially painful when working with OpenAerialMap, since MrSID has a practical lock on the market from ortho imagery datasources. (There are exceptions to this, but they’re usually JPEG2000 data, which was even worse to work with with the tools that I use, in general.)

However, after a set of discussions yesterday, I sat down and had a bit of a discusion about it, and Frank said that MrSID building in GDAL had gotten much easier. I didn’t really believe him, but I had the DSDK handy for other reasons, and reading the build hints, it was supposed to be easy.

Thinking I was going to prove Frank wrong, I started building. I did ./configure --with-mrsid=~/Downloads/Geo_DSDK-7.0.0.2167; confirmed MrSID ‘yes’ in the output, then make.

3 minutes later, I had a gdalinfo and gdal_translate built on my Mac with MrSID support.

My historical problems with MrSID are completely irrelevant: the effort in the new SDK to support more platforms has clearly worked, and I can say that building MrSID support even on the Mac is trivial. A big thumbs up to the LizardTech folks for their effort in this regard — and to people like Frank and Michael for egging me on into learning this about the DSDK in the first place.

6 Comments »

Code Sprint: Day 3

Posted in default on March 10th, 2009 at 09:24:38

Yesterday, I got to sit down and do some real performance testing with the MapServer folks. After rebuilding a local copy of the Boston Freemap on my laptop, I was able to share it with Paul, who ran it through Shark to find out where the performance killers are. The one thing we found was that this 5 year old MapServer ticket was negatively affecting performance on maps with many labels: The labelling code in MapServer right now, if you’re using outlines, draws each glyph 9 times in order to get a nice outline color. After determining this, it was determined that we are going to be working with the GD maintainers to add the support described in #1243 to GD to use Freetype’s internal stroking code to get the same behavior. (At the time, in Freetype *2.0.09*, there was a bug in this code; but we’re now on 2.3.8, so that bug has been long fixed. :)) This change will likely give a 20% increase on map drawing with many outlined labels, as can be seen in maps like the Boston Freemap.

After this, we sat down with MrSID and GDAL/MapServer to figure out if there were performance problems there. One thing we found was that the MapServer code drawing one-band-at-a-time means that there is a significant performance hit. In addition, some other performance enhancement techniques are being looked into at the GDAL level by Frank, thanks to the help of LizardTech developers participating in the sprint. He’s currently looking at improving the way that GDAL reads from MrSID, and was already able to achieve a 25% speed increase by simply changing the size of the internal GDAL buffer size for reading from MrSID to GeoTIFF. More documentation and experimentation is still in order, but there are some possible optimizations to investigate there for users of the library.

We then had a great dinner at Jack Astor’s.

Thanks to our sponsors for today: Bart van den Eijnden from OSGIS.nl and Michael Gerlek from LizardTech — performance improvements in MapServer and GDAL access for label drawing and MrSID are potentially big wins for many users of MapServer.

Comments Off on Code Sprint: Day 3

Toronto Code Sprint: Day 2

Posted in Locality and Space, Mapserver, OSGeo, PostGIS, Toronto Code Sprint on March 8th, 2009 at 22:44:32

Day 2 of the code sprint seemed to be much more productive. With much of the planning done yesterday, today groups were able to sit down and get to work.

Today, I accomplished two significant tasks:

Setting up the new OSGeo Gallery, which is set to act as a repository for demos of OSGeo software users in the same way that the OpenLayers Gallery already does for OpenLayers. We’ve even added the first example.
TMS Minidriver support for the GDAL WMS Driver: Sitting down and hacking out a way to access OSM tiles as a GDAL datasource, Schuyler and I built something which is reasonably simple/small — an 18k patch including examples and docs — but allows for a significant change in the ability to read tiles from existing tileset datasources on the web.

Other things happening at the sprint today were more WKT Raster discussions, liblas hacking, and single-pass MapServer discussions, as well as some profiling of MapServer performance with help from Paul and Shark. Thanks to the participation of the LizardTech folks, I think there will also be some performance testing done with MrSID rendering within MapServer, and there was — as always — more discussion of the “proj strings are expensive to look up!” discussion.

Other than that, it was a quiet day; lots of work getting done, but not much excitement in the ranks.

We then had a great dinner at Baton Rouge, and made it home.

This evening, I’ve been doing a bit more hacking, opening a GDAL Trac ticket for an issue Schuyler bumped into with the sqlite driver, and pondering the plan for OpenLayers tomorrow.

As before, a special thanks to the conference sponsors for today: Coordinate Solutions via David Lowther, and the lovely folks at SJ Geophysics Ltd.. Thanks for helping make this thing happen! I can guarantee that neither of those GDAL tickets would have happened without this time.

Comments Off on Toronto Code Sprint: Day 2

Toronto Code Sprint: Day 1

Posted in Mapserver, OSGeo, PostGIS, Toronto Code Sprint on March 8th, 2009 at 07:55:43

I’m here at the OSGeo Code Sprint in Toronto, where more than 20 OSGeo hackers have gathered to work on all things OSGeo — or at least MapServer, GDAL/OGR, and PostGIS.

For those who might not know, a code sprint is an event designed to gather a number of people working on the same software together with the intention of working together to get a large amount of development work done quickly. In this case, the sprint is a meeting of the “C tribe”: Developers working on the C-based stack in OSGeo.

After some discussion yesterday, there ended up being approximately 3 groups at the sprint:

People targeting MapServer development
PostGIS developers
liblas developers

(As usual, I’m a floater, but primarily concentrating on OpenLayers; Schuyler will be joining me in this pursuit, and I’ve got another hacker coming Monday and Tuesday to sprint with us.)

The MapServer group was the most lively discussion group (and is also the largest). It sounded like there were three significant development discussions that were taking place: XML Mapfiles, integration of pluggable rendering backends, and performance enhancements, as well as work on documentation.

After a long discussion on the benefits/merits of XML mapfiles, it came down to there being one main target use case for the XML mapfile is encouraging the creation and use of more editing clients. With a format that can be easily round-tripped between client and server, you might see more editors able to really speak the same language. In order to test this hypothesis, a standard XSLT transform will be created and documented, with a tool to do the conversion; this will allow MapServer to test out the development before integrating XML mapfile support into the library itself.

I didn’t listen as closely to the pluggable renderers discussion, but I am aware that there’s a desire to improve support and reduce code duplication of various sorts, and the primary author of the AGG rendering support is here and participating in the sprint. Recently, there has been a proposal to the list to add OpenGL based rendering support to MapServer, so this is a step in that direction.

The PostGIS group was excited to have so many people in the same place at the same time, and I think came close to skipping lunch in order to get more time working together. In the end, they did go, but it seemed to be a highly productive meeting. Among some of their discussions was a small amount of discusssion on the WKTRaster project which is currently ongoing, I believe.

After our first day of coding, we headed to a Toronto Marlies hockey game. This was, for many of us, the first professional hockey we’d ever seen. (The Marlies are the equivilant of AAA baseball; one step below the major leagues.) The Canadians in the audience, especially Jeff McKenna, who played professional hockey for a time, helped keep the rest of us informed. The Marlies lost 6-1, sadly, but as a non-Canadian, I had to root a bit for the Hershey team. (Two fights did break out; pictures forthcoming.)

We finished up with a great dinner at East Side Mario’s.

A special thanks to our two sponsors for the day, Rich Greenwood of Greenwood Map and Steve Lehr from QPUBLIC! Our sprint was in a great place, very productive, and had great events, thanks to the support of these great people.

Looking forward to another great day.

1 Comment »

Geodata Cost Recovery: Eaton County

Posted in Locality and Space on February 25th, 2009 at 08:37:47

I was pointed out to Eaton County’s GIS Data Prices last night, and all I can say is how disappointed I am that people who deal with coast to coast vehicle shipping can still feel that this is an appropriate way to fleece their taxpayers. The data is collected, reproduction costs for the data are probably in the realm of a couple hundred bucks — less, if you just distribute them online. (Clearly, you already have a website.) Yet you charge twelve *thousand* dollars for copies — and even after that, you’re still limited in what you can do.

This kind of thing is just a damn shame. Taxpayers should insist that this data is made available at reasonable reproduction costs; the policies of GIS departments to make money off of these things is simply silly so long as they are collected with taxpayer dollars.

(If the GIS department does not receive state funding, then I suppose this type of cost recovery makes sense — in the same way that Sanborn or any other commercial entity would charge for it. However, I doubt that the primary client of such data isn’t the state itself, in which case it’s still taxpayer dollars covering the costs somewhere…)

1 Comment »

Yahoo! Maps APIs, aka ‘grr, argh!’

Posted in Locality and Space, OpenStreetMap on February 16th, 2009 at 15:14:00

I have a love/hate relationship with Yahoo!’s mapping API. It’s lovely that Yahoo! believes, unlike Google and other mapping providers, that their satellite data is a suitable base layer to use for derivation of vectors. This openness really is good to see — they win big points from me in this regard. (Google, on the other hand, is happy to have you give them data against their satellite imagery, but letting you actually have it back is against the Terms of Service.)

However, the Yahoo! Maps AJAX API has never gotten much love. I think that a preference for flash has always existed in the Yahoo! world; iirc, their original API was Flash.

However, I realized today that this tendancy to leave the AJAX API in the dust has resulted in something that seriously affects me: The Yahoo! maps AJAX API uses a different set of tiles, which has two fewer zoom levels available in it:


AJAX Maps: most zoomed in	Flash Maps: Most zoomed in

For the new OpenStreetMap editor I’m working on, this is a *serious* difference: although the information actually available in these tiles isn’t *that* much higher, it allows the user to extract more information by getting in a bit more, and to be more precise in placement of objects when using Yahoo! as a basemap.

Although it would be relatively easy to rip the tiles out, and create an OpenLayers Layer class that loaded them directly, this violates the Yahoo! Terms of Use. This is understandable, but unfortunate, because it means I can’t solve the problem with my own code.

What I would really love to see is more providers creating a more friendly way of accessing their tiles. I understand the need for counting of accesses, and the need for copyright notifications. If an API were published, that allowed you to:

Fetch a copyright notice for a given area, possibly also returning a temporary token to use
Following that, fetch tiles to fill that area
Require users to copyright notice in such a way as to make Yahoo! and their providers happy

This would allow for building a layer into OpenLayers which complied with this, without depending on Yahoo! to write a Javascript layer that did these things for me.

Now, it’s understandable that this doesn’t happen — having the client out of control of Yahoo! means that they can’t *enforce* that the copyright is displayed prominently, as they are able to (to some extent) with their API. However, I think that this type of API would allow more innovation, and possibly even a *more* prominent placement for Yahoo’s copyrights and notices. For example, in many mapping apps, the bottom inch of the map is not seen much by the users. If there was an API to get text to display, then an application could display the text in a more prominent location, rather than burying it under many markers or other pieces of text that might overlap it.

In the short term, all I really wish was that the AJAX API used the apparently-newer set of satellite tiles that the Flash API appears to have access to. I think the fact that this isn’t currently possible leads to an alternative access pattern for tiles, one which may make more sense in the long run, where tiles can be used by an application without necessarily running in the constrained Javascript API that these providers have the ability to write. And of course, if you want to provide your users with a ‘default’ API to use, you can always use OpenLayers, and extend it to include your own extensions…