Technical Ramblings

Archive for the 'default' Category

Flying United (or not)

Posted in default on July 3rd, 2011 at 19:14:14

иконописПравославни икониToday, my 13-year old daughter was flying from Manchester to Greenville/Spartanburg, transferring in Dulles. She was flying without the unaccompanied minor service. Her connection time in Dulles was short (only 1hr15m), but should have been sufficient.

At 4:50PM, her flight status was updated, making it clear that her flight was going to be an hour late in arriving, arriving at 4:54PM. (Admittedly, I should have been keeping closer track and making some phone calls earlier, but I doubt it would have helped.) She pocket-dialed me at 5:00, so I knew she was at least somewhere with cell reception; unfortunately, calling her back didn’t get any response.

At 5:05, I called United, and asked if they could confirm her status — whether she made her flight. While navigating their phone menus, I observed that the flight she was supposed to take had taken off, 5 minutes early. I gave United her name + flight information, and they then proceeded to take 5+ minutes to tell me that she had left from Manchester earlier that day. (Thanks for telling me information I gave you.) They also told me her flight was late (thanks again), and that she was rebooked for the next morning at 8:20AM.

Now, for me, cue a bit of panic; I made the decision not to purchase unaccompanied minor service for her trip based on the idea that she could make her flights, and even if she didn’t, there was another flight later in the day she could pick up (still the case here: there were two more flights out of IAD to her destination 30 minutes later). This may have been a mistake, but that was a decision. Having United rebook her on a flight the next day is an entirely different beast, obviously.

I asked the agent whether they could confirm she had *not* made her flight: I was told that she could tell me that she had been rebooked the next morning, which they would not have done if she had made her flight, but she repeated that she could not tell me if she actually made her flight. (What?)

I called a couple other numbers, including the airport traveller assistance number, who told me that they could not do anything other than page in the baggage claim area; we punted on that for the time and kept trying United, still trying to get a confirmation that she had or had not made it on her flight.

In my next call, I got the same basic story: She’s rebooked for tomorrow. Since she is rebooked for tomorrow, United will be putting her up in a hotel tonight. No, I can’t tell you what hotel. No, I can’t tell you a number you can call for that information. No, I can’t tell you whether she made it on the flight. No, I can’t tell you hotels we use for putting passengers up. I can tell you that she was rebooked for tomorrow morning.

At this point, we had her paged in the airport — which ended up producing no response, though they were attempting to be very helpful. They did clarify that they can’t page in the gate area, and we should contact the airline for that.

Again, we called a different United number, only to be shunted back to the main United phone disaster. No, we can’t contact our gate agents because of 9/11. No we can’t page in the gate are. No, we can’t tell you what hotels we use. No, we can’t give you a number to find out where this passenger has gone.

Kristan and I at this point started calling hotels — this is one hour after she has landed in Dulles and we have been told she will be staying overnight in DC and that we could get no information from anyone. We made it through about 10… at which point we got a call from Alicia. She had not answered her cell phone (which she did have on her, and charged)… because she was on her plane.

She was fine, and on her plane, the whole time. United put her on the next flight, but rebooked her anyway. She’s not in IAD, and had no problems… other than our complete lack of ability to communicate with United at all.

The first response to this will naturally be: “You should have paid for unaccompanied minor service.” While this may be true, I’m not convinced that it would have helped: specifically, the problem was not that Alicia did not make her flight — she did. The problem is that *no one could tell us this* — despite the fact that, presumably, someone scanned her boarding pass. Since we did not have a phone number — and I looked for an ‘unaccompanied minor contact number’ on United’s site — and she made her flight on time, I doubt we would have been contacted by United. (In fact, unaccompanied minor service might well have meant the difference between her making her flight and not in this case!)

Overall, I’m not upset that United didn’t know where my daughter was: that’s my fault. I’m not really that upset about most of the question they couldn’t answer in this case — though if she *was* stuck overnight, I’d be damn upset. Instead, I’m only upset about one thing: United told me she missed her flight, rebooked her on a flight tomorrow morning, and was completely unable to tell us that *she actually made her flight*. That has to be seen as a failure somewhere in the system, in my opinion, and I think the complete lack of being able to get to anyone other than reservations via the phone on short notice is completely unreasonable.

I do want to thank people who helped reassure me on twitter and elsewhere, and the IAD Travelers Aid desk, who did everything within their limited power to help us out, and the most important thing is that Alicia is at her destination + safe.

3 Comments »

Suggestions for Better Geo In the (dot)Cloud

Posted in default on June 10th, 2011 at 00:47:03

икониOne of the interesting things that I’ve noticed in exploring various cloud providers is the limited amount of support for strong geo integration. This isn’t particularly surprising — Geo on the web is still a niche market, even if it is a niche market I tend to care about a lot. Here are some things that I think that might make people more eager to adopt cloud-y services for geo, specifically in the context of DotCloud, where I’ve been investing most of my time recently.

DotCloud has the idea of ‘services’ — independent units which work together to form an overall operating picture. The idea behind these services is that they are simple to set up, and perform one particular task well. So rather than having a service which combines PostGIS, GeoDjango, and MapServer, you have three separate services: one for PostGIS, one for GeoDjango, and one for MapServer, or map rendering, and you connect between them. This way, if your application needs to scale at the frontend, you can easily simply deploy additional GeoDjango services; if you need more map rendering ‘oomph’, same deal. (Deploying additional service for databases won’t magically scale them, but you do have the ability to use the software’s internal scaling across your multiple instances.)

So, again, using DotCloud, there are three types of ‘services’ that I think would be interesting to users who are working in Geo:

PostGIS — Supporting PostGIS on most debian+debian-derivatives platforms is pretty simple; the tools are all easily installable via packages. (pgRouting is another step beyond — something which might actually have some benefit simply because it’s more difficult to get started with, so having it preconfigured could make a big difference in adoption.) Nothing really special needed here other than apt-get install postgis and some wrapper tools to make setting up PostGIS databases easy. There probably isn’t any reason why this couldn’t be included in the default Postgresql service type in DotCloud.
GeoDjango — This one is even easier than PostGIS. GeoDjango already comes with the appropriate batteries included. The primary blocker on this is to get the GEOS, proj, and GDAL libraries installed on the Django service type. I don’t think there is any need for a separate service type for GeoDjango.
Map Rendering — This one is a big one for a lot of people, and I’m not entirely sure the best way to work it within DotCloud. Map Rendering — taking a set of raster or vector data, and making it available as rendered tiles based on requests via WMS or other protocols — is one of the things that is not pursued as often by the community right now, and I think a lot of that is in the difficulty of setup. As data grows large, coping with it all on the client side becomes more difficult; some applications simply never get built because the jump from ‘small’ to ‘big’ is too expensive.
There are three different ‘big’ map rendering setups that I can think of that might be worth trying to support:
- MapServer — MapServer is a tried and true map rendering tool. It primarily exists as a C program with a set of command line tools around it; it is usually run under CGI/FastCGI. Configuration exists as a set of text-based configuration files (mapfiles); rendering uses GDAL/OGR for most data interactions, and GD or AGG (plus more esoteric output formats) for output. MapServer is often combined with TileCache, for caching purposes; TileCache is based on Python.
- GeoServer — GeoServer is a Java program, which runs inside a servlet container like Tomcat. Like MapServer, it supports a variety of input/output formats; configuration is typically via its internal web interface. Caching is built in (via geowebcache). I think GeoServer would probably run as is under the ‘java’ service type that exists on DotCloud, assuming the appropriate PostGIS support exists for the database side.
- OSM data rendering — This one is a bit less solid. OpenStreetMap data rendering has a number of different rendering frontend environments, but the primary one that I think people would tend to set up these days is a stack of mod_tile (Apache-side frontend) talking to tirex (renderer backend) which calls out to/depends on Mapnik, the actual software which does tile rendering. Data comes from a PostGIS database — though in the case of OSM, even that requires some additional finagling, since getting a worldwide OSM dump is… pretty big. (It’s probably safe to set that point aside as a starting point, and concentrate instead on targeting localized OSM rendering deployments — solve the little problems first, scale up later.)

One thing that all of these tools have in common is that they really like having fast access to lots of disk for quickly reading and writing small files. I’m not sure what the right way to do that within the DotCloud setup is — I don’t see an obvious service type which is designed for this — so that might be another component in the overall picture. (Things like the redis service try to solve this problem I think, but since the tools primarily intend to write to disk as is, adopting them to support other ways of storing tile data persistently would require modifying the upstream libraries.)

I think that there is room to significantly simplify deployment of some components of geographic applications by centralizing installation in cloud-based services; the above sketches out some of the components that it might make sense to consider as a first pass. These components would let someone create a relatively complex client + server side geographic data application; exploring and expanding on these — especially the OSM data rendering component — could make deploying to the cloud easier than deploying locally, with the net effect of increased adoption of cloud-based services… and more geodata in the world to consume, which is something I’m always in favor of. 🙂

1 Comment »

Better GDAL in the Cloud (DotCloud, Round 2)

Posted in default on June 7th, 2011 at 06:59:40

My previous attempts to make GDAL work in the cloud were sort of silly; compiling directly on the instance is sort of counter to the design of DotCloud, which lets you scale primarily by assuming that a ‘push’ to the server establishes all of the configuration you’ll need. (sshing into the machine is certainly possible, but it seems it is designed more for debugging than it is for active work.)

So, with a few suggestions from the folks in #dotcloud, I started exploring how to set up my dotcloud service in a bit more of a repeatable way — such that if I theoretically needed to scale, I could do it more easily. Rather than manually installing packages on the host, I moved all of that configuration into a ‘postinstall’ file — fenceposted so it runs only once.

After a bit of experimentation (and reporting a bug in GDAL), I was able to get a repeatable installation going; this means that I no longer have any manual steps in creating and setting up a new service doing OpenAerialMap thumbnailing; in fact, the entire process — right down to setting up redis as a distributed cache — can now be completed automatically.

The postinstall is pretty simple: download and configure, then install, curl; Download and configure, then install, GDAL. In both cases, use the respective -config tool as a fencepost so we don’t install multiple times.

Once this is done, setting up a new deployment with the appropriate services is trivial: that’s outlined in setup.sh — configure a redis service, grab the config information, and deploy a wsgi service using that config and the rest of the code.

In fact, while I was writing this post, I followed that very set of instructions, running ‘./setup.sh openaerialmap’ — before I finished the post, I had a running www.openaerialmap.dotcloud.com instance, which I could then commit in place of the silly-named ‘pepperoni’ service that oam.osgeo.org was using before.

The big thing for me about these one-off deployments is that by making available better tools for iterative deployment, they allow me to be more disciplined in how I configure services. When I was recently attending a workshop on container orchestration and automated deployments, I learned about several Ontario available options that significantly streamline this exact process. Normally, installing GDAL or curl is a one-time event: without reformatting my system (and inevitably losing track of customizations), it’s challenging to test setups reliably. With these improved tools, deploying software literally becomes seconds of work—it’s all scripted, repeatable, and far more manageable. No more mistakes in one-off apt-get install commands that I’ll never remember if my system goes down. Now, I’ve got the ability to script deployments of software.

This is of course possible with EC2 directly as well — building custom AMIs, redeploying them, etc. For me, though, DotCloud took the pain out of having to do such a thing. They did the hard part for me — and I got to do the more interesting and fun parts.

Comments Off on Better GDAL in the Cloud (DotCloud, Round 2)

DotCloud: GDAL in Python in the Cloud

Posted in default on June 5th, 2011 at 15:14:40

One of the key components of the OpenAerialMap design that I have been working on since around FOSS4G of last year is to use distributed services rather than a single service — the goal being to identify a way to avoid a single point of failure, and instead allow the data and infrastructure to be shared among many different people and organizations. This is why OpenAerialMap does not provide hosting for imagery directly — instead, it expects that you will upload images to your own dedicated servers and provide the information about them to the OpenAerialMap repository.

One of the things that many people want from OpenAerialMap is (naturally) a way to see the images in the repository more easily. Since the images themselves are quite large — in some cases, hundreds of megabytes — it’s not as easy as just browsing a directory of files.

Since OAM does not host the imagery itself, there is actually nothing about this type of service that would be better served by hosting it centrally; the meta-information about the images is small, and easily fetched by HTTP, and the images themselves are already remote from the catalog.

As a result, when I saw people were interested in trying to get OpenAerialMap image thumbnails, I decided that I would write it as a separately hosted service. Thankfully, Mike Migurski had already made a branch with the ‘interesting’ parts of the problem solved: Python code using the GDAL library to fetch an overviewed version of the image and return it as a reasonably sized thumbnail. With that in mind, I thought that I would take this as an opportunity to explore setting this code up in the ‘cloud’.

A requirement for such a cloud deployment is that it needs to have support for GDAL: this severely limits the options available. In general, most of the ‘cloud hosting’ that is out there is to host your applications, using pre-installed libraries; very few of the sites out there would let you compile and run code in them, at least that I can find. However, I remembered that during my vacation week, I happened to sign up for an account with ‘DotCloud’, a service which is designed to help you “Assemble your stack from pre-configured and heavily tested components.” — instead of tying you to some specific setup, you’re able to build your own.

At first, I wasn’t convinced that I could do what I needed, but after reading about another migration, I realized that I could get access to the instance I was running on directly via SSH, and investigated more deeply.

I followed the instructions to create a Python deployment; this got me a simple instance running a webserver under uwsgi behind nginx. I was able to quickly SSH in, and with a bit of poking about, was able to compile and run curl and GDAL via the SSH command line into my local home:
$ wget http://curl.haxx.se/download/curl-7.21.6.tar.gz; tar -zvxf curl-7.21.6.tar.gz $ wget http://download.osgeo.org/gdal/gdal-1.8.0.tar.gz; tar -zvxf gdal-1.8.0.tar.gz $ cd curl-7.21.6.tar.gz $ ./configure --prefix=$HOME --bindir=$HOME/env/bin $ make $ make install $ cd ../gdal-1.8.0; $ ./configure --prefix=$HOME --bindir=$HOME/env/bin --with-curl --without-libtool --with-python $ make $ make install $ cd .. $ LD_LIBRARY_PATH=lib/ python

So this got me a running python from which I could import the GDAL Python libraries; a great first step. Then I had to figure out how to make the webserver know about that LD_LIBRARY_PATH — a much more difficult step.

After some mucking about and poking in /etc/, I realized that the way that the uwsgi process was actually being run was under ‘supervsisord’; basically, whenever I deployed, my Python process running under supervisord was restarted, and was running my Python code inside it. So then all I had to do was figure out how to tell the supervisor on the DotCloud instance to run with my environment variables.

I found the ‘/etc/supervisor/conf.d/uwsgi.conf’ which was actually running my uwsgi process; after some research, I found that DotCloud will let you append to the supervisor configuration by simply creating a ‘supervisord.conf’ into your application install directory; it hooks this config at the end of all your other configs. So, I copied the contents of uwsgi.conf into my own supervisord.conf, dropped it in my app directory, and added: environment=LD_LIBRARY_PATH=”/home/dotcloud/lib” to the end of it.

Pushed my app to the server, and now, instead of “libgdal.so.1: cannot open shared object file: No such file or directory”, I was able to successfully import GDAL; I then easy_installed PIL, and with a little bit more work, I got the image thumbnailing code working on my DotCloud instance.

I’ve still got some more to do to clean this up — I’m not sure I’m actually using DotCloud ‘right’ in this sense — but I was able to get GDAL 1.8.0 in a Python server running on a DotCloud instance and use it to run a service, without having to install or screw with the running GDAL libraries on any of my production servers, so I’ll consider that a win.

Comments Off on DotCloud: GDAL in Python in the Cloud

Some things OpenLayers Shouldn’t Do

Posted in default on June 1st, 2011 at 04:16:17

One of the things that sets me apart from other OpenLayers users and contributors at times is the strong belief that it is not the responsibility of OpenLayers to be everything for everyone. I’ve often been faced with a situation where a user will say “Well, OpenLayers doesn’t have this functionality: doesn’t that mean it should?”

Just this morning, Eric made a similar argument: In response to my saying that OpenLayers is not for everyone, h responded if OpenLayers isn’t for everyone we gotta do something to change the situation, I think.

I completely disagree with this statement, and I think that anyone who thought about it would also disagree. There are some things OpenLayers is not meant to do, which other mapping tools are in a better position to do.

OpenLayers is not a 3D spinning globe — we will probably never have support for rendering 3D globes inside of OpenLayers itself.
OpenLayers is the ideal tool for browsing 3D panoramas: we are likely not the best tool for doing something like Google Streetview.
OpenLayers is not a full GIS application: We will likely never ship as part of OpenLayers itself a full UI for doing GIS editing against some server.
OpenLayers should not have as one of its primary goals creating user interface elements for complex functionality.

These choices might seem obvious, but I bring them up specifically because they are things that I have seen people state OpenLayers should do, and I think it’s important for any project to identify goals and seek to solve those goals.

There are also other things which are less obvious that I think OpenLayers is unlikely to do, which other mapping APIs have chosen to do.

OpenLayers should not abandon support for Internet Explorer: One of the moves I’ve seen recently is to create applications which abandon support for Internet Explorer, since supporting IE takes more effort than supporting other platforms. Although that is reasonable for many platforms, I think it would not be the right path for OpenLayers. Building a library which supports more platforms will have side effects: OpenLayers has IE support baked into some of its core functionality, like event handling, so it will always have some minimal impact on things like download size, and there are some things we may choose not to do because doing so would make supporting IE more difficult. Internet Explorer may be close to becoming a non-majority browser, but it’s still going to be very important to OpenLayers users for years to come.
OpenLayers should not remove support for commercial layers, even if supporting these types of layers requires a more complex architecture: Supporting commercial layers is one of the key components of OpenLayers, and I think that it is important to continue to support using commercial layers, even though this does, at times, make the OpenLayers code more complex. This problem is thankfully getting somewhat better with newer APIs from Bing and Google for direct tile access, but there exist other APIs out there that aren’t as advanced, and continuing to support the types of APIs we need to make that happen is something I feel is central to OpenLayers.
OpenLayers should not remove support for direct WMS access. One of the things that some other mapping APIs have chosen to do is to limit their target audience, and as a result, they do not worry about adding WMS access or other similar functionality for accessing data through means other than laying out X/Y/Z tiles on a map. Supporting WMS and other OGC web standards is something that takes some non-trivial portion of OpenLayers developer time. If we were to abandon anything other than support for OSM or XYZ style tiled layers and vector layers, we could certainly concentrate on a smaller API — but I don’t think that is something OpenLayers should do, even if it would mean a better overall API.
OpenLayers should not remove support for fetching via various protocols, parsing via various formats, and choosing what to do with that data via various strategies: I have seen an argument that the more recent work with formats, strategies, and protocols is confusing to users, and should simply be removed in favor of letting that be handled at the application level. Most of the APIs OpenLayers is being compared to do not have this kind of support. This is another thing I strongly disagree with.
OpenLayers should not remove support for dealing with data in projections other than Spherical Mercator. Many other libraries simplify user experience by picking and sticking with a single projection; I think that is impractical for OpenLayers.

Now, I think it is completely true that it would be possible to rip out 80% of the functionality of OpenLayers, create a smaller, easier to maintain library, and hit a use case that could solve interesting problems. I will agree that OpenLayers is a large codebase — after all, we have more than 200 *thousand* lines of code, compared to just under 20,000 in Polymaps, 60,000 in Mapstraction, 9,000 in Leaflet. This is the ‘big daddy’ of mapping frameworks — no one is denying that — and it has a lot of technical debt built up the same way any Open Source project has happen over 6 years of development without a rewrite.

Some of that code is cruft, and should be removed. No one is denying that. In fact, there’s a fair amount of already-deprecated code; controls that have been deprecated or non-default for more than 4 years are still part of the main OpenLayers trunk release. However, I’d put the amount that is cruft at closer to 20% than 80%: the much bigger portion of the OpenLayers code is the broad support for the many different ways of interacting with remote data. In our formats alone — things which are generally designed to do only one thing, read and write from an external string to an OpenLayers resource — we have 69 files with 20,000 lines of code. That’s right, our format parsing — for everything from OSM to GeoJSON, GeoRSS to ArcXML to CQL to KML — is larger than several other entire mapping libraries.

Eric followed up by pointing out that of course he didn’t mean to suggest OpenLayers shouldn’t do everything, but that OpenLayers should have an easier API to hit simple use cases — something which is better documented, easier to use, and less confusing to users. I find it a bit amusing that this is an argument someone feels they need to make to *me*, since I feel like I’ve been pushing that argument for years now within the project. 🙂 However, since it may not have been clear, let me clarify:

The OpenLayers API is difficult to get started with for many simple problems. It can be hard to use, confusing to start, and difficult to understand for solving simple problems. It pushes details that very few users care about in the face of users who don’t know what to do with them. It is crucially important to supporting the future use of the project to make the easy things easy, while maintaining the ability to make the hard things possible.

Some areas where this is most apparent: difficulty in working with projected coordinates for the purposes of interacting with OSM or other spherical mercator layers. Difficulty in setting up a simple request to download a file from a server and render it — even *I* can’t configure that from memory, and I’ve done it dozens of times. These are real problems that users run into all the time and simplifying them would be a huge step forward in usability for solving the simple problems. (Note that I’m saying this in a blog post rather than in code, which also shows that I don’t think this is a trivial problem: the reason these things are not done is in part because coming up with a solid way to do these things in a helpful way is not trivial.)

I just want to clarify that there will always be some things OpenLayers shouldn’t do. Some of the decisions we’ve made have increased our overall technical debt: Maintaining support for IE, even for relatively advanced features like rotated client-side graphics in VML, was a cost that we could have saved if we chose a narrower supported platform range. However, I think that some of these decisions are important: a key component of OpenLayers is its broad support for loading data of any kind, in a wide variety of browsers. That was the core idea when OpenLayers was started, and I think that it is an extremely important to maintain part of our legacy.

Some of the competing mapping frameworks target a narrower use case. As a result, they can concentrate their developer time on better examples and documentation for a subset of functionality that OpenLayers has. This is great for the users of those tools, and will likely make those tools more attractive, short term, to some users. Competition is good: it encourages innovation, and pushes the limits, especially when the competition is open source and can collaborate across teams. If another tool is better for users than OpenLayers, it is in their best interest to use it. In the end, I hope that OpenLayers can continue to expand the set of users for which it is the best tool, and there is a lot of work that can be done there — and I look forward to continuing to see competition and collaboration between the various mapping projects out there to maximize user success.

10 Comments »

More On Flaws in OpenLayers

Posted in default on May 30th, 2011 at 17:00:04

Tom MacWright has responded to my previous post about OpenLayers; in the spirit of continuing the conversation, I’ll respond in kind. (Note that this is a conversation that feels disjointed, which is unfortunate; but I’m not sure of a better way to continue the conversation; oh for the days of Usenet, where I feel like this conversation could be so much more effective.)

First, Tom points out that I compared to the wrong mapping libraries. That’s unfortunate, I guess, but I was going entirely from the list that was presented; I don’t have any beef in the argument. Since I’m an OpenLayers user, I don’t end up doing a lot of research on new hotness, so I’ll admit I had a flaw there; I apologize. (In fact, I couldn’t even *find* Leaflet until it was pointed out to me; clear evidence of my flawed knowledge in this regard.) It was suggested that any comparison should include Leaflet, OpenLayers, and Modest Maps.

I reject that statement as simply incorrect. Leaflet, I’ll gladly accept into the ring—I’m especially happy to see that Cloudmade has open-sourced their new efforts, which opens the possibility for collaboration and contribution across projects, a great step forward. However, Modest Maps is not a JavaScript web mapping library—it’s based on Flash and requires developing using the Flash toolchain. For those interested in comparing various mapping libraries, there are comprehensive collections like 토토사이트 모음 that provide a wide range of options tailored to different needs. While Modest Maps may be an interesting choice for a small percentage of the OpenLayers user base, I would argue that I’m content with users whose needs align with Modest Maps to use that library, but it doesn’t offer a relevant comparison for the vast majority of OpenLayers usage. While OpenLayers can certainly learn from Modest Maps, it’s not an interesting comparison for users of the software as an API.

Some questions which are asked:

Where does control of the map come from? The baselayer? The map object? Both?

Ah, a softball! The layer configuration for resolutions, projections, and all other information related to it should come directly from the current baseLayer of the map. The information assigned to the map may occasionally offer a shortcut for that configuration, but it should not be in any case necessary to have that configuration on the map. (Any case where that is untrue — and I’m sure there are some! — is likely a bug.) For other aspects of the overall configuration, the map may be a home, but in almost all cases, configuration options on the map are just a shortcut (and likely a confusing one, because it compounds the possible configuration issues). When configuring options for projection information, the information comes from the layer; as a courtesy, we do our best to copy configuration from the map to the layers to make it easier (since most maps have the same properties on all layers.)

OpenLayers has its own events system … which is different than the $.proxy and other apis in jQuery or anything else (or in pure Javascript) that people are used to.

Well, in actuality, the OpenLayers event system is based almost entirely on the Prototype system. As we have moved on we have incorporated shortcuts very similar to the ExtJS event registration system. But really, we’re talking about one or two functions here, which are consistent throughout the OpenLayers library: on anything with an ‘events’ object, you can call “foo.events.register(‘eventname’, null, function(evt) {});”. So while it may not be particularly familiar — since it’s based on a less commonly used syntax, perhaps — it’s not like this is the most complex part of the library. I’ll also point out that we’re talking about a library which is older than jQuery, and older than ExtJS — it’s hardly a sin to not follow conventions that were invented years after you started your project 🙂

OpenLayers has its own HTML/styling system, that’s just built-out enough to be terribly limiting to anyone who wants controls to look nice, and uses anywhere from 40% to 60% hardcoded styles.

A mistake (largely created by our early development) that we have been slowly paying back ever since we got some external developers. Tim Schaub has been pushing the use of CSS and stylesheets for new controls since 2.4. We added CSS-stylable zoom and pan panels in OpenLayers 2.7, almost 3 years ago. While there are still a limited number of controls that use non-CSS based styling — specifically, the Control.PanZoomBar is probably a big one that will bite people, and the LayerSwitcher — I hardly think that these flaws are something that are the biggest problem with the library.

Now, the LayerSwitcher is a point: it’s *very* difficult to style, and it’s a not a particularly attractive display… but the reason that we haven’t invested in it is precisely because it’s not in the spirit of OpenLayers to solve this problem for users. As an API — not a UI development tool — OpenLayers isn’t seeking to solve all the problems of user interaction. Instead, we have left things like a more clean layer switching interface to third parties like GeoExt — with reasonable success. We’ve seen people create alternative layerswitchers for mobile applications, as well as other similar alternatives, ranging all the way down to just a simple dropdown. For complex usage, these may be insufficient — but that’s where libraries like GeoExt pick up the slack. In fact, Tom makes my argument for me: “See Modest Maps’s controls: there aren’t any, because designers will make prettier ones.” So it’s okay to have *no* controls, but not okay to have *unattractive* controls… that argument seems a bit flawed.

In this vein, there is a followup later in the post:

The configuration of the theme requires a line of straight-imperative code that sets a global variable. What if you want a different imgpath for different maps? Shouldn’t this be a per-map setting?

For panels — that is, for controls developed in the past 3 years, rather than 4 or 5 years ago — styling is entirely via CSS. This includes the Editing toolbars, it includes the panpanel and zoompanel controls, etc. Since these controls are via CSS, you can use different configurations per map, simply by using CSS rules.

Comparing to ModestMaps — which Tom describes as having no controls at all — and Leaflet — which has only a Zoom control — OpenLayers doesn’t appear to be lacking in the ‘easily customizable controls’ department 🙂

Now, Tom describes this as part of a systemic problem: “an assumption that you already know the scope of the library and the parts that you can customize.” This is a criticism that I am totally willing to grant — and feel falls directly under the heading of poor documentation, especially prose documentation, which I feel is a significant lack in OpenLayers for most users.

The growth of OpenLayers is addressed by code compression, not by changing the code, ever. … the bloat of OpenLayers and its ever-more-complex API is the core problem that shows no signs of changing.

This is completely untrue. The OpenLayers project has made significant efforts in three different areas to reduce code size: Compression (via better tools like the Closure compiler), Better build tools (which simplified building build profiles), and reducing interdependencies in code to limit build size. In fact, a significant portion of the most recent release was an attempt to minimize builds — simplifying default controls, pulling various portions that people use most out of the larger components so they could be built separately, and creating alternative build profiles to help users get started with building smaller OpenLayers builds for their applications.

In addition to making the OpenLayers build itself smaller, we have concentrated a large amount of our recent work on improving performance (simplifying operations while dragging to improve responsiveness) and limiting download size (using more sensible defaults, providing more example build profiles, and doing things like shrinking our default images to save space there). Overall, a simplified OSM map designed for a mobile interface, using our documented example profiles, will save over a megabyte of space, due to improved defaults and documentation.

In addition, the description of the OpenLayers API as ‘ever more complex’ feels confusing to me. An OpenLayers 2.0 application works just as well with OpenLayers 2.11 — so the core API can’t have changed much. There is certainly new functionality in the API, but looking at other APIs — like the leaflet quick start example — I just don’t see that much difference in APIs between OpenLayers and its competition.

While Polymaps and Modest Maps are great drawing frameworks, OpenLayers is an abysmal one. Try to get individual tiles from a drawn map? No way. Interact with anything on the map like a DOM element? Nope. “Unbind” the map from the HTML element it “owns”?

This would be what I would consider contrary to the design of OpenLayers, yes 🙂 I can’t think of *any* case where someone has come to the mailing list and asked a question which would be more easily answered if these things are easier — and I read a lot of mailing list posts.

New-style Javascript isn’t there in OpenLayers.

A perfectly reasonable criticism — OpenLayers is a child of the era from which it was born, and I think that if this is important to you, then I expect other tools will be more useful to you. That said, OpenLayers isn’t alone in this regard — Google, for example, follows much the same style of code, so far as I can tell, so at least we’re in good company. (Note that leaflet‘s quick start reads very much like an OpenLayers example, so I’m not sure I see where this comparison is coming from.)

Recommending GeoExt/MapQuery is a good idea for people building RIAs, but it’s the opposite of that for most others. What people want (people, as in, me, and people building things on mapping frameworks) is a lower-level, simpler API, rather than a higher one.

I think that this is simply untrue, based on the types of questions I regularly see. I think that OpenLayers provides a reasonable API for most tasks at the low level — many of the complaints are indeed only about the higher level tasks (like the layerswitcher styling). Given the rapid growth of usage for GeoExt, I see a lot of basis in the argument that OpenLayers was insufficiently providing for that niche — while I don’t see widely used reasonable competition to OpenLayers as a low level API.

The ability to make slim builds of OpenLayers (something we’ve also done, with openlayers_slim[5]) hasn’t been trumpeted nearly to the level that it should have been.

I don’t really know what more we can do. It’s the first response to every question we’ve ever had on speeding up your application, it’s in our deploying documentation, it’s been the first performance increasing technique mentioned every time we’ve talked about OpenLayers.

Short of not providing a pre-built OpenLayers at all, I don’t know what more we can do to help developers understand how important this is.

I’m positive that OpenLayers is not the best tool for all problems. But I think it’s actually a pretty good tool for a lot of problems. And one of the key things that it does — that nothing else does — is provide support for layers that you don’t *get* anywhere else. OpenLayers is designed to make it possible to use Bing, Google, and OSM in the same map — and as far as I can tell, nothing else does that. I still don’t see a library which offers support for parsing OSM, KML, GeoJSON, and GML in Javascript — and in my experience 50%+ of OpenLayers users end up using either the format parsing code or the commercial layer code.

For the few users who will never need to use anything other than OSM and some markers drawn in Javascript — great. But comparing OpenLayers to these other frameworks for anything other than dropping pre-drawn tiles in a map is sort of a non-sequiter for real usage, no matter what the flaws of OpenLayers are.

4 Comments »

Perceived Flaws of OpenLayers

Posted in default on May 29th, 2011 at 12:28:21

So, apparently at WhereCampEU this weekend, there was some discussion of “Why people use web mapping libraries other than #openlayers”. (I can’t find any other references to this in Googling; someone who was there may be able to provide more context which might alter the content of this post.)

There are a number of good points about OpenLayers identified in the whiteboard snapshot; the ones listed there that I can read are:

Documentation
Default Look
Viability
Examples are not Good
No explanation of the general architecture

(Throughout this post, I will be comparing OpenLayers to other mapping tools listed on the whiteboard snapshot; with the exception of “Leaflet”, which I can’t seem to find by googling things like “Leaflet” and “Leaflet web mapping”. Help is gladly accepted.)

Some of these things I would be interested in understanding more deeply — and some I’d like to take on trying to address.

Documentation and Examples Are Not Good

I think a large difference of opinion exists between different people in the OpenLayers using and developing community about how documentation should be managed. In general, in OpenLayers, we have taken the approach of saying that the best form of documentation is via examples that clearly demonstrate how to use a given set of functionality. (Note that in this case I say ‘we’, but I won’t claim this is a considered decision on behalf of the project. Instead, it is one that I believe fell into place in large part because of how we started documenting the project, and has not drastically changed.)

In OpenLayers, the examples are the first line of documentation. Always check the examples first; not doing so is doing yourself a disservice. This is not to say the examples are perfect; they are far from it. Especially as the project has grown, the examples are not an ideal way to understand all of the functionality of OpenLayers; as the library has grown, the complexity has grown correspondingly. However, I have often been told “Well, I looked at the API docs first and they weren’t clear about how I was supposed to use a specific feature” — and I think that particular approach is one that is doomed to fail with the documentation currently available in OpenLayers. The best advice I can give to beginners starting with OpenLayers is: start with the examples. Search for the thing you’re looking for — and if you can’t find what you’re looking for with the search term you’re using, tell the OpenLayers users list.

Now, as I’ve said, the examples haven’t kept up. OpenLayers has grown tremendously, and although we are relatively diligent about writing API documentation, prose documentation has never been high on our list, and prose documentation outside of the examples probably never will be. I think that this is a problem that really needs a solid champion outside of the current developer pool if any progress is going to be made; no one on the current development team is likely to be strongly in favor of making sweeping changes, simply because the choices are to continue to add features — things like improved performance, better mobile support, improved functionality for new features like geolocation APIs — or to work on documentation, and the current developers will tend towards the former.

That said, I think it’s fair to compare OpenLayers to some of the other mapping APIs listed on this point.

Google Maps: Blows OpenLayers out of the water on API documentation. Google has done an excellent job of integrating reference documentation with prose documentation, and has literally hundreds of pages of API docs for their mapping APIs. Their examples are solid — though comparing this page to the OpenLayers equivalent seems to have a certain something lacking. There is, however, no doubt that Google has done great work on documenting their API, and that there are many simple problems for which Google is a better solution than OpenLayers for some users simply because it’s better documented.
Polymaps: Again, a relatively well done set of documentation. Their API docs are much more descriptive than the equivalent in OpenLayers, and their examples like this one provide a simple, easy to read overview, example, and code in one page, and the example overview is quite nice, with screenshots. I think that this is well in line with what you would expect from Stamen. I will say that their API docs do lack information about what properties you are actually supposed to pass into functions; the map documentation for example, says that “map.center([x])
Sets or gets the location of the map center. If the argument x is specified, it sets the new map center”… but doesn’t explain what x is supposed to be. So long as functions have clear examples, this probably isn’t a problem; it’s just something that I could see being a problem with more complex functions that don’t take obvious arguments.
Web Maps Lite (I’m assuming this is the Cloudmade offering): This API offers a pretty reasonable set of documentation; their Overview is slightly more comprehensive than the OpenLayers equivilant which uses Naturaldocs. I would describe most of the brief API method descriptions as being similar in quality to those in OpenLayers; no obvious winner or loser there. The examples seem clear and well done for the few that exist; not particularly comprehensive or entirely obvious, but probably a win for most simple use cases over OpenLayers.
Mapstraction: Documentation was a bit hard to find; I didn’t even see it linked from the homepage, but Googling turned up javadoc-like content; limited, but since Mapstraction is a pretty simple library, probably not insufficient. The examples are… brief, but workable, looking at the mapstraction sandbox. I’m not a huge fan of the sandbox form of examples personally, but for people who are, this probably is a perfectly workable set of docs — but I don’t think there’s an obvious win over OpenLayers here.
Tile5: The Tile5 API documentation seems, to me, to be less informative and more confusing than OpenLayers. It seems hard to navigate, and even once you’re in it, it’s hard to understand exactly what you’re meant to do with things like T5.Marker objects. The ‘examples’/tutorials section has two sections, approximately equivalent to the OpenLayers introduction, but nothing further; it’s not clear that these examples offer any sort of view of a more comprehensive idea of how to start using the library. I may be missing something, but I feel like this is actually a pretty major lack in this particular project — and given that OpenLayers is coming from deep in the hole on this problem, that’s saying something 🙂

Overall, reviewing the various APIs shows one thing to me: I believe that the integrated API reference and prose documentation of Google Maps works best, and that it looks like the tools for generating Javascript documentation from code appear to have improved greatly since OpenLayers started. However, no project has an obvious win in all categories; the different approaches generally leave something to be desired. OpenLayers is probably one of the worst in this regard — with the exception that it does appear to have the broadest example support, providing at least limited documentation for a number of ready-built use cases. However, in prose documentation, most APIs do better, and in API docs, Naturaldocs seems like it may simply not be cutting it for many users, and switching to something else may be viable way to help increase developer engagement with the documentation.

General Architecture

Unfortunately, without being there, it’s hard for me to know what people want here; I’ve heard requests for a general architecture description before, but when I try to give them, people usually end up saying something like “No, that’s not what I meant, I wanted something… else.” I’d be interested to hear thoughts on what this means.

In general, the architecture of OpenLayers is: You pass an HTML element to OpenLayers, at which point, it owns that element. You inform OpenLayers of a set of x/y coordinates you would like to render data into, and it takes layers of content, and allows you to lay down multiple layers in whatever set of x/y coordinates you defined (via your base layer). In general, OpenLayers tries to provide comprehensive tools for interacting with the map — using browser gestures, clicks, etc. — with an API to implement more advanced functionality beyond that. Where possible, OpenLayers will make it as easy as practical to interface with a wide variety of data sources to provide the data you are able to browse, and make interacting with the map to increase the amount of data possible.

Beyond that, OpenLayers isn’t supposed to be much of anything; it’s an API and a tool for rendering map data, and little else. (Sometimes the problem is simply overly high expectations for OpenLayers because people have used it to do a lot of complex things!)

In searching out the various map APIs described above, I didn’t see anything obviously filling this role, which probably means that I don’t know what I’m looking for — but I’d be happy to learn, and explore how to solve this problem, because it doesn’t really sound that complex to solve.

Default Look and Feel

This one is interesting. In general, OpenLayers doesn’t *have* much of a look and feel; in fact, the default look and feel is generally limited to 7 icons which are easily replaceable by CSS. (One of the things we even have docs for!) That said, I’m willing to go head to head against most of the other APIs here for that issue:

Google: Google has smaller controls than OpenLayers; these may be more attractive to some people. I’m willing to accept that some people prefer Google’s style to OpenLayers standard control styling.
Tile5: When I look at Tile5’s example at simple map marker, the only ‘default style’ I see is a checkered grey and black background (easily implemented via CSS in any webpage).
Polymaps: Seems to have a default +/- icon, but no others; the +/- look very similar to what we have in our mobile example, styled using entirely easily-modifiable CSS.
Mapstraction: This uses the native control style for everything, it looks like, and turns controls off by default; since OpenLayers can have its controls turned off with one map option, this feels like they are the same in this regard.
Cloudmade: basic example seems to have no default look at all; again, no difference.

Overall, it seems there is a preference for… no controls. Perhaps we should simply make that a better-documented OpenLayers mode of operation, and then people could complain about how their map didn’t have any default styling 🙂

Viability

Edit: After reading Volkler‘s post on the topic, I realized that this point was *Usability*, not Viability. I’ll expand on that after this section, but I think that the Viability section here is an important concern, if you ignore the off-base ranty nature of it: OpenLayers is an extremely broadly used and widely supported tool, so using any other open source solution is risking joining a smaller community, which may have its own dangers.

This is the one I actually have the strongest feelings about. I think that if you are going to talk about problems with OpenLayers, viability is the one that has the least of a leg to stand on, on any API other than a commercially supported (non-free) one.

OpenLayers has:

2000 members of its users list, with dozens of messages a day.
Code contributions from more than 40 users
New releases multiple times a year, closing hundreds of bug reports
A steadily growing list of features and enhancements

OpenLayers is the de facto web mapping library used by almost every open source geo related project. OpenLayers has a broad community, with broad support. There are two books introducing OpenLayers, there is an active community around OpenLayers, and earlier this year, more than a dozen organizations came together to sponsor an OpenLayers code sprint, funding 18 developers to stay in Switzerland for a week and work solely on supporting OpenLayers.

OpenLayers is a huge success story. The biggest evidence of that, in my mind, is simply the fact that no one talks about using OpenLayers anymore — they just do it. OpenLayers has become the de facto web client at FOSS4G; OpenLayers is used by tens of thousands of people every day around the world, from OpenStreetMap to the US Government, to the Portland public transit system, to consulting companies selling services in Australia.

I’m sure that there are many people who would claim that other APIs win on various aspects — and they’d be right. Documentation is a pain. Since OpenLayers caters to thousands of different use cases, it’s not as trivial to get started with. The library is complex, and there are many features that are confusing and complex to get started with — but overall, OpenLayers is a reasonably usable library for doing a relatively complex task that achieves in a way that I don’t think any of the open source competitors come close.

In the same way that OpenLayers has revolutionized accessibility in the field of web mapping, there’s a growing trend in other industries to enhance accessibility to vital resources. One significant area is the pharmaceutical industry, particularly regarding the availability and affordability of medications. For example, there’s a noticeable shift towards generic medications, like Generic Cialis, which are gaining popularity due to their cost-effectiveness. To support this trend, informative resources and platforms are emerging, offering detailed comparisons and information, much like the community support seen in OpenLayers, to help users find cost-efficient options for their healthcare needs.

Someone questioning the viability of OpenLayers makes me wonder what they mean — because the project continues to make regular, large strides in new developments and features, and I can’t see how anyone could look carefully at the project in comparison to its competitors and think that it fails on that merit.

I’d be glad to here about this point, or any others, from people who were at the WhereCampEU session — perhaps I’m misunderstanding the complaint, for example — but I would say that to almost everyone looking to create a web map: OpenLayers is a strong starting point, if you’re a competent developer, upon which it is possible to build great applications.

Usability

It appears that “usability” here is a description of the API as it applies to developer usability. I think that this is an area where the main problem is “OpenLayers is a broadly used library supporting dozens, or hundreds, of different use cases, and is not a narrowly designed tool designed to solve a specific problem.” I’ve theorized for a long time that a lot of people (especially people who are likely to attend WhereCamp events) would be much happier with OpenLayers if I were to write a simple “OSM.Map” wrapper around it that picked sensible defaults for users of OSM maps.

The biggest thing that makes usability hard is targeting too many users. OpenLayers is trying to solve many hard problems — from integrating with OSM and parsing OSM data, to integrating with OGC catalog services, to interoperating with WFS. The most ‘sensible defaults’ for one application are not the best ones for another, and with weak documentation, usability suffers.

Some ways to fix this include better docs, writing wrappers for specific purposes to simplify defaults for some users, etc. But overall, because OpenLayers is a library which supports many different use cases, it’s somewhat harder to get started with for all use cases. This is unfortunate, but not unexpected; I think it’s a necessary evil of having an API which targets such broad use cases.

Overall, many of the complaints about OpenLayers are reasonable: our documentation is weak, our default styling apparently doesn’t appeal to some people (I still like it, but I’m apparently in the minority), and it’s harder to get started than with some other tools. However, we’ve seen continued transitions of people from other tools to OpenLayers as they encounter harder problems — because as your problems get harder, OpenLayers becomes a more attractive solution to prevent you from having to reinvent the wheel.

7 Comments »

Working with Place APIs (aka “How I spent my Spring Vacation”)

Posted in default on May 28th, 2011 at 11:10:55

So, this week, I have learned a couple things:

Nobody in this house does the dishes but me.
My 10 year old daughter loves playing with the younger kids in the sandbox at the park… and the other parents seem to love her for it most of the time.
There are at least 7 different major POI data sources, and the type and quality of data varies from “enh” to unusable — but once you get them working, there is actually some interesting data contained in them.

I’m going to do a bit of an exploration of the various place APIs, how difficult they were to get started with, and some thoughts on when you might use each of them. The order listed here is the order in which I implemented them for a new toy that I’m playing with. In each of these things, my goal is to provide place listings in response to search queries from users. The way I am doing this is to take each place API search, and wrap up the title, lat, lon, and URL into a GeoJSON object.

FourSquare

The FourSquare Venue API is actually surprisingly simple to get started with. Although for user access, FourSquare depends on OAuth — a bit weird given that I’m required to type my username and password in directly on my phone for the official app — for venue search, you’re not required to go the full OAuth route, and can instead use the Venues Project, which makes searching for Venues easy. In addition, the documentation on the FourSquare venues project makes it clear that using the API is pretty open; you can do most of what you want to do (except copying Foursquare places wholesale into your own database), including copying and storing IDs, displaying information, etc.

The documentation and examples with the FourSquare API were easy to get started with. The process for getting a key was well linked from the docs, so getting there was easy. There was no need for complex OAuth loops, and the licensing is clear and obvious. Overall, the FourSquare Venues API made getting started with place search easy.

FourSquare Data

The FourSquare data, at least for the areas that I’m currently looking at, are pretty great. I have been able to find results for things that don’t exist in any other data using FourSquare. The primary problem I have with FourSquare is that it is somewhat too generous in attempting to find results — appropriate for searching on a mobile phone, but less for a web interface. (A search for “Cambridge, MA” from Europe will give you many results for ‘ma’, but nothing actually relevant to the town.) Excepting this — which applies to almost all of the APIs in this post — FourSquare provides an easy to use API with a large quantity of interesting data in the areas where I’ve looked at it so far.

Facebook

Facebook is all noise. While I appreciate the social aspects of what Facebook place results provide — they have the benefit that you can have all of the aspects of Facebook linked into them — Facebook places coverage is limited, and what little coverage there is tends to be absolutely filled with duplicates. (A local hospital has 4 duplicates in 3 surrounding towns, even though it only has one location.) The API is relatively easy to use — just fetch a temporary token, and you can then make additional calls back to the API — and Facebook provides a familiar interface for Facebook users, but overall, the quantity and quality of data tends to be low in a way that makes it difficult to use for any serious purpose. Additionally, using the API was made slightly more difficult because getting a human-readable link required an additional call to fetch more details about a place, which almost all of the other APIs did not require. However, it was possible to bunch all these requests in one ?ids= request, so it wasn’t particularly painful. Total number of HTTP calls to get the data I needed: 3.

Google Places

Google Places is a weird entry. First of all, unlike every other API, Google does not give you a link to the place, nor does it have a way of getting details about a batch of places. In order to get a link, you must make a place details call for each place you are interested in. This makes Google one of the slower APIs to get the information I needed for my UI.

Google has a relatively high level of data quality. Although it uses a narrower definition of places than something like FourSquare, it is much broader than something like CitySearch. It also has a smaller problem with duplicates and other data issues than many other providers with user-submitted data tend to.

Overall, Google has high quality data with low noise, for their definition of places. However, for maximum coverage — to support services like check-in — other options like FourSquare have Google beat, at least in the urban areas I’m looking at.

CitySearch

The CitySearch API is interesting; it is a deeper/richer set of data than many other place providers — because it appears to be centered around a strongly curated set of results. Many places have reviews, and additional details, that can be gathered through the APIs, and due to strong categorization efforts, it is a good choice to use to find things like cuisines, which the full text search also indexes.

From the point of view of how I’m using it, the results are reasonable, though it is sometimes not obvious exactly how things are being matched due to the depth of data being searched. Additionally, the coverage is centered around restaurants and the like; because adding places is not something that users are expected to do (the API is centered around having businesses submit, rather than users), the coverage for things like public transit stations, or other things that users would consider important, is lower than competing APIs.

Additionally, the terms of service make it clear that using the API is really designed around creating your own CitySearch clone; they describe how you have to create links to CitySearch, how many ads you have to display on your pages, and the like. From reading the terms, it’s not even clear if it’s allowed to display CitySearch content without also displaying at least 2 ads on the page — see section 3.5 of Usage Requirements. Overall, though there is a great depth of metadata — reviews, attributes, etc. — for places in the CitySearch API, from the perspective of someone looking for a comprehensive POI source, this is probably not the best place to start.

Yelp

Yelp falls under pretty much the same descriptions as CitySearch, minus the weird/confusing Usage Requirements and the lack of user-submitted content. Because the site is review driven, the search is also searching full text of reviews, which has a somewhat weird impact on the results you get. (Searching for “Green Line” will get you Green Line stops… but also places which are close to the green line, because people mention it in reviews.)

Yelp uses OAuth for their API, but there are solid examples of how to make calls in Python, which made getting started easy. Signing up was also easy.

The content in the Yelp API has more user generated content — which means, as usual, more coverage and more noise. It is also an API with a lot of depth — reviews, etc. — if that’s the type of thing you’re interested in.

GeoNames

GeoNames is a perfectly reasonable set of data, but POI coverage is low. For navigation style queries to addresses, towns, etc. GeoNames is fine; for POI coverage, you’ll probably need to look elsewhere. Additionally, the API is not the best documented, and getting set up with an account took longer than I wish it had; additionally, the API is not particularly full featured. There is no way to pass a location into the query, so results are always worldwide, with the expected results. Additionally, while developing, I found that the API was down for about an hour one evening. All of this is in line with the expectations that I’d have of a non-commercial service like this, but it makes using it in any kind of application somewhat less tenable.

OSM/Nominatim

OSM’s coverage is primarily towards address-like things, as is the case for GeoNames, but there is an interesting amount of POI data in it. The search API is relatively fully featured, and you have a benefit of any object you find in this way having a full edit history, and additional data/attributes available, via the OSM APIs. This also allows for easy correction of this data in the case of mistakes — something that doesn’t exist in any of the other APIs. However, overall the coverage for POIs is spotty, and this is not an appropriate source for someone looking to build on top of a comprehensive place database.

SimpleGeo

I almost didn’t bother with SimpleGeo; despite the use of GeoJSON (my preferred format) as transport format, for two reasons: 1. It didn’t offer any particularly exciting data. 2. Required OAuth, but … in a weird way without examples. In general, the documentation around the SimpleGeo *tried* really hard, but failed in too many important ways; without working a way to do browsing of example requests, it was just too much effort to integrate. However, I did eventually find the SimpleGeo Python client (after a lot of looking around), which solved that particular problem. (In general, I don’t like the idea of depending on a specific API client to do things like ‘Fetch JSON’, but this was easier than the alternative.)

SimpleGeo did offer the benefit of using GeoJSON natively — for this reason, SimpleGeo is the only API for which I have full data available. Unfortunately, due to the difficulty of getting started with their API, I almost didn’t bother to implement it.

When I did implement it, I found it lacking in about the way I’d expect. It looks like someone glommed together a bunch of free data with some user entered data, with about the expected result. (For example, “Harvard Square Eye Care” seems to have two copies of the location in Porter Square… and none of the one in Harvard Square.)

Factual

After reading a bit about Factual, I actually got far enough to try their API — setting up a token, etc. — and found that it does make getting set up with the API relatively easy. However, because their local data is split by country, there is no practical way to treat Factual as a worldwide POI data source via their APIs, which makes it impractical without a lot more work than most of these APIs require.

But Don’t take my word for it!

Realizing that everyone has different needs, and even different aspects of a project have different needs, I didn’t just explore these APIs: I built something with them. The result is Anymaps.

Anymaps will let you search through the aggregated result list of all of the place APIs described in this post. So whether you’re looking for Four Burgers in Cambridge, or Pune Central, or even the Taj Mahal, you can see the results from all the various place APIs that are out there — and decide for yourself which one has the right data for you.

And that’s how I spent my Spring Vacation.

3 Comments »

Finding distance from a point in Spain to the Sea

Posted in default on May 15th, 2011 at 08:44:05

So, a friend asked me if there was a list of distances-to-the-sea for worldwide cities. I wasn’t aware of one, but thought it should be relatively quick to build one — then actually thought a bit more, and actually did it, and found out instead of the ‘a while’ estimate, it was actually closer to about 10 minutes. So I thought I’d write up what I did and get a critique on it.

This approach only works for the area of interest, Spain.

So, first I grabbed the world_borders file from http://www.mappinghacks.com/data/world_borders.zip — we’re looking for a gross estimate, so this is a reasonable dataset to use. I unzipped the data, and loaded it into postgres:

$ createdb world
$ psql -f $POSTGIS/postgis.sql world
# Crap, forgot the plpgsql
$ createlang plpgsql world
$ psql -f $POSTGIS/postgis.sql world
$ ogr2ogr -f PostgreSQL PG:dbname=world world_borders.shp

Then assembled a geometry of France, Spain, and Portugal, using ST_Union:

select ST_Union(wkb_geometry) 
   FROM world_borders 
   WHERE cntry_name IN ('Spain', 'France', 'Portugal');

Once I did that, I grabbed the point for Madrid, and calculated the distance, after converting that shape to a boundary:

select ST_Distance_Sphere(ST_GeomFromText('POINT(-3.683333 40.4)'), 
    Boundary(ST_Union(wkb_geometry))) 
    FROM world_borders 
    WHERE cntry_name IN ('Spain', 'France', 'Portugal');

Which gave me the ballpark answer I was expecting, of 303.2km.

Dependencies: PostgreSQL, PostGIS, OGR installed on your machine (easy on a mac — just grab the KyngChaos libraries — and also easy on Debian derivatives).

Comments Off on Finding distance from a point in Spain to the Sea

“Who needs more than one finger?” — Android

Posted in default on February 26th, 2011 at 07:07:11

After Tim Schaub got pinch-zoom into trunk yesterday with some excellent work and debugging from him and Bruno Binet, I was excited to see people able to use OpenLayers for easy navigation of the map on browsers with touch events. Turns out, my excitement was misplaced: though the functionality is functional (and super-slick!) on iOS devices greater than v2.0, on the Android platform, it’s not.

Turns out, the reason is simple: Android browsers don’t implement multitouch at the Javascript level.

You can zoom an Android page by using two fingers to pinch, but those events are never delivered to the DOM as multiple touch events; instead, we simply get one touch, no matter how many fingers are interacting with the screen.

Using the Multitouch Test OpenLayers example, I’ve gotten reports from far and wide, from 2.1 all the way up to 2.3 (Gingerbread): the Android browser only supports one touch event. For comparison? iOS supports up to 5.

The OS supports these events: In fact, a multitouch-on-Android development article suggests behavior very similar to the browser events. (There is a slight difference; in the browser DOM, you don’t get different events for DOWN vs. POINTER_DOWN; they’re both a touchstart.) You have a series of move events, and each move has a list of touches attached to it — in iOS/Safari, these equate to the evt.touches array of Touch objects attached to the TouchEvents.

This documentation has been around for almost 3 years — the functionality existed in iOS before the T-Mobile G1 device even shipped (in October of 2008). The touch events in Android for single touches map pretty much exactly to Apple’s spec, but no multitouch support exists at the DOM level, and no clearly documented plans to.

I just hope that the plan on the Android side isn’t to wait until the webevents W3C group ships a Recommendation in 2012…

3 Comments »

Technical Ramblings

Archive for the 'default' Category

Flying United (or not)

Suggestions for Better Geo In the (dot)Cloud

Better GDAL in the Cloud (DotCloud, Round 2)

DotCloud: GDAL in Python in the Cloud

Some things OpenLayers Shouldn’t Do

More On Flaws in OpenLayers

Perceived Flaws of OpenLayers

Documentation and Examples Are Not Good

General Architecture

Default Look and Feel

Viability

Usability

Working with Place APIs (aka “How I spent my Spring Vacation”)

FourSquare

FourSquare Data

Facebook

Google Places

CitySearch

Yelp

GeoNames

OSM/Nominatim

SimpleGeo

Factual

But Don’t take my word for it!

Finding distance from a point in Spain to the Sea

“Who needs more than one finger?” — Android

Archives

Categories