Technical Ramblings

Archive for the 'Software' Category

TileCache: Map Tile Caching

Posted in TileCache on November 12th, 2006 at 18:03:45

Last week, Schuyler and I wrote TileCache, a WMS-C server implementation. OpenLayers is already a WMS-C client, so combining the two gives you a super-fast loading slippy map.

Yesterday, we wandered around town doing various thought experiments as to how to create a distributed peer to peer tile caching network. TileCache supports an extensible Cache plugin backend: any type of storage can be used. We wrote a simple Disk based cache, and also included support for a memory-based cache, based on memcached. The former is good when you have lots of disk, and a slow remote service. The latter is good when you need an LRU cache of the most important areas, but can fall back to the source data for everything else: simply install memcached, get it running, and the MemoryCache will do you fine with the defaults.

However, for things like landsat, most people don’t have enough disk to cache even the hot areas, if you plan to actually create a service which is usable. All of landsat as source data ranges into the terabytes, and even a just-in-time cache is probably on the hundreds of gigabytes scale: bigger than most people have disk for. (I’m biased in this sense: because I work for MetaCarta Labs, I do get access to some pretty sweet hardware. However, I try to target the things that we release to the more casual user, and in that sense, I look at my hosted webserver, which has an 80GB drive that I have full use of.)

So, we need a cache plugin which distributes caches to peers in a network. Ideally, because peers go down, you want to distribute the cache to multiple peers. You want the number of peers to be able to scale, and you want the changing of the peer list to not result in a complete load redistribution. (Apparently a term for this is ‘consistent hashing’, as described in this paper.)

While discussing that, Schuyler mentioned Chris Holmes’s post on S3 for secondary storage of tiles. Although in some cases this might make sense, I’m not sure that it does in the case of caching open data under the umbrella of OSGeo, which is the large part of why I’m thinking about doing this. For small data sources — the NYC Freemap, or the Boston Freemap — I can do local caches without running out of disk. For larger data sources, most of the important ones — like TIGER, or Landsat — could presumably be hosted under some form of OSGeo umbrella. If that’s the case, then falling back to S3 doesn’t make sense, since Telascience has offered very large amounts of disk space — larger than could reasonably be paid for on a monthly basis via S3. If you take the several terabytes available there currently, and do the math, you find that you’re looking at a cost of hundreds of dollars a month just for storage, and that’s before you even start counting the bandwidth costs.

I don’t know the exact particulars of the S3 service interface. It seems like it’s likely to be a key-based data storage system. Given the resources made available for exploring high-bandwidth, high-use applications for open geospatial work, I think that it’s much more likely that creating an S3-like service on top of the Telascience resources would be approved by the SAC than paying thousands of dollars per year for S3 storage and bandwidth.

I haven’t been able to think of a situation where using S3 would help me to muster the resources I need to solve a particular goal as the best solution. Perhaps if you don’t have machines that you can put extra disks in, it makes the most sense to go the S3 route, but I think that by the time you’re hosting such large datasets that you need something like S3, you’ve gone beyond cost effectiveness, given the resources available to most of the people participating in Open GeoSpatial Foundation, both personally and via other services made available for projects under that guise.

Of course, adamhill from the WorldWind project has reported that a just-in-time cache of 8 or so layers for the entire world tops out at about a terabyte for them. So perhaps that entire discussion is moot: most services are not going to have a serious problem with caching all the data they need. But when you need to scale to the level that Google does, you do need to investigate more serious services: but at that point, you better be making some serious cash, or you’re going to run into trouble at some point or another no matter what.

2 Comments »

SLRN Help?

Posted in default, Software on July 25th, 2006 at 22:06:42

I can’t seem to find any places that I can get to that have many slrn users — I’m looking to figure out if it’s possible, in the article listing, to ask slrn to display already-deleted messages.

Anyone have any advice? Or tips/tricks on slrn in general?

Comments Off on SLRN Help?

Freenode Attack?

Posted in Software on June 24th, 2006 at 23:16:25

Seems like the Freenode IRC network may be under attack. Around 20 minutes ago, I got kicked off, and haven’t been able to get back in — manually connecting to different servers, it seems that about half of them are ‘Connection Refused’, half are not responding at all, and the one I could connect to is telling me that it’s all-full-up. Looking back in my status window, I see:

23:44:18 [freenode] -ratbert(i=ratbert@freenode/staff/pdpc.levin)- [Global notice] I am a fat asshole, who loves abuse, die
23:44:29 [freenode] -ratbert(i=ratbert@freenode/staff/pdpc.levin)- DCC SEND YOUAREALLJUDENLOL

Seems likely that someone got through to something they shouldn’t have… hopefully it doesn’t last too long, as a fair number of projects that I either participate in or maintain have their primary IRC channels there.

Luckily, Freenode actually has someone whose full time job is to make it run. I have no doubt that it will come back at some point, even if it is not quite as soon as I’d typically like.

Comments Off on Freenode Attack?

Mapserver Rendering Bug

Posted in Locality and Space, Mapserver, Software, WMS on May 6th, 2006 at 14:58:18

One of the problems I’m running into with mapserver right now is related to its rendering of LINE elements which are wider than one as they run into tile boundaries at acute angles. It seems that mapserver is drawing the centerlines for these elements up to the side of the image — but in cases where a line is approaching a boundary at an acute angle, this means that the ‘outer’ edge of the rendered line stops away from the edge.

In non anti-aliased lines, this is less visible as a problem (especially if you’re not looking at the images as tiles) because the lines just stop — and visually, it’s hard to tell if it’s at the image edge. However, it becomes very obvious in cases where anti-aliasing is on because the edges of the left and right boundary are ‘tied’ together by a curving, anti-alised line: resulting in a bubbled look at tile boundaries.

I’m not sure if this is a known bug, or something that other people have run into: I’m mostly recording it here so that I have a description of it.

Right now, it seems like the workarounds are:
* Use thinner lines (so it’s less visible)
* Don’t have roads near boundaries at acute angles — although there’s not much I can do about this one!

Comments Off on Mapserver Rendering Bug

Mapserver Wishlist Items

Posted in Locality and Space, Mapserver on April 25th, 2006 at 17:09:16

Caching headers. Geoserver has an outstanding JIRA item about this, and a geoserver mailing list post describes the problem:

Putting squid in front of geoserver can help tremendously, but squid is very reluctant to cache anything without proper caching meta-information in the appropriate http headers.

This applies to mapserver as well.

Additionally, when running mapserver behind squid, libcurl sends “Pragma: no-cache” headers, so even if a remote mapserver instance supported caching (which it doesn’t), it wouldn’t be cached when behind a proxy. I think this is fixable by adding ‘Pragma:’ to the header setting code in maphttp.c, but I tried that and couldn’t get it to work.

The combinations of these would make tiling map apps much more realistic, since it would reduce load when requesting tiles from the map server.

3 Comments »

OS X 10.4 Compile failures due to libtool

Posted in Redland RDF Application Framework, Subversion on January 11th, 2006 at 13:22:22

Many different projects (Redland, svn, and wxWindows included) have seen cases where users have attempted to compile, and seen the errors:

/usr/bin/libtool: for architecture: cputype (16777234) cpusubtype (0) file: -lSystem is not an object file (not allowed in a library)

or similar, posted to the mailing lists of these projects.

Only one place that I’ve been able to find so far (and not easily!) has the answer:

This is the typical error you get when you do an upgrade install of
Panther -> Tiger but you don’t install the Tiger Developer Tools
(Xcode 2.0). Don’t do that (Do upgrade your dev tools)

(From darwinports mailing list.)

Thank you ssen! Now, anyone else who has seen this will hopefully be able to read this post for now and into the future.

Ran into this attempting to compile gpsd.

Comments Off on OS X 10.4 Compile failures due to libtool

My First Spatial Database

Posted in PostGIS, Spatial Databases on January 11th, 2006 at 11:13:48

Thanks to Schuyler and Rich Gibson, I now have a spatially aware postgres database.

Later today, thanks to Schuyler and zool, I’ll have a copy of Mapping Hacks, and a bluetooth GPS.

Last night, I learned how to use centroid(), astext(), and distance_spheroid(), and calculated the distance from my house to zool’s house, and from there to Darwin’s, where I ate lunch and used the wireless yesterday. I loaded some data, learned the frustration of having data in different projections, and learned a little bit about the various types of geometry. I loaded data from an ESRI shapefile. I found that “” in Postgres is equivilant to “ in mysql — that is, “GEOMETRY” means ‘the value from column Geometry’, not ‘GEOMETRY’, which is the literal. (If you ever get “Error: column “Foo” does not exist, that might be a good thing to check.)

Last night, I made my first foray into spatial databases.

Last night, I took control of space on my machine.

Tomorrow, I take control of space in the world!

But today, I need to work on things that I’m actually paid for at the moment. 😉

Comments Off on My First Spatial Database

OpenGuides Map Changes

Posted in Javascript, OpenGuides, WebKit on January 9th, 2006 at 05:37:18

So, tonight I took it upon myself to redo the javascript behind the map for the Open Guide to Boston. The reason for this is relatively simple: when I first wrote the Google Maps interface, the guide had about 50 nodes. With that in mind, drawing them all was acceptable. As the guide got bigger, it got a bit more time consuming, but was still much easier than paging. However, the guide has recently doubled in size as I’ve increased the rate at which I pull data from Zami.com (specifically for the purpose of trying to build a free database of all the churches in the area with user interaction) — up to 2400 nodes (making it the single largest Open Guide to date by pure node size, as far as I can tell).

With this change, the map was no longer just difficult to use: it was impossible. Attempting to open the page crashed my web browser.

So, I took it upon myself to learn enough JavaScript to do paging… but then decided that rather than paging, I should attempt a bit more friendly solution to start. So I started hacking, and got into the Google Maps getBoundsLatLng() function, which allows me to determine the area of the map.

There are two basic options to go from here, depending on performance you expect in various situations, and the scalability you want to have.

* Load all data. Create Javascript variables to store it all. When the map moves, iterate over the data, adding markers for only points which are not already visible, and are in the new span.
* Load only the original data from the database. All additional data to be loaded via XMLHttpRequest.

The first is obviously something that can be done entirely client side, and in fact turned out to be the (much easier than I expected) path I chose. The largest reason for this is simply ease of integration. OpenGuides code is rather convoluted (to me), and modifying it is not something that I enjoy spending a lot of time doing. I’ll do it when I need to, but I prefer to avoid it. The other option would be to query the database directly via Perl or some other language, without using the OG framework. This would probably be slightly faster, but with the size of data I’m using, iterating over it is a minimal time compared to the time drawing the GMaps Markers. As a result, I chose the slower, less scalable, but quicker-to-implement and merge to other guides way of doing things. The biggest benefit here is that I can merge it back into the other quickly growing guides — Saint Paul is now growing leaps and bounds alongside Boston, due to help from pulling Zami.com’s database, and I plan to do similar things with other guides. For instance, in London, where online entertainment is a big draw, integrating features like those found on UK sports betting sites not on Gamstop could enhance the guide’s appeal for users seeking local leisure options beyond traditional mapping. This kind of code is essential for London’s guide, as it won’t fully leverage interactive features without such additions, so I went for the quickest thing that could possibly work.

There were a few gotchas that I had to end up avoiding:

* Removing Google Maps Markers takes *much* longer than adding them. Like, 3-4 times longer in extremely informal testing. With one or two, or even 20-30, this doesn’t matter, but with 100, this starts to be a significant barrier to user interaction.
* There are a number of words which are reserved in Javascript, but are not treated that way in Firefox. This leads to confusing error messages in Safari (although they are slightly improved in the latest webkit): If you see a ParseError, you should check the Firefox Reserved Words list — this is in one of their bugs. For example, the variable “long” is reserved, and can not be used to pass something like, say, longitude.

Discovering the ParseError, working out why it was there, and how to fix it, was greatly helped by the #webkit folks: I have never seen a more dedicated, hardworking, friendly, open and inviting group of Open Source Software developers in my history as a programmer, either in closed or open source products. Many thanks to them for helping me work through why the error was happening, and how to avoid it in the future.

Lastly (although this is not quite chronological, as I did this first), I had to modify the code to the guide to zoom in father to start, so it wouldn’t try to load so many nodes. I think a next step is to allow users to set a default lat/long for themselves, a la Google Local, from which they can start browsing, rather than always dropping them into the main map. But that’s a problem for a nother late night hack session.

3 Comments »

Building WebKit on Panther

Posted in Software, WebKit, XSLT on June 9th, 2005 at 00:26:48

I mentioned the other day the release of Apple’s WebKit, WebCore, and JavascriptCore (the latter two of which were already publicly available). Naturally, the first thing I wanted to do was download it and give it a try. This post will outline the steps I took to get as far as possible in the build process at this point. First, I would like to mentioned that this project has the cleanest build steps I have ever seen. It is well documented all the way through, and for Tiger users on Xcode 2.0, the build process went off without a hitch, the first time through. (Xcode 2.1 problems have since been fixed.) The members of the supporting IRC channel are helpful and intelligent, and the mailing list has already taken multiple patches from non-employees into the source tree. This is, quite simply, the best opening for an open source project that I have ever been aware of.

However, the build process currently favors those with Tiger, and the current CVS does not support those who are using Panther. Apple developers have expressed an interest in correcting this once the WWDC, being held this week in San Francisco, is over. So, I took it upon myself to report bugs in bugzilla where they are applicable, to help out developers when they get a chance to breathe.

First problem: Building returned a problem with “CarbonSound.h not available”. This was as a result of not yet installing the QuickTime 7 SDK. (It has been in software update, I just hadn’t touched it yet.) Updating fixed that.

Second Problem: 10.3.9 Build Failure: NSString may not respond to `+stringWithCString:encoding:’. This is a method which was not available in Panther. Maciej has said he is working on a patch to have this use CFString instead, where it is available. (I am tossing about some terms I don’t know here, so please excuse any incorrect terminology.) Workaround for the time being – copy the last two build commands before the crash (a cd line and a gcc-3.3 line) and past them, altering the gcc-3.3 line slightly to remove the -Werror. This means that it may cause problems later on, but will compile for the time being.

Third Problem: isnan failure in kjs_window.cpp: This one boggles me a bit, especially since (as I mention in the bug) there seems to be explicit knowledge in the code of the problem. However, a workaround is now offered in the bug in comment 1: replace using std::isnan; with extern “C” int isnan(double); This fixed the problem for me.

Fourth Problem: XSLT Headers not installed – This one is more systematic of the way that Apple releases updates, and is something that dajobe has brought up with building Redland in the past: “Headers don’t match libraries”. This is true here as well, but I now (thanks to toby from #webkit) know that the reason for this is that Apple does *not* ship updated headers with libraries updated through Software Update. Since libxslt is new in 10.3.9, there are no development headers. Dave Hyatt, of the WebKit team, mentioned that the whole team, when building on Panther, had to install libxslt and libxml from the source. Once I did this, it made this problem go away.

Fifth Problem: libxml headers are wrong – this was before I installed libxml, which also fixed this problem. It is, again, related to the fact that Apple does not update headers with System Update.

Once you get through these, you will have built both JavascriptCore and WebCore. Congratulations! You now have two completely useless frameworks which the new Webkit will depend on when you can build it! 🙂

WebKit is the previously unreleased Apple-specific Framework which is the “pretty” part of WebCore – it’s what ties everything together. It has a few more issues building on Panther, but most of them can be worked around by simply copy pasting build lines without the -Werror flag. (Note that this will produce possibly unstable results! These builds are not designed for production, and I do not advise doing this and filing bug reports on Safari crashing.)

npapi headers not available – for some reason, building on Panther does not find the appropriate headers from the in-process WebKit build. I really have no clue why this is, and neither did anyone else when I was building. My workaround was to copy the headers out of the framework and into ~/build/include (a directory I had to make), which was already on the path. cp ~/build/WebKit.Framework/Versions/A/Headers/* ~/build/include, cp ~/build/WebKit.Framework/Versions/A/PrivateHeaders/* ~/build/include, then continuing the build. I am not sure why this is neccesary, but it does seem to work.

Missing 10.4 Method -setCompositingOperation for WebImageRenderer – Two parts of the code require: (void)setCompositingOperation:(NSCompositingOperation)operation;
(NSCompositingOperation)compositingOperation; — this function was added in 10.4. This can be resolved by following the above -Werror removal steps. You will have to do this several times.

Missing 10.4 Method CFMakeCollectible – CFMakeColelctible is new in 10.4. Building with no -Werror allows the build to continue.

And, the current showstopper: Missing SecurityNssAsn1 headers — This comes from the libWebKitSystemInterface.a file, which is currently Tiger-specific. Once WWDC is over, a Panther binary file will be released. Until then, this is where the ride stops: you can build WebCore and JavascriptCore, but WebKit is out of your reach until you get your hands on Tiger.

Luckily for me, I’m going to be in Cupertino this weekend, so I’ll pick up a copy and get it installed soon 😉

2 Comments »

WebKit Source

Posted in Social, Software, WebKit on June 7th, 2005 at 08:19:41

An announcement on the release of Webkit, the source of the rendering engine for the popular OS X browser, Safari. Includes mention of #webkit on irc.freenode.net, for discussion of webkit, and information on how to get anon CVS access.

Currently requires XCode to build, but I’m sure that someone out there will cook up some autotools goodness for it sometime soon.

Keep in mind that (as far as I know) this isn’t the actual shell that makes up Safari. It’s the source of the rendering engine inside it – basically, the bits that were taken from KHTML. I’m not sure though, and I can’t read the code well enough to confirm that I think that. However, one of the parts that is being released is WebKit: the interface that people have used in the past to make 10 line browsers in Xcode projects. This could mean we’ll see a lot more similar projects for other UNIXes – with the rendering taken care of and a simple binding, it becomes much simpler to write applications which display HTML.

Certainly an interesting development. Could this mean we’ll see a Safari-like browser base on other platforms in the near future? My bet is yes.

1 Comment »