Technical Ramblings

Archive for the 'TileCache' Category

Telling Stories Via Maps

Posted in FeatureServer, OpenLayers, TileCache on May 15th, 2007 at 01:49:25

screenshot of OpenLayers-based story Map Went out for a late night walk tonight: though my path was a bit unclear, I think I made a decently accurate map:

Walking around MIT

If you hover over the circles, you’ll see points of interest along the route. I recommend zooming into the main MIT building, to see the confused meandering I was forced to.

This demo uses OpenLayers, TileCache, and FeatureServer. It was created entirely via a web interface.

6 Comments »

From Data To Map

Posted in Locality and Space, OpenLayers, QGIS, TileCache on February 14th, 2007 at 00:47:25

Earlier this evening, Atrus pointed out that DC has a bunch of cool data in their GIS Data Catalog. I decided I would play with it a bit and see what I could come up with.

I grabbed the Street Centerlines, played with it in QGIS to do a bit of cartography, and then (eventually) got it exported to a MapServer .map file (which describes styling info). I was then able to set the file up in MapServer, serve it out to OpenLayers, and then to stick TileCache in the mix. The result isn’t the prettiest thing in the world, but it works.
After going through it once, I decided I’d go through it all again, to see how long it took.

12:15AM: Open Firefox to the DC Data Catalog to find some data to map.
12:16AM: Pick out Structures Polygons.
12:17AM: Download complete, open QGIS
12:18AM: Open file in QGIS
12:19AM: Save QGIS project file, save map file from project file
12:20AM: Copy both shapefile and mapfile to server
12:21AM: Tweak mapfile: adjust PNG output to not be interlaced (for TileCache usage), change background color
12:22AM: Test mapfile in mapserv CGI. Find out I misspelled something, fix it.
12:23AM: Edit TileCache config to add new layer information.
12:24AM: Copy an existing tile URL, ensure that it works in TileCache with the different layer.
12:25AM: Edit OpenLayers config to include additional layer
12:26AM: Edit OpenLayers config to include layerswitcher.
12:27AM: Marvel at the result

In less than 15 minutes I was able to turn a dataset into a browsable, lazily cached web viewable data set, using qgis, OpenLayers, and TileCache. Not bad at all.

1 Comment »

Free Maps for Free Guides

Posted in Locality and Space, Mapserver, OpenGuides, OpenLayers, TileCache on February 11th, 2007 at 08:46:05

A bit more than a year ago, when I was just learning how to use the Google Maps API, I put together a patch for the OpenGuides software, adding Google Maps support. It seemed the logical way to go: It wasn’t perfect, since Google Maps are obviously non-free, but it seemed like a better way to get the geographic output from OpenGuides out there than anything else at the time.

Since I did that, I’ve learned a lot. Remember that 18 months ago, I’d never installed MapServer, had no idea what PostGIS was, and didn’t realize that there were free alternatives to some of the things that Google had done. Also, 9 months ago, there was no OpenLayers, or any decent open alternative to the Google Maps API.

In the past 18 months, that’s all changed. I’ve done map cartography, I’ve done setting up of map servers, and I worked full time for several months on the OpenLayers project. Although my direction has changed slightly, I still work heavily with maps on a daily basis, and spend more of my time on things like TileCache, which lets you serve map tiles at hundreds of requests/second.

So, about a month ago, I went back to the Open Guide to Boston, and converted all the Google Maps API calls to OpenLayers API calls. The conversion took about an hour, as I replaced all the templates with the different code. (If I was writing it again, it would have taken less time, but this was my first large scale open source Javascript undertaking, long before I gained the knowledge I now have from working with OpenLayers.) In that hour, I was able to convert all the existing maps to use free data from MassGIS, rather than the copyrighted data from Google, and to have Google as a backup: a Map of Furniture Stores can show you the different. You’ll see that there are several layers — one of which is a roadmap provided by me, one from Google — and one from the USGS, topographic quad charts.

It’s possible that some of this could have been done using Google as the tool. There’s nothing really magical here. But now, the data in the guide is no longer displayed by default on top of closed source data that no one can have access to. Instead, it’s displayed on top of an open dataset provided by my state government.

This is how the world should work. The data that the government collects should be made available to the people for things exactly like this. It shouldn’t require a ‘grassroots remapping’: There are examples out there of how to do it right. I find it so depressing to talk to friends in the UK, who not only don’t have the 1:5000 scale quality road data that Massachusetts provides, but doesn’t even provide TIGER-level data that the geocoder on the Open Guide to Boston uses.

Free Guides, with Free Maps. That’s the way it should be. The fact that it isn’t everywhere is sad, but at least it’s good to know that the technology is there. Switching from Google to OpenLayers is an easy task — it’s what happens next that is a problem. You need the data from somewhere, and it’s unfortunate that that ‘somewhere’ needs to be Google for so many people. I’m thankful to MassGIS and to the US Government for providing the data I can use, and to all the people who helped me learn enough to realize that using Google for everything is heading the wrong way when you want to not be beholden to a specific set of restrictions placed on a corporate entity.

1 Comment »

TileCache Under Windows… It Just Works ™

Posted in Apache, Python, TileCache on February 7th, 2007 at 10:53:06

It’s cool when software works.

So, I needed to test the latest TileCache in Windows. I’ve got a relatively clean Windows laptop I keep around for this. None of the niceties that I would typically want on a machine I used full time — no Firefox, no cygwin, no vim, etc. Just a straight up Windows install, with things like Google Earth installed.

So, starting from that, I started by downloading Apache from the Apache Download Page. I downloaded “Win32 Binary (MSI Installer): apache_2.2.4-win32-x86-no_ssl.msi“, and ran the installer directly. This got http://localhost/ serving a page that says “It works!”.

Next, I downloaded the Python 2.5 Windows installer , and again selected all the defaults.

Next, I downloaded TileCache 1.4, as a .zip file, since I know how well Windows supports .tar.gz files. I extracted the zip file into C:\Program Files\Apache Software Foundation\Apache2.2\cgi-bin.

Lastly, I edited C:\Program Files\Apache Software Foundation\Apache2.2\cgi-bin\tilecache-1.4\tilecache.cgi so that the first line of the file read “#!C:/Python25/python.exe -u”, and visited http://localhost/cgi-bin/tilecache-1.4/tilecache.cgi/1.0.0/basic/0/0/0.png …

And it worked! I got an image. I never expected setting up Apache, Python, and TileCache to just… well… work.

2 Comments »

TileCache: Map Tile Caching

Posted in TileCache on November 12th, 2006 at 18:03:45

Last week, Schuyler and I wrote TileCache, a WMS-C server implementation. OpenLayers is already a WMS-C client, so combining the two gives you a super-fast loading slippy map.

Yesterday, we wandered around town doing various thought experiments as to how to create a distributed peer to peer tile caching network. TileCache supports an extensible Cache plugin backend: any type of storage can be used. We wrote a simple Disk based cache, and also included support for a memory-based cache, based on memcached. The former is good when you have lots of disk, and a slow remote service. The latter is good when you need an LRU cache of the most important areas, but can fall back to the source data for everything else: simply install memcached, get it running, and the MemoryCache will do you fine with the defaults.

However, for things like landsat, most people don’t have enough disk to cache even the hot areas, if you plan to actually create a service which is usable. All of landsat as source data ranges into the terabytes, and even a just-in-time cache is probably on the hundreds of gigabytes scale: bigger than most people have disk for. (I’m biased in this sense: because I work for MetaCarta Labs, I do get access to some pretty sweet hardware. However, I try to target the things that we release to the more casual user, and in that sense, I look at my hosted webserver, which has an 80GB drive that I have full use of.)

So, we need a cache plugin which distributes caches to peers in a network. Ideally, because peers go down, you want to distribute the cache to multiple peers. You want the number of peers to be able to scale, and you want the changing of the peer list to not result in a complete load redistribution. (Apparently a term for this is ‘consistent hashing’, as described in this paper.)

While discussing that, Schuyler mentioned Chris Holmes’s post on S3 for secondary storage of tiles. Although in some cases this might make sense, I’m not sure that it does in the case of caching open data under the umbrella of OSGeo, which is the large part of why I’m thinking about doing this. For small data sources — the NYC Freemap, or the Boston Freemap — I can do local caches without running out of disk. For larger data sources, most of the important ones — like TIGER, or Landsat — could presumably be hosted under some form of OSGeo umbrella. If that’s the case, then falling back to S3 doesn’t make sense, since Telascience has offered very large amounts of disk space — larger than could reasonably be paid for on a monthly basis via S3. If you take the several terabytes available there currently, and do the math, you find that you’re looking at a cost of hundreds of dollars a month just for storage, and that’s before you even start counting the bandwidth costs.

I don’t know the exact particulars of the S3 service interface. It seems like it’s likely to be a key-based data storage system. Given the resources made available for exploring high-bandwidth, high-use applications for open geospatial work, I think that it’s much more likely that creating an S3-like service on top of the Telascience resources would be approved by the SAC than paying thousands of dollars per year for S3 storage and bandwidth.

I haven’t been able to think of a situation where using S3 would help me to muster the resources I need to solve a particular goal as the best solution. Perhaps if you don’t have machines that you can put extra disks in, it makes the most sense to go the S3 route, but I think that by the time you’re hosting such large datasets that you need something like S3, you’ve gone beyond cost effectiveness, given the resources available to most of the people participating in Open GeoSpatial Foundation, both personally and via other services made available for projects under that guise.

Of course, adamhill from the WorldWind project has reported that a just-in-time cache of 8 or so layers for the entire world tops out at about a terabyte for them. So perhaps that entire discussion is moot: most services are not going to have a serious problem with caching all the data they need. But when you need to scale to the level that Google does, you do need to investigate more serious services: but at that point, you better be making some serious cash, or you’re going to run into trouble at some point or another no matter what.