Topology vs. Simple Features, pt deux

More followup from the list: It seems that there is a reason (in addition to routing) for topological behavior: editing.

If you have a junction described in “features”, what you get are two roads that somehow happen to meet in one place (two lines crossing, or meeting). Right? In terms of storage, both lines will usually have their own coordinates in the data base; the information that they intersect, or meet, is not there explicitly, it just comes from the maths applied when drawing them, and if your database has spatial support you can ask the database for the point where they intersect.

However if someone wants to move that intersection, he will have to edit each feature separately – if you move one road away from the junction, then the junction is no more.

(From the mailing list.)

It’s a valid point, and not one that should be easily tossed aside. However, I’m not arguing against topology in all cases: I think that this is the exact kind of thing for which topology should be used — in the client.

If you have two nodes that are on top of each other, they should merge together. When you drag one, any feature that includes that node should be updated. When things don’t actually intersect, they should be separated. The canonical example is a bridge-over-river situation: If I want to draw a bridge over a river, and need two nodes in the exact same place, how do I do that without joining them?

The next piece needed beyond simple join-on-node is a set of logic which tells you when *not* to join data. OpenStreetMap already has the concept of a ‘layer’ — layers determine rendering order, but they also (essentially) determine the ordering on the earth. A bridge has a higher layer than a street, so they shouldn’t be merged topologically.

But what if they’re at the same layer, but not connected? (I can’t envision this, but suppose it happens.) The way I look at it, the answer is that this is a special case, and an additional ‘junction’ (or ‘not-junction’) node can be appropriate here.

Maybe I’m wrong. I’ve been wrong plenty of times before 🙂 However, this is how I’ve worked so far, and it’s worked out for the data I’m working with. I know that Frederik and I disagree on it — and that’s fine. Brings vitality to the discussion. 🙂 It’s good to see others so interested in solving the succinctly stated problems brought to the fore by the paper he has put together with Jochen (available from remote.org).

There’s an additional concern, stated by Steve on the list:

And yes you’re right, topological is really useful since OSM is a wiki and we track changes in nodes. Otherwise moving an intersection of many roads would mean updating many linestrings not one node.

another mailing list post

This one is more interesting to me, because I feel like moving topology to the client actually turns editing a topological operation — after which, grouping the edits in the API should be simple. Updating multiple features at once is a ‘simple matter of programming’ that I can solve (given time/effort on my part). Certainly I’ve built a RESTful Feature API that supports this — but it’s Simple Feature-based, not topology based.

Does anyone have experience of creating GIS data from aerial imagery? This is the thing that I have the most experience with, after mapping out a decent sized city using Yahoo! Maps as a backdrop. What do you use in editing? What do you use for export? How does it all tie together?

5 Responses to “Topology vs. Simple Features, pt deux”

  1. Paul Ramsey Says:

    Many of the professional photogrammetry shops in British Columbia use some form of CAD or CAD-like software for their data capture, such as Microstation. However, this is probably more a function of history (these packages always had the most advanced data capture tools Back in the Day) than any sort of present-day efficiency analysis.

  2. dylan beaudette Says:

    Hi Chis. I have been following this conversation for a while now, and thought I would add a couple of points.

    It seems that you and the OSM people are approaching the same problem from very different angles, as dictated by need and past experience:

    Chris – interactions with vector data in web-based clients, simple API structure, etc. You want to work with simple features because they are *simple*. there is only so much you can do in a web browser with JS and creating support for topology is hard.

    Steve and the OSM team – you are *creating* and *editing* data, efficiency and accuracy are the top priorities. Working with simple features is wasteful for polygons, and error-prone when joining and connecting segments at nodes.

    This seems like a decent way to do things: topological format for the creation and main storage of OSM, simple features for the presentation of the data– is that really so hard? GRASS is an excellent platform to accomplish this: the internal vector storage ususes topology, but it is very simple to export to simple feature with GDAL/OGR.

    From an outsider’s perspective this entire set of posts, mailing list traffic, and IRC conversations seem a bit silly. That being said, I am glad to see people talking about things instead of arguing and name-calling.

    I personally use GRASS for all of my GIS needs, PostGIS as a datastore for massive geometries, and Mapserver as a rendering engine. These applications represent a mix of topological and simple feature models- not one or the other – as they both have their uses. I would hope that OSM sticks with the topological format, at least on the back-end where all of the editing and updating are occuring.

    Cheers,

    Dylan

  3. crschmidt Says:

    Dylan:

    I don’t have enough experience with GRASS to understand, so I’d be interested in becoming further enlightened — how do you save topological data? I know GRASS usees this format internally, but when you save the features, what kind of file format is it in? Is it something that other tools will read?

    Imagine OSM isn’t in the conversation. I have to take a satellite image, and I need to draw all the streets from it. Or different — I need to take a raster map, and vectorize it with attributes, etc. What do people use? GRASS? My experience with GRASS suggested it was very much *not* point and click, which makes me wonder if there’s some aspect of it I don’t understand, or people are just better at guessing pixel locations than me 🙂

    What I would do? Draw out lines. At junctions, add nodes. Multiple layers of data? they belong in multiple layers. Is there a join to another dataset? Mention it explicitly with a notation in the data.

    It seems like you’re saying that it’s important to store this information on the backend — but if I’m really about making a map, and GRASS can export from topology to simple features, why would I want to export the (larger, more complex) topology? Am I misunderstanding? (I assume so.)

  4. Dylan Beaudette Says:

    Hi,

    Here is the general approach in GRASS:
    1. have something to digitize from: i.e. DOQQ
    2. run v.digit
    3. digitize your points, lines, etc.
    4. close map, and topology is built automatically
    5. now that you have data in topological format, we can use ‘cleaning’ tools on your first map.
    6. use v.clean to do fun things such as:
    – remove dangles
    – snap lines to nodes
    – break lines at crossings
    – simplify lines (if needed)
    – remove small areas and small angles (if needed: this is helpful after some sort of auto-vectorization)
    – etc.

    7. topology is used to do these operations (ever try to run CLEAN on a shapefile in ArcInfo ?)

    8. the important thing is that the user is not responsible for enforcing the topology, or even inserting nodes at line junctions — this is done with software. Also, layers which represent different entities in nature should be kept seperate: rivers and streets should be in different files — which eliminates the bride crossing a river issue altoghether. An excellent example of this problem can be found in the US TIGER data (census) as all features are encoded in the same layer, and must be extracted accordingly– wich is a tiresome process.

    When making or editing data, these features are very helpful, and insure data integrity. When doing network analysis topology is a must. When making a map, they are not really needed. Therefore:

    Make and edit your map with topology = probably a good idea.
    Export raw geometry as simple features for map making = probably a good idea.
    There is no need to export the topology when making maps.

    For a longer discussion on how to do these things in GRASS hop on the GRASS mailing list, there are several there (including myself) who would be willing to help 🙂 .

    Cheers,

    Dylan

  5. Matt Perry Says:

    Hi Chris. There’s a really good reason to stick with topology on the backend: It helps ensure data integrity. Junctions between related features remain intact, you don’t get slivers or overshoots, when someone moves an intersection you don’t have to update and snap every related feature, etc.

    As far as putting that functionality on the client-side, well what if someone uses a different client? It is an open interface. These sorts of validation constraints really belong at the data tier. I have worked on both topological and non-topological mutli-user datasets and, without fail, the non-topological datasets always get screwed up requiring dozens of hours of tedious manual work to fix.

    Also, in my experience, it is much easier to convert from a topological layer to the non-topological layer. The topology ensure that all simple features in the output will be valid. But the “spaghetti” model of non-topological data requires tons of tedious cleaning and manual inspection to convert to valid topology.

    Given that road data has far more uses than just making cartographic products, it would be hard to justify converting the OSM model to simple features. Why not just write a RESTful web service to expose the topological data up as simple features?