Wednesday, November 18, 2009

OpenStreetMap helps free Ordnance Survey data with suicide bombing mission

So as I talked about in my previous post, Ordnance Survey is going to make its small scale data freely available. I think that in many ways, OpenStreetMap has been a major influence in making this happen. The growth of OpenStreetMap has increased the awareness of the benefits of free geospatial data, and it was becoming apparent that there would no longer be a significant market for the Ordnance Survey to sell small scale map data, certainly not at the sort of high prices it has traditionally charged.

However, the fact that this is happening raises some major questions about the future of OpenStreetMap in the UK, and could even lead to its demise there. At the very least, it dramatically changes the nature of OpenStreetMap in the UK. People have different motivations for contributing to OpenStreetMap. Some do it just because they think it's fun, and they like mapping their local area. For many people there is motivation around the fact that they believe it's important to have freely available and open map data. Suddenly at a stroke, the second motivation is seriously diminished (in the UK), as this aim has been achieved if the Ordnance Survey makes a high quality and very complete dataset freely available. Now we don't know for sure yet what Ordnance Survey will release - it is possible that it could just make raster map data available (like Google does). But it seems likely to me that they will probably make the small scale vector data available too - there is certainly lots of demand for this.

We also don't know the licensing terms yet, but it seems likely that the Ordnance Survey data will be in the public domain. So ironically it will be more open than OpenStreetMap, whose current and new licenses are fairly "viral" - roughly speaking they say that if you enhance the data, you have to make those enhancements available on the same terms as the original data (i.e. the enhanced data has to be freely available). This more or less precludes having a commercial ecosystem of "value added" data providers on top of OpenStreetMap. And many commercial companies, like Google, have expressed concern about using OpenStreetMap because of licensing (even with the new license that should be rolled out soon). But potentially Google, Microsoft et al will be free to use the Ordnance Survey data with no constraints.

So where does this leave OpenStreetMap in the UK? It is interesting to compare the situation in the UK with the US. OpenStreetMap took off very quickly in the UK, driven in many ways by frustration with the Ordnance Survey and the lack of free map data. In the US it has taken off more slowly, and this is widely thought to be because there are more sources of free map data (albeit often poor quality ones, as I've discussed previously). There has also been a lot of spirited discussion recently on the OpenStreetMap mailing lists about the pros and cons of importing TIGER data as a starting point in the US. There is a strong contingent that argues that cleaning up existing data is less interesting and motivating than mapping something from scratch, and that this is why there is less interest in OpenStreetMap in the US than the UK. The counter-argument, which I support in general, is that we are much further along in the US with TIGER data than we would have been without it. But anyway, suddenly the UK finds itself in a similar situation to the US, but with a much higher quality free data source (assuming there are no licensing issues, which there won't be if the data is public domain, which is what I expect).

This raises a lot of practical issues in terms of data imports, which we have already faced (but not solved) with OpenStreetMap in the US. OpenStreetMap in the UK has a rich database already - according to Muki Haklay, it is about 65% complete in terms of geometry, and 25% complete if you consider attributes. Now you have a 100% complete high quality dataset that you could import, but how do you reconcile this with existing data? This is a complex problem to solve. And how about subsequent updates? Do you just do a one time import of OS data, and let the community do updates after that? Will people be motivated to do this, if the OS is updating its own dataset for free in parallel? Is there some way of using the OS data for items that they maintain, and having OpenStreetMap focus on more detailed items (benches, trash cans / bins, etc)?

The ideal world might be to have some sort of integration between OpenStreetMap and the Ordnance Survey. I have spoken often about the disruptive impact of crowdsourcing and how government agencies and commercial companies need to leverage the power of this approach to bring down the cost of creating and maintaining data. Now that Ordnance Survey will have reduced revenues and require increased subsidies from taxpayers, they will be under increasing pressure to cut costs. If there was a way to leverage the power of the thriving OpenStreetMap community in the UK that could reduce costs quite significantly. There are challenges with doing this and it may just be wishful thinking ... but we can hope :).

So anyway, this move raises lots of questions about what OpenStreetMap will look like in the UK in future. If you regarded the mission of OpenStreetMap in the UK as being to create a free, open and high quality map of the UK, you can argue that the mission is completed (or will be in April), perhaps in a slightly unexpected and sudden fashion, like the falling of the Berlin Wall. Steve Coast quotes Gandhi on the OpenGeoData blog: "First they ignore you, then they laugh at you, then they fight you, then you win." The question is should we add "... and then you die"? (Or less drastically perhaps, retire, or have no reason to exist any more?)

There are some other aspects to OpenStreetMap of course, like I alluded to before - making more detailed maps of aspects of your neighborhood than the Ordnance Survey does for example. But working out how those other aspects can coexist alongside the new reality of free OS data is complex. And how many OpenStreetMappers will lose the incentive to participate in this new world, if there is an alternative source of good quality, free and open data? We live in interesting times in the geo world today - this is the second hugely disruptive announcement (following the Google earthquake) in a month or so!

I should just reiterate that of course all these specific questions apply to OpenStreetMap in the UK, they don't affect its aims and benefits in the rest of the world - except that a lot of energy for the global movement has come from the UK, so if that energy diminishes it could have some knock-on effect in the rest of the world. But I hope not!

This move by Ordnance Survey will also increase pressure on National Mapping Agencies in other countries to make more data freely available (where it isn't already).
@Osbornec said...

Rumours of our decease are greatly exaggerated!

Very interesting to see what actually gets released, I trust that 1:10,000 rasters can be immediately dismissed as worthless.

What vector products do they have at 1:10,000?

Possibly OS Vector Map Local, designed to be viewed between 1:3000 to 1:20,000. Or Meridian 2 which is meant to be viewed at 1:50,000 which would be of very little value to OSM, apart from street names/classification.

My gut feeling is that there would be no immediate import, the data would mostly be used for reference in areas that are poorly mapped.

Your point about motivations for mapping in OSM is interesting. I believe most involved are also passionate about map democracy - mapping the things important to you that won't be mapped by NMAs or commercial groups.

Perhaps the biggest win here is that if Derived Data claims are being dropped, then we can start to see Local Authorities and communities getting more involved with OSM in the UK.

ebwolf said...

Two quick points (then I need to turn of Twitter and get to work...):

1. I think there are social differences between the British and Americans that explain some of the differences in the approach to OSM. You Brits are infamous about your survey and mapping efforts. It's like you are uncomfortable with blank spots on a map. For Americans, there generally has to be some profit motive (like managing natural resources).

2. The data the OS is opening is much smaller scale than OSM. The real beauty of OSM shows at the largest scales. If anything, it'll start making the OSM people think about multi-scale mapping.

3. (Ok, I was only supposed to make 2 points...but...) When I first met Steve Coast, I told him that the real significance of OSM is that it allows for anything of interest to be mapped. It becomes a universal basemap.

OSM isn't going anywhere. All the shift at OS will do is help the OSM community focus on it's strengths rather than wasting time recreating data that's available elsewhere.

Peter Batty said...

Well I was being deliberately provocative with the title of course, and as I discuss in the text it's hard to tell what will happen. I don't actually think OSM will go away in the UK, but I do think that this is likely to bring significant change.

Certainly what level of data the OS releases will have a significant affect on the impact, as will the complex and varying set of motivations that different OSMers have. Even if the best data that OS releases is Meridian, according to Muki's analysis (which is based on that) currently OSM is 67% complete in terms of geometry and 25% complete if you consider attributes, so I think there is a lot that could be gained from Meridian. But you get back into the same debate that's been going on in the US about whether it's a good idea to import TIGER or not.

And on both of your comments that OSM can map things that won't ever be collected by OS, that's true, but there's a lot to figure out in terms of how that coexists with a (potential) reasonable quality dataset from OS containing roads etc which is frequently updated. Do you continue to duplicate everything that OS does? While great progress has been made, there's still a good bit of work to get to the level of completeness of Meridian or whatever. And if you somehow try to integrate updates from the OS, that's a complex task and also changes the nature of the UK OSM community.

We all agree that there are some who map in order to do things that won't be done by OS, but I still think it's hard to tell what proportion that is. I am a reasonably active mapper in the US, but if we had a free and open dataset with the quality of Meridian available here, I personally would be much less likely to spend time working on an alternative map.

Another interesting question is whether the "I want to make my own personal map" needs can be satisfied for many people by simply adding layers on top of a less detailed basemap.

But we definitely live in interesting times - will be interesting to see what shakes out of this!

James said...

Peter, one of the interesting aspects of this is what implications this will have for local government, all of whome are signed up to the expensive mapping services agreement to enable them to receive Ordnance Survey data. The two main issues which prevent us going else where for mapping data is the derived data issue and how to maintain a supply of other datasets like boundary line (administrative boundaries). If, from what I'm reading, both of these problems are solved there is nothing to prevent us moving to alternative large scale map products such as UK Map and utilising OSM for the smaller scales. At Surrey Heath, we've invested quite a lot of time and effort in OSM and have our own internal tile rendering system in place so we can use OSM data in conjunction with our in house GIS systems. I happen to think that the OSM data that we have created is more useful (and looks better!) than the smaller scale OS mapping so we will continue to use it.

I would also like to know where this announcement leaves the negotiations for MSA2 which is currently being negotiated. I should think that they will have to start from scratch!!

Richard Fairhurst said...

All interesting stuff and I agree with a lot of it... in fact I posted the same sort of thing a few hours earlier. ;)

"More open than OSM" and motivation are the two key dangers. Yes, OSM can do large-scale better than the free OS data will. But will it? Even as an OSM contributor of some antiquity, I've still not developed any interest in micro-mapping (building outlines? Sure, or I could find a nice lawn and watch it grow. Postboxes? Isn't that like trainspotting without the compensation of Monster Power?), and I'm far from alone in the UK. Even if I enjoyed that sort of thing, OSM doesn't even have aerial imagery for Birmingham, let alone deepest Monmouthshire.

On the licensing question, yes, CC-BY-SA precludes 'value-added'. ODbL doesn't: one of its main advances is clarity, and sensible dividing lines, on derived vs collective. As you say, PD is of course better still, but I wouldn't assume PD is a done deal: we still have some campaigning to do there.

Much does depend on the dataset. Meridian2 seems most likely on first glance; it tallies most closely with the (scanty) detail in the announcement. No foot/cycle data, though. This claims footpaths will be included, this suggests not. You can have some fun watching commenters on the latter tie themselves in knots about small and large scales, but actually the footpaths question is pretty crucial to OSM... because that's the one thing Google et al will find very hard to survey. (You can't identify rights of way from aerial imagery. Hell, sometimes it's hard to identify them when you're standing on them)

By the time any of this data becomes available, OSM will have much better import tools. I'm saying that with some knowledge of the situation. I don't think we're going to see TIGER repeated.

Chris's point about derived data is immensely important. I guess, for example, that we could readily get both Sustrans' and British Waterways' data into OSM. Both are currently unavailable to us due to the usual derived data hogwash. But neither is adequately represented in any OS dataset (the OS's National Cycle Network coverage is embarrassingly poor).

One final curveball: I reckon the two organisations that need to start talking urgently are OSM and Getmapping.

Learon Dalby said...

Your thoughts on the US situation are dead on (mess/not an example). I seem to recall the Ordinance Survey did a pretty detailed ROI on the data at one time.

No doubt there are data cost. It is not free, it is made available for no fee. However, there is intrinsic value (hard to put a number on) in providing data for no fee.

When data is made available really good things happen, many of which were never thought of when it was locked up. Some of these good things bring real $ value even if indirectly.

mentaer said...

for people that are interested a (very technical) link on road matching/data conflation, which the canadian OSM guys did to complete their (huge) street network with official data. However, not sure what they did with attributes or if it was only for geometries:

povesham said...

Enjoyed reading your analysis on these two posts.

I would like to point to three issues that will justify the need for OSM in the UK. They are all linked to your statement that the UK will have 'high quality and very complete dataset freely available.'

1. The mid-scale OS products are generalised both in terms of positional accuracy and completeness. Meridian 2, for example, is not as accurate positionally as OSM and chop off a lot of small streets and details that are on OSM. I would agree that Meridian is good enough (that's why it is being sold) but OSM is better and can fill a need for those who want detailed maps. OSM will be a valid alfternative to large scale products that will continue to be charged for, as its quality is better than the mid-scale products but lower than the large-scale ones. All medium and small scale OS products are either dumb (raster) or generalised - that means degraded (vector)

2. While I don't think that it will be a very interesting activity for OSMers, the OS datasets can be used to fill in the blanks - for example, using so of the raster datasets as background to allow adding attribute details which are missing. Where I do see the opportunity is a fusion of OS mid scale with OSM details in urban areas - this can be very compelling.

3. Not that surprising, but because mid-scale vector products went through batch processing, a close inspection reveals all the errors that are caused by it and they do need some editing to be used in a specific application. It is also not surprising to discover areas are not that up to date in mid-scale products (the OS commitment for 6 months updates is only for large scale products).So my guess is that after one upload OSM will always be better and there want be a need to integrate updates from the OS.

I agree with your more important points about motivation and engagement of the community, but at this rate the UK might be complete enough by the time the OS open its data...

Anonymous said...

It would be bitter sweet if this could improve OSM because of the problems it could present. Two ways that would make my life easier:
- better technology for keeping up with and merging released data sets. A three way diff tool? A graphical merge of every change (every altered way, every POI)?
- divide the data based on licencing. CC and ODbL are all well and good for allowing commercial usage when we survey ourselves but sometimes all we can get at the moment is non-commercial use only (but modifiable and distributable - sometimes CC NC-SA) licences. The value it adds for hobbyist and non-profit users may be overwhelming!