Monday, April 14, 2014

Report on another great State of the Map conference

Summary

I spent the past weekend in Washington DC for State of the Map (SotM) US, the OpenStreetMap conference. It ended up selling out, with around 500 attendees, making it the largest OpenStreetMap event yet. As with previous SotM events I’ve attended (the last one being in Denver in 2011), I found it very enjoyable and interesting, and there was great energy about it as always, much more than at traditional GIS events. Many thanks to the organizing team for a great event!

State of the Map US 2014

Below I summarize a number of items that I thought were of interest. I should also say that I missed a number of interesting presentations - there were two tracks and much of the time I wished I could have been in both talks! Fortunately though all the talks were recorded so I will check out those I missed online, and recommend you do too! Actually most of the talks are already online - just click through on the online schedule. Well done to the SotM organizing team for getting this done so quickly and professionally!

I also got some nice pictures of cherry blossom, and saw my English football (soccer) team, Liverpool, win a hugely important and exciting game against Manchester City together with several other SotM attendees, at the Queen Vic pub, home of the DC Liverpool Supporters Club - thanks to Jeff Johnson of Boundless for being the ringleader on this outing!

Vector tiles

The most interesting and potentially significant announcement was that MapBox released the spec for their vector tiles format, which they have been talking about for a while, and have been using internally to generate dynamically styled raster tiles at mapbox.com. Dane Springmeyer talked about more technical aspects of it, and Nicki Dlugash from Mapbox and Mamata Akella of the National Park Service talked about using Tilemill 2, which can both consume and export vector tiles, from a user perspective.

One nice aspect of rendering vector tiles directly in the browser is that you can do very interactive zooms and nice transitions, as shown in this short demo from Nicki's presentation. Another advantage is that you can do very dynamic restyling, on the server or client. And thirdly there are attractions for offline use in terms of data volumes compared to raster tiles.

Vector tiles are not a panacea though, raster tiles continue to have their advantages too. A key one is being able to exactly match complex symbology from an existing GIS, which is important in a lot of the applications we work on at Ubisense - large utilities can have a hundred feature types, thousands of symbology rules and complex custom annotation, which is not easy to replicate. Also raster tiles give great interoperability between different systems, an advantage that many people underestimate. And they work well in older browsers, whereas most vector technologies require newer browsers and a lot of large enterprises are still stuck on older versions of Internet Explorer :(. This is less of an issue for mobile apps on tablets or smart phones though.

As an aside, I find it a bit ironic that when I started doing GIS in the 1980s, most products then stored their graphics in map tiles, which caused various problems with editing in particular, and managing features that were split between multiple tiles. The big focus of everyone in the industry at that time was to move away from tiles to a "continuous database" where you didn't have to split things across tile boundaries. But now tiles are cool again :) !! It's interesting how many computing problems oscillate in a similar way, moves between thin and fat clients being another example - mainframes to PCs to web browsers to smart phones, etc.

But caveats notwithstanding, these vector tiles definitely have a lot of potential and I certainly plan to be playing around with them in the near future!

Licensing

The elephant in the room with OpenStreetMap, unfortunately, continues to be licensing. I hesitate to even get into this topic, but will try to briefly summarize. There has been a long and painful process over the past several years to migrate the license for OpenStreetMap from Creative Commons to a new license called ODbL. Many smart people, many of whom are friends of mine, have put a huge amount of effort into creating this new license and managing the difficult process of steering the community through the change, all with great intentions. A key part of both the old and new licenses is the notion of "share alike" - roughly speaking that you can take a copy of the OpenStreetMap data, but if you make improvements to the data then you need to also make those freely available under the same license.

However, there was a very strong message from a range of organizations presenting at the conference that they were prevented from doing much of what they would like to do with OpenStreetMap because of the license - from government organizations like USGS, New York City and the National Park Service (NPS) to private companies like MapBox and Foursquare. Alex Barth did a presentation just on the licensing topic. I am not a lawyer, and I have also been steering pretty well clear of the license discussions in the last year or two, but I think there are two broad areas of concern. One is that if you use OpenStreetMap data in conjunction with your own private data, if there is sufficient interaction between the datasets then there is a risk that he license could oblige you to make your private data freely available also, which is obviously not acceptable in many situations. Some things are clearly fine, like just using OpenStreetMap tiles as a backdrop to your data. But when you get into editing the underlying data in some way then it's more complicated. A second category is that certain government organizations like USGS and NPS are mandated that the data they create and maintain has to be public domain. So they are unable to incorporate OpenStreetMap data as they would like to, because then at least some aspects if not all of their data cannot be released as public domain, it has to have the ODbL license which imposes additional conditions on the use of the data.

There seem to be two schools of thought on how to proceed, as articulated by Mikel Maron:

The first category of problems could arguably be mitigated by further clarifications on ODbL, for example detailing specific use cases that are or aren't acceptable. Part of the problem is that since ODBL is a new license, until there are some law suits involving it, there isn't a legal precedent for how certain things will be interpreted (again as I understand it, I am not a lawyer or overly close to this!). However, I personally don't see how the second category of problems can be resolved without dropping share alike, it's just fundamentally incompatible with mandates that these government agencies have.  

From my non-legal and somewhat removed perspective, I would really like to see OpenStreetMap be public domain. I fully understand the rationale for why we started out with a share alike approach, but I think it's just run into a lot of problems now, as seen from the various user presentations I mentioned. But it's also difficult to see how to get from here to there - having just been through a very divisive and often vitriolic multi-year process, which drove many people (including me) to be less engaged with OpenStreetMap, I don't know if many people are going to have the appetite for another license change. But who knows, perhaps there are new people who have the energy for it! It just seems to me that if organizations have to pay lawyers huge sums of money to decide whether they can use the data as they would like to (and often get the answer "no" back), then we don't have the sort of free and open database that I think most of us want.

Anyway, enough of that, back to more fun stuff ...

Passive Crowdsourcing

I have thought for a while that one of the next big steps for OpenStreetMap, and other crowdsourced map initiatives, is to make more use of passive crowdsourcing to complement the active crowdsourcing that has been the focus so far. By this I mean things like gathering bulk GPS traces from phones or in car navigation systems, and from this you can deduce a lot, like whether there is a road that is missing from the map, as there are lots of cars going through empty space at 30mph, or whether a one way road is incorrectly tagged, etc. To some degree OpenStreetMap is at a disadvantage here compared to the likes of Google and Apple, who have more direct access to GPS tracking information from navigation applications running on many smart phones.

However, Telenav has been doing a lot with OpenStreetMap and Kristen Kam talked about their work. One thing they do is to identify potentially incorrectly tagged one way streets, by comparing the direction of GPS tracks versus the underlying OSM data. Currently these changes are pushed into MapRoulette, which presents a sequence of small fixes to a user for them to validate and either fix or not - Martijn Van Exel talked about this. I tested this out, and personally I found that most of the time (admittedly from a small sample), I couldn't tell whether it made sense to change a street to one way from the aerial imagery. I think potentially it would make more sense to make this particular type of edit as a direct batch update job - but nevertheless MapRoulette does work very well for many types of fix.

Another cool example of leveraging passively crowdsourced data came from Strava, who have an application that tracks runners and cyclists, and they have a very large number of GPS traces along trails, comprising billions of GPS points. Digitizing winding trails manually is quite time consuming, and Paul Mach presented a very cool solution called Slide, which has some pretty elaborate mathematics behind it, but basically would snap a very roughly digitized path to the "center of gravity" of the traces running along that path. You can try an interactive demo here, which I think is pretty cool! They also have a version of the same functionality that is integrated into (a fork of) the OpenStreetMap iD editor.

In general, it seems as though OpenStreetMap's routing capabilities have improved quite significantly, and there are a lot more applications using it for this purpose. 

Geocoding

Apart from routing, another historical weakness with OpenStreetMap has been addressing / geocoding. There is also quite a lot of effort going into this area, but it would seem with not as much progress yet as we've seen with routing. Randy Meech gave an interesting presentation on Mapzen's work on an open source geocoder, including autocomplete capabilities. You can check out a live demo of their "Pelias" geocoder here. It has a knowledge of neighborhood names in many cases, but is more limited on individual house addresses (because of what's available in OpenStreetMap). If you drag the map around it also does reverse geocoding based on the location of the target, where possible.

David Blackman of Foursquare gave a very entertaining talk that touched on a range of topics, including the construction of neighborhood and other boundary data from OpenStreetMap plus some other sources. One of the outputs of this work is Quattroshapes, where you can download a variety of useful polygon data for free. This data is used by Pelias and some other gecoding initiatives, I believe. Tyler Bell also gave a good talk focused on reverse geocoding, and there were multiple talks on doing address imports from county GIS datasets. 

And more ...

I'm running out of time and energy here, so just a quick bullet list of a few more miscellaneous things that I think are worth a mention:
  • An interesting presentation from Mike Skalnik on what github is doing with geoJSON - sort of cool for smaller simpler datasets, but with limitations for more complex applications
  • Jeff Johnson of Boundless talked about GeoGit, which is an ambitious initiative to manage distributed updates, and something I've been interested in exploring but haven't had a chance to get to (one thing we did a lot of work on at Smallworld back in the early 1990s was version management, which has a lot of parallels with this). Jeff talked about some interesting scenarios in how this could potentially be used to manage two way sync between OSM and a local government (or other) GIS (license constraints permitting!)
  • In the bar at the end of the conference I chatted to Morgan Herlocker, who is developing an open source library I hadn't come across called Turf, for geospatial processing in JavaScript (server or client side) - sort of like a GEOS or JTS, but independently developed. I haven't looked at it in detail, but it seemed like this could be useful for some things we are working on.
  • Mapillary is an interesting looking project that aims to create a crowdsourced Street View from stitched together photos taken by smart phones, something that I've been thinking for a while is a project that is needed. You can download their app for your smart phone and upload some pictures - I had a very quick play capturing some data near my home this afternoon and it came out fairly well, I will experiment more in due course.
  • John Firebaugh gave an excellent talk on "Implementing change in OpenStreetMap", with lots of great insights on managing community based projects. He had a lot of good one line pearls of wisdom, one that I liked in particular was that "perfect" is the enemy of "much better than we have today". 
  • Lyzi Diamond gave a great talk on "Maptime" meetups, which I found especially interesting as one of the organizers of the Geospatial Amateurs meetup in Denver. They take a very hands on approach to helping people learn new things, and have developed quite a few lessons and tutorials that are available on github, and I certainly plan to look more at those.
  • OSM Tchoutchou is an entertainingly named site that maps French trains in near real time using OSM, which is quite cool to watch.
And there's much more too - sorry for those whose presentations I didn't get to (either physically or in this post), but I really am out of time for now, and want to get something posted. Thanks again to everyone involved for making it such a great and energizing event. And I encourage everyone to watch the recorded presentations.