A Japanese translation of this post is available on http://www.openstreetmap.org/user/MAPconcierge/diary/36106

Over the last few weeks, the data team at Mapbox have been investigating the unusually large number of unconnected highways in Japan which otherwise looked comprehensively mapped.

Broken highways in Japan. Bigger circles indicate highways of higher classification

Looking into the data threw up quite a few interesting findings:

Most of the data in Japan is the result of the Yahoo Japan import from 2011 and contains over 5 million road segments.

The road data has positional errors (5-30m) when compared to GPS data and includes incorrect topology that does not match satellite imagery.

Roads in metropolitan areas have been realigned to the correct position, but large parts of the country are still untouched since 2011.

Many motorable roads are tagged as paths, and many paths tagged as roads

Roads are split into small segments between every junction

The classification of minor roads seems to be based on a YH:width tag that has inconsistent road width values compared to imagery. The result is arbitrary segments of tertiary, unclassified and residential roads throughout the map.

tag that has inconsistent road width values compared to imagery. The result is arbitrary segments of tertiary, unclassified and residential roads throughout the map. The Bing imagery coverage for Japan is comprehensive but does not match the OSM data or Strava GPS data. There is both an offset and orthorectification errors that varies throughout Japan. New mappers end up realigning the data to incorrect Bing imagery using iD causing more inconsistencies.

Highly detailed maps from Japan GSI is available for tracing into OSM. On a closer look, the major roads are accurate, but the minor roads are not reliable.

There is high resolution orthorectified imagery for Japan from GSI which perfectly matches Strava GPS data and is the best imagery source available.

The coverage of orthorectified imagery from GSI is limited to only the major urban areas.

Fixing the map

The complexities of the data issues in Japan make fixing the data a challenging task. In the 4 years since the import, large parts of the data remains untouched.

Members of the osm-ja community have expressed how these large scale data inconsistencies make it hard for grassroots mapping to happen

Our current OSM tools are not ready for a data cleanup of this scale and it requires evolving smart tools and a data cleanup strategy that can empower the local mapping community to fix the map.

The data team was eager to take up this challenge and got inputs from the Japanese community on how to approach the issue. You can follow the remapping trials and our findings in our /mapping repository. In a later post, I’d like to document the cleanup strategy using existing tools and learnings that could help make this a more comprehensive effort.