Recently at TaskRabbit we upgraded our Elasticsearch versions from 1.5 to 1.7. One consequence of the update was the newer version seems to be more strict about how it treats consecutive duplicate points in geo json shapes we store.
One feature we have is we allow Taskers to draw maps of geographical locations they want to work on. This can obviously lead to duplicate consecutive points if the Tasker happens to double click on the map drawing tool on the same spot, for example. It was particularly problematic because some maps that were already stored in Elasticsearch from version 1.5 couldn’t really be updated cause the shape was now invalid cause of these duplicate points.
So we had to write a little script to go through the shapes and remove them. This was very straightforward to do, but I figured someone might run into a similar issue and might find the actual code helpful:
As you can see, the class receives the coordinates as a param, for example:
And then goes through the coordinates outer array, then goes through each of the coordinate points, and if the previous coordinate is the same, we just don’t add it to our @fixed_coordinates array. Finally, we expose them via an attr_reader.
We threw the fixup in a before_save that only runs if the coordinates themselves change, as well as wrote a Rake task that goes through all the existing maps and fixes them up if needed.