Google’s Cloud Platform offers a great set of tools for creating easily scalable applications in the cloud. In this post, I’ll explore some of the special challenges of working with geospatial data in a cloud environment, and how Google’s Cloud can help. I’ve found that there aren’t many options to do this, especially when dealing with complicated operations like geofencing with multiple complex polygons. You can find the complete code for my approach on GitHub.
Geofencing is the procedure of identifying if a location lies within a certain fence, e.g. neighborhood boundaries, school attendance zones or even the outline of a shop in a mall. It’s particularly useful in mobile applications that need to apply this extra context to someone’s exact location. This process isn’t actually as straight forward as you’d hope and, depending on the complexity of your fences, can include some intense calculations and if your app gets a lot of use, you need to make sure this doesn’t impact performance.
In order to simplify this problem this blogpost outlines the process of creating a scalable but affordable geofencing API on Google’s App Engine.
And the best part? It’s completely free to start playing around.
If you want to dive right in you can download the finished geofencing API from my GitHub account.
To include this library into your build path you simply add the JTS Maven dependency to the pom.xml file in your project’s root directory.
In addition I’m using the GSON library to handle JSON within the Java backend. You can basically use any JSON library you want to. If you want to use GSON import this dependency.
https://gist.github.com/tschaeff/778b12fb116c36e3c08c Now you need to create an endpoint called add. This endpoint expects a string for the group name, a boolean indicating whether to rebuild the spatial index, and a JSON object representing the fence’s object model. From this App Engine creates a new fence and writes it to Cloud Datastore.
Cloud Datastore uses internal indexes to speed up queries. If you deploy the API directly to App Engine you’re probably going to get an error message, saying that the Datastore query you’re using needs an index. The App Engine Development server can auto-generate the indexes, therefore I’d recommend testing all your endpoints on the development server before pushing it to App Engine.
https://gist.github.com/tschaeff/ae7279e8d0e5317e4862 You then build the index and write it to memcache for fast read access.
For this you first need to retrieve the index from memcache. Then query the index with the bounding box of the point which returns a list of polygons. Since the index only tests against the bounding boxes of the polygons, you need to iterate through the list and test if the point actually lies within the polygon.
https://gist.github.com/tschaeff/b23844335ec70688ea4c When testing for a polygon you want to get back all fences that are either completely or partly contained in the polygon. Therefore you test if the returned fences are within the polygon or are not disjoint. For some use cases you only want to return fences that are completely contained within the polygon. In that case you want to delete the not disjoint test in the if clause.
If you want to insure best performance and great scalability you should consider to switch from the free and shared memcache to a dedicated memcache for your application. This guarantees enough capacity for your spatial index and therefore ensures enough space even for a large amount of complex fences.
That’s it - that’s all you need to create a fast and scalable geofencing API. Preview: Processing Big Spatial Data in the Cloud with Dataflow In my next post I will show you how I geofenced more than 340 million NYC Taxi locations in the cloud using Google’s new Big Data tool called Cloud Dataflow.