Tuesday, 30 July 2013

Notifying You

The GeoNet Project provides a range of ways that you can stay up to date with geohazards information for New Zealand.  This includes smart phone applications that provide near real time earthquake push notifications for both Android and iOS devices.  I've worked with the internet for a long time but some days it still amazes me - receiving an earthquake notification on a phone in my pocket within moments of an earthquake occurring is one of those times.  It's both disarmingly simple and complex at the same time.  Here's how we get the earthquake location to the phone in your hand.

Getting push notifications to your phone involves services in several countries.  The numbered steps are described below.

1. The Application on Your Phone

To receive near real time push notifications you have to have the GeoNet app for Android or iOS installed on your phone, it has to be connected to the internet, and you have to have your quake preferences set appropriately.

2. Pushing to the Phone

We don't connect to your phone directly to send the push notifications, Apple or Google do this for us via either the Apple Push Notification Service or Google Cloud Messaging. When it's on the internet (and you've given it permission) your phone maintains a connection to the Apple or Google servers and receives push notification messages over that connection.    When your phone isn't connected those servers store messages to send to your phone next time it connects to the internet.  Using these services also allow the application on the phone to receive messages even when it's in the background.

We use Urban Airship to send push notifications to the Apple and Google messaging services.  Urban Airship give us a single API to send messages to and do all the hard work of maintaining the connection to the Apple and Google service as well as staying up to date with any changes to those services.  Urban Airship also provide some handy reporting tools for us.

3. Managing Your Preferences

We run a back end application to store your quake notification preferences.  When a quake message is received the preferences are used to decide which phones should be notified about this quake and a push message is sent to Urban Airship.  We run the back end application in AWS Elastic Beanstalk in the Sydney region.  This gives us very easy deployment, high availability, and easy scaling as demand increases (like it did after the recent earthquakes).

4. Locating the Quake

This is done by GeoNet Rapid, automatically in near real time and reviewed by the duty officer as required.  It's using all that great data coming from the remote seismic sensor network.

The Internet is Amazing!

When the notification arrives on my phone it has started from a message about an earthquake location in New Zealand, been sent to Sydney, forwarded onto America, and sent back to my phone in New Zealand.  This all happens within moments of the earthquake occurring.  A journey of over 25,000 km passing through several cloud services.  This might not be magic but every time it happens it seems pretty close to it to me.

Friday, 19 July 2013

Earthquake Stats: Magnitude 5.7, Friday, July 19 2013 at 9:06:39 am

This morning's earthquake shook a good bit of the country including myself. We have so far collected 6167, 6219, 6556 Felt Reports - thanks to all who have submitted them.  We've been using a couple of new things to monitor the web site and this has been their first real work out. Here's what we saw.

Real User Monitoring with Pingdom

We're using Pingdom for Real User Monitoring (RUM).  This lets us find out how the web site performs from your point of view, in the browser.  As I've written before we spend a lot of time on web site optimization and RUM with Pingdom gives a window into the results of that (and nothing else, we're not tracking you here, just web site performance).  RUM is a big improvement over what we used to do (post processing server logs with awstats) because we get to see page loads and how long they take in the browser in real time instead of just requests per second.  Here's what it looked like for this morning:

The top graph shows page load time in the browser.  The darker blue line is today, overlaid on yesterday (the lighter grey line).  The median page load time over this time period is 0.67s.  Fast!  The graph below this shows the number of page views in 5 minute windows.  The traffic goes from the usual low background to 44,480 views in 5 minutes very very quickly.  Below the graphs is a map showing page load time by country.  What's really nice is when the traffic hits the page load time doesn't go up - in fact it goes down a little, probably due to improved caching.

We also use Pingdom for uptime monitoring on our web servers.

Application Server Monitoring for Felt Reports

The application that collects Felt Reports got overloaded for a little while.  Thanks for your patience in submitting them.  To monitor the application server that hosts Felt we use two tools.  The first is Jolokia  which provides a JMX-HTTP bridge.  If you've ever worked with JMX and the inevitable firewall issues you will know what I mean when I say that Jolokia is breath of fresh air for JVM monitoring.  With Jolokia in place it is easy to write a script to query for JVM metrics.  We send the metrics to  Librato Metrics to store and visualize them.  Libarto Metrics provides a fantastic online tool for visualizing any time series data you care to send them.  What's more, for the 100 or so metrics we send at the moment it's costing about $7 per month - a total bargain.  Here's what we saw this morning:

In the graph Tomcat JK-8009 we see that the Felt app couldn't create more threads to serve additional requests for about 8 minutes.  Everything else was fine.  We'd like to never see this happen but it's a difficult situation to improve.  Eight minutes is to short a time to to spin up additional capacity quickly enough to make a big difference and we can't justify the cost of having extra capacity sitting around doing nothing for most of the time.  We've got some ideas for rewriting this application but we're currently very busy working on improving data and data access for science.  I hope we will have time to sneak in some work on Felt later this year.

I've also been doing a little work recently to modernize the monitoring for our 500 or so remote field sites and that data is going to go to Librato Metrics.  If I get time I'll write about it in the future.

Tools like Pingdom, Librato Metrics, and Jolokia have been very useful for gaining real time insight into our systems.  The arrival of great services in the cloud is providing huge benefit for us: we can spend less time building our own monitoring systems and more time focusing on the business problems and, unfortunately, with all these earthquakes business is good.