NumerousApp Metrics

Sorry to report that Numerous had to shut down on May 1, 2016, and now this entire page has become irrelevant.

I’m leaving the page here in case you stumble upon it from other links somewhere; note that the preview links for the metrics being discussed are now all broken because the Numerous server itself is gone.

Original page follows.


 

  • Disclaimer – I’m an investor in Numerous.
  • More postings of mine can be found under the tag numerousapp

I wrote a python class for accessing NumerousApp. General developer info can be found on the NumerousApp web site (also see more details all the way at the bottom of this note).

Using that class I’ve integrated a bunch variables of my own into NumerousApp.

The ones I think are (somewhat) interesting:

  • Number of flights in the U.S. known to the Air Traffic Control system and currently in the air. Updated every 30 minutes

 

Geek Info:

Flights Airborne

I am screen scraping this off of FlightAware, using a simple general-purpose screen scraper I built using python and Beautiful Soup.

The screen scraper code itself is available here: http://neilwebber.com/files/scraper.

In this particular case the data I am looking for is contained in a “strong” element and the number appears within the phrase “Tracking NNN” where NNN is an integer. Using the screen scraper program that becomes:

scraper -s strong -p 'Tracking {candidate_value}' --int http://www.flightaware.com/live

The scraper program is pretty simple-minded but works surprisingly well on a lot of web pages.

Ozone/Air Quality

The ozone and air quality variables are screen-scraped from www.airnow.gov using the same scraper program. The scraper runs every 30 minutes but the data itself is updated on the site much less frequently (and usually goes off line entirely overnight).

Moon

The moon phase uses static data derived from the US Naval Observatory web site. It does not screen scrape fresh each night; rather, I did a one-time screen scrape session to build a table good until 2021. This improves reliability (protects against screen scrape breakage if the USNO site changes format) at the expense of having to remember to refresh the table in 2021.

Google Ping

The google ping times are derived using a script I wrote running on server boxes I have at my hilltop and my loft.

Hilltop Network Speed

The hilltop internet download speed uses the speedtest_cli python package to duplicate what speedtest.net reports. On my hilltop network I have a soekris box running FreeBSD that I use for these (and other) scripts. This box performs many status monitoring functions on my network. It has a gigabit ethernet connection and enough horsepower to drive the speedtest at a reasonable clip.

On my loft network I am using a raspberry pi server to run a fing sentinel and my google ping script. The pi isn’t as fast and, most importantly, only has 100Mb ethernet. As a comparison test I tried running the same speedtest at my hilltop on both the soekris box and the pi; the pi results are about 30-40% of the numbers I get from the soekris box. For this reason (given that I only have a pi at the loft) I don’t bother reporting download performance metrics from the loft. Perhaps someday I will stick another soekris box on the loft network though that seems overkill just for this performance metric. Then again, everything about the architecture of all my networks is already overkill, so I’ll probably go ahead and do this at some point.

As an aside, now that internet speeds are (in some areas) exceeding 100Mb/s, it’s important that you start paying attention to your switching equipment and whether it is gigabit or only 10/100. You won’t get full download speeds if your network is throttled to 10/100 at any critical point between you and your internet connection. You also probably won’t get that peak performance off a WiFi connection (though this depends on a lot of details). My stubborn insistence on hard-wiring whenever possible, even if it requires extra work to retrofit wires to TV locations (e.g., for Netflix streaming boxes like a Roku) may finally pay off. 🙂

Power Fail

The hilltop power fail time is updated every 30 minutes, by reading the uptime from various devices on my network and taking the longest uptime (from devices not on a UPS) as the time of the last power failure. This seems to give pretty reliable results. It could be wrong if I had to reboot all the queried devices for some reason. This metric obviously isn’t quite real time; while the power is out it isn’t getting updated but will reflect the correct value once the power comes back on. My maintenance guy and I find it useful to have access to this number because sometimes knowing that the power failed overnight or “N hours ago” helps explain otherwise seemingly random problems.

Cable Modem Reboot

I have a Synaccess NP-02(B) “netBooter” device controlling the power to my cable modem.  You program the box to ping something (I use google) and if connectivity fails (you program a threshold for how many pings in a row have to fail) then it will power-cycle the attached device. So this way I automatically reboot my cable modem from time to time when it wedges as, for whatever reason, Time Warner cable modems seem to do periodically. As annoying as it is to periodically go into your garage or basement or wherever it is that TWC has decided is the “right” place for your cable modem, in my case the cable modem is down the hill from my house and nearly a quarter mile away (my network link back up the hill is carried on fiber). So I’m really happy about this automatic reboot device to unwedge the network when the modem loses its mind.

The netBooter keeps a log of reboot events and I extract that into a Numerous metric reporting the date of the most recent cable modem reboot (not caused by a power failure).

NumerousApp Server Response Time

This performs a simple “get user info” API call to the server, using my python class as a client. The underlying requests library provides a timing facility that comes very close to capturing the “on the wire” time (so is exclusive of any local overhead in my python class and most of the overhead in the requests library itself).

The numbers are typically somewhere in the neighborhood of 300msec, which is consistent with my experience of typically getting slightly more than 3 API calls per second as a response rate from the server. This is for a single thread; if you fire multiple responses at the server simultaneously the overall API throughput rate is much higher but any individual thread will see performance around  300msec per API.

NumerousApp Atomicity Test

The “Atomicity Test” used to show a bug in the NumerousApp server. Their API includes a primitive to add a value to a metric. Which of course you could do by reading the metric, adding the value yourself, and then writing the result back. However, the advantage of using the built-in ADD operation is that it is supposed to be atomic: if two people simultaneously try to add something to a metric then the result is supposed to include both additions. In the case where you read and add and write back the variable yourself, obviously two people doing that at the same time might “lose” one of the updates (it could get clobbered by the other update). So the ADD operation is supposed to be the better way.

I first developed this metric to demonstrate a bug in the NumerousApp server.  The test sets the value to zero then fires 100 parallel “ADD 1” operations at the server. The result should be 100; if it is less than 100 we know that some of the updates got lost. Because we can read back the event stream from a metric (the stream of update requests) my test program can verify that the server did in fact receive all 100 requests; once that is verified we know for certain the variable should have ended up at 100.

The bug has been fixed, so unless there is a regression this metric is always going to show the value 100. It is one of the most interesting “constant” values on Numerous!

Random

Finally, the “Random Number” is what it says it is: a random number, in the range 0 to 99 (inclusive). It is generated every 30 minutes using the random.org service. There’s no particularly good reason to have this on Numerous; obviously you could just surf the random.org web site yourself. I just did it because I can. The random.org web site has a quota policy for automated clients such as this but they give you 200000 random bits a day. 0 .. 99 takes (inefficiently) 7 bits and I’m updating it 48 times a day, so I’m only using 336 bits of quota a day (and I therefore did not even implement a quota check in the code).

Python and Ruby

All of these variables are ultimately uploaded to NumerousApp using their API and a python interface I wrote which is available as “numerous” on pypi (I.e., so you can “sudo pip install numerous) and the repository is Nappy on github.

The documentation for the Python class is also published in a Wiki (on github).

If you’ve made it all the way down to here I’ll mention for completeness that I also wrote a Ruby gem for the Numerous API, available as numerousapp on rubygems.org (sudo gem install numerousapp) and its github repository is numeruby.

I can't complain but sometimes I still do