About the significance of machine cognition


About the significance of machine cognition

Big AI is watching you!

Last month, I wrote about a newly developed smart sensor that uses sound and AI to identify activities. Nest, a Google company, has now upped the ante.

newest home surveillance camera, the Nest Cam IQ, not only comes with a whopping 4k sensor, but also with new "brains" to analyze the video and audio stream. These "brains" are not built into the camera, of course. We are felt lightyears away from having that sort of capability in a package this small. The AI doing the video analysis runs on the Nest Aware service available as a subscription. You can bet your booty that Nest benefits from Google's new TPU 2.0 hardware, as analyzing video and audio content simultaneously and live takes a lot of processing power.

Anyone (like me) that runs a home surveillance system based on simple pixel change algorithms knows the benefit that intelligent analysis of video streams would bring. To give you an idea of the difference, allow me to explain how current systems (with some exceptions) work. The goal of surveillance recording is not to net the most hours of video material. The goal is to single out security issues, either so that it is easy to find the relevant footage after the fact or - ideally - to get a system alert while an issue occurs. Traditional surveillance systems use a method where video pixel changes are analyzed and situations are flagged where a certain number of pixels (that corresponds to a certain size) changes in a certain timeframe. Think of a dog jumping over a fence or a door being opened.

While it is pretty simple to trigger an event if a door is opened that pretty much fills the camera frame, the situation changes drastically if the camera is recording a yard. Wind will move bushes and trees, and this will trigger the recording mechanism, resulting in a lot of recorded material. If you want to find out if someone is trespassing in your yard during the day, you'll have to look at hours and hours of video of moving trees.

There have been systems available that use algorithms to try to discern the shape of a person, for example. One of these that I have used personally is from
Sighthound, Inc. which continued to develop the product I used several years ago ("Vitamin D"). Sighthound claims to have an SDK available that permits the training of a neural network used for facial recognition that runs locally on an iPad, using only that iPad's hardware.

While the training and recognition of individual faces may even work in a simple AI on an iPad, the video analysis that Nest offers goes several levels higher on the complexity scale. Think of it this way: instead of recognizing someone's face when they stand in front of a camera, a complete AI solution should be able to detect a lot more information, such as

"John Doe, wearing blue shorts and a Metallica T-Shirt. John just entered the dining room carrying a tablet computer and put that tablet computer on the table, leaving the room in the direction of the kitchen. John looks tired".

If that doesn't sound like big brother watching, I don't know what does!

Since the new Next camera not only records video in ultra high resolution but also sound in excellent quality, the data is sufficient for an AI to discern a lot of activities. Want to know how your aging mother is doing in her apartment three blocks down? In a scenario where the Nest service is connected to your Alexa account (which is planned), you could just say: "Alexa, what is mom doing right now?" and Alexa would answer "Your mother is sitting at the table doing a crossword puzzle." And while your mom might not appreciate being watched 24/7 by an online camera system, both of you will likely appreciate an urgent message being automatically sent out that your mom has been lying on the floor without motion for two minutes.

I would have an immediate use for a system like that: one that tells me when our cat is sitting in front of the terrace sliding door, waiting to be let in. I don't think I'll be putting these devices into my kids' bedrooms anytime soon.

Google's TPU 2.0

Google made their AI framework TensorFlow open-source in late 2015. Most AI frameworks use relatively inexpensive and widely available GPU (Graphics Processing Unit) devices to accelerate AI-related filtering, as doing this with regular CPUs isn't very cost effective (Watts/Output). But even GPU's - while much more efficient that CPU's for this type of activity - can be beat by differentiated hardware.

Research into making AI-related computing more efficient has shown that "bittedness" isn't what drives performance. While modern operating systems require 64-bit processors for efficient work, deep learning machines are quite happy to work with 8 bits, as long as there are lots and lots of nodes to avoid swapping.

Google research developed a
Tensor Processing Unit (TPU) to work alongside their TensorFlow framework. The device was designed to plug into a regular hard-drive slot, making roll-out of large clusters of TPUs quite simple: set up server racks with slot-in hard drive capacity and plug these (mostly) full of TPUs.

The advantage of the TPU over a GPU-based solution is a much higher number of operations per second per watt. I.e.: faster number crunching with lower power requirements. Google's TPU was never made available to the market, though I wouldn't be surprised to find something similar - albeit much scaled down - in a Google Android phone in a few years.

Things don't stay at a standstill in IT, especially not at Google, so it comes as no surprise that the
successor of the TPU has been announced. This "TPU 2.0" (also dubbed "Cloud TPU") device doesn't offer the same form factor as version 1. Just looking at the towering heat sink gives you a feeling that there is quite a bit of neural capability waiting to be unleashed.

And indeed: while the original TPU could only be used to run pre-trained neural networks, this new version is designed to facilitate an efficient learning cycle as well. According to Google, the TPU 2.0 can train a neural net several times faster than comparable GPU farms.

The TPU 2.0 was designed specifically for a Cloud-based offering. In other words: anyone can put together an AI solution using Google's open-source TensorFlow framework and run this solution on the Google Cloud with access to TPU 2.0 farms. All at a price, of course. Will this be a success for Google? In my opinion, selling TPU time via Cloud-based AIaaS (AI as a Service) isn't the prime objective of all the R&D that has gone into this new device. Google itself has transformed to an AI company, with most services it offers, from Maps to Search to Pictures all using AI in some form. Not to forget Google Home - the associated service requires intense AI processing for natural language processing (NLP) of voice input.

As the world moves to AI - and who wouldn't like to have an intelligence built into their "Smart"-Phone - you can bet your booties that companies like AMD, Intel or Nvidia are hard at work, designing industrial or even consumer-grade AI hardware. The next two years will likely show a plethora of TPU-like processing devices coming to a computer store near you!