[Introduction to TensorFlow] A new programming paradigm(Video & Exercise)

A primer in machine learning

Coding has been the bread and butter for developers since the dawn of computing. We're used to creating applications by breaking down requirements into composable problems that can then be coded against.

So for example, if we have to write an application that figures out a stock analytic, maybe the price divided by the ratio, we can usually write code to get the values from a data source, do the calculation and then return the result.

Or if we're writing a game we can usually figure out the rules. For example, if the ball hits the brick then the brick should vanish and the ball should rebound. But if the ball falls off the bottom of the screen then maybe the player loses their life.

We can represent that with this diagram. Rules and data go in answers come out. Rules are expressed in a programming language and data can come from a variety of sources from local variables all the way up to databases.

Machine learning rearranges this diagram where we put answers in data in and then we get rules out. So instead of us as developers figuring out the rules when should the brick be removed, when should the player's life end, or what's the desired analytic for any other concept, what we will do is we can get a bunch of examples for what we want to see and then have the computer figure out the rules. Now, this is particularly valuable for problems that you can't solve by figuring the rules out for yourself.

So consider this example, activity recognition. If I'm building a device that detects if somebody is say walking and I have data about their speed, I might write code like this and if they're running well that's a faster speed so I could adapt my code to this and if they're biking, well that's not too bad either. I can adapt my code like this. But then I have to do golf recognition too, now my concept becomes broken. But not only that, doing it by speed alone, of course, is quite naive. We walk and run at different speeds uphill and downhill and other people walk and run at different speeds to us.

扫描二维码关注公众号，回复： 5724696 查看本文章

So, let's go back to this diagram. Ultimately machine learning is very similar but we're just flipping the axes. So instead of me trying to express the problem as rules when often that isn't even possible, I'll have to compromise. The new paradigm is that I get lots and lots of examples and then I have labels on those examples and I use the data to say this is what walking looks like, this is what running looks like, this is what biking looks like and yes, even this is what golfing looks like.

So, then it becomes answers and data in with rules being inferred by the machine. A machine learning algorithm then figures out the specific patterns in each set of data that determines the distinctiveness of each. That's what's so powerful and exciting about this programming paradigm. It's more than just a new way of doing the same old thing. It opens up new possibilities that were infeasible to do before. So in the next few minutes, I'm going to show you the basics of creating a neural network which is the workhorse of doing this type of pattern recognition. A neural network is just a slightly more advanced implementation of machine learning and we call that deep learning. But fortunately, it's actually very easy to code.

So, we're just going to jump straight into deep learning. We'll start with a simple one and then we'll move on to one that does computer vision in about 10 lines of code. But let's start with a very simple "Hello World" example. So you can see just how everything hangs together.

The 'Hello World' of neural networks

Earlier we mentioned that machine learning is all about a computer learning the patterns that distinguish things. Like for activity recognition, it was the pattern of walking, running and biking that can be learned from various sensors on a device. To show how that works, let's take a look at a set of numbers and see if you can determine the pattern between them.

Okay, here are the numbers. There's a formula that maps X to Y. Can you spot it? Take a moment.

Well, the answer is Y equals 2X minus 1. So whenever you see a Y, it's twice the corresponding X minus 1. If you figured it out for yourself, well done, but how did you do that? How would you think you could figure this out? Maybe you can see that the Y increases by 2 every time the X increases by 1. So it probably looks like Y equals 2X plus or minus something.

Then when you saw X equals 0 and Y equals minus 1, so you thought hey that the something is a minus 1, so the answer might be Y equals 2X minus 1. You probably tried that out with a couple of other values and see that it fits. Congratulations, you've just done the basics of machine learning in your head.

So let's take a look at it in code now. Okay, here's our first line of code.

This is written using Python and TensorFlow and an API in TensorFlow called keras. Keras makes it really easy to define neural networks. A neural network is basically a set of functions which can learn patterns. Don't worry if there were a lot of new concepts here. They will become clear quite quickly as you work through them. The simplest possible neural network is one that has only one neuron in it, and that's what this line of code does. In keras, you use the word dense to define a layer of connected neurons. There's only one dense here. So there's only one layer and there's only one unit in it, so it's a single neuron. Successive layers are defined in sequence, hence the word sequential. But as I've said, there's only one. So you have a single neuron. You define the shape of what's input to the neural network in the first and in this case the only layer, and you can see that our input shape is super simple. It's just one value.

You've probably seen that for machine learning, you need to know and use a lot of math, calculus probability and the like. It's really good to understand that as you want to optimize your models but the nice thing for now about TensorFlow and keras is that a lot of that math is implemented for you in functions.

There are two functional roles that you should be aware of though and these are loss functions and optimizers.

This code defines them. I like to think about it this way. The neural network has no idea of the relationship between X and Y, so it makes a guess. Say it guesses Y equals 10X minus 10. It will then use the data that it knows about, that's the set of Xs and Ys that we've already seen to measure how good or how bad its guess was.

The loss function measures this and then gives the data to the optimizer which figures out the next guess. So the optimizer thinks about how good or how badly the guess was done using the data from the loss function. Then the logic is that each guest should be better than the one before. As the guesses get better and better, an accuracy approaches 100 percent, the term convergence is used. In this case, the loss is mean squared error and the optimizer is SGD which stands for stochastic gradient descent. If you want to learn more about these particular functions, as well as the other options that might be better in other scenarios, check out the TensorFlow documentation. But for now, we're just going to use this.

Our next step is to represent the known data.

These are the Xs and the Ys that you saw earlier. The np.array is using a Python library called numpy that makes data representation particularly enlists much easier. So here you can see we have one list for the Xs and another one for the Ys.

The training takes place in the fit command.

Here we're asking the model to figure out how to fit the X values to the Y values. The epochs equal 500 value means that it will go through the training loop 500 times. This training loop is what we described earlier. Make a guess, measure how good or how bad the guesses with the loss function, then use the optimizer and the data to make another guess and repeat this.

When the model has finished training, it will then give you back values using the predict method.

So it hasn't previously seen 10, and what do you think it will return when you pass it a 10? Now you might think it would return 19 because after all Y equals 2X minus 1, and you think it should be 19. But when you try this in the workbook yourself, you'll see that it will return a value very close to 19 but not exactly 19.

Now, why do you think that would be?

Ultimately there are two main reasons.

The first is that you trained it using very little data. There are only six points. Those six points are linear but there's no guarantee that for every X, the relationship will be Y equals 2X minus 1. There's a very high probability that Y equals 19 for X equals 10, but the neural network isn't positive. So it will figure out a realistic value for Y.
That's the second main reason. When using neural networks, as they try to figure out the answers for everything, they deal in probability. You'll see that a lot and you'll have to adjust how you handle answers to fit. Keep that in mind as you work through the code. Okay, enough theory. Now let's get hands-on and write the code that we just saw and then we can run it.

Working through ‘Hello World’ in TensorFlow and Python

n the previous section, you saw some details behind the concept and paradigms of machine learning. You saw how it was a change from a rules-based expression using code to getting data, labeling that data, and then having a neural network figure out the patterns that bring about the rules. You looked through a very simple example that took some x and y values and figured out the relationship between them.

Okay, now you're going to get hands-on with writing this code for yourself. Now you don't need a development environment to do it, and one way that you can use it is to use it right in the browser with something called Google Colaboratory.

If you're familiar with Jupyter Notebooks in Python, you'll be right at home, and of course, you can use Jupyter Notebooks too. Otherwise, consider Colab to be an environment that runs in the browser that lets you run, edit, and inspect your Python code. It's really cool for learning with. If you want more details about it, check out this video on YouTube.

Here is the Colab environment that I'm using with the Notebook for this lesson loaded into it. I'll step through the lesson first, and then you can go and try it out for yourself. You can run the code by clicking the play button in a code block. Make sure you run each block in order or you might get some bugs.

In this block, I am importing and setting up TensorFlow, Keras, and NumPy. Next, I'll define the neural network as we discussed. It's one layer, with one neuron, and one input value. Now I'm going to compile the neural network using the loss function and the optimizer. Remember, these help the neural network guess the pattern, measure how well or how badly the guess performed, before trying again on the next epoch, and slowly getting more accurate.

Here's where we give it the data, where we have a known x and unknown corresponding y, and we want to figure out the relationship between them. These are fed in using NumPy arrays. Here's where we do the training and the machine learns the patterns. We're asking the model to fit the x's to the y's and it will try for 500 epochs. Each epoch is where it makes a guess, uses the loss function to figure out how well or how badly it did, and then use the optimizer to make another guess. When it's run, keep an eye on the loss on the right-hand side.

Remember, it knows the correct answer for the values we've fed in, so it can compare it's guessing against them. When I start, my loss is quite high, i.e the guess wasn't that great. But epoch by epoch, you can see the loss getting lower and lower. So the neural network is getting closer to spot the relationship. By the time 500 epochs have transpired, the loss is really, really small, meaning that for this data, it has come really close to figuring out the relationship between x and y. So let's see what happens if we give it an x that it hadn't previously seen. In this case, 10. What do you think the answer will be? As you can see, it ends up as 18.98, very close to 19 but not quite 19. Do you remember the reasons why? Check back to the lesson in the previous video to see the answer to this.

The Hello World of Deep Learning with Neural Networks

Like every first app you should start with something super simple that shows the overall scaffolding for how your code works.

In the case of creating neural networks, the sample I like to use is one where it learns the relationship between two numbers. So, for example, if you were writing code for a function like this, you already know the 'rules' --

float hw_function(float x){
    float y = (2 * x) - 1;
    return y;
}

So how would you train a neural network to do the equivalent task? Using data! By feeding it with a set of Xs, and a set of Ys, it should be able to figure out the relationship between them.

This is obviously a very different paradigm than what you might be used to, so let's step through it piece by piece.

Imports

Let's start with our imports. Here we are importing TensorFlow and calling it tf for ease of use.

We then import a library called numpy, which helps us to represent our data as lists easily and quickly.

The framework for defining a neural network as a set of Sequential layers is called keras, so we import that too.

import tensorflow as tf

import numpy as np

from tensorflow import keras

Define and Compile the Neural Network

Next we will create the simplest possible neural network. It has 1 layer, and that layer has 1 neuron, and the input shape to it is just 1 value.

model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])

Now we compile our Neural Network. When we do so, we have to specify 2 functions, a loss, and an optimizer.

If you've seen lots of math for machine learning, here's where it's usually used, but in this case, it's nicely encapsulated in functions for you. But what happens here -- let's explain...

We know that in our function, the relationship between the numbers is y=2x-1.

When the computer is trying to 'learn' that, it makes a guess...maybe y=10x+10. The LOSS function measures the guessed answers against the known correct answers and measures how well or how badly it did.

It then uses the OPTIMIZER function to make another guess. Based on how the loss function went, it will try to minimize the loss. At that point maybe it will come up with something like y=5x+5, which, while still pretty bad, is closer to the correct result (i.e. the loss is lower)

It will repeat this for the number of EPOCHS which you will see shortly. But first, here's how we tell it to use 'MEAN SQUARED ERROR' for the loss and 'STOCHASTIC GRADIENT DESCENT' for the optimizer. You don't need to understand the math for these yet, but you can see that they work! :)

Over time you will learn the different and appropriate loss and optimizer functions for different scenarios.

model.compile(optimizer='sgd', loss='mean_squared_error')

Providing the Data

Next up we'll feed in some data. In this case, we are taking 6 xs and 6ys. You can see that the relationship between these is that y=2x-1, so where x = -1, y=-3 etc. etc.

A python library called 'Numpy' provides lots of array type data structures that are a defacto standard way of doing it. We declare that we want to use these by specifying the values asn an np.array[]

xs = np.array([-1.0,  0.0, 1.0, 2.0, 3.0, 4.0], dtype=float)

ys = np.array([-3.0, -1.0, 1.0, 3.0, 5.0, 7.0], dtype=float)

Training the Neural Network

The process of training the neural network, where it 'learns' the relationship between the Xs and Ys is in the model.fit call. This is where it will go through the loop we spoke about above, making a guess, measuring how good or bad it is (aka the loss), using the optimizer to make another guess etc. It will do it for the number of epochs you specify. When you run this code, you'll see the loss on the right-hand side.

model.fit(xs, ys, epochs=500)

Ok, now you have a model that has been trained to learn the relationship between X and Y. You can use the model.predictmethod to have it figure out the Y for a previously unknown X. So, for example, if X = 10, what do you think Y will be? Take a guess before you run this code:

print(model.predict([10.0]))

You might have thought 19, right? But it ended up being a little under. Why do you think that is?

Remember that neural networks deal with probabilities, so given the data that we fed the NN with, it calculated that there is a very high probability that the relationship between X and Y is Y=2X-1, but with only 6 data points we can't know for sure. As a result, the result for 10 is very close to 19, but not necessarily 19.

As you work with neural networks, you'll see this pattern recurring. You will almost always deal with probabilities, not certainties, and will do a little bit of coding to figure out what the result is based on the probabilities, particularly when it comes to classification.

Week 1 Quiz

Weekly Exercise - Your First Neural Network

Reading: Colaboratory

Watching: Get started with Google Colaboratory (Coding TensorFlow)

Playlist: TensorFlow in Google Colaboratory

Exercise 1 - House Prices - Answer

In this exercise you'll try to build a neural network that predicts the price of a house according to a simple formula.

So, imagine if house pricing was as easy as a house costs 50k + 50k per bedroom, so that a 1 bedroom house costs 100k, a 2 bedroom house costs 150k etc.

How would you create a neural network that learns this relationship so that it would predict a 7 bedroom house as costing close to 400k etc.

Hint: Your network might work better if you scale the house price down. You don't have to give the answer 400...it might be better to create something that predicts the number 4, and then your answer is in the 'hundreds of thousands' etc.

import tensorflow as tf
import numpy as np
from tensorflow import keras
model = tf.keras.Sequential([keras.layers.Dense(units=1, input_shape=[1])])
model.compile(optimizer='sgd', loss='mean_squared_error')
xs = np.array([1.0, 2.0, 3.0, 4.0, 5.0, 6.0], dtype=float)
ys = np.array([1.0, 1.5, 2.0, 2.5, 3.0, 3.5], dtype=float)
model.fit(xs, ys, epochs=1000)
print(model.predict([7.0]))