DaSE-Computer-Vision-2021
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 

1601 lines
75 KiB

{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from google.colab import drive\n",
"\n",
"drive.mount('/content/drive', force_remount=True)\n",
"\n",
"# 输入daseCV所在的路径\n",
"# 'daseCV' 文件夹包括 '.py', 'classifiers' 和'datasets'文件夹\n",
"# 例如 'CV/assignments/assignment1/daseCV/'\n",
"FOLDERNAME = None\n",
"\n",
"assert FOLDERNAME is not None, \"[!] Enter the foldername.\"\n",
"\n",
"%cd drive/My\\ Drive\n",
"%cp -r $FOLDERNAME ../../\n",
"%cd ../../\n",
"%cd daseCV/datasets/\n",
"!bash get_datasets.sh\n",
"%cd ../../"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import tensorflow as tf\n",
"print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-title"
]
},
"source": [
"# What's this TensorFlow business?\n",
"\n",
"You've written a lot of code in this assignment to provide a whole host of neural network functionality. Dropout, Batch Norm, and 2D convolutions are some of the workhorses of deep learning in computer vision. You've also worked hard to make your code efficient and vectorized.\n",
"\n",
"For the last part of this assignment, though, we're going to leave behind your beautiful codebase and instead migrate to one of two popular deep learning frameworks: in this instance, TensorFlow (or PyTorch, if you choose to work with that notebook)."
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-ignore"
]
},
"source": [
"#### What is it?\n",
"TensorFlow is a system for executing computational graphs over Tensor objects, with native support for performing backpropogation for its Variables. In it, we work with Tensors which are n-dimensional arrays analogous to the numpy ndarray.\n",
"\n",
"#### Why?\n",
"\n",
"* Our code will now run on GPUs! Much faster training. Writing your own modules to run on GPUs is beyond the scope of this class, unfortunately.\n",
"* We want you to be ready to use one of these frameworks for your project so you can experiment more efficiently than if you were writing every feature you want to use by hand. \n",
"* We want you to stand on the shoulders of giants! TensorFlow and PyTorch are both excellent frameworks that will make your lives a lot easier, and now that you understand their guts, you are free to use them :) \n",
"* We want you to be exposed to the sort of deep learning code you might run into in academia or industry. "
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-ignore"
]
},
"source": [
"## How will I learn TensorFlow?\n",
"\n",
"TensorFlow has many excellent tutorials available, including those from [Google themselves](https://www.tensorflow.org/get_started/get_started).\n",
"\n",
"Otherwise, this notebook will walk you through much of what you need to do to train models in TensorFlow. See the end of the notebook for some links to helpful tutorials if you want to learn more or need further clarification on topics that aren't fully explained here.\n",
"\n",
"**NOTE: This notebook is meant to teach you the latest version of Tensorflow 2.0. Most examples on the web today are still in 1.x, so be careful not to confuse the two when looking up documentation**.\n",
"\n",
"## Install Tensorflow 2.0\n",
"Tensorflow 2.0 is still not in a fully 100% stable release, but it's still usable and more intuitive than TF 1.x. Please make sure you have it installed before moving on in this notebook! Here are some steps to get started:\n",
"\n",
"1. Have the latest version of Anaconda installed on your machine.\n",
"2. Create a new conda environment starting from Python 3.7. In this setup example, we'll call it `tf_20_env`.\n",
"3. Run the command: `source activate tf_20_env`\n",
"4. Then pip install TF 2.0 as described here: https://www.tensorflow.org/install/pip \n",
"\n",
"A guide on creating Anaconda enviornments: https://uoa-eresearch.github.io/eresearch-cookbook/recipe/2014/11/20/conda/\n",
"\n",
"This will give you an new enviornemnt to play in TF 2.0. Generally, if you plan to also use TensorFlow in your other projects, you might also want to keep a seperate Conda environment or virtualenv in Python 3.7 that has Tensorflow 1.9, so you can switch back and forth at will. "
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-ignore"
]
},
"source": [
"# Table of Contents\n",
"\n",
"This notebook has 5 parts. We will walk through TensorFlow at **three different levels of abstraction**, which should help you better understand it and prepare you for working on your project.\n",
"\n",
"1. Part I, Preparation: load the CIFAR-10 dataset.\n",
"2. Part II, Barebone TensorFlow: **Abstraction Level 1**, we will work directly with low-level TensorFlow graphs. \n",
"3. Part III, Keras Model API: **Abstraction Level 2**, we will use `tf.keras.Model` to define arbitrary neural network architecture. \n",
"4. Part IV, Keras Sequential + Functional API: **Abstraction Level 3**, we will use `tf.keras.Sequential` to define a linear feed-forward network very conveniently, and then explore the functional libraries for building unique and uncommon models that require more flexibility.\n",
"5. Part V, CIFAR-10 open-ended challenge: please implement your own network to get as high accuracy as possible on CIFAR-10. You can experiment with any layer, optimizer, hyperparameters or other advanced features. \n",
"\n",
"We will discuss Keras in more detail later in the notebook.\n",
"\n",
"Here is a table of comparison:\n",
"\n",
"| API | Flexibility | Convenience |\n",
"|---------------|-------------|-------------|\n",
"| Barebone | High | Low |\n",
"| `tf.keras.Model` | High | Medium |\n",
"| `tf.keras.Sequential` | Low | High |"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Part I: Preparation\n",
"\n",
"首先导入CIFAR-10数据集,如果是首次使用,tf会帮你下载,但是出于网络原因,建议把我们提供的数据文件`cifar-10-batches-py.tar.gz`直接放在`/root/.keras/datasets`目录下,可以不用下载,直接使用\n",
"\n",
"First, we load the CIFAR-10 dataset. This might take a few minutes to download the first time you run it, but after that the files should be cached on disk and loading should be faster.\n",
"\n",
"In previous parts of the assignment we used daseCV-specific code to download and read the CIFAR-10 dataset; however the `tf.keras.datasets` package in TensorFlow provides prebuilt utility functions for loading many common datasets.\n",
"\n",
"For the purposes of this assignment we will still write our own code to preprocess the data and iterate through it in minibatches. The `tf.data` package in TensorFlow provides tools for automating this process, but working with this package adds extra complication and is beyond the scope of this notebook. However using `tf.data` can be much more efficient than the simple approach used in this notebook, so you should consider using it for your project."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"import os\n",
"import tensorflow as tf\n",
"import numpy as np\n",
"import math\n",
"import timeit\n",
"import matplotlib.pyplot as plt\n",
"\n",
"%matplotlib inline"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"def load_cifar10(num_training=49000, num_validation=1000, num_test=10000):\n",
" \"\"\"\n",
" Fetch the CIFAR-10 dataset from the web and perform preprocessing to prepare\n",
" it for the two-layer neural net classifier. These are the same steps as\n",
" we used for the SVM, but condensed to a single function.\n",
" \"\"\"\n",
" # Load the raw CIFAR-10 dataset and use appropriate data types and shapes\n",
" cifar10 = tf.keras.datasets.cifar10.load_data()\n",
" (X_train, y_train), (X_test, y_test) = cifar10\n",
" X_train = np.asarray(X_train, dtype=np.float32)\n",
" y_train = np.asarray(y_train, dtype=np.int32).flatten()\n",
" X_test = np.asarray(X_test, dtype=np.float32)\n",
" y_test = np.asarray(y_test, dtype=np.int32).flatten()\n",
"\n",
" # Subsample the data\n",
" mask = range(num_training, num_training + num_validation)\n",
" X_val = X_train[mask]\n",
" y_val = y_train[mask]\n",
" mask = range(num_training)\n",
" X_train = X_train[mask]\n",
" y_train = y_train[mask]\n",
" mask = range(num_test)\n",
" X_test = X_test[mask]\n",
" y_test = y_test[mask]\n",
"\n",
" # Normalize the data: subtract the mean pixel and divide by std\n",
" mean_pixel = X_train.mean(axis=(0, 1, 2), keepdims=True)\n",
" std_pixel = X_train.std(axis=(0, 1, 2), keepdims=True)\n",
" X_train = (X_train - mean_pixel) / std_pixel\n",
" X_val = (X_val - mean_pixel) / std_pixel\n",
" X_test = (X_test - mean_pixel) / std_pixel\n",
"\n",
" return X_train, y_train, X_val, y_val, X_test, y_test\n",
"\n",
"# If there are errors with SSL downloading involving self-signed certificates,\n",
"# it may be that your Python version was recently installed on the current machine.\n",
"# See: https://github.com/tensorflow/tensorflow/issues/10779\n",
"# To fix, run the command: /Applications/Python\\ 3.7/Install\\ Certificates.command\n",
"# ...replacing paths as necessary.\n",
"\n",
"# Invoke the above function to get our data.\n",
"NHW = (0, 1, 2)\n",
"X_train, y_train, X_val, y_val, X_test, y_test = load_cifar10()\n",
"print('Train data shape: ', X_train.shape)\n",
"print('Train labels shape: ', y_train.shape, y_train.dtype)\n",
"print('Validation data shape: ', X_val.shape)\n",
"print('Validation labels shape: ', y_val.shape)\n",
"print('Test data shape: ', X_test.shape)\n",
"print('Test labels shape: ', y_test.shape)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"class Dataset(object):\n",
" def __init__(self, X, y, batch_size, shuffle=False):\n",
" \"\"\"\n",
" Construct a Dataset object to iterate over data X and labels y\n",
" \n",
" Inputs:\n",
" - X: Numpy array of data, of any shape\n",
" - y: Numpy array of labels, of any shape but with y.shape[0] == X.shape[0]\n",
" - batch_size: Integer giving number of elements per minibatch\n",
" - shuffle: (optional) Boolean, whether to shuffle the data on each epoch\n",
" \"\"\"\n",
" assert X.shape[0] == y.shape[0], 'Got different numbers of data and labels'\n",
" self.X, self.y = X, y\n",
" self.batch_size, self.shuffle = batch_size, shuffle\n",
"\n",
" def __iter__(self):\n",
" N, B = self.X.shape[0], self.batch_size\n",
" idxs = np.arange(N)\n",
" if self.shuffle:\n",
" np.random.shuffle(idxs)\n",
" return iter((self.X[i:i+B], self.y[i:i+B]) for i in range(0, N, B))\n",
"\n",
"\n",
"train_dset = Dataset(X_train, y_train, batch_size=64, shuffle=True)\n",
"val_dset = Dataset(X_val, y_val, batch_size=64, shuffle=False)\n",
"test_dset = Dataset(X_test, y_test, batch_size=64)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# We can iterate through a dataset like this:\n",
"for t, (x, y) in enumerate(train_dset):\n",
" print(t, x.shape, y.shape)\n",
" if t > 5: break"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"You can optionally **use GPU by setting the flag to True below**. It's not neccessary to use a GPU for this assignment; if you are working on Google Cloud then we recommend that you do not use a GPU, as it will be significantly more expensive."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore-input"
]
},
"outputs": [],
"source": [
"# Set up some global variables\n",
"USE_GPU = False\n",
"\n",
"if USE_GPU:\n",
" device = '/device:GPU:0'\n",
"else:\n",
" device = '/cpu:0'\n",
"\n",
"# Constant to control how often we print when training models\n",
"print_every = 100\n",
"\n",
"print('Using device: ', device)"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-ignore"
]
},
"source": [
"# Part II: Barebones TensorFlow\n",
"\n",
"TensorFlow附带各种高级api,方便我们定义和训练神经网络;但是在本节中,我们将首先使用基本的TensorFlow低级api构建一个模型,以帮助你更好地理解在高级api的框架下发生了什么。我们将在本作业的第三部分和第四部分介绍高级api。\n",
"\n",
"TensorFlow ships with various high-level APIs which make it very convenient to define and train neural networks; we will cover some of these constructs in Part III and Part IV of this notebook. In this section we will start by building a model with basic TensorFlow constructs to help you better understand what's going on under the hood of the higher-level APIs.\n",
"\n",
"**\"Barebones Tensorflow\" is important to understanding the building blocks of TensorFlow, but much of it involves concepts from TensorFlow 1.x.** We will be working with legacy modules such as `tf.Variable`.\n",
"\n",
"Therefore, please read and understand the differences between legacy (1.x) TF and the new (2.0) TF.\n",
"\n",
"### Historical background on TensorFlow 1.x\n",
"\n",
"TensorFlow 1.x is primarily a framework for working with **static computational graphs**. Nodes in the computational graph are Tensors which will hold n-dimensional arrays when the graph is run; edges in the graph represent functions that will operate on Tensors when the graph is run to actually perform useful computation.\n",
"\n",
"Before Tensorflow 2.0, we had to configure the graph into two phases. There are plenty of tutorials online that explain this two-step process. The process generally looks like the following for TF 1.x:\n",
"1. **Build a computational graph that describes the computation that you want to perform**. This stage doesn't actually perform any computation; it just builds up a symbolic representation of your computation. This stage will typically define one or more `placeholder` objects that represent inputs to the computational graph.\n",
"2. **Run the computational graph many times.** Each time the graph is run (e.g. for one gradient descent step) you will specify which parts of the graph you want to compute, and pass a `feed_dict` dictionary that will give concrete values to any `placeholder`s in the graph.\n",
"\n",
"### The new paradigm in Tensorflow 2.0\n",
"Now, with Tensorflow 2.0, we can simply adopt a functional form that is more Pythonic and similar in spirit to PyTorch and direct Numpy operation. Instead of the 2-step paradigm with computation graphs, making it (among other things) easier to debug TF code. You can read more details at https://www.tensorflow.org/guide/eager.\n",
"\n",
"The main difference between the TF 1.x and 2.0 approach is that the 2.0 approach doesn't make use of `tf.Session`, `tf.run`, `placeholder`, `feed_dict`. To get more details of what's different between the two version and how to convert between the two, check out the official migration guide: https://www.tensorflow.org/alpha/guide/migration_guide\n",
"\n",
"Later, in the rest of this notebook we'll focus on this new, simpler approach."
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-ignore"
]
},
"source": [
"### TensorFlow warmup: Flatten Function\n",
"\n",
"We can see this in action by defining a simple `flatten` function that will reshape image data for use in a fully-connected network.\n",
"\n",
"In TensorFlow, data for convolutional feature maps is typically stored in a Tensor of shape N x H x W x C where:\n",
"\n",
"- N is the number of datapoints (minibatch size)\n",
"- H is the height of the feature map\n",
"- W is the width of the feature map\n",
"- C is the number of channels in the feature map\n",
"\n",
"This is the right way to represent the data when we are doing something like a 2D convolution, that needs spatial understanding of where the intermediate features are relative to each other. When we use fully connected affine layers to process the image, however, we want each datapoint to be represented by a single vector -- it's no longer useful to segregate the different channels, rows, and columns of the data. So, we use a \"flatten\" operation to collapse the `H x W x C` values per representation into a single long vector. \n",
"\n",
"Notice the `tf.reshape` call has the target shape as `(N, -1)`, meaning it will reshape/keep the first dimension to be N, and then infer as necessary what the second dimension is in the output, so we can collapse the remaining dimensions from the input properly.\n",
"\n",
"**NOTE**: TensorFlow and PyTorch differ on the default Tensor layout; TensorFlow uses N x H x W x C but PyTorch uses N x C x H x W."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"def flatten(x):\n",
" \"\"\" \n",
" Input:\n",
" - TensorFlow Tensor of shape (N, D1, ..., DM)\n",
" \n",
" Output:\n",
" - TensorFlow Tensor of shape (N, D1 * ... * DM)\n",
" \"\"\"\n",
" N = tf.shape(x)[0]\n",
" return tf.reshape(x, (N, -1))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore-input"
]
},
"outputs": [],
"source": [
"def test_flatten():\n",
" # Construct concrete values of the input data x using numpy\n",
" x_np = np.arange(24).reshape((2, 3, 4))\n",
" print('x_np:\\n', x_np, '\\n')\n",
" # Compute a concrete output value.\n",
" x_flat_np = flatten(x_np)\n",
" print('x_flat_np:\\n', x_flat_np, '\\n')\n",
"\n",
"test_flatten()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Barebones TensorFlow: Define a Two-Layer Network\n",
"We will now implement our first neural network with TensorFlow: a fully-connected ReLU network with two hidden layers and no biases on the CIFAR10 dataset. For now we will use only low-level TensorFlow operators to define the network; later we will see how to use the higher-level abstractions provided by `tf.keras` to simplify the process.\n",
"\n",
"We will define the forward pass of the network in the function `two_layer_fc`; this will accept TensorFlow Tensors for the inputs and weights of the network, and return a TensorFlow Tensor for the scores. \n",
"\n",
"After defining the network architecture in the `two_layer_fc` function, we will test the implementation by checking the shape of the output.\n",
"\n",
"**It's important that you read and understand this implementation.**"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"def two_layer_fc(x, params):\n",
" \"\"\"\n",
" A fully-connected neural network; the architecture is:\n",
" fully-connected layer -> ReLU -> fully connected layer.\n",
" Note that we only need to define the forward pass here; TensorFlow will take\n",
" care of computing the gradients for us.\n",
" \n",
" The input to the network will be a minibatch of data, of shape\n",
" (N, d1, ..., dM) where d1 * ... * dM = D. The hidden layer will have H units,\n",
" and the output layer will produce scores for C classes.\n",
"\n",
" Inputs:\n",
" - x: A TensorFlow Tensor of shape (N, d1, ..., dM) giving a minibatch of\n",
" input data.\n",
" - params: A list [w1, w2] of TensorFlow Tensors giving weights for the\n",
" network, where w1 has shape (D, H) and w2 has shape (H, C).\n",
" \n",
" Returns:\n",
" - scores: A TensorFlow Tensor of shape (N, C) giving classification scores\n",
" for the input data x.\n",
" \"\"\"\n",
" w1, w2 = params # Unpack the parameters\n",
" x = flatten(x) # Flatten the input; now x has shape (N, D)\n",
" h = tf.nn.relu(tf.matmul(x, w1)) # Hidden layer: h has shape (N, H)\n",
" scores = tf.matmul(h, w2) # Compute scores of shape (N, C)\n",
" return scores"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore-input"
]
},
"outputs": [],
"source": [
"def two_layer_fc_test():\n",
" hidden_layer_size = 42\n",
"\n",
" # Scoping our TF operations under a tf.device context manager \n",
" # lets us tell TensorFlow where we want these Tensors to be\n",
" # multiplied and/or operated on, e.g. on a CPU or a GPU.\n",
" with tf.device(device): \n",
" x = tf.zeros((64, 32, 32, 3))\n",
" w1 = tf.zeros((32 * 32 * 3, hidden_layer_size))\n",
" w2 = tf.zeros((hidden_layer_size, 10))\n",
"\n",
" # Call our two_layer_fc function for the forward pass of the network.\n",
" scores = two_layer_fc(x, [w1, w2])\n",
"\n",
" print(scores.shape)\n",
"\n",
"two_layer_fc_test()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Barebones TensorFlow: Three-Layer ConvNet\n",
"Here you will complete the implementation of the function `three_layer_convnet` which will perform the forward pass of a three-layer convolutional network. The network should have the following architecture:\n",
"\n",
"1. A convolutional layer (with bias) with `channel_1` filters, each with shape `KW1 x KH1`, and zero-padding of two\n",
"2. ReLU nonlinearity\n",
"3. A convolutional layer (with bias) with `channel_2` filters, each with shape `KW2 x KH2`, and zero-padding of one\n",
"4. ReLU nonlinearity\n",
"5. Fully-connected layer with bias, producing scores for `C` classes.\n",
"\n",
"**HINT**: For convolutions: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/nn/conv2d; be careful with padding!\n",
"\n",
"**HINT**: For biases: https://www.tensorflow.org/performance/xla/broadcasting"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def three_layer_convnet(x, params):\n",
" \"\"\"\n",
" A three-layer convolutional network with the architecture described above.\n",
" \n",
" Inputs:\n",
" - x: A TensorFlow Tensor of shape (N, H, W, 3) giving a minibatch of images\n",
" - params: A list of TensorFlow Tensors giving the weights and biases for the\n",
" network; should contain the following:\n",
" - conv_w1: TensorFlow Tensor of shape (KH1, KW1, 3, channel_1) giving\n",
" weights for the first convolutional layer.\n",
" - conv_b1: TensorFlow Tensor of shape (channel_1,) giving biases for the\n",
" first convolutional layer.\n",
" - conv_w2: TensorFlow Tensor of shape (KH2, KW2, channel_1, channel_2)\n",
" giving weights for the second convolutional layer\n",
" - conv_b2: TensorFlow Tensor of shape (channel_2,) giving biases for the\n",
" second convolutional layer.\n",
" - fc_w: TensorFlow Tensor giving weights for the fully-connected layer.\n",
" Can you figure out what the shape should be?\n",
" - fc_b: TensorFlow Tensor giving biases for the fully-connected layer.\n",
" Can you figure out what the shape should be?\n",
" \"\"\"\n",
" conv_w1, conv_b1, conv_w2, conv_b2, fc_w, fc_b = params\n",
" scores = None\n",
" ############################################################################\n",
" # TODO: Implement the forward pass for the three-layer ConvNet. #\n",
" ############################################################################\n",
" # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
" pass\n",
"\n",
" # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
" ############################################################################\n",
" # END OF YOUR CODE #\n",
" ############################################################################\n",
" return scores"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"After defing the forward pass of the three-layer ConvNet above, run the following cell to test your implementation. Like the two-layer network, we run the graph on a batch of zeros just to make sure the function doesn't crash, and produces outputs of the correct shape.\n",
"\n",
"When you run this function, `scores_np` should have shape `(64, 10)`."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore-input"
]
},
"outputs": [],
"source": [
"def three_layer_convnet_test():\n",
" \n",
" with tf.device(device):\n",
" x = tf.zeros((64, 32, 32, 3))\n",
" conv_w1 = tf.zeros((5, 5, 3, 6))\n",
" conv_b1 = tf.zeros((6,))\n",
" conv_w2 = tf.zeros((3, 3, 6, 9))\n",
" conv_b2 = tf.zeros((9,))\n",
" fc_w = tf.zeros((32 * 32 * 9, 10))\n",
" fc_b = tf.zeros((10,))\n",
" params = [conv_w1, conv_b1, conv_w2, conv_b2, fc_w, fc_b]\n",
" scores = three_layer_convnet(x, params)\n",
"\n",
" # Inputs to convolutional layers are 4-dimensional arrays with shape\n",
" # [batch_size, height, width, channels]\n",
" print('scores_np has shape: ', scores.shape)\n",
"\n",
"three_layer_convnet_test()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Barebones TensorFlow: Training Step\n",
"\n",
"We now define the `training_step` function performs a single training step. This will take three basic steps:\n",
"\n",
"1. Compute the loss\n",
"2. Compute the gradient of the loss with respect to all network weights\n",
"3. Make a weight update step using (stochastic) gradient descent.\n",
"\n",
"\n",
"We need to use a few new TensorFlow functions to do all of this:\n",
"- For computing the cross-entropy loss we'll use `tf.nn.sparse_softmax_cross_entropy_with_logits`: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/nn/sparse_softmax_cross_entropy_with_logits\n",
"\n",
"- For averaging the loss across a minibatch of data we'll use `tf.reduce_mean`:\n",
"https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/reduce_mean\n",
"\n",
"- For computing gradients of the loss with respect to the weights we'll use `tf.GradientTape` (useful for Eager execution): https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/GradientTape\n",
"\n",
"- We'll mutate the weight values stored in a TensorFlow Tensor using `tf.assign_sub` (\"sub\" is for subtraction): https://www.tensorflow.org/api_docs/python/tf/assign_sub \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"def training_step(model_fn, x, y, params, learning_rate):\n",
" with tf.GradientTape() as tape:\n",
" scores = model_fn(x, params) # Forward pass of the model\n",
" loss = tf.nn.sparse_softmax_cross_entropy_with_logits(labels=y, logits=scores)\n",
" total_loss = tf.reduce_mean(loss)\n",
" grad_params = tape.gradient(total_loss, params)\n",
"\n",
" # Make a vanilla gradient descent step on all of the model parameters\n",
" # Manually update the weights using assign_sub()\n",
" for w, grad_w in zip(params, grad_params):\n",
" w.assign_sub(learning_rate * grad_w)\n",
" \n",
" return total_loss"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"def train_part2(model_fn, init_fn, learning_rate):\n",
" \"\"\"\n",
" Train a model on CIFAR-10.\n",
" \n",
" Inputs:\n",
" - model_fn: A Python function that performs the forward pass of the model\n",
" using TensorFlow; it should have the following signature:\n",
" scores = model_fn(x, params) where x is a TensorFlow Tensor giving a\n",
" minibatch of image data, params is a list of TensorFlow Tensors holding\n",
" the model weights, and scores is a TensorFlow Tensor of shape (N, C)\n",
" giving scores for all elements of x.\n",
" - init_fn: A Python function that initializes the parameters of the model.\n",
" It should have the signature params = init_fn() where params is a list\n",
" of TensorFlow Tensors holding the (randomly initialized) weights of the\n",
" model.\n",
" - learning_rate: Python float giving the learning rate to use for SGD.\n",
" \"\"\"\n",
" \n",
" \n",
" params = init_fn() # Initialize the model parameters \n",
" \n",
" for t, (x_np, y_np) in enumerate(train_dset):\n",
" # Run the graph on a batch of training data.\n",
" loss = training_step(model_fn, x_np, y_np, params, learning_rate)\n",
" \n",
" # Periodically print the loss and check accuracy on the val set.\n",
" if t % print_every == 0:\n",
" print('Iteration %d, loss = %.4f' % (t, loss))\n",
" check_accuracy(val_dset, x_np, model_fn, params)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"def check_accuracy(dset, x, model_fn, params):\n",
" \"\"\"\n",
" Check accuracy on a classification model, e.g. for validation.\n",
" \n",
" Inputs:\n",
" - dset: A Dataset object against which to check accuracy\n",
" - x: A TensorFlow placeholder Tensor where input images should be fed\n",
" - model_fn: the Model we will be calling to make predictions on x\n",
" - params: parameters for the model_fn to work with\n",
" \n",
" Returns: Nothing, but prints the accuracy of the model\n",
" \"\"\"\n",
" num_correct, num_samples = 0, 0\n",
" for x_batch, y_batch in dset:\n",
" scores_np = model_fn(x_batch, params).numpy()\n",
" y_pred = scores_np.argmax(axis=1)\n",
" num_samples += x_batch.shape[0]\n",
" num_correct += (y_pred == y_batch).sum()\n",
" acc = float(num_correct) / num_samples\n",
" print('Got %d / %d correct (%.2f%%)' % (num_correct, num_samples, 100 * acc))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Barebones TensorFlow: Initialization\n",
"We'll use the following utility method to initialize the weight matrices for our models using Kaiming's normalization method.\n",
"\n",
"[1] He et al, *Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification\n",
"*, ICCV 2015, https://arxiv.org/abs/1502.01852"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def create_matrix_with_kaiming_normal(shape):\n",
" if len(shape) == 2:\n",
" fan_in, fan_out = shape[0], shape[1]\n",
" elif len(shape) == 4:\n",
" fan_in, fan_out = np.prod(shape[:3]), shape[3]\n",
" return tf.keras.backend.random_normal(shape) * np.sqrt(2.0 / fan_in)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Barebones TensorFlow: Train a Two-Layer Network\n",
"We are finally ready to use all of the pieces defined above to train a two-layer fully-connected network on CIFAR-10.\n",
"\n",
"We just need to define a function to initialize the weights of the model, and call `train_part2`.\n",
"\n",
"Defining the weights of the network introduces another important piece of TensorFlow API: `tf.Variable`. A TensorFlow Variable is a Tensor whose value is stored in the graph and persists across runs of the computational graph; however unlike constants defined with `tf.zeros` or `tf.random_normal`, the values of a Variable can be mutated as the graph runs; these mutations will persist across graph runs. Learnable parameters of the network are usually stored in Variables.\n",
"\n",
"You don't need to tune any hyperparameters, but you should achieve validation accuracies above 40% after one epoch of training."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def two_layer_fc_init():\n",
" \"\"\"\n",
" Initialize the weights of a two-layer network, for use with the\n",
" two_layer_network function defined above. \n",
" You can use the `create_matrix_with_kaiming_normal` helper!\n",
" \n",
" Inputs: None\n",
" \n",
" Returns: A list of:\n",
" - w1: TensorFlow tf.Variable giving the weights for the first layer\n",
" - w2: TensorFlow tf.Variable giving the weights for the second layer\n",
" \"\"\"\n",
" hidden_layer_size = 4000\n",
" w1 = tf.Variable(create_matrix_with_kaiming_normal((3 * 32 * 32, 4000)))\n",
" w2 = tf.Variable(create_matrix_with_kaiming_normal((4000, 10)))\n",
" return [w1, w2]\n",
"\n",
"learning_rate = 1e-2\n",
"train_part2(two_layer_fc, two_layer_fc_init, learning_rate)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Barebones TensorFlow: Train a three-layer ConvNet\n",
"We will now use TensorFlow to train a three-layer ConvNet on CIFAR-10.\n",
"\n",
"You need to implement the `three_layer_convnet_init` function. Recall that the architecture of the network is:\n",
"\n",
"1. Convolutional layer (with bias) with 32 5x5 filters, with zero-padding 2\n",
"2. ReLU\n",
"3. Convolutional layer (with bias) with 16 3x3 filters, with zero-padding 1\n",
"4. ReLU\n",
"5. Fully-connected layer (with bias) to compute scores for 10 classes\n",
"\n",
"You don't need to do any hyperparameter tuning, but you should see validation accuracies above 43% after one epoch of training."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def three_layer_convnet_init():\n",
" \"\"\"\n",
" Initialize the weights of a Three-Layer ConvNet, for use with the\n",
" three_layer_convnet function defined above.\n",
" You can use the `create_matrix_with_kaiming_normal` helper!\n",
" \n",
" Inputs: None\n",
" \n",
" Returns a list containing:\n",
" - conv_w1: TensorFlow tf.Variable giving weights for the first conv layer\n",
" - conv_b1: TensorFlow tf.Variable giving biases for the first conv layer\n",
" - conv_w2: TensorFlow tf.Variable giving weights for the second conv layer\n",
" - conv_b2: TensorFlow tf.Variable giving biases for the second conv layer\n",
" - fc_w: TensorFlow tf.Variable giving weights for the fully-connected layer\n",
" - fc_b: TensorFlow tf.Variable giving biases for the fully-connected layer\n",
" \"\"\"\n",
" params = None\n",
" ############################################################################\n",
" # TODO: Initialize the parameters of the three-layer network. #\n",
" ############################################################################\n",
" # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
" pass\n",
"\n",
" # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
" ############################################################################\n",
" # END OF YOUR CODE #\n",
" ############################################################################\n",
" return params\n",
"\n",
"learning_rate = 3e-3\n",
"train_part2(three_layer_convnet, three_layer_convnet_init, learning_rate)"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-ignore"
]
},
"source": [
"# Part III: Keras Model Subclassing API\n",
"\n",
"使用低级的TensorFlow API实现一个神经网络能够很好的理解TensorFlow,但是低级API不方便——我们必须手动创建并跟踪所有可学习参数的张量。这对于小型网络来说还可以,但是对于大型复杂的模型来说就变得不方便了。\n",
"\n",
"幸运的是,TensorFlow 2.0提供了更高级别的API,比如`tf.keras`。它很容易建立模块化的模型和面向对象的层。此外,TensorFlow 2.0使用立即计算(eager execution)操作,而不显式地构造任何计算图。这使得编写和调试模型变得很容易,并且减少了引用代码。\n",
"\n",
"Implementing a neural network using the low-level TensorFlow API is a good way to understand how TensorFlow works, but it's a little inconvenient - we had to manually keep track of all Tensors holding learnable parameters. This was fine for a small network, but could quickly become unweildy for a large complex model.\n",
"\n",
"Fortunately TensorFlow 2.0 provides higher-level APIs such as `tf.keras` which make it easy to build models out of modular, object-oriented layers. Further, TensorFlow 2.0 uses eager execution that evaluates operations immediately, without explicitly constructing any computational graphs. This makes it easy to write and debug models, and reduces the boilerplate code.\n",
"\n",
"In this part of the notebook we will define neural network models using the `tf.keras.Model` API. To implement your own model, you need to do the following:\n",
"\n",
"1. Define a new class which subclasses `tf.keras.Model`. Give your class an intuitive name that describes it, like `TwoLayerFC` or `ThreeLayerConvNet`.\n",
"2. In the initializer `__init__()` for your new class, define all the layers you need as class attributes. The `tf.keras.layers` package provides many common neural-network layers, like `tf.keras.layers.Dense` for fully-connected layers and `tf.keras.layers.Conv2D` for convolutional layers. Under the hood, these layers will construct `Variable` Tensors for any learnable parameters. **Warning**: Don't forget to call `super(YourModelName, self).__init__()` as the first line in your initializer!\n",
"3. Implement the `call()` method for your class; this implements the forward pass of your model, and defines the *connectivity* of your network. Layers defined in `__init__()` implement `__call__()` so they can be used as function objects that transform input Tensors into output Tensors. Don't define any new layers in `call()`; any layers you want to use in the forward pass should be defined in `__init__()`.\n",
"\n",
"After you define your `tf.keras.Model` subclass, you can instantiate it and use it like the model functions from Part II.\n",
"\n",
"### Keras Model Subclassing API: Two-Layer Network\n",
"\n",
"Here is a concrete example of using the `tf.keras.Model` API to define a two-layer network. There are a few new bits of API to be aware of here:\n",
"\n",
"We use an `Initializer` object to set up the initial values of the learnable parameters of the layers; in particular `tf.initializers.VarianceScaling` gives behavior similar to the Kaiming initialization method we used in Part II. You can read more about it here: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/initializers/VarianceScaling\n",
"\n",
"We construct `tf.keras.layers.Dense` objects to represent the two fully-connected layers of the model. In addition to multiplying their input by a weight matrix and adding a bias vector, these layer can also apply a nonlinearity for you. For the first layer we specify a ReLU activation function by passing `activation='relu'` to the constructor; the second layer uses softmax activation function. Finally, we use `tf.keras.layers.Flatten` to flatten the output from the previous fully-connected layer."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore-input"
]
},
"outputs": [],
"source": [
"class TwoLayerFC(tf.keras.Model):\n",
" def __init__(self, hidden_size, num_classes):\n",
" super(TwoLayerFC, self).__init__() \n",
" initializer = tf.initializers.VarianceScaling(scale=2.0)\n",
" self.fc1 = tf.keras.layers.Dense(hidden_size, activation='relu',\n",
" kernel_initializer=initializer)\n",
" self.fc2 = tf.keras.layers.Dense(num_classes, activation='softmax',\n",
" kernel_initializer=initializer)\n",
" self.flatten = tf.keras.layers.Flatten()\n",
" \n",
" def call(self, x, training=False):\n",
" x = self.flatten(x)\n",
" x = self.fc1(x)\n",
" x = self.fc2(x)\n",
" return x\n",
"\n",
"\n",
"def test_TwoLayerFC():\n",
" \"\"\" A small unit test to exercise the TwoLayerFC model above. \"\"\"\n",
" input_size, hidden_size, num_classes = 50, 42, 10\n",
" x = tf.zeros((64, input_size))\n",
" model = TwoLayerFC(hidden_size, num_classes)\n",
" with tf.device(device):\n",
" scores = model(x)\n",
" print(scores.shape)\n",
" \n",
"test_TwoLayerFC()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Keras Model Subclassing API: Three-Layer ConvNet\n",
"Now it's your turn to implement a three-layer ConvNet using the `tf.keras.Model` API. Your model should have the same architecture used in Part II:\n",
"\n",
"1. Convolutional layer with 5 x 5 kernels, with zero-padding of 2\n",
"2. ReLU nonlinearity\n",
"3. Convolutional layer with 3 x 3 kernels, with zero-padding of 1\n",
"4. ReLU nonlinearity\n",
"5. Fully-connected layer to give class scores\n",
"6. Softmax nonlinearity\n",
"\n",
"You should initialize the weights of your network using the same initialization method as was used in the two-layer network above.\n",
"\n",
"**Hint**: Refer to the documentation for `tf.keras.layers.Conv2D` and `tf.keras.layers.Dense`:\n",
"\n",
"https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/layers/Conv2D\n",
"\n",
"https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/layers/Dense"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class ThreeLayerConvNet(tf.keras.Model):\n",
" def __init__(self, channel_1, channel_2, num_classes):\n",
" super(ThreeLayerConvNet, self).__init__()\n",
" ########################################################################\n",
" # TODO: Implement the __init__ method for a three-layer ConvNet. You #\n",
" # should instantiate layer objects to be used in the forward pass. #\n",
" ########################################################################\n",
" # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
" pass\n",
"\n",
" # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
" ########################################################################\n",
" # END OF YOUR CODE #\n",
" ########################################################################\n",
" \n",
" def call(self, x, training=False):\n",
" scores = None\n",
" ########################################################################\n",
" # TODO: Implement the forward pass for a three-layer ConvNet. You #\n",
" # should use the layer objects defined in the __init__ method. #\n",
" ########################################################################\n",
" # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
" pass\n",
"\n",
" # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
" ########################################################################\n",
" # END OF YOUR CODE #\n",
" ######################################################################## \n",
" return scores"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Once you complete the implementation of the `ThreeLayerConvNet` above you can run the following to ensure that your implementation does not crash and produces outputs of the expected shape."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def test_ThreeLayerConvNet(): \n",
" channel_1, channel_2, num_classes = 12, 8, 10\n",
" model = ThreeLayerConvNet(channel_1, channel_2, num_classes)\n",
" with tf.device(device):\n",
" x = tf.zeros((64, 3, 32, 32))\n",
" scores = model(x)\n",
" print(scores.shape)\n",
"\n",
"test_ThreeLayerConvNet()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Keras Model Subclassing API: Eager Training\n",
"\n",
"While keras models have a builtin training loop (using the `model.fit`), sometimes you need more customization. Here's an example, of a training loop implemented with eager execution.\n",
"\n",
"In particular, notice `tf.GradientTape`. Automatic differentiation is used in the backend for implementing backpropagation in frameworks like TensorFlow. During eager execution, `tf.GradientTape` is used to trace operations for computing gradients later. A particular `tf.GradientTape` can only compute one gradient; subsequent calls to tape will throw a runtime error. \n",
"\n",
"TensorFlow 2.0 ships with easy-to-use built-in metrics under `tf.keras.metrics` module. Each metric is an object, and we can use `update_state()` to add observations and `reset_state()` to clear all observations. We can get the current result of a metric by calling `result()` on the metric object."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"def train_part34(model_init_fn, optimizer_init_fn, num_epochs=1, is_training=False):\n",
" \"\"\"\n",
" Simple training loop for use with models defined using tf.keras. It trains\n",
" a model for one epoch on the CIFAR-10 training set and periodically checks\n",
" accuracy on the CIFAR-10 validation set.\n",
" \n",
" Inputs:\n",
" - model_init_fn: A function that takes no parameters; when called it\n",
" constructs the model we want to train: model = model_init_fn()\n",
" - optimizer_init_fn: A function which takes no parameters; when called it\n",
" constructs the Optimizer object we will use to optimize the model:\n",
" optimizer = optimizer_init_fn()\n",
" - num_epochs: The number of epochs to train for\n",
" \n",
" Returns: Nothing, but prints progress during trainingn\n",
" \"\"\" \n",
" with tf.device(device):\n",
"\n",
" # Compute the loss like we did in Part II\n",
" loss_fn = tf.keras.losses.SparseCategoricalCrossentropy()\n",
" \n",
" model = model_init_fn()\n",
" optimizer = optimizer_init_fn()\n",
" \n",
" train_loss = tf.keras.metrics.Mean(name='train_loss')\n",
" train_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='train_accuracy')\n",
" \n",
" val_loss = tf.keras.metrics.Mean(name='val_loss')\n",
" val_accuracy = tf.keras.metrics.SparseCategoricalAccuracy(name='val_accuracy')\n",
" \n",
" t = 0\n",
" for epoch in range(num_epochs):\n",
" \n",
" # Reset the metrics - https://www.tensorflow.org/alpha/guide/migration_guide#new-style_metrics\n",
" train_loss.reset_states()\n",
" train_accuracy.reset_states()\n",
" \n",
" for x_np, y_np in train_dset:\n",
" with tf.GradientTape() as tape:\n",
" \n",
" # Use the model function to build the forward pass.\n",
" scores = model(x_np, training=is_training)\n",
" loss = loss_fn(y_np, scores)\n",
" \n",
" gradients = tape.gradient(loss, model.trainable_variables)\n",
" optimizer.apply_gradients(zip(gradients, model.trainable_variables))\n",
" \n",
" # Update the metrics\n",
" train_loss.update_state(loss)\n",
" train_accuracy.update_state(y_np, scores)\n",
" \n",
" if t % print_every == 0:\n",
" val_loss.reset_states()\n",
" val_accuracy.reset_states()\n",
" for test_x, test_y in val_dset:\n",
" # During validation at end of epoch, training set to False\n",
" prediction = model(test_x, training=False)\n",
" t_loss = loss_fn(test_y, prediction)\n",
"\n",
" val_loss.update_state(t_loss)\n",
" val_accuracy.update_state(test_y, prediction)\n",
" \n",
" template = 'Iteration {}, Epoch {}, Loss: {}, Accuracy: {}, Val Loss: {}, Val Accuracy: {}'\n",
" print (template.format(t, epoch+1,\n",
" train_loss.result(),\n",
" train_accuracy.result()*100,\n",
" val_loss.result(),\n",
" val_accuracy.result()*100))\n",
" t += 1"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Keras Model Subclassing API: Train a Two-Layer Network\n",
"We can now use the tools defined above to train a two-layer network on CIFAR-10. We define the `model_init_fn` and `optimizer_init_fn` that construct the model and optimizer respectively when called. Here we want to train the model using stochastic gradient descent with no momentum, so we construct a `tf.keras.optimizers.SGD` function; you can [read about it here](https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/optimizers/SGD).\n",
"\n",
"You don't need to tune any hyperparameters here, but you should achieve validation accuracies above 40% after one epoch of training."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"hidden_size, num_classes = 4000, 10\n",
"learning_rate = 1e-2\n",
"\n",
"def model_init_fn():\n",
" return TwoLayerFC(hidden_size, num_classes)\n",
"\n",
"def optimizer_init_fn():\n",
" return tf.keras.optimizers.SGD(learning_rate=learning_rate)\n",
"\n",
"train_part34(model_init_fn, optimizer_init_fn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Keras Model Subclassing API: Train a Three-Layer ConvNet\n",
"Here you should use the tools we've defined above to train a three-layer ConvNet on CIFAR-10. Your ConvNet should use 32 filters in the first convolutional layer and 16 filters in the second layer.\n",
"\n",
"To train the model you should use gradient descent with Nesterov momentum 0.9. \n",
"\n",
"**HINT**: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/optimizers/SGD\n",
"\n",
"You don't need to perform any hyperparameter tuning, but you should achieve validation accuracies above 50% after training for one epoch."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"learning_rate = 3e-3\n",
"channel_1, channel_2, num_classes = 32, 16, 10\n",
"\n",
"def model_init_fn():\n",
" model = None\n",
" ############################################################################\n",
" # TODO: Complete the implementation of model_fn. #\n",
" ############################################################################\n",
" # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
" pass\n",
"\n",
" # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
" ############################################################################\n",
" # END OF YOUR CODE #\n",
" ############################################################################\n",
" return model\n",
"\n",
"def optimizer_init_fn():\n",
" optimizer = None\n",
" ############################################################################\n",
" # TODO: Complete the implementation of model_fn. #\n",
" ############################################################################\n",
" # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
" pass\n",
"\n",
" # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
" ############################################################################\n",
" # END OF YOUR CODE #\n",
" ############################################################################\n",
" return optimizer\n",
"\n",
"train_part34(model_init_fn, optimizer_init_fn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Part IV: Keras Sequential API\n",
"\n",
"在第三部分中,我们介绍了`tf.keras.Model`API,它允许你使用任意数量的层和任意的连接来定义模型。\n",
"\n",
"但是,对于许多模型,你其实不需要这样的灵活性——许多模型可以表示为层的连续堆砌,将每一层的输出作为输入提供给下一层。如果你的模型符合这种模式,那么还有一种更简单的方法来定义你的模型:使用`tf.keras.Sequential`。你不需要编写任何自定义类;你只要调用`tf.keras.Sequential`就行了, 他是一个列表,包含一系列的层。\n",
"\n",
"In Part III we introduced the `tf.keras.Model` API, which allows you to define models with any number of learnable layers and with arbitrary connectivity between layers.\n",
"\n",
"However for many models you don't need such flexibility - a lot of models can be expressed as a sequential stack of layers, with the output of each layer fed to the next layer as input. If your model fits this pattern, then there is an even easier way to define your model: using `tf.keras.Sequential`. You don't need to write any custom classes; you simply call the `tf.keras.Sequential` constructor with a list containing a sequence of layer objects.\n",
"\n",
"One complication with `tf.keras.Sequential` is that you must define the shape of the input to the model by passing a value to the `input_shape` of the first layer in your model.\n",
"\n",
"### Keras Sequential API: Two-Layer Network\n",
"In this subsection, we will rewrite the two-layer fully-connected network using `tf.keras.Sequential`, and train it using the training loop defined above.\n",
"\n",
"You don't need to perform any hyperparameter tuning here, but you should see validation accuracies above 40% after training for one epoch."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"learning_rate = 1e-2\n",
"\n",
"def model_init_fn():\n",
" input_shape = (32, 32, 3)\n",
" hidden_layer_size, num_classes = 4000, 10\n",
" initializer = tf.initializers.VarianceScaling(scale=2.0)\n",
" layers = [\n",
" tf.keras.layers.Flatten(input_shape=input_shape),\n",
" tf.keras.layers.Dense(hidden_layer_size, activation='relu',\n",
" kernel_initializer=initializer),\n",
" tf.keras.layers.Dense(num_classes, activation='softmax', \n",
" kernel_initializer=initializer),\n",
" ]\n",
" model = tf.keras.Sequential(layers)\n",
" return model\n",
"\n",
"def optimizer_init_fn():\n",
" return tf.keras.optimizers.SGD(learning_rate=learning_rate) \n",
"\n",
"train_part34(model_init_fn, optimizer_init_fn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Abstracting Away the Training Loop\n",
"In the previous examples, we used a customised training loop to train models (e.g. `train_part34`). Writing your own training loop is only required if you need more flexibility and control during training your model. Alternately, you can also use built-in APIs like `tf.keras.Model.fit()` and `tf.keras.Model.evaluate` to train and evaluate a model. Also remember to configure your model for training by calling `tf.keras.Model.compile.\n",
"\n",
"You don't need to perform any hyperparameter tuning here, but you should see validation and test accuracies above 42% after training for one epoch."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model = model_init_fn()\n",
"model.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=learning_rate),\n",
" loss='sparse_categorical_crossentropy',\n",
" metrics=[tf.keras.metrics.sparse_categorical_accuracy])\n",
"model.fit(X_train, y_train, batch_size=64, epochs=1, validation_data=(X_val, y_val))\n",
"model.evaluate(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Keras Sequential API: Three-Layer ConvNet\n",
"Here you should use `tf.keras.Sequential` to reimplement the same three-layer ConvNet architecture used in Part II and Part III. As a reminder, your model should have the following architecture:\n",
"\n",
"1. Convolutional layer with 32 5x5 kernels, using zero padding of 2\n",
"2. ReLU nonlinearity\n",
"3. Convolutional layer with 16 3x3 kernels, using zero padding of 1\n",
"4. ReLU nonlinearity\n",
"5. Fully-connected layer giving class scores\n",
"6. Softmax nonlinearity\n",
"\n",
"You should initialize the weights of the model using a `tf.initializers.VarianceScaling` as above.\n",
"\n",
"You should train the model using Nesterov momentum 0.9.\n",
"\n",
"You don't need to perform any hyperparameter search, but you should achieve accuracy above 45% after training for one epoch."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def model_init_fn():\n",
" model = None\n",
" ############################################################################\n",
" # TODO: Construct a three-layer ConvNet using tf.keras.Sequential. #\n",
" ############################################################################\n",
" # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
" pass\n",
"\n",
" # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
" ############################################################################\n",
" # END OF YOUR CODE #\n",
" ############################################################################\n",
" return model\n",
"\n",
"learning_rate = 5e-4\n",
"def optimizer_init_fn():\n",
" optimizer = None\n",
" ############################################################################\n",
" # TODO: Complete the implementation of model_fn. #\n",
" ############################################################################\n",
" # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
" pass\n",
"\n",
" # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
" ############################################################################\n",
" # END OF YOUR CODE #\n",
" ############################################################################\n",
" return optimizer\n",
"\n",
"train_part34(model_init_fn, optimizer_init_fn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We will also train this model with the built-in training loop APIs provided by TensorFlow."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"model = model_init_fn()\n",
"model.compile(optimizer='sgd',\n",
" loss='sparse_categorical_crossentropy',\n",
" metrics=[tf.keras.metrics.sparse_categorical_accuracy])\n",
"model.fit(X_train, y_train, batch_size=64, epochs=1, validation_data=(X_val, y_val))\n",
"model.evaluate(X_test, y_test)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Part IV: Functional API\n",
"### Demonstration with a Two-Layer Network \n",
"\n",
"In the previous section, we saw how we can use `tf.keras.Sequential` to stack layers to quickly build simple models. But this comes at the cost of losing flexibility.\n",
"\n",
"Often we will have to write complex models that have non-sequential data flows: a layer can have **multiple inputs and/or outputs**, such as stacking the output of 2 previous layers together to feed as input to a third! (Some examples are residual connections and dense blocks.)\n",
"\n",
"In such cases, we can use Keras functional API to write models with complex topologies such as:\n",
"\n",
" 1. Multi-input models\n",
" 2. Multi-output models\n",
" 3. Models with shared layers (the same layer called several times)\n",
" 4. Models with non-sequential data flows (e.g. residual connections)\n",
"\n",
"Writing a model with Functional API requires us to create a `tf.keras.Model` instance and explicitly write input tensors and output tensors for this model. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"def two_layer_fc_functional(input_shape, hidden_size, num_classes): \n",
" initializer = tf.initializers.VarianceScaling(scale=2.0)\n",
" inputs = tf.keras.Input(shape=input_shape)\n",
" flattened_inputs = tf.keras.layers.Flatten()(inputs)\n",
" fc1_output = tf.keras.layers.Dense(hidden_size, activation='relu',\n",
" kernel_initializer=initializer)(flattened_inputs)\n",
" scores = tf.keras.layers.Dense(num_classes, activation='softmax',\n",
" kernel_initializer=initializer)(fc1_output)\n",
"\n",
" # Instantiate the model given inputs and outputs.\n",
" model = tf.keras.Model(inputs=inputs, outputs=scores)\n",
" return model\n",
"\n",
"def test_two_layer_fc_functional():\n",
" \"\"\" A small unit test to exercise the TwoLayerFC model above. \"\"\"\n",
" input_size, hidden_size, num_classes = 50, 42, 10\n",
" input_shape = (50,)\n",
" \n",
" x = tf.zeros((64, input_size))\n",
" model = two_layer_fc_functional(input_shape, hidden_size, num_classes)\n",
" \n",
" with tf.device(device):\n",
" scores = model(x)\n",
" print(scores.shape)\n",
" \n",
"test_two_layer_fc_functional()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Keras Functional API: Train a Two-Layer Network\n",
"You can now train this two-layer network constructed using the functional API.\n",
"\n",
"You don't need to perform any hyperparameter tuning here, but you should see validation accuracies above 40% after training for one epoch."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"input_shape = (32, 32, 3)\n",
"hidden_size, num_classes = 4000, 10\n",
"learning_rate = 1e-2\n",
"\n",
"def model_init_fn():\n",
" return two_layer_fc_functional(input_shape, hidden_size, num_classes)\n",
"\n",
"def optimizer_init_fn():\n",
" return tf.keras.optimizers.SGD(learning_rate=learning_rate)\n",
"\n",
"train_part34(model_init_fn, optimizer_init_fn)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Part V: CIFAR-10 open-ended challenge\n",
"\n",
"在本节中,你可以在CIFAR-10上实践任何ConvNet架构。\n",
"\n",
"你应该尝试在模型架构、超参数、损失函数、正则化或其他任何你能想到的地方调整来训练模型,从而在10个epoch内获得至少70%的验证集准确率。你可以使用上面的内置训练函数`train_part34`,或者实现自己的循环训练函数。\n",
"\n",
"**把你的工作描述在这个notebook的结尾!**\n",
"\n",
"In this section you can experiment with whatever ConvNet architecture you'd like on CIFAR-10.\n",
"\n",
"You should experiment with architectures, hyperparameters, loss functions, regularization, or anything else you can think of to train a model that achieves **at least 70%** accuracy on the **validation** set within 10 epochs. You can use the built-in train function, the `train_part34` function from above, or implement your own training loop.\n",
"\n",
"Describe what you did at the end of the notebook.\n",
"\n",
"### Some things you can try:\n",
"- **Filter size**: Above we used 5x5 and 3x3; is this optimal?\n",
"- **Number of filters**: Above we used 16 and 32 filters. Would more or fewer do better?\n",
"- **Pooling**: We didn't use any pooling above. Would this improve the model?\n",
"- **Normalization**: Would your model be improved with batch normalization, layer normalization, group normalization, or some other normalization strategy?\n",
"- **Network architecture**: The ConvNet above has only three layers of trainable parameters. Would a deeper model do better?\n",
"- **Global average pooling**: Instead of flattening after the final convolutional layer, would global average pooling do better? This strategy is used for example in Google's Inception network and in Residual Networks.\n",
"- **Regularization**: Would some kind of regularization improve performance? Maybe weight decay or dropout?\n",
"\n",
"### NOTE: Batch Normalization / Dropout\n",
"If you are using Batch Normalization and Dropout, remember to pass `is_training=True` if you use the `train_part34()` function. BatchNorm and Dropout layers have different behaviors at training and inference time. `training` is a specific keyword argument reserved for this purpose in any `tf.keras.Model`'s `call()` function. Read more about this here : https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/layers/BatchNormalization#methods\n",
"https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/keras/layers/Dropout#methods\n",
"\n",
"### Tips for training\n",
"For each network architecture that you try, you should tune the learning rate and other hyperparameters. When doing this there are a couple important things to keep in mind: \n",
"\n",
"- If the parameters are working well, you should see improvement within a few hundred iterations\n",
"- Remember the coarse-to-fine approach for hyperparameter tuning: start by testing a large range of hyperparameters for just a few training iterations to find the combinations of parameters that are working at all.\n",
"- Once you have found some sets of parameters that seem to work, search more finely around these parameters. You may need to train for more epochs.\n",
"- You should use the validation set for hyperparameter search, and save your test set for evaluating your architecture on the best parameters as selected by the validation set.\n",
"\n",
"### Going above and beyond\n",
"If you are feeling adventurous there are many other features you can implement to try and improve your performance. You are **not required** to implement any of these, but don't miss the fun if you have time!\n",
"\n",
"- Alternative optimizers: you can try Adam, Adagrad, RMSprop, etc.\n",
"- Alternative activation functions such as leaky ReLU, parametric ReLU, ELU, or MaxOut.\n",
"- Model ensembles\n",
"- Data augmentation\n",
"- New Architectures\n",
" - [ResNets](https://arxiv.org/abs/1512.03385) where the input from the previous layer is added to the output.\n",
" - [DenseNets](https://arxiv.org/abs/1608.06993) where inputs into previous layers are concatenated together.\n",
" - [This blog has an in-depth overview](https://chatbotslife.com/resnets-highwaynets-and-densenets-oh-my-9bb15918ee32)\n",
" \n",
"### Have fun and happy training! "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"class CustomConvNet(tf.keras.Model):\n",
" def __init__(self):\n",
" super(CustomConvNet, self).__init__()\n",
" ############################################################################\n",
" # TODO: Construct a model that performs well on CIFAR-10 #\n",
" ############################################################################\n",
" # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
" pass\n",
"\n",
" # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
" ############################################################################\n",
" # END OF YOUR CODE #\n",
" ############################################################################\n",
" \n",
" def call(self, input_tensor, training=False):\n",
" ############################################################################\n",
" # TODO: Construct a model that performs well on CIFAR-10 #\n",
" ############################################################################\n",
" # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
" pass\n",
"\n",
" # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
" ############################################################################\n",
" # END OF YOUR CODE #\n",
" ############################################################################\n",
" \n",
" return x\n",
"\n",
"# device = '/device:GPU:0' # Change this to a CPU/GPU as you wish!\n",
"device = '/cpu:0' # Change this to a CPU/GPU as you wish!\n",
"print_every = 700\n",
"num_epochs = 10\n",
"\n",
"model = CustomConvNet()\n",
"\n",
"def model_init_fn():\n",
" return CustomConvNet()\n",
"\n",
"def optimizer_init_fn():\n",
" learning_rate = 1e-3\n",
" return tf.keras.optimizers.Adam(learning_rate) \n",
"\n",
"train_part34(model_init_fn, optimizer_init_fn, num_epochs=num_epochs, is_training=True)"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-inline"
]
},
"source": [
"## Describe what you did \n",
"\n",
"In the cell below you should write an explanation of what you did, any additional features that you implemented, and/or any graphs that you made in the process of training and evaluating your network."
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-inline"
]
},
"source": [
"TODO: Tell us what you did"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"# 重要\n",
"\n",
"这里是作业的结尾处,请执行以下步骤:\n",
"\n",
"1. 点击`File -> Save`或者用`control+s`组合键,确保你最新的的notebook的作业已经保存到谷歌云。\n",
"2. 执行以下代码确保 `.py` 文件保存回你的谷歌云。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"FOLDER_TO_SAVE = os.path.join('drive/My Drive/', FOLDERNAME)\n",
"FILES_TO_SAVE = ['daseCV/classifiers/cnn.py', 'daseCV/classifiers/fc_net.py']\n",
"\n",
"for files in FILES_TO_SAVE:\n",
" with open(os.path.join(FOLDER_TO_SAVE, '/'.join(files.split('/')[1:])), 'w') as f:\n",
" f.write(''.join(open(files).readlines()))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.0"
}
},
"nbformat": 4,
"nbformat_minor": 4
}