Browse Source

上传文件至 'assignment1'

master
孙秋实 3 years ago
parent
commit
ad9179a06f
6 changed files with 2636 additions and 0 deletions
  1. +409
    -0
      assignment1/features.ipynb
  2. +611
    -0
      assignment1/knn.ipynb
  3. +44
    -0
      assignment1/makepdf.py
  4. +394
    -0
      assignment1/softmax.ipynb
  5. +618
    -0
      assignment1/svm.ipynb
  6. +560
    -0
      assignment1/two_layer_net.ipynb

+ 409
- 0
assignment1/features.ipynb View File

@ -0,0 +1,409 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from google.colab import drive\n",
"\n",
"drive.mount('/content/drive', force_remount=True)\n",
"\n",
"# 输入daseCV所在的路径\n",
"# 'daseCV' 文件夹包括 '.py', 'classifiers' 和'datasets'文件夹\n",
"# 例如 'CV/assignments/assignment1/daseCV/'\n",
"FOLDERNAME = None\n",
"\n",
"assert FOLDERNAME is not None, \"[!] Enter the foldername.\"\n",
"\n",
"%cd drive/My\\ Drive\n",
"%cp -r $FOLDERNAME ../../\n",
"%cd ../../\n",
"%cd daseCV/datasets/\n",
"!bash get_datasets.sh\n",
"%cd ../../"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-title"
]
},
"source": [
"# 图像特征练习\n",
"*补充并完成本练习。*\n",
"\n",
"我们已经看到,通过在输入图像的像素上训练线性分类器,从而在图像分类任务上达到一个合理的性能。在本练习中,我们将展示我们可以通过对线性分类器(不是在原始像素上,而是在根据原始像素计算出的特征上)进行训练来改善分类性能。\n",
"\n",
"你将在此notebook中完成本练习的所有工作。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"import random\n",
"import numpy as np\n",
"from daseCV.data_utils import load_CIFAR10\n",
"import matplotlib.pyplot as plt\n",
"\n",
"\n",
"%matplotlib inline\n",
"plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots\n",
"plt.rcParams['image.interpolation'] = 'nearest'\n",
"plt.rcParams['image.cmap'] = 'gray'\n",
"\n",
"# for auto-reloading extenrnal modules\n",
"# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython\n",
"%load_ext autoreload\n",
"%autoreload 2"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-ignore"
]
},
"source": [
"## 数据加载\n",
"与之前的练习类似,我们将从磁盘加载CIFAR-10数据。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"from daseCV.features import color_histogram_hsv, hog_feature\n",
"\n",
"def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=1000):\n",
" # Load the raw CIFAR-10 data\n",
" cifar10_dir = 'daseCV/datasets/cifar-10-batches-py'\n",
"\n",
" # Cleaning up variables to prevent loading data multiple times (which may cause memory issue)\n",
" try:\n",
" del X_train, y_train\n",
" del X_test, y_test\n",
" print('Clear previously loaded data.')\n",
" except:\n",
" pass\n",
"\n",
" X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)\n",
" \n",
" # Subsample the data\n",
" mask = list(range(num_training, num_training + num_validation))\n",
" X_val = X_train[mask]\n",
" y_val = y_train[mask]\n",
" mask = list(range(num_training))\n",
" X_train = X_train[mask]\n",
" y_train = y_train[mask]\n",
" mask = list(range(num_test))\n",
" X_test = X_test[mask]\n",
" y_test = y_test[mask]\n",
" \n",
" return X_train, y_train, X_val, y_val, X_test, y_test\n",
"\n",
"X_train, y_train, X_val, y_val, X_test, y_test = get_CIFAR10_data()"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-ignore"
]
},
"source": [
"## 特征提取\n",
"对于每一张图片我们将会计算它的方向梯度直方图(英語:Histogram of oriented gradient,简称HOG)以及在HSV颜色空间使用色相通道的颜色直方图。\n",
"\n",
"简单来讲,HOG能提取图片的纹理信息而忽略颜色信息,颜色直方图则提取出颜色信息而忽略纹理信息。\n",
"因此,我们希望将两者结合使用而不是单独使用任一个。去实现这个设想是一个十分有趣的事情。\n",
"\n",
"`hog_feature` 和 `color_histogram_hsv`两个函数都可以对单个图像进行运算并返回改图像的一个特征向量。\n",
"extract_features函数输入一个图像集合和一个特征函数列表然后对每张图片运行每个特征函数,\n",
"然后将结果存储在一个矩阵中,矩阵的每一列是单个图像的所有特征向量的串联。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true,
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"from daseCV.features import *\n",
"\n",
"num_color_bins = 10 # Number of bins in the color histogram\n",
"feature_fns = [hog_feature, lambda img: color_histogram_hsv(img, nbin=num_color_bins)]\n",
"X_train_feats = extract_features(X_train, feature_fns, verbose=True)\n",
"X_val_feats = extract_features(X_val, feature_fns)\n",
"X_test_feats = extract_features(X_test, feature_fns)\n",
"\n",
"# Preprocessing: Subtract the mean feature\n",
"mean_feat = np.mean(X_train_feats, axis=0, keepdims=True)\n",
"X_train_feats -= mean_feat\n",
"X_val_feats -= mean_feat\n",
"X_test_feats -= mean_feat\n",
"\n",
"# Preprocessing: Divide by standard deviation. This ensures that each feature\n",
"# has roughly the same scale.\n",
"std_feat = np.std(X_train_feats, axis=0, keepdims=True)\n",
"X_train_feats /= std_feat\n",
"X_val_feats /= std_feat\n",
"X_test_feats /= std_feat\n",
"\n",
"# Preprocessing: Add a bias dimension\n",
"X_train_feats = np.hstack([X_train_feats, np.ones((X_train_feats.shape[0], 1))])\n",
"X_val_feats = np.hstack([X_val_feats, np.ones((X_val_feats.shape[0], 1))])\n",
"X_test_feats = np.hstack([X_test_feats, np.ones((X_test_feats.shape[0], 1))])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 使用特征训练SVM\n",
"使用之前作业完成的多分类SVM代码来训练上面提取的特征。这应该比原始数据直接在SVM上训练会去的更好的效果。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"code"
]
},
"outputs": [],
"source": [
"# 使用验证集调整学习率和正则化强度\n",
"\n",
"from daseCV.classifiers.linear_classifier import LinearSVM\n",
"\n",
"learning_rates = [1e-9, 1e-8, 1e-7]\n",
"regularization_strengths = [5e4, 5e5, 5e6]\n",
"\n",
"results = {}\n",
"best_val = -1\n",
"best_svm = None\n",
"\n",
"################################################################################\n",
"# 你需要做的: \n",
"# 使用验证集设置学习率和正则化强度。\n",
"# 这应该与你对SVM所做的验证相同;\n",
"# 将训练最好的的分类器保存在best_svm中。\n",
"# 你可能还想在颜色直方图中使用不同数量的bins。\n",
"# 如果你细心一点应该能够在验证集上获得接近0.44的准确性。 \n",
"################################################################################\n",
"# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
"pass\n",
"\n",
"# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
"# Print out results.\n",
"for lr, reg in sorted(results):\n",
" train_accuracy, val_accuracy = results[(lr, reg)]\n",
" print('lr %e reg %e train accuracy: %f val accuracy: %f' % (\n",
" lr, reg, train_accuracy, val_accuracy))\n",
" \n",
"print('best validation accuracy achieved during cross-validation: %f' % best_val)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Evaluate your trained SVM on the test set\n",
"y_test_pred = best_svm.predict(X_test_feats)\n",
"test_accuracy = np.mean(y_test == y_test_pred)\n",
"print(test_accuracy)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 直观了解算法工作原理的一种重要方法是可视化它所犯的错误。\n",
"# 在此可视化中,我们显示了当前系统未正确分类的图像示例。\n",
"# 第一列显示的图像是我们的系统标记为“ plane”,但其真实标记不是“ plane”。\n",
"\n",
"examples_per_class = 8\n",
"classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']\n",
"for cls, cls_name in enumerate(classes):\n",
" idxs = np.where((y_test != cls) & (y_test_pred == cls))[0]\n",
" idxs = np.random.choice(idxs, examples_per_class, replace=False)\n",
" for i, idx in enumerate(idxs):\n",
" plt.subplot(examples_per_class, len(classes), i * len(classes) + cls + 1)\n",
" plt.imshow(X_test[idx].astype('uint8'))\n",
" plt.axis('off')\n",
" if i == 0:\n",
" plt.title(cls_name)\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-inline"
]
},
"source": [
"**问题 1:**\n",
"\n",
"描述你看到的错误分类结果。你认为他们有道理吗?\n",
"\n",
"$\\color{blue}{\\textit 答:}$ **在这里写上你的回答**"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 图像特征神经网络\n",
"在之前的练习中,我们看到在原始像素上训练两层神经网络比线性分类器具有更好的分类精度。在这里,我们已经看到使用图像特征的线性分类器优于使用原始像素的线性分类器。\n",
"为了完整起见,我们还应该尝试在图像特征上训练神经网络。这种方法应优于以前所有的方法:你应该能够轻松地在测试集上达到55%以上的分类精度;我们最好的模型可达到约60%的精度。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"# Preprocessing: Remove the bias dimension\n",
"# Make sure to run this cell only ONCE\n",
"print(X_train_feats.shape)\n",
"X_train_feats = X_train_feats[:, :-1]\n",
"X_val_feats = X_val_feats[:, :-1]\n",
"X_test_feats = X_test_feats[:, :-1]\n",
"\n",
"print(X_train_feats.shape)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"code"
]
},
"outputs": [],
"source": [
"from daseCV.classifiers.neural_net import TwoLayerNet\n",
"\n",
"input_dim = X_train_feats.shape[1]\n",
"hidden_dim = 500\n",
"num_classes = 10\n",
"best_acc = 0.0\n",
"\n",
"net = TwoLayerNet(input_dim, hidden_dim, num_classes)\n",
"best_net = None\n",
"\n",
"################################################################################\n",
"# TODO: 使用图像特征训练两层神经网络。\n",
"# 您可能希望像上一节中那样对各种参数进行交叉验证。\n",
"# 将最佳的模型存储在best_net变量中。 \n",
"################################################################################\n",
"# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
"pass\n",
"\n",
"# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 在测试集上运行得到的最好的神经网络分类器,应该能够获得55%以上的准确性。\n",
"\n",
"test_acc = (best_net.predict(X_test_feats) == y_test).mean()\n",
"print(test_acc)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"# 重要\n",
"\n",
"这里是作业的结尾处,请执行以下步骤:\n",
"\n",
"1. 点击`File -> Save`或者用`control+s`组合键,确保你最新的的notebook的作业已经保存到谷歌云。\n",
"2. 执行以下代码确保 `.py` 文件保存回你的谷歌云。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"FOLDER_TO_SAVE = os.path.join('drive/My Drive/', FOLDERNAME)\n",
"FILES_TO_SAVE = []\n",
"\n",
"for files in FILES_TO_SAVE:\n",
" with open(os.path.join(FOLDER_TO_SAVE, '/'.join(files.split('/')[1:])), 'w') as f:\n",
" f.write(''.join(open(files).readlines()))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.0"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

+ 611
- 0
assignment1/knn.ipynb View File

@ -0,0 +1,611 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from google.colab import drive\n",
"\n",
"drive.mount('/content/drive', force_remount=True)\n",
"\n",
"# 输入daseCV所在的路径\n",
"# 'daseCV' 文件夹包括 '.py', 'classifiers' 和'datasets'文件夹\n",
"# 例如 'CV/assignments/assignment1/daseCV/'\n",
"FOLDERNAME = None\n",
"\n",
"assert FOLDERNAME is not None, \"[!] Enter the foldername.\"\n",
"\n",
"%cd drive/My\\ Drive\n",
"%cp -r $FOLDERNAME ../../\n",
"%cd ../../\n",
"%cd daseCV/datasets/\n",
"!bash get_datasets.sh\n",
"%cd ../../"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-title"
]
},
"source": [
"# K-近邻算法 (kNN) 练习\n",
"\n",
"*补充并完成本练习。*\n",
"\n",
"kNN分类器包含两个阶段:\n",
"\n",
"- 训练阶段,分类器获取训练数据并简单地记住它。\n",
"- 测试阶段, kNN将测试图像与所有训练图像进行比较,并计算出前k个最相似的训练示例的标签来对每个测试图像进行分类。\n",
"- 对k值进行交叉验证\n",
"\n",
"在本练习中,您将实现这些步骤,并了解基本的图像分类、交叉验证和熟练编写高效矢量化代码的能力。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"# 运行notebook的一些初始化代码\n",
"\n",
"import random\n",
"import numpy as np\n",
"from daseCV.data_utils import load_CIFAR10\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# 使得matplotlib的图像在当前页显示而不是新的窗口。\n",
"%matplotlib inline\n",
"plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots\n",
"plt.rcParams['image.interpolation'] = 'nearest'\n",
"plt.rcParams['image.cmap'] = 'gray'\n",
"\n",
"# 一些更神奇的,使notebook重新加载外部的python模块;\n",
"# 参见 http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython\n",
"%load_ext autoreload\n",
"%autoreload 2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"# 加载未处理的 CIFAR-10 数据.\n",
"cifar10_dir = 'daseCV/datasets/cifar-10-batches-py'\n",
"\n",
"# 清理变量以防止多次加载数据(这可能会导致内存问题)\n",
"try:\n",
" del X_train, y_train\n",
" del X_test, y_test\n",
" print('Clear previously loaded data.')\n",
"except:\n",
" pass\n",
"\n",
"X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)\n",
"\n",
"# 作为健全性检查,我们打印出训练和测试数据的形状。\n",
"print('Training data shape: ', X_train.shape)\n",
"print('Training labels shape: ', y_train.shape)\n",
"print('Test data shape: ', X_test.shape)\n",
"print('Test labels shape: ', y_test.shape)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"# 可视化数据集中的一些示例。\n",
"# 我们展示了训练图像的所有类别的一些示例。\n",
"classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']\n",
"num_classes = len(classes)\n",
"samples_per_class = 7\n",
"for y, cls in enumerate(classes):\n",
" idxs = np.flatnonzero(y_train == y) # flatnonzero表示返回所给数列的非零项的索引值,这里表示返回所有属于y类的索引\n",
" idxs = np.random.choice(idxs, samples_per_class, replace=False) # replace表示抽取的样本是否能重复\n",
" for i, idx in enumerate(idxs):\n",
" plt_idx = i * num_classes + y + 1\n",
" plt.subplot(samples_per_class, num_classes, plt_idx)\n",
" plt.imshow(X_train[idx].astype('uint8'))\n",
" plt.axis('off')\n",
" if i == 0:\n",
" plt.title(cls)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"# 在练习中使用更小的子样本可以提高代码的效率\n",
"num_training = 5000\n",
"mask = list(range(num_training))\n",
"X_train = X_train[mask]\n",
"y_train = y_train[mask]\n",
"\n",
"num_test = 500\n",
"mask = list(range(num_test))\n",
"X_test = X_test[mask]\n",
"y_test = y_test[mask]\n",
"\n",
"# 将图像数据调整为行\n",
"X_train = np.reshape(X_train, (X_train.shape[0], -1))\n",
"X_test = np.reshape(X_test, (X_test.shape[0], -1))\n",
"print(X_train.shape, X_test.shape)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"from daseCV.classifiers import KNearestNeighbor\n",
"\n",
"# 创建一个kNN分类器实例。\n",
"# 请记住,kNN分类器的训练并不会做什么: \n",
"# 分类器仅记住数据并且不做进一步处理\n",
"classifier = KNearestNeighbor()\n",
"classifier.train(X_train, y_train)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"现在,我们要使用kNN分类器对测试数据进行分类。回想一下,我们可以将该过程分为两个步骤: \n",
"\n",
"1. 首先,我们必须计算所有测试样本与所有训练样本之间的距离。 \n",
"2. 给定这些距离,对于每个测试示例,我们找到k个最接近的示例,并让它们对标签进行投票\n",
"\n",
"让我们开始计算所有训练和测试示例之间的距离矩阵。 假设有 **Ntr** 的训练样本和 **Nte** 的测试样本, 该过程的结果存储在一个 **Nte x Ntr** 矩阵中,其中每个元素 (i,j) 表示的是第 i 个测试样本和第 j 个 训练样本的距离。\n",
"\n",
"**注意: 在完成此notebook中的三个距离的计算时请不要使用numpy提供的np.linalg.norm()函数。**\n",
"\n",
"首先打开 `daseCV/classifiers/k_nearest_neighbor.py` 并且补充完成函数 `compute_distances_two_loops` ,这个函数使用双重循环(效率十分低下)来计算距离矩阵。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 打开 daseCV/classifiers/k_nearest_neighbor.py 并且补充完成\n",
"# compute_distances_two_loops.\n",
"\n",
"# 测试你的代码:\n",
"dists = classifier.compute_distances_two_loops(X_test)\n",
"print(dists.shape)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 我们可视化距离矩阵:每行代表一个测试样本与训练样本的距离\n",
"plt.imshow(dists, interpolation='none')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-inline"
]
},
"source": [
"**问题 1** \n",
"\n",
"请注意距离矩阵中的结构化图案,其中某些行或列的可见亮度更高。(请注意,使用默认的配色方案,黑色表示低距离,而白色表示高距离。)\n",
"\n",
"- 数据中导致行亮度更高的原因是什么?\n",
"- 那列方向的是什么原因呢?\n",
"\n",
"$\\color{blue}{\\textit 答:}$ *在这里做出回答*\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 现在实现函数predict_labels并运行以下代码:\n",
"# 我们使用k = 1(这是最近的邻居)。\n",
"y_test_pred = classifier.predict_labels(dists, k=1)\n",
"\n",
"# 计算并打印出预测的精度\n",
"num_correct = np.sum(y_test_pred == y_test)\n",
"accuracy = float(num_correct) / num_test\n",
"print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"你预期的精度应该为 `27%` 左右。 现在让我们尝试更大的 `k`, 比如 `k = 5`:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"y_test_pred = classifier.predict_labels(dists, k=5)\n",
"num_correct = np.sum(y_test_pred == y_test)\n",
"accuracy = float(num_correct) / num_test\n",
"print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"你应该能看到一个比 `k = 1` 稍微好一点的结果。"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-inline"
]
},
"source": [
"**问题 2**\n",
"\n",
"我们还可以使用其他距离指标,例如L1距离。\n",
"\n",
"记图像 $I_k$ 的每个位置 $(i,j)$ 的像素值为 $p_{ij}^{(k)}$,\n",
"\n",
"所有图像上的所有像素的均值 $\\mu$ 为 \n",
"\n",
"$$\\mu=\\frac{1}{nhw}\\sum_{k=1}^n\\sum_{i=1}^{h}\\sum_{j=1}^{w}p_{ij}^{(k)}$$\n",
"\n",
"并且所有图像的每个像素的均值 $\\mu_{ij}$ 为\n",
"\n",
"$$\\mu_{ij}=\\frac{1}{n}\\sum_{k=1}^np_{ij}^{(k)}.$$\n",
"\n",
"标准差 $\\sigma$ 以及每个像素的标准差 $\\sigma_{ij}$ 的定义与之类似。\n",
"\n",
"以下哪个预处理步骤不会改变使用L1距离的最近邻分类器的效果?选择所有符合条件的答案。\n",
"1. 减去均值 $\\mu$ ($\\tilde{p}_{ij}^{(k)}=p_{ij}^{(k)}-\\mu$.)\n",
"2. 减去每个像素均值 $\\mu_{ij}$ ($\\tilde{p}_{ij}^{(k)}=p_{ij}^{(k)}-\\mu_{ij}$.)\n",
"3. 减去均值 $\\mu$ 然后除以标准偏差 $\\sigma$.\n",
"4. 减去每个像素均值 $\\mu_{ij}$ 并除以每个素标准差 $\\sigma_{ij}$.\n",
"5. 旋转数据的坐标轴。\n",
"\n",
"$\\color{blue}{\\textit 你的回答:}$\n",
"\n",
"\n",
"$\\color{blue}{\\textit 你的解释:}$\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore-input"
]
},
"outputs": [],
"source": [
"# 现在,通过部分矢量化并且使用单层循环的来加快距离矩阵的计算。\n",
"# 需要实现函数compute_distances_one_loop并运行以下代码:\n",
"\n",
"dists_one = classifier.compute_distances_one_loop(X_test)\n",
"\n",
"# 为了确保我们的矢量化实现正确,我们要保证它的结果与最原始的实现方式结果一致。\n",
"# 有很多方法可以确定两个矩阵是否相似。最简单的方法之一就是Frobenius范数。 \n",
"# 如果您以前从未了解过Frobenius范数,它其实是两个矩阵的所有元素之差的平方和的平方根;\n",
"# 换句话说,就是将矩阵重整为向量并计算它们之间的欧几里得距离。\n",
"\n",
"difference = np.linalg.norm(dists - dists_one, ord='fro')\n",
"print('One loop difference was: %f' % (difference, ))\n",
"if difference < 0.001:\n",
" print('Good! The distance matrices are the same')\n",
"else:\n",
" print('Uh-oh! The distance matrices are different')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"scrolled": true,
"tags": [
"pdf-ignore-input"
]
},
"outputs": [],
"source": [
"# 现在完成compute_distances_no_loops实现完全矢量化的版本并运行代码\n",
"dists_two = classifier.compute_distances_no_loops(X_test)\n",
"\n",
"# 检查距离矩阵是否与我们之前计算出的矩阵一致:\n",
"difference = np.linalg.norm(dists - dists_two, ord='fro')\n",
"print('No loop difference was: %f' % (difference, ))\n",
"if difference < 0.001:\n",
" print('Good! The distance matrices are the same')\n",
"else:\n",
" print('Uh-oh! The distance matrices are different')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore-input"
]
},
"outputs": [],
"source": [
"# 让我们比较一下三种实现方式的速度\n",
"def time_function(f, *args):\n",
" \"\"\"\n",
" Call a function f with args and return the time (in seconds) that it took to execute.\n",
" \"\"\"\n",
" import time\n",
" tic = time.time()\n",
" f(*args)\n",
" toc = time.time()\n",
" return toc - tic\n",
"\n",
"two_loop_time = time_function(classifier.compute_distances_two_loops, X_test)\n",
"print('Two loop version took %f seconds' % two_loop_time)\n",
"\n",
"one_loop_time = time_function(classifier.compute_distances_one_loop, X_test)\n",
"print('One loop version took %f seconds' % one_loop_time)\n",
"\n",
"no_loop_time = time_function(classifier.compute_distances_no_loops, X_test)\n",
"print('No loop version took %f seconds' % no_loop_time)\n",
"\n",
"# 你应该会看到使用完全矢量化的实现会有明显更佳的性能!\n",
"\n",
"# 注意:在部分计算机上,当您从两层循环转到单层循环时,\n",
"# 您可能看不到速度的提升,甚至可能会看到速度变慢。"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 交叉验证\n",
"\n",
"我们已经实现了kNN分类器,并且可以设置k = 5。现在,将通过交叉验证来确定此超参数的最佳值。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"code"
]
},
"outputs": [],
"source": [
"num_folds = 5\n",
"k_choices = [1, 3, 5, 8, 10, 12, 15, 20, 50, 100]\n",
"\n",
"X_train_folds = []\n",
"y_train_folds = []\n",
"################################################################################\n",
"# 需要完成的事情: \n",
"# 将训练数据分成多个部分。拆分后,X_train_folds和y_train_folds均应为长度为num_folds的列表,\n",
"# 其中y_train_folds [i]是X_train_folds [i]中各点的标签向量。\n",
"# 提示:查阅numpy的array_split函数。 \n",
"################################################################################\n",
"# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
"pass\n",
"\n",
"# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
"# A dictionary holding the accuracies for different values of k that we find when running cross-validation.\n",
"# 一个字典,存储我们进行交叉验证时不同k的值的精度。\n",
"# 运行交叉验证后,k_to_accuracies[k]应该是长度为num_folds的列表,存储了k值下的精度值。\n",
"k_to_accuracies = {}\n",
"\n",
"\n",
"################################################################################\n",
"# 需要完成的事情: \n",
"# 执行k的交叉验证,以找到k的最佳值。\n",
"# 对于每个可能的k值,运行k-最近邻算法 num_folds 次,\n",
"# 在每次循环下,你都会用所有拆分的数据(除了其中一个需要作为验证集)作为训练数据。\n",
"# 然后存储所有的精度结果到k_to_accuracies[k]中。 \n",
"################################################################################\n",
"# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
"# 交叉验证。有时候,训练集数量较小(因此验证集的数量更小),人们会使用一种被称为\n",
"# 交叉验证的方法,这种方法更加复杂些。还是用刚才的例子,如果是交叉验证集,我们就\n",
"# 不是取1000个图像,而是将训练集平均分成5份,其中4份用来训练,1份用来验证。然后\n",
"# 我们循环着取其中4份来训练,其中1份来验证,最后取所有5次验证结果的平均值作为算\n",
"# 法验证结果。\n",
"\n",
"for k in k_choices:\n",
" k_to_accuracies[k] = []\n",
" for i in range(num_folds):\n",
" # prepare training data for the current fold\n",
" X_train_fold = np.concatenate([ fold for j, fold in enumerate(X_train_folds) if i != j ])\n",
" y_train_fold = np.concatenate([ fold for j, fold in enumerate(y_train_folds) if i != j ])\n",
" \n",
" # use of k-nearest-neighbor algorithm\n",
" classifier.train(X_train_fold, y_train_fold)\n",
" y_pred_fold = classifier.predict(X_train_folds[i], k=k, num_loops=0)\n",
"\n",
" # Compute the fraction of correctly predicted examples\n",
" num_correct = np.sum(y_pred_fold == y_train_folds[i])\n",
" accuracy = float(num_correct) / X_train_folds[i].shape[0]\n",
" k_to_accuracies[k].append(accuracy)\n",
"\n",
"# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
"# 打印出计算的精度\n",
"for k in sorted(k_to_accuracies):\n",
" for accuracy in k_to_accuracies[k]:\n",
" print('k = %d, accuracy = %f' % (k, accuracy))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore-input"
]
},
"outputs": [],
"source": [
"# 绘制原始观察结果\n",
"for k in k_choices:\n",
" accuracies = k_to_accuracies[k]\n",
" plt.scatter([k] * len(accuracies), accuracies)\n",
"\n",
"# 用与标准偏差相对应的误差线绘制趋势线\n",
"accuracies_mean = np.array([np.mean(v) for k,v in sorted(k_to_accuracies.items())])\n",
"accuracies_std = np.array([np.std(v) for k,v in sorted(k_to_accuracies.items())])\n",
"plt.errorbar(k_choices, accuracies_mean, yerr=accuracies_std)\n",
"plt.title('Cross-validation on k')\n",
"plt.xlabel('k')\n",
"plt.ylabel('Cross-validation accuracy')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 根据上述交叉验证结果,为k选择最佳值,使用所有训练数据重新训练分类器,\n",
"# 并在测试中对其进行测试数据。您应该能够在测试数据上获得28%以上的准确性。\n",
"\n",
"best_k = k_choices[accuracies_mean.argmax()]\n",
"\n",
"classifier = KNearestNeighbor()\n",
"classifier.train(X_train, y_train)\n",
"y_test_pred = classifier.predict(X_test, k=best_k)\n",
"\n",
"# Compute and display the accuracy\n",
"num_correct = np.sum(y_test_pred == y_test)\n",
"accuracy = float(num_correct) / num_test\n",
"print('Got %d / %d correct => accuracy: %f' % (num_correct, num_test, accuracy))"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-inline"
]
},
"source": [
"**问题 3**\n",
"\n",
"下列关于$k$-NN的陈述中哪些是在分类器中正确的设置,并且对所有的$k$都有效?选择所有符合条件的选项。\n",
"\n",
"1. k-NN分类器的决策边界是线性的。\n",
"2. 1-NN的训练误差将始终低于5-NN。\n",
"3. 1-NN的测试误差将始终低于5-NN。\n",
"4. 使用k-NN分类器对测试示例进行分类所需的时间随训练集的大小而增加。\n",
"5. 以上都不是。\n",
"\n",
"$\\color{blue}{\\textit 你的回答:}$\n",
"\n",
"\n",
"$\\color{blue}{\\textit 你的解释:}$\n",
"\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"# 重要\n",
"\n",
"这里是作业的结尾处,请执行以下步骤:\n",
"\n",
"1. 点击`File -> Save`或者用`control+s`组合键,确保你最新的的notebook的作业已经保存到谷歌云。\n",
"2. 执行以下代码确保 `.py` 文件保存回你的谷歌云。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"FOLDER_TO_SAVE = os.path.join('drive/My Drive/', FOLDERNAME)\n",
"FILES_TO_SAVE = ['daseCV/classifiers/k_nearest_neighbor.py']\n",
"\n",
"for files in FILES_TO_SAVE:\n",
" with open(os.path.join(FOLDER_TO_SAVE, '/'.join(files.split('/')[1:])), 'w') as f:\n",
" f.write(''.join(open(files).readlines()))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.0"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

+ 44
- 0
assignment1/makepdf.py View File

@ -0,0 +1,44 @@
import argparse
import os
import subprocess
try:
from PyPDF2 import PdfFileMerger
MERGE = True
except ImportError:
print("Could not find PyPDF2. Leaving pdf files unmerged.")
MERGE = False
def main(files):
os_args = [
"jupyter",
"nbconvert",
"--log-level",
"CRITICAL",
"--to",
"pdf",
]
for f in files:
os_args.append(f)
subprocess.run(os_args)
print("Created PDF {}.".format(f))
if MERGE:
pdfs = [f.split(".")[0] + ".pdf" for f in files]
merger = PdfFileMerger(strict=True)
for pdf in pdfs:
merger.append(pdf)
merger.write("assignment.pdf")
merger.close()
for pdf in pdfs:
os.remove(pdf)
if __name__ == "__main__":
parser = argparse.ArgumentParser()
# we pass in explicit notebook arg so that we can provide
# an ordered list and produce an ordered pdf
parser.add_argument("--notebooks", type=str, nargs="+", required=True)
args = parser.parse_args()
main(args.notebooks)

+ 394
- 0
assignment1/softmax.ipynb View File

@ -0,0 +1,394 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from google.colab import drive\n",
"\n",
"drive.mount('/content/drive', force_remount=True)\n",
"\n",
"# 输入daseCV所在的路径\n",
"# 'daseCV' 文件夹包括 '.py', 'classifiers' 和'datasets'文件夹\n",
"# 例如 'CV/assignments/assignment1/daseCV/'\n",
"FOLDERNAME = None\n",
"\n",
"assert FOLDERNAME is not None, \"[!] Enter the foldername.\"\n",
"\n",
"%cd drive/My\\ Drive\n",
"%cp -r $FOLDERNAME ../../\n",
"%cd ../../\n",
"%cd daseCV/datasets/\n",
"!bash get_datasets.sh\n",
"%cd ../../"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-title"
]
},
"source": [
"# Softmax 练习\n",
"\n",
"*补充并完成本练习。*\n",
"\n",
"本练习类似于SVM练习,你要完成的事情包括:\n",
"\n",
"- 为Softmax分类器实现完全矢量化的**损失函数**\n",
"- 实现其**解析梯度(analytic gradient)**的完全矢量化表达式\n",
"- 用数值梯度**检查你的代码**\n",
"- 使用验证集**调整学习率和正则化强度**\n",
"- 使用**SGD优化**损失函数\n",
"- **可视化**最终学习的权重\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"import random\n",
"import numpy as np\n",
"from daseCV.data_utils import load_CIFAR10\n",
"import matplotlib.pyplot as plt\n",
"\n",
"%matplotlib inline\n",
"plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots\n",
"plt.rcParams['image.interpolation'] = 'nearest'\n",
"plt.rcParams['image.cmap'] = 'gray'\n",
"\n",
"# for auto-reloading extenrnal modules\n",
"# see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython\n",
"%load_ext autoreload\n",
"%autoreload 2"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=1000, num_dev=500):\n",
" \"\"\"\n",
" Load the CIFAR-10 dataset from disk and perform preprocessing to prepare\n",
" it for the linear classifier. These are the same steps as we used for the\n",
" SVM, but condensed to a single function. \n",
" \"\"\"\n",
" # Load the raw CIFAR-10 data\n",
" cifar10_dir = 'daseCV/datasets/cifar-10-batches-py'\n",
" \n",
" # Cleaning up variables to prevent loading data multiple times (which may cause memory issue)\n",
" try:\n",
" del X_train, y_train\n",
" del X_test, y_test\n",
" print('Clear previously loaded data.')\n",
" except:\n",
" pass\n",
"\n",
" X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)\n",
" \n",
" # subsample the data\n",
" mask = list(range(num_training, num_training + num_validation))\n",
" X_val = X_train[mask]\n",
" y_val = y_train[mask]\n",
" mask = list(range(num_training))\n",
" X_train = X_train[mask]\n",
" y_train = y_train[mask]\n",
" mask = list(range(num_test))\n",
" X_test = X_test[mask]\n",
" y_test = y_test[mask]\n",
" mask = np.random.choice(num_training, num_dev, replace=False)\n",
" X_dev = X_train[mask]\n",
" y_dev = y_train[mask]\n",
" \n",
" # Preprocessing: reshape the image data into rows\n",
" X_train = np.reshape(X_train, (X_train.shape[0], -1))\n",
" X_val = np.reshape(X_val, (X_val.shape[0], -1))\n",
" X_test = np.reshape(X_test, (X_test.shape[0], -1))\n",
" X_dev = np.reshape(X_dev, (X_dev.shape[0], -1))\n",
" \n",
" # Normalize the data: subtract the mean image\n",
" mean_image = np.mean(X_train, axis = 0)\n",
" X_train -= mean_image\n",
" X_val -= mean_image\n",
" X_test -= mean_image\n",
" X_dev -= mean_image\n",
" \n",
" # add bias dimension and transform into columns\n",
" X_train = np.hstack([X_train, np.ones((X_train.shape[0], 1))])\n",
" X_val = np.hstack([X_val, np.ones((X_val.shape[0], 1))])\n",
" X_test = np.hstack([X_test, np.ones((X_test.shape[0], 1))])\n",
" X_dev = np.hstack([X_dev, np.ones((X_dev.shape[0], 1))])\n",
" \n",
" return X_train, y_train, X_val, y_val, X_test, y_test, X_dev, y_dev\n",
"\n",
"\n",
"# Invoke the above function to get our data.\n",
"X_train, y_train, X_val, y_val, X_test, y_test, X_dev, y_dev = get_CIFAR10_data()\n",
"print('Train data shape: ', X_train.shape)\n",
"print('Train labels shape: ', y_train.shape)\n",
"print('Validation data shape: ', X_val.shape)\n",
"print('Validation labels shape: ', y_val.shape)\n",
"print('Test data shape: ', X_test.shape)\n",
"print('Test labels shape: ', y_test.shape)\n",
"print('dev data shape: ', X_dev.shape)\n",
"print('dev labels shape: ', y_dev.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Softmax 分类器\n",
"\n",
"请在**daseCV/classifiers/softmax.py**中完成本节的代码。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 首先使用嵌套循环实现简单的softmax损失函数。\n",
"# 打开文件 daseCV/classifiers/softmax.py 并补充完成\n",
"# softmax_loss_naive 函数.\n",
"\n",
"from daseCV.classifiers.softmax import softmax_loss_naive\n",
"import time\n",
"\n",
"# 生成一个随机的softmax权重矩阵,并使用它来计算损失。\n",
"W = np.random.randn(3073, 10) * 0.0001\n",
"loss, grad = softmax_loss_naive(W, X_dev, y_dev, 0.0)\n",
"\n",
"# As a rough sanity check, our loss should be something close to -log(0.1).\n",
"print('loss: %f' % loss)\n",
"print('sanity check: %f' % (-np.log(0.1)))"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-inline"
]
},
"source": [
"**问题 1**\n",
"\n",
"\n",
"为什么我们期望损失接近-log(0.1)?简要说明。\n",
"\n",
"$\\color{blue}{\\textit 答:}$ *在这里写上你的答案* \n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 完成softmax_loss_naive,并实现使用嵌套循环的梯度的版本(naive)。\n",
"loss, grad = softmax_loss_naive(W, X_dev, y_dev, 0.0)\n",
"\n",
"# 就像SVM那样,请使用数值梯度检查作为调试工具。\n",
"# 数值梯度应接近分析梯度。\n",
"from daseCV.gradient_check import grad_check_sparse\n",
"f = lambda w: softmax_loss_naive(w, X_dev, y_dev, 0.0)[0]\n",
"grad_numerical = grad_check_sparse(f, W, grad, 10)\n",
"\n",
"# 与SVM情况类似,使用正则化进行另一个梯度检查\n",
"loss, grad = softmax_loss_naive(W, X_dev, y_dev, 5e1)\n",
"f = lambda w: softmax_loss_naive(w, X_dev, y_dev, 5e1)[0]\n",
"grad_numerical = grad_check_sparse(f, W, grad, 10)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 现在,我们有了softmax损失函数及其梯度的简单实现,\n",
"# 接下来要在 softmax_loss_vectorized 中完成一个向量化版本.\n",
"# 这两个版本应计算出相同的结果,但矢量化版本应更快。\n",
"tic = time.time()\n",
"loss_naive, grad_naive = softmax_loss_naive(W, X_dev, y_dev, 0.000005)\n",
"toc = time.time()\n",
"print('naive loss: %e computed in %fs' % (loss_naive, toc - tic))\n",
"\n",
"from daseCV.classifiers.softmax import softmax_loss_vectorized\n",
"tic = time.time()\n",
"loss_vectorized, grad_vectorized = softmax_loss_vectorized(W, X_dev, y_dev, 0.000005)\n",
"toc = time.time()\n",
"print('vectorized loss: %e computed in %fs' % (loss_vectorized, toc - tic))\n",
"\n",
"# 正如前面在SVM练习中所做的一样,我们使用Frobenius范数比较两个版本梯度。\n",
"grad_difference = np.linalg.norm(grad_naive - grad_vectorized, ord='fro')\n",
"print('Loss difference: %f' % np.abs(loss_naive - loss_vectorized))\n",
"print('Gradient difference: %f' % grad_difference)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"code"
]
},
"outputs": [],
"source": [
"# 使用验证集调整超参数(正则化强度和学习率)。您应该尝试不同的学习率和正则化强度范围; \n",
"# 如果您小心的话,您应该能够在验证集上获得超过0.35的精度。\n",
"from daseCV.classifiers import Softmax\n",
"results = {}\n",
"best_val = -1\n",
"best_softmax = None\n",
"learning_rates = [1e-7, 5e-7]\n",
"regularization_strengths = [2.5e4, 5e4]\n",
"\n",
"################################################################################\n",
"# 需要完成的事: \n",
"# 对验证集设置学习率和正则化强度。\n",
"# 这与之前SVM中做的类似;\n",
"# 保存训练效果最好的softmax分类器到best_softmax中。\n",
"################################################################################\n",
"# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
"pass\n",
"\n",
"# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
" \n",
"# Print out results.\n",
"for lr, reg in sorted(results):\n",
" train_accuracy, val_accuracy = results[(lr, reg)]\n",
" print('lr %e reg %e train accuracy: %f val accuracy: %f' % (\n",
" lr, reg, train_accuracy, val_accuracy))\n",
" \n",
"print('best validation accuracy achieved during cross-validation: %f' % best_val)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 在测试集上评估\n",
"# 在测试集上评估最好的softmax\n",
"y_test_pred = best_softmax.predict(X_test)\n",
"test_accuracy = np.mean(y_test == y_test_pred)\n",
"print('softmax on raw pixels final test set accuracy: %f' % (test_accuracy, ))"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-inline"
]
},
"source": [
"**问题 2** - *对或错*\n",
"\n",
"假设总训练损失定义为所有训练样本中每个数据点损失的总和。可能会有新的数据点添加到训练集中,同时SVM损失保持不变,但是对于Softmax分类器的损失而言,情况并非如此。\n",
"\n",
"$\\color{blue}{\\textit 你的回答:}$\n",
"\n",
"\n",
"$\\color{blue}{\\textit 你的解释:}$\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 可视化每个类别的学习到的权重\n",
"w = best_softmax.W[:-1,:] # strip out the bias\n",
"w = w.reshape(32, 32, 3, 10)\n",
"\n",
"w_min, w_max = np.min(w), np.max(w)\n",
"\n",
"classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']\n",
"for i in range(10):\n",
" plt.subplot(2, 5, i + 1)\n",
" \n",
" # Rescale the weights to be between 0 and 255\n",
" wimg = 255.0 * (w[:, :, :, i].squeeze() - w_min) / (w_max - w_min)\n",
" plt.imshow(wimg.astype('uint8'))\n",
" plt.axis('off')\n",
" plt.title(classes[i])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"# 重要\n",
"\n",
"这里是作业的结尾处,请执行以下步骤:\n",
"\n",
"1. 点击`File -> Save`或者用`control+s`组合键,确保你最新的的notebook的作业已经保存到谷歌云。\n",
"2. 执行以下代码确保 `.py` 文件保存回你的谷歌云。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"FOLDER_TO_SAVE = os.path.join('drive/My Drive/', FOLDERNAME)\n",
"FILES_TO_SAVE = ['daseCV/classifiers/softmax.py']\n",
"\n",
"for files in FILES_TO_SAVE:\n",
" with open(os.path.join(FOLDER_TO_SAVE, '/'.join(files.split('/')[1:])), 'w') as f:\n",
" f.write(''.join(open(files).readlines()))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.0"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

+ 618
- 0
assignment1/svm.ipynb View File

@ -0,0 +1,618 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from google.colab import drive\n",
"\n",
"drive.mount('/content/drive', force_remount=True)\n",
"\n",
"# 输入daseCV所在的路径\n",
"# 'daseCV' 文件夹包括 '.py', 'classifiers' 和'datasets'文件夹\n",
"# 例如 'CV/assignments/assignment1/daseCV/'\n",
"FOLDERNAME = None\n",
"\n",
"assert FOLDERNAME is not None, \"[!] Enter the foldername.\"\n",
"\n",
"%cd drive/My\\ Drive\n",
"%cp -r $FOLDERNAME ../../\n",
"%cd ../../\n",
"%cd daseCV/datasets/\n",
"!bash get_datasets.sh\n",
"%cd ../../"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-title"
]
},
"source": [
"# 多分类支撑向量机练习\n",
"*完成此练习并且上交本ipynb(包含输出及代码).*\n",
"\n",
"在这个练习中,你将会:\n",
" \n",
"- 为SVM构建一个完全向量化的**损失函数**\n",
"- 实现**解析梯度**的向量化表达式\n",
"- 使用数值梯度检查你的代码是否正确\n",
"- 使用验证集**调整学习率和正则化项**\n",
"- 用**SGD(随机梯度下降)** **优化**损失函数\n",
"- **可视化** 最后学习到的权重\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"# 导入包\n",
"import random\n",
"import numpy as np\n",
"from daseCV.data_utils import load_CIFAR10\n",
"import matplotlib.pyplot as plt\n",
"\n",
"# 下面一行是notebook的magic命令,作用是让matplotlib在notebook内绘图(而不是新建一个窗口)\n",
"%matplotlib inline\n",
"plt.rcParams['figure.figsize'] = (10.0, 8.0) # 设置绘图的默认大小\n",
"plt.rcParams['image.interpolation'] = 'nearest'\n",
"plt.rcParams['image.cmap'] = 'gray'\n",
"\n",
"# 该magic命令可以重载外部的python模块\n",
"# 相关资料可以去看 http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython\n",
"%load_ext autoreload\n",
"%autoreload 2"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-ignore"
]
},
"source": [
"## 准备和预处理CIFAR-10的数据"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"# 导入原始CIFAR-10数据\n",
"cifar10_dir = 'daseCV/datasets/cifar-10-batches-py'\n",
"\n",
"# 清空变量,防止多次定义变量(可能造成内存问题)\n",
"try:\n",
" del X_train, y_train\n",
" del X_test, y_test\n",
" print('Clear previously loaded data.')\n",
"except:\n",
" pass\n",
"\n",
"X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)\n",
"\n",
"# 完整性检查,打印出训练和测试数据的大小\n",
"print('Training data shape: ', X_train.shape)\n",
"print('Training labels shape: ', y_train.shape)\n",
"print('Test data shape: ', X_test.shape)\n",
"print('Test labels shape: ', y_test.shape)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"# 可视化部分数据\n",
"# 这里我们每个类别展示了7张图片\n",
"classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']\n",
"num_classes = len(classes)\n",
"samples_per_class = 7\n",
"for y, cls in enumerate(classes):\n",
" idxs = np.flatnonzero(y_train == y)\n",
" idxs = np.random.choice(idxs, samples_per_class, replace=False)\n",
" for i, idx in enumerate(idxs):\n",
" plt_idx = i * num_classes + y + 1\n",
" plt.subplot(samples_per_class, num_classes, plt_idx)\n",
" plt.imshow(X_train[idx].astype('uint8'))\n",
" plt.axis('off')\n",
" if i == 0:\n",
" plt.title(cls)\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"# 划分训练集,验证集和测试集,除此之外,\n",
"# 我们从训练集中抽取了一小部分作为代码开发的数据,\n",
"# 使用小批量的开发数据集能够快速开发代码\n",
"num_training = 49000\n",
"num_validation = 1000\n",
"num_test = 1000\n",
"num_dev = 500\n",
"\n",
"# 从原始训练集中抽取出num_validation个样本作为验证集\n",
"mask = range(num_training, num_training + num_validation)\n",
"X_val = X_train[mask]\n",
"y_val = y_train[mask]\n",
"\n",
"# 从原始训练集中抽取出num_training个样本作为训练集\n",
"mask = range(num_training)\n",
"X_train = X_train[mask]\n",
"y_train = y_train[mask]\n",
"\n",
"# 从训练集中抽取num_dev个样本作为开发数据集\n",
"mask = np.random.choice(num_training, num_dev, replace=False)\n",
"X_dev = X_train[mask]\n",
"y_dev = y_train[mask]\n",
"\n",
"# 从原始测试集中抽取num_test个样本作为测试集\n",
"mask = range(num_test)\n",
"X_test = X_test[mask]\n",
"y_test = y_test[mask]\n",
"\n",
"print('Train data shape: ', X_train.shape)\n",
"print('Train labels shape: ', y_train.shape)\n",
"print('Validation data shape: ', X_val.shape)\n",
"print('Validation labels shape: ', y_val.shape)\n",
"print('Test data shape: ', X_test.shape)\n",
"print('Test labels shape: ', y_test.shape)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"# 预处理:把图片数据rehspae成行向量\n",
"X_train = np.reshape(X_train, (X_train.shape[0], -1))\n",
"X_val = np.reshape(X_val, (X_val.shape[0], -1))\n",
"X_test = np.reshape(X_test, (X_test.shape[0], -1))\n",
"X_dev = np.reshape(X_dev, (X_dev.shape[0], -1))\n",
"\n",
"# 完整性检查,打印出数据的shape\n",
"print('Training data shape: ', X_train.shape)\n",
"print('Validation data shape: ', X_val.shape)\n",
"print('Test data shape: ', X_test.shape)\n",
"print('dev data shape: ', X_dev.shape)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore-input"
]
},
"outputs": [],
"source": [
"# 预处理:减去image的平均值(均值规整化)\n",
"# 第一步:计算训练集中的图像均值\n",
"mean_image = np.mean(X_train, axis=0)\n",
"print(mean_image[:10]) # print a few of the elements\n",
"plt.figure(figsize=(4,4))\n",
"plt.imshow(mean_image.reshape((32,32,3)).astype('uint8')) # visualize the mean image\n",
"plt.show()\n",
"\n",
"# 第二步:所有数据集减去均值\n",
"X_train -= mean_image\n",
"X_val -= mean_image\n",
"X_test -= mean_image\n",
"X_dev -= mean_image\n",
"\n",
"# 第三步:拼接一个bias维,其中所有值都是1(bias trick),\n",
"# SVM可以联合优化数据和bias,即只需要优化一个权值矩阵W\n",
"X_train = np.hstack([X_train, np.ones((X_train.shape[0], 1))])\n",
"X_val = np.hstack([X_val, np.ones((X_val.shape[0], 1))])\n",
"X_test = np.hstack([X_test, np.ones((X_test.shape[0], 1))])\n",
"X_dev = np.hstack([X_dev, np.ones((X_dev.shape[0], 1))])\n",
"\n",
"print(X_train.shape, X_val.shape, X_test.shape, X_dev.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## SVM分类器\n",
"\n",
"你需要在**daseCV/classifiers/linear_svm.py**里面完成编码\n",
"\n",
"我们已经预先定义了一个函数`compute_loss_naive`,该函数使用循环来计算多分类SVM损失函数"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 调用朴素版的损失计算函数\n",
"from daseCV.classifiers.linear_svm import svm_loss_naive\n",
"import time\n",
"\n",
"# 生成一个随机的SVM权值矩阵(矩阵值很小)\n",
"W = np.random.randn(3073, 10) * 0.0001 \n",
"\n",
"loss, grad = svm_loss_naive(W, X_dev, y_dev, 0.000005)\n",
"print('loss: %f' % (loss, ))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"从上面的函数返回的`grad`现在是零。请推导支持向量机损失函数的梯度,并在svm_loss_naive中编码实现。\n",
"\n",
"为了检查是否正确地实现了梯度,你可以用数值方法估计损失函数的梯度,并将数值估计与你计算出来的梯度进行比较。我们已经为你提供了检查的代码:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 一旦你实现了梯度计算的功能,重新执行下面的代码检查梯度\n",
"\n",
"# 计算损失和W的梯度\n",
"loss, grad = svm_loss_naive(W, X_dev, y_dev, 0.0)\n",
"\n",
"# 数值估计梯度的方法沿着随机几个维度进行计算,并且和解析梯度进行比较,\n",
"# 这两个方法算出来的梯度应该在任何维度上完全一致(相对误差足够小)\n",
"from daseCV.gradient_check import grad_check_sparse\n",
"f = lambda w: svm_loss_naive(w, X_dev, y_dev, 0.0)[0]\n",
"grad_numerical = grad_check_sparse(f, W, grad)\n",
"\n",
"# 把正则化项打开后继续再检查一遍梯度\n",
"# 你没有忘记正则化项吧?(忘了的罚抄100遍(๑•́ ₃•̀๑) )\n",
"loss, grad = svm_loss_naive(W, X_dev, y_dev, 5e1)\n",
"f = lambda w: svm_loss_naive(w, X_dev, y_dev, 5e1)[0]\n",
"grad_numerical = grad_check_sparse(f, W, grad)"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-inline"
]
},
"source": [
"**问题 1**\n",
"\n",
"有可能会出现某一个维度上的gradcheck没有完全匹配。这个问题是怎么引起的?有必要担心这个问题么?请举一个简单例子,能够导致梯度检查失败。如何改进这个问题?*提示:SVM的损失函数不是严格可微的*\n",
"\n",
"$\\color{blue}{ 你的回答:}$ *在这里填写* \n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 接下来实现svm_loss_vectorized函数,目前只计算损失\n",
"# 稍后再计算梯度\n",
"tic = time.time()\n",
"loss_naive, grad_naive = svm_loss_naive(W, X_dev, y_dev, 0.000005)\n",
"toc = time.time()\n",
"print('Naive loss: %e computed in %fs' % (loss_naive, toc - tic))\n",
"\n",
"from daseCV.classifiers.linear_svm import svm_loss_vectorized\n",
"tic = time.time()\n",
"loss_vectorized, _ = svm_loss_vectorized(W, X_dev, y_dev, 0.000005)\n",
"toc = time.time()\n",
"print('Vectorized loss: %e computed in %fs' % (loss_vectorized, toc - tic))\n",
"\n",
"# 两种方法算出来的损失应该是相同的,但是向量化实现的方法应该更快\n",
"print('difference: %f' % (loss_naive - loss_vectorized))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 完成svm_loss_vectorized函数,并用向量化方法计算梯度\n",
"\n",
"# 朴素方法和向量化实现的梯度应该相同,但是向量化方法也应该更快\n",
"tic = time.time()\n",
"_, grad_naive = svm_loss_naive(W, X_dev, y_dev, 0.000005)\n",
"toc = time.time()\n",
"print('Naive loss and gradient: computed in %fs' % (toc - tic))\n",
"\n",
"tic = time.time()\n",
"_, grad_vectorized = svm_loss_vectorized(W, X_dev, y_dev, 0.000005)\n",
"toc = time.time()\n",
"print('Vectorized loss and gradient: computed in %fs' % (toc - tic))\n",
"\n",
"# 损失是一个标量,因此很容易比较两种方法算出的值,\n",
"# 而梯度是一个矩阵,所以我们用Frobenius范数来比较梯度的值\n",
"difference = np.linalg.norm(grad_naive - grad_vectorized, ord='fro')\n",
"print('difference: %f' % difference)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### 随机梯度下降(Stochastic Gradient Descent)\n",
"\n",
"我们现在有了向量化的损失函数表达式和梯度表达式,同时我们计算的梯度和数值梯度是匹配的。\n",
"接下来我们要做SGD。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 在linear_classifier.py文件中,编码实现LinearClassifier.train()中的SGD功能,\n",
"# 运行下面的代码\n",
"from daseCV.classifiers import LinearSVM\n",
"svm = LinearSVM()\n",
"tic = time.time()\n",
"loss_hist = svm.train(X_train, y_train, learning_rate=1e-7, reg=2.5e4,\n",
" num_iters=1500, verbose=True)\n",
"toc = time.time()\n",
"print('That took %fs' % (toc - tic))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 一个有用的debugging技巧是把损失函数画出来\n",
"plt.plot(loss_hist)\n",
"plt.xlabel('Iteration number')\n",
"plt.ylabel('Loss value')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 完成LinearSVM.predict函数,并且在训练集和验证集上评估其准确性\n",
"y_train_pred = svm.predict(X_train)\n",
"print('training accuracy: %f' % (np.mean(y_train == y_train_pred), ))\n",
"y_val_pred = svm.predict(X_val)\n",
"print('validation accuracy: %f' % (np.mean(y_val == y_val_pred), ))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"code"
]
},
"outputs": [],
"source": [
"# 使用验证集来调整超参数(正则化强度和学习率)。\n",
"# 你可以尝试不同的学习速率和正则化项的值;\n",
"# 如果你细心的话,您应该可以在验证集上获得大约0.39的准确率。\n",
"\n",
"# 注意:在搜索超参数时,您可能会看到runtime/overflow的警告。\n",
"# 这是由极端超参值造成的,不是代码的bug。\n",
"\n",
"learning_rates = [1e-7, 5e-5]\n",
"regularization_strengths = [2.5e4, 5e4]\n",
"\n",
"# results是一个字典,把元组(learning_rate, regularization_strength)映射到元组(training_accuracy, validation_accuracy) \n",
"# accuracy是样本中正确分类的比例\n",
"results = {}\n",
"best_val = -1 # 我们迄今为止见过最好的验证集准确率\n",
"best_svm = None # 拥有最高验证集准确率的LinearSVM对象\n",
"\n",
"##############################################################################\n",
"# TODO:\n",
"# 编写代码,通过比较验证集的准确度来选择最佳超参数。\n",
"# 对于每个超参数组合,在训练集上训练一个线性SVM,在训练集和验证集上计算它的精度,\n",
"# 并将精度结果存储在results字典中。此外,在best_val中存储最高验证集准确度,\n",
"# 在best_svm中存储拥有此精度的SVM对象。\n",
"#\n",
"# 提示: \n",
"# 在开发代码时,应该使用一个比较小的num_iter值,这样SVM就不会花费太多时间训练; \n",
"# 一旦您确信您的代码开发完成,您就应该使用一个较大的num_iter值重新训练并验证。\n",
"##############################################################################\n",
"# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
"pass\n",
" \n",
"# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
" \n",
"# 打印results\n",
"for lr, reg in sorted(results):\n",
" train_accuracy, val_accuracy = results[(lr, reg)]\n",
" print('lr %e reg %e train accuracy: %f val accuracy: %f' % (\n",
" lr, reg, train_accuracy, val_accuracy))\n",
" \n",
"print('best validation accuracy achieved during cross-validation: %f' % best_val)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore-input"
]
},
"outputs": [],
"source": [
"# 可是化交叉验证结果\n",
"import math\n",
"x_scatter = [math.log10(x[0]) for x in results]\n",
"y_scatter = [math.log10(x[1]) for x in results]\n",
"\n",
"# 画出训练集准确率\n",
"marker_size = 100\n",
"colors = [results[x][0] for x in results]\n",
"plt.subplot(2, 1, 1)\n",
"plt.scatter(x_scatter, y_scatter, marker_size, c=colors, cmap=plt.cm.coolwarm)\n",
"plt.colorbar()\n",
"plt.xlabel('log learning rate')\n",
"plt.ylabel('log regularization strength')\n",
"plt.title('CIFAR-10 training accuracy')\n",
"\n",
"# 画出验证集准确率\n",
"colors = [results[x][1] for x in results] # default size of markers is 20\n",
"plt.subplot(2, 1, 2)\n",
"plt.scatter(x_scatter, y_scatter, marker_size, c=colors, cmap=plt.cm.coolwarm)\n",
"plt.colorbar()\n",
"plt.xlabel('log learning rate')\n",
"plt.ylabel('log regularization strength')\n",
"plt.title('CIFAR-10 validation accuracy')\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# 在测试集上测试最好的SVM分类器\n",
"y_test_pred = best_svm.predict(X_test)\n",
"test_accuracy = np.mean(y_test == y_test_pred)\n",
"print('linear SVM on raw pixels final test set accuracy: %f' % test_accuracy)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore-input"
]
},
"outputs": [],
"source": [
"# 画出每一类的权重\n",
"# 基于您选择的学习速度和正则化强度,画出来的可能不好看\n",
"w = best_svm.W[:-1,:] # 去掉bias\n",
"w = w.reshape(32, 32, 3, 10)\n",
"w_min, w_max = np.min(w), np.max(w)\n",
"classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']\n",
"for i in range(10):\n",
" plt.subplot(2, 5, i + 1)\n",
" \n",
" # 将权重调整为0到255之间\n",
" wimg = 255.0 * (w[:, :, :, i].squeeze() - w_min) / (w_max - w_min)\n",
" plt.imshow(wimg.astype('uint8'))\n",
" plt.axis('off')\n",
" plt.title(classes[i])"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-inline"
]
},
"source": [
"**问题2**\n",
"\n",
"描述你的可视化权值是什么样子的,并提供一个简短的解释为什么它们看起来是这样的。\n",
"\n",
"$\\color{blue}{ 你的回答: }$ *请在这里填写* \n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"# 重要\n",
"\n",
"这里是作业的结尾处,请执行以下步骤:\n",
"\n",
"1. 点击`File -> Save`或者用`control+s`组合键,确保你最新的的notebook的作业已经保存到谷歌云。\n",
"2. 执行以下代码确保 `.py` 文件保存回你的谷歌云。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"FOLDER_TO_SAVE = os.path.join('drive/My Drive/', FOLDERNAME)\n",
"FILES_TO_SAVE = ['daseCV/classifiers/linear_svm.py', 'daseCV/classifiers/linear_classifier.py']\n",
"\n",
"for files in FILES_TO_SAVE:\n",
" with open(os.path.join(FOLDER_TO_SAVE, '/'.join(files.split('/')[1:])), 'w') as f:\n",
" f.write(''.join(open(files).readlines()))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.0"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

+ 560
- 0
assignment1/two_layer_net.ipynb View File

@ -0,0 +1,560 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from google.colab import drive\n",
"\n",
"drive.mount('/content/drive', force_remount=True)\n",
"\n",
"# 输入daseCV所在的路径\n",
"# 'daseCV' 文件夹包括 '.py', 'classifiers' 和'datasets'文件夹\n",
"# 例如 'CV/assignments/assignment1/daseCV/'\n",
"FOLDERNAME = None\n",
"\n",
"assert FOLDERNAME is not None, \"[!] Enter the foldername.\"\n",
"\n",
"%cd drive/My\\ Drive\n",
"%cp -r $FOLDERNAME ../../\n",
"%cd ../../\n",
"%cd daseCV/datasets/\n",
"!bash get_datasets.sh\n",
"%cd ../../"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-title"
]
},
"source": [
"# 实现一个神经网络\n",
"\n",
"在这个练习中,我们将开发一个具有全连接层的神经网络来进行分类任务,并在CIFAR-10数据集上进行测试。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"# 一些初始化设置\n",
"\n",
"import numpy as np\n",
"import matplotlib.pyplot as plt\n",
"\n",
"from daseCV.classifiers.neural_net import TwoLayerNet\n",
"\n",
"%matplotlib inline\n",
"plt.rcParams['figure.figsize'] = (10.0, 8.0) # 设置默认绘图大小\n",
"plt.rcParams['image.interpolation'] = 'nearest'\n",
"plt.rcParams['image.cmap'] = 'gray'\n",
"\n",
"# 自动重载外部模块的详细资料可以查看下面链接\n",
"# http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython\n",
"%load_ext autoreload\n",
"%autoreload 2\n",
"\n",
"def rel_error(x, y):\n",
" \"\"\" returns relative error \"\"\"\n",
" return np.max(np.abs(x - y) / (np.maximum(1e-8, np.abs(x) + np.abs(y))))"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-ignore"
]
},
"source": [
"在文件`daseCV/classifiers/neural_net`中使用一个类`TwoLayerNet`表示我们的网络实例。网络参数存储在实例变量`self.params`中, 其中键是参数名,值是numpy数组。\n",
"下面,我们初始化玩具数据和一个玩具模型,我们将使用它来开发具体代码。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"# 创建一个小网络和一些玩具数据\n",
"# 注意,我们设置了可重复实验的随机种子。\n",
"\n",
"input_size = 4\n",
"hidden_size = 10\n",
"num_classes = 3\n",
"num_inputs = 5\n",
"\n",
"def init_toy_model():\n",
" np.random.seed(0)\n",
" return TwoLayerNet(input_size, hidden_size, num_classes, std=1e-1)\n",
"\n",
"def init_toy_data():\n",
" np.random.seed(1)\n",
" X = 10 * np.random.randn(num_inputs, input_size)\n",
" y = np.array([0, 1, 2, 2, 1])\n",
" return X, y\n",
"\n",
"net = init_toy_model()\n",
"X, y = init_toy_data()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 前向传播:计算scores\n",
"\n",
"打开文件`daseCV/classifiers/neural_net`,查看`TwoLayerNet.loss`函数。这个函数与你之前在SVM和Softmax写的损失函数非常相似:输入数据和权重,计算类别的scores、loss和参数上的梯度。\n",
"\n",
"实现前向传播的第一部分:使用权重和偏差来计算所有输入的scores。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"scores = net.loss(X)\n",
"print('Your scores:')\n",
"print(scores)\n",
"print()\n",
"print('correct scores:')\n",
"correct_scores = np.asarray([\n",
" [-0.81233741, -1.27654624, -0.70335995],\n",
" [-0.17129677, -1.18803311, -0.47310444],\n",
" [-0.51590475, -1.01354314, -0.8504215 ],\n",
" [-0.15419291, -0.48629638, -0.52901952],\n",
" [-0.00618733, -0.12435261, -0.15226949]])\n",
"print(correct_scores)\n",
"print()\n",
"\n",
"# The difference should be very small. We get < 1e-7\n",
"print('Difference between your scores and correct scores:')\n",
"print(np.sum(np.abs(scores - correct_scores)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 反向传播: 计算损失\n",
"\n",
"在同一个函数中,编码实现第二个部分,计算损失值。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"loss, _ = net.loss(X, y, reg=0.05) #reg为0.1\n",
"correct_loss = 1.30378789133\n",
"\n",
"# should be very small, we get < 1e-12\n",
"print('Difference between your loss and correct loss:')\n",
"print(np.sum(np.abs(loss - correct_loss)))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 反向传播\n",
"\n",
"实现函数的其余部分。计算关于变量`W1`, `b1`, `W2`, `b2`的梯度。当你正确实现了前向传播的代码后(hopefully!),你可以用数值梯度检查debug你的反向传播:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from daseCV.gradient_check import eval_numerical_gradient\n",
"\n",
"# 使用数值梯度检查反向传播的代码。\n",
"# 如果你的代码是正确的,那么对于W1、W2、b1和b2,\n",
"# 数值梯度和解析梯度之间的差异应该小于1e-8。\n",
"\n",
"loss, grads = net.loss(X, y, reg=0.05)\n",
"\n",
"# these should all be less than 1e-8 or so\n",
"for param_name in grads:\n",
" f = lambda W: net.loss(X, y, reg=0.05)[0]\n",
" param_grad_num = eval_numerical_gradient(f, net.params[param_name], verbose=False)\n",
" print('%s max relative error: %e' % (param_name, rel_error(param_grad_num, grads[param_name])))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 训练网络\n",
"\n",
"我们使用随机梯度下降(SGD)训练网络,类似于SVM和Softmax。查看`TwoLayerNet.train`函数并填写训练代码中缺失的部分。这与SVM和Softmax分类器的训练过程非常相似。您还必须实现`TwoLayerNet.predict`,即在网络训练过程中周期性地进行预测,以持续追踪网络的准确率\n",
"\n",
"当你完成了这个函数吼,运行下面的代码,在玩具数据上训练一个两层网络。你的训练损失应该少于0.02。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"net = init_toy_model()\n",
"stats = net.train(X, y, X, y,\n",
" learning_rate=1e-1, reg=5e-6,\n",
" num_iters=100, verbose=False)\n",
"\n",
"print('Final training loss: ', stats['loss_history'][-1])\n",
"\n",
"# plot the loss history\n",
"plt.plot(stats['loss_history'])\n",
"plt.xlabel('iteration')\n",
"plt.ylabel('training loss')\n",
"plt.title('Training Loss history')\n",
"plt.show()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 加载数据\n",
"\n",
"现在你已经实现了一个两层的神经网络,通过了梯度检查,并且在玩具数据有效工作,现在可以加载我们喜欢的CIFAR-10数据了(我不喜欢(╯‵□′)╯︵┴─┴ ),这样就可以训练真实数据集上的分类器。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"pdf-ignore"
]
},
"outputs": [],
"source": [
"from daseCV.data_utils import load_CIFAR10\n",
"\n",
"def get_CIFAR10_data(num_training=49000, num_validation=1000, num_test=1000):\n",
" \"\"\"\n",
" Load the CIFAR-10 dataset from disk and perform preprocessing to prepare\n",
" it for the two-layer neural net classifier. These are the same steps as\n",
" we used for the SVM, but condensed to a single function. \n",
" \"\"\"\n",
" # Load the raw CIFAR-10 data\n",
" cifar10_dir = 'daseCV/datasets/cifar-10-batches-py'\n",
" \n",
" # 清除变量,防止多次加载数据(这可能会导致内存问题)\n",
" try:\n",
" del X_train, y_train\n",
" del X_test, y_test\n",
" print('Clear previously loaded data.')\n",
" except:\n",
" pass\n",
"\n",
" X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)\n",
" \n",
" # Subsample the data\n",
" mask = list(range(num_training, num_training + num_validation))\n",
" X_val = X_train[mask]\n",
" y_val = y_train[mask]\n",
" mask = list(range(num_training))\n",
" X_train = X_train[mask]\n",
" y_train = y_train[mask]\n",
" mask = list(range(num_test))\n",
" X_test = X_test[mask]\n",
" y_test = y_test[mask]\n",
"\n",
" # Normalize the data: subtract the mean image\n",
" mean_image = np.mean(X_train, axis=0)\n",
" X_train -= mean_image\n",
" X_val -= mean_image\n",
" X_test -= mean_image\n",
"\n",
" # Reshape data to rows\n",
" X_train = X_train.reshape(num_training, -1)\n",
" X_val = X_val.reshape(num_validation, -1)\n",
" X_test = X_test.reshape(num_test, -1)\n",
"\n",
" return X_train, y_train, X_val, y_val, X_test, y_test\n",
"\n",
"\n",
"# Invoke the above function to get our data.\n",
"X_train, y_train, X_val, y_val, X_test, y_test = get_CIFAR10_data()\n",
"print('Train data shape: ', X_train.shape)\n",
"print('Train labels shape: ', y_train.shape)\n",
"print('Validation data shape: ', X_val.shape)\n",
"print('Validation labels shape: ', y_val.shape)\n",
"print('Test data shape: ', X_test.shape)\n",
"print('Test labels shape: ', y_test.shape)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 训练网络\n",
"\n",
"我们使用SGD训练网络。此外,在训练过程中,我们采用指数学习率衰减计划,把学习率乘以衰减率来降低学习率。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"code"
]
},
"outputs": [],
"source": [
"input_size = 32 * 32 * 3\n",
"hidden_size = 50\n",
"num_classes = 10\n",
"net = TwoLayerNet(input_size, hidden_size, num_classes)\n",
"\n",
"# Train the network\n",
"stats = net.train(X_train, y_train, X_val, y_val,\n",
" num_iters=1000, batch_size=200,\n",
" learning_rate=1e-4, learning_rate_decay=0.95,\n",
" reg=0.25, verbose=True)\n",
"\n",
"# Predict on the validation set\n",
"val_acc = (net.predict(X_val) == y_val).mean()\n",
"print('Validation accuracy: ', val_acc)\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Debug 训练过程\n",
"\n",
"使用默认参数,验证集的验证精度应该在0.29左右。太差了\n",
"\n",
"解决这个问题的一种策略是在训练过程中绘制损失函数, 以及训练集和验证集的准确度。\n",
"\n",
"另一种策略是把网络的第一层权重可视化。在大多数以视觉数据为训练对象的神经网络中,第一层的权值在可视化时通常会显示有趣的结构。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# Plot the loss function and train / validation accuracies\n",
"plt.subplot(2, 1, 1)\n",
"plt.plot(stats['loss_history'])\n",
"plt.title('Loss history')\n",
"plt.xlabel('Iteration')\n",
"plt.ylabel('Loss')\n",
"\n",
"plt.subplot(2, 1, 2)\n",
"plt.plot(stats['train_acc_history'], label='train')\n",
"plt.plot(stats['val_acc_history'], label='val')\n",
"plt.title('Classification accuracy history')\n",
"plt.xlabel('Epoch')\n",
"plt.ylabel('Classification accuracy')\n",
"plt.legend()\n",
"plt.show()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from daseCV.vis_utils import visualize_grid\n",
"\n",
"# Visualize the weights of the network\n",
"\n",
"def show_net_weights(net):\n",
" W1 = net.params['W1']\n",
" W1 = W1.reshape(32, 32, 3, -1).transpose(3, 0, 1, 2)\n",
" plt.imshow(visualize_grid(W1, padding=3).astype('uint8'))\n",
" plt.gca().axis('off')\n",
" plt.show()\n",
"\n",
"show_net_weights(net)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 调整超参数\n",
"\n",
"**What's wrong?**. 查看上面的可视化,我们可以看到损失或多或少是线性下降的,这似乎表明学习率可能太小了。此外,训练的准确度和验证的准确度之间没有差距,这说明我们使用的模型容量较小,我们应该增加模型的大小。另一方面,对于一个非常大的模型,我们期望看到更多的过拟合,这表现为训练和验证准确度之间有非常大的差距。\n",
"\n",
"**Tuning**. 调整超参数并了解它们如何影响最终的性能是使用神经网络的一个重要部分,因此我们希望你进行大量实践。下面,你应该试验各种超参数的不同值,包括隐层大小、学习率、训练周期数和正则化强度。你也可以考虑调整学习速率衰减,但是这个实验中默认值应该能够获得良好的性能。\n",
"\n",
"**Approximate results**. 你应该在验证集上获得超过48%的分类准确率。我们最好的模型在验证集上获得超过52%的准确率。\n",
"\n",
"**Experiment**: 在这个练习中,你的任务是使用一个全连接的神经网络,在CIFAR-10上获得尽可能好的结果(52%可以作为参考)。您可以自由地实现自己的技术(例如,使用PCA来降低维度,或添加dropout,或添加特征,等等)。"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-inline"
]
},
"source": [
"**在下面说明你的超参数搜索过程**\n",
"\n",
"$\\color{blue}{你的回答: }$"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"tags": [
"code"
]
},
"outputs": [],
"source": [
"best_net = None # store the best model into this \n",
"\n",
"#################################################################################\n",
"# TODO:使用验证集调整超参数。 将您的最佳模型存储在best_net中。\n",
"# 使用上面用过的可视化手段可能能够帮助你调试网络。\n",
"# 可视化结果与上面比较差的网络有明显的差别。\n",
"# 手工调整超参数可能很有趣,但是你会发现编写代码自动扫描超参数的可能组合会很有用。 \n",
"#################################################################################\n",
"# *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n",
"\n",
"pass\n",
"\n",
"# *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"# visualize the weights of the best network\n",
"show_net_weights(best_net)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 在测试集上面测试\n",
"\n",
"当你完成实验时,你可以在测试集上评估你最终的模型;你应该得到48%以上的准确度。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"test_acc = (best_net.predict(X_test) == y_test).mean()\n",
"print('Test accuracy: ', test_acc)"
]
},
{
"cell_type": "markdown",
"metadata": {
"tags": [
"pdf-inline"
]
},
"source": [
"**问题 2**\n",
"\n",
"\n",
"现在您已经完成训练了一个神经网络分类器,您可能会发现您的测试精度远远低于训练精度。我们可以用什么方法来缩小这种差距?选出下列正确的选项\n",
"\n",
"1. 在更大的数据集上训练\n",
"2. 增加更多的隐藏单元\n",
"3. 增加正则化强度\n",
"4. 其他\n",
"\n",
"$\\color{blue}{\\textit Your Answer:}$\n",
"\n",
"$\\color{blue}{\\textit Your Explanation:}$\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"# 重要\n",
"\n",
"这里是作业的结尾处,请执行以下步骤:\n",
"\n",
"1. 点击`File -> Save`或者用`control+s`组合键,确保你最新的的notebook的作业已经保存到谷歌云。\n",
"2. 执行以下代码确保 `.py` 文件保存回你的谷歌云。"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"\n",
"FOLDER_TO_SAVE = os.path.join('drive/My Drive/', FOLDERNAME)\n",
"FILES_TO_SAVE = ['daseCV/classifiers/neural_net.py']\n",
"\n",
"for files in FILES_TO_SAVE:\n",
" with open(os.path.join(FOLDER_TO_SAVE, '/'.join(files.split('/')[1:])), 'w') as f:\n",
" f.write(''.join(open(files).readlines()))"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.0"
}
},
"nbformat": 4,
"nbformat_minor": 1
}

Loading…
Cancel
Save