Complete and hand in this completed worksheet (including its outputs and any supporting code outside of the worksheet) with your assignment submission. For more details see the assignments page on the course website.

In this exercise you will:

  • implement a fully-vectorized loss function for the SVM
  • implement the fully-vectorized expression for its analytic gradient
  • check your implementation using numerical gradient
  • use a validation set to tune the learning rate and regularization strength
  • optimize the loss function with SGD
  • visualize the final learned weights
  1. # Run some setup code for this notebook.
  2. import random
  3. import numpy as np
  4. from cs231n.data_utils import load_CIFAR10
  5. import matplotlib.pyplot as plt
  6. # This is a bit of magic to make matplotlib figures appear inline in the
  7. # notebook rather than in a new window.
  8. %matplotlib inline
  9. plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
  10. plt.rcParams['image.interpolation'] = 'nearest'
  11. plt.rcParams['image.cmap'] = 'gray'
  12. # Some more magic so that the notebook will reload external python modules;
  13. # see http://stackoverflow.com/questions/1907993/autoreload-of-modules-in-ipython
  14. %load_ext autoreload
  15. %autoreload 2
  1. The autoreload extension is already loaded. To reload it, use:
  2. %reload_ext autoreload

CIFAR-10 Data Loading and Preprocessing

  1. # Load the raw CIFAR-10 data.
  2. cifar10_dir = 'cs231n/datasets/cifar-10-batches-py'
  3. # Cleaning up variables to prevent loading data multiple times (which may cause memory issue)
  4. try:
  5. del X_train, y_train
  6. del X_test, y_test
  7. print('Clear previously loaded data.')
  8. except:
  9. pass
  10. X_train, y_train, X_test, y_test = load_CIFAR10(cifar10_dir)
  11. # As a sanity check, we print out the size of the training and test data.
  12. print('Training data shape: ', X_train.shape)
  13. print('Training labels shape: ', y_train.shape)
  14. print('Test data shape: ', X_test.shape)
  15. print('Test labels shape: ', y_test.shape)
  1. Clear previously loaded data.
  2. Training data shape: (50000, 32, 32, 3)
  3. Training labels shape: (50000,)
  4. Test data shape: (10000, 32, 32, 3)
  5. Test labels shape: (10000,)
  1. # Visualize some examples from the dataset.
  2. # We show a few examples of training images from each class.
  3. classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
  4. num_classes = len(classes)
  5. samples_per_class = 7
  6. for y, cls in enumerate(classes):
  7. idxs = np.flatnonzero(y_train == y)
  8. idxs = np.random.choice(idxs, samples_per_class, replace=False)
  9. for i, idx in enumerate(idxs):
  10. plt_idx = i * num_classes + y + 1
  11. plt.subplot(samples_per_class, num_classes, plt_idx)
  12. plt.imshow(X_train[idx].astype('uint8'))
  13. plt.axis('off')
  14. if i == 0:
  15. plt.title(cls)
  16. plt.show()

svm_4_0.png

  1. # Split the data into train, val, and test sets. In addition we will
  2. # create a small development set as a subset of the training data;
  3. # we can use this for development so our code runs faster.
  4. num_training = 49000
  5. num_validation = 1000
  6. num_test = 1000
  7. num_dev = 500
  8. # Our validation set will be num_validation points from the original
  9. # training set.
  10. mask = range(num_training, num_training + num_validation)
  11. X_val = X_train[mask]
  12. y_val = y_train[mask]
  13. # Our training set will be the first num_train points from the original
  14. # training set.
  15. mask = range(num_training)
  16. X_train = X_train[mask]
  17. y_train = y_train[mask]
  18. # We will also make a development set, which is a small subset of
  19. # the training set.
  20. mask = np.random.choice(num_training, num_dev, replace=False)
  21. X_dev = X_train[mask]
  22. y_dev = y_train[mask]
  23. # We use the first num_test points of the original test set as our
  24. # test set.
  25. mask = range(num_test)
  26. X_test = X_test[mask]
  27. y_test = y_test[mask]
  28. print('Train data shape: ', X_train.shape)
  29. print('Train labels shape: ', y_train.shape)
  30. print('Validation data shape: ', X_val.shape)
  31. print('Validation labels shape: ', y_val.shape)
  32. print('Test data shape: ', X_test.shape)
  33. print('Test labels shape: ', y_test.shape)
  1. Train data shape: (49000, 32, 32, 3)
  2. Train labels shape: (49000,)
  3. Validation data shape: (1000, 32, 32, 3)
  4. Validation labels shape: (1000,)
  5. Test data shape: (1000, 32, 32, 3)
  6. Test labels shape: (1000,)
  1. # Preprocessing: reshape the image data into rows
  2. X_train = np.reshape(X_train, (X_train.shape[0], -1))
  3. X_val = np.reshape(X_val, (X_val.shape[0], -1))
  4. X_test = np.reshape(X_test, (X_test.shape[0], -1))
  5. X_dev = np.reshape(X_dev, (X_dev.shape[0], -1))
  6. # As a sanity check, print out the shapes of the data
  7. print('Training data shape: ', X_train.shape)
  8. print('Validation data shape: ', X_val.shape)
  9. print('Test data shape: ', X_test.shape)
  10. print('dev data shape: ', X_dev.shape)
  1. Training data shape: (49000, 3072)
  2. Validation data shape: (1000, 3072)
  3. Test data shape: (1000, 3072)
  4. dev data shape: (500, 3072)
  1. # Preprocessing: subtract the mean image
  2. # first: compute the image mean based on the training data
  3. mean_image = np.mean(X_train, axis=0)
  4. print(mean_image[:10]) # print a few of the elements
  5. plt.figure(figsize=(4,4))
  6. plt.imshow(mean_image.reshape((32,32,3)).astype('uint8')) # visualize the mean image
  7. plt.show()
  8. # second: subtract the mean image from train and test data
  9. X_train -= mean_image
  10. X_val -= mean_image
  11. X_test -= mean_image
  12. X_dev -= mean_image
  13. # third: append the bias dimension of ones (i.e. bias trick) so that our SVM
  14. # only has to worry about optimizing a single weight matrix W.
  15. X_train = np.hstack([X_train, np.ones((X_train.shape[0], 1))])
  16. X_val = np.hstack([X_val, np.ones((X_val.shape[0], 1))])
  17. X_test = np.hstack([X_test, np.ones((X_test.shape[0], 1))])
  18. X_dev = np.hstack([X_dev, np.ones((X_dev.shape[0], 1))])
  19. print(X_train.shape, X_val.shape, X_test.shape, X_dev.shape)
  1. [130.64189796 135.98173469 132.47391837 130.05569388 135.34804082
  2. 131.75402041 130.96055102 136.14328571 132.47636735 131.48467347]

svm_7_1.png

  1. (49000, 3073) (1000, 3073) (1000, 3073) (500, 3073)

SVM Classifier

Your code for this section will all be written inside cs231n/classifiers/linear_svm.py.

As you can see, we have prefilled the function compute_loss_naive which uses for loops to evaluate the multiclass SVM loss function.

  1. # Evaluate the naive implementation of the loss we provided for you:
  2. from cs231n.classifiers.linear_svm import svm_loss_naive
  3. import time
  4. # generate a random SVM weight matrix of small numbers
  5. W = np.random.randn(3073, 10) * 0.0001
  6. loss, grad = svm_loss_naive(W, X_dev, y_dev, 0.000005)
  7. print('loss: %f' % (loss, ))
  8. print('shape of grad: ',grad.shape)
  1. loss: 8.702263
  2. shape of grad: (3073, 10)

The grad returned from the function above is right now all zero. Derive and implement the gradient for the SVM cost function and implement it inline inside the function svm_loss_naive. You will find it helpful to interleave your new code inside the existing function.

To check that you have correctly implemented the gradient correctly, you can numerically estimate the gradient of the loss function and compare the numeric estimate to the gradient that you computed. We have provided code that does this for you:

  1. # Once you've implemented the gradient, recompute it with the code below
  2. # and gradient check it with the function we provided for you
  3. # Compute the loss and its gradient at W.
  4. loss, grad = svm_loss_naive(W, X_dev, y_dev, 0.0)
  5. # Numerically compute the gradient along several randomly chosen dimensions, and
  6. # compare them with your analytically computed gradient. The numbers should match
  7. # almost exactly along all dimensions.
  8. from cs231n.gradient_check import grad_check_sparse
  9. f = lambda w: svm_loss_naive(w, X_dev, y_dev, 0.0)[0]
  10. grad_numerical = grad_check_sparse(f, W, grad)
  11. # do the gradient check once again with regularization turned on
  12. # you didn't forget the regularization gradient did you?
  13. loss, grad = svm_loss_naive(W, X_dev, y_dev, 5e1)
  14. f = lambda w: svm_loss_naive(w, X_dev, y_dev, 5e1)[0]
  15. grad_numerical = grad_check_sparse(f, W, grad)
  1. numerical: -2.097039 analytic: -2.097039, relative error: 1.632850e-12
  2. numerical: -13.150595 analytic: -13.150595, relative error: 8.383147e-12
  3. numerical: 35.293383 analytic: 35.293383, relative error: 1.306768e-11
  4. numerical: 11.074474 analytic: 11.074474, relative error: 2.002911e-11
  5. numerical: 9.746951 analytic: 9.746951, relative error: 1.532373e-11
  6. numerical: -0.897797 analytic: -0.897797, relative error: 1.196728e-11
  7. numerical: -7.943663 analytic: -7.943663, relative error: 2.130405e-11
  8. numerical: -16.034647 analytic: -16.034647, relative error: 1.573243e-11
  9. numerical: 8.443374 analytic: 8.443374, relative error: 1.431563e-11
  10. numerical: -3.365248 analytic: -3.365248, relative error: 1.020220e-10
  11. numerical: -18.147710 analytic: -18.147710, relative error: 6.067783e-13
  12. numerical: -11.426479 analytic: -11.426479, relative error: 6.281970e-12
  13. numerical: -3.141528 analytic: -3.141528, relative error: 1.164299e-10
  14. numerical: 0.282185 analytic: 0.282185, relative error: 7.727437e-10
  15. numerical: -10.892570 analytic: -10.892570, relative error: 7.020010e-12
  16. numerical: -3.616890 analytic: -3.616890, relative error: 2.809321e-11
  17. numerical: -30.501798 analytic: -30.501798, relative error: 3.546797e-12
  18. numerical: -9.108684 analytic: -9.108684, relative error: 3.362022e-11
  19. numerical: 8.114566 analytic: 8.114566, relative error: 1.355445e-11
  20. numerical: 12.924950 analytic: 12.924950, relative error: 2.482065e-11

Inline Question 1

It is possible that once in a while a dimension in the gradcheck will not match exactly. What could such a discrepancy be caused by? Is it a reason for concern? What is a simple example in one dimension where a gradient check could fail? How would change the margin affect of the frequency of this happening? Hint: the SVM loss function is not strictly speaking differentiable

SVM | 代码 - 图3 SVM | 代码 - 图4

  1. # Next implement the function svm_loss_vectorized; for now only compute the loss;
  2. # we will implement the gradient in a moment.
  3. tic = time.time()
  4. loss_naive, grad_naive = svm_loss_naive(W, X_dev, y_dev, 0.000005)
  5. toc = time.time()
  6. print('Naive loss: %e computed in %fs' % (loss_naive, toc - tic))
  7. from cs231n.classifiers.linear_svm import svm_loss_vectorized
  8. tic = time.time()
  9. loss_vectorized, _ = svm_loss_vectorized(W, X_dev, y_dev, 0.000005)
  10. toc = time.time()
  11. print('Vectorized loss: %e computed in %fs' % (loss_vectorized, toc - tic))
  12. # The losses should match but your vectorized implementation should be much faster.
  13. print('difference: %f' % (loss_naive - loss_vectorized))
  1. Naive loss: 8.702263e+00 computed in 0.078789s
  2. Vectorized loss: 8.702263e+00 computed in 0.007735s
  3. difference: -0.000000
  1. # Complete the implementation of svm_loss_vectorized, and compute the gradient
  2. # of the loss function in a vectorized way.
  3. # The naive implementation and the vectorized implementation should match, but
  4. # the vectorized version should still be much faster.
  5. tic = time.time()
  6. _, grad_naive = svm_loss_naive(W, X_dev, y_dev, 0.000005)
  7. toc = time.time()
  8. print('Naive loss and gradient: computed in %fs' % (toc - tic))
  9. tic = time.time()
  10. _, grad_vectorized = svm_loss_vectorized(W, X_dev, y_dev, 0.000005)
  11. toc = time.time()
  12. print('Vectorized loss and gradient: computed in %fs' % (toc - tic))
  13. # The loss is a single number, so it is easy to compare the values computed
  14. # by the two implementations. The gradient on the other hand is a matrix, so
  15. # we use the Frobenius norm to compare them.
  16. difference = np.linalg.norm(grad_naive - grad_vectorized, ord='fro')
  17. print('difference: %f' % difference)
  1. Naive loss and gradient: computed in 0.088396s
  2. Vectorized loss and gradient: computed in 0.004990s
  3. difference: 0.000000

Stochastic Gradient Descent

We now have vectorized and efficient expressions for the loss, the gradient and our gradient matches the numerical gradient. We are therefore ready to do SGD to minimize the loss.

  1. # In the file linear_classifier.py, implement SGD in the function
  2. # LinearClassifier.train() and then run it with the code below.
  3. from cs231n.classifiers import LinearSVM
  4. svm = LinearSVM()
  5. tic = time.time()
  6. loss_hist = svm.train(X_train, y_train, learning_rate=1e-7, reg=2.5e4,
  7. num_iters=1500, verbose=True)
  8. toc = time.time()
  9. print('That took %fs' % (toc - tic))
  1. iteration 0 / 1500: loss 780.845025
  2. iteration 100 / 1500: loss 284.218950
  3. iteration 200 / 1500: loss 107.290724
  4. iteration 300 / 1500: loss 42.352389
  5. iteration 400 / 1500: loss 18.537910
  6. iteration 500 / 1500: loss 10.759981
  7. iteration 600 / 1500: loss 6.869571
  8. iteration 700 / 1500: loss 6.355614
  9. iteration 800 / 1500: loss 5.903678
  10. iteration 900 / 1500: loss 5.378417
  11. iteration 1000 / 1500: loss 5.844359
  12. iteration 1100 / 1500: loss 5.279668
  13. iteration 1200 / 1500: loss 5.036227
  14. iteration 1300 / 1500: loss 5.161794
  15. iteration 1400 / 1500: loss 4.938430
  16. That took 16.796489s
  1. # A useful debugging strategy is to plot the loss as a function of
  2. # iteration number:
  3. plt.plot(loss_hist)
  4. plt.xlabel('Iteration number')
  5. plt.ylabel('Loss value')
  6. plt.show()

svm_17_0.png

  1. # Write the LinearSVM.predict function and evaluate the performance on both the
  2. # training and validation set
  3. y_train_pred = svm.predict(X_train)
  4. print('training accuracy: %f' % (np.mean(y_train == y_train_pred), ))
  5. y_val_pred = svm.predict(X_val)
  6. print('validation accuracy: %f' % (np.mean(y_val == y_val_pred), ))
  1. training accuracy: 0.367388
  2. validation accuracy: 0.370000
  1. # Use the validation set to tune hyperparameters (regularization strength and
  2. # learning rate). You should experiment with different ranges for the learning
  3. # rates and regularization strengths; if you are careful you should be able to
  4. # get a classification accuracy of about 0.39 on the validation set.
  5. # Note: you may see runtime/overflow warnings during hyper-parameter search.
  6. # This may be caused by extreme values, and is not a bug.
  7. from cs231n.classifiers import LinearSVM
  8. learning_rates = np.linspace(1e-6,1e-8,5)
  9. regularization_strengths = np.linspace(4.5e4,5.5e4,5)
  10. # results is dictionary mapping tuples of the form
  11. # (learning_rate, regularization_strength) to tuples of the form
  12. # (training_accuracy, validation_accuracy). The accuracy is simply the fraction
  13. # of data points that are correctly classified.
  14. results = {}
  15. best_val = -1 # The highest validation accuracy that we have seen so far.
  16. # The LinearSVM object that achieved the highest validation rate.
  17. best_svm = None
  18. ################################################################################
  19. # TODO: #
  20. # Write code that chooses the best hyperparameters by tuning on the validation #
  21. # set. For each combination of hyperparameters, train a linear SVM on the #
  22. # training set, compute its accuracy on the training and validation sets, and #
  23. # store these numbers in the results dictionary. In addition, store the best #
  24. # validation accuracy in best_val and the LinearSVM object that achieves this #
  25. # accuracy in best_svm. #
  26. # #
  27. # Hint: You should use a small value for num_iters as you develop your #
  28. # validation code so that the SVMs don't take much time to train; once you are #
  29. # confident that your validation code works, you should rerun the validation #
  30. # code with a larger value for num_iters. #
  31. ################################################################################
  32. # *****START OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
  33. svm = LinearSVM()
  34. acc_temp = 0
  35. for lr in learning_rates:
  36. for reg in regularization_strengths:
  37. svm.train(X_train, y_train, learning_rate=lr, reg=reg,
  38. num_iters=150, verbose=True)
  39. train_acc = np.mean(y_train == svm.predict(X_train))
  40. val_acc = np.mean(y_val == svm.predict(X_val))
  41. results[(lr, reg)] = (train_acc, val_acc)
  42. if val_acc > acc_temp:
  43. best_val = val_acc
  44. acc_temp = best_val
  45. best_svm = svm
  46. # *****END OF YOUR CODE (DO NOT DELETE/MODIFY THIS LINE)*****
  47. # Print out results.
  48. for lr, reg in sorted(results):
  49. train_accuracy, val_accuracy = results[(lr, reg)]
  50. print('lr %e reg %e train accuracy: %.2f val accuracy: %.2f' % (
  51. lr, reg, train_accuracy, val_accuracy))
  52. print('best validation accuracy achieved during cross-validation: %f' % best_val)
  1. iteration 0 / 150: loss 1396.757584
  2. iteration 100 / 150: loss 7.978646
  3. iteration 0 / 150: loss 7.074112
  4. iteration 100 / 150: loss 6.814321
  5. iteration 0 / 150: loss 6.906133
  6. iteration 100 / 150: loss 6.113737
  7. iteration 0 / 150: loss 6.380821
  8. iteration 100 / 150: loss 6.100835
  9. iteration 0 / 150: loss 6.743149
  10. iteration 100 / 150: loss 7.828594
  11. iteration 0 / 150: loss 7.270202
  12. iteration 100 / 150: loss 6.695549
  13. iteration 0 / 150: loss 5.911166
  14. iteration 100 / 150: loss 6.497495
  15. iteration 0 / 150: loss 6.323286
  16. iteration 100 / 150: loss 6.402935
  17. iteration 0 / 150: loss 6.360113
  18. iteration 100 / 150: loss 6.215709
  19. iteration 0 / 150: loss 6.329384
  20. iteration 100 / 150: loss 6.792598
  21. iteration 0 / 150: loss 7.810835
  22. iteration 100 / 150: loss 6.215014
  23. iteration 0 / 150: loss 6.047842
  24. iteration 100 / 150: loss 6.189011
  25. iteration 0 / 150: loss 5.712657
  26. iteration 100 / 150: loss 6.202888
  27. iteration 0 / 150: loss 6.219806
  28. iteration 100 / 150: loss 6.846721
  29. iteration 0 / 150: loss 5.953943
  30. iteration 100 / 150: loss 5.950440
  31. iteration 0 / 150: loss 5.930696
  32. iteration 100 / 150: loss 5.878044
  33. iteration 0 / 150: loss 5.672647
  34. iteration 100 / 150: loss 5.687492
  35. iteration 0 / 150: loss 5.630673
  36. iteration 100 / 150: loss 6.361494
  37. iteration 0 / 150: loss 5.624344
  38. iteration 100 / 150: loss 5.781926
  39. iteration 0 / 150: loss 5.934599
  40. iteration 100 / 150: loss 5.907119
  41. iteration 0 / 150: loss 6.388585
  42. iteration 100 / 150: loss 6.015466
  43. iteration 0 / 150: loss 6.002802
  44. iteration 100 / 150: loss 5.679754
  45. iteration 0 / 150: loss 5.205385
  46. iteration 100 / 150: loss 5.423794
  47. iteration 0 / 150: loss 5.784023
  48. iteration 100 / 150: loss 5.535814
  49. iteration 0 / 150: loss 5.612699
  50. iteration 100 / 150: loss 6.147560
  51. lr 1.000000e-08 reg 4.500000e+04 train accuracy: 0.36 val accuracy: 0.37
  52. lr 1.000000e-08 reg 4.750000e+04 train accuracy: 0.36 val accuracy: 0.38
  53. lr 1.000000e-08 reg 5.000000e+04 train accuracy: 0.36 val accuracy: 0.37
  54. lr 1.000000e-08 reg 5.250000e+04 train accuracy: 0.36 val accuracy: 0.37
  55. lr 1.000000e-08 reg 5.500000e+04 train accuracy: 0.36 val accuracy: 0.37
  56. lr 2.575000e-07 reg 4.500000e+04 train accuracy: 0.35 val accuracy: 0.34
  57. lr 2.575000e-07 reg 4.750000e+04 train accuracy: 0.35 val accuracy: 0.37
  58. lr 2.575000e-07 reg 5.000000e+04 train accuracy: 0.34 val accuracy: 0.36
  59. lr 2.575000e-07 reg 5.250000e+04 train accuracy: 0.34 val accuracy: 0.35
  60. lr 2.575000e-07 reg 5.500000e+04 train accuracy: 0.34 val accuracy: 0.37
  61. lr 5.050000e-07 reg 4.500000e+04 train accuracy: 0.32 val accuracy: 0.33
  62. lr 5.050000e-07 reg 4.750000e+04 train accuracy: 0.33 val accuracy: 0.32
  63. lr 5.050000e-07 reg 5.000000e+04 train accuracy: 0.32 val accuracy: 0.34
  64. lr 5.050000e-07 reg 5.250000e+04 train accuracy: 0.32 val accuracy: 0.31
  65. lr 5.050000e-07 reg 5.500000e+04 train accuracy: 0.31 val accuracy: 0.33
  66. lr 7.525000e-07 reg 4.500000e+04 train accuracy: 0.29 val accuracy: 0.31
  67. lr 7.525000e-07 reg 4.750000e+04 train accuracy: 0.31 val accuracy: 0.34
  68. lr 7.525000e-07 reg 5.000000e+04 train accuracy: 0.27 val accuracy: 0.28
  69. lr 7.525000e-07 reg 5.250000e+04 train accuracy: 0.29 val accuracy: 0.31
  70. lr 7.525000e-07 reg 5.500000e+04 train accuracy: 0.27 val accuracy: 0.28
  71. lr 1.000000e-06 reg 4.500000e+04 train accuracy: 0.28 val accuracy: 0.29
  72. lr 1.000000e-06 reg 4.750000e+04 train accuracy: 0.27 val accuracy: 0.28
  73. lr 1.000000e-06 reg 5.000000e+04 train accuracy: 0.26 val accuracy: 0.25
  74. lr 1.000000e-06 reg 5.250000e+04 train accuracy: 0.29 val accuracy: 0.28
  75. lr 1.000000e-06 reg 5.500000e+04 train accuracy: 0.26 val accuracy: 0.27
  76. best validation accuracy achieved during cross-validation: 0.377000
  1. # Visualize the cross-validation results
  2. import math
  3. x_scatter = [math.log10(x[0]) for x in results]
  4. y_scatter = [math.log10(x[1]) for x in results]
  5. # plot training accuracy
  6. marker_size = 100
  7. colors = [results[x][0] for x in results]
  8. plt.subplot(2, 1, 1)
  9. plt.scatter(x_scatter, y_scatter, marker_size, c=colors, cmap=plt.cm.coolwarm)
  10. plt.colorbar()
  11. plt.xlabel('log learning rate')
  12. plt.ylabel('log regularization strength')
  13. plt.title('CIFAR-10 training accuracy')
  14. # plot validation accuracy
  15. colors = [results[x][1] for x in results] # default size of markers is 20
  16. plt.subplot(2, 1, 2)
  17. plt.scatter(x_scatter, y_scatter, marker_size, c=colors, cmap=plt.cm.coolwarm)
  18. plt.colorbar()
  19. plt.xlabel('log learning rate')
  20. plt.ylabel('log regularization strength')
  21. plt.title('CIFAR-10 validation accuracy')
  22. plt.show()

svm_20_0.png

  1. # Evaluate the best svm on test set
  2. y_test_pred = best_svm.predict(X_test)
  3. test_accuracy = np.mean(y_test == y_test_pred)
  4. print('linear SVM on raw pixels final test set accuracy: %f' % test_accuracy)
  1. linear SVM on raw pixels final test set accuracy: 0.361000
  1. # Visualize the learned weights for each class.
  2. # Depending on your choice of learning rate and regularization strength, these may
  3. # or may not be nice to look at.
  4. w = best_svm.W[:-1,:] # strip out the bias
  5. w = w.reshape(32, 32, 3, 10)
  6. w_min, w_max = np.min(w), np.max(w)
  7. classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']
  8. for i in range(10):
  9. plt.subplot(2, 5, i + 1)
  10. # Rescale the weights to be between 0 and 255
  11. wimg = 255.0 * (w[:, :, :, i].squeeze() - w_min) / (w_max - w_min)
  12. plt.imshow(wimg.astype('uint8'))
  13. plt.axis('off')
  14. plt.title(classes[i])

svm_22_0.png

Inline question 2

Describe what your visualized SVM weights look like, and offer a brief explanation for why they look they way that they do.

SVM | 代码 - 图8 SVM | 代码 - 图9

  1. !jupyter nbconvert --to markdown svm.ipynb
  1. [NbConvertApp] Converting notebook svm.ipynb to markdown
  2. [NbConvertApp] Support files will be in svm_files\
  3. [NbConvertApp] Making directory svm_files
  4. [NbConvertApp] Making directory svm_files
  5. [NbConvertApp] Making directory svm_files
  6. [NbConvertApp] Making directory svm_files
  7. [NbConvertApp] Making directory svm_files
  8. [NbConvertApp] Writing 23693 bytes to svm.md