Jeff Heiges

Effect of Optimizer Choice on MNIST Score

Git Repository Results in folder training_results. Problem Statement To determine how the choice of optimizer effects LeNet-5’s performance on the MNIST and Fashion MNIST benchmarks. LeNet-5 with SGD with a learning rate of 0.001 and momentum of 0.9 will be our baseline. Without momentum, SGD doesn’t really learn. Results The table shows the accuracy on the test set from either the 9th or 10th epoch, whichever is higher. Choices MNIST Performance Fashion MNIST Performance Notes LeNet-5 tr=0.001 11% 11% Didn’t learn. LeNet-5 tr=0.01 91.91% 77.20% Learning rate 0.01. Slow start, more epochs needed. ReLU tr=0.001 87.28% 72.69% _ ReLU tr=0.01 98.94% 90.22% Slow start MaxPool tr=0.01 94.57% 79.37% Slow start ReLU and MaxPool lr=0.001 98.11% 87.21% Trained steadily. Would benefit from more epochs. ReLU and MaxPool lr=0.01 99.07% 90.45% same as Adam ^ and ASGD lr=0.01 98.52% _ _ ^ and Rprop lr=0.01 91.95% _ _ ^ and RMSprop lr=0.001 98.95% 90.64% BEST FASHION MNIST SCORE ^ and Adadelta lr=0.001 80.37% 65.66% _ ^ and Adafactor lr=0.01 99.15% 89.87% BEST MNIST SCORE ^ and Adagrad lr=0.01 98.93% _ _ ^ and Adagrad lr=0.001 96.07% _ would likely be equal to tr=0.01 with more epochs. ^ and Adam lr=0.01 98.31% _ jumped around a lot. tr obviously too high. ^ and Adam lr=0.001 99.07% 90.35% 97.34% and 84.45% accuracy after first epoch ^ and AdamW lr=0.001 98.94% 89.48% _ ^ and Adamax lr=0.001 98.95% 89.07% _ ^ and NAdam lr=0.001 99.08% _ _ ^ and NAdam lr=0.002 99.01% _ _ ^ and RAdam lr=0.001 98.97% _ jumps around a decent amount ^ and RAdam lr=0.0001 98.23% _ needs more epochs ...

MNIST and LeNet

Git Repository MNIST The MNIST dataset, or Modified National Institute of Standards and Technology dataset, is a labeled preprocessed dataset of handwritten digits taken from American Census Bureau employees. The database contains 700007000070000 28×2828\times 2828×28 black and white images, split into 600006000060000 training and 100001000010000 testing images. Training a model to recognize these digits is something of a “Hello world” for machine learning. Fashion MNIST is a dataset of the same size, with black and white images of the same size that represent one of 10 outfits. An artificial neural network that can be trained on the MNIST dataset to recognize digits should also be able to be trained on the Fashion MNIST dataset to recognize articles of clothing. Although getting a good score might be harder. ...

RV32I Assembler

Git Repository Problem Statement Convert RISCV-32I assembly into machine code.

Nand2Tetris

Attempting to locate repository.

Sudoku

Git repository Problem statement This program generates every possible solved sudoku board in an order with no repeats. A solved sudoku board is a 9x99x99x9 board for which every row and column contains each of the numbers 1…91\dots 91…9 exactly once. Additionally, it is tiled with 3x33x33x3 squares, all of which must contain each number from 111 to 999. Clearly there must be a bijective mapping for all 333 requirements. ...