Multi-Layer Neural Network Classification of Fashion MNIST Dataset

Examples from Zalando’s Fashion MNIST Dataset (Zalando 2017).

We implemented a multi-layer neural network to classify images of fashion products using the Zalando’s Fashion MNIST Dataset in order to identify strategies that optimize performance in neural networks. Our baseline model consisted of 2 hidden layers of 50 nodes and tanh activation and a softmax output layer, utilized a learning rate of 1.2e-2, momentum gamma of $0.9$, and no regularization, and was trained using stochastic gradient descent of mini-batch size $128$ over $100$ epochs with early stopping. The baseline achieved a high accuracy of 78.36%. Several variations were tested, including varying activation functions (sigmoid, ReLU, leakyReLU), number of hidden nodes, and number of hidden layers. The fact that the baseline outperformed other variations suggests parameters in a sweet spot that are not too low or too high for the task at hand result in the best performance overall.

Margot Wagner
Margot Wagner
Postdoctoral Researcher

Interested in the use of data science and AI in mental health and using neuroscience to inspire next generation AI tools.