Tutorial 1: Your first deepy experiment

The first experiment with deepy is to build a simple multi-layer neural network with:

  • two hidden layers with 256 neurons each
  • ReLU activations
  • Apply dropout with probability of 20% after ReLU
  • Apply SGD with Momentum for model training

We use this network to classify MNIST digits, so the full architecture will be:

  • 784-dimension input layer (28*28)
  • 256-dimension fully-connected layer with ReLU activation
  • Dropout layer
  • 256-dimension fully-connected layer with ReLU activation
  • Dropout layer
  • 10-dimension fully-connected layer, no activation
  • Softmax layer

Let's start.

Checkout deepy

git clone https://github.com/uaca/deepy

Setup your environment

Set your environment configurations properly or the experiments will be your nightmare. Make sure you are not running Theano on one-core CPU.

If you are using a GPU machine, just execute this command:

source bin/gpu_env.sh

If you are using a multi-core CPU machine, execute this command:

source bin/cpu_env.sh

From now on, don't change your directory.

Run this to create a directory for experiments and your first code.

mkdir my_experiments
touch my_experiments/tutorial1.py

If you are familiar with vim, your can edit the file with:

vi my_experiments/tutorial1.py

Import classes you need

# Set logging level so that you can see debug information of the training process.
import logging
logging.basicConfig(level=logging.INFO)

# Import classes
from deepy.dataset import MnistDataset, MiniBatches
from deepy.networks import NeuralClassifier
from deepy.layers import Dense, Softmax, Dropout
from deepy.trainers import MomentumTrainer, LearningRateAnnealer
from deepy.utils import shared_scalar

Define your model

# Create a classifier, so it implies you will use cross-entropy as cost.
model = NeuralClassifier(input_dim=28*28) 
# Stack layers
model.stack(Dense(256, 'relu'),
            Dropout(0.2),
            Dense(256, 'relu'),
            Dropout(0.2),
            Dense(10, 'linear'),
            Softmax())

Load training data

deepy has some pre-defined datasets like MNIST, so you just need to load them

mnist = MiniBatches(MnistDataset(), batch_size=20)

Define the training method

trainer = MomentumTrainer(model)
# Note: you can specify options for the trainer
# For example, if you want to add L2 regularization
# trainer = MomentumTrainer(model, {"weight_l2": 0.0001})

For learning rate, if you want to modify it on the fly, you need define it a shared variable. In deepy you can just do it like this:

trainer = MomentumTrainer(model, {"learning_rate": shared_scalar(0.01)})

For a complete training option list, see this file: deepy/conf/trainer_config.py

Run the trainer

# This will halve the learning rate if no lower valid cost is observed in 5 epochs
annealer = LearningRateAnnealer(trainer)

trainer.run(mnist, controllers=[annealer])

During the training process, you can just press "Ctrl + C" to quit. But do not press it more than once.

Save your model

model.save_params("tutorial1.gz")

To fill the model with saved parameters:

model.load_params("tutorial1.gz")

Finish editing your code, and run it

Run your code with:

python my_experiments/tutorial1.py

And after around 30 epochs, you can see this:

INFO:deepy.trainers.trainers:test    (iter=31) J=0.06 err=1.68
INFO:deepy.trainers.trainers:valid   (iter=31) J=0.07 err=1.69

If it does not go well, we have prepared a full code example for this tutorial.

Run it with:

python experiments/tutorials/tutorial1.py

The code list is here: Full code of this tutorial