Thursday, August 11, 2016

Statefarm - experiment 3 - VGG16 finetune




VGG:   [conv(x2/3)->max-pool]xfew-times Then classifier head (4096->4096->1000)

Finetune VGG16

I saw one approach , with great results(!) where the the whole model was loaded, and only the last softmax layer was changed from the original(1000) to the new target (10).
In that case finetuning was done on ALL the model together, with slow learning-rate (sgd 1e-4)

I will use another approach:
We will replace the whole classifier head (4096->4096->1000).
1. [optional to save time later]  Load the model without the last dense part. Run it once on all the images and save to disk all the intermediate output  (512x7x7) per image of the train/validate/test info.  for reference, 10K files should take 1.9 GB of disk space.

2. Create alternative classifier. I used a small one due to a bit weak machine.   256->10
    model = Sequential()  
    model.add(Flatten(input_shape=(512,7,7)))
    model.add(Dense(256, activation='relu'))
    model.add(Dropout(0.5))
    model.add(Dense(10, activation='softmax'))

Train it. I used few optimizers
SGD(lr=1e-3, momentum=0.9, nesterov=True)

SGD(lr=1e-3, momentum=0.9, nesterov=False)- BESTSaved model to disk vgg_head_only1_out_0_epoc20

SGD(lr=1e-4, momentum=0.9, nesterov=True)

SGD(lr=1e-4, momentum=0.9, nesterov=False)

'adam'



No comments: