Thursday, August 11, 2016

statefarm - retrospect

Reviewing the best team competition results.



1. How to train each single-model 

1.1 Syntesize/Augment to generate huge amount of data
  • Synthesize 5M new images: create 5M images, by combining left and right (almost) half of images from the same class.  This is so good, it was able to train from-scratch google net V3 to 0.15 . ref
  • Synthesize images by combing images from the test-set (As in this competition, they all used the same video)


1.2 If not possible, use pre-trained. The stronger, the better.
Resnet-152 > VGG-19 > VGG-16 > googlenet 

1.3 Use semi-supervised learning
"dark-knowledge" - let an ensamble predict on the test-set, take most cofident. 6-12K images, don't use too many

* Some numbers to compare... to compare from the same team/model.googlenet V3 
0.31 pre-trained , augmented (flip/rotate)
0.26 pretrained, augmented + "dark-knowledge"/semi-supervised
0.15 from scratch: but 5M synthesized images(!)

2. How to run a single model

If test-data data can be clusered use this fact (in this competition, yes, it was):
  • hack-the-input and get 3rd place . as the input was sequence of images, use NN for better training and test.  (resnet 0.27->0.18)
  • Other-approach is to run all images on VGG, take mid-layer output and cluster it (1000 clusters) and use the cluster mean result

All images or part of it?
  • Most ran the image as a whole (with/out clustering)
  • R-CNN (tuned differently from object-detection) helped  VGG 0.22>0.175



3. How to choose models for en-sample

  • Try to use different models, trained differently. For example, one VGG and another Resnet. one augmented, the other not...
  • X-fold is common, but basic.



4. How to combine models

  • use scipy minimize function and create a custom geometric average function to minimize logloss of all models.



No comments: