[I2S] effnet을 사용해서 fine-tuning을 해보자.

그렇다면 1000만개 학습한 것을 미세조정 해보는것이 어떠한가?

Eff1의 구조를 한번 알아보자.

https://hoya012.github.io/blog/EfficientNet-review/

EfficientNet： Rethinking Model Scaling for Convolutional Neural Networks 리뷰

ICML 2019에 제출된 “EfficientNet： Rethinking Model Scaling for Convolutional Neural Networks” 논문에 대한 리뷰를 수행하였습니다.

hoya012.github.io

https://keras.io/examples/vision/image_classification_efficientnet_fine_tuning/

Keras documentation: Image classification via fine-tuning with EfficientNet

Image classification via fine-tuning with EfficientNet Author: Yixing Fu Date created: 2020/06/30 Last modified: 2020/07/16 Description: Use EfficientNet with weights pre-trained on imagenet for Stanford Dogs classification. View in Colab • GitHub source

keras.io

Tips for fine tuning EfficientNet

On unfreezing layers:

The BathcNormalization layers need to be kept frozen (more details). If they are also turned to trainable, the first epoch after unfreezing will significantly reduce accuracy.
In some cases it may be beneficial to open up only a portion of layers instead of unfreezing all. This will make fine tuning much faster when going to larger models like B7.
Each block needs to be all turned on or off. This is because the architecture includes a shortcut from the first layer to the last layer for each block. Not respecting blocks also significantly harms the final performance.

Some other tips for utilizing EfficientNet:

Larger variants of EfficientNet do not guarantee improved performance, especially for tasks with less data or fewer classes. In such a case, the larger variant of EfficientNet chosen, the harder it is to tune hyperparameters.
EMA (Exponential Moving Average) is very helpful in training EfficientNet from scratch, but not so much for transfer learning.
Do not use the RMSprop setup as in the original paper for transfer learning. The momentum and learning rate are too high for transfer learning. It will easily corrupt the pretrained weight and blow up the loss. A quick check is to see if loss (as categorical cross entropy) is getting significantly larger than log(NUM_CLASSES) after the same epoch. If so, the initial learning rate/momentum is too high.
Smaller batch size benefit validation accuracy, possibly due to effectively providing regularization.

어디서부터 fine-tuning을 할 것이냐?

def unfreeze_model(model):
    # We unfreeze the top 20 layers while leaving BatchNorm layers frozen
    for layer in model.layers[-20:]:
        if not isinstance(layer, layers.BatchNormalization):
            layer.trainable = True

    optimizer = tf.keras.optimizers.Adam(learning_rate=1e-4)
    model.compile(
        optimizer=optimizer, loss="categorical_crossentropy", metrics=["accuracy"]
    )


unfreeze_model(model)

모델을 build한 이후에 가중치를 불러오고

그다음에 unfreeze_model함수를 사용해서 layers를 unfreeze해야한다.

def unfreeze_encoder(encoder):
    for layer in encoder.layers[0].layers[-31:]:
      if not isinstance(layer, tf.keras.layers.BatchNormalization):
        layer.trainable = True