07 Milestone Project 1: 🍔👁 Food Vision Big™¶
In the previous notebook (transfer learning part 3: scaling up) we built Food Vision mini: a transfer learning model which beat the original results of the Food101 paper with only 10% of the data.
But you might be wondering, what would happen if we used all the data?
Well, that's what we're going to find out in this notebook!
We're going to be building Food Vision Big™, using all of the data from the Food101 dataset.
Yep. All 75,750 training images and 25,250 testing images.
And guess what...
This time we've got the goal of beating DeepFood, a 2016 paper which used a Convolutional Neural Network trained for 2-3 days to achieve 77.4% top-1 accuracy.
🔑 Note: Top-1 accuracy means "accuracy for the top softmax activation value output by the model" (because softmax ouputs a value for every class, but top-1 means only the highest one is evaluated). Top-5 accuracy means "accuracy for the top 5 softmax activation values output by the model", in other words, did the true label appear in the top 5 activation values? Top-5 accuracy scores are usually noticeably higher than top-1.
🍔👁 Food Vision Big™ | 🍔👁 Food Vision mini | |
---|---|---|
Dataset source | TensorFlow Datasets | Preprocessed download from Kaggle |
Train data | 75,750 images | 7,575 images |
Test data | 25,250 images | 25,250 images |
Mixed precision | Yes | No |
Data loading | Performanant tf.data API | TensorFlow pre-built function |
Target results | 77.4% top-1 accuracy (beat DeepFood paper) | 50.76% top-1 accuracy (beat Food101 paper) |
Table comparing difference between Food Vision Big (this notebook) versus Food Vision mini (previous notebook).
Alongside attempting to beat the DeepFood paper, we're going to learn about two methods to significantly improve the speed of our model training:
- Prefetching
- Mixed precision training
But more on these later.
What we're going to cover¶
- Using TensorFlow Datasets to download and explore data
- Creating preprocessing function for our data
- Batching & preparing datasets for modelling (making our datasets run fast)
- Creating modelling callbacks
- Setting up mixed precision training
- Building a feature extraction model (see transfer learning part 1: feature extraction)
- Fine-tuning the feature extraction model (see transfer learning part 2: fine-tuning)
- Viewing training results on TensorBoard
How you should approach this notebook¶
You can read through the descriptions and the code (it should all run, except for the cells which error on purpose), but there's a better option.
Write all of the code yourself.
Yes. I'm serious. Create a new notebook, and rewrite each line by yourself. Investigate it, see if you can break it, why does it break?
You don't have to write the text descriptions but writing the code yourself is a great way to get hands-on experience.
Don't worry if you make mistakes, we all do. The way to get better and make less mistakes is to write more code.
📖 Resource: See the full set of course materials on GitHub: https://github.com/mrdbourke/tensorflow-deep-learning
Check GPU¶
For this notebook, we're going to be doing something different.
We're going to be using mixed precision training.
Mixed precision training was introduced in TensorFlow 2.4.0 (a very new feature at the time of writing).
What does mixed precision training do?
Mixed precision training uses a combination of single precision (float32) and half-preicison (float16) data types to speed up model training (up 3x on modern GPUs).
We'll talk about this more later on but in the meantime you can read the TensorFlow documentation on mixed precision for more details.
For now, before we can move forward if we want to use mixed precision training, we need to make sure the GPU powering our Google Colab instance (if you're using Google Colab) is compataible.
For mixed precision training to work, you need access to a GPU with a compute compability score of 7.0+.
Google Colab offers P100, K80 and T4 GPUs, however, the P100 and K80 aren't compatible with mixed precision training.
Therefore before we proceed we need to make sure we have access to a Tesla T4 GPU in our Google Colab instance.
If you're not using Google Colab, you can find a list of various Nvidia GPU compute capabilities on Nvidia's developer website.
🔑 Note: If you run the cell below and see a P100 or K80, try going to to Runtime -> Factory Reset Runtime (note: this will remove any saved variables and data from your Colab instance) and then retry to get a T4.
# If using Google Colab, this should output "Tesla T4" otherwise,
# you won't be able to use mixed precision training
!nvidia-smi -L
GPU 0: NVIDIA TITAN RTX (UUID: GPU-64b1678c-cec3-56bb-af0c-8ae69de44cbd)
Since mixed precision training was introduced in TensorFlow 2.4.0, make sure you've got at least TensorFlow 2.4.0+.
# Check TensorFlow version (should be 2.4.0+)
import tensorflow as tf
print(tf.__version__)
2.6.2
Get helper functions¶
We've created a series of helper functions throughout the previous notebooks in the course. Instead of rewriting them (tedious), we'll import the helper_functions.py
file from the GitHub repo.
# Get helper functions file
import os
if not os.path.exists("helper_functions.py"):
!wget https://raw.githubusercontent.com/mrdbourke/tensorflow-deep-learning/main/extras/helper_functions.py
else:
print("[INFO] 'helper_functions.py' already exists, skipping download.")
[INFO] 'helper_functions.py' already exists, skipping download.
# Import series of helper functions for the notebook (we've created/used these in previous notebooks)
from helper_functions import create_tensorboard_callback, plot_loss_curves, compare_historys
Use TensorFlow Datasets to Download Data¶
In previous notebooks, we've downloaded our food images (from the Food101 dataset) from Google Storage.
And this is a typical workflow you'd use if you're working on your own datasets.
However, there's another way to get datasets ready to use with TensorFlow.
For many of the most popular datasets in the machine learning world (often referred to and used as benchmarks), you can access them through TensorFlow Datasets (TFDS).
What is TensorFlow Datasets?
A place for prepared and ready-to-use machine learning datasets.
Why use TensorFlow Datasets?
- Load data already in Tensors
- Practice on well established datasets
- Experiment with differet data loading techniques (like we're going to use in this notebook)
- Experiment with new TensorFlow features quickly (such as mixed precision training)
Why not use TensorFlow Datasets?
- The datasets are static (they don't change, like your real-world datasets would)
- Might not be suited for your particular problem (but great for experimenting)
To begin using TensorFlow Datasets we can import it under the alias tfds
.
# Get TensorFlow Datasets
import tensorflow_datasets as tfds
To find all of the available datasets in TensorFlow Datasets, you can use the list_builders()
method.
After doing so, we can check to see if the one we're after ("food101"
) is present.
# List available datasets
datasets_list = tfds.list_builders() # get all available datasets in TFDS
print("food101" in datasets_list) # is the dataset we're after available?
True
Beautiful! It looks like the dataset we're after is available (note there are plenty more available but we're on Food101).
To get access to the Food101 dataset from the TFDS, we can use the tfds.load()
method.
In particular, we'll have to pass it a few parameters to let it know what we're after:
name
(str) : the target dataset (e.g."food101"
)split
(list, optional) : what splits of the dataset we're after (e.g.["train", "validation"]
)- the
split
parameter is quite tricky. See the documentation for more.
- the
shuffle_files
(bool) : whether or not to shuffle the files on download, defaults toFalse
as_supervised
(bool) :True
to download data samples in tuple format ((data, label)
) orFalse
for dictionary formatwith_info
(bool) :True
to download dataset metadata (labels, number of samples, etc)
🔑 Note: Calling the
tfds.load()
method will start to download a target dataset to disk if thedownload=True
parameter is set (default). This dataset could be 100GB+, so make sure you have space.
# Load in the data (takes about 5-6 minutes in Google Colab)
(train_data, test_data), ds_info = tfds.load(name="food101", # target dataset to get from TFDS
split=["train", "validation"], # what splits of data should we get? note: not all datasets have train, valid, test
shuffle_files=True, # shuffle files on download?
as_supervised=True, # download data in tuple format (sample, label), e.g. (image, label)
with_info=True) # include dataset metadata? if so, tfds.load() returns tuple (data, ds_info)
Wonderful! After a few minutes of downloading, we've now got access to entire Food101 dataset (in tensor format) ready for modelling.
Now let's get a little information from our dataset, starting with the class names.
Getting class names from a TensorFlow Datasets dataset requires downloading the "dataset_info
" variable (by using the as_supervised=True
parameter in the tfds.load()
method, note: this will only work for supervised datasets in TFDS).
We can access the class names of a particular dataset using the dataset_info.features
attribute and accessing names
attribute of the the "label"
key.
# Features of Food101 TFDS
ds_info.features
FeaturesDict({ 'image': Image(shape=(None, None, 3), dtype=tf.uint8), 'label': ClassLabel(shape=(), dtype=tf.int64, num_classes=101), })
# Get class names
class_names = ds_info.features["label"].names
class_names[:10]
['apple_pie', 'baby_back_ribs', 'baklava', 'beef_carpaccio', 'beef_tartare', 'beet_salad', 'beignets', 'bibimbap', 'bread_pudding', 'breakfast_burrito']
Exploring the Food101 data from TensorFlow Datasets¶
Now we've downloaded the Food101 dataset from TensorFlow Datasets, how about we do what any good data explorer should?
In other words, "visualize, visualize, visualize".
Let's find out a few details about our dataset:
- The shape of our input data (image tensors)
- The datatype of our input data
- What the labels of our input data look like (e.g. one-hot encoded versus label-encoded)
- Do the labels match up with the class names?
To do, let's take one sample off the training data (using the .take()
method) and explore it.
# Take one sample off the training data
train_one_sample = train_data.take(1) # samples are in format (image_tensor, label)
Because we used the as_supervised=True
parameter in our tfds.load()
method above, data samples come in the tuple format structure (data, label)
or in our case (image_tensor, label)
.
# What does one sample of our training data look like?
train_one_sample
<TakeDataset shapes: ((None, None, 3), ()), types: (tf.uint8, tf.int64)>
Let's loop through our single training sample and get some info from the image_tensor
and label
.
# Output info about our training sample
for image, label in train_one_sample:
print(f"""
Image shape: {image.shape}
Image dtype: {image.dtype}
Target class from Food101 (tensor form): {label}
Class name (str form): {class_names[label.numpy()]}
""")
Image shape: (512, 512, 3) Image dtype: <dtype: 'uint8'> Target class from Food101 (tensor form): 8 Class name (str form): bread_pudding
Because we set the shuffle_files=True
parameter in our tfds.load()
method above, running the cell above a few times will give a different result each time.
Checking these you might notice some of the images have different shapes, for example (512, 342, 3)
and (512, 512, 3)
(height, width, color_channels).
Let's see what one of the image tensors from TFDS's Food101 dataset looks like.
# What does an image tensor from TFDS's Food101 look like?
image
<tf.Tensor: shape=(512, 512, 3), dtype=uint8, numpy= array([[[18, 6, 8], [18, 6, 8], [18, 6, 8], ..., [30, 15, 22], [29, 14, 21], [26, 11, 18]], [[22, 10, 12], [21, 9, 11], [20, 8, 10], ..., [35, 20, 27], [31, 16, 23], [26, 11, 18]], [[23, 13, 14], [21, 11, 12], [19, 9, 10], ..., [39, 26, 33], [36, 21, 28], [30, 15, 22]], ..., [[15, 4, 8], [15, 4, 8], [14, 5, 10], ..., [41, 9, 10], [39, 7, 8], [36, 4, 5]], [[16, 5, 9], [16, 5, 9], [16, 5, 11], ..., [42, 12, 12], [39, 9, 9], [35, 5, 5]], [[15, 4, 8], [15, 4, 8], [16, 5, 11], ..., [41, 11, 11], [39, 9, 9], [35, 5, 5]]], dtype=uint8)>
# What are the min and max values?
tf.reduce_min(image), tf.reduce_max(image)
(<tf.Tensor: shape=(), dtype=uint8, numpy=0>, <tf.Tensor: shape=(), dtype=uint8, numpy=255>)
Alright looks like our image tensors have values of between 0 & 255 (standard red, green, blue colour values) and the values are of data type unit8
.
We might have to preprocess these before passing them to a neural network. But we'll handle this later.
In the meantime, let's see if we can plot an image sample.
Plot an image from TensorFlow Datasets¶
We've seen our image tensors in tensor format, now let's really adhere to our motto.
"Visualize, visualize, visualize!"
Let's plot one of the image samples using matplotlib.pyplot.imshow()
and set the title to target class name.
# Plot an image tensor
import matplotlib.pyplot as plt
plt.imshow(image)
plt.title(class_names[label.numpy()]) # add title to image by indexing on class_names list
plt.axis(False);
Delicious!
Okay, looks like the Food101 data we've got from TFDS is similar to the datasets we've been using in previous notebooks.
Now let's preprocess it and get it ready for use with a neural network.
Create preprocessing functions for our data¶
In previous notebooks, when our images were in folder format we used the method tf.keras.utils.image_dataset_from_directory()
to load them in.
Doing this meant our data was loaded into a format ready to be used with our models.
However, since we've downloaded the data from TensorFlow Datasets, there are a couple of preprocessing steps we have to take before it's ready to model.
More specifically, our data is currently:
- In
uint8
data type - Comprised of all differnet sized tensors (different sized images)
- Not scaled (the pixel values are between 0 & 255)
Whereas, models like data to be:
- In
float32
data type - Have all of the same size tensors (batches require all tensors have the same shape, e.g.
(224, 224, 3)
) - Scaled (values between 0 & 1), also called normalized
To take care of these, we'll create a preprocess_img()
function which:
- Resizes an input image tensor to a specified size using
tf.image.resize()
- Converts an input image tensor's current datatype to
tf.float32
usingtf.cast()
🔑 Note: Pretrained EfficientNetBX models in
tf.keras.applications.efficientnet
(what we're going to be using) have rescaling built-in. But for many other model architectures you'll want to rescale your data (e.g. get its values between 0 & 1). This could be incorporated inside your "preprocess_img()
" function (like the one below) or within your model as atf.keras.layers.Rescaling
layer.
# Make a function for preprocessing images
def preprocess_img(image, label, img_shape=224):
"""
Converts image datatype from 'uint8' -> 'float32' and reshapes image to
[img_shape, img_shape, color_channels]
"""
image = tf.image.resize(image, [img_shape, img_shape]) # reshape to img_shape
return tf.cast(image, tf.float32), label # return (float32_image, label) tuple
Our preprocess_img()
function above takes image and label as input (even though it does nothing to the label) because our dataset is currently in the tuple structure (image, label)
.
Let's try our function out on a target image.
# Preprocess a single sample image and check the outputs
preprocessed_img = preprocess_img(image, label)[0]
print(f"Image before preprocessing:\n {image[:2]}...,\nShape: {image.shape},\nDatatype: {image.dtype}\n")
print(f"Image after preprocessing:\n {preprocessed_img[:2]}...,\nShape: {preprocessed_img.shape},\nDatatype: {preprocessed_img.dtype}")
Image before preprocessing: [[[18 6 8] [18 6 8] [18 6 8] ... [30 15 22] [29 14 21] [26 11 18]] [[22 10 12] [21 9 11] [20 8 10] ... [35 20 27] [31 16 23] [26 11 18]]]..., Shape: (512, 512, 3), Datatype: <dtype: 'uint8'> Image after preprocessing: [[[20.158163 8.158163 10.158163 ] [18.42347 7.6173472 9.020408 ] [15.010203 6.423469 9.285714 ] ... [26.285824 15.714351 23.07156 ] [31.091867 17.285728 24.285728 ] [28.754953 13.754952 20.754953 ]] [[18.92857 8.928571 9.928571 ] [16.214285 7.0765305 8.07653 ] [14.739796 8.571429 10.627552 ] ... [26.444029 15.872557 21.658293 ] [39.86226 26.862259 33.86226 ] [39.49479 24.494787 31.494787 ]]]..., Shape: (224, 224, 3), Datatype: <dtype: 'float32'>
Excellent! Looks like our preprocess_img()
function is working as expected.
The input image gets converted from uint8
to float32
and gets reshaped from its current shape to (224, 224, 3)
.
How does it look?
# We can still plot our preprocessed image as long as we
# divide by 255 (for matplotlib capatibility)
plt.imshow(preprocessed_img/255.)
plt.title(class_names[label])
plt.axis(False);
All this food visualization is making me hungry. How about we start preparing to model it?
Batch & prepare datasets¶
Before we can model our data, we have to turn it into batches.
Why?
Because computing on batches is memory efficient.
We turn our data from 101,000 image tensors and labels (train and test combined) into batches of 32 image and label pairs, thus enabling it to fit into the memory of our GPU.
To do this in effective way, we're going to be leveraging a number of methods from the tf.data
API.
📖 Resource: For loading data in the most performant way possible, see the TensorFlow docuemntation on Better performance with the tf.data API.
Specifically, we're going to be using:
map()
- maps a predefined function to a target dataset (e.g.preprocess_img()
to our image tensors)shuffle()
- randomly shuffles the elements of a target dataset upbuffer_size
(ideally, thebuffer_size
is equal to the size of the dataset, however, this may have implications on memory)batch()
- turns elements of a target dataset into batches (size defined by parameterbatch_size
)prefetch()
- prepares subsequent batches of data whilst other batches of data are being computed on (improves data loading speed but costs memory)- Extra:
cache()
- caches (saves them for later) elements in a target dataset, saving loading time (will only work if your dataset is small enough to fit in memory, standard Colab instances only have 12GB of memory)
Things to note:
- Can't batch tensors of different shapes (e.g. different image sizes, need to reshape images first, hence our
preprocess_img()
function) shuffle()
keeps a buffer of the number you pass it images shuffled, ideally this number would be all of the samples in your training set, however, if your training set is large, this buffer might not fit in memory (a fairly large number like 1000 or 10000 is usually suffice for shuffling)- For methods with the
num_parallel_calls
parameter available (such asmap()
), setting it tonum_parallel_calls=tf.data.AUTOTUNE
will parallelize preprocessing and significantly improve speed - Can't use
cache()
unless your dataset can fit in memory
Woah, the above is alot. But once we've coded below, it'll start to make sense.
We're going to through things in the following order:
Original dataset (e.g. train_data) -> map() -> shuffle() -> batch() -> prefetch() -> PrefetchDataset
This is like saying,
"Hey, map this preprocessing function across our training dataset, then shuffle a number of elements before batching them together and make sure you prepare new batches (prefetch) whilst the model is looking through the current batch".
What happens when you use prefetching (faster) versus what happens when you don't use prefetching (slower). Source: Page 422 of Hands-On Machine Learning with Scikit-Learn, Keras & TensorFlow Book by Aurélien Géron.
# Map preprocessing function to training data (and paralellize)
train_data = train_data.map(map_func=preprocess_img, num_parallel_calls=tf.data.AUTOTUNE)
# Shuffle train_data and turn it into batches and prefetch it (load it faster)
train_data = train_data.shuffle(buffer_size=1000).batch(batch_size=32).prefetch(buffer_size=tf.data.AUTOTUNE)
# Map prepreprocessing function to test data
test_data = test_data.map(preprocess_img, num_parallel_calls=tf.data.AUTOTUNE)
# Turn test data into batches (don't need to shuffle)
test_data = test_data.batch(32).prefetch(tf.data.AUTOTUNE)
And now let's check out what our prepared datasets look like.
train_data, test_data
(<PrefetchDataset shapes: ((None, 224, 224, 3), (None,)), types: (tf.float32, tf.int64)>, <PrefetchDataset shapes: ((None, 224, 224, 3), (None,)), types: (tf.float32, tf.int64)>)
Excellent! Looks like our data is now in tutples of (image, label)
with datatypes of (tf.float32, tf.int64)
, just what our model is after.
🔑 Note: You can get away without calling the
prefetch()
method on the end of your datasets, however, you'd probably see significantly slower data loading speeds when building a model. So most of your dataset input pipelines should end with a call toprefecth()
.
Onward.
Create modelling callbacks¶
Since we're going to be training on a large amount of data and training could take a long time, it's a good idea to set up some modelling callbacks so we be sure of things like our model's training logs being tracked and our model being checkpointed (saved) after various training milestones.
To do each of these we'll use the following callbacks:
tf.keras.callbacks.TensorBoard()
- allows us to keep track of our model's training history so we can inspect it later (note: we've created this callback before have imported it fromhelper_functions.py
ascreate_tensorboard_callback()
)tf.keras.callbacks.ModelCheckpoint()
- saves our model's progress at various intervals so we can load it and resuse it later without having to retrain it- Checkpointing is also helpful so we can start fine-tuning our model at a particular epoch and revert back to a previous state if fine-tuning offers no benefits
# Create TensorBoard callback (already have "create_tensorboard_callback()" from a previous notebook)
from helper_functions import create_tensorboard_callback
# Create ModelCheckpoint callback to save model's progress
checkpoint_path = "model_checkpoints/cp.ckpt" # saving weights requires ".ckpt" extension
model_checkpoint = tf.keras.callbacks.ModelCheckpoint(checkpoint_path,
monitor="val_accuracy", # save the model weights with best validation accuracy
save_best_only=True, # only save the best weights
save_weights_only=True, # only save model weights (not whole model)
verbose=0) # don't print out whether or not model is being saved
Setup mixed precision training¶
We touched on mixed precision training above.
However, we didn't quite explain it.
Normally, tensors in TensorFlow default to the float32 datatype (unless otherwise specified).
In computer science, float32 is also known as single-precision floating-point format. The 32 means it usually occupies 32 bits in computer memory.
Your GPU has a limited memory, therefore it can only handle a number of float32 tensors at the same time.
This is where mixed precision training comes in.
Mixed precision training involves using a mix of float16 and float32 tensors to make better use of your GPU's memory.
Can you guess what float16 means?
Well, if you thought since float32 meant single-precision floating-point, you might've guessed float16 means half-precision floating-point format. And if you did, you're right! And if not, no trouble, now you know.
For tensors in float16 format, each element occupies 16 bits in computer memory.
So, where does this leave us?
As mentioned before, when using mixed precision training, your model will make use of float32 and float16 data types to use less memory where possible and in turn run faster (using less memory per tensor means more tensors can be computed on simultaneously).
As a result, using mixed precision training can improve your performance on modern GPUs (those with a compute capability score of 7.0+) by up to 3x.
For a more detailed explanation, I encourage you to read through the TensorFlow mixed precision guide (I'd highly recommend at least checking out the summary).
Because mixed precision training uses a combination of float32 and float16 data types, you may see up to a 3x speedup on modern GPUs.
🔑 Note: If your GPU doesn't have a score of over 7.0+ (e.g. P100 in Colab), mixed precision won't work (see: "Supported Hardware" in the mixed precision guide for more).
📖 Resource: If you'd like to learn more about precision in computer science (the detail to which a numerical quantity is expressed by a computer), see the Wikipedia page) (and accompanying resources).
Okay, enough talk, let's see how we can turn on mixed precision training in TensorFlow.
The beautiful thing is, the tensorflow.keras.mixed_precision
API has made it very easy for us to get started.
First, we'll import the API and then use the set_global_policy()
method to set the dtype policy to "mixed_float16"
.
# Turn on mixed precision training
from tensorflow.keras import mixed_precision
mixed_precision.set_global_policy(policy="mixed_float16") # set global policy to mixed precision
Nice! As long as the GPU you're using has a compute capability of 7.0+ the cell above should run without error.
Now we can check the global dtype policy (the policy which will be used by layers in our model) using the mixed_precision.global_policy()
method.
mixed_precision.global_policy() # should output "mixed_float16" (if your GPU is compatible with mixed precision)
<Policy "mixed_float16">
Great, since the global dtype policy is now "mixed_float16"
our model will automatically take advantage of float16 variables where possible and in turn speed up training.
Build feature extraction model¶
Callbacks: ready to roll.
Mixed precision: turned on.
Let's build a model.
Because our dataset is quite large, we're going to move towards fine-tuning an existing pretrained model (EfficienetNetB0).
But before we get into fine-tuning, let's set up a feature-extraction model.
Recall, the typical order for using transfer learning is:
- Build a feature extraction model (replace the top few layers of a pretrained model)
- Train for a few epochs with lower layers frozen
- Fine-tune if necessary with multiple layers unfrozen
Before fine-tuning, it's best practice to train a feature extraction model with custom top layers.
To build the feature extraction model (covered in Transfer Learning in TensorFlow Part 1: Feature extraction), we'll:
- Use
EfficientNetB0
fromtf.keras.applications
pre-trained on ImageNet as our base model- We'll download this without the top layers using
include_top=False
parameter so we can create our own output layers
- We'll download this without the top layers using
- Freeze the base model layers so we can use the pre-learned patterns the base model has found on ImageNet
- Put together the input, base model, pooling and output layers in a Functional model
- Compile the Functional model using the Adam optimizer and sparse categorical crossentropy as the loss function (since our labels aren't one-hot encoded)
- Fit the model for 3 epochs using the TensorBoard and ModelCheckpoint callbacks
🔑 Note: Since we're using mixed precision training, our model needs a separate output layer with a hard-coded
dtype=float32
, for example,layers.Activation("softmax", dtype=tf.float32)
. This ensures the outputs of our model are returned back to the float32 data type which is more numerically stable than the float16 datatype (important for loss calculations). See the "Building the model" section in the TensorFlow mixed precision guide for more.
Turning mixed precision on in TensorFlow with 3 lines of code.
from tensorflow.keras import layers
# Create base model
input_shape = (224, 224, 3)
base_model = tf.keras.applications.EfficientNetB0(include_top=False)
base_model.trainable = False # freeze base model layers
# Create Functional model
inputs = layers.Input(shape=input_shape, name="input_layer")
# Note: EfficientNetBX models have rescaling built-in but if your model didn't you could have a layer like below
# x = layers.Rescaling(1./255)(x)
x = base_model(inputs, training=False) # set base_model to inference mode only
x = layers.GlobalAveragePooling2D(name="pooling_layer")(x)
x = layers.Dense(len(class_names))(x) # want one output neuron per class
# Separate activation of output layer so we can output float32 activations
outputs = layers.Activation("softmax", dtype=tf.float32, name="softmax_float32")(x)
model = tf.keras.Model(inputs, outputs)
# Compile the model
model.compile(loss="sparse_categorical_crossentropy", # Use sparse_categorical_crossentropy when labels are *not* one-hot
optimizer=tf.keras.optimizers.Adam(),
metrics=["accuracy"])
# Check out our model
model.summary()
Model: "model_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_layer (InputLayer) [(None, 224, 224, 3)] 0 _________________________________________________________________ efficientnetb0 (Functional) (None, None, None, 1280) 4049571 _________________________________________________________________ pooling_layer (GlobalAverage (None, 1280) 0 _________________________________________________________________ dense_1 (Dense) (None, 101) 129381 _________________________________________________________________ softmax_float32 (Activation) (None, 101) 0 ================================================================= Total params: 4,178,952 Trainable params: 129,381 Non-trainable params: 4,049,571 _________________________________________________________________
Checking layer dtype policies (are we using mixed precision?)¶
Model ready to go!
Before we said the mixed precision API will automatically change our layers' dtype policy's to whatever the global dtype policy is (in our case it's "mixed_float16"
).
We can check this by iterating through our model's layers and printing layer attributes such as dtype
and dtype_policy
.
# Check the dtype_policy attributes of layers in our model
for layer in model.layers:
print(layer.name, layer.trainable, layer.dtype, layer.dtype_policy) # Check the dtype policy of layers
input_layer True float32 <Policy "float32"> efficientnetb0 False float32 <Policy "mixed_float16"> pooling_layer True float32 <Policy "mixed_float16"> dense_1 True float32 <Policy "mixed_float16"> softmax_float32 True float32 <Policy "float32">
Going through the above we see:
layer.name
(str) : a layer's human-readable name, can be defined by thename
parameter on constructionlayer.trainable
(bool) : whether or not a layer is trainable (all of our layers are trainable except the efficientnetb0 layer since we set it'strainable
attribute toFalse
layer.dtype
: the data type a layer stores its variables inlayer.dtype_policy
: the data type a layer computes in
🔑 Note: A layer can have a dtype of
float32
and a dtype policy of"mixed_float16"
because it stores its variables (weights & biases) infloat32
(more numerically stable), however it computes infloat16
(faster).
We can also check the same details for our model's base model.
# Check the layers in the base model and see what dtype policy they're using
for layer in model.layers[1].layers[:20]: # only check the first 20 layers to save output space
print(layer.name, layer.trainable, layer.dtype, layer.dtype_policy)
input_2 False float32 <Policy "float32"> rescaling_1 False float32 <Policy "mixed_float16"> normalization_1 False float32 <Policy "mixed_float16"> stem_conv_pad False float32 <Policy "mixed_float16"> stem_conv False float32 <Policy "mixed_float16"> stem_bn False float32 <Policy "mixed_float16"> stem_activation False float32 <Policy "mixed_float16"> block1a_dwconv False float32 <Policy "mixed_float16"> block1a_bn False float32 <Policy "mixed_float16"> block1a_activation False float32 <Policy "mixed_float16"> block1a_se_squeeze False float32 <Policy "mixed_float16"> block1a_se_reshape False float32 <Policy "mixed_float16"> block1a_se_reduce False float32 <Policy "mixed_float16"> block1a_se_expand False float32 <Policy "mixed_float16"> block1a_se_excite False float32 <Policy "mixed_float16"> block1a_project_conv False float32 <Policy "mixed_float16"> block1a_project_bn False float32 <Policy "mixed_float16"> block2a_expand_conv False float32 <Policy "mixed_float16"> block2a_expand_bn False float32 <Policy "mixed_float16"> block2a_expand_activation False float32 <Policy "mixed_float16">
🔑 Note: The mixed precision API automatically causes layers which can benefit from using the
"mixed_float16"
dtype policy to use it. It also prevents layers which shouldn't use it from using it (e.g. the normalization layer at the start of the base model).
Fit the feature extraction model¶
Now that's one good looking model. Let's fit it to our data shall we?
Three epochs should be enough for our top layers to adjust their weights enough to our food image data.
To save time per epoch, we'll also only validate on 15% of the test data.
# Turn off all warnings except for errors
tf.get_logger().setLevel('ERROR')
# Fit the model with callbacks
history_101_food_classes_feature_extract = model.fit(train_data,
epochs=3,
steps_per_epoch=len(train_data),
validation_data=test_data,
validation_steps=int(0.15 * len(test_data)),
callbacks=[create_tensorboard_callback("training_logs",
"efficientnetb0_101_classes_all_data_feature_extract"),
model_checkpoint])
2022-09-20 10:06:15.322918: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing. 2022-09-20 10:06:15.322948: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started. 2022-09-20 10:06:15.427900: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down. 2022-09-20 10:06:15.428018: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1748] CUPTI activity buffer flushed
Saving TensorBoard log files to: training_logs/efficientnetb0_101_classes_all_data_feature_extract/20220920-100615 Epoch 1/3 2/2368 [..............................] - ETA: 9:47 - loss: 2.5717 - accuracy: 0.5000
2022-09-20 10:06:15.858416: I tensorflow/core/profiler/lib/profiler_session.cc:131] Profiler session initializing. 2022-09-20 10:06:15.858442: I tensorflow/core/profiler/lib/profiler_session.cc:146] Profiler session started. 2022-09-20 10:06:16.013564: I tensorflow/core/profiler/lib/profiler_session.cc:66] Profiler session collecting data. 2022-09-20 10:06:16.013961: I tensorflow/core/profiler/internal/gpu/cupti_tracer.cc:1748] CUPTI activity buffer flushed 2022-09-20 10:06:16.022859: I tensorflow/core/profiler/internal/gpu/cupti_collector.cc:673] GpuTracer has collected 551 callback api events and 550 activity events. 2022-09-20 10:06:16.030739: I tensorflow/core/profiler/lib/profiler_session.cc:164] Profiler session tear down. 2022-09-20 10:06:16.042630: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: training_logs/efficientnetb0_101_classes_all_data_feature_extract/20220920-100615/train/plugins/profile/2022_09_20_10_06_16 2022-09-20 10:06:16.050313: I tensorflow/core/profiler/rpc/client/save_profile.cc:142] Dumped gzipped tool data for trace.json.gz to training_logs/efficientnetb0_101_classes_all_data_feature_extract/20220920-100615/train/plugins/profile/2022_09_20_10_06_16/daniel-Z490-UD.trace.json.gz
9/2368 [..............................] - ETA: 2:18 - loss: 2.6338 - accuracy: 0.4653
2022-09-20 10:06:16.072212: I tensorflow/core/profiler/rpc/client/save_profile.cc:136] Creating directory: training_logs/efficientnetb0_101_classes_all_data_feature_extract/20220920-100615/train/plugins/profile/2022_09_20_10_06_16 2022-09-20 10:06:16.076049: I tensorflow/core/profiler/rpc/client/save_profile.cc:142] Dumped gzipped tool data for memory_profile.json.gz to training_logs/efficientnetb0_101_classes_all_data_feature_extract/20220920-100615/train/plugins/profile/2022_09_20_10_06_16/daniel-Z490-UD.memory_profile.json.gz 2022-09-20 10:06:16.076738: I tensorflow/core/profiler/rpc/client/capture_profile.cc:251] Creating directory: training_logs/efficientnetb0_101_classes_all_data_feature_extract/20220920-100615/train/plugins/profile/2022_09_20_10_06_16 Dumped tool data for xplane.pb to training_logs/efficientnetb0_101_classes_all_data_feature_extract/20220920-100615/train/plugins/profile/2022_09_20_10_06_16/daniel-Z490-UD.xplane.pb Dumped tool data for overview_page.pb to training_logs/efficientnetb0_101_classes_all_data_feature_extract/20220920-100615/train/plugins/profile/2022_09_20_10_06_16/daniel-Z490-UD.overview_page.pb Dumped tool data for input_pipeline.pb to training_logs/efficientnetb0_101_classes_all_data_feature_extract/20220920-100615/train/plugins/profile/2022_09_20_10_06_16/daniel-Z490-UD.input_pipeline.pb Dumped tool data for tensorflow_stats.pb to training_logs/efficientnetb0_101_classes_all_data_feature_extract/20220920-100615/train/plugins/profile/2022_09_20_10_06_16/daniel-Z490-UD.tensorflow_stats.pb Dumped tool data for kernel_stats.pb to training_logs/efficientnetb0_101_classes_all_data_feature_extract/20220920-100615/train/plugins/profile/2022_09_20_10_06_16/daniel-Z490-UD.kernel_stats.pb
2368/2368 [==============================] - 52s 22ms/step - loss: 1.6907 - accuracy: 0.5805 - val_loss: 1.2160 - val_accuracy: 0.6748 Epoch 2/3 2368/2368 [==============================] - 51s 21ms/step - loss: 1.2817 - accuracy: 0.6685 - val_loss: 1.1228 - val_accuracy: 0.6994 Epoch 3/3 2368/2368 [==============================] - 51s 21ms/step - loss: 1.1366 - accuracy: 0.7028 - val_loss: 1.0876 - val_accuracy: 0.7068
Nice, looks like our feature extraction model is performing pretty well. How about we evaluate it on the whole test dataset?
# Evaluate model (unsaved version) on whole test dataset
results_feature_extract_model = model.evaluate(test_data)
results_feature_extract_model
790/790 [==============================] - 14s 18ms/step - loss: 1.0888 - accuracy: 0.7048
[1.0887573957443237, 0.704752504825592]
And since we used the ModelCheckpoint
callback, we've got a saved version of our model in the model_checkpoints
directory.
Let's load it in and make sure it performs just as well.
Load and evaluate checkpoint weights¶
We can load in and evaluate our model's checkpoints by:
- Cloning our model using
tf.keras.models.clone_model()
to make a copy of our feature extraction model with reset weights. - Calling the
load_weights()
method on our cloned model passing it the path to where our checkpointed weights are stored. - Calling
evaluate()
on the cloned model with loaded weights.
A reminder, checkpoints are helpful for when you perform an experiment such as fine-tuning your model. In the case you fine-tune your feature extraction model and find it doesn't offer any improvements, you can always revert back to the checkpointed version of your model.
# Clone the model we created (this resets all weights)
cloned_model = tf.keras.models.clone_model(model)
cloned_model.summary()
Model: "model_1" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= input_layer (InputLayer) [(None, 224, 224, 3)] 0 _________________________________________________________________ efficientnetb0 (Functional) (None, None, None, 1280) 4049571 _________________________________________________________________ pooling_layer (GlobalAverage (None, 1280) 0 _________________________________________________________________ dense_1 (Dense) (None, 101) 129381 _________________________________________________________________ softmax_float32 (Activation) (None, 101) 0 ================================================================= Total params: 4,178,952 Trainable params: 129,381 Non-trainable params: 4,049,571 _________________________________________________________________
# Where are our checkpoints stored?
checkpoint_path
'model_checkpoints/cp.ckpt'
# Load checkpointed weights into cloned_model
cloned_model.load_weights(checkpoint_path)
<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7fa40c318430>
Each time you make a change to your model (including loading weights), you have to recompile.
# Compile cloned_model (with same parameters as original model)
cloned_model.compile(loss="sparse_categorical_crossentropy",
optimizer=tf.keras.optimizers.Adam(),
metrics=["accuracy"])
# Evalaute cloned model with loaded weights (should be same score as trained model)
results_cloned_model_with_loaded_weights = cloned_model.evaluate(test_data)
790/790 [==============================] - 15s 17ms/step - loss: 1.7259 - accuracy: 0.5503
Our cloned model with loaded weight's results should be very close to the feature extraction model's results (if the cell below errors, something went wrong).
# Loaded checkpoint weights should return very similar results to checkpoint weights prior to saving
import numpy as np
assert np.isclose(results_feature_extract_model, results_cloned_model_with_loaded_weights).all(), "Loaded weights results are not close to original model." # check if all elements in array are close
--------------------------------------------------------------------------- AssertionError Traceback (most recent call last) /tmp/ipykernel_1467108/1538537382.py in <module> 1 # Loaded checkpoint weights should return very similar results to checkpoint weights prior to saving 2 import numpy as np ----> 3 assert np.isclose(results_feature_extract_model, results_cloned_model_with_loaded_weights).all(), "Loaded weights results are not close to original model." # check if all elements in array are close AssertionError: Loaded weights results are not close to original model.
Cloning the model preserves dtype_policy
's of layers (but doesn't preserve weights) so if we wanted to continue fine-tuning with the cloned model, we could and it would still use the mixed precision dtype policy.
# Check the layers in the base model and see what dtype policy they're using
for layer in cloned_model.layers[1].layers[:20]: # check only the first 20 layers to save printing space
print(layer.name, layer.trainable, layer.dtype, layer.dtype_policy)
input_2 True float32 <Policy "float32"> rescaling_1 False float32 <Policy "mixed_float16"> normalization_1 False float32 <Policy "mixed_float16"> stem_conv_pad False float32 <Policy "mixed_float16"> stem_conv False float32 <Policy "mixed_float16"> stem_bn False float32 <Policy "mixed_float16"> stem_activation False float32 <Policy "mixed_float16"> block1a_dwconv False float32 <Policy "mixed_float16"> block1a_bn False float32 <Policy "mixed_float16"> block1a_activation False float32 <Policy "mixed_float16"> block1a_se_squeeze False float32 <Policy "mixed_float16"> block1a_se_reshape False float32 <Policy "mixed_float16"> block1a_se_reduce False float32 <Policy "mixed_float16"> block1a_se_expand False float32 <Policy "mixed_float16"> block1a_se_excite False float32 <Policy "mixed_float16"> block1a_project_conv False float32 <Policy "mixed_float16"> block1a_project_bn False float32 <Policy "mixed_float16"> block2a_expand_conv False float32 <Policy "mixed_float16"> block2a_expand_bn False float32 <Policy "mixed_float16"> block2a_expand_activation False float32 <Policy "mixed_float16">
Save the whole model to file¶
We can also save the whole model using the save()
method.
Since our model is quite large, you might want to save it to Google Drive (if you're using Google Colab) so you can load it in for use later.
🔑 Note: Saving to Google Drive requires mounting Google Drive (go to Files -> Mount Drive).
# ## Saving model to Google Drive (optional)
# # Create save path to drive
# save_dir = "drive/MyDrive/tensorflow_course/food_vision/07_efficientnetb0_feature_extract_model_mixed_precision/"
# # os.makedirs(save_dir) # Make directory if it doesn't exist
# # Save model
# model.save(save_dir)
We can also save it directly to our Google Colab instance.
🔑 Note: Google Colab storage is ephemeral and your model will delete itself (along with any other saved files) when the Colab session expires.
# Save model locally (if you're using Google Colab, your saved model will Colab instance terminates)
save_dir = "07_efficientnetb0_feature_extract_model_mixed_precision"
model.save(save_dir)
2022-09-20 10:10:36.539525: W tensorflow/python/util/util.cc:348] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them. /home/daniel/code/tensorflow/env/lib/python3.9/site-packages/keras/utils/generic_utils.py:494: CustomMaskWarning: Custom mask layers require a config and must override get_config. When loading, the custom mask layer must be passed to the custom_objects argument. warnings.warn('Custom mask layers require a config and must override '
And again, we can check whether or not our model saved correctly by loading it in and evaluating it.
# Load model previously saved above
loaded_saved_model = tf.keras.models.load_model(save_dir)
WARNING:absl:Importing a function (__inference_block2a_expand_activation_layer_call_and_return_conditional_losses_65022) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2a_se_reduce_layer_call_and_return_conditional_losses_65096) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3a_expand_activation_layer_call_and_return_conditional_losses_65348) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block7a_expand_activation_layer_call_and_return_conditional_losses_67313) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6d_expand_activation_layer_call_and_return_conditional_losses_67146) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6d_se_reduce_layer_call_and_return_conditional_losses_67213) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_efficientnetb0_layer_call_and_return_conditional_losses_79592) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5c_se_reduce_layer_call_and_return_conditional_losses_66553) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block1a_activation_layer_call_and_return_conditional_losses_64894) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6c_expand_activation_layer_call_and_return_conditional_losses_104473) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3a_se_reduce_layer_call_and_return_conditional_losses_100655) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6b_expand_activation_layer_call_and_return_conditional_losses_66812) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2b_expand_activation_layer_call_and_return_conditional_losses_65181) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5b_expand_activation_layer_call_and_return_conditional_losses_102850) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_model_1_layer_call_and_return_conditional_losses_86402) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5b_activation_layer_call_and_return_conditional_losses_102944) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4a_se_reduce_layer_call_and_return_conditional_losses_101454) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5a_se_reduce_layer_call_and_return_conditional_losses_102643) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2a_expand_activation_layer_call_and_return_conditional_losses_99676) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_efficientnetb0_layer_call_and_return_conditional_losses_75784) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6b_se_reduce_layer_call_and_return_conditional_losses_66879) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4c_activation_layer_call_and_return_conditional_losses_66024) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block7a_activation_layer_call_and_return_conditional_losses_67337) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_model_1_layer_call_and_return_conditional_losses_88150) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2b_activation_layer_call_and_return_conditional_losses_100157) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5b_expand_activation_layer_call_and_return_conditional_losses_66319) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3b_expand_activation_layer_call_and_return_conditional_losses_65507) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6a_se_reduce_layer_call_and_return_conditional_losses_103854) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5a_expand_activation_layer_call_and_return_conditional_losses_102485) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4a_activation_layer_call_and_return_conditional_losses_65705) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4b_activation_layer_call_and_return_conditional_losses_65857) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2a_activation_layer_call_and_return_conditional_losses_99792) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2a_activation_layer_call_and_return_conditional_losses_65053) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4b_expand_activation_layer_call_and_return_conditional_losses_101661) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6c_expand_activation_layer_call_and_return_conditional_losses_66979) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3b_activation_layer_call_and_return_conditional_losses_100956) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4c_activation_layer_call_and_return_conditional_losses_102167) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_top_activation_layer_call_and_return_conditional_losses_105662) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2b_activation_layer_call_and_return_conditional_losses_65205) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4a_expand_activation_layer_call_and_return_conditional_losses_65674) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3b_activation_layer_call_and_return_conditional_losses_65531) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4c_se_reduce_layer_call_and_return_conditional_losses_102231) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block1a_se_reduce_layer_call_and_return_conditional_losses_64937) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3a_activation_layer_call_and_return_conditional_losses_100591) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4b_se_reduce_layer_call_and_return_conditional_losses_65900) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5b_se_reduce_layer_call_and_return_conditional_losses_66386) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6b_se_reduce_layer_call_and_return_conditional_losses_104219) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_stem_activation_layer_call_and_return_conditional_losses_99311) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block7a_se_reduce_layer_call_and_return_conditional_losses_105455) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block7a_se_reduce_layer_call_and_return_conditional_losses_67380) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4c_se_reduce_layer_call_and_return_conditional_losses_66067) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2a_se_reduce_layer_call_and_return_conditional_losses_99856) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference__wrapped_model_57549) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6b_activation_layer_call_and_return_conditional_losses_104155) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_efficientnetb0_layer_call_and_return_conditional_losses_96626) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3a_activation_layer_call_and_return_conditional_losses_65379) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block1a_activation_layer_call_and_return_conditional_losses_99405) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5a_expand_activation_layer_call_and_return_conditional_losses_66167) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block7a_expand_activation_layer_call_and_return_conditional_losses_105297) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5c_se_reduce_layer_call_and_return_conditional_losses_103420) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4c_expand_activation_layer_call_and_return_conditional_losses_102073) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5b_se_reduce_layer_call_and_return_conditional_losses_103008) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3b_expand_activation_layer_call_and_return_conditional_losses_100862) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3a_expand_activation_layer_call_and_return_conditional_losses_100475) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6c_se_reduce_layer_call_and_return_conditional_losses_104631) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3b_se_reduce_layer_call_and_return_conditional_losses_65574) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_stem_activation_layer_call_and_return_conditional_losses_64870) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block1a_se_reduce_layer_call_and_return_conditional_losses_99469) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3a_se_reduce_layer_call_and_return_conditional_losses_65422) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5c_activation_layer_call_and_return_conditional_losses_103356) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6a_expand_activation_layer_call_and_return_conditional_losses_66653) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4a_activation_layer_call_and_return_conditional_losses_101390) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6b_activation_layer_call_and_return_conditional_losses_66836) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6d_activation_layer_call_and_return_conditional_losses_67170) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5a_activation_layer_call_and_return_conditional_losses_102579) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4b_activation_layer_call_and_return_conditional_losses_101755) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6d_se_reduce_layer_call_and_return_conditional_losses_105043) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6a_activation_layer_call_and_return_conditional_losses_103790) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6c_activation_layer_call_and_return_conditional_losses_104567) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4b_se_reduce_layer_call_and_return_conditional_losses_101819) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6a_expand_activation_layer_call_and_return_conditional_losses_103674) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6d_activation_layer_call_and_return_conditional_losses_104979) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5c_expand_activation_layer_call_and_return_conditional_losses_103262) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_top_activation_layer_call_and_return_conditional_losses_67465) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6a_activation_layer_call_and_return_conditional_losses_66684) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5a_se_reduce_layer_call_and_return_conditional_losses_66234) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_efficientnetb0_layer_call_and_return_conditional_losses_94755) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4b_expand_activation_layer_call_and_return_conditional_losses_65833) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6b_expand_activation_layer_call_and_return_conditional_losses_104061) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_efficientnetb0_layer_call_and_return_conditional_losses_91148) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5c_expand_activation_layer_call_and_return_conditional_losses_66486) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6a_se_reduce_layer_call_and_return_conditional_losses_66727) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4a_se_reduce_layer_call_and_return_conditional_losses_65748) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5a_activation_layer_call_and_return_conditional_losses_66191) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4c_expand_activation_layer_call_and_return_conditional_losses_66000) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6c_activation_layer_call_and_return_conditional_losses_67003) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6d_expand_activation_layer_call_and_return_conditional_losses_104885) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block7a_activation_layer_call_and_return_conditional_losses_105391) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3b_se_reduce_layer_call_and_return_conditional_losses_101020) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5b_activation_layer_call_and_return_conditional_losses_66343) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2b_se_reduce_layer_call_and_return_conditional_losses_100221) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2b_expand_activation_layer_call_and_return_conditional_losses_100063) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2b_se_reduce_layer_call_and_return_conditional_losses_65248) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6c_se_reduce_layer_call_and_return_conditional_losses_67046) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_efficientnetb0_layer_call_and_return_conditional_losses_93019) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4a_expand_activation_layer_call_and_return_conditional_losses_101274) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5c_activation_layer_call_and_return_conditional_losses_66510) with ops with unsaved custom gradients. Will likely fail if a gradient is requested.
Loading a SavedModel
also retains all of the underlying layers dtype_policy
(we want them to be "mixed_float16"
).
# Check the layers in the base model and see what dtype policy they're using
for layer in loaded_saved_model.layers[1].layers[:20]: # check only the first 20 layers to save output space
print(layer.name, layer.trainable, layer.dtype, layer.dtype_policy)
input_2 True float32 <Policy "float32"> rescaling_1 False float32 <Policy "mixed_float16"> normalization_1 False float32 <Policy "mixed_float16"> stem_conv_pad False float32 <Policy "mixed_float16"> stem_conv False float32 <Policy "mixed_float16"> stem_bn False float32 <Policy "mixed_float16"> stem_activation False float32 <Policy "mixed_float16"> block1a_dwconv False float32 <Policy "mixed_float16"> block1a_bn False float32 <Policy "mixed_float16"> block1a_activation False float32 <Policy "mixed_float16"> block1a_se_squeeze False float32 <Policy "mixed_float16"> block1a_se_reshape False float32 <Policy "mixed_float16"> block1a_se_reduce False float32 <Policy "mixed_float16"> block1a_se_expand False float32 <Policy "mixed_float16"> block1a_se_excite False float32 <Policy "mixed_float16"> block1a_project_conv False float32 <Policy "mixed_float16"> block1a_project_bn False float32 <Policy "mixed_float16"> block2a_expand_conv False float32 <Policy "mixed_float16"> block2a_expand_bn False float32 <Policy "mixed_float16"> block2a_expand_activation False float32 <Policy "mixed_float16">
# Check loaded model performance (this should be the same as results_feature_extract_model)
results_loaded_saved_model = loaded_saved_model.evaluate(test_data)
results_loaded_saved_model
790/790 [==============================] - 15s 18ms/step - loss: 1.0888 - accuracy: 0.7048
[1.0887584686279297, 0.704752504825592]
# The loaded model's results should equal (or at least be very close) to the model's results prior to saving
# Note: this will only work if you've instatiated results variables
import numpy as np
assert np.isclose(results_feature_extract_model, results_loaded_saved_model).all()
That's what we want! Our loaded model performing as it should.
🔑 Note: We spent a fair bit of time making sure our model saved correctly because training on a lot of data can be time-consuming, so we want to make sure we don't have to continaully train from scratch.
Preparing our model's layers for fine-tuning¶
Our feature-extraction model is showing some great promise after three epochs. But since we've got so much data, it's probably worthwhile that we see what results we can get with fine-tuning (fine-tuning usually works best when you've got quite a large amount of data).
Remember our goal of beating the DeepFood paper?
They were able to achieve 77.4% top-1 accuracy on Food101 over 2-3 days of training.
Do you think fine-tuning will get us there?
Let's find out.
To start, let's load in our saved model.
🔑 Note: It's worth remembering a traditional workflow for fine-tuning is to freeze a pre-trained base model and then train only the output layers for a few iterations so their weights can be updated inline with your custom data (feature extraction). And then unfreeze a number or all of the layers in the base model and continue training until the model stops improving.
Like all good cooking shows, I've saved a model I prepared earlier (the feature extraction model from above) to Google Storage.
We can download it to make sure we're using the same model going forward.
# Download the saved model from Google Storage
!wget https://storage.googleapis.com/ztm_tf_course/food_vision/07_efficientnetb0_feature_extract_model_mixed_precision.zip
--2022-09-20 10:11:48-- https://storage.googleapis.com/ztm_tf_course/food_vision/07_efficientnetb0_feature_extract_model_mixed_precision.zip Resolving storage.googleapis.com (storage.googleapis.com)... 142.250.76.112, 142.250.204.16, 172.217.167.80, ... Connecting to storage.googleapis.com (storage.googleapis.com)|142.250.76.112|:443... connected. HTTP request sent, awaiting response... 200 OK Length: 16976857 (16M) [application/zip] Saving to: '07_efficientnetb0_feature_extract_model_mixed_precision.zip.1’ 07_efficientnetb0_f 100%[===================>] 16.19M 11.7MB/s in 1.4s 2022-09-20 10:11:51 (11.7 MB/s) - '07_efficientnetb0_feature_extract_model_mixed_precision.zip.1’ saved [16976857/16976857]
# Unzip the SavedModel downloaded from Google Stroage
!mkdir downloaded_gs_model # create new dir to store downloaded feature extraction model
!unzip 07_efficientnetb0_feature_extract_model_mixed_precision.zip -d downloaded_gs_model
mkdir: cannot create directory ‘downloaded_gs_model’: File exists Archive: 07_efficientnetb0_feature_extract_model_mixed_precision.zip replace downloaded_gs_model/07_efficientnetb0_feature_extract_model_mixed_precision/variables/variables.data-00000-of-00001? [y]es, [n]o, [A]ll, [N]one, [r]ename: ^C
# Load and evaluate downloaded GS model
loaded_gs_model = tf.keras.models.load_model("downloaded_gs_model/07_efficientnetb0_feature_extract_model_mixed_precision")
WARNING:absl:Importing a function (__inference_block1a_activation_layer_call_and_return_conditional_losses_158253) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2a_activation_layer_call_and_return_conditional_losses_191539) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6d_expand_activation_layer_call_and_return_conditional_losses_196076) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6c_activation_layer_call_and_return_conditional_losses_195780) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6d_activation_layer_call_and_return_conditional_losses_196153) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_model_layer_call_and_return_conditional_losses_180010) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_stem_activation_layer_call_and_return_conditional_losses_191136) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4c_expand_activation_layer_call_and_return_conditional_losses_160354) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6c_expand_activation_layer_call_and_return_conditional_losses_195703) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3b_expand_activation_layer_call_and_return_conditional_losses_159392) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block1a_activation_layer_call_and_return_conditional_losses_191213) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4c_se_reduce_layer_call_and_return_conditional_losses_193678) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5a_se_reduce_layer_call_and_return_conditional_losses_194051) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2b_expand_activation_layer_call_and_return_conditional_losses_158768) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2b_se_reduce_layer_call_and_return_conditional_losses_191907) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6d_se_reduce_layer_call_and_return_conditional_losses_162720) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5c_activation_layer_call_and_return_conditional_losses_194708) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6d_se_reduce_layer_call_and_return_conditional_losses_196195) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5b_expand_activation_layer_call_and_return_conditional_losses_194258) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_efficientnetb0_layer_call_and_return_conditional_losses_188022) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6b_activation_layer_call_and_return_conditional_losses_161995) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_efficientnetb0_layer_call_and_return_conditional_losses_183149) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2b_activation_layer_call_and_return_conditional_losses_158824) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4a_activation_layer_call_and_return_conditional_losses_159787) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2a_expand_activation_layer_call_and_return_conditional_losses_158482) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2a_se_reduce_layer_call_and_return_conditional_losses_158588) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6b_se_reduce_layer_call_and_return_conditional_losses_195449) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5b_se_reduce_layer_call_and_return_conditional_losses_194377) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6d_expand_activation_layer_call_and_return_conditional_losses_162615) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3a_activation_layer_call_and_return_conditional_losses_192238) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4b_se_reduce_layer_call_and_return_conditional_losses_160121) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4a_expand_activation_layer_call_and_return_conditional_losses_192860) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2b_activation_layer_call_and_return_conditional_losses_191865) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4b_expand_activation_layer_call_and_return_conditional_losses_160016) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5c_se_reduce_layer_call_and_return_conditional_losses_194750) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_efficientnetb0_layer_call_and_return_conditional_losses_169029) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_efficientnetb0_layer_call_and_return_conditional_losses_170771) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3b_activation_layer_call_and_return_conditional_losses_159448) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5c_expand_activation_layer_call_and_return_conditional_losses_194631) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4a_se_reduce_layer_call_and_return_conditional_losses_192979) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4b_activation_layer_call_and_return_conditional_losses_193263) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5b_expand_activation_layer_call_and_return_conditional_losses_160977) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block7a_expand_activation_layer_call_and_return_conditional_losses_162953) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4a_se_reduce_layer_call_and_return_conditional_losses_159836) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2a_se_reduce_layer_call_and_return_conditional_losses_191581) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2a_activation_layer_call_and_return_conditional_losses_158539) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6c_se_reduce_layer_call_and_return_conditional_losses_162382) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block7a_expand_activation_layer_call_and_return_conditional_losses_196449) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_top_activation_layer_call_and_return_conditional_losses_163238) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6c_expand_activation_layer_call_and_return_conditional_losses_162277) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3b_expand_activation_layer_call_and_return_conditional_losses_192487) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block1a_se_reduce_layer_call_and_return_conditional_losses_191255) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block7a_activation_layer_call_and_return_conditional_losses_163009) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5b_activation_layer_call_and_return_conditional_losses_194335) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4c_expand_activation_layer_call_and_return_conditional_losses_193559) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4a_expand_activation_layer_call_and_return_conditional_losses_159730) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6a_se_reduce_layer_call_and_return_conditional_losses_161759) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3a_expand_activation_layer_call_and_return_conditional_losses_192161) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4b_se_reduce_layer_call_and_return_conditional_losses_193305) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5a_activation_layer_call_and_return_conditional_losses_160748) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5c_activation_layer_call_and_return_conditional_losses_161371) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4a_activation_layer_call_and_return_conditional_losses_192937) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block7a_se_reduce_layer_call_and_return_conditional_losses_196568) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2b_expand_activation_layer_call_and_return_conditional_losses_191788) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3a_expand_activation_layer_call_and_return_conditional_losses_159106) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3b_se_reduce_layer_call_and_return_conditional_losses_159497) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5c_expand_activation_layer_call_and_return_conditional_losses_161315) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_efficientnetb0_layer_call_and_return_conditional_losses_184891) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_model_layer_call_and_return_conditional_losses_178256) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6a_activation_layer_call_and_return_conditional_losses_161710) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6a_expand_activation_layer_call_and_return_conditional_losses_161653) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3a_se_reduce_layer_call_and_return_conditional_losses_159212) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_stem_activation_layer_call_and_return_conditional_losses_158197) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_efficientnetb0_layer_call_and_return_conditional_losses_189764) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3b_se_reduce_layer_call_and_return_conditional_losses_192606) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6a_activation_layer_call_and_return_conditional_losses_195081) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6c_activation_layer_call_and_return_conditional_losses_162333) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5a_se_reduce_layer_call_and_return_conditional_losses_160797) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5a_activation_layer_call_and_return_conditional_losses_194009) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6c_se_reduce_layer_call_and_return_conditional_losses_195822) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block5b_activation_layer_call_and_return_conditional_losses_161033) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6b_expand_activation_layer_call_and_return_conditional_losses_195330) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3a_activation_layer_call_and_return_conditional_losses_159163) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4c_se_reduce_layer_call_and_return_conditional_losses_160459) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6b_activation_layer_call_and_return_conditional_losses_195407) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block7a_se_reduce_layer_call_and_return_conditional_losses_163058) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block3a_se_reduce_layer_call_and_return_conditional_losses_192280) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6d_activation_layer_call_and_return_conditional_losses_162671) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference__wrapped_model_152628) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block6b_se_reduce_layer_call_and_return_conditional_losses_162044) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block2b_se_reduce_layer_call_and_return_conditional_losses_158873) with ops with unsaved custom gradients. Will likely fail if a gradient is requested. WARNING:absl:Importing a function (__inference_block4c_activation_layer_call_and_return_co