Immutable (tf.constant) vs Variable (tf.Variable), notice the different capitalization:
tf.math.reduce_max(): find the max along certain dimension(s).
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
# immutable vs variable
importtensorflowastfvar=tf.Variable([[1,2],[3,4]])con=tf.constant([[1,2],[3,4]])# Calculate the max along certain dimension(s). Doc says:
# If axis is None, all dimensions are reduced, and a tensor with a single element is returned.
# By default, keepdims is False, so this will collapse the matrix
tf.math.reduce_max(var,axis=0,)# see <tf.Tensor: shape=(2,), dtype=int32, numpy=array([3, 4], dtype=int32)>
tf.math.reduce_max(var)# see <tf.Tensor: shape=(2,), dtype=int32, numpy=4>
tf.math.argmax: returns the index with the largest value across axes of a tensor. By default, it returns max along axis=0
# 2x3
t=tf.constant([[1,2,3],[4,5,6]])# reshaping into 3x2
reshaped_tensor=tf.reshape(t,[3,2])batch_size,H,W=3,2,3tensor=tf.random.normal([batch_size,H,W])tf.reshape(tensor,[-1,H*W])
-1 will make tf infer the previous tensor sizes
Squeeze: tf.squeeze(tensor, dim) to remove dims with only one element
1
2
3
4
t=tf.constant([[[1,2,3],[4,5,6]]])# squeezing into 3x2
reshaped_tensor=tf.squeeze(t)
Element-wise Operations
Broadcasting: this is similar to numpy
1
([0.9,0.3,0.4,0.5,0.1]<0.4)# see [False, True, False, False, True]
Masking: this is similar to numpy as well. Its numpy equivalent is tensor[mask]. Note that from the doc:
0 < dim(mask) = K <= Element-wisedim(tensor), and mask’s shape must match the first K dimensions of tensor’s shape. We then have: boolean_mask(tensor, mask)[i, j1,...,jd] = tensor[i1,...,iK,j1,...,jd] where (i1,…,iK) is the ith True entry of mask (row-major order). The axis could be used with mask to indicate the axis to mask from.
So masking will collapse all matching dims into one
1
2
3
4
5
6
# 3x2
tensor=[[1,2],[3,4],[5,6]]# 3
mask=np.array([True,False,True])# so the two's 1st dimension match! based on that, all the matching dimensions will be collapsed into 1, while the other dims will remain
tf.boolean_mask(tensor,mask)# [[1, 2], [5, 6]]
Element-wise multiplications: * and tf.multiply()
1
2
3
4
5
tensor1=tf.constant([0,1,2,3])# 1-D example
tensor2=tf.constant([1,2,3,4])# 1-D example
tensor1*tensor2tf.multiply(tensor1,tensor2)# see <tf.Tensor: shape=(4,), dtype=int32, numpy=array([ 0, 2, 6, 12], dtype=int32)>
tensor1.*tensor2# Invalid
Elementwise exponentiation:
1
tensor**2
Elementwise sum:
1
2
tf.reduce_sum(tensor)
Gradient
tf.GradientTape(): record operations on tensors so we can calculate gradients more easily. GradientTape is a context manager that records operations
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
importtensorflowastf# Example variables
x=tf.constant(3.0)w=tf.Variable(2.0)# Gradient computation using GradientTape
withtf.GradientTape()asg:# Watch the variable `w` (automatically watched since it's a Variable)
y=w*x+2# Compute some operation
# Compute the gradient of `y` with respect to `w`
grad=g.gradient(y,w)print(f"Gradient of y with respect to w: {grad}")
Apply gradient on an image. $\alpha$ should be set in optimizer initialization already.
- `prefetch()` partially downloads the dataset in a background thread, stores it in memory, and when `next(iter(train_dataset))` is called, it will download and prepare the next batch. This way, memory usage is reduced. `AUTOTUNE` finds an optimized buffer size.
Data Augmentation
1
2
3
4
5
6
7
8
fromtensorflow.keras.layers.experimental.preprocessingimportRandomFlip,RandomRotationfromtensorflow.kerasimportSequentialdefdata_augmenter():data_augmentation=Sequential()# RandomFlip, RandomRotation
data_augmentation.add(RandomFlip('horizontal'))data_augmentation.add(RandomRotation(0.2))#20% of a full circle
returndata_augmentation
defalpaca_model(image_shape=IMG_SIZE,data_augmentation=data_augmenter()):''' Define a tf.keras model for binary classification out of the MobileNetV2 model
Arguments:
image_shape -- Image width and height
data_augmentation -- data augmentation function
Returns:
Returns:
tf.keras.model
'''input_shape=image_shape+(3,)### START CODE HERE
base_model_path="imagenet_base_model/without_top_mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_160_no_top.h5"base_model=tf.keras.applications.MobileNetV2(input_shape=input_shape,include_top=False,# <== Important!!!! so we are not using the output classifier
weights=base_model_path)# freeze the base model by making it non trainable
base_model.trainable=None# create the input layer (Same as the imageNetv2 input size)
# This is a symbolic placeholder that will get train_dataset when model.fit() is called
inputs=tf.keras.Input(shape=input_shape)# apply data augmentation to the inputs
x=data_augmentation(inputs)# data preprocessing using the same weights the model was trained on
x=preprocess_input(x)# set training to False to avoid keeping track of statistics in the batch norm layer
x=base_model(x,training=False)# add the new Binary classification layers
# use global avg pooling to summarize the info in each channel
x=tf.keras.layers.GlobalAveragePooling2D()(x)# include dropout with probability of 0.2 to avoid overfitting
x=tf.keras.layers.Dropout(0.2)(x)# use a prediction layer with one neuron (as a binary classifier only needs one)
outputs=tf.keras.layers.Dense(1,activation=None)(x)### END CODE HERE
model=tf.keras.Model(inputs,outputs)returnmodel
Model Refining
The idea of refining a pretrained model is to have a small step size. Especially the later layers. This is because the early layers are earlier stage features, such as edges. Later features are more high level, like hair, ears, etc. So we want the high level features to adapt to the new data. So the way to do it is to unfreeze the model, set a layer to fine tune from, then freeze the model?? TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
base_model=model2.layers[4]#?
base_model.trainable=Trueprint("number of layers: ",len(base_model.layers))TRAINABLE_FROM=120forlinbase_model.layers[:TRAINABLE_FROM]:l.trainable=Falseloss_function=tf.keras.losses.BinaryCrossentropy(from_logits=True)optimizer=tf.keras.Adam(lr=0.1*base_learning_rate)metrics=['accuracy']model2.compile(loss=loss_function,optimizer=optimizer,metrics=metrics)fine_tune_epochs=5total_epoch=initial_epoch+fine_tune_epochshistory_fine=model2.fit(train_dataset,epochs=total_epoch,initial_epoch=history.epoch[-1],validation_data=validation_dataset)
Prediction
predict and predict_on_batch(x_train) both outputs np.ndarray
from_logits is more numerically stable. We are trying to map the entire real number set using a floating number, that is, +- 2^32.
Misc
HDF5 is “Hierarchical Data Format 5”, a data format designed for compressing, chuking, and storing complex data hierarchies. It’s similar to a filesystem, for example, you can create “groups” within an HDF5 file like creating a folder. Datasets are similar to files. HDF5 allows access from multiple processes, and is supported by multiple languages, like C, C++, Python.