1. Keras Cheat Sheet
- 1. Keras Cheat Sheet
- 1.1 Getting Started
- 1.2 Model Building
- 1.3 Layers
- 1.4 Activation Functions
- 1.5 Loss Functions
- 1.6 Optimizers
- 1.7 Metrics
- 1.8 Model Compilation
- 1.9 Training
- 1.10 Evaluation
- 1.11 Prediction
- 1.12 Saving and Loading Models
- 1.13 Regularization
- 1.14 Transfer Learning
- 1.15 Callbacks
- 1.16 Custom Training Loops
- 1.17 Distributed Training
- 1.18 Hyperparameter Tuning
- 1.19 TensorFlow Datasets
- 1.20 TensorFlow Hub
- 1.21 TensorFlow Lite
- 1.22 Tips and Best Practices
This cheat sheet provides an exhaustive overview of the Keras deep learning library, covering essential concepts, code snippets, and best practices for efficient model building, training, and evaluation. It aims to be a one-stop reference for common tasks.
1.1 Getting Started
1.1.1 Installation
pip install tensorflow # Installs TensorFlow with Keras
# or
pip install keras # Installs Keras with a backend (TensorFlow, Theano, or CNTK)
1.1.2 Importing Keras
import tensorflow as tf # If using TensorFlow backend
from tensorflow import keras
# or
import keras # If using standalone Keras
1.2 Model Building
1.2.1 Sequential Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
model = Sequential([
Dense(128, activation='relu', input_shape=(784,)),
Dense(10, activation='softmax')
])
1.2.2 Functional API
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
inputs = Input(shape=(784,))
x = Dense(128, activation='relu')(inputs)
outputs = Dense(10, activation='softmax')(x)
model = Model(inputs=inputs, outputs=outputs)
1.2.3 Model Subclassing
import tensorflow as tf
class MyModel(tf.keras.Model):
def __init__(self):
super(MyModel, self).__init__()
self.dense1 = tf.keras.layers.Dense(128, activation='relu')
self.dense2 = tf.keras.layers.Dense(10, activation='softmax')
def call(self, inputs):
x = self.dense1(inputs)
return self.dense2(x)
model = MyModel()
1.3 Layers
1.3.1 Core Layers
Dense
: Fully connected layer.Activation
: Applies an activation function.Dropout
: Applies dropout regularization.Flatten
: Flattens the input.Input
: Creates an input tensor.Reshape
: Reshapes the input.Embedding
: Turns positive integers (indexes) into dense vectors of fixed size.
1.3.2 Convolutional Layers
Conv1D
: 1D convolution layer.Conv2D
: 2D convolution layer.Conv3D
: 3D convolution layer.SeparableConv2D
: Depthwise separable 2D convolution layer.DepthwiseConv2D
: Depthwise 2D convolution layer.Conv2DTranspose
: Transposed convolution layer (deconvolution).
1.3.3 Pooling Layers
MaxPooling1D
,MaxPooling2D
,MaxPooling3D
: Max pooling layers.AveragePooling1D
,AveragePooling2D
,AveragePooling3D
: Average pooling layers.GlobalMaxPooling1D
,GlobalMaxPooling2D
,GlobalMaxPooling3D
: Global max pooling layers.GlobalAveragePooling1D
,GlobalAveragePooling2D
,GlobalAveragePooling3D
: Global average pooling layers.
1.3.4 Recurrent Layers
LSTM
: Long Short-Term Memory layer.GRU
: Gated Recurrent Unit layer.SimpleRNN
: Fully-connected RNN where the output is to be fed back to input.Bidirectional
: Wraps another recurrent layer to run it in both directions.ConvLSTM2D
: ConvLSTM2D layer.
1.3.5 Normalization Layers
BatchNormalization
: Applies batch normalization.LayerNormalization
: Applies layer normalization.
1.3.6 Advanced Activation Layers
LeakyReLU
: Leaky version of a Rectified Linear Unit.PReLU
: Parametric Rectified Linear Unit.ELU
: Exponential Linear Unit.
1.3.7 Embedding Layers
Embedding
: Turns positive integers (indexes) into dense vectors of fixed size.
1.3.8 Merge Layers
Add
: Adds inputs.Multiply
: Multiplies inputs.Average
: Averages inputs.Maximum
: Takes the maximum of inputs.Concatenate
: Concatenates inputs.Dot
: Performs a dot product between inputs.
1.3.9 Writing Custom Layers
import tensorflow as tf
class MyCustomLayer(tf.keras.layers.Layer):
def __init__(self, units=32):
super(MyCustomLayer, self).__init__()
self.units = units
def build(self, input_shape):
self.w = self.add_weight(shape=(input_shape[-1], self.units),
initializer='random_normal',
trainable=True)
self.b = self.add_weight(shape=(self.units,),
initializer='zeros',
trainable=True)
def call(self, inputs):
return tf.matmul(inputs, self.w) + self.b
1.4 Activation Functions
relu
: Rectified Linear Unit.sigmoid
: Sigmoid function.tanh
: Hyperbolic tangent function.softmax
: Softmax function (for multi-class classification).elu
: Exponential Linear Unit.selu
: Scaled Exponential Linear Unit.linear
: Linear (identity) activation.LeakyReLU
: Leaky Rectified Linear Unit.
1.5 Loss Functions
1.5.1 Regression Losses
MeanSquaredError
: Mean squared error.MeanAbsoluteError
: Mean absolute error.MeanAbsolutePercentageError
: Mean absolute percentage error.MeanSquaredLogarithmicError
: Mean squared logarithmic error.Huber
: Huber loss.
1.5.2 Classification Losses
BinaryCrossentropy
: Binary cross-entropy (for binary classification).CategoricalCrossentropy
: Categorical cross-entropy (for multi-class classification with one-hot encoded labels).SparseCategoricalCrossentropy
: Sparse categorical cross-entropy (for multi-class classification with integer labels).Hinge
: Hinge loss (for "maximum-margin" classification).KLDivergence
: Kullback-Leibler Divergence loss.Poisson
: Poisson loss.
1.5.3 Custom Loss Functions
import tensorflow as tf
def my_custom_loss(y_true, y_pred):
squared_difference = tf.square(y_true - y_pred)
return tf.reduce_mean(squared_difference, axis=-1) # Note the `axis=-1`
1.6 Optimizers
SGD
: Stochastic Gradient Descent.Adam
: Adaptive Moment Estimation.RMSprop
: Root Mean Square Propagation.Adagrad
: Adaptive Gradient Algorithm.Adadelta
: Adaptive Delta.Adamax
: Adamax optimizer from Adam and max operators.Nadam
: Nesterov Adam optimizer.Ftrl
: Follow The Regularized Leader optimizer.
1.6.1 Optimizer Configuration
from tensorflow.keras.optimizers import Adam
optimizer = Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-07)
1.7 Metrics
Accuracy
: Accuracy.BinaryAccuracy
: Binary accuracy.CategoricalAccuracy
: Categorical accuracy.SparseCategoricalAccuracy
: Sparse categorical accuracy.TopKCategoricalAccuracy
: Computes how often targets are in the top K predictions.MeanAbsoluteError
: Mean absolute error.MeanSquaredError
: Mean squared error.Precision
: Precision.Recall
: Recall.AUC
: Area Under the Curve.F1Score
: F1 Score.
1.7.1 Custom Metrics
import tensorflow as tf
class MyCustomMetric(tf.keras.metrics.Metric):
def __init__(self, name='my_custom_metric', **kwargs):
super(MyCustomMetric, self).__init__(name=name, **kwargs)
self.sum = self.add_weight(name='sum', initializer='zeros')
self.count = self.add_weight(name='count', initializer='zeros')
def update_state(self, y_true, y_pred, sample_weight=None):
values = tf.abs(y_true - y_pred)
if sample_weight is not None:
sample_weight = tf.cast(sample_weight, self.dtype)
values = tf.multiply(values, sample_weight)
self.sum.assign_add(tf.reduce_sum(values))
self.count.assign_add(tf.cast(tf.size(y_true), self.dtype))
def result(self):
return self.sum / self.count
def reset_state(self):
self.sum.assign(0.0)
self.count.assign(0.0)
1.8 Model Compilation
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
1.9 Training
1.9.1 Training with NumPy Arrays
import numpy as np
data = np.random.random((1000, 784))
labels = np.random.randint(10, size=(1000,))
one_hot_labels = tf.keras.utils.to_categorical(labels, num_classes=10)
model.fit(data, one_hot_labels, epochs=10, batch_size=32)
1.9.2 Training with tf.data.Dataset
import tensorflow as tf
dataset = tf.data.Dataset.from_tensor_slices((data, one_hot_labels))
dataset = dataset.batch(32)
model.fit(dataset, epochs=10)
1.9.3 Validation
val_data = np.random.random((100, 784))
val_labels = np.random.randint(10, size=(100,))
one_hot_val_labels = tf.keras.utils.to_categorical(val_labels, num_classes=10)
model.fit(data, one_hot_labels, epochs=10, batch_size=32,
validation_data=(val_data, one_hot_val_labels))
1.9.4 Callbacks
ModelCheckpoint
: Saves the model at certain intervals.EarlyStopping
: Stops training when a monitored metric has stopped improving.TensorBoard
: Enables visualization of metrics and more.ReduceLROnPlateau
: Reduces the learning rate when a metric has stopped improving.CSVLogger
: Streams epoch results to a CSV file.
from tensorflow.keras.callbacks import ModelCheckpoint, EarlyStopping, TensorBoard
checkpoint_callback = ModelCheckpoint(filepath='./checkpoints/model.{epoch:02d}-{val_loss:.2f}.h5',
save_best_only=True,
monitor='val_loss',
verbose=1)
early_stopping_callback = EarlyStopping(monitor='val_loss', patience=3)
tensorboard_callback = TensorBoard(log_dir='./logs', histogram_freq=1)
model.fit(data, one_hot_labels, epochs=10, batch_size=32,
validation_data=(val_data, one_hot_val_labels),
callbacks=[checkpoint_callback, early_stopping_callback, tensorboard_callback])
1.10 Evaluation
loss, accuracy = model.evaluate(val_data, one_hot_val_labels)
print('Loss:', loss)
print('Accuracy:', accuracy)
1.11 Prediction
predictions = model.predict(val_data)
predicted_classes = np.argmax(predictions, axis=1)
1.12 Saving and Loading Models
1.12.1 Save the Entire Model
model.save('my_model.h5') # Saves the model architecture, weights, and optimizer state
1.12.2 Load the Entire Model
from tensorflow.keras.models import load_model
loaded_model = load_model('my_model.h5')
1.12.3 Save Model Architecture as JSON
json_string = model.to_json()
# Save the JSON string to a file
with open('model_architecture.json', 'w') as f:
f.write(json_string)
1.12.4 Load Model Architecture from JSON
from tensorflow.keras.models import model_from_json
# Load the JSON string from a file
with open('model_architecture.json', 'r') as f:
json_string = f.read()
model = model_from_json(json_string)
1.12.5 Save Model Weights
model.save_weights('model_weights.h5')
1.12.6 Load Model Weights
model.load_weights('model_weights.h5')
1.13 Regularization
1.13.1 L1 and L2 Regularization
from tensorflow.keras import regularizers
from tensorflow.keras.layers import Dense
model = Sequential([
Dense(128, activation='relu', input_shape=(784,),
kernel_regularizer=regularizers.l1(0.01), # L1 regularization
bias_regularizer=regularizers.l2(0.01)), # L2 regularization
Dense(10, activation='softmax')
])
1.13.2 Dropout
from tensorflow.keras.layers import Dropout
model = Sequential([
Dense(128, activation='relu', input_shape=(784,)),
Dropout(0.5), # Dropout layer with 50% dropout rate
Dense(10, activation='softmax')
])
1.13.3 Batch Normalization
from tensorflow.keras.layers import BatchNormalization
model = Sequential([
Dense(128, activation='relu', input_shape=(784,)),
BatchNormalization(), # Batch normalization layer
Dense(10, activation='softmax')
])
1.14 Transfer Learning
1.14.1 Feature Extraction
from tensorflow.keras.applications import VGG16
# Load pre-trained VGG16 model without the top (classification) layer
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
# Freeze the weights of the base model
base_model.trainable = False
# Add custom classification layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dense
model = Sequential([
base_model,
Flatten(),
Dense(256, activation='relu'),
Dense(1, activation='sigmoid') # Binary classification
])
1.14.2 Fine-Tuning
# Unfreeze some of the layers in the base model
base_model.trainable = True
for layer in base_model.layers[:-4]: # Unfreeze the last 4 layers
layer.trainable = False
# Recompile the model
model.compile(optimizer=tf.keras.optimizers.RMSprop(learning_rate=1e-5),
loss='binary_crossentropy',
metrics=['accuracy'])
# Continue training
model.fit(train_data, train_labels, epochs=10, validation_data=(val_data, val_labels))
1.15 Callbacks
1.15.1 ModelCheckpoint
from tensorflow.keras.callbacks import ModelCheckpoint
checkpoint_callback = ModelCheckpoint(
filepath='best_model.h5',
monitor='val_loss',
save_best_only=True,
verbose=1
)
1.15.2 EarlyStopping
from tensorflow.keras.callbacks import EarlyStopping
early_stopping_callback = EarlyStopping(
monitor='val_loss',
patience=5,
restore_best_weights=True,
verbose=1
)
1.15.3 ReduceLROnPlateau
from tensorflow.keras.callbacks import ReduceLROnPlateau
reduce_lr_callback = ReduceLROnPlateau(
monitor='val_loss',
factor=0.1,
patience=3,
verbose=1
)
1.15.4 TensorBoard
from tensorflow.keras.callbacks import TensorBoard
tensorboard_callback = TensorBoard(
log_dir='./logs',
histogram_freq=1,
write_graph=True,
write_images=True
)
1.16 Custom Training Loops
import tensorflow as tf
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
loss_fn = tf.keras.losses.CategoricalCrossentropy()
metric_fn = tf.keras.metrics.CategoricalAccuracy()
@tf.function
def train_step(images, labels):
with tf.GradientTape() as tape:
predictions = model(images)
loss = loss_fn(labels, predictions)
gradients = tape.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
metric_fn.update_state(labels, predictions)
return loss
epochs = 10
for epoch in range(epochs):
for images, labels in dataset:
loss = train_step(images, labels)
print(f"Epoch {epoch+1}, Loss: {loss.numpy():.4f}, Accuracy: {metric_fn.result().numpy():.4f}")
metric_fn.reset_state()
1.17 Distributed Training
1.17.1 MirroredStrategy
import tensorflow as tf
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
model = Sequential([
Dense(128, activation='relu', input_shape=(784,)),
Dense(10, activation='softmax')
])
model.compile(optimizer='adam',
loss='categorical_crossentropy',
metrics=['accuracy'])
model.fit(data, one_hot_labels, epochs=10, batch_size=32)
1.18 Hyperparameter Tuning
1.18.1 Using Keras Tuner
Installation:
pip install keras-tuner
Define a Hypermodel:
from tensorflow import keras
from kerastuner.tuners import RandomSearch
def build_model(hp):
model = keras.Sequential()
model.add(keras.layers.Flatten(input_shape=(28, 28)))
model.add(keras.layers.Dense(
hp.Choice('units', [32, 64, 128]),
activation='relu'))
model.add(keras.layers.Dense(10, activation='softmax'))
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
return model
Run the Tuner:
tuner = RandomSearch(
build_model,
objective='val_accuracy',
max_trials=5,
executions_per_trial=3,
directory='my_dir',
project_name='my_project')
tuner.search_space_summary()
tuner.search(x_train, y_train,
epochs=10,
validation_data=(x_val, y_val))
best_model = tuner.get_best_models(num_models=1)[0]
1.19 TensorFlow Datasets
1.19.1 Installation
pip install tensorflow-datasets
1.19.2 Usage
import tensorflow_datasets as tfds
(ds_train, ds_test), ds_info = tfds.load(
'mnist',
split=['train', 'test'],
shuffle_files=True,
as_supervised=True,
with_info=True,
)
def normalize_img(image, label):
"""Normalizes images: `uint8` -> `float32`."""
return tf.cast(image, tf.float32) / 255., label
ds_train = ds_train.map(normalize_img, num_parallel_calls=tf.data.AUTOTUNE)
ds_train = ds_train.cache()
ds_train = ds_train.shuffle(ds_info.splits['train'].num_examples)
ds_train = ds_train.batch(128)
ds_train = ds_train.prefetch(tf.data.AUTOTUNE)
ds_test = ds_test.map(normalize_img, num_parallel_calls=tf.data.AUTOTUNE)
ds_test = ds_test.batch(128)
ds_test = ds_test.cache()
ds_test = ds_test.prefetch(tf.data.AUTOTUNE)
model.fit(ds_train, epochs=12, validation_data=ds_test)
1.20 TensorFlow Hub
1.20.1 Installation
pip install tensorflow-hub
1.20.2 Usage
import tensorflow_hub as hub
embedding = "https://tfhub.dev/google/nnlm-en-dim128/2"
hub_layer = hub.KerasLayer(embedding, input_shape=[], dtype=tf.string, trainable=True)
model = tf.keras.Sequential()
model.add(hub_layer)
model.add(tf.keras.layers.Dense(16, activation='relu'))
model.add(tf.keras.layers.Dense(1))
model.compile(optimizer='adam',
loss=tf.keras.losses.BinaryCrossentropy(from_logits=True),
metrics=['accuracy'])
1.21 TensorFlow Lite
1.21.1 Convert to TensorFlow Lite
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
1.22 Tips and Best Practices
- Use virtual environments to isolate project dependencies.
- Use meaningful names for layers, models, and variables.
- Follow the DRY (Don't Repeat Yourself) principle.
- Write unit tests to ensure code quality.
- Use a consistent coding style.
- Document your code.
- Use a version control system (e.g., Git).
- Use a GPU for training if possible.
- Monitor your training progress with TensorBoard.
- Use callbacks to save the best model and stop training early.
- Use regularization techniques to prevent overfitting.
- Experiment with different optimizers and learning rates.
- Use data augmentation to improve model performance.
- Use transfer learning to leverage pre-trained models.
- Use a TPU for faster training.
- Use a distributed training strategy for large datasets.
- Use a profiler to identify performance bottlenecks.
- Use a model quantization technique to reduce model size.
- Use a model pruning technique to reduce model complexity.
- Use a model distillation technique to create a smaller model.
- Use a model compression technique to reduce model size.
- Use a model deployment tool to deploy your model to production.
- Use a model monitoring tool to monitor your model's performance in production.