Cat Classification with Logistic Regression

Introduction

I’ll be implementing a logistic regression model to classify images as cats or non-cats using a cat dataset. The goal is to build a model that can accurately predict whether an image contains a cat.

Importing Libraries

First, we need to import the necessary Python libraries:

# coding=utf-8
import matplotlib.pyplot as plt
import numpy as np
import scipy
from scipy import ndimage
from lr_utils import load_dataset  # Custom utility to load the cat dataset

Loading and Preprocessing Data

The dataset consists of training and test images of cats and non-cats. We need to preprocess the data to make it suitable for our model:

# Load the dataset
train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()

# Get dimensions of the dataset
m_train = train_set_x_orig.shape[0]  # Number of training examples
m_test = test_set_x_orig.shape[0]    # Number of test examples
num_px = train_set_x_orig.shape[1]   # Height/width of each image (64x64)

# Reshape images from (num_px, num_px, 3) to (num_px*num_px*3, 1)
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T

# Normalize the data by dividing by 255 (pixel values range from 0-255)
train_set_x = train_set_x_flatten / 255.
test_set_x = test_set_x_flatten / 255.

General Architecture of the Learning Algorithm

The logistic regression model follows these steps:
1. Define the model structure (number of input features)
2. Initialize model parameters
3. Loop:
- Compute current loss (forward propagation)
- Compute current gradients (backward propagation)
- Update parameters (gradient descent)

1. Define the Sigmoid Function

The sigmoid function maps any real number to a value between 0 and 1:

\[sigmoid(x) = \frac{1}{1 + e^{-(x)}}\]
def sigmoid(x):
    """
    Compute the sigmoid of x
    :param x: A scalar or numpy array of any size
    :return: s -- sigmoid(x)
    """
    s = 1 / (1 + np.exp(-x))
    return s

2. Initialize Parameters

We initialize weights to zeros and bias to zero:

def initialize_with_zeros(dim):
    """
    Initialize w as a zero vector of shape (dim, 1) and b as 0
    :param dim: The size of the w vector
    :return: w -- initialized vector of shape (dim, 1)
             b -- initialized scalar (0)
    """
    w = np.zeros((dim, 1))
    b = 0
    return w, b

3. Propagate (Forward and Backward Propagation)

This function computes the cost function and gradients:

def propagate(w, b, X, Y):
    """
    Implement forward and backward propagation for the logistic regression cost function
    :param w: Weights, shape (num_px*num_px*3, 1)
    :param b: Bias, scalar
    :param X: Data, shape (num_px*num_px*3, number of examples)
    :param Y: True labels, shape (1, number of examples)
    :return: grads -- dictionary containing dw and db
             cost -- negative log-likelihood cost
    """
    m = X.shape[1]  # Number of examples

    # Forward propagation
    A = sigmoid(np.dot(w.T, X) + b)  # Activation
    cost = -(np.dot(Y, np.log(A).T) + np.dot(1 - Y, np.log(1 - A).T)) / m  # Cost

    # Backward propagation
    dw = np.dot(X, (A - Y).T) / m
    db = np.sum(A - Y) / m

    grads = {"dw": dw, "db": db}

    return grads, cost

4. Gradient Descent Optimization

We optimize the parameters using gradient descent:

def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost=False):
    """
    Optimize w and b by running a gradient descent algorithm
    :param w: Initial weights
    :param b: Initial bias
    :param X: Data
    :param Y: True labels
    :param num_iterations: Number of iterations of optimization loop
    :param learning_rate: Learning rate of the gradient descent update rule
    :param print_cost: If True, print the cost every 100 iterations
    :return: params -- dictionary containing weights w and bias b
             grads -- dictionary containing gradients dw and db
             costs -- list of costs recorded during optimization
    """
    costs = []

    for i in range(num_iterations):
        # Run forward/backward propagation
        grads, cost = propagate(w, b, X, Y)

        # Retrieve gradients
        dw = grads["dw"]
        db = grads["db"]

        # Update parameters
        w = w - learning_rate * dw
        b = b - learning_rate * db

        # Record cost every 100 iterations
        if i % 100 == 0:
            costs.append(cost)

        # Print cost every 100 iterations
        if print_cost and i % 100 == 0:
            print(f"Cost after iteration {i}: {cost:.6f}")

    params = {"w": w, "b": b}
    grads = {"dw": dw, "db": db}

    return params, grads, costs

5. Prediction Function

Once we have optimized parameters, we use them to make predictions:

def predict(w, b, X):
    """
    Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)
    :param w: Weights
    :param b: Bias
    :param X: Data to predict
    :return: Y_prediction -- a numpy array (vector) containing predictions (0/1)
    """
    m = X.shape[1]
    Y_prediction = np.zeros((1, m))
    w = w.reshape(X.shape[0], 1)

    A = sigmoid(np.dot(w.T, X) + b)

    for i in range(A.shape[1]):
        Y_prediction[0, i] = 1 if A[0, i] > 0.5 else 0

    return Y_prediction

6. Combine All Functions into a Model

This function integrates all the previous components:

def model(X_train, Y_train, X_test, Y_test, num_iterations=2000, learning_rate=0.5, print_cost=False):
    """
    Build the logistic regression model by calling the helper functions
    :param X_train: Training data
    :param Y_train: Training labels
    :param X_test: Test data
    :param Y_test: Test labels
    :param num_iterations: Number of iterations for gradient descent
    :param learning_rate: Learning rate
    :param print_cost: If True, print cost every 100 iterations
    :return: d -- dictionary containing information about the model
    """
    # Initialize parameters
    w, b = initialize_with_zeros(X_train.shape[0])

    # Optimize parameters
    parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)

    # Retrieve optimized parameters
    w = parameters["w"]
    b = parameters["b"]

    # Predict on test and training sets
    Y_prediction_test = predict(w, b, X_test)
    Y_prediction_train = predict(w, b, X_train)

    # Calculate accuracy
    train_accuracy = 100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100
    test_accuracy = 100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100

    print(f"train accuracy: {train_accuracy:.2f} %")
    print(f"test accuracy: {test_accuracy:.2f} %")

    d = {
        "costs": costs,
        "Y_prediction_test": Y_prediction_test,
        "Y_prediction_train": Y_prediction_train,
        "w": w,
        "b": b,
        "learning_rate": learning_rate,
        "num_iterations": num_iterations
    }

    return d

Testing Different Learning Rates

The choice of learning rate significantly affects convergence:

def test_learning_rates():
    learning_rates = [0.01, 0.001, 0.0001]
    models = {}

    for lr in learning_rates:
        print(f"Learning rate: {lr}")
        models[str(lr)] = model(
            train_set_x, train_set_y, test_set_x, test_set_y,
            num_iterations=1500, learning_rate=lr, print_cost=False
        )
        print("\n" + "="*50 + "\n")

    # Plot cost vs iterations for different learning rates
    for lr in learning_rates:
        plt.plot(
            np.squeeze(models[str(lr)]["costs"]), 
            label=f"LR={lr}"
        )

    plt.ylabel("Cost")
    plt.xlabel("Iterations (hundreds)")
    plt.legend(loc="upper right")
    plt.show()

Predicting Custom Images

To test with your own images:

def predict_custom_image(image_path, model_info):
    """
    Predict if a custom image contains a cat
    :param image_path: Path to the image
    :param model_info: Dictionary containing model parameters
    """
    image = np.array(ndimage.imread(image_path, flatten=False))
    image = scipy.misc.imresize(image, size=(num_px, num_px)).reshape((1, num_px*num_px*3)).T
    prediction = predict(model_info["w"], model_info["b"], image)

    plt.imshow(image)
    class_name = classes[int(np.squeeze(prediction))].decode("utf-8")
    print(f"Prediction: {class_name}")

Main Execution

if __name__ == "__main__":
    # Train the model
    model_info = model(
        train_set_x, train_set_y, test_set_x, test_set_y,
        num_iterations=1000, learning_rate=0.005, print_cost=True
    )

    # Test with custom image (uncomment to use)
    # predict_custom_image("images/cat.jpg", model_info)

    # Test different learning rates (uncomment to use)
    # test_learning_rates()

Expected Output

When running the main function, you should see:
- Cost values decreasing over iterations
- Training and test accuracy metrics
- Visualization of cost vs iterations for different learning rates (if tested)

References

  • Deep Learning Specialization by Andrew Ng
  • Custom lr_utils library for dataset loading

Notes

This implementation is a basic logistic regression approach for binary classification. For better performance, you might consider using neural networks or optimizing the learning rate further. The code provided assumes the lr_utils library is available to load the cat dataset.


Explanation of Key Concepts

  1. Sigmoid Function: Maps any value to a probability between 0 and 1
  2. Gradient Descent: Minimizes the cost function by updating parameters
  3. Cost Function: Measures the error between predicted and actual values
  4. Propagation: Combines forward (prediction) and backward (gradient calculation) steps
  5. Hyperparameters: Key parameters like learning rate and number of iterations

This implementation provides a solid foundation for understanding logistic regression and binary classification.

Xiaoye