Foreword¶
In Chapter 2, we learned how to use PaddlePaddle for addition calculations. From this small example, we grasped the basic usage of PaddlePaddle. In this chapter, we will introduce a very common introductory example of deep learning using PaddlePaddle—linear regression. We will train a linear regression model using both a custom dataset and the dataset interface provided by PaddlePaddle.
Using Custom Data¶
In this section, we will introduce the entire process of linear regression, from defining the network to training with custom data, and finally verifying the network’s prediction ability.
First, import the PaddlePaddle library and some utility libraries.
import paddle.fluid as fluid
import paddle
import numpy as np
Define a simple linear network. The network structure is: Output Layer -> Hidden Layer -> Output Layer, with a total of 2 layers (since the input layer is not counted as part of the network layers). Specifically, it consists of a fully connected layer with 100 neurons (activation function: ReLU) and another fully connected layer with 1 neuron (no activation function). The input layer is defined using fluid.layers.data() with a shape of [13], which matches the 13 attributes of the Boston Housing dataset (we will use a custom dataset with the same dimension for consistency).
# Define a simple linear network
x = fluid.layers.data(name='x', shape=[13], dtype='float32')
hidden = fluid.layers.fc(input=x, size=100, act='relu')
net = fluid.layers.fc(input=hidden, size=1, act=None)
After defining the loss function, we clone the main program (fluid.default_main_program) as a prediction program for later use. This is to separate the inference process from the training process, ensuring we only use the network structure without the loss function. The main program defines the neural network model, forward/backward computations, and optimization algorithms for parameter updates—it is the core of the entire program, which PaddlePaddle has already implemented. We only need to focus on network construction and training.
# Clone the main program for later prediction
test_program = fluid.default_main_program().clone(for_test=True)
Next, define the loss function. Here, we use fluid.layers.data() again to represent the target data corresponding to the input. Since this is a linear regression task, we use the mean squared error (MSE) loss function. fluid.layers.square_error_cost() computes the loss for a batch, so we take the average of the batch loss.
# Define the loss function
y = fluid.layers.data(name='y', shape=[1], dtype='float32')
cost = fluid.layers.square_error_cost(input=net, label=y)
avg_cost = fluid.layers.mean(cost)
Then, define the optimization method. We use stochastic gradient descent (SGD) with a learning rate of 0.01. PaddlePaddle provides various optimization algorithms (e.g., Momentum, Adagrad), and we can choose based on the task requirements.
# Define the optimization method
optimizer = fluid.optimizer.SGDOptimizer(learning_rate=0.01)
opts = optimizer.minimize(avg_cost)
Create an executor to initialize parameters using CPU.
# Create a CPU executor
place = fluid.CPUPlace()
exe = fluid.Executor(place)
# Initialize parameters
exe.run(fluid.default_startup_program())
Define custom training data. Each data sample has 13 features, where the first feature follows the rule y = 2 * x + 1 (the remaining 12 features are dummy zeros for format consistency). We also define a test sample to verify the prediction result.
# Define training and test data
x_data = np.array([[1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[2.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[3.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[4.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0],
[5.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]).astype('float32')
y_data = np.array([[3.0], [5.0], [7.0], [9.0], [11.0]]).astype('float32')
test_data = np.array([[6.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0]]).astype('float32')
Train the model for 10 epochs. The loss value should decrease as training progresses, indicating convergence.
# Start training for 10 epochs
for pass_id in range(10):
train_cost = exe.run(program=fluid.default_main_program(),
feed={'x': x_data, 'y': y_data},
fetch_list=[avg_cost])
print("Pass:%d, Cost:%0.5f" % (pass_id, train_cost[0]))
Output:
Pass:0, Cost:65.61024
Pass:1, Cost:26.62285
Pass:2, Cost:7.78299
Pass:3, Cost:0.59838
Pass:4, Cost:0.02781
Pass:5, Cost:0.02600
Pass:6, Cost:0.02548
Pass:7, Cost:0.02496
Pass:8, Cost:0.02446
Pass:9, Cost:0.02396
After training, use the test program to predict the test data. The input test_data should yield a result close to 13 (since y = 2 * 6 + 1 = 13).
# Start prediction
result = exe.run(program=test_program,
feed={'x': test_data},
fetch_list=[net])
print("When x is 6.0, y is: %0.5f:" % result[0][0][0])
Output:
When x is 6.0, y is: 13.23651:
Training with the Boston Housing Dataset¶
In this section, we use the Boston Housing dataset (provided by PaddlePaddle) and the defined network structure for training.
PaddlePaddle provides convenient dataset APIs. The uci_housing dataset is commonly used for linear regression tasks. We split the dataset into training and test sets and use paddle.batch to process batches of data.
import paddle.dataset.uci_housing as uci_housing
# Load Boston Housing dataset using PaddlePaddle API
train_reader = paddle.batch(reader=uci_housing.train(), batch_size=128)
test_reader = paddle.batch(reader=uci_housing.test(), batch_size=128)
Define the data feeder to map dataset attributes to the network’s input/output layers.
# Define data feeder
feeder = fluid.DataFeeder(place=place, feed_list=[x, y])
Train the model with the dataset, splitting data into batches and evaluating on the test set after each epoch.
# Train and test the model
for pass_id in range(10):
# Training phase
train_cost = 0
for batch_id, data in enumerate(train_reader()):
train_cost = exe.run(program=fluid.default_main_program(),
feed=feeder.feed(data),
fetch_list=[avg_cost])
print("Pass:%d, Training Cost:%0.5f" % (pass_id, train_cost[0][0]))
# Testing phase
test_cost = 0
for batch_id, data in enumerate(test_reader()):
test_cost = exe.run(program=fluid.default_main_program(),
feed=feeder.feed(data),
fetch_list=[avg_cost])
print("Pass:%d, Testing Cost:%0.5f" % (pass_id, test_cost[0][0]))
Output:
Pass:0, Training Cost:35.61119
Pass:0, Testing Cost:92.18690
Pass:1, Training Cost:121.56089
Pass:1, Testing Cost:51.94175
...
Pass:9, Training Cost:21.51291
Pass:9, Testing Cost:21.71775
Conclusion¶
This chapter completes the introduction to linear regression using PaddlePaddle. We have learned to define networks, load data, train models, and evaluate performance. With this foundation, you are ready to explore more advanced deep learning tasks.
Previous Chapter: PaddlePaddle Basics: Calculating 1+1
Next Chapter: Convolutional Neural Networks for MNIST