In data analysis and numerical computing, NumPy is one of the most core tools in Python. It provides efficient multi-dimensional array objects and a large number of functions for array operations and statistical analysis. Today, we’ll focus on three of the most commonly used statistical functions in NumPy: mean (average), sum (sum), and max (maximum), to help you quickly get started with data statistical analysis.
一、Quickly Import NumPy and Create Arrays¶
First, we need to import the NumPy library and create some sample arrays to demonstrate function usage. We typically use np as an alias for NumPy:
import numpy as np
Next, create simple arrays:
- 1D array: [1, 2, 3, 4, 5]
- 2D array (2 rows, 3 columns): [[1, 2, 3], [4, 5, 6]]
# Example of a 1D array
arr_1d = np.array([1, 2, 3, 4, 5])
# Example of a 2D array
arr_2d = np.array([[1, 2, 3],
[4, 5, 6]])
二、mean Function: Calculate Arithmetic Mean¶
Purpose: The mean function calculates the arithmetic mean of all elements in an array.
Basic Usage¶
- For 1D arrays, it directly computes the overall mean;
- For multi-dimensional arrays, use the
axisparameter to specify the calculation direction (row or column).
Key Parameter axis¶
axis=0: Calculate column-wise (vertical direction, i.e., the mean of elements in each column);axis=1: Calculate row-wise (horizontal direction, i.e., the mean of elements in each row);- If
axisis not specified, the mean of all elements in the entire array is calculated.
Examples¶
# 1. Mean of 1D array
print("1D array:", arr_1d)
print("Overall mean:", np.mean(arr_1d)) # Result: 3.0
# 2. Mean of 2D array
print("\n2D array:\n", arr_2d)
print("Column-wise mean (axis=0):", np.mean(arr_2d, axis=0)) # [2.5, 3.5, 4.5]
print("Row-wise mean (axis=1):", np.mean(arr_2d, axis=1)) # [2.0, 5.0]
三、sum Function: Calculate Element Sum¶
Purpose: The sum function calculates the sum of all elements in an array.
Basic Usage¶
Similar to mean, use the axis parameter to specify row-wise or column-wise summation.
Key Parameter axis¶
axis=0: Sum of elements in each column;axis=1: Sum of elements in each row.
Examples¶
# 1. Sum of 1D array
print("1D array:", arr_1d)
print("Total sum:", np.sum(arr_1d)) # Result: 15
# 2. Sum of 2D array
print("\n2D array:\n", arr_2d)
print("Column-wise sum (axis=0):", np.sum(arr_2d, axis=0)) # [5, 7, 9]
print("Row-wise sum (axis=1):", np.sum(arr_2d, axis=1)) # [6, 15]
四、max Function: Find Maximum Value¶
Purpose: The max function finds the maximum value among elements in an array.
Basic Usage¶
Similar to the previous two functions, use the axis parameter to specify row-wise or column-wise maximum.
Key Parameter axis¶
axis=0: Maximum value of each column;axis=1: Maximum value of each row.
Examples¶
# 1. Maximum of 1D array
print("1D array:", arr_1d)
print("Maximum value:", np.max(arr_1d)) # Result: 5
# 2. Maximum of 2D array
print("\n2D array:\n", arr_2d)
print("Column-wise max (axis=0):", np.max(arr_2d, axis=0)) # [4, 5, 6]
print("Row-wise max (axis=1):", np.max(arr_2d, axis=1)) # [3, 6]
五、Practical Example: Student Score Statistics¶
Suppose we have a class’s score data (3 students, 3 courses), represented as an array:
# Student scores array: each row represents a student, each column represents a course
scores = np.array([
[85, 90, 78], # Student 1: Chinese 85, Math 90, English 78
[92, 88, 95], # Student 2: Chinese 92, Math 88, English 95
[76, 80, 82] # Student 3: Chinese 76, Math 80, English 82
])
Now, let’s analyze the scores using mean, sum, and max:
# 1. Average score per course (column mean)
print("Average score per course (column mean):", np.mean(scores, axis=0))
# Result: [84.333..., 86.0, 85.0]
# 2. Total score per student (row sum)
print("Total score per student (row sum):", np.sum(scores, axis=1))
# Result: [253, 275, 238]
# 3. Highest score per student (row max)
print("Highest score per student (row max):", np.max(scores, axis=1))
# Result: [90, 95, 82]
# 4. Highest score per course (column max)
print("Highest score per course (column max):", np.max(scores, axis=0))
# Result: [92, 90, 95]
六、Summary¶
NumPy’s mean, sum, and max functions are fundamental tools for statistical analysis. The core is controlling the calculation direction (row/column) via the axis parameter. Remember:
- axis=0: Column direction (vertical direction, corresponding to row index changes);
- axis=1: Row direction (horizontal direction, corresponding to column index changes).
By practicing array operations with different dimensions, you can quickly master the usage of these functions and lay a solid foundation for more complex data analysis!