In daily operations of Linux servers, system monitoring is crucial to ensuring stable service operation. Just as doctors recommend regular check-ups, servers need periodic “health examinations” through monitoring tools to promptly identify resource usage conditions such as CPU, memory, and disk, thereby preventing service slowdowns or outages due to resource shortages. This article introduces several essential Linux performance monitoring tools for beginners, enabling you to quickly grasp how to check server “health.”

1. top: Real-time System Performance and Process Monitoring

Purpose: top is the most commonly used real-time monitoring tool, dynamically displaying CPU, memory, and process usage to quickly identify resource-hungry processes.
Basic Command: Run top directly in the terminal and press q to exit.
Key Information Interpretation:
- First Line: System time, uptime, number of logged-in users, and load average (if exceeding the number of CPU cores, CPU overload may occur).
- Second Line (Tasks): Total processes, running processes, sleeping processes, stopped processes, and zombie processes.
- Third Line (%Cpu(s)): CPU usage metrics:
- us: User-space process CPU usage (normal range: <70%, high values may indicate malicious processes).
- sy: Kernel-space CPU usage (persistently high values may indicate kernel issues or frequent system calls).
- id: Idle CPU time (higher is better; ~100% indicates low CPU load).
- wa: IO wait time (if wa > 20%, disk IO is likely slow; optimize storage).
- Fourth Line (KiB Mem): Memory usage: total (total memory), used (used), free (free), and buff/cache (cache/缓冲区, directly usable by the system).
- Fifth Line (KiB Swap): Swap space (used when memory is insufficient; high swap usage severely impacts performance).
- Process List: Press P to sort by CPU usage and M to sort by memory usage, quickly identifying resource-heavy processes.

2. vmstat: Overall System Performance Analysis

Purpose: vmstat (Virtual Memory Statistics) comprehensively reflects system performance, including process scheduling, memory swapping, and IO.
Basic Command: vmstat 1 (refreshes every 1 second; press Ctrl+C to exit).
Key Columns:
- r: Number of processes waiting to run (if r > CPU cores, CPU is overloaded; optimize or add cores).
- b: Number of uninterruptible sleep processes (if b > 0, processes may be waiting for IO, e.g., disk reads/writes).
- swpd: Used swap space (persistent increase indicates potential memory shortage).
- free: Free memory (includes unused memory and cache).
- si/so: Swap-in/swap-out rates (both non-zero indicate severe memory shortage).
- bi/bo: Block device IO rates (bi: disk-to-memory; bo: memory-to-disk; high values indicate IO bottlenecks).

3. iostat: Disk IO Performance Monitoring

Purpose: iostat (Input/Output Statistics) monitors disk read/write speeds and IO request frequencies to identify disk bottlenecks.
Basic Command: iostat -x 1 (-x for detailed metrics; refreshes every 1 second).
Key Columns:
- tps: IO requests per second (higher values mean heavier disk load).
- kB_read/s/kB_wrtn/s: Disk read/write throughput (in KB/s).
- %util: Disk device utilization (near 100% indicates disk saturation; check for excessive writes).
- await: Average IO response time (in milliseconds; higher values indicate slow disk reads/writes, e.g., >20ms for HDDs).

4. free: Quick Memory Usage Check

Purpose: free rapidly displays total memory, used space, free space, and cache, ideal for quick memory sufficiency checks.
Basic Command: free -h (-h auto-converts units to KB/MB/GB for readability).
Key Metrics:
- total: Total memory (physical + swap).
- used: Used memory (processes + cache).
- free: Unused physical memory.
- available: System-available memory (free + part of buff/cache; assignable to new processes).
Note: Persistently low available indicates memory stress; check for memory leaks or high-memory processes.

5. df and du: Disk Space Monitoring

Purpose:
- df: Checks total, used, and free disk partition space (prevent disk fullness).
- du: Checks directory/file sizes (locate large files/directories for space cleanup).

df Command: df -h (shows partitions with auto-converted units).
- Filesystem: Partition device name (e.g., /dev/sda1).
- Size: Total partition size.
- Used: Used space.
- Avail: Available space.
- Use%: Usage rate (>85% requires cleanup, e.g., logs, temp files).

du Command: du -sh [directory] (e.g., du -sh /var/log for log directory size).
- -s: Show only total size.
- -h: Human-readable units (KB/MB/GB).
- --max-depth=1: Check first-level subdirectories.

6. How to Choose Tools?

  • Quick System Overview: Use top (real-time, intuitive).
  • Memory Stress Check: Use free -h (simple available memory check).
  • Disk IO Bottleneck: Use iostat -x 1 (check %util and await).
  • Disk Space: Use df -h (locate full partitions).
  • Large File Identification: Use du -sh [directory] (narrow down to specific files).

Conclusion

These tools form the foundation of Linux server monitoring. Beginners need not memorize all parameters—focus on core commands and metrics, using tools as needed. For example:
- If services lag, use top to check CPU/memory.
- If memory is low, use free to confirm and resolve leaks or optimize cache.
- If disk reads/writes are slow, use iostat to check IO saturation.

With practice, you can quickly assess server health and ensure stable service operation.

Xiaoye