Learn how to accurately read top, free, and swap space in Linux to monitor CPU, memory, and performance effectively.
As a Linux administrator, understanding system resource utilization is crucial for ensuring optimal performance, diagnosing issues, and planning for future upgrades. Many administrators struggle with interpreting top
, free
, swap space, CPU load, and disk I/O metrics correctly. Misinterpretations can lead to unnecessary panic, premature hardware upgrades, or overlooking actual performance bottlenecks.
This guide will help you correctly read and analyze these metrics, avoid common mistakes, and determine when to take action to optimize your Linux system.
Proper monitoring and understanding of system resources help in:
- Ensuring application stability and performance.
- Preventing unnecessary hardware upgrades and cost overruns.
- Diagnosing performance bottlenecks and optimizing configurations.
- Understanding when to scale your infrastructure.
By the end of this article, you will learn how to:
- Analyze CPU, memory, swap, and I/O usage.
- Interpret load average correctly.
- Use advanced monitoring tools to track system performance over time.
- Avoid common misconceptions that can lead to poor decision-making.
Understanding Server Resource Utilization
A server’s key performance metrics include:
- CPU Usage: The percentage of CPU time used by system and user processes.
- Memory Usage: The amount of RAM currently in use, including cache and buffers.
- Swap Space: Virtual memory on disk used when physical RAM is full.
- I/O Wait: Time the CPU spends waiting for disk or network I/O operations to complete.
- Load Average: The number of processes waiting for CPU or I/O resources over different time intervals.
How to Read System Utilization Correctly
Using the free
Command for Memory Analysis
The free
command provides an overview of total, used, free, and available memory.
free -h
Example Output:
total used free shared buff/cache available
16Gi 6Gi 2Gi 512Mi 8Gi 9Gi
Interpretation:
- Used memory includes applications and cache.
- Available memory is what is truly free for new applications.
- Buff/cache includes file system caching, which speeds up disk operations.
Normal vs. High Values:
- Normal: Available memory is at least 20-30% of total RAM.
- Medium: Available memory drops below 15%.
- High Concern: Available memory is below 5%, indicating possible memory pressure.
Best Practices:
- Always check the
available
column rather thanfree
to understand real memory availability. - Use
vmstat -s
for a detailed breakdown of memory statistics.
Using the top
Command for CPU, Memory, and Load Analysis
The top
command provides real-time system performance metrics.
top
Key Metrics:
- %Cpu(s): Displays CPU usage breakdown.
- MiB Mem: Shows RAM usage details.
- Load average: Represents system load over different time intervals.
Example Output:
%Cpu(s): 20.0 us, 5.0 sy, 2.0 ni, 70.0 id, 3.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 16000.0 total, 6000.0 used, 2000.0 free, 500.0 buff/cache
Load average: 1.24, 0.98, 0.76
Interpretation:
- CPU Usage:
- Normal: Below 50% utilization.
- Medium: 50-80% sustained usage.
- High Concern: Above 90%, risk of CPU saturation.
- Load Average:
- Normal: Close to or below the number of CPU cores.
- High Concern: Load average consistently exceeding CPU core count.
Best Practices:
- Monitor
%wa
(I/O wait) in the CPU section to detect disk bottlenecks. - Use
Shift + M
intop
to sort processes by memory usage. - Use
Shift + P
to sort by CPU usage.
Monitoring Disk I/O with iostat
The iostat
command provides detailed I/O statistics.
iostat -x 1 5
Example Output:
Device r/s w/s await %util
sda 120 50 3.2 25.6
Interpretation:
- await: Disk response time in milliseconds.
- Normal: Below 5ms.
- Medium: 5-10ms.
- High Concern: Above 10ms may indicate storage issues.
- %util: Disk utilization.
- Normal: Below 50%.
- High Concern: Above 70%.
Best Practices:
- Investigate high
%util
andawait
values. - Optimize disk performance by checking workload patterns.
Checking Swap Usage
swapon -s
Example Output:
Filename Type Size Used Priority
/dev/sda2 partition 8G 2G -2
Best Practices:
- Persistent high swap usage indicates memory pressure.
- Use
vmstat 1 5
to check real-time swap activity.
Understanding CPU Load Average
uptime
Example Output:
12:34:56 up 10 days, 4:23, 3 users, load average: 1.24, 0.98, 0.76
Interpretation:
- Load Average over 1, 5, 15 mins
- Normal: Load is below total CPU core count.
- High Concern: Load consistently exceeds core count.
Best Practices:
- Compare load average against the number of CPU cores.
- Monitor with
mpstat -P ALL 1
to analyze per-core CPU usage.
Common Mistakes and Misinterpretations
- Confusing Load Average with CPU Usage
- Mistake: Assuming high load means high CPU usage.
- Fix: Check
%Cpu(s):
intop
to verify CPU load.
- Ignoring I/O Wait in Performance Analysis
- Mistake: High system load without high CPU utilization can be due to disk bottlenecks.
- Fix: Use
iostat -x
to check disk performance.
- Misinterpreting Memory Usage
- Mistake: Believing low free memory means RAM exhaustion.
- Fix: Look at
available
memory instead offree
memory.
- Overlooking Swap Activity
- Mistake: Assuming all swap usage is bad.
- Fix: Occasional swap use is fine; sustained high swap usage is a warning sign.
Advanced Monitoring Techniques
Using htop
for a Detailed View
htop
Checking Process-Specific CPU and Memory Usage
ps aux --sort=-%cpu | head -10
ps aux --sort=-%mem | head -10
Using sar
for Historical Data
sar -u 5 10
sar -r 5 10
Conclusion
Mastering Linux performance monitoring is crucial for maintaining stable, efficient servers. By correctly interpreting CPU, memory, swap, I/O wait, and load average, administrators can make informed decisions, optimize resources, and prevent downtime. Avoiding common misinterpretations and using the right commands ensures that system performance is analyzed accurately.
The key takeaways are:
- Use
top
,free
,iostat
, anduptime
correctly. - Always check the
available
memory instead offree
memory. - Monitor CPU, I/O, and swap activity together to avoid false positives.
- Consider historical data with
sar
to identify trends.
By implementing these best practices, you will improve troubleshooting efficiency and optimize your Linux server performance proactively.