Monitoring CPU, RAM and disk on your VPS.
How to see what's using resources right now, what to look at in long-term metrics, and free tools to set up alerting.
You ship a VPS to production, traffic grows, suddenly the site is slow. First question: what's the bottleneck? CPU? RAM? Disk I/O? Network?
Real-time view
SSH in and run:
topTop of screen: load average (1min / 5min / 15min). Load > number of CPU cores means CPU-bound.
Process list: sorted by %CPU descending by default. Press M to sort by memory.
For a friendlier interface, install htop:
sudo apt install htop # Ubuntu/Debian
sudo dnf install htop # RHEL/Alma
htopUse arrow keys to navigate, F9 to kill processes.
Memory
free -hShows total / used / free / buffers / cache.
Cached memory is not "in use". Linux uses spare RAM as filesystem cache and frees it when apps need it. Look at available column for the real "free" figure.
Out of memory = OOM killer kicks in and kills the largest process. Check dmesg | grep -i kill to see what got murdered.
Disk space
df -h # disk space per filesystem
du -sh /* # space per top-level dir
du -sh /var/* # drill into /varFor finding big files:
find / -size +500M -exec ls -lh {} \; 2>/dev/nullCommon space hogs: /var/log (rotate logs), /var/lib/docker (clean unused images: docker system prune -a), /tmp, MySQL data dirs.
Disk I/O (the silent killer)
iostat -x 5 # if installed (apt install sysstat)Watch %util column. >70% sustained means disk is the bottleneck.
For per-process I/O:
sudo iotop(install: sudo apt install iotop)
Network
sudo iftop # real-time bandwidth per connection
sudo nethogs # bandwidth per processOr for total traffic:
ifstat 1Long-term metrics
Real-time tools tell you what's happening NOW. For trends, install something persistent:
Netdata (free, lightweight, gorgeous web UI):
bash <(curl -SsL https://my-netdata.io/kickstart.sh)Visit http://your-vps:19999. Auto-discovers everything — CPU, RAM, disk, network, MySQL, Redis, Nginx, Docker, etc. Free tier sufficient for most.
Prometheus + Grafana (full stack, more setup): for serious monitoring with alerting and long retention. Standard for production.
Glances (alternative to top, persistent):
sudo apt install glances
glances --webserver --port 61208Alerting
Free options that work:
- UptimeRobot — free for 50 monitors, pings your services every 5 minutes, alerts on downtime
- Healthchecks.io — free for 20 cron checks, alerts when scheduled jobs fail to ping
- Netdata Cloud — free tier alerting on the metrics it collects
- Grafana Cloud — free 10k metrics + alerting
Cron job emails work too but are rate-limited and easy to ignore once they accumulate.
Diagnose a slow site fast
Run these in sequence:
uptime # load avg — CPU-bound?
free -h # memory pressure?
df -h # disk full?
iostat -x 5 5 # disk I/O bottleneck?
top # what processes are eating CPU?In ~60 seconds you'll know which resource is the issue, and which process is responsible.