Monitoring CPU, RAM and disk on your VPS.

How to see what's using resources right now, what to look at in long-term metrics, and free tools to set up alerting.

2 DƏQ OXUMA

You ship a VPS to production, traffic grows, suddenly the site is slow. First question: what's the bottleneck? CPU? RAM? Disk I/O? Network?

Real-time view

SSH in and run:

top

Top of screen: load average (1min / 5min / 15min). Load > number of CPU cores means CPU-bound.

Process list: sorted by %CPU descending by default. Press M to sort by memory.

For a friendlier interface, install htop:

sudo apt install htop   # Ubuntu/Debian
sudo dnf install htop   # RHEL/Alma
htop

Use arrow keys to navigate, F9 to kill processes.

Memory

free -h

Shows total / used / free / buffers / cache.

Cached memory is not "in use". Linux uses spare RAM as filesystem cache and frees it when apps need it. Look at available column for the real "free" figure.

Out of memory = OOM killer kicks in and kills the largest process. Check dmesg | grep -i kill to see what got murdered.

Disk space

df -h           # disk space per filesystem
du -sh /*       # space per top-level dir
du -sh /var/*   # drill into /var

For finding big files:

find / -size +500M -exec ls -lh {} \; 2>/dev/null

Common space hogs: /var/log (rotate logs), /var/lib/docker (clean unused images: docker system prune -a), /tmp, MySQL data dirs.

Disk I/O (the silent killer)

iostat -x 5    # if installed (apt install sysstat)

Watch %util column. >70% sustained means disk is the bottleneck.

For per-process I/O:

sudo iotop

(install: sudo apt install iotop)

Network

sudo iftop     # real-time bandwidth per connection
sudo nethogs   # bandwidth per process

Or for total traffic:

ifstat 1

Long-term metrics

Real-time tools tell you what's happening NOW. For trends, install something persistent:

Netdata (free, lightweight, gorgeous web UI):

bash <(curl -SsL https://my-netdata.io/kickstart.sh)

Visit http://your-vps:19999. Auto-discovers everything — CPU, RAM, disk, network, MySQL, Redis, Nginx, Docker, etc. Free tier sufficient for most.

Prometheus + Grafana (full stack, more setup): for serious monitoring with alerting and long retention. Standard for production.

Glances (alternative to top, persistent):

sudo apt install glances
glances --webserver --port 61208

Alerting

Free options that work:

  1. UptimeRobot — free for 50 monitors, pings your services every 5 minutes, alerts on downtime
  2. Healthchecks.io — free for 20 cron checks, alerts when scheduled jobs fail to ping
  3. Netdata Cloud — free tier alerting on the metrics it collects
  4. Grafana Cloud — free 10k metrics + alerting

Cron job emails work too but are rate-limited and easy to ignore once they accumulate.

Diagnose a slow site fast

Run these in sequence:

uptime              # load avg — CPU-bound?
free -h             # memory pressure?
df -h               # disk full?
iostat -x 5 5       # disk I/O bottleneck?
top                 # what processes are eating CPU?

In ~60 seconds you'll know which resource is the issue, and which process is responsible.

Daha sualınız varmı?

Canlı dəstəklə əlaqə saxlayın — günün istənilən vaxtı 3 dəqiqədən az median cavab müddəti.

Dəstəklə əlaqə