Multi-GPU Monitoring

gpulse is built for systems with multiple GPUs. This guide covers the views and workflows for monitoring two or more devices simultaneously.

Grid View Overview

Grid view is the default and the best starting point for any multi-GPU system. Press g to switch to it from any other view.

Each GPU tile shows:

GPU index and model name
Memory bar: used / total with percentage
Utilization bar: current SM occupancy
Temperature and power draw
A colour-coded health indicator

Colour-Coded Health

Colour	Meaning
Green	All metrics within normal ranges
Yellow	At least one metric in the warning range (e.g., temperature 70-85 C, memory > 80%)
Red	At least one metric critical (e.g., temperature > 85 C, memory > 95%, ECC uncorrected error)

Scanning the grid top-to-bottom lets you spot an outlier GPU at a glance without reading every number.

Sorting

In Grid and List views, press o to cycle through sort orders:

Sort Order	Description
Index	GPU 0, 1, 2... (default)
Memory Used	Highest memory consumer first
Utilization	Highest compute load first
Temperature	Hottest GPU first
Name	Alphabetical by model name

Detail Deep-Dive

To investigate a single GPU:

In Grid or List view, use Up / Down to highlight the GPU
Press Enter to select it, then d for Detail view

Detail view divides the screen into four quadrants:

Top-left

Memory utilization timeline (last N seconds of history)

Top-right

GPU compute utilization timeline

Bottom-left

Temperature and power readings with sparklines

Bottom-right

Live process table — PID, name, memory, and user

Press g or v to return to the multi-GPU overview.

Compare View

Compare view places two or more GPUs side-by-side with matching metric rows so you can spot imbalances in a distributed training job. Press c to open it.

Typical use cases:

Verifying all GPUs in a data-parallel training run consume similar memory and compute
Identifying a "slow GPU" causing others to block at synchronisation barriers
Checking tensor-parallel model layer splits across devices

Use Left / Right to change the comparison target.

Topology View

Press t to open Topology view. It renders a diagram of the physical interconnect between GPUs, including:

PCIe links: bandwidth class (x8, x16) and CPU socket attachment
NVLink bridges: direct GPU-to-GPU links and negotiated bandwidth

Two GPUs connected via NVLink can exchange tensors at 600 GB/s (NVLink 4.0), while GPUs on opposite NUMA nodes over PCIe may see 10-20x lower effective bandwidth. If distributed training is unexpectedly slow, check Topology view for the interconnect path.

16+ GPU Systems (Pagination)

On systems with more than 8 GPUs, Grid view paginates automatically.

Key	Action
`PgDn`	Next page of GPUs
`PgUp`	Previous page of GPUs

The status bar shows the current page (e.g., GPUs 9-16 of 64). All metrics continue updating for off-screen GPUs. For 16+ GPU systems, consider List view (v) as a denser alternative.