Memory Management
PyTorch provides comprehensive GPU memory management through CUDA, allowing developers to control memory allocation, transfer data between CPU and GPU, and monitor memory usage. The system includes automatic memory management features while also offering manual control when needed for optimization. These capabilities are essential for training large models and handling substantial datasets efficiently.
Syntax
Memory Release Methods
.empty_cache()
: Forces GPU memory release and cache clearing.
torch.cuda.empty_cache()
Memory Monitoring Methods
.memory_allocated()
: Returns current memory used by tensors (in bytes).
torch.cuda.memory_allocated()
.memory_reserved()
: Returns total memory allocated by PyTorch (includes allocated + cached memory).
torch.cuda.memory_reserved()
.max_memory_allocated()
: Returns the peak GPU memory usage since the start of the program or last reset.
torch.cuda.max_memory_allocated()
.reset_peak_memory_stats()
: Resets peak memory tracking statistics to current values.
torch.cuda.reset_peak_memory_stats()
Memory Limiting Methods
.set_per_process_memory_fraction()
: Limits PyTorch to use only specified fraction of GPU memory.
torch.cuda.set_per_process_memory_fraction(0.7)
.total_memory
: Returns total available GPU memory on specified device.
torch.cuda.get_device_properties(device).total_memory
Memory Optimization Methods
.pin_memory()
: Pins CPU memory for faster CPU-GPU transfer.
tensor.pin_memory()
.zero_grad()
: Efficiently clears gradients by setting toNone
instead of zero.
model.zero_grad(set_to_none=True)
Example
The following example demonstrates PyTorch’s GPU memory management by creating large tensors, monitoring memory allocation/usage, cleaning up unused memory, and setting memory limits with output showing memory statistics in megabytes:
import torch# Create some tensors on GPUx = torch.randn(10000, 10000, device='cuda')y = torch.randn(10000, 10000, device='cuda')# Monitor memory usageprint(f"Current memory allocated: {torch.cuda.memory_allocated() / 1024**2:.2f} MB")print(f"Max memory allocated: {torch.cuda.max_memory_allocated() / 1024**2:.2f} MB")print(f"Reserved memory: {torch.cuda.memory_reserved() / 1024**2:.2f} MB")# Clear unused memorydel xtorch.cuda.empty_cache()# Set memory limittorch.cuda.set_per_process_memory_fraction(0.8) # Limit to 80% GPU memoryprint(f"Memory after cleanup: {torch.cuda.memory_allocated() / 1024**2:.2f} MB")
The output of the above code will be:
Current memory allocated: 764.00 MBMax memory allocated: 764.00 MBReserved memory: 764.00 MBMemory after cleanup: 382.00 MB
Note: The memory values in the output will vary depending on GPU, system configuration, and other running processes. Each run might show different memory statistics even on the same system.
All contributors
- Anonymous contributor
Contribute to Docs
- Learn more about how to get involved.
- Edit this page on GitHub to fix an error or make an improvement.
- Submit feedback to let us know how we can improve Docs.