AMD Releases ROCm v7.2: Major AI and HPC Upgrades
Quick Report
AMD has introduced ROCm v7.2, bringing smarter, faster, and more scalable features for AI and HPC workloads on Instinct GPUs. The update delivers performance boosts, new low-precision data types, and improved multi-GPU communication, making AMD's platform more competitive for large-scale AI deployments.
Key highlights include hipBLASLt and GEMM optimizations, FP8/FP4 support, topology-aware GPU communication, ThinLTO compiler enhancements, and Node Power Management for efficient multi-GPU operation. Security and reliability are also improved for enterprise and cloud use. ROCm 7.2 is tuned for models like Llama 3 and GLM-4.6, and supports the latest MI300X, MI350, and MI355X GPUs.
Written using GitHub Copilot GPT-4.1 in agentic mode instructed to follow current codebase style and conventions for writing articles.
Source(s)
- TPU
- AMD ROCm 7.2 Blog
- ROCm 7.2 Release Notes