Platform Computing bolsters its Platform HPC, Platform LSF, and Platform Symphony products to manage graphic processing units (GPUs) clusters and GPU-aware applications. The company reports that administrators can better manage and plan GPU cluster capacity by providing access to monitoring, reporting, and analysis tools including new monitoring functions within the Platform HPC and Platform Symphony web-based interfaces, and Platform RTM for large Platform LSF datacenters.
GPU computing involves harnessing the advanced capabilities of graphics processing units to run the parallel portion of applications many times faster than can be done by standard CPUs. GPU acceleration has gained traction among HPC data centers because computational problems in fields such as biology, physics, seismic processing, finance, and other disciplines are able to run faster on GPUs.
The hardware provides cluster and workload management on resources using GPUs. You can use Platform HPC management software to deploy compute unified device architecture (CUDA) software to compute nodes and allocate jobs only to resources that have GPUs.
The platform provides support by exposing GPU-specific scheduling metrics for NVIDIA Tesla GPUs in Platform HPC, Platform LSF, and Platform Symphony. The product also includes administrator monitoring support that includes:
• GPU-related management metrics such as the number of available GPUs, GPU temperature, GPU operating mode, and ECC error counts.
• For NVIDIA Tesla 20-series GPUs, Platform supports chassis-level metrics such as fan speeds, PSU voltage, electrical current, and LED states.
You can manage GPU devices in a manner similar to traditional computational devices.
GPU support will be included. Platform products support NVIDIA Tesla GPUs, with plans to support future GPU accelerators when available.
Filed Under: Software