This example measures the memory bandwith capacity of GPU devices. It performs memcpy from host to GPU device, GPU device to host, and within a single GPU.
- User commandline arguments are parsed and test parameters initialized. If there are no commandline arguments then the test paramenters are initialized with default values.
- Bandwidth tests are launched.
- If the memory type for the test set to
-memory pageable
then the host side data is instantiated instd::vector<unsigned char>
. If the memory type for the test set to-memory pinned
then the host side data is instantiated inunsigned char*
and allocated usinghipHostMalloc
. - Device side storage is allocated using
hipMalloc
inunsigned char*
- Memory transfer is performed
trail
amount of times usinghipMemcpy
for pageable memory or usinghipMemcpyAsync
for host allocated pinned memory. - Time of memory transfer operations is measured that is then used to calculate the bandwidth.
- All device memory is freed using
hipFree
and all host allocated pinned memory is freed usinghipHostFree
.
The program uses HIP pageable and pinned memory. It is important to note that the pinned memory is allocated using hipHostMalloc
and is destroyed using hipHostFree
. The HIP memory transfer routine hipMemcpyAsync
will behave synchronously if the host memory is not pinned. Therefore, it is important to allocate pinned host memory using hipHostMalloc
for hipMemcpyAsync
to behave asynchronously.
hipMalloc
hipMemcpy
hipMemcpyAsync
hipGetDeviceCount
hipGetDeviceProperties
hipFree
hipHostFree
hipHostMalloc
hipSetDevice