High Performance Plasticity  0.5.0
spectralUtilsCUDA.h File Reference

Header file for spectral utilities CUDA implementations. More...

#include <hpp/config.h>
#include <hpp/crystal.h>
#include <hpp/cudaUtils.h>
Include dependency graph for spectralUtilsCUDA.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

struct  hpp::SpectralDataCUDA< T, N >
 
class  hpp::SpectralCoordCUDA< N >
 
struct  hpp::SpectralCoeffCUDA< T >
 
class  hpp::SpectralDatasetCUDA< T, N >
 
class  hpp::SpectralDatabaseCUDA< T, N >
 
class  hpp::SpectralDatabaseUnifiedCUDA< T, N, P >
 

Namespaces

 hpp
 

Functions

template<typename T , unsigned int N>
__global__ void hpp::GET_IDFT_REAL (SpectralDatabaseCUDA< T, N > *db, unsigned int dsetIdx, unsigned int *spatialCoord, T *val)
 
template<typename T , unsigned int N, unsigned int P>
struct hpp::ALIGN (16) SpectralDataUnifiedCUDA
 
template<typename T , unsigned int N>
__device__ void hpp::getExpVal (unsigned int *spatialCoord, SpectralCoordCUDA< N > &coord, unsigned int gridDim, T expArgFactor, T *expValRe, T *expValIm)
 

Detailed Description

Header file for spectral utilities CUDA implementations.

A spectral database for DFTs of real data.

Author
Michael Malahe
Template Parameters
Tthe scalar datatype
Nthe dimension of the database

Instances of the class itself may live on the device or host. However, all of the dynamically-allocated memory is on the device. The approach to dynamic memory has two parts. First, there are raw pointers to device memory. These are needed for use in the device member functions. Second, each of these raw pointers is additionally wrapped in a shared_ptr. This ensures that the memory is freed upon destruction of the final holder of the memory, and also that the database can be passed around and copied freely.

The class is designed around having its getIDFTReal members be as blazing fast as possible. To this end, there are a number of awkward but worthwhile optimisations.

  1. The sign on the imaginary terms of the Fourier coefficients is pre-negated in the constructor. This allows the exp*coeff multiplication to be done in exactly two FMA operations.
    Template Parameters
    Tthe scalar datatype
    Nthe dimension of the database
    Tthe scalar type
    Nthe dimension of the data
    Pthe number of coefficients tied to each coordinate
    This is specifically for "unified" coefficient datasets, which here means that there is a single ordering of the dominant components that all of the coefficients follow. See hpp::SpectralDatabaseCUDA
    Template Parameters
    Tthe scalar datatype
    Nthe dimension of the database
    Pthe number of coefficients tied to each coordinate