GPUMLib  0.2.2
GPU Machine Learning Library
Classes | Macros | Typedefs | Functions
Common framework

Classes

class  CudaStream
 Represents a CUDA stream. More...
 

Macros

#define USE_SINGLE_PRECISION_VARIABLES
 
#define USE_SINGLE_PRECISION_FUNCTIONS
 
#define KERNEL   __global__ void
 Defines the type of a kernel function.
 
#define MAX_THREADS_PER_BLOCK   (512)
 Defines the maximum threads per block.
 
#define SIZE_SMALL_CUDA_VECTOR   (3 * MAX_THREADS_PER_BLOCK)
 
#define OPTIMAL_BLOCK_SIZE_REDUCTION   (128)
 
#define CUDA_VALUE(X)   (X##f)
 
#define MAX_CUDAFLOAT   (FLT_MAX)
 
#define MIN_POSITIVE_CUDAFLOAT   (FLT_MIN)
 
#define MIN_CUDAFLOAT   (-FLT_MAX)
 
#define CUDA_EXP   expf
 
#define CUDA_SQRT   sqrtf
 
#define CUDA_TANH   tanhf
 
#define CUDA_COSH   coshf
 
#define CUDA_POW   powf
 
#define CUDA_SIGMOID(X)   (CUDA_VALUE(1.0) / (CUDA_VALUE(1.0) + CUDA_EXP(-(X))))
 
#define CUDA_SIGMOID_DERIVATE(OUTPUT)   ((OUTPUT) * (CUDA_VALUE(1.0) - (OUTPUT)))
 
#define SAME_DIRECTION(X, Y)   (((X) > CUDA_VALUE(0.0) && (Y) > CUDA_VALUE(0.0)) || ((X) < CUDA_VALUE(0.0) && (Y) < CUDA_VALUE(0.0)))
 Verifies if X and Y have the same signal.
 

Typedefs

typedef float cudafloat
 

Functions

int NumberThreadsPerBlockThatBestFit (int threads, int maxThreadsPerBlock=MAX_THREADS_PER_BLOCK)
 
int NumberBlocks (int threads, int blockSize)
 

Detailed Description

Macro Definition Documentation

#define CUDA_COSH   coshf

Defines the hyperbolic cosine function to be used by CUDA. Depends on USE_SINGLE_PRECISION_FUNCTIONS being defined.

See also
USE_SINGLE_PRECISION_FUNCTIONS

Definition at line 131 of file CudaDefinitions.h.

#define CUDA_EXP   expf

Defines the exponential function to be used by CUDA. Depends on USE_SINGLE_PRECISION_FUNCTIONS being defined.

See also
USE_SINGLE_PRECISION_FUNCTIONS

Definition at line 119 of file CudaDefinitions.h.

#define CUDA_POW   powf

Defines the power function to be used by CUDA. Depends on USE_SINGLE_PRECISION_FUNCTIONS being defined.

See also
USE_SINGLE_PRECISION_FUNCTIONS

Definition at line 135 of file CudaDefinitions.h.

#define CUDA_SIGMOID (   X)    (CUDA_VALUE(1.0) / (CUDA_VALUE(1.0) + CUDA_EXP(-(X))))

Defines the logistic function to be used by CUDA.

See also
CUDA_EXP

Definition at line 162 of file CudaDefinitions.h.

#define CUDA_SIGMOID_DERIVATE (   OUTPUT)    ((OUTPUT) * (CUDA_VALUE(1.0) - (OUTPUT)))

Defines the logistic function derivate to be used by CUDA.

See also
CUDA_SIGMOID

Definition at line 166 of file CudaDefinitions.h.

#define CUDA_SQRT   sqrtf

Defines the square root function to be used by CUDA. Depends on USE_SINGLE_PRECISION_FUNCTIONS being defined.

See also
USE_SINGLE_PRECISION_FUNCTIONS

Definition at line 123 of file CudaDefinitions.h.

#define CUDA_TANH   tanhf

Defines the hyperbolic tangent function to be used by CUDA. Depends on USE_SINGLE_PRECISION_FUNCTIONS being defined.

See also
USE_SINGLE_PRECISION_FUNCTIONS

Definition at line 127 of file CudaDefinitions.h.

#define CUDA_VALUE (   X)    (X##f)

Represents a floating point number, accordingly to the type used by CUDA to represent floating point values.

See also
cudafloat
USE_SINGLE_PRECISION_VARIABLES
Examples:
ATS.cpp, BP.cpp, and MBP.cpp.

Definition at line 70 of file CudaDefinitions.h.

#define MAX_CUDAFLOAT   (FLT_MAX)

Maximum value that a cudafloat can hold.

See also
cudafloat
USE_SINGLE_PRECISION_VARIABLES

Definition at line 75 of file CudaDefinitions.h.

#define MIN_CUDAFLOAT   (-FLT_MAX)

Minimum (negative) value that a cudafloat can hold.

See also
MIN_POSITIVE_CUDAFLOAT
cudafloat
USE_SINGLE_PRECISION_VARIABLES
Attention
Check also MIN_POSITIVE_CUDAFLOAT to see which one is appropriate for your purposes.

Definition at line 88 of file CudaDefinitions.h.

#define MIN_POSITIVE_CUDAFLOAT   (FLT_MIN)

Minimum positive value that a cudafloat can hold.

See also
MIN_CUDAFLOAT
cudafloat
USE_SINGLE_PRECISION_VARIABLES

Definition at line 81 of file CudaDefinitions.h.

#define OPTIMAL_BLOCK_SIZE_REDUCTION   (128)

Defines the optimal block size for reduction operations. This value is optimized for a GTX 280.

Attention
If you optimize the value for other boards, please send me the information (noel@.nosp@m.ipg..nosp@m.pt) so that I can include it here (thank you).

Definition at line 60 of file CudaDefinitions.h.

#define SIZE_SMALL_CUDA_VECTOR   (3 * MAX_THREADS_PER_BLOCK)

Defines the size of a small vector (for reduction purposes). This value defines the size of a vector for which computing a reduction (sum, average, max, min, ...) using a single block is faster than using several blocks. This value is optimized for a GTX 280.

Attention
If you optimize the value for other boards, please send me the information (noel@.nosp@m.ipg..nosp@m.pt) so that I can include it here (thank you).

Definition at line 56 of file CudaDefinitions.h.

#define USE_SINGLE_PRECISION_FUNCTIONS

Tells cuda to use single precision functions. To use double precision functions (on supported devices) comment this define

Definition at line 40 of file CudaDefinitions.h.

#define USE_SINGLE_PRECISION_VARIABLES

Use fermi architecture. Uncomment this line in order to build exclusively for fermi architecture. Tells cuda to use single precision variables. To use double precision variables and values comment this define. You also need to compile with -arch sm_13

Definition at line 36 of file CudaDefinitions.h.

Typedef Documentation

typedef float cudafloat

Type used by CUDA to represent floating point numbers. If USE_SINGLE_PRECISION_VARIABLES is defined, cudafloat represents a float. Otherwise it represents a double

See also
USE_SINGLE_PRECISION_VARIABLES
Examples:
BP.cpp, and MBP.cpp.

Definition at line 65 of file CudaDefinitions.h.

Function Documentation

int GPUMLib::NumberBlocks ( int  threads,
int  blockSize 
)
inline

Finds the number of blocks needed to execute the number of threads specified, given a block size.

Parameters
threadsNumber of threads.
blockSizeBlock size.
Returns
The number of blocks needed to execute the number of threads specified.
See also
NumberThreadsPerBlockThatBestFit, MAX_THREADS_PER_BLOCK

Definition at line 49 of file Utilities.h.

int GPUMLib::NumberThreadsPerBlockThatBestFit ( int  threads,
int  maxThreadsPerBlock = MAX_THREADS_PER_BLOCK 
)
inline

Finds the number of threads (multiple of 2) per block that either is greater that the number of threads needed or identical to the maximum number of threads per block.

Parameters
threadsNumber of threads.
maxThreadsPerBlockMaximum number of threads.
Returns
The number of threads (multiple of 2) per block that either is greater that the number of threads needed or identical to the maximum number of threads per block.
See also
MAX_THREADS_PER_BLOCK, NumberBlocks

Definition at line 37 of file Utilities.h.