GPUMLib  0.2.2
GPU Machine Learning Library
Classes | Functions
Reduction framework

Classes

class  Reduction
 Provides reduction functions (Sum, Average, Max, Min, ...). More...
 

Functions

void KernelSum (cudaStream_t stream, int blocks, int blockSize, cudafloat *inputs, cudafloat *outputs, int numInputs)
 
void KernelSumSmallArray (cudaStream_t stream, int blockSize, cudafloat *inputs, cudafloat *output, int numInputs, cudafloat multiplyFactor)
 
void KernelMin (cudaStream_t stream, int blocks, int blockSize, cudafloat *inputs, cudafloat *output, int numInputs)
 
void KernelMinIndexes (cudaStream_t stream, int blocks, int blockSize, cudafloat *inputs, cudafloat *output, int *minIndexes, int numInputs, int *indexes)
 
void KernelMax (cudaStream_t stream, int blocks, int blockSize, cudafloat *inputs, cudafloat *output, int numInputs)
 
void KernelMaxIndexes (cudaStream_t stream, int blocks, int blockSize, cudafloat *inputs, cudafloat *output, int *maxIndexes, int numInputs, int *indexes)
 

Detailed Description

Function Documentation

void KernelMax ( cudaStream_t  stream,
int  blocks,
int  blockSize,
cudafloat inputs,
cudafloat output,
int  numInputs 
)

Kernel to compute the maximum of an array.

Parameters
[in]streamCUDA stream
[in]blocksNumber of thread blocks
[in]blockSizeBlock size (number of threads per block)
[in]inputsinput array
[out]outputPointer to the location that will contain the maximum
[in]numInputsNumber of inputs

Definition at line 351 of file MaxKernel.cu.

void KernelMaxIndexes ( cudaStream_t  stream,
int  blocks,
int  blockSize,
cudafloat inputs,
cudafloat output,
int *  maxIndexes,
int  numInputs,
int *  indexes 
)

Kernel to compute the maximum of an array and its index within the array.

Parameters
[in]streamCUDA stream
[in]blocksNumber of thread blocks
[in]blockSizeBlock size (number of threads per block)
[in]inputsinput array
[out]outputPointer to the location that will contain the maximum
[out]maxIndexesPointer to the location that will contain the index of one of the maximums
[in]numInputsNumber of inputs
[in]indexesBuffer used to tempory store the indexes. Must have the same size of the inputs array.

Definition at line 431 of file MaxKernel.cu.

void KernelMin ( cudaStream_t  stream,
int  blocks,
int  blockSize,
cudafloat inputs,
cudafloat output,
int  numInputs 
)

Kernel to compute the minimum of an array.

Parameters
[in]streamCUDA stream
[in]blocksNumber of thread blocks
[in]blockSizeBlock size (number of threads per block)
[in]inputsinput array
[out]outputPointer to the location that will contain the minimum
[in]numInputsNumber of inputs

Definition at line 352 of file MinKernel.cu.

void KernelMinIndexes ( cudaStream_t  stream,
int  blocks,
int  blockSize,
cudafloat inputs,
cudafloat output,
int *  minIndexes,
int  numInputs,
int *  indexes 
)

Kernel to compute the minimum of an array and its index within the array.

Parameters
[in]streamCUDA stream
[in]blocksNumber of thread blocks
[in]blockSizeBlock size (number of threads per block)
[in]inputsinput array
[out]outputPointer to the location that will contain the minimum
[out]minIndexesPointer to the location that will contain the index of one of the minimums
[in]numInputsNumber of inputs
[in]indexesBuffer used to tempory store the indexes. Must have the same size of the inputs array.

Definition at line 432 of file MinKernel.cu.

void KernelSum ( cudaStream_t  stream,
int  blocks,
int  blockSize,
cudafloat inputs,
cudafloat outputs,
int  numInputs 
)

Kernel to sum an array. For small arrays use KernelSumSmallArray instead.

Parameters
[in]streamCUDA stream
[in]blocksNumber of thread blocks
[in]blockSizeBlock size (number of threads per block)
[in]inputsValues to be summed
[out]outputsArray that will contain the partial sums of each block
[in]numInputsNumber of inputs
See also
KernelSumSmallArray, SIZE_SMALL_CUDA_VECTOR

Definition at line 45 of file SumKernel.cu.

void KernelSumSmallArray ( cudaStream_t  stream,
int  blockSize,
cudafloat inputs,
cudafloat output,
int  numInputs,
cudafloat  multiplyFactor 
)

Kernel to sum a small array, multiply the result by a given factor and place the result in the output.

Parameters
[in]streamCUDA stream
[in]blockSizeBlock size (number of threads per block)
[in]inputsValues to be summed
[out]outputPointer to the location that will contain the sum output
[in]numInputsNumber of inputs
[in]multiplyFactorMultiply factor (optional, by default 1.0)
See also
KernelSum, SIZE_SMALL_CUDA_VECTOR

Definition at line 102 of file SumKernel.cu.