cuda-lowpass.cu File Reference

#include "CUDA/cuda-lowpass.h"
#include <cuda.h>
#include "CUDA/cutil.h"
#include "Envision/env_types.h"
Include dependency graph for cuda-lowpass.cu:

Go to the source code of this file.

Defines

#define INT_IS_32_BITS
#define IMUL(a, b)   __mul24(a, b)
#define ROW_TILE_W   128
#define COLUMN_TILE_W   16
#define COLUMN_TILE_H   16

Functions

__global__ void cudalowpass5xdecx (const int *src, const unsigned int w, const unsigned int h, int *dst)
__global__ void cudalowpass5ydecy (const int *src, const unsigned int w, const unsigned int h, int *dst, int sms, int gms)
int iDivUp (int a, int b)
void cuda_lowpass_5_x_dec_x_fewbits_optim (const int *src, const unsigned int w, const unsigned int h, int *dst)
 Convolve and decimate in X direction with 5-tap lowpass filter.
void cuda_lowpass_5_y_dec_y_fewbits_optim (const int *src, const unsigned int w, const unsigned int h, int *dst)
 Convolve and decimate in X direction with 5-tap lowpass filter.
__global__ void cudalowpass9x (const int *src, const unsigned int w, const unsigned int h, int *dst)
__global__ void cudalowpass9y (const int *src, const unsigned int w, const unsigned int h, int *dst, int sms, int gms)
void cuda_lowpass_9_x_fewbits_optim (const int *src, const unsigned int w, const unsigned int h, int *dst)
void cuda_lowpass_9_y_fewbits_optim (const int *src, const unsigned int w, const unsigned int h, int *dst)

Detailed Description

CUDA/GPU optimized lowpass code

Definition in file cuda-lowpass.cu.


Function Documentation

void cuda_lowpass_5_x_dec_x_fewbits_optim ( const int *  src,
const unsigned int  w,
const unsigned int  h,
int *  dst 
)

Convolve and decimate in X direction with 5-tap lowpass filter.

Note that src and dst should have been allocated already by the caller in DEVICE memory, and source data should have been copied to src. The caller may have to copy the result back to host memory if no further GPU processing is needed.

Definition at line 176 of file cuda-lowpass.cu.

void cuda_lowpass_5_y_dec_y_fewbits_optim ( const int *  src,
const unsigned int  w,
const unsigned int  h,
int *  dst 
)

Convolve and decimate in X direction with 5-tap lowpass filter.

Note that src and dst should have been allocated already by the caller in DEVICE memory, and source data should have been copied to src. The caller may have to copy the result back to host memory if no further GPU processing is needed.

Definition at line 185 of file cuda-lowpass.cu.

Generated on Sun May 8 08:11:07 2011 for iLab Neuromorphic Vision Toolkit by  doxygen 1.6.3