cuda-lowpass.cu File Reference

#include "CUDA/cuda-lowpass.h"
#include <cuda.h>
#include "CUDA/cutil.h"
#include "Envision/env_types.h"

Include dependency graph for cuda-lowpass.cu:

Defines
#define	INT_IS_32_BITS
#define	IMUL(a, b) __mul24(a, b)
#define	ROW_TILE_W 128
#define	COLUMN_TILE_W 16
#define	COLUMN_TILE_H 16
Functions
__global__ void	cudalowpass5xdecx (const int src, const unsigned int w, const unsigned int h, int dst)
__global__ void	cudalowpass5ydecy (const int src, const unsigned int w, const unsigned int h, int dst, int sms, int gms)
int	iDivUp (int a, int b)
void	cuda_lowpass_5_x_dec_x_fewbits_optim (const int src, const unsigned int w, const unsigned int h, int dst)
	Convolve and decimate in X direction with 5-tap lowpass filter.
void	cuda_lowpass_5_y_dec_y_fewbits_optim (const int src, const unsigned int w, const unsigned int h, int dst)
	Convolve and decimate in X direction with 5-tap lowpass filter.
__global__ void	cudalowpass9x (const int src, const unsigned int w, const unsigned int h, int dst)
__global__ void	cudalowpass9y (const int src, const unsigned int w, const unsigned int h, int dst, int sms, int gms)
void	cuda_lowpass_9_x_fewbits_optim (const int src, const unsigned int w, const unsigned int h, int dst)
void	cuda_lowpass_9_y_fewbits_optim (const int src, const unsigned int w, const unsigned int h, int dst)

Detailed Description

CUDA/GPU optimized lowpass code

Definition in file cuda-lowpass.cu.

Function Documentation

void cuda_lowpass_5_x_dec_x_fewbits_optim	(	const int *	src,
		const unsigned int	w,
		const unsigned int	h,
		int *	dst
	)

Convolve and decimate in X direction with 5-tap lowpass filter.

Note that src and dst should have been allocated already by the caller in DEVICE memory, and source data should have been copied to src. The caller may have to copy the result back to host memory if no further GPU processing is needed.

Definition at line 176 of file cuda-lowpass.cu.

void cuda_lowpass_5_y_dec_y_fewbits_optim	(	const int *	src,
		const unsigned int	w,
		const unsigned int	h,
		int *	dst
	)

Convolve and decimate in X direction with 5-tap lowpass filter.

Note that src and dst should have been allocated already by the caller in DEVICE memory, and source data should have been copied to src. The caller may have to copy the result back to host memory if no further GPU processing is needed.

Definition at line 185 of file cuda-lowpass.cu.

cuda-lowpass.cu File Reference

Defines

Functions

Detailed Description

Function Documentation