AudRecordLib
Functions
detail::SSE Namespace Reference

Functions

FORCEINLINE void StoreConvertedSamples (short *pDstSamples, const __m128 &samples)
FORCEINLINE void StoreConvertedSamples (long *pDstSamples, const __m128 &samples)
FORCEINLINE __m128i LongShortToShorts (__m128 &samples)
FORCEINLINE void StoreConvertedSamples (short *&pDstSamples, __m128 samples[4])
FORCEINLINE void StoreConvertedSamples (long *&pDstSamples, __m128 samples[4])
FORCEINLINE void Get16ConvertedSamples (const float *&pSrcSamples, __m128 samplePacks[4], const __m128 &scaleFactors)
template<class DestSampleType >
void Convert16SamplesLoop (const float *&pSrcSamples, DestSampleType *&pDstSamples, DWORD numPacks, const __m128 &scaleFactors)

Detailed Description

SSE2 specific functions used in the sample conversion process


Function Documentation

template<class DestSampleType >
void detail::SSE::Convert16SamplesLoop ( const float *&  pSrcSamples,
DestSampleType *&  pDstSamples,
DWORD  numPacks,
const __m128 &  scaleFactors 
)

Workhorse for converting 16 float samples at a time using SSE

Loops around numPacks times getting 16 samples from pSrcSamples pointer, converting them to the DestSampleType and storing them into the pDstSamples pointer.

Template Parameters:
DestSampleTypeThe integer type to convert the float samples to
Parameters:
pSrcSamplesPointer to the 16 * numPacks float samples, this pointer is updated during each call
pDstSamplesPointer to a buffer which will contain the 16 converted samples of type DestSampleType, this pointer is updated during each call
numPacksNumber of 16-float 'packs' to convert
scaleFactorsPrefilled SSE type containing 4 copies of the scale factor
void detail::SSE::Get16ConvertedSamples ( const float *&  pSrcSamples,
__m128  samplePacks[4],
const __m128 &  scaleFactors 
)

Retreives and converts 16 samples

Loads 16 floats from a memory location into SSE variables and multiplies them all by a scale factor

Parameters:
[in,out]pSrcSamplesThe memory location containing the raw samples
[out]samplePacksFour SSE variables to store the processed samples in
scaleFactorsContains four copies of the required scale factor
__m128i detail::SSE::LongShortToShorts ( __m128 &  samples)

Converts samples from packed floats to packed shorts

Firstly, samples are converted from float to long format. The packed longs (whose highwords are all zero) are treated as a bunch of eight shorts with contents of {x, 0, x, 0, x, 0, x, 0}. The shorts are then shuffled so that the shorts all occupy the first half of the register (i.e. the format is {x, x, x, x, 0, 0, 0, 0}) and are returned to the caller

Parameters:
samplesThe premultiplied samples in packed float format
void detail::SSE::StoreConvertedSamples ( short *  pDstSamples,
const __m128 &  samples 
)

Stores premultiplied samples to a memory location

Converts the premultiplied samples from packed float format to packed short format before storing the packed shorts into memory. This entire function could be simplified with the _mm_cvtps_pi16 intrinsic but that's not available when compiling for X64.

Parameters:
pDstSamplesMemory location to store the 4 samples
samplesThe premultiplied samples in packed float format
void detail::SSE::StoreConvertedSamples ( long *  pDstSamples,
const __m128 &  samples 
)

Stores premultiplied samples to a memory location

Converts the premultiplied samples from packed float format to packed long format before storing the packed longs into memory.

Parameters:
pDstSamplesMemory location to store the 4 samples
samplesThe premultiplied samples in packed float format
void detail::SSE::StoreConvertedSamples ( short *&  pDstSamples,
__m128  samples[4] 
)

Stores four sets of premultiplied samples to a memory location

Firstly, four sets of samples are converted from float to short format. These four sets of samples are then shuffled from four variables containing 4 short values each, to two variables containing 8 shorts each (i.e. from four {x, 0, x, 0, x, 0, x, 0} to two {x, x, x, x, x, x, x, x}) The shuffled variables are then written to memory in one blast. There are more straight forward ways of implementing this but despite it's length, this method minimizes loads and stores from memory.

Parameters:
[in,out]pDstSamplesThe memory location to store the 16 samples, it is updated by the call
samplesThe four sets of premultiplied samples in packed float format
void detail::SSE::StoreConvertedSamples ( long *&  pDstSamples,
__m128  samples[4] 
)

Stores four sets of premultiplied samples to a memory location

Four sets of samples are converted from packed float to packed long format and then copied to memory.

Parameters:
[in,out]pDstSamplesThe memory location to store the 16 samples, it is updated by the call
samplesThe four sets of premultiplied samples in packed float format
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Events Defines