Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
869 views
in Technique[技术] by (71.8m points)

c++ - alias of a function template

I have created a CPU dispatcher which compiles the same functions with different compile options into different object files. In order for my code to access the same functions in different object files I need to give the functions in each object file a different name.

In C (or C++) I would do something like this in the header file for the declarations of the function.

typedef float MyFuncType(float a);

MyFuncType  myfunc_SSE2, myfunc_SSE41, myfunc_AVX, myfunc_AVX2, myfunc_AVX512

But now I want function templates for the declarations. My real code currently looks more like this

//kernel.h
template <typename TYPE, unsigned N, typename VALUES>
void foo_SSE2(int32_t *buffer, VALUES & v);

template <typename TYPE, unsigned N, typename VALUES>
void foo_SSE41(int32_t *buffer, VALUES & v);
...
template <typename TYPE, unsigned N, typename VALUES>
void foo_AVX512(int32_t *buffer, VALUES & v);

#if   INSTRSET == 2                    // SSE2
#define FUNCNAME foo_SSE2
#elif INSTRSET == 5                    // SSE4.1
#define FUNCNAME foo_SSE41
...
#if   INSTRSET == 9                    // AVX512
#define FUNCNAME foo_AVX512
#endif

These are only declarations in a header file. The function definitions are in a separate source file which is compiled to a different object file for each function name. The definitions look something like this

//kernel.cpp
#include "kernel.h"
template<typename TYPE, unsigned N, typename VALUES>
void FUNCNAME(int32_t *buffer, VALUES & v) {
    //code
}

Then I compile like this

gcc -c -O3 -msse2 kernel.cpp -o kernel_sse2.o
gcc -c -O3 -msse4.1 kernel.cpp -o kernel_sse41.o
...
gcc -c -O3 -mavx512f kernel.cpp -o kernel_avx512.o
gcc -O3 main.cpp kernel_sse2.o kernel_sse41.o ... kernel_avx512.o

The file main.cpp is another source file which only needs to know the function declarations so that the linker can link them to the definitions in the other object files. It looks like this

void dispatch(void) {
    int iset = instrset_detect();
    if (iset >= 9) {
        fp_float1  = &foo_AVX512<float,1>;  
    }
    else if (iset >= 8) {
        fp_float1  = &foo_AVX2<float,1>;
    }
    ...
    else if (iset >= 2) {
        fp_float1  = &foo_SSE2<float,1>;
    }
}
int main(void) {
    dispatch();
    fp_float1(buffer, values);
}

But in my file "kernel.h" it's annoying (and error prone) to repeat this for every change in function name. I want something like the following (which I know does not work).

template <typename TYPE, unsigned N, typename VALUES>
typedef void foo(int32_t *buffer, VALUES & v);

foo foo_SSE2, foo_SSE41, foo_SSE_AVX, foo_AVX2, foo_AVX512

Is there an ideal way to to this which separates the declarations and definitions and allows me to simply rename identical template function declarations?

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

This seems like an application for the preprocessor:

#define EMIT_FUNCTION_PROTOTYPE(func_name, func_suffix) 
    template<typename TYPE, unsigned N, typename VALUES> 
    void func_name ## func_suffix (int32_t *buffer, VALUES & v)

#define EMIT_FUNCTION_PROTOTYPES(func_name) 
    EMIT_FUNCTION_PROTOTYPE(func_name, _SSE2); 
    EMIT_FUNCTION_PROTOTYPE(func_name, _SSE41); 
    EMIT_FUNCTION_PROTOTYPE(func_name, _AVX); 
    EMIT_FUNCTION_PROTOTYPE(func_name, _AVX2); 
    EMIT_FUNCTION_PROTOTYPE(func_name, _AVX512)

Then it's just a one-liner to generate all of the prototypes in your header file:

EMIT_FUNCTION_PROTOTYPES(foo);
// expands to:
//     template <typename TYPE, unsigned N, typename VALUES>
//     void foo_SSE2(int32_t *buffer, VALUES & v);
//
//     template <typename TYPE, unsigned N, typename VALUES>
//     void foo_SSE41(int32_t *buffer, VALUES & v);
//
//     template <typename TYPE, unsigned N, typename VALUES>
//     void foo_AVX(int32_t *buffer, VALUES & v);
//
//     template <typename TYPE, unsigned N, typename VALUES>
//     void foo_AVX2(int32_t *buffer, VALUES & v);
//
//     template <typename TYPE, unsigned N, typename VALUES>
//     void foo_AVX512(int32_t *buffer, VALUES & v);

I don't think this is a huge benefit, but it should give you what you want.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to OStack Knowledge Sharing Community for programmer and developer-Open, Learning and Share
Click Here to Ask a Question

...