Extensions
OpenBLAS for the most part contains implementations of the reference (Netlib) BLAS, CBLAS, LAPACK and LAPACKE interfaces. A few OpenBLAS-specific functions are also provided however, which mostly can be seen as "BLAS extensions". This page documents those non-standard APIs.
BLAS-like extensions
| Routine | Data Types | Description |
|---|---|---|
| ?axpby | s,d,c,z | like axpy with a multiplier for y |
| ?gemm3m | c,z | gemm3m |
| ?imatcopy | s,d,c,z | in-place transposition/copying |
| ?omatcopy | s,d,c,z | out-of-place transposition/copying |
| ?geadd | s,d,c,z | ATLAS-like matrix add B = α*A+β*B |
| ?gemmt | s,d,c,z | gemm but only a triangular part updated |
bfloat16 functionality
BLAS-like and conversion functions for bfloat16 (available when OpenBLAS was compiled with BUILD_BFLOAT16=1):
void cblas_sbstobf16converts a float array to an array of bfloat16 values by roundingvoid cblas_sbdtobf16converts a double array to an array of bfloat16 values by roundingvoid cblas_sbf16tosconverts a bfloat16 array to an array of floatsvoid cblas_dbf16todconverts a bfloat16 array to an array of doublesfloat cblas_sbdotcomputes the dot product of two bfloat16 arraysvoid cblas_sbgemvperforms the matrix-vector operations of GEMV with the input matrix and X vector as bfloat16void cblas_sbgemmperforms the matrix-matrix operations of GEMM with both input arrays containing bfloat16
Utility functions
openblas_get_num_threadsopenblas_set_num_threadsint openblas_get_num_procs(void)returns the number of processors available on the system (may include "hyperthreading cores")int openblas_get_parallel(void)returns 0 for sequential use, 1 for platform-based threading and 2 for OpenMP-based threadingchar * openblas_get_config()returns the options OpenBLAS was built with, something likeNO_LAPACKE DYNAMIC_ARCH NO_AFFINITY Haswellint openblas_set_affinity(int thread_index, size_t cpusetsize, cpu_set_t *cpuset)sets the CPU affinity mask of the given thread to the provided cpuset. Only available on Linux, with semantics identical topthread_setaffinity_np.