spla
|
Classes | |
class | Array |
One-dimensional dense tightly packed array of typed values. More... | |
class | Descriptor |
Descriptor object used to parametrize execution of particular scheduled tasks. More... | |
class | MtxLoader |
Loader for matrix data stored in matrix-market (.mtx) format. More... | |
class | Library |
Library global state automatically instantiated on lib init. More... | |
class | Matrix |
Generalized M x N dimensional matrix object. More... | |
class | MemView |
View to some memory resource without life-time control. More... | |
class | Object |
Base class for any library primitive. More... | |
class | Op |
An callable operation to parametrize execution of math computations. More... | |
class | OpUnary |
Unary operation with 1-arity. More... | |
class | OpBinary |
Binary operation with 2-arity. More... | |
class | OpSelect |
Select operation with 1-arity and bool return type. More... | |
class | RefCnt |
Base class for object with built-in reference counting mechanism. More... | |
class | ref_ptr |
Automates reference counting and behaves as shared smart pointer. More... | |
class | Scalar |
Box for a single typed scalar value. More... | |
class | ScheduleTask |
Represent single smallest evaluation tasks which can scheduled. More... | |
class | Schedule |
Object with sequence of steps with tasks forming schedule for execution. More... | |
class | Timer |
Simple timer to measure intervals of time on CPU-side. More... | |
class | Type |
Type representation for parametrisation of containers stored values. More... | |
class | Vector |
Generalized N dimensional vector object. More... | |
class | Accelerator |
Interface for an computations acceleration backend. More... | |
class | DispatchContext |
Execution context of a single task. More... | |
class | Dispatcher |
Class responsible for dispatching of execution of a single task. More... | |
class | Logger |
Library logger. More... | |
class | RegistryAlgo |
Algorithm suitable to process schedule task based on task string key. More... | |
class | Registry |
Registry with key-algo mapping of stored algo implementations. More... | |
class | TArray |
Array interface implementation with type information bound. More... | |
class | TDecoration |
Base class for typed decoration for storage object. More... | |
class | TDecorationStorage |
Storage for decorators with data of a particular vector or matrix object. More... | |
class | TMatrix |
Matrix interface implementation with type information bound. More... | |
class | TOpUnary |
class | TOpBinary |
class | TOpSelect |
class | TScalar |
class | TType |
Type interface implementation with actual type info bound. More... | |
class | TVector |
Vector interface implementation with type information bound. More... | |
class | Algo_callback_cpu |
class | CpuDokVec |
class | CpuDenseVec |
CPU one-dim array for dense vector representation. More... | |
class | CpuCooVec |
CPU list-of-coordinates sparse vector representation. More... | |
class | CpuLil |
CPU list-of-list matrix format for fast incremental build. More... | |
class | CpuDok |
Dictionary of keys sparse matrix format. More... | |
class | CpuCoo |
CPU list of coordinates matrix format. More... | |
class | CpuCsr |
CPU compressed sparse row matrix format. More... | |
class | Algo_kron_cpu |
class | Algo_m_eadd_cpu |
class | Algo_m_emult_cpu |
class | Algo_m_extract_column_cpu |
class | Algo_m_extract_row_cpu |
class | Algo_m_reduce_cpu |
class | Algo_m_reduce_by_column_cpu |
class | Algo_m_reduce_by_row_cpu |
class | Algo_m_transpose_cpu |
class | Algo_mxm_cpu |
class | Algo_mxmT_masked_cpu |
class | Algo_mxv_masked_cpu |
class | Algo_v_assign_masked_cpu |
class | Algo_v_count_mf_cpu |
class | Algo_v_eadd_cpu |
class | Algo_v_eadd_fdb_cpu |
class | Algo_v_emult_cpu |
class | Algo_v_map_cpu |
class | Algo_v_reduce_cpu |
class | Algo_vxm_masked_cpu |
class | CLAccelerator |
Single-device OpenCL acceleration implementation. More... | |
class | CLAlloc |
Base class for any device-local opencl buffer allocator. More... | |
class | CLAllocGeneral |
Wrapper for default OpenCL buffer allcoation. More... | |
class | CLAllocLinear |
Linear allocator for temporary device local buffer allocations. More... | |
class | CLCounter |
Unsigned integer reusable counter for operations. More... | |
class | CLCounterWrapper |
class | CLCounterPool |
Global pool with pre-allocated counters. More... | |
class | CLDenseVec |
OpenCL one-dim array for dense vector representation. More... | |
class | CLCooVec |
OpenCL list-of-coordinates sparse vector representation. More... | |
class | CLCsr |
OpenCL compressed sparse row matrix representation. More... | |
class | Algo_m_reduce_cl |
class | Algo_mxmT_masked_cl |
class | Algo_mxv_masked_cl |
class | CLProgram |
Compiled opencl program from library sources. More... | |
class | CLProgramBuilder |
Runtime opencl program builder. More... | |
class | CLProgramCache |
Runtime cache for compiled opencl programs. More... | |
class | Algo_v_assign_masked_cl |
class | Algo_v_count_mf_cl |
class | Algo_v_eadd_cl |
class | Algo_v_eadd_fdb_cl |
class | Algo_v_map_cl |
class | Algo_v_reduce_cl |
class | Algo_vxm_masked_cl |
struct | TimeProfilerLabel |
struct | TimeProfilerScope |
class | TimeProfiler |
Scope-based time profiler to measure perf of schedule tasks execution. More... | |
class | ScheduleSingleThread |
Single-thread dispatch sequential execution schedule. More... | |
class | ScheduleTaskBase |
Base schedule task class with common public properties. More... | |
class | ScheduleTask_callback |
Callback task. More... | |
class | ScheduleTask_mxm |
Sparse matrix sparse matrix product. More... | |
class | ScheduleTask_mxmT_masked |
Masked matrix matrix-transposed product. More... | |
class | ScheduleTask_kron |
Sparse matrix kronecker product. More... | |
class | ScheduleTask_mxv_masked |
Masked matrix-vector product. More... | |
class | ScheduleTask_vxm_masked |
Masked vector-matrix product. More... | |
class | ScheduleTask_m_eadd |
Matrix ewise add. More... | |
class | ScheduleTask_m_emult |
Matrix ewise add. More... | |
class | ScheduleTask_m_reduce_by_row |
Matrix by row reduction. More... | |
class | ScheduleTask_m_reduce_by_column |
Matrix by col reduction. More... | |
class | ScheduleTask_m_reduce |
Matrix reduction to scalar. More... | |
class | ScheduleTask_m_transpose |
Matrix transpose. More... | |
class | ScheduleTask_m_extract_row |
Matrix extract vector. More... | |
class | ScheduleTask_m_extract_column |
Matrix extract vector. More... | |
class | ScheduleTask_v_eadd |
Vector ewise add. More... | |
class | ScheduleTask_v_emult |
Vector ewise mult. More... | |
class | ScheduleTask_v_eadd_fdb |
Vector ewise with feedback. More... | |
class | ScheduleTask_v_assign_masked |
Masked vector assignment. More... | |
class | ScheduleTask_v_map |
Vector map to vector. More... | |
class | ScheduleTask_v_reduce |
Vector reduction to scalar. More... | |
class | ScheduleTask_v_count_mf |
Vector count meaningful elements. More... | |
class | StorageManager |
General format converter for vector or matrix decoration storage. More... | |
struct | pair_hash |
Typedefs | |
using | uint = std::uint32_t |
Library index and size type. More... | |
using | MessageCallback = std::function< void(Status status, const std::string &msg, const std::string &file, const std::string &function, int line)> |
using | ScheduleCallback = std::function< void()> |
using | T_BOOL = bool |
using | T_INT = std::int32_t |
using | T_UINT = std::uint32_t |
using | T_FLOAT = float |
template<typename T > | |
using | StorageManagerMatrix = StorageManager< T, FormatMatrix, static_cast< int >(FormatMatrix::Count)> |
template<typename T > | |
using | StorageManagerVector = StorageManager< T, FormatVector, static_cast< int >(FormatVector::Count)> |
Enumerations | |
enum class | Status : uint { Ok = 0 , Error = 1 , NoAcceleration = 2 , PlatformNotFound = 3 , DeviceNotFound = 4 , InvalidState = 5 , InvalidArgument = 6 , NoValue = 7 , CompilationError = 8 , NotImplemented = 1024 } |
enum class | AcceleratorType : uint { None = 0 , OpenCL = 1 } |
enum class | FormatMatrix : uint { CpuLil = 0 , CpuDok = 1 , CpuCoo = 2 , CpuCsr = 3 , CpuCsc = 4 , AccCoo = 5 , AccCsr = 6 , AccCsc = 7 , Count = 8 } |
enum class | FormatVector : uint { CpuDok = 0 , CpuDense = 1 , CpuCoo = 2 , AccDense = 3 , AccCoo = 4 , Count = 5 } |
Functions | |
Status | bfs (const ref_ptr< Vector > &v, const ref_ptr< Matrix > &A, uint s, const ref_ptr< Descriptor > &descriptor=spla::Descriptor::make()) |
Breadth-first search algorithm. More... | |
Status | bfs_naive (std::vector< int > &v, std::vector< std::vector< spla::uint >> &A, uint s, const ref_ptr< Descriptor > &descriptor=spla::Descriptor::make()) |
Naive breadth-first search algorithm (reference cpu implementation) More... | |
Status | sssp (const ref_ptr< Vector > &v, const ref_ptr< Matrix > &A, uint s, const ref_ptr< Descriptor > &descriptor=ref_ptr< Descriptor >()) |
Single-source shortest path algorithm. More... | |
Status | sssp_naive (std::vector< float > &v, std::vector< std::vector< uint >> &Ai, std::vector< std::vector< float >> &Ax, uint s, const ref_ptr< Descriptor > &descriptor=spla::Descriptor::make()) |
Naive single-source shortest path algorithm (reference cpu implementation) More... | |
Status | pr (ref_ptr< Vector > &p, const ref_ptr< Matrix > &A, float alpha=0.85, float eps=1e-6, const ref_ptr< Descriptor > &descriptor=spla::Descriptor::make()) |
PageRank algorithm. More... | |
Status | pr_naive (std::vector< float > &p, std::vector< std::vector< uint >> &Ai, std::vector< std::vector< float >> &Ax, float alpha=0.85, float eps=1e-6, const ref_ptr< Descriptor > &descriptor=spla::Descriptor::make()) |
Naive PageRank algorithm (reference cpu implementation) More... | |
Status | tc (int &ntrins, const ref_ptr< Matrix > &A, const ref_ptr< Matrix > &B, const ref_ptr< Descriptor > &descriptor=spla::Descriptor::make()) |
Triangles counting algorithm. More... | |
Status | tc_naive (int &ntrins, std::vector< std::vector< spla::uint >> &Ai, const ref_ptr< Descriptor > &descriptor=spla::Descriptor::make()) |
Naive triangles counting algorithm (reference cpu implementation) More... | |
Status | exec_callback (ScheduleCallback callback, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) callback function. More... | |
Status | exec_mxm (ref_ptr< Matrix > R, ref_ptr< Matrix > A, ref_ptr< Matrix > B, ref_ptr< OpBinary > op_multiply, ref_ptr< OpBinary > op_add, ref_ptr< Scalar > init, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) sparse-matrix sparse-matrix product. More... | |
Status | exec_mxmT_masked (ref_ptr< Matrix > R, ref_ptr< Matrix > mask, ref_ptr< Matrix > A, ref_ptr< Matrix > B, ref_ptr< OpBinary > op_multiply, ref_ptr< OpBinary > op_add, ref_ptr< OpSelect > op_select, ref_ptr< Scalar > init, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) sparse masked matrix matrix-transposed product. More... | |
Status | exec_kron (ref_ptr< Matrix > R, ref_ptr< Matrix > A, ref_ptr< Matrix > B, ref_ptr< OpBinary > op_multiply, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) sparse masked matrix kronecker product. More... | |
Status | exec_mxv_masked (ref_ptr< Vector > r, ref_ptr< Vector > mask, ref_ptr< Matrix > M, ref_ptr< Vector > v, ref_ptr< OpBinary > op_multiply, ref_ptr< OpBinary > op_add, ref_ptr< OpSelect > op_select, ref_ptr< Scalar > init, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) dense-masked sparse matrix by dense vector product. More... | |
Status | exec_vxm_masked (ref_ptr< Vector > r, ref_ptr< Vector > mask, ref_ptr< Vector > v, ref_ptr< Matrix > M, ref_ptr< OpBinary > op_multiply, ref_ptr< OpBinary > op_add, ref_ptr< OpSelect > op_select, ref_ptr< Scalar > init, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) dense-masked sparse vector by sparse matrix product. More... | |
Status | exec_m_eadd (ref_ptr< Matrix > R, ref_ptr< Matrix > A, ref_ptr< Matrix > B, ref_ptr< OpBinary > op, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) element-wise addition by structure of two matrices. More... | |
Status | exec_m_emult (ref_ptr< Matrix > R, ref_ptr< Matrix > A, ref_ptr< Matrix > B, ref_ptr< OpBinary > op, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) element-wise multiplication by structure of two matrices. More... | |
Status | exec_m_reduce_by_row (ref_ptr< Vector > r, ref_ptr< Matrix > M, ref_ptr< OpBinary > op_reduce, ref_ptr< Scalar > init, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) matrix by row reduction to single vector column. More... | |
Status | exec_m_reduce_by_column (ref_ptr< Vector > r, ref_ptr< Matrix > M, ref_ptr< OpBinary > op_reduce, ref_ptr< Scalar > init, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) matrix by column reduction to single vector column. More... | |
Status | exec_m_reduce (ref_ptr< Scalar > r, ref_ptr< Scalar > s, ref_ptr< Matrix > M, ref_ptr< OpBinary > op_reduce, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) matrix by structure reduction to a single scalar value. More... | |
Status | exec_m_transpose (ref_ptr< Matrix > R, ref_ptr< Matrix > M, ref_ptr< OpUnary > op_apply, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) matrix transpose operation. More... | |
Status | exec_m_extract_row (ref_ptr< Vector > r, ref_ptr< Matrix > M, uint index, ref_ptr< OpUnary > op_apply, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) matrix row extract. More... | |
Status | exec_m_extract_column (ref_ptr< Vector > r, ref_ptr< Matrix > M, uint index, ref_ptr< OpUnary > op_apply, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) matrix column extract. More... | |
Status | exec_v_eadd (ref_ptr< Vector > r, ref_ptr< Vector > u, ref_ptr< Vector > v, ref_ptr< OpBinary > op, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) element-wise addition by structure of two vectors. More... | |
Status | exec_v_emult (ref_ptr< Vector > r, ref_ptr< Vector > u, ref_ptr< Vector > v, ref_ptr< OpBinary > op, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) element-wise multiplication by structure of two vectors. More... | |
Status | exec_v_eadd_fdb (ref_ptr< Vector > r, ref_ptr< Vector > v, ref_ptr< Vector > fdb, ref_ptr< OpBinary > op, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) element-wise addition by structure of two vectors with feedback. More... | |
Status | exec_v_assign_masked (ref_ptr< Vector > r, ref_ptr< Vector > mask, ref_ptr< Scalar > value, ref_ptr< OpBinary > op_assign, ref_ptr< OpSelect > op_select, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) masked scalar assignment to a vector. More... | |
Status | exec_v_map (ref_ptr< Vector > r, ref_ptr< Vector > v, ref_ptr< OpUnary > op, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) by structure map of one vector to another using unary operation. More... | |
Status | exec_v_reduce (ref_ptr< Scalar > r, ref_ptr< Scalar > s, ref_ptr< Vector > v, ref_ptr< OpBinary > op_reduce, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) vector by structure reduction to a single scalar value. More... | |
Status | exec_v_count_mf (ref_ptr< Scalar > r, ref_ptr< Vector > v, ref_ptr< Descriptor > desc=ref_ptr< Descriptor >(), ref_ptr< ScheduleTask > *task_hnd=nullptr) |
Execute (schedule) count number of meaningful values by vector structure. More... | |
template<typename T , typename... TArgs> | |
ref_ptr< T > | make_ref (TArgs &&... args) |
ref_ptr< Schedule > | make_schedule () |
Makes new schedule for making execution schedule. More... | |
void | register_ops () |
Register all ops on library initialization. More... | |
template<> | |
ref_ptr< TType< T_BOOL > > | get_ttype () |
void | register_algo_cpu (class Registry *g_registry) |
Register all cpu-specific algorithms. More... | |
template<typename T > | |
void | cpu_coo_resize (const uint n_values, CpuCoo< T > &storage) |
template<typename T > | |
void | cpu_coo_clear (CpuCoo< T > &in) |
template<typename T > | |
void | cpu_coo_to_lil (uint n_rows, const CpuCoo< T > &in, CpuLil< T > &out) |
template<typename T > | |
void | cpu_coo_to_dok (const CpuCoo< T > &in, CpuDok< T > &out) |
template<typename T > | |
void | cpu_coo_to_csr (uint n_rows, const CpuCoo< T > &in, CpuCsr< T > &out) |
template<typename T > | |
void | cpu_coo_vec_sort (CpuCooVec< T > &vec) |
template<typename T > | |
void | cpu_coo_vec_resize (const uint n_values, CpuCooVec< T > &vec) |
template<typename T > | |
void | cpu_coo_vec_clear (CpuCooVec< T > &vec) |
template<typename T > | |
void | cpu_coo_vec_to_dok (const CpuCooVec< T > &in, CpuDokVec< T > &out) |
template<typename T > | |
void | cpu_csr_resize (const uint n_rows, const uint n_values, CpuCsr< T > &storage) |
template<typename T > | |
void | cpu_csr_to_dok (uint n_rows, const CpuCsr< T > &in, CpuDok< T > &out) |
template<typename T > | |
void | cpu_csr_to_coo (uint n_rows, const CpuCsr< T > &in, CpuCoo< T > &out) |
template<typename T > | |
void | cpu_dense_vec_resize (const uint n_rows, CpuDenseVec< T > &vec) |
template<typename T > | |
void | cpu_dense_vec_fill (const T fill_value, CpuDenseVec< T > &vec) |
template<typename T > | |
void | cpu_dense_vec_to_dok (const uint n_rows, const T fill_value, const CpuDenseVec< T > &in, CpuDokVec< T > &out) |
template<typename T > | |
void | cpu_dok_clear (CpuDok< T > &storage) |
template<typename T > | |
void | cpu_dok_vec_to_coo (const CpuDokVec< T > &in, CpuCooVec< T > &out) |
template<typename T > | |
void | cpu_dok_vec_to_dense (const uint n_rows, const CpuDokVec< T > &in, CpuDenseVec< T > &out) |
template<typename T > | |
void | cpu_dok_vec_add_element (uint row_id, T element, CpuDokVec< T > &vec) |
template<typename T > | |
void | cpu_dok_vec_clear (CpuDokVec< T > &vec) |
template<typename T > | |
void | cpu_lil_resize (uint n_rows, CpuLil< T > &lil) |
template<typename T > | |
void | cpu_lil_clear (CpuLil< T > &lil) |
template<typename T > | |
void | cpu_lil_add_element (uint row_id, uint col_id, T element, CpuLil< T > &lil) |
template<typename T > | |
void | cpu_lil_to_dok (uint n_rows, const CpuLil< T > &in, CpuDok< T > &out) |
template<typename T > | |
void | cpu_lil_to_coo (uint n_rows, const CpuLil< T > &in, CpuCoo< T > &out) |
template<typename T > | |
void | cpu_lil_to_csr (uint n_rows, const CpuLil< T > &in, CpuCsr< T > &out) |
template<typename T > | |
T | min (T a, T b) |
template<typename T > | |
T | max (T a, T b) |
void | register_algo_cl (class Registry *g_registry) |
Register all opencl-specific algorithms. More... | |
template<typename T > | |
void | cl_fill_zero (cl::CommandQueue &queue, cl::Buffer &values, uint n) |
template<typename T > | |
void | cl_fill_value (cl::CommandQueue &queue, const cl::Buffer &values, uint n, T value) |
template<typename T > | |
void | cl_coo_vec_init (const std::size_t n_values, const uint *Ai, const T *Ax, CLCooVec< T > &storage) |
template<typename T > | |
void | cl_coo_vec_resize (const std::size_t n_values, CLCooVec< T > &storage) |
template<typename T > | |
void | cl_coo_vec_clear (CLCooVec< T > &storage) |
template<typename T > | |
void | cl_coo_vec_read (const std::size_t n_values, uint *Ai, T *Ax, const CLCooVec< T > &storage, cl::CommandQueue &queue, cl_mem_flags staging_flags=CL_MEM_READ_ONLY|CL_MEM_HOST_READ_ONLY|CL_MEM_ALLOC_HOST_PTR, bool blocking=true) |
template<typename T > | |
void | cl_coo_vec_to_dense (const std::size_t n_rows, const T fill_value, const CLCooVec< T > &in, CLDenseVec< T > &out, cl::CommandQueue &queue) |
template<typename T > | |
void | cl_csr_init (std::size_t n_rows, std::size_t n_values, const uint *Ap, const uint *Aj, const T *Ax, CLCsr< T > &storage) |
template<typename T > | |
void | cl_csr_resize (std::size_t n_rows, std::size_t n_values, CLCsr< T > &storage) |
template<typename T > | |
void | cl_csr_read (std::size_t n_rows, std::size_t n_values, uint *Ap, uint *Aj, T *Ax, CLCsr< T > &storage, cl::CommandQueue &queue, cl_mem_flags staging_flags=CL_MEM_READ_ONLY|CL_MEM_HOST_READ_ONLY|CL_MEM_ALLOC_HOST_PTR, bool blocking=true) |
template<typename T > | |
void | cl_dense_vec_resize (const std::size_t n_rows, CLDenseVec< T > &storage) |
template<typename T > | |
void | cl_dense_vec_fill_value (const std::size_t n_rows, const T value, CLDenseVec< T > &storage) |
template<typename T > | |
void | cl_dense_vec_init (const std::size_t n_rows, const T *values, CLDenseVec< T > &storage) |
template<typename T > | |
void | cl_dense_vec_read (const std::size_t n_rows, T *values, CLDenseVec< T > &storage, cl::CommandQueue &queue, cl_mem_flags staging_flags=CL_MEM_READ_ONLY|CL_MEM_HOST_READ_ONLY|CL_MEM_ALLOC_HOST_PTR, bool blocking=true) |
template<typename T > | |
void | cl_dense_vec_to_coo (const std::size_t n_rows, const T fill_value, const CLDenseVec< T > &in, CLCooVec< T > &out, cl::CommandQueue &queue) |
template<typename T > | |
void | cl_map (cl::CommandQueue &queue, const cl::Buffer &source, cl::Buffer &dest, uint n, const ref_ptr< TOpUnary< T, T >> &op) |
template<typename T > | |
void | cl_exclusive_scan (cl::CommandQueue &queue, cl::Buffer &values, uint n, const ref_ptr< TOpBinary< T, T, T >> &op, CLAlloc *tmp_alloc) |
template<typename T > | |
void | cl_reduce (cl::CommandQueue &queue, const cl::Buffer &values, uint n, T init, const ref_ptr< TOpBinary< T, T, T >> &op_reduce, T &result) |
template<typename T > | |
void | cl_reduce_by_key (cl::CommandQueue &queue, const cl::Buffer &keys, const cl::Buffer &values, const uint size, cl::Buffer &unique_keys, cl::Buffer &reduce_values, uint &reduced_size, const ref_ptr< TOpBinary< T, T, T >> &reduce_op, CLAlloc *tmp_alloc) |
template<typename T > | |
void | cl_sort_by_key_bitonic (cl::CommandQueue &queue, cl::Buffer &keys, cl::Buffer &values, uint size) |
template<typename T > | |
void | cl_sort_by_key_radix (cl::CommandQueue &queue, cl::Buffer &keys, cl::Buffer &values, uint n, CLAlloc *tmp_alloc, uint max_key=0xffffffff) |
template<typename T > | |
void | cl_sort_by_key (cl::CommandQueue &queue, cl::Buffer &keys, cl::Buffer &values, uint n, CLAlloc *tmp_alloc, uint max_key=0xffffffff) |
template<typename T > | |
void | register_formats_matrix (StorageManagerMatrix< T > &manager) |
template<typename T > | |
void | register_formats_vector (StorageManagerVector< T > &manager) |
template<class T > | |
void | hash_combine (std::size_t &seed, const T &v) |
using spla::StorageManagerMatrix = typedef StorageManager<T, FormatMatrix, static_cast<int>(FormatMatrix::Count)> |
using spla::StorageManagerVector = typedef StorageManager<T, FormatVector, static_cast<int>(FormatVector::Count)> |
void spla::cl_exclusive_scan | ( | cl::CommandQueue & | queue, |
cl::Buffer & | values, | ||
uint | n, | ||
const ref_ptr< TOpBinary< T, T, T >> & | op, | ||
CLAlloc * | tmp_alloc | ||
) |
void spla::cl_fill_value | ( | cl::CommandQueue & | queue, |
const cl::Buffer & | values, | ||
uint | n, | ||
T | value | ||
) |
void spla::cl_fill_zero | ( | cl::CommandQueue & | queue, |
cl::Buffer & | values, | ||
uint | n | ||
) |
void spla::cl_map | ( | cl::CommandQueue & | queue, |
const cl::Buffer & | source, | ||
cl::Buffer & | dest, | ||
uint | n, | ||
const ref_ptr< TOpUnary< T, T >> & | op | ||
) |
void spla::cl_reduce | ( | cl::CommandQueue & | queue, |
const cl::Buffer & | values, | ||
uint | n, | ||
T | init, | ||
const ref_ptr< TOpBinary< T, T, T >> & | op_reduce, | ||
T & | result | ||
) |
void spla::cl_reduce_by_key | ( | cl::CommandQueue & | queue, |
const cl::Buffer & | keys, | ||
const cl::Buffer & | values, | ||
const uint | size, | ||
cl::Buffer & | unique_keys, | ||
cl::Buffer & | reduce_values, | ||
uint & | reduced_size, | ||
const ref_ptr< TOpBinary< T, T, T >> & | reduce_op, | ||
CLAlloc * | tmp_alloc | ||
) |
void spla::cl_sort_by_key | ( | cl::CommandQueue & | queue, |
cl::Buffer & | keys, | ||
cl::Buffer & | values, | ||
uint | n, | ||
CLAlloc * | tmp_alloc, | ||
uint | max_key = 0xffffffff |
||
) |
void spla::cl_sort_by_key_bitonic | ( | cl::CommandQueue & | queue, |
cl::Buffer & | keys, | ||
cl::Buffer & | values, | ||
uint | size | ||
) |
void spla::cl_sort_by_key_radix | ( | cl::CommandQueue & | queue, |
cl::Buffer & | keys, | ||
cl::Buffer & | values, | ||
uint | n, | ||
CLAlloc * | tmp_alloc, | ||
uint | max_key = 0xffffffff |
||
) |
Status spla::exec_callback | ( | ScheduleCallback | callback, |
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) callback function.
task_hnd
to store as a task, rather then execute immediately.callback | User-defined function to call as scheduled task |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_kron | ( | ref_ptr< Matrix > | R, |
ref_ptr< Matrix > | A, | ||
ref_ptr< Matrix > | B, | ||
ref_ptr< OpBinary > | op_multiply, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) sparse masked matrix kronecker product.
R = A<x>B
task_hnd
to store as a task, rather then execute immediately.R | Matrix to store result of the operation |
A | Left matrix for product |
B | Right matrix for product |
op_multiply | Element-wise binary operator for matrices elements product |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_m_eadd | ( | ref_ptr< Matrix > | R, |
ref_ptr< Matrix > | A, | ||
ref_ptr< Matrix > | B, | ||
ref_ptr< OpBinary > | op, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) element-wise addition by structure of two matrices.
R | Matrix to store result of operation |
A | Matrix input to sum |
B | Matrix input to sum |
op | Element-wise binary operator sum elements of matrices |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_m_emult | ( | ref_ptr< Matrix > | R, |
ref_ptr< Matrix > | A, | ||
ref_ptr< Matrix > | B, | ||
ref_ptr< OpBinary > | op, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) element-wise multiplication by structure of two matrices.
R | Matrix to store result of operation |
A | Matrix input to mult |
B | Matrix input to mult |
op | Element-wise binary operator mult elements of matrices |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_m_extract_column | ( | ref_ptr< Vector > | r, |
ref_ptr< Matrix > | M, | ||
uint | index, | ||
ref_ptr< OpUnary > | op_apply, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) matrix column extract.
task_hnd
to store as a task, rather then execute immediately.r | Result vector |
M | Source matrix |
index | Index of column |
op_apply | Unary op to transform value |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_m_extract_row | ( | ref_ptr< Vector > | r, |
ref_ptr< Matrix > | M, | ||
uint | index, | ||
ref_ptr< OpUnary > | op_apply, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) matrix row extract.
task_hnd
to store as a task, rather then execute immediately.r | Result vector |
M | Source matrix |
index | index of row |
op_apply | Unary op to transform value |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_m_reduce | ( | ref_ptr< Scalar > | r, |
ref_ptr< Scalar > | s, | ||
ref_ptr< Matrix > | M, | ||
ref_ptr< OpBinary > | op_reduce, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) matrix by structure reduction to a single scalar value.
task_hnd
to store as a task, rather then execute immediately.r | Scalar to store reduction result |
s | Scalar neutral init value for reduction |
M | Matrix to reduce |
op_reduce | Binary op to reduce to values |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_m_reduce_by_column | ( | ref_ptr< Vector > | r, |
ref_ptr< Matrix > | M, | ||
ref_ptr< OpBinary > | op_reduce, | ||
ref_ptr< Scalar > | init, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) matrix by column reduction to single vector column.
task_hnd
to store as a task, rather then execute immediately.r | Vector to store reduction of columns |
M | Matrix to reduce columns |
op_reduce | Binary op to sum elements of single column |
init | Scalar identity element for reduction |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_m_reduce_by_row | ( | ref_ptr< Vector > | r, |
ref_ptr< Matrix > | M, | ||
ref_ptr< OpBinary > | op_reduce, | ||
ref_ptr< Scalar > | init, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) matrix by row reduction to single vector column.
task_hnd
to store as a task, rather then execute immediately.r | Vector to store reduction of rows |
M | Matrix to reduce rows |
op_reduce | Binary op to sum elements of single row |
init | Scalar identity element for reduction |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_m_transpose | ( | ref_ptr< Matrix > | R, |
ref_ptr< Matrix > | M, | ||
ref_ptr< OpUnary > | op_apply, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) matrix transpose operation.
task_hnd
to store as a task, rather then execute immediately.R | Matrix to store result |
M | Matrix to transpose |
op_apply | Unary op to transform value |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_mxm | ( | ref_ptr< Matrix > | R, |
ref_ptr< Matrix > | A, | ||
ref_ptr< Matrix > | B, | ||
ref_ptr< OpBinary > | op_multiply, | ||
ref_ptr< OpBinary > | op_add, | ||
ref_ptr< Scalar > | init, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) sparse-matrix sparse-matrix product.
R = AB
task_hnd
to store as a task, rather then execute immediately.R | Matrix to store result of the operation |
A | Left matrix for product |
B | Right matrix for product |
op_multiply | Element-wise binary operator for matrices elements product |
op_add | Element-wise binary operator for matrices elements products sum |
init | Init of matrix row and column product |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_mxmT_masked | ( | ref_ptr< Matrix > | R, |
ref_ptr< Matrix > | mask, | ||
ref_ptr< Matrix > | A, | ||
ref_ptr< Matrix > | B, | ||
ref_ptr< OpBinary > | op_multiply, | ||
ref_ptr< OpBinary > | op_add, | ||
ref_ptr< OpSelect > | op_select, | ||
ref_ptr< Scalar > | init, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) sparse masked matrix matrix-transposed product.
R = AB^t .mask
task_hnd
to store as a task, rather then execute immediately.R | Matrix to store result of the operation |
mask | Mask to filter product result |
A | Left matrix for product |
B | Right matrix for product |
op_multiply | Element-wise binary operator for matrices elements product |
op_add | Element-wise binary operator for matrices elements products sum |
op_select | Selection op to filter mask |
init | Init of matrix row and column product |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_mxv_masked | ( | ref_ptr< Vector > | r, |
ref_ptr< Vector > | mask, | ||
ref_ptr< Matrix > | M, | ||
ref_ptr< Vector > | v, | ||
ref_ptr< OpBinary > | op_multiply, | ||
ref_ptr< OpBinary > | op_add, | ||
ref_ptr< OpSelect > | op_select, | ||
ref_ptr< Scalar > | init, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) dense-masked sparse matrix by dense vector product.
task_hnd
to store as a task, rather then execute immediately.r | Vector to store operation result |
mask | Vector to select for which values to compute product |
M | Matrix for product |
v | Vector for product |
op_multiply | Element-wise binary operator for matrix vector elements product |
op_add | Element-wise binary operator for matrix vector products sum |
op_select | Selection op to filter mask |
init | Init of matrix row and vector product |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_v_assign_masked | ( | ref_ptr< Vector > | r, |
ref_ptr< Vector > | mask, | ||
ref_ptr< Scalar > | value, | ||
ref_ptr< OpBinary > | op_assign, | ||
ref_ptr< OpSelect > | op_select, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) masked scalar assignment to a vector.
task_hnd
to store as a task, rather then execute immediately.r | Vector result to store assigned values |
mask | Vector mask to chose where to assign |
value | Scalar value to assign |
op_assign | Binary op to assign values |
op_select | Select op to chose values for assignment |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_v_count_mf | ( | ref_ptr< Scalar > | r, |
ref_ptr< Vector > | v, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) count number of meaningful values by vector structure.
Count number of entries in the provided vector container not equal to fill value. Use this function to obtain actual number of meaningful values in a container. Since container can use sparse or dense storage schema, actual number of meaningful elements must be explicitly evaluated. This functions is useful for algorithms which employ sparsity of input.
task_hnd
to store as a task, rather then execute immediately.r | Scalar (int) to store count of meaningful entries |
v | Vector to count number of meaningful entries |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_v_eadd | ( | ref_ptr< Vector > | r, |
ref_ptr< Vector > | u, | ||
ref_ptr< Vector > | v, | ||
ref_ptr< OpBinary > | op, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) element-wise addition by structure of two vectors.
r | Vector to store result of operation |
u | Vector input to sum |
v | Vector input to sum |
op | Element-wise binary operator sum elements of vectors |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_v_eadd_fdb | ( | ref_ptr< Vector > | r, |
ref_ptr< Vector > | v, | ||
ref_ptr< Vector > | fdb, | ||
ref_ptr< OpBinary > | op, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) element-wise addition by structure of two vectors with feedback.
task_hnd
to store as a task, rather then execute immediately.r | Vector to store operation result |
v | Vector add to r element-wise |
fdb | feedback vector storing affected r values |
op | Element-wise binary operator sum elements of vectors |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_v_emult | ( | ref_ptr< Vector > | r, |
ref_ptr< Vector > | u, | ||
ref_ptr< Vector > | v, | ||
ref_ptr< OpBinary > | op, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) element-wise multiplication by structure of two vectors.
r | Vector to store result of operation |
u | Vector input to mult |
v | Vector input to mult |
op | Element-wise binary operator mult elements of vectors |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_v_map | ( | ref_ptr< Vector > | r, |
ref_ptr< Vector > | v, | ||
ref_ptr< OpUnary > | op, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) by structure map of one vector to another using unary operation.
task_hnd
to store as a task, rather then execute immediately.r | Vector result to store mapped values |
v | Vector source to map |
op | binary op to transform one value to another |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_v_reduce | ( | ref_ptr< Scalar > | r, |
ref_ptr< Scalar > | s, | ||
ref_ptr< Vector > | v, | ||
ref_ptr< OpBinary > | op_reduce, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) vector by structure reduction to a single scalar value.
task_hnd
to store as a task, rather then execute immediately.r | Scalar to store reduction result |
s | Scalar neutral init value for reduction |
v | Vector to reduce |
op_reduce | Binary op to reduce to values |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
Status spla::exec_vxm_masked | ( | ref_ptr< Vector > | r, |
ref_ptr< Vector > | mask, | ||
ref_ptr< Vector > | v, | ||
ref_ptr< Matrix > | M, | ||
ref_ptr< OpBinary > | op_multiply, | ||
ref_ptr< OpBinary > | op_add, | ||
ref_ptr< OpSelect > | op_select, | ||
ref_ptr< Scalar > | init, | ||
ref_ptr< Descriptor > | desc = ref_ptr<Descriptor>() , |
||
ref_ptr< ScheduleTask > * | task_hnd = nullptr |
||
) |
Execute (schedule) dense-masked sparse vector by sparse matrix product.
task_hnd
to store as a task, rather then execute immediately.r | Vector to store operation result |
mask | Vector to select for which values to compute product |
v | Vector for product |
M | Matrix for product |
op_multiply | Element-wise binary operator for matrix vector elements product |
op_add | Element-wise binary operator for matrix vector products sum |
op_select | Selection op to filter mask |
init | Init of matrix row and vector product |
desc | Scheduled task descriptor; default is null |
task_hnd | Optional task hnd; pass not-null pointer to store task |
|
inline |
|
inline |
void spla::register_formats_matrix | ( | StorageManagerMatrix< T > & | manager | ) |
void spla::register_formats_vector | ( | StorageManagerVector< T > & | manager | ) |