User Manual

Introduction

rocSPARSE is a library that contains basic linear algebra subroutines for sparse matrices and vectors written in HiP for GPU devices. It is designed to be used from C and C++ code. The functionality of rocSPARSE is organized in the following categories:

The code is open and hosted here: https://github.com/ROCmSoftwarePlatform/rocSPARSE

Building and Installing

Prerequisites

rocSPARSE requires a ROCm enabled platform, more information here.

Installing pre-built packages

rocSPARSE can be installed from AMD ROCm repository. For detailed instructions on how to set up ROCm on different platforms, see the AMD ROCm Platform Installation Guide for Linux.

rocSPARSE can be installed on e.g. Ubuntu using

$ sudo apt-get update
$ sudo apt-get install rocsparse

Once installed, rocSPARSE can be used just like any other library with a C API. The header file will need to be included in the user code in order to make calls into rocSPARSE, and the rocSPARSE shared library will become link-time and run-time dependent for the user application.

Building rocSPARSE from source

Building from source is not necessary, as rocSPARSE can be used after installing the pre-built packages as described above. If desired, the following instructions can be used to build rocSPARSE from source. Furthermore, the following compile-time dependencies must be met

Download rocSPARSE

The rocSPARSE source code is available at the rocSPARSE GitHub page. Download the master branch using:

$ git clone -b master https://github.com/ROCmSoftwarePlatform/rocSPARSE.git
$ cd rocSPARSE

Below are steps to build different packages of the library, including dependencies and clients. It is recommended to install rocSPARSE using the install.sh script.

Using install.sh to build rocSPARSE with dependencies

The following table lists common uses of install.sh to build dependencies + library.

Command

Description

./install.sh -h

Print help information.

./install.sh -d

Build dependencies and library in your local directory. The -d flag only needs to be used once. For subsequent invocations of install.sh it is not necessary to rebuild the dependencies.

./install.sh

Build library in your local directory. It is assumed dependencies are available.

./install.sh -i

Build library, then build and install rocSPARSE package in /opt/rocm/rocsparse. You will be prompted for sudo access. This will install for all users.

Using install.sh to build rocSPARSE with dependencies and clients

The client contains example code, unit tests and benchmarks. Common uses of install.sh to build them are listed in the table below.

Command

Description

./install.sh -h

Print help information.

./install.sh -dc

Build dependencies, library and client in your local directory. The -d flag only needs to be used once. For subsequent invocations of install.sh it is not necessary to rebuild the dependencies.

./install.sh -c

Build library and client in your local directory. It is assumed dependencies are available.

./install.sh -idc

Build library, dependencies and client, then build and install rocSPARSE package in /opt/rocm/rocsparse. You will be prompted for sudo access. This will install for all users.

./install.sh -ic

Build library and client, then build and install rocSPARSE package in opt/rocm/rocsparse. You will be prompted for sudo access. This will install for all users.

Using individual commands to build rocSPARSE

CMake 3.5 or later is required in order to build rocSPARSE. The rocSPARSE library contains both, host and device code, therefore the HIP compiler must be specified during cmake configuration process.

rocSPARSE can be built using the following commands:

# Create and change to build directory
$ mkdir -p build/release ; cd build/release

# Default install path is /opt/rocm, use -DCMAKE_INSTALL_PREFIX=<path> to adjust it
$ CXX=/opt/rocm/bin/hipcc cmake ../..

# Compile rocSPARSE library
$ make -j$(nproc)

# Install rocSPARSE to /opt/rocm
$ make install

Boost and GoogleTest is required in order to build rocSPARSE clients.

rocSPARSE with dependencies and clients can be built using the following commands:

# Install boost on e.g. Ubuntu
$ apt install libboost-program-options-dev

# Install googletest
$ mkdir -p build/release/deps ; cd build/release/deps
$ cmake ../../../deps
$ make -j$(nproc) install

# Change to build directory
$ cd ..

# Default install path is /opt/rocm, use -DCMAKE_INSTALL_PREFIX=<path> to adjust it
$ CXX=/opt/rocm/bin/hipcc cmake ../.. -DBUILD_CLIENTS_TESTS=ON \
                                      -DBUILD_CLIENTS_BENCHMARKS=ON \
                                      -DBUILD_CLIENTS_SAMPLES=ON

# Compile rocSPARSE library
$ make -j$(nproc)

# Install rocSPARSE to /opt/rocm
$ make install

Common build problems

  1. Issue: Could not find a package configuration file provided by “ROCM” with any of the following names: ROCMConfig.cmake, rocm-config.cmake

    Solution: Install ROCm cmake modules

Simple Test

You can test the installation by running one of the rocSPARSE examples, after successfully compiling the library with clients.

# Navigate to clients binary directory
$ cd rocSPARSE/build/release/clients/staging

# Execute rocSPARSE example
$ ./example_csrmv 1000

Supported Targets

Currently, rocSPARSE is supported under the following operating systems

To compile and run rocSPARSE, AMD ROCm Platform is required.

The following HIP capable devices are currently supported

  • gfx803 (e.g. Fiji)

  • gfx900 (e.g. Vega10, MI25)

  • gfx906 (e.g. Vega20, MI50, MI60)

  • gfx908

Device and Stream Management

hipSetDevice() and hipGetDevice() are HIP device management APIs. They are NOT part of the rocSPARSE API.

Asynchronous Execution

All rocSPARSE library functions, unless otherwise stated, are non blocking and executed asynchronously with respect to the host. They may return before the actual computation has finished. To force synchronization, hipDeviceSynchronize() or hipStreamSynchronize() can be used. This will ensure that all previously executed rocSPARSE functions on the device / this particular stream have completed.

HIP Device Management

Before a HIP kernel invocation, users need to call hipSetDevice() to set a device, e.g. device 1. If users do not explicitly call it, the system by default sets it as device 0. Unless users explicitly call hipSetDevice() to set to another device, their HIP kernels are always launched on device 0.

The above is a HIP (and CUDA) device management approach and has nothing to do with rocSPARSE. rocSPARSE honors the approach above and assumes users have already set the device before a rocSPARSE routine call.

Once users set the device, they create a handle with rocsparse_create_handle().

Subsequent rocSPARSE routines take this handle as an input parameter. rocSPARSE ONLY queries (by hipGetDevice()) the user’s device; rocSPARSE does NOT set the device for users. If rocSPARSE does not see a valid device, it returns an error message. It is the users’ responsibility to provide a valid device to rocSPARSE and ensure the device safety.

Users CANNOT switch devices between rocsparse_create_handle() and rocsparse_destroy_handle(). If users want to change device, they must destroy the current handle and create another rocSPARSE handle.

HIP Stream Management

HIP kernels are always launched in a queue (also known as stream).

If users do not explicitly specify a stream, the system provides a default stream, maintained by the system. Users cannot create or destroy the default stream. However, users can freely create new streams (with hipStreamCreate()) and bind it to the rocSPARSE handle using rocsparse_set_stream(). HIP kernels are invoked in rocSPARSE routines. The rocSPARSE handle is always associated with a stream, and rocSPARSE passes its stream to the kernels inside the routine. One rocSPARSE routine only takes one stream in a single invocation. If users create a stream, they are responsible for destroying it.

Multiple Streams and Multiple Devices

If the system under test has multiple HIP devices, users can run multiple rocSPARSE handles concurrently, but can NOT run a single rocSPARSE handle on different discrete devices. Each handle is associated with a particular singular device, and a new handle should be created for each additional device.

Storage Formats

COO storage format

The Coordinate (COO) storage format represents a \(m \times n\) matrix by

m

number of rows (integer).

n

number of columns (integer).

nnz

number of non-zero elements (integer).

coo_val

array of nnz elements containing the data (floating point).

coo_row_ind

array of nnz elements containing the row indices (integer).

coo_col_ind

array of nnz elements containing the column indices (integer).

The COO matrix is expected to be sorted by row indices and column indices per row. Furthermore, each pair of indices should appear only once. Consider the following \(3 \times 5\) matrix and the corresponding COO structures, with \(m = 3, n = 5\) and \(\text{nnz} = 8\) using zero based indexing:

\[\begin{split}A = \begin{pmatrix} 1.0 & 2.0 & 0.0 & 3.0 & 0.0 \\ 0.0 & 4.0 & 5.0 & 0.0 & 0.0 \\ 6.0 & 0.0 & 0.0 & 7.0 & 8.0 \\ \end{pmatrix}\end{split}\]

where

\[\begin{split}\begin{array}{ll} \text{coo_val}[8] & = \{1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0\} \\ \text{coo_row_ind}[8] & = \{0, 0, 0, 1, 1, 2, 2, 2\} \\ \text{coo_col_ind}[8] & = \{0, 1, 3, 1, 2, 0, 3, 4\} \end{array}\end{split}\]

CSR storage format

The Compressed Sparse Row (CSR) storage format represents a \(m \times n\) matrix by

m

number of rows (integer).

n

number of columns (integer).

nnz

number of non-zero elements (integer).

csr_val

array of nnz elements containing the data (floating point).

csr_row_ptr

array of m+1 elements that point to the start of every row (integer).

csr_col_ind

array of nnz elements containing the column indices (integer).

The CSR matrix is expected to be sorted by column indices within each row. Furthermore, each pair of indices should appear only once. Consider the following \(3 \times 5\) matrix and the corresponding CSR structures, with \(m = 3, n = 5\) and \(\text{nnz} = 8\) using one based indexing:

\[\begin{split}A = \begin{pmatrix} 1.0 & 2.0 & 0.0 & 3.0 & 0.0 \\ 0.0 & 4.0 & 5.0 & 0.0 & 0.0 \\ 6.0 & 0.0 & 0.0 & 7.0 & 8.0 \\ \end{pmatrix}\end{split}\]

where

\[\begin{split}\begin{array}{ll} \text{csr_val}[8] & = \{1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0\} \\ \text{csr_row_ptr}[4] & = \{1, 4, 6, 9\} \\ \text{csr_col_ind}[8] & = \{1, 2, 4, 2, 3, 1, 4, 5\} \end{array}\end{split}\]

BSR storage format

The Block Compressed Sparse Row (BSR) storage format represents a \((mb \cdot \text{bsr_dim}) \times (nb \cdot \text{bsr_dim})\) matrix by

mb

number of block rows (integer)

nb

number of block columns (integer)

nnzb

number of non-zero blocks (integer)

bsr_val

array of nnzb * bsr_dim * bsr_dim elements containing the data (floating point). Blocks can be stored column-major or row-major.

bsr_row_ptr

array of mb+1 elements that point to the start of every block row (integer).

bsr_col_ind

array of nnzb elements containing the block column indices (integer).

bsr_dim

dimension of each block (integer).

The BSR matrix is expected to be sorted by column indices within each row. If \(m\) or \(n\) are not evenly divisible by the block dimension, then zeros are padded to the matrix, such that \(mb = (m + \text{bsr_dim} - 1) / \text{bsr_dim}\) and \(nb = (n + \text{bsr_dim} - 1) / \text{bsr_dim}\). Consider the following \(4 \times 3\) matrix and the corresponding BSR structures, with \(\text{bsr_dim} = 2, mb = 2, nb = 2\) and \(\text{nnzb} = 4\) using zero based indexing and column-major storage:

\[\begin{split}A = \begin{pmatrix} 1.0 & 0.0 & 2.0 \\ 3.0 & 0.0 & 4.0 \\ 5.0 & 6.0 & 0.0 \\ 7.0 & 0.0 & 8.0 \\ \end{pmatrix}\end{split}\]

with the blocks \(A_{ij}\)

\[\begin{split}A_{00} = \begin{pmatrix} 1.0 & 0.0 \\ 3.0 & 0.0 \\ \end{pmatrix}, A_{01} = \begin{pmatrix} 2.0 & 0.0 \\ 4.0 & 0.0 \\ \end{pmatrix}, A_{10} = \begin{pmatrix} 5.0 & 6.0 \\ 7.0 & 0.0 \\ \end{pmatrix}, A_{11} = \begin{pmatrix} 0.0 & 0.0 \\ 8.0 & 0.0 \\ \end{pmatrix}\end{split}\]

such that

\[\begin{split}A = \begin{pmatrix} A_{00} & A_{01} \\ A_{10} & A_{11} \\ \end{pmatrix}\end{split}\]

with arrays representation

\[\begin{split}\begin{array}{ll} \text{bsr_val}[16] & = \{1.0, 3.0, 0.0, 0.0, 2.0, 4.0, 0.0, 0.0, 5.0, 7.0, 6.0, 0.0, 0.0, 8.0, 0.0, 0.0\} \\ \text{bsr_row_ptr}[3] & = \{0, 2, 4\} \\ \text{bsr_col_ind}[4] & = \{0, 1, 0, 1\} \end{array}\end{split}\]

ELL storage format

The Ellpack-Itpack (ELL) storage format represents a \(m \times n\) matrix by

m

number of rows (integer).

n

number of columns (integer).

ell_width

maximum number of non-zero elements per row (integer)

ell_val

array of m times ell_width elements containing the data (floating point).

ell_col_ind

array of m times ell_width elements containing the column indices (integer).

The ELL matrix is assumed to be stored in column-major format. Rows with less than ell_width non-zero elements are padded with zeros (ell_val) and \(-1\) (ell_col_ind). Consider the following \(3 \times 5\) matrix and the corresponding ELL structures, with \(m = 3, n = 5\) and \(\text{ell_width} = 3\) using zero based indexing:

\[\begin{split}A = \begin{pmatrix} 1.0 & 2.0 & 0.0 & 3.0 & 0.0 \\ 0.0 & 4.0 & 5.0 & 0.0 & 0.0 \\ 6.0 & 0.0 & 0.0 & 7.0 & 8.0 \\ \end{pmatrix}\end{split}\]

where

\[\begin{split}\begin{array}{ll} \text{ell_val}[9] & = \{1.0, 4.0, 6.0, 2.0, 5.0, 7.0, 3.0, 0.0, 8.0\} \\ \text{ell_col_ind}[9] & = \{0, 1, 0, 1, 2, 3, 3, -1, 4\} \end{array}\end{split}\]

HYB storage format

The Hybrid (HYB) storage format represents a \(m \times n\) matrix by

m

number of rows (integer).

n

number of columns (integer).

nnz

number of non-zero elements of the COO part (integer)

ell_width

maximum number of non-zero elements per row of the ELL part (integer)

ell_val

array of m times ell_width elements containing the ELL part data (floating point).

ell_col_ind

array of m times ell_width elements containing the ELL part column indices (integer).

coo_val

array of nnz elements containing the COO part data (floating point).

coo_row_ind

array of nnz elements containing the COO part row indices (integer).

coo_col_ind

array of nnz elements containing the COO part column indices (integer).

The HYB format is a combination of the ELL and COO sparse matrix formats. Typically, the regular part of the matrix is stored in ELL storage format, and the irregular part of the matrix is stored in COO storage format. Three different partitioning schemes can be applied when converting a CSR matrix to a matrix in HYB storage format. For further details on the partitioning schemes, see rocsparse_hyb_partition.

Types

rocsparse_handle

typedef struct _rocsparse_handle *rocsparse_handle

Handle to the rocSPARSE library context queue.

The rocSPARSE handle is a structure holding the rocSPARSE library context. It must be initialized using rocsparse_create_handle() and the returned handle must be passed to all subsequent library function calls. It should be destroyed at the end using rocsparse_destroy_handle().

rocsparse_mat_descr

typedef struct _rocsparse_mat_descr *rocsparse_mat_descr

Descriptor of the matrix.

The rocSPARSE matrix descriptor is a structure holding all properties of a matrix. It must be initialized using rocsparse_create_mat_descr() and the returned descriptor must be passed to all subsequent library calls that involve the matrix. It should be destroyed at the end using rocsparse_destroy_mat_descr().

rocsparse_mat_info

typedef struct _rocsparse_mat_info *rocsparse_mat_info

Info structure to hold all matrix meta data.

The rocSPARSE matrix info is a structure holding all matrix information that is gathered during analysis routines. It must be initialized using rocsparse_create_mat_info() and the returned info structure must be passed to all subsequent library calls that require additional matrix information. It should be destroyed at the end using rocsparse_destroy_mat_info().

rocsparse_hyb_mat

typedef struct _rocsparse_hyb_mat *rocsparse_hyb_mat

HYB matrix storage format.

The rocSPARSE HYB matrix structure holds the HYB matrix. It must be initialized using rocsparse_create_hyb_mat() and the returned HYB matrix must be passed to all subsequent library calls that involve the matrix. It should be destroyed at the end using rocsparse_destroy_hyb_mat().

For more details on the HYB format, see HYB storage format.

rocsparse_action

enum rocsparse_action

Specify where the operation is performed on.

The rocsparse_action indicates whether the operation is performed on the full matrix, or only on the sparsity pattern of the matrix.

Values:

enumerator rocsparse_action_symbolic

Operate only on indices.

enumerator rocsparse_action_numeric

Operate on data and indices.

rocsparse_hyb_partition

enum rocsparse_hyb_partition

HYB matrix partitioning type.

The rocsparse_hyb_partition type indicates how the hybrid format partitioning between COO and ELL storage formats is performed.

Values:

enumerator rocsparse_hyb_partition_auto

automatically decide on ELL nnz per row.

enumerator rocsparse_hyb_partition_user

user given ELL nnz per row.

enumerator rocsparse_hyb_partition_max

max ELL nnz per row, no COO part.

rocsparse_index_base

enum rocsparse_index_base

Specify the matrix index base.

The rocsparse_index_base indicates the index base of the indices. For a given rocsparse_mat_descr, the rocsparse_index_base can be set using rocsparse_set_mat_index_base(). The current rocsparse_index_base of a matrix can be obtained by rocsparse_get_mat_index_base().

Values:

enumerator rocsparse_index_base_zero

zero based indexing.

enumerator rocsparse_index_base_one

one based indexing.

rocsparse_matrix_type

enum rocsparse_matrix_type

Specify the matrix type.

The rocsparse_matrix_type indices the type of a matrix. For a given rocsparse_mat_descr, the rocsparse_matrix_type can be set using rocsparse_set_mat_type(). The current rocsparse_matrix_type of a matrix can be obtained by rocsparse_get_mat_type().

Values:

enumerator rocsparse_matrix_type_general

general matrix type.

enumerator rocsparse_matrix_type_symmetric

symmetric matrix type.

enumerator rocsparse_matrix_type_hermitian

hermitian matrix type.

enumerator rocsparse_matrix_type_triangular

triangular matrix type.

rocsparse_fill_mode

enum rocsparse_fill_mode

Specify the matrix fill mode.

The rocsparse_fill_mode indicates whether the lower or the upper part is stored in a sparse triangular matrix. For a given rocsparse_mat_descr, the rocsparse_fill_mode can be set using rocsparse_set_mat_fill_mode(). The current rocsparse_fill_mode of a matrix can be obtained by rocsparse_get_mat_fill_mode().

Values:

enumerator rocsparse_fill_mode_lower

lower triangular part is stored.

enumerator rocsparse_fill_mode_upper

upper triangular part is stored.

rocsparse_diag_type

enum rocsparse_diag_type

Indicates if the diagonal entries are unity.

The rocsparse_diag_type indicates whether the diagonal entries of a matrix are unity or not. If rocsparse_diag_type_unit is specified, all present diagonal values will be ignored. For a given rocsparse_mat_descr, the rocsparse_diag_type can be set using rocsparse_set_mat_diag_type(). The current rocsparse_diag_type of a matrix can be obtained by rocsparse_get_mat_diag_type().

Values:

enumerator rocsparse_diag_type_non_unit

diagonal entries are non-unity.

enumerator rocsparse_diag_type_unit

diagonal entries are unity

rocsparse_operation

enum rocsparse_operation

Specify whether the matrix is to be transposed or not.

The rocsparse_operation indicates the operation performed with the given matrix.

Values:

enumerator rocsparse_operation_none

Operate with matrix.

enumerator rocsparse_operation_transpose

Operate with transpose.

enumerator rocsparse_operation_conjugate_transpose

Operate with conj. transpose.

rocsparse_pointer_mode

enum rocsparse_pointer_mode

Indicates if the pointer is device pointer or host pointer.

The rocsparse_pointer_mode indicates whether scalar values are passed by reference on the host or device. The rocsparse_pointer_mode can be changed by rocsparse_set_pointer_mode(). The currently used pointer mode can be obtained by rocsparse_get_pointer_mode().

Values:

enumerator rocsparse_pointer_mode_host

scalar pointers are in host memory.

enumerator rocsparse_pointer_mode_device

scalar pointers are in device memory.

rocsparse_analysis_policy

enum rocsparse_analysis_policy

Specify policy in analysis functions.

The rocsparse_analysis_policy specifies whether gathered analysis data should be re-used or not. If meta data from a previous e.g. rocsparse_csrilu0_analysis() call is available, it can be re-used for subsequent calls to e.g. rocsparse_csrsv_analysis() and greatly improve performance of the analysis function.

Values:

enumerator rocsparse_analysis_policy_reuse

try to re-use meta data.

enumerator rocsparse_analysis_policy_force

force to re-build meta data.

rocsparse_solve_policy

enum rocsparse_solve_policy

Specify policy in triangular solvers and factorizations.

This is a placeholder.

Values:

enumerator rocsparse_solve_policy_auto

automatically decide on level information.

rocsparse_layer_mode

enum rocsparse_layer_mode

Indicates if layer is active with bitmask.

The rocsparse_layer_mode bit mask indicates the logging characteristics.

Values:

enumerator rocsparse_layer_mode_none

layer is not active.

enumerator rocsparse_layer_mode_log_trace

layer is in logging mode.

enumerator rocsparse_layer_mode_log_bench

layer is in benchmarking mode.

For more details on logging, see Logging.

rocsparse_status

enum rocsparse_status

List of rocsparse status codes definition.

This is a list of the rocsparse_status types that are used by the rocSPARSE library.

Values:

enumerator rocsparse_status_success

success.

enumerator rocsparse_status_invalid_handle

handle not initialized, invalid or null.

enumerator rocsparse_status_not_implemented

function is not implemented.

enumerator rocsparse_status_invalid_pointer

invalid pointer parameter.

enumerator rocsparse_status_invalid_size

invalid size parameter.

enumerator rocsparse_status_memory_error

failed memory allocation, copy, dealloc.

enumerator rocsparse_status_internal_error

other internal library failure.

enumerator rocsparse_status_invalid_value

invalid value parameter.

enumerator rocsparse_status_arch_mismatch

device arch is not supported.

enumerator rocsparse_status_zero_pivot

encountered zero pivot.

Logging

Three different environment variables can be set to enable logging in rocSPARSE: ROCSPARSE_LAYER, ROCSPARSE_LOG_TRACE_PATH and ROCSPARSE_LOG_BENCH_PATH.

ROCSPARSE_LAYER is a bit mask, where several logging modes (rocsparse_layer_mode) can be combined as follows:

ROCSPARSE_LAYER unset

logging is disabled.

ROCSPARSE_LAYER set to 1

trace logging is enabled.

ROCSPARSE_LAYER set to 2

bench logging is enabled.

ROCSPARSE_LAYER set to 3

trace logging and bench logging is enabled.

When logging is enabled, each rocSPARSE function call will write the function name as well as function arguments to the logging stream. The default logging stream is stderr.

If the user sets the environment variable ROCSPARSE_LOG_TRACE_PATH to the full path name for a file, the file is opened and trace logging is streamed to that file. If the user sets the environment variable ROCSPARSE_LOG_BENCH_PATH to the full path name for a file, the file is opened and bench logging is streamed to that file. If the file cannot be opened, logging output is stream to stderr.

Note that performance will degrade when logging is enabled. By default, the environment variable ROCSPARSE_LAYER is unset and logging is disabled.

Exported Sparse Functions

Sparse Level 1 Functions

Function name

single

double

single complex

double complex

rocsparse_Xaxpyi()

x

x

x

x

rocsparse_Xdoti()

x

x

x

x

rocsparse_Xdotci()

x

x

rocsparse_Xgthr()

x

x

x

x

rocsparse_Xgthrz()

x

x

x

x

rocsparse_Xroti()

x

x

rocsparse_Xsctr()

x

x

x

x

Sparse Level 3 Functions

Function name

single

double

single complex

double complex

rocsparse_Xbsrmm()

x

x

x

x

rocsparse_Xcsrmm()

x

x

x

x

rocsparse_Xcsrsm_buffer_size()

x

x

x

x

rocsparse_Xcsrsm_analysis()

x

x

x

x

rocsparse_csrsm_zero_pivot()

rocsparse_csrsm_clear()

rocsparse_Xcsrsm_solve()

x

x

x

x

rocsparse_Xgemmi()

x

x

x

x

Sparse Extra Functions

Function name

single

double

single complex

double complex

rocsparse_csrgeam_nnz()

rocsparse_Xcsrgeam()

x

x

x

x

rocsparse_Xcsrgemm_buffer_size()

x

x

x

x

rocsparse_csrgemm_nnz()

rocsparse_Xcsrgemm()

x

x

x

x

Storage schemes and indexing base

rocSPARSE supports 0 and 1 based indexing. The index base is selected by the rocsparse_index_base type which is either passed as standalone parameter or as part of the rocsparse_mat_descr type.

Furthermore, dense vectors are represented with a 1D array, stored linearly in memory. Sparse vectors are represented by a 1D data array stored linearly in memory that hold all non-zero elements and a 1D indexing array stored linearly in memory that hold the positions of the corresponding non-zero elements.

Pointer mode

The auxiliary functions rocsparse_set_pointer_mode() and rocsparse_get_pointer_mode() are used to set and get the value of the state variable rocsparse_pointer_mode. If rocsparse_pointer_mode is equal to rocsparse_pointer_mode_host, then scalar parameters must be allocated on the host. If rocsparse_pointer_mode is equal to rocsparse_pointer_mode_device, then scalar parameters must be allocated on the device.

There are two types of scalar parameter:

  1. Scaling parameters, such as alpha and beta used in e.g. rocsparse_scsrmv(), rocsparse_scoomv(), …

  2. Scalar results from functions such as rocsparse_sdoti(), rocsparse_cdotci(), …

For scalar parameters such as alpha and beta, memory can be allocated on the host heap or stack, when rocsparse_pointer_mode is equal to rocsparse_pointer_mode_host. The kernel launch is asynchronous, and if the scalar parameter is on the heap, it can be freed after the return from the kernel launch. When rocsparse_pointer_mode is equal to rocsparse_pointer_mode_device, the scalar parameter must not be changed till the kernel completes.

For scalar results, when rocsparse_pointer_mode is equal to rocsparse_pointer_mode_host, the function blocks the CPU till the GPU has copied the result back to the host. Using rocsparse_pointer_mode equal to rocsparse_pointer_mode_device, the function will return after the asynchronous launch. Similarly to vector and matrix results, the scalar result is only available when the kernel has completed execution.

Asynchronous API

Except a functions having memory allocation inside preventing asynchronicity, all rocSPARSE functions are configured to operate in non-blocking fashion with respect to CPU, meaning these library functions return immediately.

hipSPARSE

hipSPARSE is a SPARSE marshalling library, with multiple supported backends. It sits between the application and a worker SPARSE library, marshalling inputs into the backend library and marshalling results back to the application. hipSPARSE exports an interface that does not require the client to change, regardless of the chosen backend. Currently, hipSPARSE supports rocSPARSE and cuSPARSE as backends. hipSPARSE focuses on convenience and portability. If performance outweighs these factors, then using rocSPARSE itself is highly recommended. hipSPARSE can be found on GitHub.

Sparse Auxiliary Functions

This module holds all sparse auxiliary functions.

The functions that are contained in the auxiliary module describe all available helper functions that are required for subsequent library calls.

rocsparse_create_handle()

rocsparse_status rocsparse_create_handle(rocsparse_handle *handle)

Create a rocsparse handle.

rocsparse_create_handle creates the rocSPARSE library context. It must be initialized before any other rocSPARSE API function is invoked and must be passed to all subsequent library function calls. The handle should be destroyed at the end using rocsparse_destroy_handle().

Parameters
  • [out] handle: the pointer to the handle to the rocSPARSE library context.

Return Value
  • rocsparse_status_success: the initialization succeeded.

  • rocsparse_status_invalid_handle: handle pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

rocsparse_destroy_handle()

rocsparse_status rocsparse_destroy_handle(rocsparse_handle handle)

Destroy a rocsparse handle.

rocsparse_destroy_handle destroys the rocSPARSE library context and releases all resources used by the rocSPARSE library.

Parameters
  • [in] handle: the handle to the rocSPARSE library context.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: handle is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

rocsparse_set_stream()

rocsparse_status rocsparse_set_stream(rocsparse_handle handle, hipStream_t stream)

Specify user defined HIP stream.

rocsparse_set_stream specifies the stream to be used by the rocSPARSE library context and all subsequent function calls.

Example

This example illustrates, how a user defined stream can be used in rocSPARSE.

// Create rocSPARSE handle
rocsparse_handle handle;
rocsparse_create_handle(&handle);

// Create stream
hipStream_t stream;
hipStreamCreate(&stream);

// Set stream to rocSPARSE handle
rocsparse_set_stream(handle, stream);

// Do some work
// ...

// Clean up
rocsparse_destroy_handle(handle);
hipStreamDestroy(stream);

Parameters
  • [inout] handle: the handle to the rocSPARSE library context.

  • [in] stream: the stream to be used by the rocSPARSE library context.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: handle is invalid.

rocsparse_get_stream()

rocsparse_status rocsparse_get_stream(rocsparse_handle handle, hipStream_t *stream)

Get current stream from library context.

rocsparse_get_stream gets the rocSPARSE library context stream which is currently used for all subsequent function calls.

Parameters
  • [in] handle: the handle to the rocSPARSE library context.

  • [out] stream: the stream currently used by the rocSPARSE library context.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: handle is invalid.

rocsparse_set_pointer_mode()

rocsparse_status rocsparse_set_pointer_mode(rocsparse_handle handle, rocsparse_pointer_mode pointer_mode)

Specify pointer mode.

rocsparse_set_pointer_mode specifies the pointer mode to be used by the rocSPARSE library context and all subsequent function calls. By default, all values are passed by reference on the host. Valid pointer modes are rocsparse_pointer_mode_host or rocsparse_pointer_mode_device.

Parameters
  • [in] handle: the handle to the rocSPARSE library context.

  • [in] pointer_mode: the pointer mode to be used by the rocSPARSE library context.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: handle is invalid.

rocsparse_get_pointer_mode()

rocsparse_status rocsparse_get_pointer_mode(rocsparse_handle handle, rocsparse_pointer_mode *pointer_mode)

Get current pointer mode from library context.

rocsparse_get_pointer_mode gets the rocSPARSE library context pointer mode which is currently used for all subsequent function calls.

Parameters
  • [in] handle: the handle to the rocSPARSE library context.

  • [out] pointer_mode: the pointer mode that is currently used by the rocSPARSE library context.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: handle is invalid.

rocsparse_get_version()

rocsparse_status rocsparse_get_version(rocsparse_handle handle, int *version)

Get rocSPARSE version.

rocsparse_get_version gets the rocSPARSE library version number.

  • patch = version % 100

  • minor = version / 100 % 1000

  • major = version / 100000

Parameters
  • [in] handle: the handle to the rocSPARSE library context.

  • [out] version: the version number of the rocSPARSE library.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: handle is invalid.

rocsparse_get_git_rev()

rocsparse_status rocsparse_get_git_rev(rocsparse_handle handle, char *rev)

Get rocSPARSE git revision.

rocsparse_get_git_rev gets the rocSPARSE library git commit revision (SHA-1).

Parameters
  • [in] handle: the handle to the rocSPARSE library context.

  • [out] rev: the git commit revision (SHA-1).

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: handle is invalid.

rocsparse_create_mat_descr()

rocsparse_status rocsparse_create_mat_descr(rocsparse_mat_descr *descr)

Create a matrix descriptor.

rocsparse_create_mat_descr creates a matrix descriptor. It initializes rocsparse_matrix_type to rocsparse_matrix_type_general and rocsparse_index_base to rocsparse_index_base_zero. It should be destroyed at the end using rocsparse_destroy_mat_descr().

Parameters
  • [out] descr: the pointer to the matrix descriptor.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_pointer: descr pointer is invalid.

rocsparse_destroy_mat_descr()

rocsparse_status rocsparse_destroy_mat_descr(rocsparse_mat_descr descr)

Destroy a matrix descriptor.

rocsparse_destroy_mat_descr destroys a matrix descriptor and releases all resources used by the descriptor.

Parameters
  • [in] descr: the matrix descriptor.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_pointer: descr is invalid.

rocsparse_copy_mat_descr()

rocsparse_status rocsparse_copy_mat_descr(rocsparse_mat_descr dest, const rocsparse_mat_descr src)

Copy a matrix descriptor.

rocsparse_copy_mat_descr copies a matrix descriptor. Both, source and destination matrix descriptors must be initialized prior to calling rocsparse_copy_mat_descr.

Parameters
  • [out] dest: the pointer to the destination matrix descriptor.

  • [in] src: the pointer to the source matrix descriptor.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_pointer: src or dest pointer is invalid.

rocsparse_set_mat_index_base()

rocsparse_status rocsparse_set_mat_index_base(rocsparse_mat_descr descr, rocsparse_index_base base)

Specify the index base of a matrix descriptor.

rocsparse_set_mat_index_base sets the index base of a matrix descriptor. Valid options are rocsparse_index_base_zero or rocsparse_index_base_one.

Parameters
Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_pointer: descr pointer is invalid.

  • rocsparse_status_invalid_value: base is invalid.

rocsparse_get_mat_index_base()

rocsparse_index_base rocsparse_get_mat_index_base(const rocsparse_mat_descr descr)

Get the index base of a matrix descriptor.

rocsparse_get_mat_index_base returns the index base of a matrix descriptor.

Return

rocsparse_index_base_zero or rocsparse_index_base_one.

Parameters
  • [in] descr: the matrix descriptor.

rocsparse_set_mat_type()

rocsparse_status rocsparse_set_mat_type(rocsparse_mat_descr descr, rocsparse_matrix_type type)

Specify the matrix type of a matrix descriptor.

rocsparse_set_mat_type sets the matrix type of a matrix descriptor. Valid matrix types are rocsparse_matrix_type_general, rocsparse_matrix_type_symmetric, rocsparse_matrix_type_hermitian or rocsparse_matrix_type_triangular.

Parameters
Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_pointer: descr pointer is invalid.

  • rocsparse_status_invalid_value: type is invalid.

rocsparse_get_mat_type()

rocsparse_matrix_type rocsparse_get_mat_type(const rocsparse_mat_descr descr)

Get the matrix type of a matrix descriptor.

rocsparse_get_mat_type returns the matrix type of a matrix descriptor.

Return

rocsparse_matrix_type_general, rocsparse_matrix_type_symmetric, rocsparse_matrix_type_hermitian or rocsparse_matrix_type_triangular.

Parameters
  • [in] descr: the matrix descriptor.

rocsparse_set_mat_fill_mode()

rocsparse_status rocsparse_set_mat_fill_mode(rocsparse_mat_descr descr, rocsparse_fill_mode fill_mode)

Specify the matrix fill mode of a matrix descriptor.

rocsparse_set_mat_fill_mode sets the matrix fill mode of a matrix descriptor. Valid fill modes are rocsparse_fill_mode_lower or rocsparse_fill_mode_upper.

Parameters
Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_pointer: descr pointer is invalid.

  • rocsparse_status_invalid_value: fill_mode is invalid.

rocsparse_get_mat_fill_mode()

rocsparse_fill_mode rocsparse_get_mat_fill_mode(const rocsparse_mat_descr descr)

Get the matrix fill mode of a matrix descriptor.

rocsparse_get_mat_fill_mode returns the matrix fill mode of a matrix descriptor.

Return

rocsparse_fill_mode_lower or rocsparse_fill_mode_upper.

Parameters
  • [in] descr: the matrix descriptor.

rocsparse_set_mat_diag_type()

rocsparse_status rocsparse_set_mat_diag_type(rocsparse_mat_descr descr, rocsparse_diag_type diag_type)

Specify the matrix diagonal type of a matrix descriptor.

rocsparse_set_mat_diag_type sets the matrix diagonal type of a matrix descriptor. Valid diagonal types are rocsparse_diag_type_unit or rocsparse_diag_type_non_unit.

Parameters
Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_pointer: descr pointer is invalid.

  • rocsparse_status_invalid_value: diag_type is invalid.

rocsparse_get_mat_diag_type()

rocsparse_diag_type rocsparse_get_mat_diag_type(const rocsparse_mat_descr descr)

Get the matrix diagonal type of a matrix descriptor.

rocsparse_get_mat_diag_type returns the matrix diagonal type of a matrix descriptor.

Return

rocsparse_diag_type_unit or rocsparse_diag_type_non_unit.

Parameters
  • [in] descr: the matrix descriptor.

rocsparse_create_hyb_mat()

rocsparse_status rocsparse_create_hyb_mat(rocsparse_hyb_mat *hyb)

Create a HYB matrix structure.

rocsparse_create_hyb_mat creates a structure that holds the matrix in HYB storage format. It should be destroyed at the end using rocsparse_destroy_hyb_mat().

Parameters
  • [inout] hyb: the pointer to the hybrid matrix.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_pointer: hyb pointer is invalid.

rocsparse_destroy_hyb_mat()

rocsparse_status rocsparse_destroy_hyb_mat(rocsparse_hyb_mat hyb)

Destroy a HYB matrix structure.

rocsparse_destroy_hyb_mat destroys a HYB structure.

Parameters
  • [in] hyb: the hybrid matrix structure.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_pointer: hyb pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

rocsparse_create_mat_info()

rocsparse_status rocsparse_create_mat_info(rocsparse_mat_info *info)

Create a matrix info structure.

rocsparse_create_mat_info creates a structure that holds the matrix info data that is gathered during the analysis routines available. It should be destroyed at the end using rocsparse_destroy_mat_info().

Parameters
  • [inout] info: the pointer to the info structure.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_pointer: info pointer is invalid.

rocsparse_destroy_mat_info()

rocsparse_status rocsparse_destroy_mat_info(rocsparse_mat_info info)

Destroy a matrix info structure.

rocsparse_destroy_mat_info destroys a matrix info structure.

Parameters
  • [in] info: the info structure.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_pointer: info pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

Sparse Level 1 Functions

The sparse level 1 routines describe operations between a vector in sparse format and a vector in dense format. This section describes all rocSPARSE level 1 sparse linear algebra functions.

rocsparse_axpyi()

rocsparse_status rocsparse_saxpyi(rocsparse_handle handle, rocsparse_int nnz, const float *alpha, const float *x_val, const rocsparse_int *x_ind, float *y, rocsparse_index_base idx_base)
rocsparse_status rocsparse_daxpyi(rocsparse_handle handle, rocsparse_int nnz, const double *alpha, const double *x_val, const rocsparse_int *x_ind, double *y, rocsparse_index_base idx_base)
rocsparse_status rocsparse_caxpyi(rocsparse_handle handle, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_float_complex *x_val, const rocsparse_int *x_ind, rocsparse_float_complex *y, rocsparse_index_base idx_base)
rocsparse_status rocsparse_zaxpyi(rocsparse_handle handle, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_double_complex *x_val, const rocsparse_int *x_ind, rocsparse_double_complex *y, rocsparse_index_base idx_base)

Scale a sparse vector and add it to a dense vector.

rocsparse_axpyi multiplies the sparse vector \(x\) with scalar \(\alpha\) and adds the result to the dense vector \(y\), such that

\[ y := y + \alpha \cdot x \]

for(i = 0; i < nnz; ++i)
{
    y[x_ind[i]] = y[x_ind[i]] + alpha * x_val[i];
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] nnz: number of non-zero entries of vector \(x\).

  • [in] alpha: scalar \(\alpha\).

  • [in] x_val: array of nnz elements containing the values of \(x\).

  • [in] x_ind: array of nnz elements containing the indices of the non-zero values of \(x\).

  • [inout] y: array of values in dense format.

  • [in] idx_base: rocsparse_index_base_zero or rocsparse_index_base_one.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_value: idx_base is invalid.

  • rocsparse_status_invalid_size: nnz is invalid.

  • rocsparse_status_invalid_pointer: alpha, x_val, x_ind or y pointer is invalid.

rocsparse_doti()

rocsparse_status rocsparse_sdoti(rocsparse_handle handle, rocsparse_int nnz, const float *x_val, const rocsparse_int *x_ind, const float *y, float *result, rocsparse_index_base idx_base)
rocsparse_status rocsparse_ddoti(rocsparse_handle handle, rocsparse_int nnz, const double *x_val, const rocsparse_int *x_ind, const double *y, double *result, rocsparse_index_base idx_base)
rocsparse_status rocsparse_cdoti(rocsparse_handle handle, rocsparse_int nnz, const rocsparse_float_complex *x_val, const rocsparse_int *x_ind, const rocsparse_float_complex *y, rocsparse_float_complex *result, rocsparse_index_base idx_base)
rocsparse_status rocsparse_zdoti(rocsparse_handle handle, rocsparse_int nnz, const rocsparse_double_complex *x_val, const rocsparse_int *x_ind, const rocsparse_double_complex *y, rocsparse_double_complex *result, rocsparse_index_base idx_base)

Compute the dot product of a sparse vector with a dense vector.

rocsparse_doti computes the dot product of the sparse vector \(x\) with the dense vector \(y\), such that

\[ \text{result} := y^T x \]

for(i = 0; i < nnz; ++i)
{
    result += x_val[i] * y[x_ind[i]];
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] nnz: number of non-zero entries of vector \(x\).

  • [in] x_val: array of nnz values.

  • [in] x_ind: array of nnz elements containing the indices of the non-zero values of \(x\).

  • [in] y: array of values in dense format.

  • [out] result: pointer to the result, can be host or device memory

  • [in] idx_base: rocsparse_index_base_zero or rocsparse_index_base_one.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_value: idx_base is invalid.

  • rocsparse_status_invalid_size: nnz is invalid.

  • rocsparse_status_invalid_pointer: x_val, x_ind, y or result pointer is invalid.

  • rocsparse_status_memory_error: the buffer for the dot product reduction could not be allocated.

  • rocsparse_status_internal_error: an internal error occurred.

rocsparse_dotci()

rocsparse_status rocsparse_cdotci(rocsparse_handle handle, rocsparse_int nnz, const rocsparse_float_complex *x_val, const rocsparse_int *x_ind, const rocsparse_float_complex *y, rocsparse_float_complex *result, rocsparse_index_base idx_base)
rocsparse_status rocsparse_zdotci(rocsparse_handle handle, rocsparse_int nnz, const rocsparse_double_complex *x_val, const rocsparse_int *x_ind, const rocsparse_double_complex *y, rocsparse_double_complex *result, rocsparse_index_base idx_base)

Compute the dot product of a complex conjugate sparse vector with a dense vector.

rocsparse_dotci computes the dot product of the complex conjugate sparse vector \(x\) with the dense vector \(y\), such that

\[ \text{result} := \bar{x}^H y \]

for(i = 0; i < nnz; ++i)
{
    result += conj(x_val[i]) * y[x_ind[i]];
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] nnz: number of non-zero entries of vector \(x\).

  • [in] x_val: array of nnz values.

  • [in] x_ind: array of nnz elements containing the indices of the non-zero values of \(x\).

  • [in] y: array of values in dense format.

  • [out] result: pointer to the result, can be host or device memory

  • [in] idx_base: rocsparse_index_base_zero or rocsparse_index_base_one.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_value: idx_base is invalid.

  • rocsparse_status_invalid_size: nnz is invalid.

  • rocsparse_status_invalid_pointer: x_val, x_ind, y or result pointer is invalid.

  • rocsparse_status_memory_error: the buffer for the dot product reduction could not be allocated.

  • rocsparse_status_internal_error: an internal error occurred.

rocsparse_gthr()

rocsparse_status rocsparse_sgthr(rocsparse_handle handle, rocsparse_int nnz, const float *y, float *x_val, const rocsparse_int *x_ind, rocsparse_index_base idx_base)
rocsparse_status rocsparse_dgthr(rocsparse_handle handle, rocsparse_int nnz, const double *y, double *x_val, const rocsparse_int *x_ind, rocsparse_index_base idx_base)
rocsparse_status rocsparse_cgthr(rocsparse_handle handle, rocsparse_int nnz, const rocsparse_float_complex *y, rocsparse_float_complex *x_val, const rocsparse_int *x_ind, rocsparse_index_base idx_base)
rocsparse_status rocsparse_zgthr(rocsparse_handle handle, rocsparse_int nnz, const rocsparse_double_complex *y, rocsparse_double_complex *x_val, const rocsparse_int *x_ind, rocsparse_index_base idx_base)

Gather elements from a dense vector and store them into a sparse vector.

rocsparse_gthr gathers the elements that are listed in x_ind from the dense vector \(y\) and stores them in the sparse vector \(x\).

for(i = 0; i < nnz; ++i)
{
    x_val[i] = y[x_ind[i]];
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] nnz: number of non-zero entries of \(x\).

  • [in] y: array of values in dense format.

  • [out] x_val: array of nnz elements containing the values of \(x\).

  • [in] x_ind: array of nnz elements containing the indices of the non-zero values of \(x\).

  • [in] idx_base: rocsparse_index_base_zero or rocsparse_index_base_one.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_value: idx_base is invalid.

  • rocsparse_status_invalid_size: nnz is invalid.

  • rocsparse_status_invalid_pointer: y, x_val or x_ind pointer is invalid.

rocsparse_gthrz()

rocsparse_status rocsparse_sgthrz(rocsparse_handle handle, rocsparse_int nnz, float *y, float *x_val, const rocsparse_int *x_ind, rocsparse_index_base idx_base)
rocsparse_status rocsparse_dgthrz(rocsparse_handle handle, rocsparse_int nnz, double *y, double *x_val, const rocsparse_int *x_ind, rocsparse_index_base idx_base)
rocsparse_status rocsparse_cgthrz(rocsparse_handle handle, rocsparse_int nnz, rocsparse_float_complex *y, rocsparse_float_complex *x_val, const rocsparse_int *x_ind, rocsparse_index_base idx_base)
rocsparse_status rocsparse_zgthrz(rocsparse_handle handle, rocsparse_int nnz, rocsparse_double_complex *y, rocsparse_double_complex *x_val, const rocsparse_int *x_ind, rocsparse_index_base idx_base)

Gather and zero out elements from a dense vector and store them into a sparse vector.

rocsparse_gthrz gathers the elements that are listed in x_ind from the dense vector \(y\) and stores them in the sparse vector \(x\). The gathered elements in \(y\) are replaced by zero.

for(i = 0; i < nnz; ++i)
{
    x_val[i]    = y[x_ind[i]];
    y[x_ind[i]] = 0;
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] nnz: number of non-zero entries of \(x\).

  • [inout] y: array of values in dense format.

  • [out] x_val: array of nnz elements containing the non-zero values of \(x\).

  • [in] x_ind: array of nnz elements containing the indices of the non-zero values of \(x\).

  • [in] idx_base: rocsparse_index_base_zero or rocsparse_index_base_one.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_value: idx_base is invalid.

  • rocsparse_status_invalid_size: nnz is invalid.

  • rocsparse_status_invalid_pointer: y, x_val or x_ind pointer is invalid.

rocsparse_roti()

rocsparse_status rocsparse_sroti(rocsparse_handle handle, rocsparse_int nnz, float *x_val, const rocsparse_int *x_ind, float *y, const float *c, const float *s, rocsparse_index_base idx_base)
rocsparse_status rocsparse_droti(rocsparse_handle handle, rocsparse_int nnz, double *x_val, const rocsparse_int *x_ind, double *y, const double *c, const double *s, rocsparse_index_base idx_base)

Apply Givens rotation to a dense and a sparse vector.

rocsparse_roti applies the Givens rotation matrix \(G\) to the sparse vector \(x\) and the dense vector \(y\), where

\[\begin{split} G = \begin{pmatrix} c & s \\ -s & c \end{pmatrix} \end{split}\]

for(i = 0; i < nnz; ++i)
{
    x_tmp = x_val[i];
    y_tmp = y[x_ind[i]];

    x_val[i]    = c * x_tmp + s * y_tmp;
    y[x_ind[i]] = c * y_tmp - s * x_tmp;
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] nnz: number of non-zero entries of \(x\).

  • [inout] x_val: array of nnz elements containing the non-zero values of \(x\).

  • [in] x_ind: array of nnz elements containing the indices of the non-zero values of \(x\).

  • [inout] y: array of values in dense format.

  • [in] c: pointer to the cosine element of \(G\), can be on host or device.

  • [in] s: pointer to the sine element of \(G\), can be on host or device.

  • [in] idx_base: rocsparse_index_base_zero or rocsparse_index_base_one.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_value: idx_base is invalid.

  • rocsparse_status_invalid_size: nnz is invalid.

  • rocsparse_status_invalid_pointer: c, s, x_val, x_ind or y pointer is invalid.

rocsparse_sctr()

rocsparse_status rocsparse_ssctr(rocsparse_handle handle, rocsparse_int nnz, const float *x_val, const rocsparse_int *x_ind, float *y, rocsparse_index_base idx_base)
rocsparse_status rocsparse_dsctr(rocsparse_handle handle, rocsparse_int nnz, const double *x_val, const rocsparse_int *x_ind, double *y, rocsparse_index_base idx_base)
rocsparse_status rocsparse_csctr(rocsparse_handle handle, rocsparse_int nnz, const rocsparse_float_complex *x_val, const rocsparse_int *x_ind, rocsparse_float_complex *y, rocsparse_index_base idx_base)
rocsparse_status rocsparse_zsctr(rocsparse_handle handle, rocsparse_int nnz, const rocsparse_double_complex *x_val, const rocsparse_int *x_ind, rocsparse_double_complex *y, rocsparse_index_base idx_base)

Scatter elements from a dense vector across a sparse vector.

rocsparse_sctr scatters the elements that are listed in x_ind from the sparse vector \(x\) into the dense vector \(y\). Indices of \(y\) that are not listed in x_ind remain unchanged.

for(i = 0; i < nnz; ++i)
{
    y[x_ind[i]] = x_val[i];
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] nnz: number of non-zero entries of \(x\).

  • [in] x_val: array of nnz elements containing the non-zero values of \(x\).

  • [in] x_ind: array of nnz elements containing the indices of the non-zero values of x.

  • [inout] y: array of values in dense format.

  • [in] idx_base: rocsparse_index_base_zero or rocsparse_index_base_one.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_value: idx_base is invalid.

  • rocsparse_status_invalid_size: nnz is invalid.

  • rocsparse_status_invalid_pointer: x_val, x_ind or y pointer is invalid.

Sparse Level 2 Functions

This module holds all sparse level 2 routines.

The sparse level 2 routines describe operations between a matrix in sparse format and a vector in dense format.

rocsparse_bsrmv()

rocsparse_status rocsparse_sbsrmv(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nb, rocsparse_int nnzb, const float *alpha, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, const float *x, const float *beta, float *y)
rocsparse_status rocsparse_dbsrmv(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nb, rocsparse_int nnzb, const double *alpha, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, const double *x, const double *beta, double *y)
rocsparse_status rocsparse_cbsrmv(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nb, rocsparse_int nnzb, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, const rocsparse_float_complex *x, const rocsparse_float_complex *beta, rocsparse_float_complex *y)
rocsparse_status rocsparse_zbsrmv(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nb, rocsparse_int nnzb, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, const rocsparse_double_complex *x, const rocsparse_double_complex *beta, rocsparse_double_complex *y)

Sparse matrix vector multiplication using BSR storage format.

rocsparse_bsrmv multiplies the scalar \(\alpha\) with a sparse \((mb \cdot \text{bsr_dim}) \times (nb \cdot \text{bsr_dim})\) matrix, defined in BSR storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that

\[ y := \alpha \cdot op(A) \cdot x + \beta \cdot y, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans == rocsparse_operation_none} \\ A^T, & \text{if trans == rocsparse_operation_transpose} \\ A^H, & \text{if trans == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans == rocsparse_operation_none is supported.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] dir: matrix storage of BSR blocks.

  • [in] trans: matrix operation type.

  • [in] mb: number of block rows of the sparse BSR matrix.

  • [in] nb: number of block columns of the sparse BSR matrix.

  • [in] nnzb: number of non-zero blocks of the sparse BSR matrix.

  • [in] alpha: scalar \(\alpha\).

  • [in] descr: descriptor of the sparse BSR matrix. Currently, only rocsparse_matrix_type_general is supported.

  • [in] bsr_val: array of nnzb blocks of the sparse BSR matrix.

  • [in] bsr_row_ptr: array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.

  • [in] bsr_col_ind: array of nnz containing the block column indices of the sparse BSR matrix.

  • [in] bsr_dim: block dimension of the sparse BSR matrix.

  • [in] x: array of nb*bsr_dim elements ( \(op(A) = A\)) or mb*bsr_dim elements ( \(op(A) = A^T\) or \(op(A) = A^H\)).

  • [in] beta: scalar \(\beta\).

  • [inout] y: array of mb*bsr_dim elements ( \(op(A) = A\)) or nb*bsr_dim elements ( \(op(A) = A^T\) or \(op(A) = A^H\)).

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: mb, nb, nnzb or bsr_dim is invalid.

  • rocsparse_status_invalid_pointer: descr, alpha, bsr_val, bsr_row_ind, bsr_col_ind, x, beta or y pointer is invalid.

  • rocsparse_status_arch_mismatch: the device is not supported.

  • rocsparse_status_not_implemented: trans != rocsparse_operation_none or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_bsrsv_zero_pivot()

rocsparse_status rocsparse_bsrsv_zero_pivot(rocsparse_handle handle, rocsparse_mat_info info, rocsparse_int *position)

Sparse triangular solve using BSR storage format.

rocsparse_bsrsv_zero_pivot returns rocsparse_status_zero_pivot, if either a structural or numerical zero has been found during rocsparse_sbsrsv_solve(), rocsparse_dbsrsv_solve(), rocsparse_cbsrsv_solve() or rocsparse_zbsrsv_solve() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the BSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and rocsparse_status_success is returned instead.

Note

rocsparse_bsrsv_zero_pivot is a blocking function. It might influence performance negatively.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] info: structure that holds the information collected during the analysis step.

  • [inout] position: pointer to zero pivot \(j\), can be in host or device memory.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_pointer: info or position pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_zero_pivot: zero pivot has been found.

rocsparse_bsrsv_buffer_size()

rocsparse_status rocsparse_sbsrsv_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, rocsparse_mat_info info, size_t *buffer_size)
rocsparse_status rocsparse_dbsrsv_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, rocsparse_mat_info info, size_t *buffer_size)
rocsparse_status rocsparse_cbsrsv_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, rocsparse_mat_info info, size_t *buffer_size)
rocsparse_status rocsparse_zbsrsv_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, rocsparse_mat_info info, size_t *buffer_size)

Sparse triangular solve using BSR storage format.

rocsparse_bsrsv_buffer_size returns the size of the temporary storage buffer that is required by rocsparse_sbsrsv_analysis(), rocsparse_dbsrsv_analysis(), rocsparse_cbsrsv_analysis(), rocsparse_zbsrsv_analysis(), rocsparse_sbsrsv_solve(), rocsparse_dbsrsv_solve(), rocsparse_cbsrsv_solve() and rocsparse_zbsrsv_solve(). The temporary storage buffer must be allocated by the user.

Parameters
Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: mb, nnzb or bsr_dim is invalid.

  • rocsparse_status_invalid_pointer: descr, bsr_val, bsr_row_ptr, bsr_col_ind, info or buffer_size pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: trans == rocsparse_operation_conjugate_transpose or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_bsrsv_analysis()

rocsparse_status rocsparse_sbsrsv_analysis(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)
rocsparse_status rocsparse_dbsrsv_analysis(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)
rocsparse_status rocsparse_cbsrsv_analysis(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)
rocsparse_status rocsparse_zbsrsv_analysis(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)

Sparse triangular solve using BSR storage format.

rocsparse_bsrsv_analysis performs the analysis step for rocsparse_sbsrsv_solve(), rocsparse_dbsrsv_solve(), rocsparse_cbsrsv_solve() and rocsparse_zbsrsv_solve(). It is expected that this function will be executed only once for a given matrix and particular operation type. The analysis meta data can be cleared by rocsparse_bsrsv_clear().

rocsparse_bsrsv_analysis can share its meta data with rocsparse_sbsrsm_analysis(), rocsparse_dbsrsm_analysis(), rocsparse_cbsrsm_analysis(), rocsparse_zbsrsm_analysis(), rocsparse_sbsrilu0_analysis(), rocsparse_dbsrilu0_analysis(), rocsparse_cbsrilu0_analysis(), rocsparse_zbsrilu0_analysis(), rocsparse_sbsric0_analysis(), rocsparse_dbsric0_analysis(), rocsparse_cbsric0_analysis() and rocsparse_zbsric0_analysis(). Selecting rocsparse_analysis_policy_reuse policy can greatly improve computation performance of meta data. However, the user need to make sure that the sparsity pattern remains unchanged. If this cannot be assured, rocsparse_analysis_policy_force has to be used.

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] dir: matrix storage of BSR blocks.

  • [in] trans: matrix operation type.

  • [in] mb: number of block rows of the sparse BSR matrix.

  • [in] nnzb: number of non-zero blocks of the sparse BSR matrix.

  • [in] descr: descriptor of the sparse BSR matrix.

  • [in] bsr_val: array of nnzb blocks of the sparse BSR matrix.

  • [in] bsr_row_ptr: array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.

  • [in] bsr_col_ind: array of nnz containing the block column indices of the sparse BSR matrix.

  • [in] bsr_dim: block dimension of the sparse BSR matrix.

  • [out] info: structure that holds the information collected during the analysis step.

  • [in] analysis: rocsparse_analysis_policy_reuse or rocsparse_analysis_policy_force.

  • [in] solve: rocsparse_solve_policy_auto.

  • [in] temp_buffer: temporary storage buffer allocated by the user.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: mb, nnzb or bsr_dim is invalid.

  • rocsparse_status_invalid_pointer: descr, bsr_row_ptr, bsr_col_ind, info or temp_buffer pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: trans == rocsparse_operation_conjugate_transpose or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_bsrsv_solve()

rocsparse_status rocsparse_sbsrsv_solve(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nnzb, const float *alpha, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, rocsparse_mat_info info, const float *x, float *y, rocsparse_solve_policy policy, void *temp_buffer)
rocsparse_status rocsparse_dbsrsv_solve(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nnzb, const double *alpha, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, rocsparse_mat_info info, const double *x, double *y, rocsparse_solve_policy policy, void *temp_buffer)
rocsparse_status rocsparse_cbsrsv_solve(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, rocsparse_mat_info info, const rocsparse_float_complex *x, rocsparse_float_complex *y, rocsparse_solve_policy policy, void *temp_buffer)
rocsparse_status rocsparse_zbsrsv_solve(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int bsr_dim, rocsparse_mat_info info, const rocsparse_double_complex *x, rocsparse_double_complex *y, rocsparse_solve_policy policy, void *temp_buffer)

Sparse triangular solve using BSR storage format.

rocsparse_bsrsv_solve solves a sparse triangular linear system of a sparse \(m \times m\) matrix, defined in BSR storage format, a dense solution vector \(y\) and the right-hand side \(x\) that is multiplied by \(\alpha\), such that

\[ op(A) \cdot y = \alpha \cdot x, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans == rocsparse_operation_none} \\ A^T, & \text{if trans == rocsparse_operation_transpose} \\ A^H, & \text{if trans == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

rocsparse_bsrsv_solve requires a user allocated temporary buffer. Its size is returned by rocsparse_sbsrsv_buffer_size(), rocsparse_dbsrsv_buffer_size(), rocsparse_cbsrsv_buffer_size() or rocsparse_zbsrsv_buffer_size(). Furthermore, analysis meta data is required. It can be obtained by rocsparse_sbsrsv_analysis(), rocsparse_dbsrsv_analysis(), rocsparse_cbsrsv_analysis() or rocsparse_zbsrsv_analysis(). rocsparse_bsrsv_solve reports the first zero pivot (either numerical or structural zero). The zero pivot status can be checked calling rocsparse_bsrsv_zero_pivot(). If rocsparse_diag_type == rocsparse_diag_type_unit, no zero pivot will be reported, even if \(A_{j,j} = 0\) for some \(j\).

Note

The sparse BSR matrix has to be sorted.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans == rocsparse_operation_none and trans == rocsparse_operation_transpose is supported.

Example

Consider the lower triangular \(m \times m\) matrix \(L\), stored in BSR storage format with unit diagonal. The following example solves \(L \cdot y = x\).

// Create rocSPARSE handle
rocsparse_handle handle;
rocsparse_create_handle(&handle);

// Create matrix descriptor
rocsparse_mat_descr descr;
rocsparse_create_mat_descr(&descr);
rocsparse_set_mat_fill_mode(descr, rocsparse_fill_mode_lower);
rocsparse_set_mat_diag_type(descr, rocsparse_diag_type_unit);

// Create matrix info structure
rocsparse_mat_info info;
rocsparse_create_mat_info(&info);

// Obtain required buffer size
size_t buffer_size;
rocsparse_dbsrsv_buffer_size(handle,
                             rocsparse_direction_column,
                             rocsparse_operation_none,
                             mb,
                             nnzb,
                             descr,
                             bsr_val,
                             bsr_row_ptr,
                             bsr_col_ind,
                             bsr_dim,
                             info,
                             &buffer_size);

// Allocate temporary buffer
void* temp_buffer;
hipMalloc(&temp_buffer, buffer_size);

// Perform analysis step
rocsparse_dbsrsv_analysis(handle,
                          rocsparse_direction_column,
                          rocsparse_operation_none,
                          mb,
                          nnzb,
                          descr,
                          bsr_val,
                          bsr_row_ptr,
                          bsr_col_ind,
                          bsr_dim,
                          info,
                          rocsparse_analysis_policy_reuse,
                          rocsparse_solve_policy_auto,
                          temp_buffer);

// Solve Ly = x
rocsparse_dbsrsv_solve(handle,
                       rocsparse_direction_column,
                       rocsparse_operation_none,
                       mb,
                       nnzb,
                       &alpha,
                       descr,
                       bsr_val,
                       bsr_row_ptr,
                       bsr_col_ind,
                       bsr_dim,
                       info,
                       x,
                       y,
                       rocsparse_solve_policy_auto,
                       temp_buffer);

// No zero pivot should be found, with L having unit diagonal

// Clean up
hipFree(temp_buffer);
rocsparse_destroy_mat_info(info);
rocsparse_destroy_mat_descr(descr);
rocsparse_destroy_handle(handle);

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] dir: matrix storage of BSR blocks.

  • [in] trans: matrix operation type.

  • [in] mb: number of block rows of the sparse BSR matrix.

  • [in] nnzb: number of non-zero blocks of the sparse BSR matrix.

  • [in] alpha: scalar \(\alpha\).

  • [in] descr: descriptor of the sparse BSR matrix.

  • [in] bsr_val: array of nnzb blocks of the sparse BSR matrix.

  • [in] bsr_row_ptr: array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.

  • [in] bsr_col_ind: array of nnz containing the block column indices of the sparse BSR matrix.

  • [in] bsr_dim: block dimension of the sparse BSR matrix.

  • [in] info: structure that holds the information collected during the analysis step.

  • [in] x: array of m elements, holding the right-hand side.

  • [out] y: array of m elements, holding the solution.

  • [in] policy: rocsparse_solve_policy_auto.

  • [in] temp_buffer: temporary storage buffer allocated by the user.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: mb, nnzb or bsr_dim is invalid.

  • rocsparse_status_invalid_pointer: descr, alpha, bsr_val, bsr_row_ptr, bsr_col_ind, x or y pointer is invalid.

  • rocsparse_status_arch_mismatch: the device is not supported.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: trans == rocsparse_operation_conjugate_transpose or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_bsrsv_clear()

rocsparse_status rocsparse_bsrsv_clear(rocsparse_handle handle, rocsparse_mat_info info)

Sparse triangular solve using BSR storage format.

rocsparse_bsrsv_clear deallocates all memory that was allocated by rocsparse_sbsrsv_analysis(), rocsparse_dbsrsv_analysis(), rocsparse_cbsrsv_analysis() or rocsparse_zbsrsv_analysis(). This is especially useful, if memory is an issue and the analysis data is not required for further computation, e.g. when switching to another sparse matrix format. Calling rocsparse_bsrsv_clear is optional. All allocated resources will be cleared, when the opaque rocsparse_mat_info struct is destroyed using rocsparse_destroy_mat_info().

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [inout] info: structure that holds the information collected during the analysis step.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_pointer: info pointer is invalid.

  • rocsparse_status_memory_error: the buffer holding the meta data could not be deallocated.

  • rocsparse_status_internal_error: an internal error occurred.

rocsparse_coomv()

rocsparse_status rocsparse_scoomv(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, rocsparse_int nnz, const float *alpha, const rocsparse_mat_descr descr, const float *coo_val, const rocsparse_int *coo_row_ind, const rocsparse_int *coo_col_ind, const float *x, const float *beta, float *y)
rocsparse_status rocsparse_dcoomv(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, rocsparse_int nnz, const double *alpha, const rocsparse_mat_descr descr, const double *coo_val, const rocsparse_int *coo_row_ind, const rocsparse_int *coo_col_ind, const double *x, const double *beta, double *y)
rocsparse_status rocsparse_ccoomv(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *coo_val, const rocsparse_int *coo_row_ind, const rocsparse_int *coo_col_ind, const rocsparse_float_complex *x, const rocsparse_float_complex *beta, rocsparse_float_complex *y)
rocsparse_status rocsparse_zcoomv(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *coo_val, const rocsparse_int *coo_row_ind, const rocsparse_int *coo_col_ind, const rocsparse_double_complex *x, const rocsparse_double_complex *beta, rocsparse_double_complex *y)

Sparse matrix vector multiplication using COO storage format.

rocsparse_coomv multiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix, defined in COO storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that

\[ y := \alpha \cdot op(A) \cdot x + \beta \cdot y, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans == rocsparse_operation_none} \\ A^T, & \text{if trans == rocsparse_operation_transpose} \\ A^H, & \text{if trans == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

The COO matrix has to be sorted by row indices. This can be achieved by using rocsparse_coosort_by_row().

for(i = 0; i < m; ++i)
{
    y[i] = beta * y[i];
}

for(i = 0; i < nnz; ++i)
{
    y[coo_row_ind[i]] += alpha * coo_val[i] * x[coo_col_ind[i]];
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans == rocsparse_operation_none is supported.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans: matrix operation type.

  • [in] m: number of rows of the sparse COO matrix.

  • [in] n: number of columns of the sparse COO matrix.

  • [in] nnz: number of non-zero entries of the sparse COO matrix.

  • [in] alpha: scalar \(\alpha\).

  • [in] descr: descriptor of the sparse COO matrix. Currently, only rocsparse_matrix_type_general is supported.

  • [in] coo_val: array of nnz elements of the sparse COO matrix.

  • [in] coo_row_ind: array of nnz elements containing the row indices of the sparse COO matrix.

  • [in] coo_col_ind: array of nnz elements containing the column indices of the sparse COO matrix.

  • [in] x: array of n elements ( \(op(A) = A\)) or m elements ( \(op(A) = A^T\) or \(op(A) = A^H\)).

  • [in] beta: scalar \(\beta\).

  • [inout] y: array of m elements ( \(op(A) = A\)) or n elements ( \(op(A) = A^T\) or \(op(A) = A^H\)).

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, n or nnz is invalid.

  • rocsparse_status_invalid_pointer: descr, alpha, coo_val, coo_row_ind, coo_col_ind, x, beta or y pointer is invalid.

  • rocsparse_status_arch_mismatch: the device is not supported.

  • rocsparse_status_not_implemented: trans != rocsparse_operation_none or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrmv_analysis()

rocsparse_status rocsparse_scsrmv_analysis(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, rocsparse_int nnz, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info)
rocsparse_status rocsparse_dcsrmv_analysis(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, rocsparse_int nnz, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info)
rocsparse_status rocsparse_ccsrmv_analysis(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, rocsparse_int nnz, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info)
rocsparse_status rocsparse_zcsrmv_analysis(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, rocsparse_int nnz, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info)

Sparse matrix vector multiplication using CSR storage format.

rocsparse_csrmv_analysis performs the analysis step for rocsparse_scsrmv(), rocsparse_dcsrmv(), rocsparse_ccsrmv() and rocsparse_zcsrmv(). It is expected that this function will be executed only once for a given matrix and particular operation type. The gathered analysis meta data can be cleared by rocsparse_csrmv_clear().

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans: matrix operation type.

  • [in] m: number of rows of the sparse CSR matrix.

  • [in] n: number of columns of the sparse CSR matrix.

  • [in] nnz: number of non-zero entries of the sparse CSR matrix.

  • [in] descr: descriptor of the sparse CSR matrix.

  • [in] csr_val: array of nnz elements of the sparse CSR matrix.

  • [in] csr_row_ptr: array of m+1 elements that point to the start of every row of the sparse CSR matrix.

  • [in] csr_col_ind: array of nnz elements containing the column indices of the sparse CSR matrix.

  • [out] info: structure that holds the information collected during the analysis step.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, n or nnz is invalid.

  • rocsparse_status_invalid_pointer: descr, csr_val, csr_row_ptr, csr_col_ind or info pointer is invalid.

  • rocsparse_status_memory_error: the buffer for the gathered information could not be allocated.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: trans != rocsparse_operation_none or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrmv()

rocsparse_status rocsparse_scsrmv(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, rocsparse_int nnz, const float *alpha, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, const float *x, const float *beta, float *y)
rocsparse_status rocsparse_dcsrmv(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, rocsparse_int nnz, const double *alpha, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, const double *x, const double *beta, double *y)
rocsparse_status rocsparse_ccsrmv(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, const rocsparse_float_complex *x, const rocsparse_float_complex *beta, rocsparse_float_complex *y)
rocsparse_status rocsparse_zcsrmv(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, const rocsparse_double_complex *x, const rocsparse_double_complex *beta, rocsparse_double_complex *y)

Sparse matrix vector multiplication using CSR storage format.

rocsparse_csrmv multiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix, defined in CSR storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that

\[ y := \alpha \cdot op(A) \cdot x + \beta \cdot y, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans == rocsparse_operation_none} \\ A^T, & \text{if trans == rocsparse_operation_transpose} \\ A^H, & \text{if trans == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

The info parameter is optional and contains information collected by rocsparse_scsrmv_analysis(), rocsparse_dcsrmv_analysis(), rocsparse_ccsrmv_analysis() or rocsparse_zcsrmv_analysis(). If present, the information will be used to speed up the csrmv computation. If info == NULL, general csrmv routine will be used instead.

for(i = 0; i < m; ++i)
{
    y[i] = beta * y[i];

    for(j = csr_row_ptr[i]; j < csr_row_ptr[i + 1]; ++j)
    {
        y[i] = y[i] + alpha * csr_val[j] * x[csr_col_ind[j]];
    }
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans == rocsparse_operation_none is supported.

Example

This example performs a sparse matrix vector multiplication in CSR format using additional meta data to improve performance.

// Create matrix info structure
rocsparse_mat_info info;
rocsparse_create_mat_info(&info);

// Perform analysis step to obtain meta data
rocsparse_scsrmv_analysis(handle,
                          rocsparse_operation_none,
                          m,
                          n,
                          nnz,
                          descr,
                          csr_val,
                          csr_row_ptr,
                          csr_col_ind,
                          info);

// Compute y = Ax
rocsparse_scsrmv(handle,
                 rocsparse_operation_none,
                 m,
                 n,
                 nnz,
                 &alpha,
                 descr,
                 csr_val,
                 csr_row_ptr,
                 csr_col_ind,
                 info,
                 x,
                 &beta,
                 y);

// Do more work
// ...

// Clean up
rocsparse_destroy_mat_info(info);

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans: matrix operation type.

  • [in] m: number of rows of the sparse CSR matrix.

  • [in] n: number of columns of the sparse CSR matrix.

  • [in] nnz: number of non-zero entries of the sparse CSR matrix.

  • [in] alpha: scalar \(\alpha\).

  • [in] descr: descriptor of the sparse CSR matrix. Currently, only rocsparse_matrix_type_general is supported.

  • [in] csr_val: array of nnz elements of the sparse CSR matrix.

  • [in] csr_row_ptr: array of m+1 elements that point to the start of every row of the sparse CSR matrix.

  • [in] csr_col_ind: array of nnz elements containing the column indices of the sparse CSR matrix.

  • [in] info: information collected by rocsparse_scsrmv_analysis(), rocsparse_dcsrmv_analysis(), rocsparse_ccsrmv_analysis() or rocsparse_dcsrmv_analysis(), can be NULL if no information is available.

  • [in] x: array of n elements ( \(op(A) == A\)) or m elements ( \(op(A) == A^T\) or \(op(A) == A^H\)).

  • [in] beta: scalar \(\beta\).

  • [inout] y: array of m elements ( \(op(A) == A\)) or n elements ( \(op(A) == A^T\) or \(op(A) == A^H\)).

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, n or nnz is invalid.

  • rocsparse_status_invalid_pointer: descr, alpha, csr_val, csr_row_ptr, csr_col_ind, x, beta or y pointer is invalid.

  • rocsparse_status_arch_mismatch: the device is not supported.

  • rocsparse_status_not_implemented: trans != rocsparse_operation_none or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrmv_analysis_clear()

rocsparse_status rocsparse_csrmv_clear(rocsparse_handle handle, rocsparse_mat_info info)

Sparse matrix vector multiplication using CSR storage format.

rocsparse_csrmv_clear deallocates all memory that was allocated by rocsparse_scsrmv_analysis(), rocsparse_dcsrmv_analysis(), rocsparse_ccsrmv_analysis() or rocsparse_zcsrmv_analysis(). This is especially useful, if memory is an issue and the analysis data is not required anymore for further computation, e.g. when switching to another sparse matrix format.

Note

Calling rocsparse_csrmv_clear is optional. All allocated resources will be cleared, when the opaque rocsparse_mat_info struct is destroyed using rocsparse_destroy_mat_info().

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [inout] info: structure that holds the information collected during analysis step.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_pointer: info pointer is invalid.

  • rocsparse_status_memory_error: the buffer for the gathered information could not be deallocated.

  • rocsparse_status_internal_error: an internal error occurred.

rocsparse_csrsv_zero_pivot()

rocsparse_status rocsparse_csrsv_zero_pivot(rocsparse_handle handle, const rocsparse_mat_descr descr, rocsparse_mat_info info, rocsparse_int *position)

Sparse triangular solve using CSR storage format.

rocsparse_csrsv_zero_pivot returns rocsparse_status_zero_pivot, if either a structural or numerical zero has been found during rocsparse_scsrsv_solve(), rocsparse_dcsrsv_solve(), rocsparse_ccsrsv_solve() or rocsparse_zcsrsv_solve() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the CSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and rocsparse_status_success is returned instead.

Note

rocsparse_csrsv_zero_pivot is a blocking function. It might influence performance negatively.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] descr: descriptor of the sparse CSR matrix.

  • [in] info: structure that holds the information collected during the analysis step.

  • [inout] position: pointer to zero pivot \(j\), can be in host or device memory.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_pointer: info or position pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_zero_pivot: zero pivot has been found.

rocsparse_csrsv_buffer_size()

rocsparse_status rocsparse_scsrsv_buffer_size(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int nnz, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, size_t *buffer_size)
rocsparse_status rocsparse_dcsrsv_buffer_size(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int nnz, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, size_t *buffer_size)
rocsparse_status rocsparse_ccsrsv_buffer_size(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int nnz, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, size_t *buffer_size)
rocsparse_status rocsparse_zcsrsv_buffer_size(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int nnz, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, size_t *buffer_size)

Sparse triangular solve using CSR storage format.

rocsparse_csrsv_buffer_size returns the size of the temporary storage buffer that is required by rocsparse_scsrsv_analysis(), rocsparse_dcsrsv_analysis(), rocsparse_ccsrsv_analysis(), rocsparse_zcsrsv_analysis(), rocsparse_scsrsv_solve(), rocsparse_dcsrsv_solve(), rocsparse_ccsrsv_solve() and rocsparse_zcsrsv_solve(). The temporary storage buffer must be allocated by the user. The size of the temporary storage buffer is identical to the size returned by rocsparse_scsrilu0_buffer_size(), rocsparse_dcsrilu0_buffer_size(), rocsparse_ccsrilu0_buffer_size() and rocsparse_zcsrilu0_buffer_size() if the matrix sparsity pattern is identical. The user allocated buffer can thus be shared between subsequent calls to those functions.

Parameters
Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m or nnz is invalid.

  • rocsparse_status_invalid_pointer: descr, csr_val, csr_row_ptr, csr_col_ind, info or buffer_size pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: trans == rocsparse_operation_conjugate_transpose or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrsv_analysis()

rocsparse_status rocsparse_scsrsv_analysis(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int nnz, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)
rocsparse_status rocsparse_dcsrsv_analysis(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int nnz, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)
rocsparse_status rocsparse_ccsrsv_analysis(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int nnz, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)
rocsparse_status rocsparse_zcsrsv_analysis(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int nnz, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)

Sparse triangular solve using CSR storage format.

rocsparse_csrsv_analysis performs the analysis step for rocsparse_scsrsv_solve(), rocsparse_dcsrsv_solve(), rocsparse_ccsrsv_solve() and rocsparse_zcsrsv_solve(). It is expected that this function will be executed only once for a given matrix and particular operation type. The analysis meta data can be cleared by rocsparse_csrsv_clear().

rocsparse_csrsv_analysis can share its meta data with rocsparse_scsrsm_analysis(), rocsparse_dcsrsm_analysis(), rocsparse_ccsrsm_analysis(), rocsparse_zcsrsm_analysis(), rocsparse_scsrilu0_analysis(), rocsparse_dcsrilu0_analysis(), rocsparse_ccsrilu0_analysis(), rocsparse_zcsrilu0_analysis(), rocsparse_scsric0_analysis(), rocsparse_dcsric0_analysis(), rocsparse_ccsric0_analysis() and rocsparse_zcsric0_analysis(). Selecting rocsparse_analysis_policy_reuse policy can greatly improve computation performance of meta data. However, the user need to make sure that the sparsity pattern remains unchanged. If this cannot be assured, rocsparse_analysis_policy_force has to be used.

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans: matrix operation type.

  • [in] m: number of rows of the sparse CSR matrix.

  • [in] nnz: number of non-zero entries of the sparse CSR matrix.

  • [in] descr: descriptor of the sparse CSR matrix.

  • [in] csr_val: array of nnz elements of the sparse CSR matrix.

  • [in] csr_row_ptr: array of m+1 elements that point to the start of every row of the sparse CSR matrix.

  • [in] csr_col_ind: array of nnz elements containing the column indices of the sparse CSR matrix.

  • [out] info: structure that holds the information collected during the analysis step.

  • [in] analysis: rocsparse_analysis_policy_reuse or rocsparse_analysis_policy_force.

  • [in] solve: rocsparse_solve_policy_auto.

  • [in] temp_buffer: temporary storage buffer allocated by the user.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m or nnz is invalid.

  • rocsparse_status_invalid_pointer: descr, csr_row_ptr, csr_col_ind, info or temp_buffer pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: trans == rocsparse_operation_conjugate_transpose or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrsv_solve()

rocsparse_status rocsparse_scsrsv_solve(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int nnz, const float *alpha, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, const float *x, float *y, rocsparse_solve_policy policy, void *temp_buffer)
rocsparse_status rocsparse_dcsrsv_solve(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int nnz, const double *alpha, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, const double *x, double *y, rocsparse_solve_policy policy, void *temp_buffer)
rocsparse_status rocsparse_ccsrsv_solve(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, const rocsparse_float_complex *x, rocsparse_float_complex *y, rocsparse_solve_policy policy, void *temp_buffer)
rocsparse_status rocsparse_zcsrsv_solve(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_mat_info info, const rocsparse_double_complex *x, rocsparse_double_complex *y, rocsparse_solve_policy policy, void *temp_buffer)

Sparse triangular solve using CSR storage format.

rocsparse_csrsv_solve solves a sparse triangular linear system of a sparse \(m \times m\) matrix, defined in CSR storage format, a dense solution vector \(y\) and the right-hand side \(x\) that is multiplied by \(\alpha\), such that

\[ op(A) \cdot y = \alpha \cdot x, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans == rocsparse_operation_none} \\ A^T, & \text{if trans == rocsparse_operation_transpose} \\ A^H, & \text{if trans == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

rocsparse_csrsv_solve requires a user allocated temporary buffer. Its size is returned by rocsparse_scsrsv_buffer_size(), rocsparse_dcsrsv_buffer_size(), rocsparse_ccsrsv_buffer_size() or rocsparse_zcsrsv_buffer_size(). Furthermore, analysis meta data is required. It can be obtained by rocsparse_scsrsv_analysis(), rocsparse_dcsrsv_analysis(), rocsparse_ccsrsv_analysis() or rocsparse_zcsrsv_analysis(). rocsparse_csrsv_solve reports the first zero pivot (either numerical or structural zero). The zero pivot status can be checked calling rocsparse_csrsv_zero_pivot(). If rocsparse_diag_type == rocsparse_diag_type_unit, no zero pivot will be reported, even if \(A_{j,j} = 0\) for some \(j\).

Note

The sparse CSR matrix has to be sorted. This can be achieved by calling rocsparse_csrsort().

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans == rocsparse_operation_none and trans == rocsparse_operation_transpose is supported.

Example

Consider the lower triangular \(m \times m\) matrix \(L\), stored in CSR storage format with unit diagonal. The following example solves \(L \cdot y = x\).

// Create rocSPARSE handle
rocsparse_handle handle;
rocsparse_create_handle(&handle);

// Create matrix descriptor
rocsparse_mat_descr descr;
rocsparse_create_mat_descr(&descr);
rocsparse_set_mat_fill_mode(descr, rocsparse_fill_mode_lower);
rocsparse_set_mat_diag_type(descr, rocsparse_diag_type_unit);

// Create matrix info structure
rocsparse_mat_info info;
rocsparse_create_mat_info(&info);

// Obtain required buffer size
size_t buffer_size;
rocsparse_dcsrsv_buffer_size(handle,
                             rocsparse_operation_none,
                             m,
                             nnz,
                             descr,
                             csr_val,
                             csr_row_ptr,
                             csr_col_ind,
                             info,
                             &buffer_size);

// Allocate temporary buffer
void* temp_buffer;
hipMalloc(&temp_buffer, buffer_size);

// Perform analysis step
rocsparse_dcsrsv_analysis(handle,
                          rocsparse_operation_none,
                          m,
                          nnz,
                          descr,
                          csr_val,
                          csr_row_ptr,
                          csr_col_ind,
                          info,
                          rocsparse_analysis_policy_reuse,
                          rocsparse_solve_policy_auto,
                          temp_buffer);

// Solve Ly = x
rocsparse_dcsrsv_solve(handle,
                       rocsparse_operation_none,
                       m,
                       nnz,
                       &alpha,
                       descr,
                       csr_val,
                       csr_row_ptr,
                       csr_col_ind,
                       info,
                       x,
                       y,
                       rocsparse_solve_policy_auto,
                       temp_buffer);

// No zero pivot should be found, with L having unit diagonal

// Clean up
hipFree(temp_buffer);
rocsparse_destroy_mat_info(info);
rocsparse_destroy_mat_descr(descr);
rocsparse_destroy_handle(handle);

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans: matrix operation type.

  • [in] m: number of rows of the sparse CSR matrix.

  • [in] nnz: number of non-zero entries of the sparse CSR matrix.

  • [in] alpha: scalar \(\alpha\).

  • [in] descr: descriptor of the sparse CSR matrix.

  • [in] csr_val: array of nnz elements of the sparse CSR matrix.

  • [in] csr_row_ptr: array of m+1 elements that point to the start of every row of the sparse CSR matrix.

  • [in] csr_col_ind: array of nnz elements containing the column indices of the sparse CSR matrix.

  • [in] info: structure that holds the information collected during the analysis step.

  • [in] x: array of m elements, holding the right-hand side.

  • [out] y: array of m elements, holding the solution.

  • [in] policy: rocsparse_solve_policy_auto.

  • [in] temp_buffer: temporary storage buffer allocated by the user.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m or nnz is invalid.

  • rocsparse_status_invalid_pointer: descr, alpha, csr_val, csr_row_ptr, csr_col_ind, x or y pointer is invalid.

  • rocsparse_status_arch_mismatch: the device is not supported.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: trans == rocsparse_operation_conjugate_transpose or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrsv_clear()

rocsparse_status rocsparse_csrsv_clear(rocsparse_handle handle, const rocsparse_mat_descr descr, rocsparse_mat_info info)

Sparse triangular solve using CSR storage format.

rocsparse_csrsv_clear deallocates all memory that was allocated by rocsparse_scsrsv_analysis(), rocsparse_dcsrsv_analysis(), rocsparse_ccsrsv_analysis() or rocsparse_zcsrsv_analysis(). This is especially useful, if memory is an issue and the analysis data is not required for further computation, e.g. when switching to another sparse matrix format. Calling rocsparse_csrsv_clear is optional. All allocated resources will be cleared, when the opaque rocsparse_mat_info struct is destroyed using rocsparse_destroy_mat_info().

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] descr: descriptor of the sparse CSR matrix.

  • [inout] info: structure that holds the information collected during the analysis step.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_pointer: info pointer is invalid.

  • rocsparse_status_memory_error: the buffer holding the meta data could not be deallocated.

  • rocsparse_status_internal_error: an internal error occurred.

rocsparse_ellmv()

rocsparse_status rocsparse_sellmv(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, const float *alpha, const rocsparse_mat_descr descr, const float *ell_val, const rocsparse_int *ell_col_ind, rocsparse_int ell_width, const float *x, const float *beta, float *y)
rocsparse_status rocsparse_dellmv(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, const double *alpha, const rocsparse_mat_descr descr, const double *ell_val, const rocsparse_int *ell_col_ind, rocsparse_int ell_width, const double *x, const double *beta, double *y)
rocsparse_status rocsparse_cellmv(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *ell_val, const rocsparse_int *ell_col_ind, rocsparse_int ell_width, const rocsparse_float_complex *x, const rocsparse_float_complex *beta, rocsparse_float_complex *y)
rocsparse_status rocsparse_zellmv(rocsparse_handle handle, rocsparse_operation trans, rocsparse_int m, rocsparse_int n, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *ell_val, const rocsparse_int *ell_col_ind, rocsparse_int ell_width, const rocsparse_double_complex *x, const rocsparse_double_complex *beta, rocsparse_double_complex *y)

Sparse matrix vector multiplication using ELL storage format.

rocsparse_ellmv multiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix, defined in ELL storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that

\[ y := \alpha \cdot op(A) \cdot x + \beta \cdot y, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans == rocsparse_operation_none} \\ A^T, & \text{if trans == rocsparse_operation_transpose} \\ A^H, & \text{if trans == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

for(i = 0; i < m; ++i)
{
    y[i] = beta * y[i];

    for(p = 0; p < ell_width; ++p)
    {
        idx = p * m + i;

        if((ell_col_ind[idx] >= 0) && (ell_col_ind[idx] < n))
        {
            y[i] = y[i] + alpha * ell_val[idx] * x[ell_col_ind[idx]];
        }
    }
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans == rocsparse_operation_none is supported.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans: matrix operation type.

  • [in] m: number of rows of the sparse ELL matrix.

  • [in] n: number of columns of the sparse ELL matrix.

  • [in] alpha: scalar \(\alpha\).

  • [in] descr: descriptor of the sparse ELL matrix. Currently, only rocsparse_matrix_type_general is supported.

  • [in] ell_val: array that contains the elements of the sparse ELL matrix. Padded elements should be zero.

  • [in] ell_col_ind: array that contains the column indices of the sparse ELL matrix. Padded column indices should be -1.

  • [in] ell_width: number of non-zero elements per row of the sparse ELL matrix.

  • [in] x: array of n elements ( \(op(A) == A\)) or m elements ( \(op(A) == A^T\) or \(op(A) == A^H\)).

  • [in] beta: scalar \(\beta\).

  • [inout] y: array of m elements ( \(op(A) == A\)) or n elements ( \(op(A) == A^T\) or \(op(A) == A^H\)).

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, n or ell_width is invalid.

  • rocsparse_status_invalid_pointer: descr, alpha, ell_val, ell_col_ind, x, beta or y pointer is invalid.

  • rocsparse_status_not_implemented: trans != rocsparse_operation_none or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_hybmv()

rocsparse_status rocsparse_shybmv(rocsparse_handle handle, rocsparse_operation trans, const float *alpha, const rocsparse_mat_descr descr, const rocsparse_hyb_mat hyb, const float *x, const float *beta, float *y)
rocsparse_status rocsparse_dhybmv(rocsparse_handle handle, rocsparse_operation trans, const double *alpha, const rocsparse_mat_descr descr, const rocsparse_hyb_mat hyb, const double *x, const double *beta, double *y)
rocsparse_status rocsparse_chybmv(rocsparse_handle handle, rocsparse_operation trans, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_hyb_mat hyb, const rocsparse_float_complex *x, const rocsparse_float_complex *beta, rocsparse_float_complex *y)
rocsparse_status rocsparse_zhybmv(rocsparse_handle handle, rocsparse_operation trans, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_hyb_mat hyb, const rocsparse_double_complex *x, const rocsparse_double_complex *beta, rocsparse_double_complex *y)

Sparse matrix vector multiplication using HYB storage format.

rocsparse_hybmv multiplies the scalar \(\alpha\) with a sparse \(m \times n\) matrix, defined in HYB storage format, and the dense vector \(x\) and adds the result to the dense vector \(y\) that is multiplied by the scalar \(\beta\), such that

\[ y := \alpha \cdot op(A) \cdot x + \beta \cdot y, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans == rocsparse_operation_none} \\ A^T, & \text{if trans == rocsparse_operation_transpose} \\ A^H, & \text{if trans == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans == rocsparse_operation_none is supported.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans: matrix operation type.

  • [in] alpha: scalar \(\alpha\).

  • [in] descr: descriptor of the sparse HYB matrix. Currently, only rocsparse_matrix_type_general is supported.

  • [in] hyb: matrix in HYB storage format.

  • [in] x: array of n elements ( \(op(A) == A\)) or m elements ( \(op(A) == A^T\) or \(op(A) == A^H\)).

  • [in] beta: scalar \(\beta\).

  • [inout] y: array of m elements ( \(op(A) == A\)) or n elements ( \(op(A) == A^T\) or \(op(A) == A^H\)).

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: hyb structure was not initialized with valid matrix sizes.

  • rocsparse_status_invalid_pointer: descr, alpha, hyb, x, beta or y pointer is invalid.

  • rocsparse_status_invalid_value: hyb structure was not initialized with a valid partitioning type.

  • rocsparse_status_arch_mismatch: the device is not supported.

  • rocsparse_status_memory_error: the buffer could not be allocated.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: trans != rocsparse_operation_none or rocsparse_matrix_type != rocsparse_matrix_type_general.

Sparse Level 3 Functions

This module holds all sparse level 3 routines.

The sparse level 3 routines describe operations between a matrix in sparse format and multiple vectors in dense format that can also be seen as a dense matrix.

rocsparse_bsrmm()

rocsparse_status rocsparse_sbsrmm(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int mb, rocsparse_int n, rocsparse_int kb, rocsparse_int nnzb, const float *alpha, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, const float *B, rocsparse_int ldb, const float *beta, float *C, rocsparse_int ldc)
rocsparse_status rocsparse_dbsrmm(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int mb, rocsparse_int n, rocsparse_int kb, rocsparse_int nnzb, const double *alpha, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, const double *B, rocsparse_int ldb, const double *beta, double *C, rocsparse_int ldc)
rocsparse_status rocsparse_cbsrmm(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int mb, rocsparse_int n, rocsparse_int kb, rocsparse_int nnzb, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, const rocsparse_float_complex *B, rocsparse_int ldb, const rocsparse_float_complex *beta, rocsparse_float_complex *C, rocsparse_int ldc)
rocsparse_status rocsparse_zbsrmm(rocsparse_handle handle, rocsparse_direction dir, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int mb, rocsparse_int n, rocsparse_int kb, rocsparse_int nnzb, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, const rocsparse_double_complex *B, rocsparse_int ldb, const rocsparse_double_complex *beta, rocsparse_double_complex *C, rocsparse_int ldc)

Sparse matrix dense matrix multiplication using BSR storage format.

rocsparse_bsrmm multiplies the scalar \(\alpha\) with a sparse \(mb \times kb\) matrix \(A\), defined in BSR storage format, and the dense \(k \times n\) matrix \(B\) (where \(k = block\_dim \times kb\)) and adds the result to the dense \(m \times n\) matrix \(C\) (where \(m = block\_dim \times mb\)) that is multiplied by the scalar \(\beta\), such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot C, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ \end{array} \right. \end{split}\]

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans_A == rocsparse_operation_none is supported.

Example

This example multiplies a BSR matrix with a dense matrix.

//     1 2 0 3 0 0
// A = 0 4 5 0 0 0
//     0 0 0 7 8 0
//     0 0 1 2 4 1

rocsparse_int block_dim = 2;
rocsparse_int mb   = 2;
rocsparse_int kb   = 3;
rocsparse_int nnzb = 4;
rocsparse_direction dir = rocsparse_direction_row;

bsr_row_ptr[mb+1]                 = {0, 2, 4};                                        // device memory
bsr_col_ind[nnzb]                 = {0, 1, 1, 2};                                     // device memory
bsr_val[nnzb*block_dim*block_dim] = {1, 2, 0, 4, 0, 3, 5, 0, 0, 7, 1, 2, 8, 0, 4, 1}; // device memory

// Set dimension n of B
rocsparse_int n = 64;
rocsparse_int m = mb * block_dim;
rocsparse_int k = kb * block_dim;

// Allocate and generate dense matrix B
std::vector<float> hB(k * n);
for(rocsparse_int i = 0; i < k * n; ++i)
{
    hB[i] = static_cast<float>(rand()) / RAND_MAX;
}

// Copy B to the device
float* B;
hipMalloc((void**)&B, sizeof(float) * k * n);
hipMemcpy(B, hB.data(), sizeof(float) * k * n, hipMemcpyHostToDevice);

// alpha and beta
float alpha = 1.0f;
float beta  = 0.0f;

// Allocate memory for the resulting matrix C
float* C;
hipMalloc((void**)&C, sizeof(float) * m * n);

// Perform the matrix multiplication
rocsparse_sbsrmm(handle,
                 dir,
                 rocsparse_operation_none,
                 rocsparse_operation_none,
                 mb,
                 n,
                 kb,
                 nnzb,
                 &alpha,
                 descr,
                 bsr_val,
                 bsr_row_ptr,
                 bsr_col_ind,
                 block_dim,
                 B,
                 k,
                 &beta,
                 C,
                 m);

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] dir: the storage format of the blocks. Can be rocsparse_direction_row or rocsparse_direction_column.

  • [in] trans_A: matrix \(A\) operation type. Currently, only rocsparse_operation_none is supported.

  • [in] trans_B: matrix \(B\) operation type. Currently, only rocsparse_operation_none and rocsparse_operation_transpose are supported.

  • [in] mb: number of block rows of the sparse BSR matrix \(A\).

  • [in] n: number of columns of the dense matrix \(op(B)\) and \(C\).

  • [in] kb: number of block columns of the sparse BSR matrix \(A\).

  • [in] nnzb: number of non-zero blocks of the sparse BSR matrix \(A\).

  • [in] alpha: scalar \(\alpha\).

  • [in] descr: descriptor of the sparse BSR matrix \(A\). Currently, only rocsparse_matrix_type_general is supported.

  • [in] bsr_val: array of nnzb*block_dim*block_dim elements of the sparse BSR matrix \(A\).

  • [in] bsr_row_ptr: array of mb+1 elements that point to the start of every block row of the sparse BSR matrix \(A\).

  • [in] bsr_col_ind: array of nnzb elements containing the block column indices of the sparse BSR matrix \(A\).

  • [in] block_dim: size of the blocks in the sparse BSR matrix.

  • [in] B: array of dimension \(ldb \times n\) ( \(op(B) == B\)) or \(ldb \times k\) ( \(op(B) == B^T\)).

  • [in] ldb: leading dimension of \(B\), must be at least \(\max{(1, k)}\) where \(k = block\_dim \times kb\).

  • [in] beta: scalar \(\beta\).

  • [inout] C: array of dimension \(ldc \times n\).

  • [in] ldc: leading dimension of \(C\), must be at least \(\max{(1, m)}\) where \(m = block\_dim \times mb\).

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: mb, n, kb, nnzb, ldb or ldc is invalid.

  • rocsparse_status_invalid_pointer: descr, alpha, bsr_val, bsr_row_ptr, bsr_col_ind, B, beta or C pointer is invalid.

  • rocsparse_status_arch_mismatch: the device is not supported.

  • rocsparse_status_not_implemented: trans_A != rocsparse_operation_none or trans_B == rocsparse_operation_conjugate_transpose or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrmm()

rocsparse_status rocsparse_scsrmm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const float *alpha, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const float *B, rocsparse_int ldb, const float *beta, float *C, rocsparse_int ldc)
rocsparse_status rocsparse_dcsrmm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const double *alpha, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const double *B, rocsparse_int ldb, const double *beta, double *C, rocsparse_int ldc)
rocsparse_status rocsparse_ccsrmm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_float_complex *B, rocsparse_int ldb, const rocsparse_float_complex *beta, rocsparse_float_complex *C, rocsparse_int ldc)
rocsparse_status rocsparse_zcsrmm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_double_complex *B, rocsparse_int ldb, const rocsparse_double_complex *beta, rocsparse_double_complex *C, rocsparse_int ldc)

Sparse matrix dense matrix multiplication using CSR storage format.

rocsparse_csrmm multiplies the scalar \(\alpha\) with a sparse \(m \times k\) matrix \(A\), defined in CSR storage format, and the dense \(k \times n\) matrix \(B\) and adds the result to the dense \(m \times n\) matrix \(C\) that is multiplied by the scalar \(\beta\), such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot C, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ A^T, & \text{if trans_A == rocsparse_operation_transpose} \\ A^H, & \text{if trans_A == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ B^H, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

for(i = 0; i < ldc; ++i)
{
    for(j = 0; j < n; ++j)
    {
        C[i][j] = beta * C[i][j];

        for(k = csr_row_ptr[i]; k < csr_row_ptr[i + 1]; ++k)
        {
            C[i][j] += alpha * csr_val[k] * B[csr_col_ind[k]][j];
        }
    }
}

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans_A == rocsparse_operation_none is supported.

Example

This example multiplies a CSR matrix with a dense matrix.

//     1 2 0 3 0
// A = 0 4 5 0 0
//     6 0 0 7 8

rocsparse_int m   = 3;
rocsparse_int k   = 5;
rocsparse_int nnz = 8;

csr_row_ptr[m+1] = {0, 3, 5, 8};             // device memory
csr_col_ind[nnz] = {0, 1, 3, 1, 2, 0, 3, 4}; // device memory
csr_val[nnz]     = {1, 2, 3, 4, 5, 6, 7, 8}; // device memory

// Set dimension n of B
rocsparse_int n = 64;

// Allocate and generate dense matrix B
std::vector<float> hB(k * n);
for(rocsparse_int i = 0; i < k * n; ++i)
{
    hB[i] = static_cast<float>(rand()) / RAND_MAX;
}

// Copy B to the device
float* B;
hipMalloc((void**)&B, sizeof(float) * k * n);
hipMemcpy(B, hB.data(), sizeof(float) * k * n, hipMemcpyHostToDevice);

// alpha and beta
float alpha = 1.0f;
float beta  = 0.0f;

// Allocate memory for the resulting matrix C
float* C;
hipMalloc((void**)&C, sizeof(float) * m * n);

// Perform the matrix multiplication
rocsparse_scsrmm(handle,
                 rocsparse_operation_none,
                 rocsparse_operation_none,
                 m,
                 n,
                 k,
                 nnz,
                 &alpha,
                 descr,
                 csr_val,
                 csr_row_ptr,
                 csr_col_ind,
                 B,
                 k,
                 &beta,
                 C,
                 m);

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans_A: matrix \(A\) operation type.

  • [in] trans_B: matrix \(B\) operation type.

  • [in] m: number of rows of the sparse CSR matrix \(A\).

  • [in] n: number of columns of the dense matrix \(op(B)\) and \(C\).

  • [in] k: number of columns of the sparse CSR matrix \(A\).

  • [in] nnz: number of non-zero entries of the sparse CSR matrix \(A\).

  • [in] alpha: scalar \(\alpha\).

  • [in] descr: descriptor of the sparse CSR matrix \(A\). Currently, only rocsparse_matrix_type_general is supported.

  • [in] csr_val: array of nnz elements of the sparse CSR matrix \(A\).

  • [in] csr_row_ptr: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(A\).

  • [in] csr_col_ind: array of nnz elements containing the column indices of the sparse CSR matrix \(A\).

  • [in] B: array of dimension \(ldb \times n\) ( \(op(B) == B\)) or \(ldb \times k\) ( \(op(B) == B^T\) or \(op(B) == B^H\)).

  • [in] ldb: leading dimension of \(B\), must be at least \(\max{(1, k)}\) ( \(op(A) == A\)) or \(\max{(1, m)}\) ( \(op(A) == A^T\) or \(op(A) == A^H\)).

  • [in] beta: scalar \(\beta\).

  • [inout] C: array of dimension \(ldc \times n\).

  • [in] ldc: leading dimension of \(C\), must be at least \(\max{(1, m)}\) ( \(op(A) == A\)) or \(\max{(1, k)}\) ( \(op(A) == A^T\) or \(op(A) == A^H\)).

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, n, k, nnz, ldb or ldc is invalid.

  • rocsparse_status_invalid_pointer: descr, alpha, csr_val, csr_row_ptr, csr_col_ind, B, beta or C pointer is invalid.

  • rocsparse_status_arch_mismatch: the device is not supported.

  • rocsparse_status_not_implemented: trans_A != rocsparse_operation_none or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrsm_zero_pivot()

rocsparse_status rocsparse_csrsm_zero_pivot(rocsparse_handle handle, rocsparse_mat_info info, rocsparse_int *position)

Sparse triangular system solve using CSR storage format.

rocsparse_csrsm_zero_pivot returns rocsparse_status_zero_pivot, if either a structural or numerical zero has been found during rocsparse_scsrsm_solve(), rocsparse_dcsrsm_solve(), rocsparse_ccsrsm_solve() or rocsparse_zcsrsm_solve() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the CSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and rocsparse_status_success is returned instead.

Note

rocsparse_csrsm_zero_pivot is a blocking function. It might influence performance negatively.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] info: structure that holds the information collected during the analysis step.

  • [inout] position: pointer to zero pivot \(j\), can be in host or device memory.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_pointer: info or position pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_zero_pivot: zero pivot has been found.

rocsparse_csrsm_buffer_size()

rocsparse_status rocsparse_scsrsm_buffer_size(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const float *alpha, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const float *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, size_t *buffer_size)
rocsparse_status rocsparse_dcsrsm_buffer_size(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const double *alpha, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const double *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, size_t *buffer_size)
rocsparse_status rocsparse_ccsrsm_buffer_size(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_float_complex *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, size_t *buffer_size)
rocsparse_status rocsparse_zcsrsm_buffer_size(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_double_complex *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, size_t *buffer_size)

Sparse triangular system solve using CSR storage format.

rocsparse_csrsm_buffer_size returns the size of the temporary storage buffer that is required by rocsparse_scsrsm_analysis(), rocsparse_dcsrsm_analysis(), rocsparse_ccsrsm_analysis(), rocsparse_zcsrsm_analysis(), rocsparse_scsrsm_solve(), rocsparse_dcsrsm_solve(), rocsparse_ccsrsm_solve() and rocsparse_zcsrsm_solve(). The temporary storage buffer must be allocated by the user.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans_A: matrix A operation type.

  • [in] trans_B: matrix B operation type.

  • [in] m: number of rows of the sparse CSR matrix A.

  • [in] nrhs: number of columns of the dense matrix op(B).

  • [in] nnz: number of non-zero entries of the sparse CSR matrix A.

  • [in] alpha: scalar \(\alpha\).

  • [in] descr: descriptor of the sparse CSR matrix A.

  • [in] csr_val: array of nnz elements of the sparse CSR matrix A.

  • [in] csr_row_ptr: array of m+1 elements that point to the start of every row of the sparse CSR matrix A.

  • [in] csr_col_ind: array of nnz elements containing the column indices of the sparse CSR matrix A.

  • [in] B: array of m \(\times\) nrhs elements of the rhs matrix B.

  • [in] ldb: leading dimension of rhs matrix B.

  • [in] info: structure that holds the information collected during the analysis step.

  • [in] policy: rocsparse_solve_policy_auto.

  • [out] buffer_size: number of bytes of the temporary storage buffer required by rocsparse_scsrsm_analysis(), rocsparse_dcsrsm_analysis(), rocsparse_ccsrsm_analysis(), rocsparse_zcsrsm_analysis(), rocsparse_scsrsm_solve(), rocsparse_dcsrsm_solve(), rocsparse_ccsrsm_solve() and rocsparse_zcsrsm_solve().

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, nrhs or nnz is invalid.

  • rocsparse_status_invalid_pointer: alpha, descr, csr_val, csr_row_ptr, csr_col_ind, B, info or buffer_size pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: trans_A != rocsparse_operation_none, trans_B == rocsparse_operation_conjugate_transpose or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrsm_analysis()

rocsparse_status rocsparse_scsrsm_analysis(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const float *alpha, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const float *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)
rocsparse_status rocsparse_dcsrsm_analysis(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const double *alpha, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const double *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)
rocsparse_status rocsparse_ccsrsm_analysis(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_float_complex *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)
rocsparse_status rocsparse_zcsrsm_analysis(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_double_complex *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)

Sparse triangular system solve using CSR storage format.

rocsparse_csrsm_analysis performs the analysis step for rocsparse_scsrsm_solve(), rocsparse_dcsrsm_solve(), rocsparse_ccsrsm_solve() and rocsparse_zcsrsm_solve(). It is expected that this function will be executed only once for a given matrix and particular operation type. The analysis meta data can be cleared by rocsparse_csrsm_clear().

rocsparse_csrsm_analysis can share its meta data with rocsparse_scsrilu0_analysis(), rocsparse_dcsrilu0_analysis(), rocsparse_ccsrilu0_analysis(), rocsparse_zcsrilu0_analysis(), rocsparse_scsric0_analysis(), rocsparse_dcsric0_analysis(), rocsparse_ccsric0_analysis(), rocsparse_zcsric0_analysis(), rocsparse_scsrsv_analysis(), rocsparse_dcsrsv_analysis(), rocsparse_ccsrsv_analysis() and rocsparse_zcsrsv_analysis(). Selecting rocsparse_analysis_policy_reuse policy can greatly improve computation performance of meta data. However, the user need to make sure that the sparsity pattern remains unchanged. If this cannot be assured, rocsparse_analysis_policy_force has to be used.

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans_A: matrix A operation type.

  • [in] trans_B: matrix B operation type.

  • [in] m: number of rows of the sparse CSR matrix A.

  • [in] nrhs: number of columns of the dense matrix op(B).

  • [in] nnz: number of non-zero entries of the sparse CSR matrix A.

  • [in] alpha: scalar \(\alpha\).

  • [in] descr: descriptor of the sparse CSR matrix A.

  • [in] csr_val: array of nnz elements of the sparse CSR matrix A.

  • [in] csr_row_ptr: array of m+1 elements that point to the start of every row of the sparse CSR matrix A.

  • [in] csr_col_ind: array of nnz elements containing the column indices of the sparse CSR matrix A.

  • [in] B: array of m \(\times\) nrhs elements of the rhs matrix B.

  • [in] ldb: leading dimension of rhs matrix B.

  • [out] info: structure that holds the information collected during the analysis step.

  • [in] analysis: rocsparse_analysis_policy_reuse or rocsparse_analysis_policy_force.

  • [in] solve: rocsparse_solve_policy_auto.

  • [in] temp_buffer: temporary storage buffer allocated by the user.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, nrhs or nnz is invalid.

  • rocsparse_status_invalid_pointer: alpha, descr, csr_val, csr_row_ptr, csr_col_ind, B, info or temp_buffer pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: trans_A != rocsparse_operation_none, trans_B == rocsparse_operation_conjugate_transpose or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrsm_solve()

rocsparse_status rocsparse_scsrsm_solve(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const float *alpha, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, float *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, void *temp_buffer)
rocsparse_status rocsparse_dcsrsm_solve(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const double *alpha, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, double *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, void *temp_buffer)
rocsparse_status rocsparse_ccsrsm_solve(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_float_complex *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, void *temp_buffer)
rocsparse_status rocsparse_zcsrsm_solve(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int nrhs, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, rocsparse_double_complex *B, rocsparse_int ldb, rocsparse_mat_info info, rocsparse_solve_policy policy, void *temp_buffer)

Sparse triangular system solve using CSR storage format.

rocsparse_csrsm_solve solves a sparse triangular linear system of a sparse \(m \times m\) matrix, defined in CSR storage format, a dense solution matrix \(X\) and the right-hand side matrix \(B\) that is multiplied by \(\alpha\), such that

\[ op(A) \cdot op(X) = \alpha \cdot op(B), \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ A^T, & \text{if trans_A == rocsparse_operation_transpose} \\ A^H, & \text{if trans_A == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
,
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ B^H, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(X) = \left\{ \begin{array}{ll} X, & \text{if trans_B == rocsparse_operation_none} \\ X^T, & \text{if trans_B == rocsparse_operation_transpose} \\ X^H, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

rocsparse_csrsm_solve requires a user allocated temporary buffer. Its size is returned by rocsparse_scsrsm_buffer_size(), rocsparse_dcsrsm_buffer_size(), rocsparse_ccsrsm_buffer_size() or rocsparse_zcsrsm_buffer_size(). Furthermore, analysis meta data is required. It can be obtained by rocsparse_scsrsm_analysis(), rocsparse_dcsrsm_analysis(), rocsparse_ccsrsm_analysis() or rocsparse_zcsrsm_analysis(). rocsparse_csrsm_solve reports the first zero pivot (either numerical or structural zero). The zero pivot status can be checked calling rocsparse_csrsm_zero_pivot(). If rocsparse_diag_type == rocsparse_diag_type_unit, no zero pivot will be reported, even if \(A_{j,j} = 0\) for some \(j\).

Note

The sparse CSR matrix has to be sorted. This can be achieved by calling rocsparse_csrsort().

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only trans_A == rocsparse_operation_none and trans_B != rocsparse_operation_conjugate_transpose is supported.

Example

Consider the lower triangular \(m \times m\) matrix \(L\), stored in CSR storage format with unit diagonal. The following example solves \(L \cdot X = B\).

// Create rocSPARSE handle
rocsparse_handle handle;
rocsparse_create_handle(&handle);

// Create matrix descriptor
rocsparse_mat_descr descr;
rocsparse_create_mat_descr(&descr);
rocsparse_set_mat_fill_mode(descr, rocsparse_fill_mode_lower);
rocsparse_set_mat_diag_type(descr, rocsparse_diag_type_unit);

// Create matrix info structure
rocsparse_mat_info info;
rocsparse_create_mat_info(&info);

// Obtain required buffer size
size_t buffer_size;
rocsparse_dcsrsm_buffer_size(handle,
                             rocsparse_operation_none,
                             rocsparse_operation_none,
                             m,
                             nrhs,
                             nnz,
                             &alpha,
                             descr,
                             csr_val,
                             csr_row_ptr,
                             csr_col_ind,
                             B,
                             ldb,
                             info,
                             rocsparse_solve_policy_auto,
                             &buffer_size);

// Allocate temporary buffer
void* temp_buffer;
hipMalloc(&temp_buffer, buffer_size);

// Perform analysis step
rocsparse_dcsrsm_analysis(handle,
                          rocsparse_operation_none,
                          rocsparse_operation_none,
                          m,
                          nrhs,
                          nnz,
                          &alpha,
                          descr,
                          csr_val,
                          csr_row_ptr,
                          csr_col_ind,
                          B,
                          ldb,
                          info,
                          rocsparse_analysis_policy_reuse,
                          rocsparse_solve_policy_auto,
                          temp_buffer);

// Solve LX = B
rocsparse_dcsrsm_solve(handle,
                       rocsparse_operation_none,
                       rocsparse_operation_none,
                       m,
                       nrhs,
                       nnz,
                       &alpha,
                       descr,
                       csr_val,
                       csr_row_ptr,
                       csr_col_ind,
                       B,
                       ldb,
                       info,
                       rocsparse_solve_policy_auto,
                       temp_buffer);

// No zero pivot should be found, with L having unit diagonal

// Clean up
hipFree(temp_buffer);
rocsparse_destroy_mat_info(info);
rocsparse_destroy_mat_descr(descr);
rocsparse_destroy_handle(handle);

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans_A: matrix A operation type.

  • [in] trans_B: matrix B operation type.

  • [in] m: number of rows of the sparse CSR matrix A.

  • [in] nrhs: number of columns of the dense matrix op(B).

  • [in] nnz: number of non-zero entries of the sparse CSR matrix A.

  • [in] alpha: scalar \(\alpha\).

  • [in] descr: descriptor of the sparse CSR matrix A.

  • [in] csr_val: array of nnz elements of the sparse CSR matrix A.

  • [in] csr_row_ptr: array of m+1 elements that point to the start of every row of the sparse CSR matrix A.

  • [in] csr_col_ind: array of nnz elements containing the column indices of the sparse CSR matrix A.

  • [inout] B: array of m \(\times\) nrhs elements of the rhs matrix B.

  • [in] ldb: leading dimension of rhs matrix B.

  • [in] info: structure that holds the information collected during the analysis step.

  • [in] policy: rocsparse_solve_policy_auto.

  • [in] temp_buffer: temporary storage buffer allocated by the user.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, nrhs or nnz is invalid.

  • rocsparse_status_invalid_pointer: alpha, descr, csr_val, csr_row_ptr, csr_col_ind, B, info or temp_buffer pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: trans_A != rocsparse_operation_none, trans_B == rocsparse_operation_conjugate_transpose or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrsm_clear()

rocsparse_status rocsparse_csrsm_clear(rocsparse_handle handle, rocsparse_mat_info info)

Sparse triangular system solve using CSR storage format.

rocsparse_csrsm_clear deallocates all memory that was allocated by rocsparse_scsrsm_analysis(), rocsparse_dcsrsm_analysis(), rocsparse_ccsrsm_analysis() or rocsparse_zcsrsm_analysis(). This is especially useful, if memory is an issue and the analysis data is not required for further computation, e.g. when switching to another sparse matrix format. Calling rocsparse_csrsm_clear is optional. All allocated resources will be cleared, when the opaque rocsparse_mat_info struct is destroyed using rocsparse_destroy_mat_info().

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [inout] info: structure that holds the information collected during the analysis step.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_pointer: info pointer is invalid.

  • rocsparse_status_memory_error: the buffer holding the meta data could not be deallocated.

  • rocsparse_status_internal_error: an internal error occurred.

rocsparse_gemmi()

rocsparse_status rocsparse_sgemmi(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const float *alpha, const float *A, rocsparse_int lda, const rocsparse_mat_descr descr, const float *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const float *beta, float *C, rocsparse_int ldc)
rocsparse_status rocsparse_dgemmi(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const double *alpha, const double *A, rocsparse_int lda, const rocsparse_mat_descr descr, const double *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const double *beta, double *C, rocsparse_int ldc)
rocsparse_status rocsparse_cgemmi(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const rocsparse_float_complex *alpha, const rocsparse_float_complex *A, rocsparse_int lda, const rocsparse_mat_descr descr, const rocsparse_float_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_float_complex *beta, rocsparse_float_complex *C, rocsparse_int ldc)
rocsparse_status rocsparse_zgemmi(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, rocsparse_int nnz, const rocsparse_double_complex *alpha, const rocsparse_double_complex *A, rocsparse_int lda, const rocsparse_mat_descr descr, const rocsparse_double_complex *csr_val, const rocsparse_int *csr_row_ptr, const rocsparse_int *csr_col_ind, const rocsparse_double_complex *beta, rocsparse_double_complex *C, rocsparse_int ldc)

Dense matrix sparse matrix multiplication using CSR storage format.

rocsparse_gemmi multiplies the scalar \(\alpha\) with a dense \(m \times k\) matrix \(A\) and the sparse \(k \times n\) matrix \(B\), defined in CSR storage format and adds the result to the dense \(m \times n\) matrix \(C\) that is multiplied by the scalar \(\beta\), such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot C \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ A^T, & \text{if trans_A == rocsparse_operation_transpose} \\ A^H, & \text{if trans_A == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ B^H, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Example

This example multiplies a dense matrix with a CSC matrix.

rocsparse_int m   = 2;
rocsparse_int n   = 5;
rocsparse_int k   = 3;
rocsparse_int nnz = 8;
rocsparse_int lda = m;
rocsparse_int ldc = m;

// Matrix A (m x k)
// (  9.0  10.0  11.0 )
// ( 12.0  13.0  14.0 )

// Matrix B (k x n)
// ( 1.0  2.0  0.0  3.0  0.0 )
// ( 0.0  4.0  5.0  0.0  0.0 )
// ( 6.0  0.0  0.0  7.0  8.0 )

// Matrix C (m x n)
// ( 15.0  16.0  17.0  18.0  19.0 )
// ( 20.0  21.0  22.0  23.0  24.0 )

A[lda * k]           = {9.0, 12.0, 10.0, 13.0, 11.0, 14.0};      // device memory
csc_col_ptr_B[n + 1] = {0, 2, 4, 5, 7, 8};                       // device memory
csc_row_ind_B[nnz]   = {0, 0, 1, 1, 2, 3, 3, 4};                 // device memory
csc_val_B[nnz]       = {1.0, 6.0, 2.0, 4.0, 5.0, 3.0, 7.0, 8.0}; // device memory
C[ldc * n]           = {15.0, 20.0, 16.0, 21.0, 17.0, 22.0,      // device memory
                        18.0, 23.0, 19.0, 24.0};

// alpha and beta
float alpha = 1.0f;
float beta  = 0.0f;

// Perform the matrix multiplication
rocsparse_sgemmi(handle,
                 rocsparse_operation_none,
                 rocsparse_operation_transpose,
                 m,
                 n,
                 k,
                 nnz,
                 &alpha,
                 A,
                 lda,
                 descr_B,
                 csc_val_B,
                 csc_col_ptr_B,
                 csc_row_ind_B,
                 &beta,
                 C,
                 ldc);

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans_A: matrix \(A\) operation type.

  • [in] trans_B: matrix \(B\) operation type.

  • [in] m: number of rows of the dense matrix \(A\).

  • [in] n: number of columns of the sparse CSR matrix \(op(B)\) and \(C\).

  • [in] k: number of columns of the dense matrix \(A\).

  • [in] nnz: number of non-zero entries of the sparse CSR matrix \(B\).

  • [in] alpha: scalar \(\alpha\).

  • [in] A: array of dimension \(lda \times k\) ( \(op(A) == A\)) or \(lda \times m\) ( \(op(A) == A^T\) or \(op(A) == A^H\)).

  • [in] lda: leading dimension of \(A\), must be at least \(m\) ( \(op(A) == A\)) or \(k\) ( \(op(A) == A^T\) or \(op(A) == A^H\)).

  • [in] descr: descriptor of the sparse CSR matrix \(B\). Currently, only rocsparse_matrix_type_general is supported.

  • [in] csr_val: array of nnz elements of the sparse CSR matrix \(B\).

  • [in] csr_row_ptr: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(B\).

  • [in] csr_col_ind: array of nnz elements containing the column indices of the sparse CSR matrix \(B\).

  • [in] beta: scalar \(\beta\).

  • [inout] C: array of dimension \(ldc \times n\) that holds the values of \(C\).

  • [in] ldc: leading dimension of \(C\), must be at least \(m\).

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, n, k, nnz, lda or ldc is invalid.

  • rocsparse_status_invalid_pointer: alpha, A, csr_val, csr_row_ptr, csr_col_ind, beta or C pointer is invalid.

Sparse Extra Functions

This module holds all sparse extra routines.

The sparse extra routines describe operations that manipulate sparse matrices.

rocsparse_csrgeam_nnz()

rocsparse_status rocsparse_csrgeam_nnz(rocsparse_handle handle, rocsparse_int m, rocsparse_int n, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const rocsparse_mat_descr descr_C, rocsparse_int *csr_row_ptr_C, rocsparse_int *nnz_C)

Sparse matrix sparse matrix addition using CSR storage format.

rocsparse_csrgeam_nnz computes the total CSR non-zero elements and the CSR row offsets, that point to the start of every row of the sparse CSR matrix, of the resulting matrix C. It is assumed that csr_row_ptr_C has been allocated with size m + 1.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Currently, only rocsparse_matrix_type_general is supported.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] m: number of rows of the sparse CSR matrix \(A\), \(B\) and \(C\).

  • [in] n: number of columns of the sparse CSR matrix \(A\), \(B\) and \(C\).

  • [in] descr_A: descriptor of the sparse CSR matrix \(A\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_A: number of non-zero entries of the sparse CSR matrix \(A\).

  • [in] csr_row_ptr_A: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(A\).

  • [in] csr_col_ind_A: array of nnz_A elements containing the column indices of the sparse CSR matrix \(A\).

  • [in] descr_B: descriptor of the sparse CSR matrix \(B\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_B: number of non-zero entries of the sparse CSR matrix \(B\).

  • [in] csr_row_ptr_B: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(B\).

  • [in] csr_col_ind_B: array of nnz_B elements containing the column indices of the sparse CSR matrix \(B\).

  • [in] descr_C: descriptor of the sparse CSR matrix \(C\). Currenty, only rocsparse_matrix_type_general is supported.

  • [out] csr_row_ptr_C: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(C\).

  • [out] nnz_C: pointer to the number of non-zero entries of the sparse CSR matrix \(C\). nnz_C can be a host or device pointer.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, n, nnz_A or nnz_B is invalid.

  • rocsparse_status_invalid_pointer: descr_A, csr_row_ptr_A, csr_col_ind_A, descr_B, csr_row_ptr_B, csr_col_ind_B, descr_C, csr_row_ptr_C or nnz_C is invalid.

  • rocsparse_status_not_implemented: rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrgeam()

rocsparse_status rocsparse_scsrgeam(rocsparse_handle handle, rocsparse_int m, rocsparse_int n, const float *alpha, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const float *csr_val_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const float *beta, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const float *csr_val_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const rocsparse_mat_descr descr_C, float *csr_val_C, const rocsparse_int *csr_row_ptr_C, rocsparse_int *csr_col_ind_C)
rocsparse_status rocsparse_dcsrgeam(rocsparse_handle handle, rocsparse_int m, rocsparse_int n, const double *alpha, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const double *csr_val_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const double *beta, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const double *csr_val_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const rocsparse_mat_descr descr_C, double *csr_val_C, const rocsparse_int *csr_row_ptr_C, rocsparse_int *csr_col_ind_C)
rocsparse_status rocsparse_ccsrgeam(rocsparse_handle handle, rocsparse_int m, rocsparse_int n, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const rocsparse_float_complex *csr_val_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const rocsparse_float_complex *beta, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const rocsparse_float_complex *csr_val_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const rocsparse_mat_descr descr_C, rocsparse_float_complex *csr_val_C, const rocsparse_int *csr_row_ptr_C, rocsparse_int *csr_col_ind_C)
rocsparse_status rocsparse_zcsrgeam(rocsparse_handle handle, rocsparse_int m, rocsparse_int n, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const rocsparse_double_complex *csr_val_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const rocsparse_double_complex *beta, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const rocsparse_double_complex *csr_val_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const rocsparse_mat_descr descr_C, rocsparse_double_complex *csr_val_C, const rocsparse_int *csr_row_ptr_C, rocsparse_int *csr_col_ind_C)

Sparse matrix sparse matrix addition using CSR storage format.

rocsparse_csrgeam multiplies the scalar \(\alpha\) with the sparse \(m \times n\) matrix \(A\), defined in CSR storage format, multiplies the scalar \(\beta\) with the sparse \(m \times n\) matrix \(B\), defined in CSR storage format, and adds both resulting matrices to obtain the sparse \(m \times n\) matrix \(C\), defined in CSR storage format, such that

\[ C := \alpha \cdot A + \beta \cdot B. \]

It is assumed that csr_row_ptr_C has already been filled and that csr_val_C and csr_col_ind_C are allocated by the user. csr_row_ptr_C and allocation size of csr_col_ind_C and csr_val_C is defined by the number of non-zero elements of the sparse CSR matrix C. Both can be obtained by rocsparse_csrgeam_nnz().

Note

Both scalars \(\alpha\) and \(beta\) have to be valid.

Note

Currently, only rocsparse_matrix_type_general is supported.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Example

This example adds two CSR matrices.

// Initialize scalar multipliers
float alpha = 1.0f;
float beta  = 1.0f;

// Create matrix descriptors
rocsparse_mat_descr descr_A;
rocsparse_mat_descr descr_B;
rocsparse_mat_descr descr_C;

rocsparse_create_mat_descr(&descr_A);
rocsparse_create_mat_descr(&descr_B);
rocsparse_create_mat_descr(&descr_C);

// Set pointer mode
rocsparse_set_pointer_mode(handle, rocsparse_pointer_mode_host);

// Obtain number of total non-zero entries in C and row pointers of C
rocsparse_int nnz_C;
hipMalloc((void**)&csr_row_ptr_C, sizeof(rocsparse_int) * (m + 1));

rocsparse_csrgeam_nnz(handle,
                      m,
                      n,
                      descr_A,
                      nnz_A,
                      csr_row_ptr_A,
                      csr_col_ind_A,
                      descr_B,
                      nnz_B,
                      csr_row_ptr_B,
                      csr_col_ind_B,
                      descr_C,
                      csr_row_ptr_C,
                      &nnz_C);

// Compute column indices and values of C
hipMalloc((void**)&csr_col_ind_C, sizeof(rocsparse_int) * nnz_C);
hipMalloc((void**)&csr_val_C, sizeof(float) * nnz_C);

rocsparse_scsrgeam(handle,
                   m,
                   n,
                   &alpha,
                   descr_A,
                   nnz_A,
                   csr_val_A,
                   csr_row_ptr_A,
                   csr_col_ind_A,
                   &beta,
                   descr_B,
                   nnz_B,
                   csr_val_B,
                   csr_row_ptr_B,
                   csr_col_ind_B,
                   descr_C,
                   csr_val_C,
                   csr_row_ptr_C,
                   csr_col_ind_C);

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] m: number of rows of the sparse CSR matrix \(A\), \(B\) and \(C\).

  • [in] n: number of columns of the sparse CSR matrix \(A\), \(B\) and \(C\).

  • [in] alpha: scalar \(\alpha\).

  • [in] descr_A: descriptor of the sparse CSR matrix \(A\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_A: number of non-zero entries of the sparse CSR matrix \(A\).

  • [in] csr_val_A: array of nnz_A elements of the sparse CSR matrix \(A\).

  • [in] csr_row_ptr_A: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(A\).

  • [in] csr_col_ind_A: array of nnz_A elements containing the column indices of the sparse CSR matrix \(A\).

  • [in] beta: scalar \(\beta\).

  • [in] descr_B: descriptor of the sparse CSR matrix \(B\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_B: number of non-zero entries of the sparse CSR matrix \(B\).

  • [in] csr_val_B: array of nnz_B elements of the sparse CSR matrix \(B\).

  • [in] csr_row_ptr_B: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(B\).

  • [in] csr_col_ind_B: array of nnz_B elements containing the column indices of the sparse CSR matrix \(B\).

  • [in] descr_C: descriptor of the sparse CSR matrix \(C\). Currenty, only rocsparse_matrix_type_general is supported.

  • [out] csr_val_C: array of elements of the sparse CSR matrix \(C\).

  • [in] csr_row_ptr_C: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(C\).

  • [out] csr_col_ind_C: array of elements containing the column indices of the sparse CSR matrix \(C\).

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, n, nnz_A or nnz_B is invalid.

  • rocsparse_status_invalid_pointer: alpha, descr_A, csr_val_A, csr_row_ptr_A, csr_col_ind_A, beta, descr_B, csr_val_B, csr_row_ptr_B, csr_col_ind_B, descr_C, csr_val_C, csr_row_ptr_C or csr_col_ind_C is invalid.

  • rocsparse_status_not_implemented: rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrgemm_buffer_size()

rocsparse_status rocsparse_scsrgemm_buffer_size(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, const float *alpha, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const float *beta, const rocsparse_mat_descr descr_D, rocsparse_int nnz_D, const rocsparse_int *csr_row_ptr_D, const rocsparse_int *csr_col_ind_D, rocsparse_mat_info info_C, size_t *buffer_size)
rocsparse_status rocsparse_dcsrgemm_buffer_size(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, const double *alpha, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const double *beta, const rocsparse_mat_descr descr_D, rocsparse_int nnz_D, const rocsparse_int *csr_row_ptr_D, const rocsparse_int *csr_col_ind_D, rocsparse_mat_info info_C, size_t *buffer_size)
rocsparse_status rocsparse_ccsrgemm_buffer_size(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const rocsparse_float_complex *beta, const rocsparse_mat_descr descr_D, rocsparse_int nnz_D, const rocsparse_int *csr_row_ptr_D, const rocsparse_int *csr_col_ind_D, rocsparse_mat_info info_C, size_t *buffer_size)
rocsparse_status rocsparse_zcsrgemm_buffer_size(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const rocsparse_double_complex *beta, const rocsparse_mat_descr descr_D, rocsparse_int nnz_D, const rocsparse_int *csr_row_ptr_D, const rocsparse_int *csr_col_ind_D, rocsparse_mat_info info_C, size_t *buffer_size)

Sparse matrix sparse matrix multiplication using CSR storage format.

rocsparse_csrgemm_buffer_size returns the size of the temporary storage buffer that is required by rocsparse_csrgemm_nnz(), rocsparse_scsrgemm(), rocsparse_dcsrgemm(), rocsparse_ccsrgemm() and rocsparse_zcsrgemm(). The temporary storage buffer must be allocated by the user.

Note

Please note, that for matrix products with more than 4096 non-zero entries per row, additional temporary storage buffer is allocated by the algorithm.

Note

Please note, that for matrix products with more than 8192 intermediate products per row, additional temporary storage buffer is allocated by the algorithm.

Note

Currently, only trans_A == trans_B == rocsparse_operation_none is supported.

Note

Currently, only rocsparse_matrix_type_general is supported.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans_A: matrix \(A\) operation type.

  • [in] trans_B: matrix \(B\) operation type.

  • [in] m: number of rows of the sparse CSR matrix \(op(A)\) and \(C\).

  • [in] n: number of columns of the sparse CSR matrix \(op(B)\) and \(C\).

  • [in] k: number of columns of the sparse CSR matrix \(op(A)\) and number of rows of the sparse CSR matrix \(op(B)\).

  • [in] alpha: scalar \(\alpha\).

  • [in] descr_A: descriptor of the sparse CSR matrix \(A\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_A: number of non-zero entries of the sparse CSR matrix \(A\).

  • [in] csr_row_ptr_A: array of m+1 elements ( \(op(A) == A\), k+1 otherwise) that point to the start of every row of the sparse CSR matrix \(op(A)\).

  • [in] csr_col_ind_A: array of nnz_A elements containing the column indices of the sparse CSR matrix \(A\).

  • [in] descr_B: descriptor of the sparse CSR matrix \(B\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_B: number of non-zero entries of the sparse CSR matrix \(B\).

  • [in] csr_row_ptr_B: array of k+1 elements ( \(op(B) == B\), m+1 otherwise) that point to the start of every row of the sparse CSR matrix \(op(B)\).

  • [in] csr_col_ind_B: array of nnz_B elements containing the column indices of the sparse CSR matrix \(B\).

  • [in] beta: scalar \(\beta\).

  • [in] descr_D: descriptor of the sparse CSR matrix \(D\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_D: number of non-zero entries of the sparse CSR matrix \(D\).

  • [in] csr_row_ptr_D: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(D\).

  • [in] csr_col_ind_D: array of nnz_D elements containing the column indices of the sparse CSR matrix \(D\).

  • [inout] info_C: structure that holds meta data for the sparse CSR matrix \(C\).

  • [out] buffer_size: number of bytes of the temporary storage buffer required by rocsparse_csrgemm_nnz(), rocsparse_scsrgemm(), rocsparse_dcsrgemm(), rocsparse_ccsrgemm() and rocsparse_zcsrgemm().

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, n, k, nnz_A, nnz_B or nnz_D is invalid.

  • rocsparse_status_invalid_pointer: alpha and beta are invalid, descr_A, csr_row_ptr_A, csr_col_ind_A, descr_B, csr_row_ptr_B or csr_col_ind_B are invalid if alpha is valid, descr_D, csr_row_ptr_D or csr_col_ind_D is invalid if beta is valid, info_C or buffer_size is invalid.

  • rocsparse_status_not_implemented: trans_A != rocsparse_operation_none, trans_B != rocsparse_operation_none, or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrgemm_nnz()

rocsparse_status rocsparse_csrgemm_nnz(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const rocsparse_mat_descr descr_D, rocsparse_int nnz_D, const rocsparse_int *csr_row_ptr_D, const rocsparse_int *csr_col_ind_D, const rocsparse_mat_descr descr_C, rocsparse_int *csr_row_ptr_C, rocsparse_int *nnz_C, const rocsparse_mat_info info_C, void *temp_buffer)

Sparse matrix sparse matrix multiplication using CSR storage format.

rocsparse_csrgemm_nnz computes the total CSR non-zero elements and the CSR row offsets, that point to the start of every row of the sparse CSR matrix, of the resulting multiplied matrix C. It is assumed that csr_row_ptr_C has been allocated with size m + 1. The required buffer size can be obtained by rocsparse_scsrgemm_buffer_size(), rocsparse_dcsrgemm_buffer_size(), rocsparse_ccsrgemm_buffer_size() and rocsparse_zcsrgemm_buffer_size(), respectively.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Please note, that for matrix products with more than 8192 intermediate products per row, additional temporary storage buffer is allocated by the algorithm.

Note

Currently, only trans_A == trans_B == rocsparse_operation_none is supported.

Note

Currently, only rocsparse_matrix_type_general is supported.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans_A: matrix \(A\) operation type.

  • [in] trans_B: matrix \(B\) operation type.

  • [in] m: number of rows of the sparse CSR matrix \(op(A)\) and \(C\).

  • [in] n: number of columns of the sparse CSR matrix \(op(B)\) and \(C\).

  • [in] k: number of columns of the sparse CSR matrix \(op(A)\) and number of rows of the sparse CSR matrix \(op(B)\).

  • [in] descr_A: descriptor of the sparse CSR matrix \(A\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_A: number of non-zero entries of the sparse CSR matrix \(A\).

  • [in] csr_row_ptr_A: array of m+1 elements ( \(op(A) == A\), k+1 otherwise) that point to the start of every row of the sparse CSR matrix \(op(A)\).

  • [in] csr_col_ind_A: array of nnz_A elements containing the column indices of the sparse CSR matrix \(A\).

  • [in] descr_B: descriptor of the sparse CSR matrix \(B\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_B: number of non-zero entries of the sparse CSR matrix \(B\).

  • [in] csr_row_ptr_B: array of k+1 elements ( \(op(B) == B\), m+1 otherwise) that point to the start of every row of the sparse CSR matrix \(op(B)\).

  • [in] csr_col_ind_B: array of nnz_B elements containing the column indices of the sparse CSR matrix \(B\).

  • [in] descr_D: descriptor of the sparse CSR matrix \(D\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_D: number of non-zero entries of the sparse CSR matrix \(D\).

  • [in] csr_row_ptr_D: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(D\).

  • [in] csr_col_ind_D: array of nnz_D elements containing the column indices of the sparse CSR matrix \(D\).

  • [in] descr_C: descriptor of the sparse CSR matrix \(C\). Currenty, only rocsparse_matrix_type_general is supported.

  • [out] csr_row_ptr_C: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(C\).

  • [out] nnz_C: pointer to the number of non-zero entries of the sparse CSR matrix \(C\).

  • [in] info_C: structure that holds meta data for the sparse CSR matrix \(C\).

  • [in] temp_buffer: temporary storage buffer allocated by the user, size is returned by rocsparse_scsrgemm_buffer_size(), rocsparse_dcsrgemm_buffer_size(), rocsparse_ccsrgemm_buffer_size() or rocsparse_zcsrgemm_buffer_size().

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, n, k, nnz_A, nnz_B or nnz_D is invalid.

  • rocsparse_status_invalid_pointer: descr_A, csr_row_ptr_A, csr_col_ind_A, descr_B, csr_row_ptr_B, csr_col_ind_B, descr_D, csr_row_ptr_D, csr_col_ind_D, descr_C, csr_row_ptr_C, nnz_C, info_C or temp_buffer is invalid.

  • rocsparse_status_memory_error: additional buffer for long rows could not be allocated.

  • rocsparse_status_not_implemented: trans_A != rocsparse_operation_none, trans_B != rocsparse_operation_none, or rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_csrgemm()

rocsparse_status rocsparse_scsrgemm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, const float *alpha, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const float *csr_val_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const float *csr_val_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const float *beta, const rocsparse_mat_descr descr_D, rocsparse_int nnz_D, const float *csr_val_D, const rocsparse_int *csr_row_ptr_D, const rocsparse_int *csr_col_ind_D, const rocsparse_mat_descr descr_C, float *csr_val_C, const rocsparse_int *csr_row_ptr_C, rocsparse_int *csr_col_ind_C, const rocsparse_mat_info info_C, void *temp_buffer)
rocsparse_status rocsparse_dcsrgemm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, const double *alpha, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const double *csr_val_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const double *csr_val_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const double *beta, const rocsparse_mat_descr descr_D, rocsparse_int nnz_D, const double *csr_val_D, const rocsparse_int *csr_row_ptr_D, const rocsparse_int *csr_col_ind_D, const rocsparse_mat_descr descr_C, double *csr_val_C, const rocsparse_int *csr_row_ptr_C, rocsparse_int *csr_col_ind_C, const rocsparse_mat_info info_C, void *temp_buffer)
rocsparse_status rocsparse_ccsrgemm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, const rocsparse_float_complex *alpha, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const rocsparse_float_complex *csr_val_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const rocsparse_float_complex *csr_val_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const rocsparse_float_complex *beta, const rocsparse_mat_descr descr_D, rocsparse_int nnz_D, const rocsparse_float_complex *csr_val_D, const rocsparse_int *csr_row_ptr_D, const rocsparse_int *csr_col_ind_D, const rocsparse_mat_descr descr_C, rocsparse_float_complex *csr_val_C, const rocsparse_int *csr_row_ptr_C, rocsparse_int *csr_col_ind_C, const rocsparse_mat_info info_C, void *temp_buffer)
rocsparse_status rocsparse_zcsrgemm(rocsparse_handle handle, rocsparse_operation trans_A, rocsparse_operation trans_B, rocsparse_int m, rocsparse_int n, rocsparse_int k, const rocsparse_double_complex *alpha, const rocsparse_mat_descr descr_A, rocsparse_int nnz_A, const rocsparse_double_complex *csr_val_A, const rocsparse_int *csr_row_ptr_A, const rocsparse_int *csr_col_ind_A, const rocsparse_mat_descr descr_B, rocsparse_int nnz_B, const rocsparse_double_complex *csr_val_B, const rocsparse_int *csr_row_ptr_B, const rocsparse_int *csr_col_ind_B, const rocsparse_double_complex *beta, const rocsparse_mat_descr descr_D, rocsparse_int nnz_D, const rocsparse_double_complex *csr_val_D, const rocsparse_int *csr_row_ptr_D, const rocsparse_int *csr_col_ind_D, const rocsparse_mat_descr descr_C, rocsparse_double_complex *csr_val_C, const rocsparse_int *csr_row_ptr_C, rocsparse_int *csr_col_ind_C, const rocsparse_mat_info info_C, void *temp_buffer)

Sparse matrix sparse matrix multiplication using CSR storage format.

rocsparse_csrgemm multiplies the scalar \(\alpha\) with the sparse \(m \times k\) matrix \(A\), defined in CSR storage format, and the sparse \(k \times n\) matrix \(B\), defined in CSR storage format, and adds the result to the sparse \(m \times n\) matrix \(D\) that is multiplied by \(\beta\). The final result is stored in the sparse \(m \times n\) matrix \(C\), defined in CSR storage format, such that

\[ C := \alpha \cdot op(A) \cdot op(B) + \beta \cdot D, \]
with
\[\begin{split} op(A) = \left\{ \begin{array}{ll} A, & \text{if trans_A == rocsparse_operation_none} \\ A^T, & \text{if trans_A == rocsparse_operation_transpose} \\ A^H, & \text{if trans_A == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]
and
\[\begin{split} op(B) = \left\{ \begin{array}{ll} B, & \text{if trans_B == rocsparse_operation_none} \\ B^T, & \text{if trans_B == rocsparse_operation_transpose} \\ B^H, & \text{if trans_B == rocsparse_operation_conjugate_transpose} \end{array} \right. \end{split}\]

It is assumed that csr_row_ptr_C has already been filled and that csr_val_C and csr_col_ind_C are allocated by the user. csr_row_ptr_C and allocation size of csr_col_ind_C and csr_val_C is defined by the number of non-zero elements of the sparse CSR matrix C. Both can be obtained by rocsparse_csrgemm_nnz(). The required buffer size for the computation can be obtained by rocsparse_scsrgemm_buffer_size(), rocsparse_dcsrgemm_buffer_size(), rocsparse_ccsrgemm_buffer_size() and rocsparse_zcsrgemm_buffer_size(), respectively.

Note

If \(\alpha == 0\), then \(C = \beta \cdot D\) will be computed.

Note

If \(\beta == 0\), then \(C = \alpha \cdot op(A) \cdot op(B)\) will be computed.

Note

\(\alpha == beta == 0\) is invalid.

Note

Currently, only trans_A == rocsparse_operation_none is supported.

Note

Currently, only trans_B == rocsparse_operation_none is supported.

Note

Currently, only rocsparse_matrix_type_general is supported.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Note

Please note, that for matrix products with more than 4096 non-zero entries per row, additional temporary storage buffer is allocated by the algorithm.

Example

This example multiplies two CSR matrices with a scalar alpha and adds the result to another CSR matrix.

// Initialize scalar multipliers
float alpha = 2.0f;
float beta  = 1.0f;

// Create matrix descriptors
rocsparse_mat_descr descr_A;
rocsparse_mat_descr descr_B;
rocsparse_mat_descr descr_C;
rocsparse_mat_descr descr_D;

rocsparse_create_mat_descr(&descr_A);
rocsparse_create_mat_descr(&descr_B);
rocsparse_create_mat_descr(&descr_C);
rocsparse_create_mat_descr(&descr_D);

// Create matrix info structure
rocsparse_mat_info info_C;
rocsparse_create_mat_info(&info_C);

// Set pointer mode
rocsparse_set_pointer_mode(handle, rocsparse_pointer_mode_host);

// Query rocsparse for the required buffer size
size_t buffer_size;

rocsparse_scsrgemm_buffer_size(handle,
                               rocsparse_operation_none,
                               rocsparse_operation_none,
                               m,
                               n,
                               k,
                               &alpha,
                               descr_A,
                               nnz_A,
                               csr_row_ptr_A,
                               csr_col_ind_A,
                               descr_B,
                               nnz_B,
                               csr_row_ptr_B,
                               csr_col_ind_B,
                               &beta,
                               descr_D,
                               nnz_D,
                               csr_row_ptr_D,
                               csr_col_ind_D,
                               info_C,
                               &buffer_size);

// Allocate buffer
void* buffer;
hipMalloc(&buffer, buffer_size);

// Obtain number of total non-zero entries in C and row pointers of C
rocsparse_int nnz_C;
hipMalloc((void**)&csr_row_ptr_C, sizeof(rocsparse_int) * (m + 1));

rocsparse_csrgemm_nnz(handle,
                      rocsparse_operation_none,
                      rocsparse_operation_none,
                      m,
                      n,
                      k,
                      descr_A,
                      nnz_A,
                      csr_row_ptr_A,
                      csr_col_ind_A,
                      descr_B,
                      nnz_B,
                      csr_row_ptr_B,
                      csr_col_ind_B,
                      descr_D,
                      nnz_D,
                      csr_row_ptr_D,
                      csr_col_ind_D,
                      descr_C,
                      csr_row_ptr_C,
                      &nnz_C,
                      info_C,
                      buffer);

// Compute column indices and values of C
hipMalloc((void**)&csr_col_ind_C, sizeof(rocsparse_int) * nnz_C);
hipMalloc((void**)&csr_val_C, sizeof(float) * nnz_C);

rocsparse_scsrgemm(handle,
                   rocsparse_operation_none,
                   rocsparse_operation_none,
                   m,
                   n,
                   k,
                   &alpha,
                   descr_A,
                   nnz_A,
                   csr_val_A,
                   csr_row_ptr_A,
                   csr_col_ind_A,
                   descr_B,
                   nnz_B,
                   csr_val_B,
                   csr_row_ptr_B,
                   csr_col_ind_B,
                   &beta,
                   descr_D,
                   nnz_D,
                   csr_val_D,
                   csr_row_ptr_D,
                   csr_col_ind_D,
                   descr_C,
                   csr_val_C,
                   csr_row_ptr_C,
                   csr_col_ind_C,
                   info_C,
                   buffer);

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] trans_A: matrix \(A\) operation type.

  • [in] trans_B: matrix \(B\) operation type.

  • [in] m: number of rows of the sparse CSR matrix \(op(A)\) and \(C\).

  • [in] n: number of columns of the sparse CSR matrix \(op(B)\) and \(C\).

  • [in] k: number of columns of the sparse CSR matrix \(op(A)\) and number of rows of the sparse CSR matrix \(op(B)\).

  • [in] alpha: scalar \(\alpha\).

  • [in] descr_A: descriptor of the sparse CSR matrix \(A\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_A: number of non-zero entries of the sparse CSR matrix \(A\).

  • [in] csr_val_A: array of nnz_A elements of the sparse CSR matrix \(A\).

  • [in] csr_row_ptr_A: array of m+1 elements ( \(op(A) == A\), k+1 otherwise) that point to the start of every row of the sparse CSR matrix \(op(A)\).

  • [in] csr_col_ind_A: array of nnz_A elements containing the column indices of the sparse CSR matrix \(A\).

  • [in] descr_B: descriptor of the sparse CSR matrix \(B\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_B: number of non-zero entries of the sparse CSR matrix \(B\).

  • [in] csr_val_B: array of nnz_B elements of the sparse CSR matrix \(B\).

  • [in] csr_row_ptr_B: array of k+1 elements ( \(op(B) == B\), m+1 otherwise) that point to the start of every row of the sparse CSR matrix \(op(B)\).

  • [in] csr_col_ind_B: array of nnz_B elements containing the column indices of the sparse CSR matrix \(B\).

  • [in] beta: scalar \(\beta\).

  • [in] descr_D: descriptor of the sparse CSR matrix \(D\). Currenty, only rocsparse_matrix_type_general is supported.

  • [in] nnz_D: number of non-zero entries of the sparse CSR matrix \(D\).

  • [in] csr_val_D: array of nnz_D elements of the sparse CSR matrix \(D\).

  • [in] csr_row_ptr_D: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(D\).

  • [in] csr_col_ind_D: array of nnz_D elements containing the column indices of the sparse CSR matrix \(D\).

  • [in] descr_C: descriptor of the sparse CSR matrix \(C\). Currenty, only rocsparse_matrix_type_general is supported.

  • [out] csr_val_C: array of nnz_C elements of the sparse CSR matrix \(C\).

  • [in] csr_row_ptr_C: array of m+1 elements that point to the start of every row of the sparse CSR matrix \(C\).

  • [out] csr_col_ind_C: array of nnz_C elements containing the column indices of the sparse CSR matrix \(C\).

  • [in] info_C: structure that holds meta data for the sparse CSR matrix \(C\).

  • [in] temp_buffer: temporary storage buffer allocated by the user, size is returned by rocsparse_scsrgemm_buffer_size(), rocsparse_dcsrgemm_buffer_size(), rocsparse_ccsrgemm_buffer_size() or rocsparse_zcsrgemm_buffer_size().

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: m, n, k, nnz_A, nnz_B or nnz_D is invalid.

  • rocsparse_status_invalid_pointer: alpha and beta are invalid, descr_A, csr_val_A, csr_row_ptr_A, csr_col_ind_A, descr_B, csr_val_B, csr_row_ptr_B or csr_col_ind_B are invalid if alpha is valid, descr_D, csr_val_D, csr_row_ptr_D or csr_col_ind_D is invalid if beta is valid, csr_val_C, csr_row_ptr_C, csr_col_ind_C, info_C or temp_buffer is invalid.

  • rocsparse_status_memory_error: additional buffer for long rows could not be allocated.

  • rocsparse_status_not_implemented: trans_A != rocsparse_operation_none, trans_B != rocsparse_operation_none, or rocsparse_matrix_type != rocsparse_matrix_type_general.

Preconditioner Functions

This module holds all sparse preconditioners.

The sparse preconditioners describe manipulations on a matrix in sparse format to obtain a sparse preconditioner matrix.

rocsparse_bsric0_zero_pivot()

rocsparse_status rocsparse_bsric0_zero_pivot(rocsparse_handle handle, rocsparse_mat_info info, rocsparse_int *position)

Incomplete Cholesky factorization with 0 fill-ins and no pivoting using BSR storage format.

rocsparse_bsric0_zero_pivot returns rocsparse_status_zero_pivot, if either a structural or numerical zero has been found during rocsparse_sbsric0(), rocsparse_dbsric0(), rocsparse_cbsric0() or rocsparse_zbsric0() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the BSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and rocsparse_status_success is returned instead.

Note

If a zero pivot is found, position=j means that either the diagonal block A(j,j) is missing (structural zero) or the diagonal block A(j,j) is not positive definite (numerical zero).

Note

rocsparse_bsric0_zero_pivot is a blocking function. It might influence performance negatively.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] info: structure that holds the information collected during the analysis step.

  • [inout] position: pointer to zero pivot \(j\), can be in host or device memory.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_pointer: info or position pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_zero_pivot: zero pivot has been found.

rocsparse_bsric0_buffer_size()

rocsparse_status rocsparse_sbsric0_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, size_t *buffer_size)
rocsparse_status rocsparse_dbsric0_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, size_t *buffer_size)
rocsparse_status rocsparse_cbsric0_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, size_t *buffer_size)
rocsparse_status rocsparse_zbsric0_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, size_t *buffer_size)

Incomplete Cholesky factorization with 0 fill-ins and no pivoting using BSR storage format.

rocsparse_bsric0_buffer_size returns the size of the temporary storage buffer that is required by rocsparse_sbsric0_analysis(), rocsparse_dbsric0_analysis(), rocsparse_cbsric0_analysis(), rocsparse_zbsric0_analysis(), rocsparse_sbsric0(), rocsparse_dbsric0(), rocsparse_sbsric0() and rocsparse_dbsric0(). The temporary storage buffer must be allocated by the user. The size of the temporary storage buffer is identical to the size returned by rocsparse_sbsrsv_buffer_size(), rocsparse_dbsrsv_buffer_size(), rocsparse_cbsrsv_buffer_size(), rocsparse_zbsrsv_buffer_size(), rocsparse_sbsrilu0_buffer_size(), rocsparse_dbsrilu0_buffer_size(), rocsparse_cbsrilu0_buffer_size() and rocsparse_zbsrilu0_buffer_size() if the matrix sparsity pattern is identical. The user allocated buffer can thus be shared between subsequent calls to those functions.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] dir: direction that specifies whether to count nonzero elements by rocsparse_direction_row or by rocsparse_direction_row.

  • [in] mb: number of block rows in the sparse BSR matrix.

  • [in] nnzb: number of non-zero block entries of the sparse BSR matrix.

  • [in] descr: descriptor of the sparse BSR matrix.

  • [in] bsr_val: array of length nnzb*block_dim*block_dim containing the values of the sparse BSR matrix.

  • [in] bsr_row_ptr: array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.

  • [in] bsr_col_ind: array of nnzb elements containing the block column indices of the sparse BSR matrix.

  • [in] block_dim: the block dimension of the BSR matrix. Between 1 and m where m=mb*block_dim.

  • [out] info: structure that holds the information collected during the analysis step.

  • [in] buffer_size: number of bytes of the temporary storage buffer required by rocsparse_sbsric0_analysis(), rocsparse_dbsric0_analysis(), rocsparse_cbsric0_analysis(), rocsparse_zbsric0_analysis(), rocsparse_sbsric0(), rocsparse_dbsric0(), rocsparse_cbsric0() and rocsparse_zbsric0().

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: mb, nnzb, or block_dim is invalid.

  • rocsparse_status_invalid_pointer: descr, bsr_val, bsr_row_ptr, bsr_col_ind, info or buffer_size pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_bsric0_analysis()

rocsparse_status rocsparse_sbsric0_analysis(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)
rocsparse_status rocsparse_dbsric0_analysis(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)
rocsparse_status rocsparse_cbsric0_analysis(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)
rocsparse_status rocsparse_zbsric0_analysis(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, rocsparse_analysis_policy analysis, rocsparse_solve_policy solve, void *temp_buffer)

Incomplete Cholesky factorization with 0 fill-ins and no pivoting using BSR storage format.

rocsparse_bsric0_analysis performs the analysis step for rocsparse_sbsric0() rocsparse_dbsric0(), rocsparse_cbsric0(), and rocsparse_zbsric0(). It is expected that this function will be executed only once for a given matrix and particular operation type. The analysis meta data can be cleared by rocsparse_bsric0_clear().

rocsparse_bsric0_analysis can share its meta data with rocsparse_sbsrilu0_analysis(), rocsparse_dbsrilu0_analysis(), rocsparse_cbsrilu0_analysis(), rocsparse_zbsrilu0_analysis(), rocsparse_sbsrsv_analysis(), rocsparse_dbsrsv_analysis(), rocsparse_cbsrsv_analysis(), rocsparse_zbsrsv_analysis(), rocsparse_sbsrsm_analysis(), rocsparse_dbsrsm_analysis(), rocsparse_cbsrsm_analysis() and rocsparse_zbsrsm_analysis(). Selecting rocsparse_analysis_policy_reuse policy can greatly improve computation performance of meta data. However, the user need to make sure that the sparsity pattern remains unchanged. If this cannot be assured, rocsparse_analysis_policy_force has to be used.

Note

If the matrix sparsity pattern changes, the gathered information will become invalid.

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] dir: direction that specified whether to count nonzero elements by rocsparse_direction_row or by rocsparse_direction_row.

  • [in] mb: number of block rows in the sparse BSR matrix.

  • [in] nnzb: number of non-zero block entries of the sparse BSR matrix.

  • [in] descr: descriptor of the sparse BSR matrix.

  • [in] bsr_val: array of length nnzb*block_dim*block_dim containing the values of the sparse BSR matrix.

  • [in] bsr_row_ptr: array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.

  • [in] bsr_col_ind: array of nnzb elements containing the block column indices of the sparse BSR matrix.

  • [in] block_dim: the block dimension of the BSR matrix. Between 1 and m where m=mb*block_dim.

  • [out] info: structure that holds the information collected during the analysis step.

  • [in] analysis: rocsparse_analysis_policy_reuse or rocsparse_analysis_policy_force.

  • [in] solve: rocsparse_solve_policy_auto.

  • [in] temp_buffer: temporary storage buffer allocated by the user.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: mb, nnzb, or block_dim is invalid.

  • rocsparse_status_invalid_pointer: descr, bsr_val, bsr_row_ptr, bsr_col_ind, info or temp_buffer pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_bsric0()

rocsparse_status rocsparse_sbsric0(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, rocsparse_solve_policy policy, void *temp_buffer)
rocsparse_status rocsparse_dbsric0(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, rocsparse_solve_policy policy, void *temp_buffer)
rocsparse_status rocsparse_cbsric0(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, rocsparse_solve_policy policy, void *temp_buffer)
rocsparse_status rocsparse_zbsric0(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, rocsparse_solve_policy policy, void *temp_buffer)

Incomplete Cholesky factorization with 0 fill-ins and no pivoting using BSR storage format.

rocsparse_bsric0 computes the incomplete Cholesky factorization with 0 fill-ins and no pivoting of a sparse \(mb \times mb\) BSR matrix \(A\), such that

\[ A \approx LL^T \]

rocsparse_bsric0 requires a user allocated temporary buffer. Its size is returned by rocsparse_sbsric0_buffer_size(), rocsparse_dbsric0_buffer_size(), rocsparse_cbsric0_buffer_size() or rocsparse_zbsric0_buffer_size(). Furthermore, analysis meta data is required. It can be obtained by rocsparse_sbsric0_analysis(), rocsparse_dbsric0_analysis(), rocsparse_cbsric0_analysis() or rocsparse_zbsric0_analysis(). rocsparse_bsric0 reports the first zero pivot (either numerical or structural zero). The zero pivot status can be obtained by calling rocsparse_bsric0_zero_pivot().

Note

This function is non blocking and executed asynchronously with respect to the host. It may return before the actual computation has finished.

Example

Consider the sparse \(m \times m\) matrix \(A\), stored in BSR storage format. The following example computes the incomplete Cholesky factorization \(M \approx LL^T\) and solves the preconditioned system \(My = x\).

// Create rocSPARSE handle
rocsparse_handle handle;
rocsparse_create_handle(&handle);

// Create matrix descriptor for M
rocsparse_mat_descr descr_M;
rocsparse_create_mat_descr(&descr_M);

// Create matrix descriptor for L
rocsparse_mat_descr descr_L;
rocsparse_create_mat_descr(&descr_L);
rocsparse_set_mat_fill_mode(descr_L, rocsparse_fill_mode_lower);
rocsparse_set_mat_diag_type(descr_L, rocsparse_diag_type_unit);

// Create matrix descriptor for L'
rocsparse_mat_descr descr_Lt;
rocsparse_create_mat_descr(&descr_Lt);
rocsparse_set_mat_fill_mode(descr_Lt, rocsparse_fill_mode_upper);
rocsparse_set_mat_diag_type(descr_Lt, rocsparse_diag_type_non_unit);

// Create matrix info structure
rocsparse_mat_info info;
rocsparse_create_mat_info(&info);

// Obtain required buffer size
size_t buffer_size_M;
size_t buffer_size_L;
size_t buffer_size_Lt;
rocsparse_dbsric0_buffer_size(handle,
                               rocsparse_direction_row,
                               mb,
                               nnzb,
                               descr_M,
                               bsr_val,
                               bsr_row_ptr,
                               bsr_col_ind,
                               block_dim,
                               info,
                               &buffer_size_M);
rocsparse_dbsrsv_buffer_size(handle,
                             rocsparse_direction_row,
                             rocsparse_operation_none,
                             mb,
                             nnzb,
                             descr_L,
                             bsr_val,
                             bsr_row_ptr,
                             bsr_col_ind,
                             block_dim,
                             info,
                             &buffer_size_L);
rocsparse_dbsrsv_buffer_size(handle,
                             rocsparse_direction_row,
                             rocsparse_operation_transpose,
                             mb,
                             nnzb,
                             descr_Lt,
                             bsr_val,
                             bsr_row_ptr,
                             bsr_col_ind,
                             block_dim,
                             info,
                             &buffer_size_Lt);

size_t buffer_size = max(buffer_size_M, max(buffer_size_L, buffer_size_Lt));

// Allocate temporary buffer
void* temp_buffer;
hipMalloc(&temp_buffer, buffer_size);

// Perform analysis steps, using rocsparse_analysis_policy_reuse to improve
// computation performance
rocsparse_dbsric0_analysis(handle,
                            rocsparse_direction_row,
                            mb,
                            nnzb,
                            descr_M,
                            bsr_val,
                            bsr_row_ptr,
                            bsr_col_ind,
                            block_dim,
                            info,
                            rocsparse_analysis_policy_reuse,
                            rocsparse_solve_policy_auto,
                            temp_buffer);
rocsparse_dbsrsv_analysis(handle,
                          rocsparse_direction_row,
                          rocsparse_operation_none,
                          mb,
                          nnzb,
                          descr_L,
                          bsr_val,
                          bsr_row_ptr,
                          bsr_col_ind,
                          block_dim,
                          info,
                          rocsparse_analysis_policy_reuse,
                          rocsparse_solve_policy_auto,
                          temp_buffer);
rocsparse_dbsrsv_analysis(handle,
                          rocsparse_direction_row,
                          rocsparse_operation_transpose,
                          mb,
                          nnzb,
                          descr_Lt,
                          bsr_val,
                          bsr_row_ptr,
                          bsr_col_ind,
                          block_dim,
                          info,
                          rocsparse_analysis_policy_reuse,
                          rocsparse_solve_policy_auto,
                          temp_buffer);

// Check for zero pivot
rocsparse_int position;
if(rocsparse_status_zero_pivot == rocsparse_bsric0_zero_pivot(handle,
                                                              info,
                                                              &position))
{
    printf("A has structural zero at A(%d,%d)\n", position, position);
}

// Compute incomplete Cholesky factorization M = LL'
rocsparse_dbsric0(handle,
                   rocsparse_direction_row,
                   mb,
                   nnzb,
                   descr_M,
                   bsr_val,
                   bsr_row_ptr,
                   bsr_col_ind,
                   block_dim,
                   info,
                   rocsparse_solve_policy_auto,
                   temp_buffer);

// Check for zero pivot
if(rocsparse_status_zero_pivot == rocsparse_bsric0_zero_pivot(handle,
                                                               info,
                                                               &position))
{
    printf("L has structural and/or numerical zero at L(%d,%d)\n",
           position,
           position);
}

// Solve Lz = x
rocsparse_dbsrsv_solve(handle,
                       rocsparse_direction_row,
                       rocsparse_operation_none,
                       mb,
                       nnzb,
                       &alpha,
                       descr_L,
                       bsr_val,
                       bsr_row_ptr,
                       bsr_col_ind,
                       block_dim,
                       info,
                       x,
                       z,
                       rocsparse_solve_policy_auto,
                       temp_buffer);

// Solve L'y = z
rocsparse_dbsrsv_solve(handle,
                       rocsparse_direction_row,
                       rocsparse_operation_transpose,
                       mb,
                       nnzb,
                       &alpha,
                       descr_Lt,
                       bsr_val,
                       bsr_row_ptr,
                       bsr_col_ind,
                       block_dim,
                       info,
                       z,
                       y,
                       rocsparse_solve_policy_auto,
                       temp_buffer);

// Clean up
hipFree(temp_buffer);
rocsparse_destroy_mat_info(info);
rocsparse_destroy_mat_descr(descr_M);
rocsparse_destroy_mat_descr(descr_L);
rocsparse_destroy_mat_descr(descr_Lt);
rocsparse_destroy_handle(handle);

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] dir: direction that specified whether to count nonzero elements by rocsparse_direction_row or by rocsparse_direction_row.

  • [in] mb: number of block rows in the sparse BSR matrix.

  • [in] nnzb: number of non-zero block entries of the sparse BSR matrix.

  • [in] descr: descriptor of the sparse BSR matrix.

  • [inout] bsr_val: array of length nnzb*block_dim*block_dim containing the values of the sparse BSR matrix.

  • [in] bsr_row_ptr: array of mb+1 elements that point to the start of every block row of the sparse BSR matrix.

  • [in] bsr_col_ind: array of nnzb elements containing the block column indices of the sparse BSR matrix.

  • [in] block_dim: the block dimension of the BSR matrix. Between 1 and m where m=mb*block_dim.

  • [in] info: structure that holds the information collected during the analysis step.

  • [in] policy: rocsparse_solve_policy_auto.

  • [in] temp_buffer: temporary storage buffer allocated by the user.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_size: mb, nnzb, or block_dim is invalid.

  • rocsparse_status_invalid_pointer: descr, bsr_val, bsr_row_ptr or bsr_col_ind pointer is invalid.

  • rocsparse_status_arch_mismatch: the device is not supported.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_not_implemented: rocsparse_matrix_type != rocsparse_matrix_type_general.

rocsparse_bsric0_clear()

rocsparse_status rocsparse_bsric0_clear(rocsparse_handle handle, rocsparse_mat_info info)

Incomplete Cholesky factorization with 0 fill-ins and no pivoting using BSR storage format.

rocsparse_bsric0_clear deallocates all memory that was allocated by rocsparse_sbsric0_analysis(), rocsparse_dbsric0_analysis(), rocsparse_cbsric0_analysis() or rocsparse_zbsric0_analysis(). This is especially useful, if memory is an issue and the analysis data is not required for further computation.

Note

Calling rocsparse_bsric0_clear is optional. All allocated resources will be cleared, when the opaque rocsparse_mat_info struct is destroyed using rocsparse_destroy_mat_info().

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [inout] info: structure that holds the information collected during the analysis step.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_pointer: info pointer is invalid.

  • rocsparse_status_memory_error: the buffer holding the meta data could not be deallocated.

  • rocsparse_status_internal_error: an internal error occurred.

rocsparse_bsrilu0_zero_pivot()

rocsparse_status rocsparse_bsrilu0_zero_pivot(rocsparse_handle handle, rocsparse_mat_info info, rocsparse_int *position)

Incomplete LU factorization with 0 fill-ins and no pivoting using BSR storage format.

rocsparse_bsrilu0_zero_pivot returns rocsparse_status_zero_pivot, if either a structural or numerical zero has been found during rocsparse_sbsrilu0(), rocsparse_dbsrilu0(), rocsparse_cbsrilu0() or rocsparse_zbsrilu0() computation. The first zero pivot \(j\) at \(A_{j,j}\) is stored in position, using same index base as the BSR matrix.

position can be in host or device memory. If no zero pivot has been found, position is set to -1 and rocsparse_status_success is returned instead.

Note

If a zero pivot is found, position \(=j\) means that either the diagonal block \(A_{j,j}\) is missing (structural zero) or the diagonal block \(A_{j,j}\) is not invertible (numerical zero).

Note

rocsparse_bsrilu0_zero_pivot is a blocking function. It might influence performance negatively.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] info: structure that holds the information collected during the analysis step.

  • [inout] position: pointer to zero pivot \(j\), can be in host or device memory.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_pointer: info or position pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

  • rocsparse_status_zero_pivot: zero pivot has been found.

rocsparse_bsrilu0_numeric_boost()

rocsparse_status rocsparse_sbsrilu0_numeric_boost(rocsparse_handle handle, rocsparse_mat_info info, int enable_boost, const float *boost_tol, const float *boost_val)
rocsparse_status rocsparse_dbsrilu0_numeric_boost(rocsparse_handle handle, rocsparse_mat_info info, int enable_boost, const double *boost_tol, const double *boost_val)
rocsparse_status rocsparse_cbsrilu0_numeric_boost(rocsparse_handle handle, rocsparse_mat_info info, int enable_boost, const float *boost_tol, const rocsparse_float_complex *boost_val)
rocsparse_status rocsparse_zbsrilu0_numeric_boost(rocsparse_handle handle, rocsparse_mat_info info, int enable_boost, const double *boost_tol, const rocsparse_double_complex *boost_val)

Incomplete LU factorization with 0 fill-ins and no pivoting using BSR storage format.

rocsparse_bsrilu0_numeric_boost enables the user to replace a numerical value in an incomplete LU factorization. tol is used to determine whether a numerical value is replaced by boost_val, such that \(A_{j,j} = \text{boost_val}\) if \(\text{tol} \ge \left|A_{j,j}\right|\).

Note

The boost value is enabled by setting enable_boost to 1 or disabled by setting enable_boost to 0.

Note

tol and boost_val can be in host or device memory.

Parameters
  • [in] handle: handle to the rocsparse library context queue.

  • [in] info: structure that holds the information collected during the analysis step.

  • [in] enable_boost: enable/disable numeric boost.

  • [in] boost_tol: tolerance to determine whether a numerical value is replaced or not.

  • [in] boost_val: boost value to replace a numerical value.

Return Value
  • rocsparse_status_success: the operation completed successfully.

  • rocsparse_status_invalid_handle: the library context was not initialized.

  • rocsparse_status_invalid_pointer: info, tol or boost_val pointer is invalid.

  • rocsparse_status_internal_error: an internal error occurred.

rocsparse_bsrilu0_buffer_size()

rocsparse_status rocsparse_sbsrilu0_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const float *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, size_t *buffer_size)
rocsparse_status rocsparse_dbsrilu0_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const double *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, size_t *buffer_size)
rocsparse_status rocsparse_cbsrilu0_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_float_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int *bsr_col_ind, rocsparse_int block_dim, rocsparse_mat_info info, size_t *buffer_size)
rocsparse_status rocsparse_zbsrilu0_buffer_size(rocsparse_handle handle, rocsparse_direction dir, rocsparse_int mb, rocsparse_int nnzb, const rocsparse_mat_descr descr, const rocsparse_double_complex *bsr_val, const rocsparse_int *bsr_row_ptr, const rocsparse_int