week 3 - Boolean Operations, != and == for CSR, CSC, and BSR

This week focused on implementing support for the == and != boolean comparisons for the BSR, CSC, CSR sparse matrix types. The culmination of my work this week is this pull request.

!=

For sparse matrices, the != (not equal to) generally has an output that can be efficiently represented by sparse matrices. With the exception of comparing the whole matrix with a nonzero scalar.

==

The boolean equal to comparison in general has an output that is not efficiently represented by sparse matrices, since all the zero entries return True. The exception being comparison with zero.

Adding a c routine for this operation with the existing code, would be problematic. The binop routines only apply the given binop to element pairs in which one of them is nonzero. So all the co-occurring zero elements would not return True like they should. Because of this, and == being in general inefficient, I did not bother to implement the == operation in sparsetools. Instead, the == operation is computed using the != operation and inverting it.

Implementation

C++ & SWIG level

Implementing the routines for != and == in the sparsetools/ was pretty easy, using the existing code. There are handy csr_binop_csr and bsr_binop_bsr functions which take c++ a functional as an argument for the binop. The only problem is that the output type is not the same as the input type. So the output is not bool by default.

Python level

In Python, != and == operations are can be overridden by defining the methods __ne__ and __eq__. I did this to handle the case wher the other object is a scalar, dense, or a sparse matrix.

I defined these methods in compressed.py which defines the base class for BSR, CSC, and CSR.

In defined these methods in base.py, here they just convert the sparse matrix to CSR and preform the desired operation.

__bool__

There was however some complication with how these worked when a dense ndarray is the first argument, ideally it would ask the sparse matrix what to do, however this was not happening whenever the sparse matrix's __bool__ (for Python3, __nonzero__ for python2.x) method returned True or False, I still don't know why. It did however work when __bool__ raised an error or returned something other than T/F. Since NumPy's ndarrays raise an error when __bool__ is called, I thought this was a reasonable change to make to sparse matrices __bool__. Although ideally, I'd like to better understand what is going on so I don't have to make an incompatible change to SciPy's API.

This change broke one test in scipy/io/matlab/tests/test_mio.py there could be more serious issues relating to this change, currently they are being discussed on the pull request

Comments

Comments powered by Disqus