week 3 - Boolean Operations, != and == for CSR, CSC, and BSR
This week focused on implementing support for the == and != boolean comparisons for the BSR, CSC, CSR sparse matrix types. The culmination of my work this week is this pull request.
!=
For sparse matrices, the !=
(not equal to) generally has an output
that can be efficiently represented by sparse matrices. With the
exception of comparing the whole matrix with a nonzero scalar.
==
The boolean equal to comparison in general has an output that is not
efficiently represented by sparse matrices, since all the zero entries
return True
. The exception being comparison with zero.
Adding a c routine for this operation with the existing code, would be
problematic. The binop routines only apply the given binop to element
pairs in which one of them is nonzero. So all the co-occurring zero
elements would not return True
like they should. Because of this, and
==
being in general inefficient, I did not bother to implement the
==
operation in sparsetools. Instead, the ==
operation is computed
using the !=
operation and inverting it.
Implementation
C++ & SWIG level
Implementing the routines for !=
and ==
in the sparsetools/
was pretty
easy, using the existing code. There are handy
csr_binop_csr
and
bsr_binop_bsr
functions which take c++ a functional as an argument for the binop. The
only problem is that the output type is not the same as the input type.
So the output is not bool by default.
Python level
In Python, !=
and ==
operations are can be overridden by defining the
methods __ne__
and __eq__
. I did this to handle the case wher the
other
object is a scalar, dense, or a sparse matrix.
I defined these methods in compressed.py
which defines the base class
for BSR, CSC, and CSR.
In defined these methods in base.py
, here they just convert the sparse
matrix to CSR and preform the desired operation.
__bool__
There was however some complication with how these worked when a dense
ndarray is the first argument, ideally it would ask the sparse matrix
what to do, however this was not happening whenever the sparse matrix's
__bool__
(for Python3, __nonzero__
for python2.x) method returned
True
or False
, I still don't know why. It did however work when
__bool__
raised an error or returned something other than T/F. Since
NumPy's ndarrays raise an error when __bool__
is called, I thought
this was a reasonable change to make to sparse matrices __bool__
.
Although ideally, I'd like to better understand what is going on so I
don't have to make an incompatible change to SciPy's API.
This change broke one test in
scipy/io/matlab/tests/test_mio.py
there could be more serious issues relating to this change, currently
they are being discussed on the pull
request
Comments
Comments powered by Disqus