week 4 - Sparse Matrix Inequalities
This week I worked on adding support for inequalities to sparse matrices. The pull request is here (still pending). I also modified, the c++ routines used to produce a boolean output, pull request here. This is so these inequalities don't have to be cast to bool in python. Once both of these are accepted the inequalities will be using the c++ routines that produce boolean output.
These are separate pull requests but they are related and one will need to be rebased once the other is accepted. Or I may combine them.
Each of the four basic inequalities has its own particular quirks which
make it more or less efficient with sparse matrices. Here we consider
A
to be a sparse matrix. The cases where B
is a scalar, or sparse
are considered. (By scalar we effectively mean a matrix the same size as
A
with every element as the scalar value B
.
A < B
For scalar B, this operation is only efficient if B < 0
otherwise the
resulting matrix will be dense.
For sparse A
and B
this is also efficient.
A > B
For scalar B
the efficiency is the opposite of the less than operator.
That is B > 0
has efficient output. Similarly to <
this operation is
efficient with other sparse matrices.
A <= B
This operation is pretty much useless. The only case which make this
efficient is when B < 0
. But this case is already covered by <
, and
<
can be used efficiently with sparse matrices. So why use <=
? I'll
discuss the pros and cons of having these non-strict inequalities in a
moment.
Another disadvantage of non strict inequalities like these is that they
raies NotImplementedError
when comparing with 0. This is because of
the nature of the c++ routines. When comparing every pair of elements
they don't consider cases where both are zero.
A >= B
Like <=
this operation is only efficient for one case, when B > 0
.
And also like <=
, it is not very useful.
Why have non strict inequalities?
I can't really think of uses for >=
and <=
but they might exist, I
added them because they were easy to implement once I had done >
and
<
. In practice, they don't slow the usage down. But removing non -
strict inequalities removes 24,990 lines of code.
Implementation
The inequality operations are implemented as c++ routines, these are wrapped to provide a python interface using SWIG, then the various inequalities are added appropriately.
C++
The inequalities routines were implemented by reusing existing routines
like csr_binop_csr
. Overrides for inequalities had to be added to
complex_wrapper.h
too.
The binop routines had to altered, another class was added to their
templates, T2
, for the type of data out. In theory this could be
something other than bool.
SWIG
To handle boolean output data a new swig
macro
was added to create typemaps that have npy_bool_wrapper
as this output
class. I then use these to instantiate the boolean operations.
Python
Here the associated python special methods corresponding to the
inequalities were added in
compressed.py
.
These operations can be used with a scalar, dense, or sparse matrix.
Numpy style broadcasting is not implemented.
To produce a bool output by default for boolean comparisons, I altered
the
_binopt
function to pass the c++ routines a matrix to use for output with a bool
dtype.
Problems
When testing the inequalities with dense matrices, I could not get the proper
behavior with a dense matrix on the left hand side. Previously, with !=
and
==
I modified the __bool__
method but that did not work in this case. This
is because of the same problem I will be dealing with in the next stage of my
proposal, interactions with numpy ufuncs.
Comments
Comments powered by Disqus