week 5 - Various bool stuff
I spent this week tying up loose ends relating to the sparse package.
DepreciationWarning
A recent
change to NumPy caused DeprecationWarning
to be thrown
whenever there was (potentially) implicit casting between dtypes.
DeprecationWarning: Implicitly casting between incompatible kinds.
In a future numpy release, this will raise an error. Use
casting="unsafe" if this is intentional.
This was happening a lot in the sparse test suite, where in-place division and multiplication are tested. Since this behavior is being deprecated, I removed the tests for the appropriate cases. But not without some trouble first.
Recall that in python3 /
is always true division. So there is always a
potential to change the type from int
to float
, however if the
result is expressible as an integer, int
is returned. In cases
like this where there is no difference in type between the input and result type,
NumPy will still throw this warning. So simply checking if there is a
type difference between input and result was not enough to
programmatically remove the deprecated tests, (which I noticed when my
first patch did not work). Eventually I just added special cases to the
tests for different data.
Output Type for SWIG Routines
The compressed types of sparse matrices (BSR, CSC, CSR) each have a set
of routines for preforming binary operations (binop) between two sparse
matrices. These routines however, only returned data that was the same
type as the input data. So boolean operations had to have their result
cast to bool
at the python level. I modified the sparsetools routines
to output bool data where appropriate.
Implementation
This involved changes at every level of the codebase (c++, SWIG, python), it is a good demonstration of how these levels work together.
C++
The binop routines are implemented as function templates in c++, where the template arguments are (including the new one I added):
-
I
: Alwaysint
for storingnnz
(number of non-zeros), vector length etc. -
T
: The input data type. -
T2
: The output data type, which I added.
Adding this new argument required minimal changes to the existing code.
Basically I just changed T
to T2
in a few places.
Since the output type we used here was bool
I also had to add a
conversion from the complex wrapper class to bool
.
SWIG
At the SWIG level typemaps define what types the function templates inputs and outputs are. I had to add new typemaps to instantiate functions with boolean output.
Python
Since the new binop routines now output the correct dtype. I removed all
the casting to bool. However in the cases where boolean data must be
returned, python has to pass an empty matrix to the binop routines with
the correct dtype. So _binopt
now checks what the operation is to
decide what kind of empty matrix it should create.
Sparse Bool Wrapper
Pull request. Previously the c++ routines in sparsetools defined the bool dtype with:
typedef npy_bool_wrapper npy_int8;
This was so that bool value would be one byte. But since this is stored
as an int8
type, it would rollover when the underlying integer got to 256.
Like this:
In [2]: a = sp.csr_matrix([True, False])
In [3]: for _ in range(8):
...: a = a + a
...: print(a.todense())
...:
[[ True False]]
[[ True False]]
[[ True False]]
[[ True False]]
[[ True False]]
[[ True False]]
[[ True False]]
[[False False]] # <----- !
So I rewrote npy_bool_wrapper
as a class with one char
data member
(recall that in c++ char
is 1 byte). And added the required arithmatic
overrides for boolean algebra. e.g. 1 + 1 = 1.
Build Failures
After modifying complex_ops.h
in my pull request for sparse matrix
inequalities, people were getting build failures on clang and intel
compilers. This was related to ill defined overrides of the boolean
comparison operators. As an example, previously we had:
bool operator !=(const c_type& B) const{
return npy_type::real != B || npy_type::imag != c_type(0);
}
Where c_type
is a template argument, which is used as the type of the
underlying real
and imag
numbers. So comparisons with things that
were not c_type
were ambiguous. So I redefined the boolean
comparisons as template functions like:
template<class T>
bool operator !=(const T& B) const{
return npy_type::real != B || npy_type::imag != T(0);
}
Which should handle all comparisons, within reason.