week 5 - Various bool stuff

I spent this week tying up loose ends relating to the sparse package.

DepreciationWarning

A recent change to NumPy caused DeprecationWarning to be thrown whenever there was (potentially) implicit casting between dtypes.

 DeprecationWarning: Implicitly casting between incompatible kinds. 
In a future numpy release, this will raise an error. Use
casting="unsafe" if this is intentional.

This was happening a lot in the sparse test suite, where in-place division and multiplication are tested. Since this behavior is being deprecated, I removed the tests for the appropriate cases. But not without some trouble first.

Recall that in python3 / is always true division. So there is always a potential to change the type from int to float, however if the result is expressible as an integer, int is returned. In cases like this where there is no difference in type between the input and result type, NumPy will still throw this warning. So simply checking if there is a type difference between input and result was not enough to programmatically remove the deprecated tests, (which I noticed when my first patch did not work). Eventually I just added special cases to the tests for different data.

Output Type for SWIG Routines

The compressed types of sparse matrices (BSR, CSC, CSR) each have a set of routines for preforming binary operations (binop) between two sparse matrices. These routines however, only returned data that was the same type as the input data. So boolean operations had to have their result cast to bool at the python level. I modified the sparsetools routines to output bool data where appropriate.

Implementation

This involved changes at every level of the codebase (c++, SWIG, python), it is a good demonstration of how these levels work together.

C++

The binop routines are implemented as function templates in c++, where the template arguments are (including the new one I added):

  • I: Always int for storing nnz (number of non-zeros), vector length etc.

  • T: The input data type.

  • T2: The output data type, which I added.

Adding this new argument required minimal changes to the existing code. Basically I just changed T to T2 in a few places.

Since the output type we used here was bool I also had to add a conversion from the complex wrapper class to bool.

SWIG

At the SWIG level typemaps define what types the function templates inputs and outputs are. I had to add new typemaps to instantiate functions with boolean output.

Python

Since the new binop routines now output the correct dtype. I removed all the casting to bool. However in the cases where boolean data must be returned, python has to pass an empty matrix to the binop routines with the correct dtype. So _binopt now checks what the operation is to decide what kind of empty matrix it should create.

Sparse Bool Wrapper

Pull request. Previously the c++ routines in sparsetools defined the bool dtype with:

typedef npy_bool_wrapper npy_int8;

This was so that bool value would be one byte. But since this is stored as an int8 type, it would rollover when the underlying integer got to 256. Like this:

In [2]: a = sp.csr_matrix([True, False])

In [3]: for _ in range(8):
   ...:     a = a + a
   ...:     print(a.todense())
   ...:     
[[ True False]]
[[ True False]]
[[ True False]]
[[ True False]]
[[ True False]]
[[ True False]]
[[ True False]]
[[False False]]        # <----- !

So I rewrote npy_bool_wrapper as a class with one char data member (recall that in c++ char is 1 byte). And added the required arithmatic overrides for boolean algebra. e.g. 1 + 1 = 1.

Build Failures

After modifying complex_ops.h in my pull request for sparse matrix inequalities, people were getting build failures on clang and intel compilers. This was related to ill defined overrides of the boolean comparison operators. As an example, previously we had:

bool operator !=(const c_type& B) const{
    return npy_type::real != B || npy_type::imag != c_type(0);
}

Where c_type is a template argument, which is used as the type of the underlying real and imag numbers. So comparisons with things that were not c_type were ambiguous. So I redefined the boolean comparisons as template functions like:

template<class T>
bool operator !=(const T& B) const{
    return npy_type::real != B || npy_type::imag != T(0);
}

Which should handle all comparisons, within reason.

Comments

Comments powered by Disqus