Week 1 - bool dtype support

I got accepted to Google Summer of Code! Thank you everyone who has helped me get this far, and Amber for making me a congratulatory google cake!

google cake

On my proposal timeline I gave myself 2 weeks to add support to sparse matrices for dtype=bool. So far, dtype=bool works implementation was not too hard, and most of my time was spent figuring out how SWIG works. However there is still a lot of work to do in testing bool support. This will be more time consuming than it should be because the sparse test suite is messy.

bool dtype implementation

sparsetools

The sparsetools/ directory contains several *.cxx, *.h, *.i, and *.py files. The header files (*.h) contain C++ routines these are wrapped by SWIG according to instructions in the *.i interface files. SWIG uses the header and interface files to generate python *.py files C++ code in *.cxx files. Once these *.cxx files are compiled and linked the functions they define can be called by the generated *.py files.

Where do I add bool support?

Contained in sparsetools/ is a file called complex_ops.h which translates numpy's complex types to C++ classes. Well, that is what I want to do, except for numpy's bool types instead! My new file bool_ops.h is much simpler though. It basically does

typedef npy_int8 npy_bool_wrapper;

Yep that's it, (for now anyway). This file basically says to treat the npy_bool_wrapper type like the npy_int8. Because boolean algebra is almost integer algebra.

npy_bool_wrapper is used in numpy.i to define a typemap from numpy's npy_bool type to our npy_bool_wrapper.

In sparsetools.i we have to declare the new data type and add it to the INSTANTIATE_ALL macro which is used in all the *.i files to call the functions we want from the *.h files.

What else?

Adding bool to supported_dtypes in sputils.py prevents it from upcast. Then tests need to be added, but this isn't too easy. The test suite does not have a way to take some general data or dtype. So I am going through adding tests one-by-one where it seems appropriate. I'm also trying to add tests for other dtypes because there don't seem to be any tests for anything other than float64 and int64.

While I was doing this I discovered a few bugs relating to how sparse handles uint dtypes.

Once I'm done adding tests and fixing what ever does not pass, I'll move on to the next stage in my proposal. Adding support for boolean operations.

Comments

Comments powered by Disqus