Model Code

Probabilistic Linear Discriminant Analysis (PLDA) [Link]

My Python implementation of the model in Ioffe (2006). I wrote it so that you can both (1) extract the features you would ordinarily want from plain linear discriminant analysis and (2) classify new data using the underlying probabilistic model. I also wrote unit, integration, and inference tests that you can run if you have concerns about the model breaking on your machine.

Bayesian Teaching for Probabilistic Linear Discriminant Analysis (BTPLDA) [Link]

This is the teaching model I wrote to perform Bayesian Teaching for Probabilistic Linear Discriminant Analysis models. It explains “categories” that a PLDA model has learned by generating optimal “teaching sets” (i.e. subsets) of the training data to convey the latent categories. I still need to clean up this repository, so if you would like use this code, I would suggest getting in touch with me first.

Explaining Image Classifier Predictions [Link]

One of the procedures I created to explain model predictions at the feature level. For images, this amounts to visualizing image components that play the largest role in the target model’s classification of the image. Scott Cheng-Hsin Yang and I extended this approach so that the image components are optimally selected via Bayesian teaching.

Since this approach is not restricted to probabilistic models, I am also extending this approach to explain
predictions made by deep neural network models (e.g. ResNet).

Some of the code is available in the above BTPLDA repository, but it is still a work in progress. Get in touch with me if you are curious or have questions!

Other Code

Google Colaboratory with Unlimited Google Drive Storage [Link]

If you work at an academic institution in the United States, you probably have unlimited Google Drive storage. I wrote a couple of helper functions so that you can read data from and write to your Google drive when you are using Google Colaboratory, without having to do so much directory tracking. pull_from_gdrive() and push_to_drive() are intended to be conceptually analogous to the pull and push commands in git.

Downloading ImageNet [Link]

I fixed some deprecated code on one of Google’s TensorFlow repositories and wrote some documentation that is useful if you are working with this dataset for the first time.