data augumentation

Extending Keras ImageDataGenerator to handle multilable classification tasks

I stumbled up on this problem recently, working on one of the kaggle competitions which featured a multi label and very unbalanced satellite image dataset.

Let’s talk a moment about a neat Keras feature which is keras.preprocessing.image.ImageDataGenerator as you can see from the documentation its main purpose is to augment and generate new images from your dataset. This is a common tactic to fight small datasets and overfitting.
By default ImageDataGenerator expects our data to be structured in a very specific way, this is each class should have its own directory and every image inside this directory belongs to the class specified by the name of this directory.
We can realize that this is very limiting and usage of this API directly will not work for Multi-label problems.

Continue reading →

Posted by jakub.cieslik, 0 comments