July 2021
I recently moved into a new apartment, and I started using Pinterest more and more for fresh ideas on how I could decorate the interior of my new place. As I browsed from Pin to Pin, my mind started wondering how an image recommender like Pinterest would work. How could I build an image recommender system? Of course, I wouldn't have access to the complex tools and technologies used at Pinterest, but I could definitely experiment with tools I am already familiar with, such as Keras and Scikit-Learn to understand the underpinnings of a basic image recommender system.
The goals of this little project were simple. I wanted to understand how an image recommender would work, end-to-end. I wanted to know how a dataset of raw images as JPGs or PNGs could be used by some sort of image classification model to find other similar images in the dataset. Through some of my previous projects, I knew that cosine similarity (or any similar distance metric) needed to be used as the meat of the recommender algorithm to find close/related datapoints. All that was left to do was figure out how to build a Keras model to recognize images.
I knew I had to start by building a good image classification model in order to compare images for similarities. However, the deeper I dove into image model architectures, I began to realize that Keras has numerous pre-trained machine learning models, all of which would already perform much better than anything I could cook up on my own. Therefore, I decided to leverage these models instead and play around to see how these models perform on a given set of JPG images.
The main models I experimented with are the Keras VGG16, VGG19, and the InceptionV3, among others. The dataset I wanted to use was a set of raw JPG images, and since I wanted to make a mock-Pinterest, I decided on a clothing dataset that can be found here on Kaggle.
With this project, as well as other machine learning/data analytics projects I've worked on, I've revisited the point of how critical it is to properly pre-process the data you are given. In this case, each image in the dataset needed to first be loaded using the Python Image Library and resized to match the input size and dimensions of the model that was going to be used. So in the case of the VGG16 model, each image needed to be resized to (224, 224, 3) size, corresponding to (width, height, color_channel) as it was loaded in. Each image then needed to be converted to a Numpy array to be fed into our model. But each model has an input size of (batch_size, height, width, channel), and so an extra dimension was added using np.expand_dims() to coerce our image array to the necessary input size for the.
Aside from pre-processing the JPG images, I also needed to pre-process the model itself, which was something I hadn't done before. Since the pre-trained Keras models are set up for image classification, the final layer in each of the models was a prediction layer that output an image object prediction. The image recommender needed to instead calculate similarity on image features, not the label itself, and so I needed to remove the last layer of the model before feeding data through.
After all of the images were pre-processed as Numpy arrays of the proper size of (None, 224, 224, 3), and the Keras model dropped the prediction layer, all that was left to do was to feed the stack of image matrices to the model.predict() function and harvest the model outputs.
This is where the cosine similarity came in. A cosine similarity matrix needed to be constructed, comparing each model output for each image with every other image's model output. Then, all that was left to do was use this cosine similarity matrix as a lookup table, and given an image from the user, we could look up the image in our look-up matrix, find the maximum cosine similarity score, and then return that image, or rather a list of similar images as the "related images".
Now we can see some cool things happening! Let's say we use the Keras InceptionV3 model, we can see it performing really well to give us quite accurate recommendations for similar images, which in this case is shoes.
If you would like to see the full working code, please visit my Github repo here. Special thanks to Alexandre Wrg for his intuitive tutorial on building and understanding this recommender, please visit his Medium page and give him a like!
1. Alexandre Wrg. Oct 2, 2019. Image recommender engine -- leveraging transfer learnign. Accessed from: https://towardsdatascience.com/image-recommendation-engine-leverage-transfert-learning-ec9af32f5239.
2. Param Aggarwal. 2019. Fashion Product Images (small). Accessed from: https://www.kaggle.com/paramaggarwal/fashion-product-images-small.
3. Python Image Library. 2021. Accessed from: https://pillow.readthedocs.io/en/stable/handbook/tutorial.html#using-the-image-class.
4. Rohit Thakur. Aug 6, 2019. Step by step VGG16 implementation in Keras for beginners. Accessed from: https://towardsdatascience.com/step-by-step-vgg16-implementation-in-keras-for-beginners-a833c686ae6c.