Methods to implement the Deep Learning in Recommendation System

7 min readApr 22, 2021

As we are progressing towards making an algorithm from scratch, it’s really very important to choose the algorithms correctly. An understanding of various deep learning algorithms is must.

Also, the terms in this field are quite confusing for someone, as a lot of words are interchangeably being used. To understand we can say the whole things comes under the umbrella of Artificial Intelligence. Artificial intelligence which usually means we make computers intelligent so that they can work on their own. The term automation also means to make a system run smartly by its own. Now the term comes machine learning, machine learning is also part of artificial intelligence. Machine learning’s aim is to make machine learn, which is done by training through a dataset. Machine learning requires so much data to learn. So, a lot of techniques have been developed over time inside machine learning. Such methods like linear regression model, which is used to predict a linear estimation, the idea is things are going to behave as they behaved in past as a straight line. We try to minimise error or loss function also called the learning by fitting the values by the curve. So, there are other machine learning algorithms like logistic regression, naïve bayes, decision tree, random forest all are used for classification purposes, k-means used for clustering purposes. The algorithms can be supervised when data is labelled, unsupervised when data is unlabelled and reinforced when data is unlabelled and mostly the computer learns by its own with trial-and-error methods and acting upon a reward/punishment basis.

Now deep learning is a subset of machine learning, and deep learning uses neural networks, so sometimes people interchangeably say deep learnings as neural networks. But fields are not limited, as it’s still a modern-day research problem and things come up every day. Everyday, new neural networks are being created and tested. That’s why artificial intelligence is a progressing field of study.

About deep learning we learned about few major algorithms which are used in real world. All come with various other configurations which impacts how an algorithm performs. Also, there are unlimited ways you can create a hybrid algorithm by tweaking the configurations. Understanding the working, we can point out few common points each of these algorithms hold:

1. Embedding function — this function allows us to condense the data, allows to put the knowledge of what the pattern we expect. The growth plan and continuation of data, can expect values like number of nodes, size of nodes etc.

2. Layers — An embedding function can also be a layer, for e.g., dense layer can be an embedding layer came from sparse layer. Why? because sparse data which has a lot of empty values makes the computation inefficient because this would unnecessarily may reach millions or billions of unnecessarily fields to process during each calculation. Here are some various kinds of layers one can use into their own Neural networks / Deep Learning

a. Dense Layers — these are fully connected layers

b. Convolutional Layers — these are fragmented layers of a large problem, usually being used in image detection or machine vision algorithms

c. Pooling Layers — these layers are where fragmented layers combine to study

d. Recurrent Layers — these layers are used where we need to memorize the calculations, like a personal ai assistant e.g., google assistant

e. Normalization Layers — these layers normalize the data or prepare it such that the data is related to each other.

3. Nodes — Nodes are basic neurons, the more it is the better a neural network performs. The whole idea is taken form a human brain, just like a human brain has billions of neurons. The function of neural network is to pass the synapse or signals from one to another.

4. Activation Function — Activation functions are the trigger signals; just like in a human brain we have billions of neurons and trigger functions. The job of activation function is to pass the information to next layer or back propagate in order to adjust weights to learn. For e.g., when you touch something, you feel the texture, because some neurons of your brain got triggered and created a particular response. There are various types of activation functions to use, here are some:

a. Sigmoid Function — These are popular activation functions, they fit all the data between 0 and 1, negative or lower values saturated near 0 and maximum values get saturated near 1. The graph appears like ‘S’. We can use it for classifying a problem either it happens or not, like values greater than 0.5 to be existing or true and others as non-existing or false

b. Tanh or Hyperbolic Tangent Activation Function — Are the tangential values, lies between +infinity or -infinity remains close to 0.

c. ReLU or Rectified Linear Unit — One of the most widely used and popular activation functions. It provides as x as the original input if greater than 0 and 0 for all values of 0 or lower than 0

d. SoftMax Function — This function allows us to receive a probability distribution, where all the outputs summation gives value 1.

5. Weights — These are weightage decided upon the input and layers as to prioritise the processes. It decides the thickness of strength of node.

6. Biases — These are constant values which are added, suppose if your calculation gives the value of 0 and it’s not passing to the next layer, a constant of 1 can help.

7. Gradient Descent — the learning is decided by this allows us to reach a local minimum where the learning is maximum and errors are least.

8. Log Loss Function — This function can be a mean squared error or sum of errors function which allows the algorithm to learn from its mistakes, it allows the weights to be adjusted positively or negatively according to the actual results

9. Learning Rate — this is the increments we make in every iteration so that we progress towards the accuracy or used in gradient descent to climb down the valley to reach the minima. If the learning rate is smaller the time taken would be high but accuracy would be higher but if the learning rate is larger the time taken would be low but accuracy would be lower.

10. Epoch — this is the iterations we allow the algorithm to run. More iteration means more accuracy we can achieve.

The combinations, and variety of using all of the above gives us various types of neural networks some of them are listed as –

1. Simple ANN or MLP — Artificial Neural Network with a simple feed-forward network or we also call it as a Multi-Layer Perceptron, should have at least one hidden layer. These machines take input, multiply with weights, pass the info through activation function, compares the values with actual data sets, calculate the log loss or error values adjust the weights and learn.

2. CNN — Convolution Neural Network, these neural networks have convolutional layers which are fragments of larger piece. It’s used in image identification or machine vision and also text-recognition or natural language processing.

3. RNN — Recurrent Neural Network, these neural networks have a time variable attached, and performs time-based activation. Have a memory segment attached which then used to learn and activate the neurons based on data in memory.

4. DNN — Deep Neural Network, these are just ANN but with so many hidden layers or we say when ANN digs so deep.

Based on the above, now the question rises what we can use to make our Recommendation system to be accurate. Now two of the best-known recommendation systems can be classified as below –

1. Collaborative Filtering — CF, collaborative filtering identifies users and their interests and by collecting information from many users. Idea is person similar in nature would like similar products. These can be of two types

a. User-user based collaborative filtering where we pick user to explain another user’s taste

b. Item-item based collaborative filtering where we find similarity between the items a user picks or rates

c. matrix factorization — here we take both user-item collaborative filtering by narrowing down features of the products. It identifies the relation between user and item entities

2. Content-Based Filtering — CBF, here we identify different features of items and recommend it to other users based on their history and leaning of user towards a specific feature.

How we can use Neural Network?

To improve upon the traditional recommendation system, we come up with Neural Collaborative Filtering. This uses a neural network or we say a Multi layer Perceptron (ANN) or Deep Neural Network (DNN) to automate the process.

A very popular illustration can be seen as –

Here what’s happening is we are taking, the Matrices defined through the features as Latent Vectors. Now fed to various layers and calculating the errors from the actual data set. The goal can be to find the perfect values of the r values. What are the r values –

r values can be understood as if we are using Matrix Factorization say A = B x C

we divide the matrices such that. A= B x r + r x C.

So, our aim is through repetition find these values which would makes our losses decreased.

Methods to implement the Deep Learning in Recommendation System

Written by Hyprbay