Recommendations are neither a new concept to the businesses nor its customers. In old times, the sales personnel acted as recommendation agents who suggested the most relevant products to the customers based on understanding their preferences by talking to them and through their past behavior. The new age recommendation engines replicate just the same, nothing less, nothing more.
But, what has changed in the new Recommendation Engines are – sales personnel had human biases which have been removed, the customer interactions or the sales journey has become digital and the users’ attention span has reduced.
Netflix, Spotify etc., use recommendation engines to present their users with relevant content to ensure user retention and repeated usage. On the other hand, eCommerce giants like Amazon and Flipkart use recommendation engines to influence the user transactions through their product recommendations.
Yubi, as a unified lending marketplace, connects institutional borrowers and lenders on its platform. Some of the recommendation use cases that the data science team at Yubi solves are – product recommendations to the borrowers and borrower recommendation to the lenders. Though there are many similarities with the recommendation engines built for the consumer world, there are a few nuances that make Yubi’s recommendation problem different.
- Cost of a wrong recommendation is very high
- Acting in the institutional space, the amount of data is not as extensive as compared to the consumer world
- Lower internal customer engagement data makes it important to rely on available external available data
- The discovery process is offline hence, the instrumentation process and digitizing data are still evolving
- The latency at which the recommendations are given to the users should be very low, since all our recommendation use cases are real time
In short, scale is not a big problem for us but having very high accuracy and speed on recommendations are the bigger challenges.
Recommendation Engine Design at Yubi
Unlike other machine learning problems, building recommendation engines require detailed, thought out designs. Additionally, architecture plays an important role because good design ensures the latency is low and recommendations happen in real time.
Below is the architecture we use at Yubi. This is the core of recommendation engines, hence the core of this writeup as well. We need to get this right!
On the surface, the recommendation framework consists of three layers:
- Firstly, retrieving relevant items for a user that is majorly dependent on their explicit preferences and some business logic
- Secondly, ranking the retrieved items is driven by recommendation models like collaborative filtering and other ranking models
- Finally, a list of top “N” items based on the first two steps are given to the user, which go through an A/B framework to understand the lift in the new deployed engine as against the existing one (or a baseline/ random one- if this is the first model being deployed)
Data that Powers the Engine
The key data sources that are used for building the recommendation engines are –
- Lender (user) & item (debt instrument) characteristics. Examples of user characteristics would be PSU/ Pvt Banks/ NBFCs, location of the lender, etc. and debt instrument characteristics would comprise tenor, interest rate and so on
- Explicit user preferences that we collect on the platform for instance, an XYZ Bank is interested in companies with revenue > ₹10 Crores only
- User engagement data on platform- At this point, though the user engagement is low on the platform, we define the engagement by three events – views, clicks and Expression of Interest
- External transactions data of the user – This data is given higher weightage in our rating values at this point since the platform adoption is still evolving. We get the external secured borrowing data from the database maintained by the Ministry of Corporate Affairs.
- Business rules – explicit filters known to the business team but recorded by the users on their preferences.
How Is Item Retrieval Done?
The first step towards identifying the relevant items for a user is ‘retrieval’. The idea is to remove items that we are 100% sure are not relevant to the user. These are identified through the data points on explicit user preferences and the business logic is based on the input from internal teams.
So, how exactly does this filtering happen? Let’s assume a lender says he will lend to companies only in the MSME sector and those operating in Chennai – then these two are straight filters – which removes all non-MSME and non-Chennai deals at the retrieval stage.
This step would reduce items from thousands to hundreds in our case.
Defining the Ranking Layer
The idea of this layer is to rank the retrieved items in order, so that top “n” relevant items for the user can be nudged. This is the layer where all the simple and complex machine learning models contribute.
So, how does the ranking layer work? Using the previous example, we would understand their implicit preferences based on
- the external borrowing data available on Ministry of Corporate Affairs (MCA) and Fixed Income Money Market And Derivatives Association Of India (FIMMDA)
- characteristics (revenue, EBITDA, shareholding pattern to name a few attributes that define their characteristics) of the borrowers the lenders have historically given debt
- internal user engagement data of the lender – which product the lender views/clicks, what kind of borrowers the lenders express interest on and their similar borrowers
Now, each live deal on the platform is scored to identify which deal is the closest match to the lender’s implicit and explicit preferences and finally ranked based on the scores.
1. Collaborative Filtering based Ranking
One of the techniques we use is ‘collaborative filtering (CF)’. A key advantage of this technique is that you only need the historical transactions data,the rating values derived from the engagement and historical internal and external settled transactions.
The first step in CF technique is to prepare the matrix of users and items with the values being the rating values.
In our case, the users are the lenders and the items are the borrowers. The core of CF is how you define the rating values that fill up the matrix.
We define rating values based on the below features and through gridsearch iterations we identify what is the best combination of weightages for each of the below features to arrive at the final rating values –
- Volume of external transactions between a borrower and a lender
- Volume of internal transactions between them
- Engagement data – views, clicks and expression of interest (EoI), with most weightage for the EoI.
Given the large number of borrowers available, we usually have a very sparse matrix with an initial matrix fill rate of 3% only. A point to note is that the sparsity is caused by the number of borrowers and not lenders, since there are only a few hundreds of lenders in the country.
We handled this sparsity by creating clusters of borrowers – the clusters were created based on the below features-
- Revenue segment- Small/Mid/Large corporates/ NEG (New Economy Group)
- Current ratio
- Vintage of the borrower (how many years they are in the business)
- Sector (Industry they operate in)
Then we created the matrix with borrower cohorts x lenders and rating values derived from the above.
Once the matrix was ready we used the matrix factorization technique to fill it up. Matrix factorisation happens through techniques like SVD, SVD++, ALS etc and its accuracy is generally measured through RMSE of the predicted versus actual rating values.
Finally, we arrived at the full matrix which now provides us with the rating values for every lender – borrower combination possible, basis which we could rank the borrowers for a lender.
2. Recommendation as a standard ML problem
Another simple but very efficient technique we use is ‘recommendation as a classifier/regression’. Here, the idea is to convert a recommendation problem into a standard ML problem, which is either a classification or a regression, based on the rating values (target variable as we know here).
This conversion of the User x Item matrix allows us to use features explicitly instead of just relying on matrix factorisation to identify the similarities.
So, to execute this technique at Yubi as an extension of CF- let’s say hypothetically, we have 50 lenders and 50 borrower cohorts as explained in the CF part, now the training set becomes 50×50 = 2500 rows. Here, instead of taking the borrowers and lenders directly, we use their characteristics as features.
The features used generally are user/item characteristics, user preferences/internal platform behavior of the user and external user behavior.
So, when should collaborative filtering and recommendation as a classifier be used?
The simple thumb rule is that whenever we have a good volume of interactions data, we can rely on collaborative filtering (CF) and in cases where the interactions data is lower,we should go for recommendation as a classifier. The idea behind the same is that CF identifies implicit preferences and for that we need to have good volume of interactions data (less sparse matrix) and if not, we have to use explicit variables for the model to understand preferences as in the case of a classifier.
Key intuitive difference between CF and Recommendation as a standard ML problem is that Collaborative Filtering uses the historical transactions to implicitly understand the similarities between lenders and recommend accordingly, whereas in standard ML problem, the features are used explicitly and the model is trained based on their previous interactions.
3. Other commonly used ranking techniques
Few other techniques used for scoring are distance-based ones, extending the recommendation as a standard ML problem into deep learning problems, LTR (Learning To Rank).
Distance-based ones are similar to the content filtering where you find the distances between the explicit preferences given by the users and the item characteristics. LTR is a technique where you convert the recommendation into a standard ML problem that solves ranking by optimizing for its relevance. RankNet, LambdaRank, LambdaMART are a few optimization techniques that are used to minimize the error in ranking.
Metrics Through A/B Testing are more Reliable
Though we would perform the dev, val and OOT splits to check the performance and the stability of the models before deploying, the real performance of the model is to be tracked on the real time data which makes it necessary to have an A/B testing framework.
The performance measures of the new versus the existing models can be done (baseline/ random in case of first time deployment) based on the lift. Also, further model tunings and deployments must be conducted.
Since we have two different layers here, we would have to measure the efficiencies of both the layers separately. The filtering efficiency measures are tracked to ensure all the relevant items (and only the relevant items) that pass through the ‘retrieval’ layer.
On the other hand, the ranking efficiency measures track the efficacy of the ranking. Metrics used to evaluate the performance of filtering are ‘precision’ and ‘recall’ for all relevant items, whereas the metrics used to evaluate performance of ranking are ‘recall @k’, ‘precision @k’ and ;MRR’.
- Precision @k measures the proportion of deals recommended in top k that are relevant. Mathematically defined as
Precision @k = No. of deals recommended at top k for each user that are relevant/ Total number of recommendations
- Recall @k measures proportion of relevant items that are part of the recommended list
Recall @k = No. of deals recommended at top k that are relevant/ Total relevant deals
I hope by now you would be able to appreciate and understand the recommendation problem we are solving at Yubi – what makes it different, internal and external data we use to solve, different layers in Yubi’s recommendation engine, how we solve for the retrieval and the ranking layers, and how the accuracy of the same are measured.
In our subsequent posts, we will get into the details of non-personalized recommendation techniques and more importantly, detailing the complexities of the recommendation ranking models.