Summary

Catalant Technologies uses LightFM, a hybrid recommendation system, to match business experts with relevant projects by leveraging both collaborative filtering and content-based filtering techniques.

Abstract

Catalant Technologies faces the challenge of recommending suitable projects to a diverse pool of over 30,000 experts on their platform. Traditional collaborative filtering methods are less effective due to the short lifespan and limited interactions of Catalant projects. Content-based filtering, while useful, requires extensive manual metadata processing and lacks the ability to learn from user behavior across different users. To address these issues, Catalant employs LightFM, a hybrid recommendation model developed by Lyst, which combines the strengths of collaborative and content-based filtering. LightFM utilizes expert and project metadata, along with user interaction data, to learn embeddings that inform its recommendations. This approach allows Catalant to provide personalized project suggestions that consider both the content of the projects and the preferences of similar experts.

Opinions

Collaborative filtering is deemed less effective for Catalant due to the brief project lifespans and limited expert interactions.
Content-based filtering is critiqued for its reliance on thorough metadata and its inability to leverage transfer learning from user behavior.
LightFM is praised as an effective solution for Catalant's recommendation challenges, offering a balance between collaborative and content-based methods.
The use of metadata matrices for experts and projects, along with interaction data, is highlighted as a key component of LightFM's success in generating accurate recommendations.
The hybrid approach of LightFM is seen as advantageous for capturing the nuances of both user preferences and project characteristics, leading to more relevant recommendations.

Using LightFM to Recommend Projects to Consultants

We have a challenge at Catalant Technologies: there are more than 30,000 business experts and boutique firms using our platform all looking for the right projects to tackle. These experts are alumni of traditional consulting firms, SMEs in niche technical fields, veterans of the world’s largest enterprises, and everything in-between. The projects on our platform are just as diverse as the experts working on them, and we need to find the right projects for every expert. This is the challenge that my colleague Andy Luther and I have been working on.

When an expert signs up for Catalant, they can provide a description of themselves in their ‘About Me’ section and tag themselves with skills and industries that represent their expertise. Similarly, when a business posts a project to our site, they write a project description and can tag their project with the skills they need. Experts are able to search for projects on their own, but we want to make it easy by recommending the most relevant projects for them.

One way we could generate these recommendations is by tracking which projects experts like and dislike (‘bookmark’ or ‘hide’ on Catalant), comparing these interactions to other experts’ likes and dislikes, and recommending projects that similar experts have liked. This is called ‘collaborative filtering,’ and it covers many algorithms with their own strengths and weaknesses. Collaborative filters are widely used on shopping websites like Amazon, news services, and video content providers like YouTube.

Collaborative Filtering and Obesity: a silent killer.

There is an important difference between how websites like Amazon incorporate collaborative filters and how Catalant must recommend projects: Catalant project life-times are very short. Products stay on Amazon’s catalog for months and in that time they will be purchased and rated by potentially thousands of users. This allows them to develop a very rich understanding of what kinds of items are purchased by the same customers and make recommendations based off these user-clusters. A Catalant project will only be accepting bids for a few days, may be interacted with by less than thirty experts, and needs to be recommended to the right experts the moment it is posted. This short lifetime and small number of interactions would hobble many collaborative filtering strategies.

An alternative to collaborative filtering is content-based filtering. A content-based recommender is one that matches a user with items that have metadata (tags, genres, etc.) similar to items that user has already liked. On Netflix, that means breaking down movies by their genre, actors, subject matter or other criteria.

For Catalant, a content-based approach means recommending projects to a user that have similar industry and skill tags to projects the user has already bookmarked. There are two major drawbacks to content-based recommendation systems: they require a lot of manual processing to ensure that the project metadata is thorough and rich, and they miss out on the potential to learn from comparing different users’ behavior (called ‘transfer learning’).

The appropriate solution for Catalant recommendations is somewhere in-between collaborative filters and content-based recommenders. That is where LightFM comes in.

LightFM is a hybrid model that incorporates both content-based recommendations and the transfer learning of collaborative filtering methods; it gives us the best of both worlds. Developed by Lyst, a Fashion e-Commerce site based in London, LightFM figures out what users like by learning relationships that map users and user metadata to the projects and project metadata that they like. These relationships are called ‘embeddings.’ In order to build these embeddings, LightFM utilizes three sets of information: the expert metadata, the project metadata, and the interactions between them.

For expert metadata, we build a matrix containing all of the expert’s industry and skill tags, as well as selected words from their profile tagline and ‘About Me’ section. That matrix looks something like this:

Similarly, for project metadata, we build a matrix containing the project’s industry tags, skill tags, selected words from the project name and description, and budget range:

The interactions are also put in to a matrix where every positive value represents a like and every negative is a dislike:

LightFM then takes these three matrices and solves for the embeddings that will allow it to most accurately predict the values in the interactions matrix. You can read more about how LightFM learns (called ‘stochastic gradient descent’) in this paper by Maciej Kula from Lyst.

To create recommendations for an expert, we give LightFM that expert’s metadata and the metadata of all out-for-bid projects. LightFM then uses the embeddings that it has learned to score each project on how much the expert will like it. The top-scoring projects are the user’s recommendations.

This system is something between a collaborative filter and a content-based system. It learns how groups of experts interact with various projects, like a collaborative filter, but it is also learning relationships between expert and project metadata, like a content-based system. By working in both worlds, LightFM gives us the strengths of both, and helps us match our diverse projects to our equally diverse experts.

In future posts, we will dive further in to how we measure success, how we break down expert/project metadata, how we optimize these models, and plans for maximizing sick wheelie hang-time.

If these problems are interesting to you, we have plenty more to solve at Catalant Technologies. We are looking for Data Science Engineers and Software Engineers. Learn more here.