Improving Recommender System with Tree-based Deep Model

Alibaba Tech
4 min readAug 8, 2018

This article is part of the Academic Alibaba series and is taken from the paper entitled “Learning Tree-based Deep Model for Recommender Systems” by Han Zhu, Xiang Li, Pengye Zhang, Guozheng Li, Jie He, Han Li, and Kun Gai, accepted by KDD 2018. The full paper can be read here.

These days, whether we are streaming videos, using online dating apps, or shopping on e-commerce platforms, the number of options to choose from is practically limitless. This multitude would probably prove overwhelming, were it not for the presence of recommender systems — a type of information filtering system — working behind the scenes to search through the vast amounts of information on our behalf, filter and prioritize it, and provide us with optimal, personalized results.

System architecture of the recommender system used for Taobao’s display advertising

On Alibaba’s e-commerce platform Taobao, the system receives a page view request from a user, and then uses user, context, and item features to generate a set of candidate items (typically numbering in the hundreds) from a much larger corpus (hundreds of millions). The real-time prediction server then uses more expressive models to predict indicators such as click-through rate and conversion rate. Finally, after ranking the set, items are displayed to the user.

Existing Models: Information Overload

Model-based recommender systems have been the focus of much research in recent years, but they face some practical hurdles. E-commerce systems typically have gigantic corpuses, meaning the calculation costs involved with predicting each user/item preference are huge. This makes full corpus retrieval rather problematic. At the other end of the spectrum, some algorithms are not even capable of predicting from the entire corpus in the first place.

Some models, such as matrix factorization, have tried to facilitate efficient k-nearest neighbor searches, but incorporating a more expressive interaction between user and item features remains a challenge, often due to the calculation cost when using deep neural networks.

In addition to being accurate, recommended items also have to be novel. Results that simply replicate a user’s historical behavior are undesirable. In this respect, memory-based and item-based collaborative filtering both fall short.

TDM: Making Information Manageable

Frustrated with the shortcomings of existing models, the Alibaba tech team decided to develop a novel tree-based deep recommendation model (TDM). This model leverages a hierarchy of information (for example, taking ‘iPhone’ as a fine-grained item and ‘smartphone’ as a broader, coarse-grained concept) and turns recommendation problems into a series of hierarchical classification problems. The division of problems into smaller ones allows TDM to solve each of them successively from easiest to hardest, enabling it to make accurate and efficient predictions from a large corpus.

While TDM explores the entire corpus for more precise and effective recommendations, it also incorporates deep models that help it find potential interests.

The TDM architecture leverages the hierarchy of information to make recommendation more manageable

Testing TDM in Theory and Practice

To test the model, the Alibaba tech team derived several variants of TDM: TDM product-DNN, TDM DNN, and TDM attention-DNN-HS. These variants were tested using two datasets, MovieLens-20M and User Behavior. Results show that introducing advanced models in TDM can significantly improve the recommendation performance.

Comparison results of different methods against two different datasets

The proposed TDM method was also tested online with real traffic on the Taobao display advertising platform, with results showing both click-through rate and RPM had increased. This demonstrates that TDM can both recall more accurate results for users, and also accumulate more revenue for the Taobao advertising platform.

Looking Ahead

The main challenge that model-based methods face is the amount of calculations required in the generation of recommendations from a large corpus. TDM, a tree-based approach, employs arbitrary advanced models in the recommendation process, helping to infer a user’s interests from coarse to fine along the tree. After extensive testing, results showed increases in the effectiveness of recommendation accuracy and novelty. Looking ahead, a possible future direction for research would be more elaborate tree learning approaches.

The User Behavior data set can be found here.

The full paper can be read here.

Alibaba Tech

First hand and in-depth information about Alibaba’s latest technology → Facebook: “Alibaba Tech”. Twitter: “AlibabaTech”.

--

--

Alibaba Tech

First-hand & in-depth information about Alibaba's tech innovation in Artificial Intelligence, Big Data & Computer Engineering. Follow us on Facebook!