Product Recommendations for a Food Delivery Marketplace
Data Analysis
Problem
An online delivery marketplace aimed to boost its revenue by integrating data-driven improvements into its user experience. We collaboratively identified four potential data science applications for its marketplace: product recommendations, delivery time predictions, dynamic fare pricing, and delivery route optimization. After evaluating the data availability and practicality of each option, we decided to initiate a proof-of-concept initiative focusing on product recommendations within the food segment.
In 2023 alone, the food segment accounted for over 205,000 orders, generating approximately $6 million in revenue for the marketplace. The company’s databases were brimming with underutilized data that had not been leveraged for analytical purposes nor for data-driven management decisions. Additionally, the existing user interface lacked any form of personalized recommendations. The client’s goal was to explore the potential of data science to enhance the user experience by offering personalized food recommendations.
Solution
We identified the data requirements for building a product recommendation pipeline and collected the necessary data. Drawing inspiration from Uber Eats’ advanced recommendation techniques, we designed a proof-of-concept product recommendation pipeline combining content-based and collaborative filtering methods to deliver personalized food suggestions. The development process included several key steps:
- Data Collection: We began by extracting and manipulating relevant user and product data from MongoDB, converting this information into pandas DataFrames for in-depth analysis.
- Exploratory Data Analysis (EDA): We conducted a comprehensive exploratory data analysis to identify patterns and relationships within the data, which are crucial for refining our recommendation strategies based on actual user behaviors and product attributes.
- Content-Based Filtering: We created user and food taste profiles using features extracted from product descriptions and user preferences. These profiles were converted into numerical vectors using sklearn’s DictVectorizer, enabling the calculation of similarity scores between user tastes and product profiles.
- Collaborative Filtering: This technique leveraged users’ purchase histories to identify products favored by similar users, effectively capturing evolving user preferences.
- Hybrid Approach: By integrating content-based and collaborative filtering, our engine was able to recommend food items that not only matched previous preferences but also included popular items among similar users, thus improving both accuracy and discovery.
Tools Used
- MongoDB: Primary database for storing user and product data.
- Python: The primary programming language for developing the recommendation algorithms and performing data analysis.
- Pandas: Utilized for data manipulation and analysis to convert raw data into insights.
- Matplotlib: Primary library used to visualize data trends and algorithmic outputs.
- Sklearn (DictVectorizer): Used for encoding categorical data into numerical vectors, critical for our machine learning models.
Outcome
The proof-of-concept recommendation pipeline demonstrated the potential to implement complex AI-driven solutions in real-world scenarios. Its success marks a promising step towards extending recommendation services to other segments of the marketplace, potentially transforming the shopping experience for customers across various categories. Additionally, this initiative highlighted the importance of well-structured, ordered, and maintained data for making data-driven decisions. It also revealed gaps in existing data pipelines that, once addressed, could further refine the product’s user experience.
In the coming weeks, this proof-of-concept will be implemented in a sandbox environment within the marketplace. This will allow us to closely simulate real-world usage and make the necessary adjustments before proceeding with a full-scale deployment.