The applications of Machine Learning and AI are vast. Their capability to automate workflows and make intelligent decisions makes them a priority for fast-growing teams that want to save money and time. But the top tech firms use ML and AI very wisely to make their systems efficient and tailored to the needs of their users. For example, Facebook uses an algorithm to automatically sort the posts on one’s news feed such that posts by closest friends or paid advertisements appear first. This improves the user experience of a person scrolling through their Facebook Feed and makes it more relevant to their taste.
Machine Learning / AI allows us to process vast amounts of information and consolidate it into important results and metrics that can be very helpful for the engineers, designers, and users of a platform. For example, Facebook collects data on the kind of posts and advertisements that a user likes. Based on this data, the algorithm recommends similar posts and advertisements to that user. This analysis is carried out using Natural Language Processing that builds a ‘similarity score’ between two posts and uses this metric to recommend posts with the highest similarity score. But the same data also allows the Product Managers at Facebook to understand the general preferences of geographical groups and cities. Thus, the applications of machine learning can be focussed on the user or the designers of the platform.
Here’s an outline of the process used by top tech firms to apply Machine Learning and AI in their workplace to improve their systems:
- Identifying gaps in their systems and the potential for capitalizing on it.
The first step is to identify a potential feature where automation through ML/ AI can improve the system and allow the company to either capitalize on that feature (make a profit) or improve their user experience. One of the biggest challenges for Facebook is to find content with hate speech and violence. They have been trying to deploy multiple algorithms that screen through each post and try to analyze the words in the post along with pictures and links to classify the content like violence or hate speech. Such algorithms are a version of Natural Language Processing that allows us to process long pieces of text and identify relevant information from it.
- Funneling the best data sources for the problem
The next step for the firm is to understand the best sources of data that can be utilized for the analysis. It is important to note that the best outcomes of an ML/AI model are dependent on the quality and features of the data. This step becomes even more tricky for newer users where the firm doesn’t have a lot of data on their preferences and tastes. Thus, most top firms deploy a combination of techniques to provide the best user experience.
In cases where a user is not properly known, most firms employ unsupervised machine learning techniques to understand their preferences and make recommendations. This kind of machine learning involves unlabelled data that doesn’t have an exact predetermined outcome. Instead, the algorithm has the responsibility to find commonalities between different data points (the users) and group them into clusters. These clusters are also called cliques (a group with shared interests). There are multiple attributes of a clique. They are usually built based on the similarity of age group, geography, ethnicity and origin, work details (if known), etc.
Most firms also collect metadata, a kind of superficial data about each user, such as their basic preferences and choices. When a user starts a Netflix account, they are asked about their preferences such as the kinds of movies and TV shows they watch. The algorithms use a combination of insights from their cliques and their metadata to make recommendations. For example, a student from the Boston region signs up and prefers to watch animated and adventurous TV shows. They’re logging from the Purdue University region where a lot of students have been watching the show ‘Avatar: The Last Airbender’ with the tags ‘Animation’, ‘Adventure’ and ‘Comedy’. This would automatically make this show as the first recommendation on Netflix for that user since there is a 66% match with their metadata and a 100% match with their clique.
- Enabling real-time improvement of the predictions
Every recommendation and prediction of an AI/ML model is based on the assumption that a user is similar to their clique and watches shows that are watched by most people in their age group. But this assumption isn’t always true. The AI/ML models need to constantly evolve with time and make sure that it collects feedback from the user to improve its suggestions. When the ML model recommends ‘Avatar: The Last Airbender’ to a user but they don’t open the show for 3-4 days or stop watching it in the middle, the algorithm automatically reduces the priority on such shows and starts recommending other animated shows such as ‘South Park’. If the user loves South Park and watches it actively, the model learns that the user prefers animated shows from the last decade and not the most recent ones.
By performing an A/B test of recommendations and learning from the user’s choices, the recommendation systems get better with every choice. It is important to note that these feedback systems should be properly embedded in the systems and should have an easy interaction interface for the user. Facebook uses a survey-based feedback system and allows users to ‘thumbs-up’ or ‘thumbs-down’ a video recommendation. Netflix algorithms learn about user choices by tracking the clicks of the users, the keywords searched by them, and the shows that they watched for an extended period of time.
- Maintaining data integrity
The integrity of the data collected for each user is important for providing the most relevant recommendations. If the data is manipulated or if the data collected isn’t specific to one user, then the recommendations would not be relevant. Netflix allows users to create different profiles so that the algorithm doesn’t learn the preferences of two different users. This would be a problem because the recommendations wouldn’t be relevant to either of the users. Thus, making sure that the data is correct and specific to each user is important to make sure that the recommendations provided by the algorithm are well received by the users.
Building the best systems using ML/AI does take considerable time and investment of talent and computational resources. Thus, companies must plan their ML model deployment and production efficiency.
Thank you for reading! Connect with me to learn more about the best practices in ML/AI.