Publications
Publications by categories in reversed chronological order.
2024
- Modular Debiasing of Latent User Representations in Prototype-based Recommender SystemsAlessandro B. Melchiorre, Shahed Masoudian, Deepak Kumar, and 1 more authorIn Proceedings of 2024 Joint European Conference on Machine Learning and Knowledge Discovery in Databases (ECML PKDD) , 2024
Recommender Systems (RSs) may inadvertently perpetuate biases based on protected attributes like gender, religion, or ethnicity. Left unaddressed, these biases can lead to unfair system behavior and privacy concerns. Interpretable RS models provide a promising avenue for understanding and mitigating such biases. In this work, we propose a novel approach to debias interpretable RS models by introducing user-specific scaling weights to the interpretable user representations of prototype-based RSs. This reduces the influence of the protected attributes on the RS’s prediction while preserving recommendation utility. By decoupling the scaling weights from the original representations, users can control the degree of invariance of recommendations to their protected characteristics. Moreover, by defining distinct sets of weights for each attribute, the user can further specify which attributes the recommendations should be agnostic to. We apply our method to ProtoMF, a state-of-the-art prototype-based RS model that models users by their similarities to prototypes. We employ two debiasing strategies to learn the scaling weights and conduct experiments on ML-1M and LFM2B-DB datasets aiming at making the user representations agnostic to age and gender. The results show that our approach effectively reduces the influence of the protected attributes on the representations on both datasets, showcasing flexibility in bias mitigation, while only marginally affecting recommendation quality. Finally, we assess the effects of the debiasing weights and provide qualitative evidence, particularly focusing on movie recommendations, of genre patterns identified by ProtoMF that correlate with specific genders.
2023
- JournalEmotion-aware Music Tower Blocks (EmoMTB): An Intelligent Audiovisual Interface for Music Discovery and RecommendationAlessandro B. Melchiorre, David Penz, Christian Ganhör, and 6 more authorsInternational Journal of Multimedia Information Retrieval (IJMIR), 2023
Music listening has experienced a sharp increase during the last decade thanks to music streaming and recommendation services. While they offer text-based search functionality and provide recommendation lists of remarkable utility, their typical mode of interaction is unidimensional, i.e., they provide lists of consecutive tracks, which are commonly inspected in sequential order by the user. The user experience with such systems is heavily affected by cognition biases (e.g., position bias, human tendency to pay more attention to first positions of ordered lists) as well as algorithmic biases (e.g., popularity bias, the tendency of recommender systems to overrepresent popular items). This may cause dissatisfaction among the users by disabling them to find novel music to enjoy. In light of such systems and biases, we propose an intelligent audiovisual music exploration system named EmoMTB. It allows the user to browse the entirety of a given collection in a free non-linear fashion. The navigation is assisted by a set of personalized emotion-aware recommendations which serve as starting points for the exploration experience. EmoMTB adopts the metaphor of a city, in which each track (visualized as a colored cube) represents one floor of a building. Highly similar tracks are located in the same building, moderately similar ones form neighborhoods that mostly correspond to genres. Tracks situated between distinct neighborhoods create a gradual transition between genres. Users can navigate this music city using their smartphones as control devices. They can explore districts of well-known music or decide to leave their comfort zone. In addition, EmoMTB integrates an emotion-aware music recommendation system that re-ranks the list of suggested starting points for exploration according to the user’s self-identified emotion or the collective emotion expressed in EmoMTB’s Twitter channel. Evaluation of EmoMTB has been carried out in a three-fold way: by quantifying the homogeneity of the clustering underlying the construction of the city, by measuring the accuracy of the emotion predictor, and by carrying out a web-based survey composed of open questions to obtain qualitative feedback from users.
2022
- ProtoMF: Prototype-based Matrix Factorization for Effective and Explainable RecommendationsAlessandro B. Melchiorre, Navid Rekabsaz, Christian Ganhör, and 1 more authorIn Proceedings of the 16th ACM Conference on Recommender Systems (RecSys) , Seattle, WA, USA, 2022
Recent studies show the benefits of reformulating common machine learning models through the concept of prototypes – representatives of the underlying data, used to calculate the prediction score as a linear combination of similarities of a data point to prototypes. Such prototype-based formulation of a model, in addition to preserving (sometimes enhancing) the performance, enables explainability of the model’s decisions, as the prediction can be linearly broken down into the contributions of distinct definable prototypes. Following this direction, we extend the idea of prototypes to the recommender system domain by introducing ProtoMF, a novel collaborative filtering algorithm. ProtoMF learns sets of user/item prototypes that represent the general consumption characteristics of users/items in the underlying dataset. Using these prototypes, ProtoMF then represents users and items as vectors of similarities to the corresponding prototypes. These user/item representations are ultimately leveraged to make recommendations that are both effective in terms of accuracy metrics, and explainable through the interpretation of prototypes’ contributions to the affinity scores. We conduct experiments on three datasets to assess both the effectiveness and the explainability of ProtoMF. Addressing the former, we show that ProtoMF exhibits higher Hit Ratio and NDCG compared to other relevant collaborative filtering approaches. As for the latter, we qualitatively show how ProtoMF can provide explainable recommendations and how its explanation capabilities can expose the existence of statistical biases in the learned representations, which we exemplify for the case of gender bias.
- Exploring Cross-group Discrepancies in Calibrated Popularity for Accuracy/Fairness Trade-off OptimizationOleg Lesota, Stefan Brandl, Matthias Wenzel, and 4 more authorsIn Proceedings of the 2nd Workshop on Multi-Objective Recommender Systems co-located with 16th ACM Conference on Recommender Systems (RecSys 2022), Seattle, WA, USA, 18th-23rd September , Seattle, WA, USA, 2022
Popularity bias is an important issue in recommender systems, as it affects end-users, content creators, and content provider platforms alike. It can cause users to miss out on less popular items that would fit their preference, prevent new content creators from finding their audience, and force providers to pay higher royalties for serving expensive popular content. Over the past years, various approaches to mitigate popularity bias in recommender systems have been proposed. Among them, post-processing methods are widely accepted due to their versatility and ease of implementation. While previous studies have investigated the effects of different post-processing techniques on accuracy and fairness of recommendations, the influence of different algorithms on different user groups have not received much attention in this context. Addressing this research gap, we study the effect of a recent mitigation strategy, Calibrated Popularity, in conjunction with a selection of state-of-the-art recommender algorithms including BPR, ItemKNN, LightGCN, MultiVAE, and NeuMF. We show that these algorithms demonstrate different characteristics in terms of the trade-off between accuracy and fairness, both within and between various user groups defined by gender and inclination towards consumption of mainstream items. Finally, we demonstrate how these discrepancies can be exploited to achieve a more effective trade-off between utility and fairness of recommender systems.
- EmoMTB: Emotion-aware Music Tower BlocksAlessandro B. Melchiorre, David Penz, Christian Ganhör, and 6 more authorsIn Proceedings of the 2022 International Conference on Multimedia Retrieval (ICMR) , 2022
We introduce Emotion-aware Music Tower Blocks (EmoMTB), an audiovisual interface to explore large music collections. It creates a musical landscape, by adopting the metaphor of a city, where similar songs are grouped into the same building and nearby buildings form neighborhoods of particular genres. In order to personalize the user experience, an underlying classifier monitors textual user-generated content, by predicting their emotional state and adapting the audiovisual elements of the interface accordingly. EmoMTB enables users to explore different musical styles either within their comfort zone or outside of it. Besides, tailoring the results of the recommender engine to match the affective state of the user, EmoMTB offers a unique way to discover and enjoy music. EmoMTB supports exploring a collection of circa half a million streamed songs using a regular smartphone as a control interface to navigate in the landscape.
- Article with DeezerExplainability in Music Recommender SystemsDarius* Afchar, Alessandro B.* Melchiorre, Markus Schedl, and 3 more authorsAI Magazine, 2022
The most common way to listen to recorded music nowadays is via streaming platforms which provide access to tens of millions of tracks. To assist users in effectively browsing these large catalogs, the integration of Music Recommender Systems (MRSs) has become essential. Current real-world MRSs are often quite complex and optimized for recommendation accuracy. They combine several building blocks based on collaborative filtering and content-based recommendation. This complexity can hinder the ability to explain recommendations to end users, which is particularly important for recommendations perceived as unexpected or inappropriate. While pure recommendation performance often correlates with user satisfaction, explainability has a positive impact on other factors such as trust and forgiveness, which are ultimately essential to maintain user loyalty. In this article, we discuss how explainability can be addressed in the context of MRSs. We provide perspectives on how explainability could improve music recommendation algorithms and enhance user experience. First, we review common dimensions and goals of recommenders’ explainability and in general of eXplainable Artificial Intelligence (XAI), and elaborate on the extent to which these apply – or need to be adapted – to the specific characteristics of music consumption and recommendation. Then, we show how explainability components can be integrated within a MRS and in what form explanations can be provided. Since the evaluation of explanation quality is decoupled from pure accuracy-based evaluation criteria, we also discuss requirements and strategies for evaluating explanations of music recommendations. Finally, we describe the current challenges for introducing explainability within a large-scale industrial music recommender system and provide research perspectives.
2021
- Analyzing Item Popularity Bias of Music Recommender Systems: Are Different Genders Equally Affected?Oleg Lesota, Alessandro B. Melchiorre, Navid Rekabsaz, and 4 more authorsIn Fifteenth ACM Conference on Recommender Systems (RecSys) , Amsterdam, Netherlands, 2021
Several studies have identified discrepancies between the popularity of items in user profiles and the corresponding recommendation lists. Such behavior, which concerns a variety of recommendation algorithms, is referred to as popularity bias. Existing work predominantly adopts simple statistical measures, such as the difference of mean or median popularity, to quantify popularity bias. Moreover, it does so irrespective of user characteristics other than the inclination to popular content. In this work, in contrast, we propose to investigate popularity differences (between the user profile and recommendation list) in terms of median, a variety of statistical moments, as well as similarity measures that consider the entire popularity distributions (Kullback-Leibler divergence and Kendall’s τ rank-order correlation). This results in a more detailed picture of the characteristics of popularity bias. Furthermore, we investigate whether such algorithmic popularity bias affects users of different genders in the same way. We focus on music recommendation and conduct experiments on the recently released standardized LFM-2b dataset, containing listening profiles of Last.fm users. We investigate the algorithmic popularity bias of seven common recommendation algorithms (five collaborative filtering and two baselines). Our experiments show that (1) the studied metrics provide novel insights into popularity bias in comparison with only using average differences, (2) algorithms less inclined towards popularity bias amplification do not necessarily perform worse in terms of utility (NDCG), (3) the majority of the investigated recommenders intensify the popularity bias of the female users.
- JournalInvestigating Gender Fairness of Recommendation Algorithms in the Music DomainAlessandro B. Melchiorre, Navid Rekabsaz, Emilia Parada-Cabaleiro, and 3 more authorsInformation Processing & Management (IPM), 2021
Although recommender systems (RSs) play a crucial role in our society, previous studies have revealed that the performance of RSs may considerably differ between groups of individuals with different characteristics or from different demographics. In this case, a RS is considered to be unfair when it does not perform equally well for different groups of users. Considering the importance of RSs in the distribution and consumption of musical content worldwide, a careful evaluation of fairness in the context of music RSs is crucial. To this end, we first introduce LFM-2b, a novel large-scale real-world dataset of music listening records, comprising a subset to investigate bias of RSs regarding users’ demographics. We then define a notion of fairness based on the performance gap of a RS between the users with different demographics, and evaluate a variety of collaborative filtering algorithms in terms of accuracy and beyond-accuracy metrics to explore the fairness in the RS results toward a specific gender group. We observe the existence of significant discrepancies (unfairness) between the performance of algorithms across male and female user groups. Based on these discrepancies, we explore to what extent recommender algorithms lead to intensifying the underlying population bias in the final results. We also study the effect of a resampling strategy, commonly used as debiasing method , which yields slight improvements in the fairness measures of various algorithms while maintaining their accuracy and beyond-accuracy performance.
- LEMONS: Listenable Explanations for Music recOmmeNder SystemsAlessandro B. Melchiorre, Verena Haunschmid, Markus Schedl, and 1 more authorIn European Conference on Information Retrieval (ECIR) , 2021
Although current music recommender systems suggest new tracks to their users, they do not provide listenable explanations of why a user should listen to them. LEMONS (Demonstration video: https://youtu.be/giSPrPnZ7mc) is a new system that addresses this gap by (1) adopting a deep learning approach to generate audio content-based recommendations from the audio tracks and (2) providing listenable explanations based on the time-source segmentation of the recommended tracks using the recently proposed audioLIME
2020
- Pandemics, Music, and Collective Sentiment: Evidence from the Outbreak of COVID-19Meijun Liu, Eva Zangerle, Xiao Hu, and 2 more authorsIn Proceedings of the 21st International Society for Music Information Retrieval Conference (ISMIR) , 2020
The COVID-19 pandemic causes a massive global health crisis and produces substantial economic and social distress, which in turn may cause stress and anxiety among people. Real-world events play a key role in shaping collective sentiment in a society. As people listen to music daily everywhere in the world, the sentiment of music being listened to can reflect the mood of the listeners and serve as a measure of collective sentiment. However, the exact relationship between real-world events and the sentiment of music being listened to is not clear. Driven by this research gap, we use the unexpected outbreak of COVID-19 as a natural experiment to explore how users’ sentiment of music being listened to evolves before and during the outbreak of the pandemic. We employ causal inference approaches on an extended version of the LFM-1b dataset of listening events shared on Last.fm, to examine the impact of the pandemic on the sentiment of music listened to by users in different countries. We find that, after the first COVID-19 case in a country was confirmed, the sentiment of artists users listened to becomes more negative. This negative effect is pronounced for males while females’ music emotion is less influenced by the outbreak of the COVID-19 pandemic. We further find a negative association between the number of new weekly COVID-19 cases and users’ music sentiment. Our results provide empirical evidence that public sentiment can be monitored based on collective music listening behaviors, which can contribute to research in related disciplines.
- Personality Bias of Music Recommendation AlgorithmsAlessandro B. Melchiorre, Eva Zangerle, and Markus SchedlIn Fourteenth ACM Conference on Recommender Systems (RecSys) , 2020
Recommender systems, like other tools that make use of machine learning, are known to create or increase certain biases. Earlier work has already unveiled different performance of recommender systems for different user groups, depending on gender, age, country, and consumption behavior. In this work, we study user bias in terms of another aspect, i.e., users’ personality. We investigate to which extent state-of-the-art recommendation algorithms yield different accuracy scores depending on the users’ personality traits. We focus on the music domain and create a dataset of Twitter users’ music consumption behavior and personality traits, measuring the latter in terms of the OCEAN model. Investigating recall@K and NDCG@K of the recommendation algorithms SLIM, embarrassingly shallow autoencoders for sparse data (EASE), and variational autoencoders for collaborative filtering (Mult-VAE) on this dataset, we find several significant differences in performance between user groups scoring high vs. groups scoring low on several personality traits.
- Personality Correlates of Music Audio Preferences for Modelling Music ListenersAlessandro B. Melchiorre, and Markus SchedlIn Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization (UMAP) , 2020
Past studies have shown that personality has a significant association with user behaviour and preferences, not least towards music. This makes personality information a promising aspect for user modelling in personalised recommender systems and similar domains. In contrast to existing studies, which investigate personality correlates of music preferences via genres or styles, we study such correlates by modelling music preferences at a finer-grained content level, using audio features of the music users listen to. Leveraging listening and personality information of more than 1,300 Last.fm users, we identify several significant medium and weak correlations between music audio features and personality traits, the latter defined by the five-factor model. Our results provide useful insights into the relationship between personality and music preference, which can be valuable for music recommender systems in terms of more personalised recommendations.