Maximizing Fair Content Spread via Edge Suggestion in Social Networks

Paper in VLDB 2022

Abstract:: Content spread inequity is a potential unfairness issue in online social networks, disparately impacting minority groups. In this paper, we view friendship suggestion, a common feature in social network platforms, as an opportunity to achieve an equitable spread of content. In particular, we propose to suggest a subset of potential edges (currently not existing in the network but likely to be accepted) that maximizes content spread while achieving fairness. Instead of re-engineering the existing systems, our proposal builds a fairness wrapper on top of the existing friendship suggestion components. We prove the problem is NP-hard and inapproximable in polynomial time unless P = NP. Therefore, allowing relaxation of the fairness constraint, we propose an algorithm based on LP-relaxation and randomized rounding with fixed approximation ratios on fairness and content spread. We provide multiple optimizations, further improving the performance of our algorithm in practice. Besides, we propose a scalable algorithm that dynamically adds subsets of nodes, chosen via iterative sampling, and solves smaller problems corresponding to these nodes. Besides theoretical analysis, we conduct comprehensive experiments on real and synthetic data sets. Across different settings, our algorithms found solutions with nearzero unfairness while significantly increasing the content spread. Our scalable algorithm could process a graph with half a million nodes on a single machine, reducing the unfairness to around 0.0004 while lifting content spread by 43%.
Cite as:: Ian P. Swift, Sana Ebhrahimi, Azade Nova, Abolfazl Asudeh. Maximizing Fair Content Spread via Edge Suggestion in Social Networks. VLDB, Vol. 15(11), pages 2692 - 2705, 2022, VLDB Endowment.; DOI: https://doi.org/10.14778/3551793.3551824
Link to PDF:: VLDB Version: Maximizing Fair Content Spread via Edge Suggestion in Social Networks; Technical Report
Code Repository:: Github

Description

Online Social networks have become an inseparable aspect of modern human life and a prominent medium for human interactions at scale. These networks play an invaluable role in removing the physical communication barriers and enabling the spread of information across the world in real-time. However, like other recent technologies, the benefits of the social networks have not been for free, as they introduce new social challenges and complications that did not exist before. From privacy issues to misinformation spread and even to social media addiction, we keep hearing about the concerns and challenges specific to the nature of social networks.

Machine bias has recently been a focal concern in the social aspects of computer science and (big) data technologies, triggering extensive effort, including data pre-processing techniques, algorithm or model modification, and output post-processing techniques to address these concerns. Social networks are not an exception when it comes to unfairness issues.

Inequity in information spread across social networks is one of the potential unfairness challenges, disparately impacting minority groups. While similar issues have been reported for cases such as biased advertisement over social networks, fairness in content spread becomes critical when related to health-care, news and valuable information spread, job opportunities, etc.

Despite its importance, fairness in content spread over social networks has been relatively less studied. Existing works take two directions to address the problem: (i) fairness-aware influence maximization (IM), which aims to select the content spread seed nodes in a way that a combination of fairness and influence spread is maximized, (ii) which consider fairness in terms of topological properties of the network e.g., promoting inter-group connections in a network in order to alleviate unfairness caused by strong alignment within groups.

In this paper, we consider friendship suggestion, a popular feature across social network platforms, where a subset of potential edges that currently do not exist in the network is suggested to the users. We view this as an opportunity to achieve an equitable spread of content in social networks. That is, instead of selecting a subset of potential edges that only maximize content spread, we aim to consider fairness when suggesting the edges. Our proposal is different from the existing work on two angles. First, unlike the works on IM that focus on selecting the fair influential seed nodes with a fixed graph, our focus is on fair edge selection with fixed content source nodes. Second, unlike works that aim to directly make the topology of the network diverse and independent from the demographic information, we consider suggesting edges to achieve a new objective: fairness in content spread.

Existing social networks often use sophisticated algorithms for friendship suggestions. Re-engineering the existing algorithms are costly and perhaps impractical. Instead, we propose a fair content spread component that works with any possible friendship suggestion method, taking their output as the set of potential edges and selecting a subset for the suggestion to achieve fairness.

We introduce Fair Content Spread Maximization (FairCS) problem, where given a set of candidate edges, our goal is to select a subset such that: it contains a fixed number of incident edges to each node (k friend suggestions to each node), satisfies fairness (defined on the probability that different demographic groups receive content), and maximizes content spread. To the best of our knowledge, we are the first work to consider fairness in this context. Unfortunately, not only is the problem NP-hard, but also impossible to approximate in P time unless P=NP. By allowing approximation on content spread and relaxation of the fairness constraint, we propose a non-trivial randomized approximation algorithm for the FairCS problem based on LP-relaxation and randomized rounding. Our algorithm provides constant approximation ratios on the content spread and fairness of its output. We are the first Linear Program designed for content spread with approximation guarantees on fairness. We propose several optimizations that further help our algorithm be efficient in practice. Having to solve an LP, our original algorithm lacks to scale to very large settings. To resolve this issue, we design a scalable algorithm based on dynamically increasing the nodes via sampling. Instead of solving one expensive-to-solve LP that covers the entire problem space, the algorithm iteratively solves problems over subsets of nodes with reasonably small sizes.

Our experiments confirm that our algorithms not only could find solutions with near-zero unfairness but, due to the practical effectiveness of LP-relaxation and randomized techniques, outperformed all baselines from a large breadth of related research in content spread. We also observed our algorithm achieves comparable results to the Optimal Brute Force method on very small graphs. Our scalable algorithm could scale to within an order of half a million nodes, on a single work station in a reasonable time, while none of the baselines we used were able to compute half a million nodes in under 24 hours, confirming the effectiveness of our scalable approach. On half a million nodes, we observed a decrease from an initial unfairness of 3.9% to 0.04%, while the content spread increased by 43%. We were able to run our original algorithm on a largest setting of 4000 nodes, achieving a lift of 57.99% on content spread and an unfairness of less than 0.0001. Alternatively, our scalable algorithm still achieved a lift of 45.07% and a unfairness of 0.0132, and also had a 489x speedup over our other algorithm. At 4000 nodes, no baseline could produce a higher lift, a lower unfairness, or had a lower runtime than our scalable solution.