Aligning Recommender Systems as Cause Area

post by IvanVendrov · 2019-05-08T08:56:14.686Z · score: 65 (29 votes) · EA · GW · 14 comments

Contents

  Cause Area Definition
    Recommender Systems
    Aligning Recommender Systems
  Connection with AGI Alignment
    Overlapping Technical Subproblems
      Robustness to Adversarial Manipulation
      Understanding preferences and values from natural language
      Semi-supervised learning from human feedback
      Learning to communicate to humans
      Understanding Human Factors
    Risks from Aligning Recommender Systems
      False Confidence
      Dual Use
      Perils of Partial Alignment
  Cause Prioritization Analysis
    Scale
    Neglectedness
    Solvability
    Overall Importance
  Key Points of Uncertainty
  How You Can Contribute
14 comments

By Ivan Vendrov and Jeremy Nixon


Most recent conversations about the future focus on the point where technology surpasses human capability. But they overlook a much earlier point where technology exceeds human vulnerabilities.

The Problem, Center for Human Technology.

The short-term, dopamine-driven feedback loops that we have created are destroying how society works.

Chamath Palihapitiya, former Vice President of user growth at Facebook.


The most popular recommender systems - the Facebook news feed, the YouTube homepage, Netflix, Twitter - are optimized for metrics that are easy to measure and improve, like number of clicks, time spent, number of daily active users, which are only weakly correlated with what users care about. One of the most powerful optimization processes in the world is being applied to increase these metrics, involving thousands of engineers, the most cutting-edge machine learning technology, and a significant fraction of global computing power. The result is software that is extremely addictive, with a host of hard-to-measure side effects on users and society including harm to relationships, reduced cognitive capacity, and political radicalization.

In this post we argue that improving the alignment of recommender systems with user values is one of the best cause areas available to effective altruists, particularly those with computer science or product design skills.

We’ll start by explaining what we mean by recommender systems and their alignment. Then we’ll detail the strongest argument in favor of working on this cause, the likelihood that working on aligned recommender system will have positive flow-through effects on the broader problem of AGI alignment. We then conduct a (very speculative) cause prioritization analysis, and conclude with key points of remaining uncertainty as well as some concrete ways to contribute to the cause.

Cause Area Definition

Recommender Systems

By recommender systems we mean software that assists users in choosing between a large number of items, usually by narrowing the options down to a small set. Central examples include the Facebook news feed, the YouTube homepage, Netflix, Twitter, and Instagram. Less central examples are search engines, shopping sites, and personal assistant software which require more explicit user intent in the form of a query or constraints.

Aligning Recommender Systems

By aligning recommender systems we mean any work that leads widely used recommender systems to align better with user values. Central examples of better alignment would be recommender systems which

What interventions would best lead to these improvements? Prioritizing specific interventions is out of scope for this essay, but plausible candidates include:

Concrete Examples of How Recommender Systems Could be More Aligned

Connection with AGI Alignment

Risk from the development of artificial intelligence is widely considered one of the most pressing global problems and positively shaping the development of AI is one of the most promising cause areas for effective altruists.

We argue that working on aligning modern recommender systems is likely to have large positive spillover effects on the bigger problem of AGI alignment. There are a number of common technical sub-problems whose solution seems likely to be helpful for both. But since recommender systems are so widely deployed, working on them will lead to much tighter feedback loops, allowing more rapid winnowing of the space of ideas and solutions, faster build-up of institutional knowledge and better-calibrated researcher intuitions. In addition, because of the massive economic and social benefits of increasing recommender system alignment, it’s reasonable to expect a snowball effect of increased funding and research interest after the first successes.

In the rest of this section we review these common technical sub-problems, and specific benefits from approaching them in the context of recommender systems. We then briefly consider ways in which working on recommender system alignment might actually hurt the cause of AGI alignment. But the most serious objection to our argument from an EA perspective is lack of neglectedness: recommender system alignment will happen anyway, so it’s differentially more important to work on other sub-problems of AGI alignment. We discuss this objection more below in the section on Cause Prioritization.

Overlapping Technical Subproblems

Robustness to Adversarial Manipulation

Robustness - ensuring ML systems never fail catastrophically even on unseen or adversarially selected inputs - is a critical subproblem of AGI safety. Many solutions have been proposed, including verification, adversarial training, and red teaming, but it’s unclear how to prioritize between these approaches.

Recommender systems like Facebook, Google Search, and Twitter are under constant adversarial attack by the most powerful organizations in the world, such as intelligence agencies trying to influence elections and companies doing SEO for their websites. These adversaries can conduct espionage, exploit zero-day vulnerabilities in hardware and software, and draw on resources far in excess of any realistic internal red team. There is no better test of robustness today than deploying an aligned recommender system at scale; trying to make such systems robust will yield a great deal of useful data and intuition for the larger problem of AGI robustness.

Understanding preferences and values from natural language

There are a few reasons to think that better natural language understanding differentially improves alignment for both recommender systems and AGI.

First, given how strongly the performance of deep learning systems scales with data size, it seems plausible that the sheer number of bits of human feedback ends up being a limiting factor in the alignment of most AI systems. Since language is the highest bandwidth supervisory signal (in bits/second) that individual humans can provide to an ML system, and linguistic ability is nearly universal, it is probably the cheapest and most plentiful form of human feedback.

More speculatively, natural language may have the advantage of quality as well as quantity - since humans seem to learn values at least partly through language in the form of stories, myths, holy books, and moral claims, natural language may be an unusually high-fidelity representation of human values and goals.

Semi-supervised learning from human feedback

Since it’s plausible that AGI alignment will be constrained by the amount of high-quality human feedback we can provide, a natural subproblem is making better use of the labels we get via semi-supervised or weakly supervised learning. Proposals along these lines include Paul Christiano’s Semi-supervised RL and what the authors of Concrete Problems in AI Safety call “Scalable Oversight”. One especially promising approach to the problem is active learning, where the AI helps select which examples need to be labelled.

What are the advantages for studying semi-supervised learning in the context of recommender systems? First, because these systems are used by millions of people, they have plentiful human feedback of varying quality, letting us test algorithms at much more realistic scales than gridworlds or MuJoCo. Second, because recommender systems are a large part of many people’s lives, we expect that the feedback we get would reflect more of the complexity of human values. It seems plausible that we will need qualitatively different approaches to achieve human goals like “become physically fit” or “spend more time with my friends” than for simple goals in deterministic environments.

Learning to communicate to humans

It seems very likely that both aligned recommender systems and aligned AGI require bidirectional communication between humans and AI systems, not just a one-way supervisory signal from humans to AI. In particular, safe AI systems may need to be interpretable - to provide accurate explanations of the choices they make. They may also need to be corrigible, which among other properties requires them to actively communicate with users to elicit and clarify their true preferences.

Recommender systems seem a fertile ground for exploring and evaluating different approaches for interpretability and bidirectional communication with humans, especially in the context of conversational search and recommenders.

Understanding Human Factors

In AI Safety Needs Social Scientists, Geoffrey Irving and Amanda Askell make the case that prioritizing technical approaches to AI safety requires deeper empirical understanding of human factors. The biases, weaknesses, strengths, introspection ability, information-processing and communication limitations of actual humans and human institutions seem critical to evaluating the most promising AGI alignment proposals such as debate, amplification, and recursive reward modeling.

We agree that running human studies is likely to be valuable for future AI safety research. But we think equally valuable information could be acquired by deploying and studying aligned recommender systems. Recommender systems maintain the largest datasets of actual real-world human decisions. They have billions of users, many of whom would be willing to use experimental new interfaces for fun or for the promise of better long-term outcomes. Recommender systems are also a fertile ground for testing new social and institutional schemes of human-AI collaboration. Just in the domain of reliably aggregating human judgments (likely a key subproblem for debate and amplification) they are constantly experimenting with new techniques, from collaborative filtering to various systems for eliciting and aggregating reviews, ratings, and votes. AI safety needs social scientists, definitely - but it also needs product designers, human-computer interaction researchers, and business development specialists.

Risks from Aligning Recommender Systems

In what ways could working on recommender system alignment make AI risks worse?

False Confidence

One plausible scenario is that widespread use of aligned recommender systems instills false confidence in the alignment of AI systems, increasing the likelihood and severity of a catastrophic treacherous turn, or a slow but unstoppable trend towards the elimination of human agency. Currently the public, media, and governments have a healthy skepticism towards AI systems, and there is a great deal of pushback against using AI systems even for fairly limited tasks like criminal sentencing, financial trading, and medical decisions. But if recommender systems remain the most influential AI systems on most people’s lives, and people come to view them as highly empathetic, transparent, robust, and beneficial, skepticism will wane and increasing decision-making power will be concentrated in AI hands. If the techniques developed for aligning recommender systems don’t scale - i.e. stop working after a certain threshold of AI capability - then we may have increased overall AI risk despite making great technical progress.

Dual Use

Aligned recommender systems may be a strongly dual-use technology, enabling companies to optimize more powerfully for objectives besides alignment, such as creating even more intensely addictive products. An optimization objective that allows you to turn down anger also allows you to turn up anger; ability to optimize for users’ long term goals implies ability to insinuate yourself deeply into users’ lives.

Greater control over these systems also creates dual use censorship concerns, where organizations could dampen the recommendation of content that is negative towards them.

Perils of Partial Alignment

Working on alignment of recommender systems might simply get us worse and harder to detect versions of misalignment. For example, many ideas can’t be effectively communicated without creating an emotion or negative side effect that a partially aligned system may look to suppress. Highly warranted emotional responses (e.g. anger at failures to plan for Hurricane Katrina, or in response to genocide) could be improperly dampened. Political positions that consistently create undesirable emotions would also be suppressed, which may or may not be better than the status quo of promoting political positions that generate outrage and fear.

Cause Prioritization Analysis

Predictions are hard, especially about the future, especially in the domain of economics and sociology. So we will describe a particular model of the world which we think is likely, and do our analysis assuming that model. It’s virtually certain that this model is wrong, and fairly likely (~30% confidence) that it is wrong in a way that dramatically undermines our analysis.

The key question any model of the problem needs to answer is - why aren’t recommender systems already aligned? There are a lot of possible contingent reasons, for instance that few people have thought about it, and the few who did were not in a position to work on it. But the efficient market hypothesis implies there isn’t a giant pool of economic value lying around for anyone to pick up. That means at least one of the following structural reasons is true:

  1. Aligned recommender systems aren’t very economically valuable.
  2. Aligning recommender systems is extremely difficult and expensive.
  3. A solution to the alignment problem is a public good in which we expect rational economic actors to underinvest.

Our model says it’s a combination of (2) and (3). Notice that Google didn’t invent or fund AlexNet, the breakthrough paper that popularized image classification with deep convolutional neural networks - but it was quick to invest immense resources once the breakthrough had been made. Similarly with Monsanto and CRISPR.

We think aligning recommender systems follows the same pattern - there are still research challenges that are too hard and risky for companies to invest significant resources in. The challenges seem interdisciplinary (involving insights from ML, human-computer interaction, product design, social science) which makes it harder to attract funding and academic interest. But there is a critical threshold at which the economic incentives towards wide adoption become overpowering. Once the evidence that aligned recommender systems are practical and profitable reaches a certain threshold, tech companies and venture capitalists will pour money and talent into the field.

If this model is roughly correct, aligned recommender systems are inevitable - the only question is, how much can we speed up their creation and wide adoption? More precisely, what is the relationship between additional resources invested now and the time it takes us to reach the critical threshold?

The most optimistic case we can imagine is analogous to AlexNet - a single good paper or prototype, representing about 1-3 person-years invested, manages a conceptual breakthrough and triggers a flood of interest that brings the time-to-threshold 5 years closer.

The most pessimistic case is that the time-to-threshold is not constrained at the margin by funding, talent or attention; perhaps sufficient resources are already invested across the various tech companies. In that case additional resources will be completely wasted.

Our median estimate is that a small research sub-field (involving ~10-30 people over 3-5 years) could bring the critical threshold 3 years closer.

Assuming this model is roughly right, we now apply the Scale-Neglectness-Solvability framework for cause prioritization (also known as ITN - Importance, Tractability, Neglectedness) as described by 80000 Hours.

Scale

The easiest problem to quantify is the direct effect on quality of life while consuming content from recommender systems. In 2017 Facebook users spent about 1 billion hours / day on the site; YouTube also claims more than a billion hours a day in 2019. Netflix in 2017 counted 140 million hours per day. Not all of this time is powered by recommender systems, but 2.4 billion user hours / day = 100 million user years / year is a reasonably conservative order of magnitude estimate.

What is the difference in experienced wellbeing in time on current recommender systems vs aligned recommender systems? 1% seems conservative, leading to 1 million QALYs lost every year simply from time spent on unaligned recommender systems.

It’s likely that the flow-through effects on the rest of users’ lives will be even greater, if the studies showing effects on mental health, cognitive function, relationships hold out, and if aligned recommender systems are able to significantly assist users in achieving their long term goals. Even more speculatively, if recommender systems are able to align with users’ extrapolated volition this may also have flow-through effects on social stability, wisdom, and long-termist attitudes in a way that helps mitigate existential risk.

It’s much harder to quantify the scale of the AGI alignment problem, insofar as aligning recommender systems helps solve it; we will defer to 80000 Hours’ estimate of 3 billion QALYs per year.

Neglectedness

Culturally there’s a lot of awareness of the problems with unaligned recommender systems, so the amount of potential support to draw on seems high. Companies like Google and Facebook have announced initiatives around Digital Wellbeing and Time Well Spent, but it’s unclear how fundamental these changes are. There are some nonprofits like Center for Human Technology working on improving incentives for companies to adopt aligned recommenders, but none to our knowledge working on the technical problem itself.

How many full-time employees are dedicated to the problem? At the high end, we might count all ML, product, data analysis, and UI work on recommender systems as having some component of aligning with user values, in which case there is on the order of 1000s of people working on the problem globally. We estimate the number that are substantially engaging with the alignment problem (as opposed to improving user engagement) full-time is at least an order of magnitude lower, probably less than 100 people globally.

Solvability

The direct problem - unaligned recommender systems making their users worse off than they could be - seems very solvable. There are many seemingly tractable research problems to pursue, lots of interest from the media and wider culture, and clear economic incentives for powerful actors to throw money at a clear and convincing technical research agenda. It seems like a doubling of direct effort (~100 more people) would likely solve a large fraction of the problem, perhaps all of it, within a few years.

For the AGI alignment problem, 80000 Hours’ estimate (last updated in March 2017) is that doubling the effort, which they estimate as $10M annually, would reduce AI risk by about 1%. Given the large degree of technical overlap, it seems plausible that solving aligned recommender systems would solve 1-10% of the whole AGI alignment problem, so I’ll estimate the flow-through reduction in AI risk at 0.01 - 0.1%.

Overall Importance

Ivan's Note: I have very low confidence that these numbers mean anything. In the spirit of If It's Worth Doing, It's Worth Doing With Made-Up Statistics, I’m computing them anyway. May Taleb have mercy on my soul.

Converting all the numbers above into the 80000 Hours logarithmic scoring system for problem importance, we get the following overall problem scores. We use [x,y] to denote an interval of values.

Problem Scale Neglectedness Solvability Total
Unaligned Recommenders 8 [6,8] [6,7] [20,23]
Risks from AI (flow-through) 15 [6,8] [2,3] [23,26]

The overall range is between 20 and 26, which is coincidentally about the range of the most urgent global issues as scored by 80000 Hours, with climate change at 20 and risks from artificial intelligence at 27.

Key Points of Uncertainty

A wise man once said to think of mathematical proofs not as a way to be confident in our theorems, but as a way to focus our doubts on the assumptions. In a similar spirit, we hope this essay serves to focus our uncertainties about this cause area on a few key questions:

  1. Could aligning weak AI systems such as recommenders be net harmful due to the false confidence it builds? Are there ways of mitigating this effect?
  2. When will aligned recommender systems emerge, if we don’t intervene? If the answer is “never”, why? Why might aligned recommender systems not emerge in our economic environment, despite their obvious utility for users?
  3. What fraction of the whole AGI alignment problem would robustly aligning recommender systems with roughly modern capabilities solve? we estimated 1-10%, but we can imagine worlds in which it’s 0.1% or 90%.
  4. What is the direct cost that unaligned recommender systems are imposing on people’s lives? With fairly conservative assumptions we estimated 1 million QALYs per year, but we could easily see it being two orders of magnitude more or less.

How You Can Contribute

Machine learning researchers, software engineers, data scientists, policymakers, and others can immediately contribute to the goal of aligning recommender systems.

14 comments

Comments sorted by top scores.

comment by PeterMcCluskey · 2019-05-10T23:48:51.872Z · score: 12 (5 votes) · EA · GW

I suspect that principal–agent problems are the biggest single obstacle to alignment. That leads me to suspect it's less tractable than you indicate.

I'm interested in what happened with Netflix. Ten years ago their recommendation system seemed focused almost exclusively on maximizing user ratings of movies. That dramatically improved my ability to find good movies.

Yet I didn't notice many people paying attention to those benefits. Netflix has since then shifted toward less aligned metrics. I'm less satisfied with Netflix now, but I'm unclear what other users think of the changes.

comment by aarongertler · 2019-05-09T02:07:26.067Z · score: 12 (7 votes) · EA · GW

This is fantastic. I don't have high confidence in the numbers you've put forth (for example, it's hard to compare QALYs from "more entertainment"/"better articles" to QALYs from "no malaria"), but I love the way this post was put together:

  • Lots of citations (to a stunning variety of sources; it feels like you've been thinking about these questions for a long time)
  • Careful analysis of what could go wrong
  • Willingness to use numbers, even if they are made up

Even putting aside flow-through effects on alignment, I think that "microtime" is important. Even saving people a few minutes of wasted time each day can be hugely beneficial at scale (especially if that time is replaced with something that fits a user's extrapolated volition). Our lives are made up of the way we spend each hour, and we could certainly be having better hours.

In a world where this is not a promising cause area, even if the risks turn out not to be a concern, I think the most likely cause of "failure" would be something like regulatory capture, where people enter large tech companies hoping to better their algorithms but get swept up by existing incentives. I'd guess that many people who already work at FANG companies entered with the goal of improving users' lives and slowly drifted away -- or came to believe that metrics companies now use are in fact improving users' lives to a "sufficient" extent.

(If you spend all day at Netflix, and come to think of TV as a golden wonderland of possibility, why not work to get people spending as much time as possible watching TV?)

It's possible that these employees still generally feel bad about optimizing for bad metrics, but however they feel, it hasn't yet added up to deliberative anti-addictive properties for any of the biggest tech companies (as far as I'm aware). It would be nice to see evidence that people have successfully advocated for these changes from the inside (Mark Zuckerberg has recently made some noises about trying to improve the situation on Facebook, but I'm not sure how much of that is due to pressure from inside Facebook vs. external pressure or his own feelings).

...including harm to relationships, reduced cognitive capacity, and political radicalization.

The first two links are identical; was that your intention?

Recommender systems often have facilities for deep customization (for instance, it's possible to tell the Facebook News Feed to rank specific friends’ posts higher than others) but the cognitive overhead of creating and managing those preferences is high enough that almost nobody uses them.

In addition to work on improved automated recommendation systems, it seems like there should be valuable projects out there that focus on getting more people to exercise their existing control over present-day systems (e.g. an app that gamifies changing your newsfeed settings, apps that let you more easily set limits for how you'll spend time online).

Examples:

  • FB Purity claims to have over 450,000 users; even if only 100,000 are currently blocking their own newsfeeds, that probably represents ~10,000,000 hours each year spent somewhere other than Facebook.
  • StayFocusd has saved me, personally, thousands of hours on things my extrapolated volition would have regretted.
comment by IvanVendrov · 2019-05-09T03:37:20.056Z · score: 3 (2 votes) · EA · GW

The first two links are identical; was that your intention?

Thanks for the catch - fixed.

comment by William_Saunders · 2019-05-12T21:58:24.693Z · score: 11 (5 votes) · EA · GW

If we want to maximize flow-through effects to AI Alignment, we might want to deliberately steer the approach adopted for aligned recommender systems to one that is also designed to scale to more difficulty problems/more advanced AI systems (like Iterated Amplification). Having an idea become standard in the world of recommender systems could significantly increase the amount of non-saftey researcher effort put towards that idea. Solving the problem a bit earlier with a less scalable approach could close off this opportunity.

comment by Misha_Yagudin · 2019-05-08T11:26:34.690Z · score: 9 (5 votes) · EA · GW

re: How You Can Contribute

Center for Humane Technology is hiring for 5 positions: Managing Director, Head of Humane Design Programs, Manager of Culture & Talent, Head of Policy, Research Intelligence Manager.

comment by John_Maxwell_IV · 2019-05-09T07:23:39.049Z · score: 6 (4 votes) · EA · GW

Good post!

I have a hunch that a big part of the issue here is institutional momentum around maximizing key performance indicators such as daily active users, time spent on platform, etc. Perhaps it will be important to persuade decisionmakers that although optimizing for these metrics helps the bottom line in the short run, in the long run optimizing these to the exclusion of all else hurts the brand, increases the probability of regulatory action or negative "black swan" type events, and risks having the users abandon the product. (I understand that the longer a culture gets exposed to alcohol, the greater the degree it develops "cultural antibodies" to the negative effects of alcohol which allow it to mitigate the harms... decisionmakers should worry that if users don't endorse the time they spend with the product, this hurts the long-term viability of the platform; imagine the formation of a group like Alcoholics Anonymous but for social media, for instance.) I think it'd be good if decisionmakers also started optimizing for key performance indicators like whether users think the product is a benefit to their life personally, whether the product makes society healthier/better off, etc. Or even more specific stuff, like whether users who engage in disagreements tend to come to a consensus vs walking away even angrier than when they started.

With regard to risks, here are some thoughts of mine related to scenarios in which users self-select in their use of these tools. I think maybe what I describe in this comment has already happened though.

comment by Milan_Griffes · 2019-05-08T16:20:02.298Z · score: 4 (2 votes) · EA · GW

Could you say a little bit about how this approach compares to Christiano's Iterated Amplification?

comment by IvanVendrov · 2019-05-08T18:01:42.673Z · score: 1 (1 votes) · EA · GW

To my mind they are fully complementary: Iterated Amplification is a general scheme for AI alignment, whereas this post describes an application area where we could use and learn more about various alignment schemes. I personally think using amplification for aligning recommender systems is very much worth trying. It would have great direct positive effects if it worked, and the experiment would shed light on the viability of the scheme as a whole.

comment by Milan_Griffes · 2019-05-08T18:05:46.446Z · score: 6 (4 votes) · EA · GW

Thanks. I guess I'm fuzzy on what your actual research proposal is.

Are you proposing to implement an Iterated Amplification approach on existing recommender systems?

Or are you more agnostic about specific implementations? ("Hey, better alignment of recommender systems seems important, but we don't yet know what to do about that specifically.")

comment by IvanVendrov · 2019-05-08T18:40:47.344Z · score: 3 (2 votes) · EA · GW

Definitely the latter. Though I would frame it more optimistically as "better alignment of recommender systems seems important, there's a lot of plausible solutions out there, let's prioritize them and try out the few most promising ones". Actually doing that prioritization was out of scope for this post but definitely something we want to do - and are looking for collaborators on.

comment by William_Saunders · 2019-05-18T21:35:15.602Z · score: 3 (3 votes) · EA · GW

While fully understanding a user's preferences and values requires more research, it seems like there are simpler things that could be done by the existing recommender systems that would be a win for users, ie. facebook having a "turn off inflammatory political news" switch (or a list of 5-10 similar switches), where current knowledge would suffice to train a classification system.

It could be the case that this is bottlenecked by the incentives of current companies, in that there isn't a good revenue model for recommender systems other than advertising, and advertising creates the perverse incentive to keep users on your system as long as possible. Or it might be the case that most recommender systems are effectively monopolies on their respective content, and users will choose an aligned system over an unaligned one if options are available, but otherwise a monopoly faces no pressure to align their system.

In these cases, the bottleneck might be "start and scale one or more new organizations that do aligned recommender systems using current knowledge" rather than "do more research on how to produce more aligned recommender systems".

comment by IvanVendrov · 2019-05-18T23:27:54.621Z · score: 5 (4 votes) · EA · GW

My mental model of why Facebook doesn't have "turn off inflammatory political news" and similar switches is because 99% of their users never toggle any such switches, so the feature won't affect any of the metrics they track, so no engineer or product manager has an incentive to add it. Why won't users toggle the switches? Part of it is laziness; but mostly I think users don't trust that the system will faithfully give them what they want based on a single short description like "inflammatory political news" -what if they miss out on an important national story? What if a close friend shares a story with them and they don't see it? What if their favorite comedian gets classified as inflammatory and filtered out?

As additional evidence that we're more bottlenecked by research than by incentives, consider Twitter's call for research to measure the "health" of Twitter conversations, and Facebook's decision to demote news content. I believe if you gave most companies a robust and well-validated metric (analogous to differential privacy) for alignment with user value, they would start optimizing for it even at the cost of some short term growth/revenue.

The monopoly point is interesting. I don't think existing recommender systems are well modelled as monopolies; they certainly behave as if they are in a life-and-death struggle with each other, probably because their fundamental product is "ways to occupy your time" and that market is extremely competitive. But a monopoly might actually be better because it wouldn't have the current race to the bottom in pursuit of monetisable eyeballs.

comment by William_Saunders · 2019-05-21T00:49:29.304Z · score: 2 (2 votes) · EA · GW

Appreciate that point that they are competing for time (as I was only thinking of monopolies over content).

If the reason it isn't used is that users don't "trust that the system will give what they want given a single short description", then part of the research agenda for aligned recommender systems is not just producing systems that are aligned, but systems where their users have a greater degree of justified trust that they are aligned (placing more emphasis on the user's experience of interacting with the system). Some of this research could potentially take place with existing classification-based filters.

comment by IvanVendrov · 2019-05-22T03:33:35.037Z · score: 2 (2 votes) · EA · GW

Agreed that's an important distinction. I just assumed that if you make an aligned system, it will become trusted by users, but that's not at all obvious.