People who do AI safety research sometimes worry that their research could also contribute to AI capabilities, thereby hastening a possible AI safety disaster. But when might this be a reasonable concern?
We can model a researcher i as contributing intellectual resources of si to safety, and ci to capabilities, both real numbers. We let the total safety investment (of all researchers) be s=∑isi, and the capabilities investment be c=∑ici. Then, we assume that a good outcome is achieved if s>c/k, for some constant k, and a bad outcome otherwise.
The assumption about s>b/k could be justified by safety and capabilities research having diminishing return. Then you could have log-uniform beliefs (over some interval) about the level of capabilities c′ required to achieve AGI, and the amount of safety research c′/k required for a good outcome. Within the support of c′ and c′/k, linearly increasing s/c, will linearly increase the chance of safe AGI.
In this model, having a positive marginal impact doesn't require us to completely abstain from contributing to capabilities. Rather, one's impact is positive if the ratio of safety and capabilities contributions si/ci is greater than the average of the rest of the world. For example, a 50% safety/50% capabilities project is marginally beneficial, if the AI world focuses only 3% on safety.
If the AI world does only focus 3% on safety, then when is nervousness warranted? Firstly, technical researchers might make a big capabilities contribution if they are led to fixate on dangerous schemes that lie outside of current paradigms, like self-improvement perhaps. This means that MIRI's concerns about information security are not obviously unreasonable. Secondly, AI timeline research could lead one to understand the roots of AI progress, and thereby set in motion a wider trend toward more dangerous research. This could justify worries about the large compute experiments of OpenAI. It could also justify worries about the hypothetical future in which an AIS person launches a large AI projects for the government. Personally, I think it's reasonable to worry about cases like these breaching the 97% barrier.
It is a high bar, however. And I think in the case of a typical AI safety researcher, these worries are a bit overblown. In this 97%-capabilities world, the median person should worry a bit less about abstaining from safety contribution, and a bit more about the size of their contribution to safety.
My advice for math is that it's often possible to think you understand something even if you don't, so it's good to do at least some exercises. Also, the methodology, and general "mathematical maturity" is often what you'll reuse the most in research - being able to reason by following specific allowed/disallowed steps, and knowing that you can understand a claim by reading Wolfram Mathworld, Wikipedia, textbooks, etc. So to some extent it doesn't matter so much what you learn, as that you learn something well. Having said that, the first half of a math textbook tends to be much more useful than the second half - there are diminishing returns in each subfield.
For programming, the same is often true - what you're aiming to get is a general sort of maturity, and comfort with debugging and building programs. So probably you want to mostly read tutorials for initial few weeks, then mostly do a project after that.
In both cases, I agree that a tutor is super-useful for getting unstuck, if you have the luxury of being able to afford one.
People often tell me that they encountered EA because they were Googling "How do I choose where to donate?", "How do I choose a high-impact career?" and so on. Has anyone considered writing up answers to these topics as WikiHow instructionals? It seems like it could attract a pretty good amount of traffic to EA research and the EA community in general.
I like the idea of an EA newspaper or magazine, and agree with using it to grow the EA community. But I think this pitch is somewhat inward-looking and unambitious. Moreover, journalism is the wrong business to be in for mitigating negative coverage. Posting a rebuttal in a magazine is going to increase the exposure of criticism, andpushback, as will the existence of a magazine in general. Posting B-tier profiles is a very indirect way to push back against elitism, and would not attract readers. An outlet should choose a content niche that people want to read, not just what you want them to read, and B-tier profiles seem like an example of the latter.
The question, then, is what content niche would some people be eager to read about, that we are equipped to do, and want to tell them about. What topics have EAs written about previously, that lots of people have wanted to read? I can think of some possibilities:
For a broader, less inward-looking paper, I don't know exactly the right name, but I don't think "The Altruist" is it.
I think that you should engage more seriously with the case of Future Perfect. Is it succeeding? What is its niche? What has gone well/poorly? What other niches do they think might be out there? And so on.
You also need to engage more seriously with the question of where you would find talent. Who would write for this outlet? Who could be the editor? In order to excite that founding team, you might need to give them a lot of leeway in shaping its direction.
I agree with you, and with Issa that insofar as it's just a series of readings and discussions, "fellowship" is misleading.
And I agree with the OP that it's good to fund people, to incentivise students to learn and contribute. But I think paying reading group attendees is a weird place to start. Better to pay tutors, offer prizes, fund people if they recruit people to EA jobs, and so on.
I'd rather keep the EA Forum as far away from images/video/audio as possible, so that it can best support serious discourse. There are better ways to widen the reach of EA, like:
creating a magazine, like Works in Progress, Unherd, etc. Mostly this is about creating content.
using non-EA platforms, i.e. we go against our programmer instincts (wanting to build a platform) and moderator instincts (trying to regulate conversation) and just communicate on Twitter etc. Again, mostly content.
promoting simplified versions of content to get more views: promoting a paper with a Medium post, a blog post on Twitter, a Tweet with a meme, etc.
None of these is perfect, or even an uncontroversially good idea, but I think they're much better than trying to fit a round peg into a square hole by modifying the Forum into something more like Web 2.0, or a classic subreddit. In general, I find people to be systematically unimaginative about how to promote EA, and they fixate on existing entities like the Forum more than makes sense: Forum prizes (rather than prizes for content anywhere), meme pages on the Forum (rather than on r/effectivealtruism), videos on the forum, et cetera. The Forum is great for what it does, but it makes little sense to try to shoehorn such ambitions into it, when there are so many other possible ways to post and organise content.
Why do you think there is a pro-UK/US bias? In data-driven rankings in AI (Shanghairankings, CSrankings, some academic studies), I haven't noticed any. Rather, UK, US, & Can rank higher than ANZ+elsewhere, as they should. Maybe you are just talking about poor rating systems like Times/QS?
I think the math is going to be roughly that if 1/3 of the prizes go to schools 1-10, 1/3 to schools 11-100, and 1/3 to schools 101-onwards, then the hit rate (in terms of prizewinners) goes up by an order of magnitude each time you narrow your target audience. So if you're going to target non-elite schools, and you can't fully support hundreds of schools, you'd want to do that outreach at least somewhat more cheaply - making books available or something.
Re target universities, I wonder if UCLA, CMU, JHU, and Cornell could also be interesting, based on Shanghairankings, and strength in AI. Though I don't know about their undergrad programs in particular.
Is it accurate to say that when Pablo was working on this, there was 1 non-Pablo hour for every 1 Pablo hour. And now there are a similar number of non-Pablo hours happening as before, for no Pablo hours, on an ongoing basis?
To what degree do you think the x-risk research community (of ~~100 people) collectively decreases x-risk? If I knew this, then you would have roughly estimated the value of an average x-risk researcher.
OK, then it sounds like a tricky judgment call. I guess you could ask: compared to most technically-minded EA students, do you have a comparative advantage in social skills vs coding, or the reverse? And are your community-related job offers more selective and impressive than the software ones, or the reverse?
My questions would be: do you want to do community building an EAR projects in 5-10 years, and do you really have great community-building opportunities now (good org, good strategy for impact, some mentorship)?
If yes for both, then doing community-building looks good. It aids your trajectory moreso than software. It won't sacrifice much career/financial security, since software is not that credential-based; you can return to it later. And going into community building now won't raise any more eyebros than doing it in two years would.
If no for one or both questions, e.g. if you see yourself as suited to AIS engineering, then software might be better.
Getting advice on a job decision, efficiently (five steps)
When using EA considerations to decide between job offers, asking for help is often a good idea, even if those who could provide advice are busy, and their time is valued. This is because advisors can spend minutes of their time to guide years of yours. It's not disrespecting their "valuable" time, if you do it right. I've had some experience as an advisor, both and as an advisee, and I think a safe bet is to follow the following several steps:
Make sure you actually have a decision that is will concretely guide months to years of your time, i.e. ask about which offer to take, not which company to apply for.
Distill the pros and cons, and neutral attributes of each option down to page or two of text, in a format that permits inline comments (ideally a GDoc). Specifically:
To begin with, give a rough characterization of each option, describing it in neutral terms.
Do mention non-EA considerations e.g. location preferences, alongside EA-related ones.
Remove duplicates. If something is listed as a "pro" for option A, it need not also be listed as a "con" for option B. This helps with conciseness and helps avoid arbitrary double-counting of considerations. If there are many job offers, then simply choose some option A as the baseline, and measure the pros/cons of each other option relative to option A, as in the "three-way comparison example" below.
Merge pros/cons that are related to one another. This also helps with conciseness and avoiding arbitrary double-counting
Indicate the rough importance of various pros/cons. If you think some consideration is more important, then you should explicitly mark it as so. You can mark considerations as strong (+++/---) or weak (+/-) if you want.
Share it to experts whose time is less valuable before the paramount experts in your field,
When advisors ask questions about the detail of your situation, make sure to elaborate these points in the document
Make sure the advisors have an opportunity to give you an all-things-considered judgment within the document (to allow for criticism), or privately, in case they are reluctant to share their criticism of some options.
To make a decision, don't just add up the considerations in the list. Also, take into account the all-things-considered judgments of advisors (which includes expertise that they may not be able to articulate), as well as your personal instincts (which include self-knowledge that you may not be able to articulate).
If you wanted to buy large AI companies, you wouldn't buy all of Google or Facebook, you'd just try to acquire AI projects. You could ask whether you can spend $20B to get a 1% chance of $2T somehow (options? crypto schemes? A big startup?) but in practice I think if you're hoping to buy a $2T company, you're not targeting properly.
This jibes with Brian Caplan's theory that education is primarily about signalling. You can usually enter lectures (or view them on youtube) for free, unlike movies, or gyms, for example. What you're paying for is primarily the certificate, not the experience. And secondarily, the contacts, research experience, and references. The certificate doesn't benefit much from including more classes (possible exception being if it gets you a second major). But the research experience and references do benefit from fewer classes. So doing fewer classes is often a big win. My impression of stats grad school totally matches Dan's assessment.
As per the thread with Pablo, I think the podcast sounds pretty good. Having said that, I do have one small suggested improvement. When I look at the logo (a Sierpinski triangle), and think about what it's supposed to represent, it makes me think of a pyramid, or of growing replicators "one person recruits three, and so on". In particular, although this may seem kind-of unfair, it kinda reminds of this scene from the Office. Given that movement building is a major project of the org, that's probably not the connotation that you want. I realize that most people aren't going to think of this connotation, but I'm very curious of others' thoughts, because even a few seems too many...
How would it not be a copyright violation? Seems better to require consent, presumably via a checkbox or an email. Consent also could improve the average quality a bit. Although then the question is whether the EAF/LW/AF designers can be bothered including that kind of email, or checkbox+API, etc as a feature.
Hmm, I think the EA meaning is pretty similar to the evobio meaning.
I think that in EA, "founder effects" are when a new group (e.g. people interested in existential risk) is initialized by some unusual individuals, and it grows, leading the group to retain some of those peculiarities. It's especially used for describing properties that arose from happenstance (e.g. they like Harry Potter), rather than expected differences (e.g. they like abstract thinking).
In evolutionary theory, they seem to be exclusively focused on the effects of randomly sub-sampling a population, but it's basically the same idea.
One argument for the long reflection that I think has been missed in a lot of this discussion is that it's a proposal for taking Nick's Astronomical Waste argument (AWA) seriously. Nick argues that it's worth spending millennia to reduce existential risk by a couple percent. But launching for example, a superintelligence with the values of humanity in 2025 could itself constitute an existential risk, in light of future human values. So AWA implies that a sufficiently wise and capable society would be prepared to wait millennia before jumping in to such an action.
Now we may practically never be capable enough to coordinate to do so, but the theory makes sense.
Tangentially related: I would love to see a book of career decision worked examples. Rather than 80k's cases, which often read like biographies or testimonials, these would go deeper on the problem of choosing jobs and activities. They would present a person (real or hypothetical), along with a snapshot of their career plans and questions. Then, once the reader has formulated some thoughts, the book would outline what it would advise, what that might depend on, and what career outcomes occurred in similar cases.
A lot of fields are often taught in a case-based fashion, including medicine, poker, ethics, and law. Often, a reader can make good decisions in problems they encounter by interpolating between cases, even when they would struggle to analyse these problems analytically. Some of my favourite books have a case-based style, such as An Anthropologist on Mars by Oliver Sacks. It's not always the most efficient way to learn, but it's pretty fun.
It happens in Australian universities. Probably anywhere there's a large centralised campus. Wouldn't work as well in Oxbridge, though, because the teaching areas, and even the libraries, are spread all across the city.
Especially for referrals, since there may be very many.
Comment by RyanCarey on [deleted post]
Yeah, I haven't analysed Holden's intended meaning whatsoever, but something like what you describe would make much more sense.
Comment by RyanCarey on [deleted post]
It can't be right to say that every descendant of a digital person is by definition also a person. A digital person could spawn (by programming, or by any other means) a bot that plays RPS randomly, in one line of code. Clearly not a person!
What about the hypothesis that simple animal brains haven't been simulated because they're hard to scan - we lack a functional map of the neurons - which ones promote or inhibit one another, and other such relations.
Agree that we shouldn't expect large productivity/wellbeing changes. Perhaps a ~0.1SD improvement in wellbeing, and a single-digit improvement in productivity - small relative to effects on recruitment and retention.
I agree that it's been good overall for EA to appear extremely charitable. It's also had costs though: it sometimes encouraged self-neglect, portrayed EA as 'holier than thou', EA orgs as less productive, and EA roles as worse career moves than the private sector. Over time, as the movement has aged, professionalised, and solidified its funding base, it's been beneficial to de-emphasise sacrifice, in order to place more emphasis on effectiveness. It better reflects what we're currently doing, who we want to recruit, too. So long as we take care to project an image that is coherent, and not hypocritical, I don't see a problem with accelerating the pivot. My hunch is that even apart from salaries, it would be good, and I'd be surprised if it was bad enough to be decisive for salaries.
This kind of ambivalent view of salary-increases is quite mainstream within EA, but as far as I can tell, a more optimistic view is warranted.
If 90% of engaged EAs were wholly unmotivated by money in the range of $50k-200k/yr, you'd expect >90% of EA software engineers, industry researchers, and consultants to be giving >50%, but much fewer do. You'd expect EAs to be nearly indifferent toward pay in job choice, but they're not. You'd expect that when you increase EAs' salaries, they'd just donate a large portion on to great tax-deductible charities, so >75% of the salary increase would be refunded on to other effective orgs. But when you say that the spending would be only a tenth as effective (rather than ~four-tenths), clearly you don't.
Although some EAs are insensitive to money in this way, 90% seems too high. Rather, with doubled pay, I think you'd see some quality improvements from an increased applicant pool, and some improved workforce size (>10%) and retention. Some would buy themselves some productivity and happiness. And yes, some would donate. I don't think you'd draw too many hard-to-detect "fake EAs" - we haven't seen many so far. Rather, it seems more likely to help quality than hurt on the margin.
I don't think the PR risk is so huge at <$250k/yr levels. Closest thing I can think of is commentary regarding folks at OpenAI, but it's a bigger target, with higher pay. If the message gets out that EA employees are not bound to a vow of poverty, and are actually compensated for >10% of the good they're doing, I'd argue that's would enlarge and improve the recruitment pool on the margin.
(NB. As an EA worker, I'd stand to gain from increased salaries, as would many in this conversation. Although not for the next few years at least given the policies of my current (university) employer.)
I think they believe in Wei Dai's UDT, or some variant of it, which is very close to Stuart's anthropic decision theory, but you'd have to ask them which, if any, published or unpublished version they find most convincing.
Rob Wiblin: ...if you were able to get even 10x leverage using science and policy by trying to help Americans, by like, improving U.S. economic policy, or doing scientific research that would help Americans, shouldn’t you then be able to blow Against Malaria Foundation out of the water by applying those same methods, like science and policy, in the developing world, to also help the world’s poorest people?
Alexander Berger: Let me give two reactions. One is I take that to be the conclusion of that post. I think the argument at the end of the post was like, “We’re hiring. We think we should be able to find better causes. Come help us.” And we did, in fact, hire a few people. And they have been doing a bunch of work on this over the last few years to try to find better causes...
The most relevant comments in the transcript seem to be in the section "GiveWell’s top charities are (increasingly) hard to beat".
researching the legal environment in the state where the charity is registered and coming up with creative ways around local regulations or going through complicated registration procedures such as the ~ 1.5 year long one with the SEC are not things that I can automate... I’m taking two different perspectives in the comment based on the following steps: (1) What is realistic to realize now to get the idea off the ground, and (2) what is realistic to expect to happen in 5–10 years assuming that step 1 has succeeded... I don’t see a way to get [tax deductability] unfortunately... The legal risks I’m referring to are not simply that it might not be possible to get tax deductibility. It’s rather that in the worst case the responsible people at the charities may need to pay 8–9 digit settlements to the SEC or go to prison for up to five years for issuing unregistered securities.
VCs often manage to buy stakes in companies privately. Wouldn't it be natural to sidestep that issue by copying what VCs do (and staying off the blockchain)? i.e. step (1) is privately traded patronage certificates, then step (2) is public ones? If so, then one could imagine a scenario where all you need for now is to do some research, and write up a pro forma contract?
Ah, my point here was more that an evil charity that is afraid that it’ll get shorted can decide not to offer the (say) 99% of its shares that it still holds for borrowing...
I can envisage a lot of ways to ensure some lending, so this seems like a small advantage.
I’m currently very concerned about prices not reflecting downside risks, and this mechanism is the only one that may be able to keep risky charities out...
Yes, having the ability to short companies is quite a weak method for punishing companies, because they can just stop selling patronage certs if they go negative. It would be better if we could get charities to pay for their negative impact somehow. An ``absolving'' certificate, of sorts. Maybe the people who would want to sell these ``absolving'' certificates are similar to the ones who look to buy ``patronage''...
My thinking has gone through the following steps: (1) I want to create charity shares. (2) Oops, charity shares are prohibitively difficult to do because of legal risks, effort, and hence very low chance of getting the buy-in from all the US- and UK-based EA charities. (3) So I need to come up with something that is almost as good but is more achievable: Project shares... and intervention shares...
Ahhhh, OK! I must say though, it rewards and punishes orgs for the performance of other orgs in their area. You portray this as a positive, but it seems like a big negative to me. It incentivises people to start new incompetent orgs in an intervention area, (or to keep incompetent orgs running) just because there are existing competent orgs. Conversely, it punishes competent orgs for the presence of harmful orgs implementing their same intervention. It's quite messy to require an external panel to divide up the tokens between orgs. Frankly, given the fact that it's a bit inelegant, I would bet that other problems will arise.
I can't promise I'll have much more to say in this thread, but in case I don't, let me say that I've found this an illuminating discussion. Thanks!
Agree that discussing terminology is not yet useful in and of itself. Though I'm intending it for the purpose of idea clarification.
Re charity Vs intervention shares, my thinking was just that it would be more transparent for intervention shares to be constituted of charity shares, and for such shares to be issued by charities. Based on reading your comment, I'm not sure whether you agree?
As for your arguments: I find myself not so convinced by (1-2). I think the process of issuing charity shares could be automated for the charities. If desired, it seems not out of the question that these entities could even run as for-profits - given that you are proposing a revolution of the NGO sector, it seems weird to restrict yourself to the most common current legal setup (although I agree that tax deductability is nice to have).
I can see that (3) pushes weakly toward impact certs, but not strongly because ideally you also want to have specific markets, and the benefits of liquidity and specificity trade off against one another (in terms of the information that readers can gain). And even if resale markets are fairly dormant, I don't think it's a disaster - it should still be at least as good as the status quo (donations), and in many ways better (valuation is done retrospectively).
Re (4), why can't charity shares be bought/sold? Re (5), what is the built-in mechanism?
Re past/future shares, on further thought, even if you only allow patronage certs to be sold for past events on the "bottom layer", there are ways to route around this: you can sell shares in the company itself, or you can sell the rights to any future patronage shares. I'm certain this is a good thing, because it allows people to invest in orgs that will have large future impact, similar to investing in an org that you think will win an x-prize. The real question is just whether you should allow this "natively", i.e. whether you should.be able to sell patronage of future activities. If you think of normal stocks, they do confer an ownership of the company into the indefinite future. Stocks can also have the problem where people make a company and make a bunch of promises about what it will do, sell it, then reneg on those promises - they call it securities fraud, and have a lot of defenses built up against it. If you want to piggyback on that, maybe you would want to only allow sale of past activities "natively", and then for sale of future impact to be done only by sale of regular stocks in the company itself. That's my initial instinct, although there may be a lot of other considerations.