Posts
Comments
However, would it have been helpful to know that 85% of researchers recommend research agenda Y or person X?
TL;DR: No. (I know this is an annoying unintuitive answer)
I wouldn't be surprised if 85% of researchers think that it would be a good idea to advance capabilities (or do some research that directly advances capabilities and does not have a "full" safety theory of change), and they'll give you some reason that sounds very wrong to me. I'm assuming you interview anyone who sees themselves as working on "AI Safety".
[I don't actually know if this statistic would be true, but it's a kind example of how your survey suggestion might go wrong imo]
I agree that selecting experts is a challenge, but it seems better to survey credible experts than to exclude that evidence from the decision-making process.
I want to point out you didn't address my intended point of "how to pick experts". You said you'd survey "credible experts" - who are those? How do you pick them? A more object-level answer would be "by forum karma" (not that I'm saying it's the best answer, but it is more object-level than saying you'd pick the "credible" ones)
Yonatan:
something that might change my mind very quickly is if you'll give me examples of what "language" you might want to create. Maybe you want a term for "safety washing" as an example [of an example]?
Peter:
would conceptualize movement building in a broad sense e.g., as something involving increased contributors, contribution and coordination, people helping with operations communication and working on it while doing direct work (e.g, via going to conferences etc)
Are these examples of things you think might be useful to add to the language of community building, such as "safety washing" might be an example of something useful?
If so, it still seems too-meta / too-vague.
Specifically, it fails to address the problem I'm trying to point out of differentiating positive community building from negative community building.
And specifically, I think focusing on a KPI like "increased contributors" is the way that AI Safety community building accidentally becomes net negative.
See my original comment:
I think that having metrics for community building (that are not strongly grounded in a "good" theory of change) (such as metrics for increasing people in the field in general) have extra risk for that kind of failure mode.
I want to add one more thing:
This whole situation where a large number of possible seemingly-useful actions turns out to be net negative - is SUPER ANNOYING imo, it is absolutely not anything against you, I wish it wasn't this way, etc.
Ah, and also:
I with many others would consult about their ideas in public as you've done here, and you have my personal appreciation for that, fwiw
Edit: I just wrote this, it's ~1:30am here, I'm super tired and think this was incoherent. Please be extra picky with what you take from my message, if something doesn't make sense then it's me, not you. I'm still leaving the comment because it sounds like you really want comments
---
Hey,
TL;DR: This sounds too meta, I don't think I understand many important points of your plan, and I think examples would help.
It involves trusting experts, or deferring to them, or polling them, or having them supervise your work, or other things.
1)
This still leaves open questions like "how do you chose those experts", for example do you do it based on who has the most upvotes on the forum? (I guess not), or what happens if you chose "experts" who are making the AI situation WORSE and they tell you they mainly need to hire people to help them?
2)
And if you pick experts correctly, then once you talk to one or several of these experts, you might discover a bottle neck that is not at all in having a shared language, but is "they need a latex editor" or "someone needs to brainstorm how to find nobel prize winners to work on AI Safety" (I'm just making this up here). My point it, my priors are they will give surprising answers. [my priors are from user research, and specifically this]. These are my priors for why picking something like "having a shared language" before talking to them is probably not a good idea (though I shared why I think so, so if it doesn't make sense, totally ignore what I said)
3)
Too meta:
"Finding a shared language" pattern matches for me (maybe incorrectly!) to solutions like "let's make a graph of human knowledge" which almost always fail (and I think when they work they're unusual). These solutions are.. "far" from the problem. Sorry I'm not so coherent.
Anyway, something that might change my mind very quickly is if you'll give me examples of what "language" you might want to create. Maybe you want a term for "safety washing" as an example [of an example]?
4)
Sharing from myself:
I just spent a few months trying to figure out AI Safety so that I can have some kind of opinion about questions like "who to trust" or "does this research agenda make sense". This was kind of hard in my experience, but I do think it's the place to start.
Really, a simple example to keep in mind is that you might be interviewing "experts" who are actively working on things that make the situation worse - this would ruin your entire project. And figuring this out is really hard imo
5)
None of this means "we should stop all community building", but it does point at some annoying complications
Hey,
Some people think community building efforts around AI Safety are net negative, such as the post about "Shutting Down the Lightcone Offices".
I'm not saying they're right (it seems complicated in a way I don't know how to solve), but I do think they're pointing at a real failure mode.
I think that having metrics for community building (that are not strongly grounded in a "good" theory of change) (such as metrics for increasing people in the field in general) have extra risk for that kind of failure mode.
Excuse me if I misunderstood what you're saying - I saw you specifically wanted comments and don't have any yet, so I'm err'ing at the side of sharing my first (maybe wrong) thoughts
This post changed my mind, thanks
Thx
Reading this book as a book club is the first 5 months of that program.
5 months of.. full time work? Something else?
If I understand correctly, the book club is 11 meetings, where each meeting is 1 hour of video plus 1-2 hours of reading beforehand.
I'm confused about how this adds up, almost to the point where I wonder if you were testing us on purpose ;)
Thanks! This somewhat reduces my FOMO 🙃
(This is more of a vision than a solution, hope it still helps)
I have a dream! And in this dream, there is a central place to have each conversation, as opposed to having each conversation in multiple places.
As an example to imagine: "Does Org X advance safety more or capabilities more?"
This has sub questions, like "is advancing capabilities bad?" and so on.
Instead of having each conversation again and again each place it comes up, I think it would be much better to have a community-built "FAQ" or so, organically built, by linking conversations to each other.
I imagine someone replying "the question of whether capabilities are good or bad is discussed in this LINK", which has the best arguments for each side, and the best sub-arguments, and so on.
The situation today is often more like "the question [...] was discussed in at least 10 different places, here are some of them: ...", which is hard to read, hard to add to the conversation, hard to know if I got all the important viewpoints, and so on.
I think we can do better, and I specifically think this could be an important step towards the forum becoming a place with high quality conversations, and with readable+accessible information, clear marked curxes, and so on.
I'm not sure how to do this; my starting point would maybe be "look for some of these things that are discussed several times", and then ask people (or something?) why they didn't add it all to the same place (?) I'm not sure, this seems like non-trivial product work which is too advanced for me to just pull out of my hat or to invent many features around; so I'm just pitching it as a vision for now
I think the forum should encourage breaking up comments instead of having one comment with many points.
If my comment makes several different points, then it's an advantage to let people vote or comment on each of my points separately.
Right now, there is a "cost" as well as a norm in the forum of writing long comments. imo this is worth changing
wdyt?
Adding: It will also add to a culture that is good in my opinion, where it's encouraged to link to existing conversations rather than having the same conversation multiple times. It's less "elegant" to link to a comment that also has 3 other unrelated points in it
As one example, I have a tiny project where I publicly measure the effectiveness of Israeli startups (in a Facebook group where people can comment and ask questions), and a big part of my goal is to explain the Israeli tech ecosystem how one even measures impact, and that not all med-tech startups are equal.
I might try to estimate how much it would "cost" me to get one person to a high impact job, and if that "cost" is way more than what you're doing with the 80k outreach, I should maybe just dump the project. Or in other words, for my project to be worth while, it must be at least POTENTIALLY more cost effective than the 80k outreach. Similar to how if I'd want to run an intervention for global health and wellbeing, I'd want it to at least POTENTIALLY be better than AMF. (in both these examples, 80k and AMF are also tested and scalable, which are two advantages that new projects might not always have)
(Did this make more sense?)
I'd like to hear more about why you think this causes people to take high impact jobs (what you're measuring, what you're observing) (or more cliche, "what do you think you know and why do you think you know it?").
I'm asking this in the spirit of "trust but verify": I do assume you did a good job here, and at the same time this seems to me like the main place a project like this might break, so it seems healthy to ask.
For reference of others, here's what was said in the post:
- We have various ways of measuring the inclinations of our new audience members, and almost all of them suggest that they are indeed significantly less inclined towards our advice, on average, than control groups of people that didn’t find us through our active outreach efforts.
- However, so long as we are still getting a good proportion of high-inclination traffic too (which it looks like we are), these efforts probably still look worth it.
- Our internal calculations of the value of this work take into account this expected decrease in inclination (and we think it looks good overall).
- All that said, the inclination of the new users we get from our outreach is a top priority, which we will continue to monitor.
Still, as I said elsewhere, seems really promising
Any opinions about using this as a baseline to compare other EA outreach efforts?
I tend to be in favor, since this seems to be working, kind of tested, scalable, potentially high upside given some tweaks, and so on. I feel more reluctant to do my own outreach after reading this (but I'm saying this in a good way - as a compliment to you)
+1 for creating that market! :)
I agree with everything, and still want to point out that not so long later, Musk decided to try removing the "woke" part, so maybe he shared this meme for different reasons than you or me would share it
Fighting ‘Woke AI,’ Musk Recruits Team to Develop OpenAI Rival

Another idea:

Alternative idea: we could try using memes?
As one idea:

Thanks!
Here's the Google Sheet version for the same data:
https://docs.google.com/spreadsheets/d/1MUUwnv6V28S71SD5VqMpqkSyTi9b4qyfo76tk642TAE/edit?usp=sharing
Anyone can "File --> Make a copy" and add their own changes. Or, the formula you want is probably =IMPORTDATA("https://funds.effectivealtruism.org/api/grants")
.
(Let me know if anyone wants something more complicated, this here is the 2-minute version)
A bit off topic:
I'm currently trying to have discussions about AI Safety and the community seems to be split up into an uncountable (to me) amount of slacks-and-stuff. I keep discovering more and more of them. This is a pain point for me, as a user, since I want lots of feedback on what I write. Any chance you have thoughts/advice on this, since you thought about the subforum problem (which seems similar)?
("no" is totally valid)
Okay,
If you change your mind, I'll be on board (unless something changes?) and I'd be happy to post the pitch in the EA Software groups specifically
Mental Health and the Alignment Problem: A Compilation of Resources
Nice!
I can't see it in the post itself. If this is behind some kind of link then I don't know which one (and would recommend against putting the useful information behind a link, I can elaborate if you don't agree).
(I ended up applying btw, so you can also take this as info from an applicant)
Hey! I'd like to quietly advertise my "job post template" if I may
It contains suggestions based on talking to EAs who had high friction applying to jobs.
As an example of something I'd recommend improving in this job post: I'd have a heading for "local or remote" and "part time or full time" (or something similar), since many candidates care a lot about these aspects and it would be useful to be able to scroll and find them easily (or to use the side menu). Right now that information is "hidden" within one of the paragraphs, here's a screen shot:

(Big fan of Rethink Priorities; Super excited about you running a longtermism incubator; Helping lead it sounds to me like a top role that I would be happy to tell about to people who might be relevant)
Hey,
+1
And also, I think asking these people (who are probably busy doing some important project) to email you is going to be high friction. I tried (I posted your request in the EA Israel Slack) but I'm also sharing my expectation of a problem.
(wow)
Regarding subforums, the pitch as I see it ( = as JP[1] told me in the software subforum, I think, but I don't know how to find that link now) is:
- There's a disadvantage in having topics (such as EA software) spread over ~5 different spaces (slack, discord, FB, ...)
- There's an advantage in "controlling the platform" (the forum has a team of people from the community maintaining it, controlling the DB, able to make new features, ...)
- So there's a case for moving all those spaces into the subforum
I think for this to work well it's worth while communicating that this (the 3 bullets) is the intent. I'd for example be happy to ask[2] people from the spaces that I admin to try posting in the subforum and reading from the subforum for a month (post the same things, look for the same content) as an experiment.
I hope this would solve "not enough people post" and "people aren't sure what to post" together with some other problems like "get a critical mass of posters+readers" and "people will know what they want from the subforum, so they'll know to 'complain' if certain are missing (as opposed to having a vibe of trying to understand what the developers meant)".
Eh, reading what I wrote, this isn't so tidy. Feel free to ask questions if you'd like me to clarify
- ^
Probably some of this is my own opinion which I mixed in with what JP said, but I can't remember what
- ^
I wouldn't close the existing space, and I'd do other things like "give people an opportunity to push back if they think it's stupid" (I don't want to be a dictator and I do want people to compete if they think there's a reason to do so as opposed to by accident), but I would be happy to nudge and make a pitch to why this is a good idea
I'm personally happy about different EA orgs being explicitly opinionated in different directions.
For me personally it's sometimes unclear what people mean when they use very language like "maybe, we're not sure, we sometimes recommend, ..." (I feel like it's a culture thing).
At the same time, I think 80k got to be somewhat of a center of the community and this gives them some kind of responsibility (though if they didn't mean to become such a center then a part of me doesn't want to force the responsibility on them).
Debugging:
Do you think (A) 80k's opinions are wrong or (B) that they shouldn't present their existing opinions so explicitly? (or something else?)
Why is this post being downvoted?
You can tell me anonymously:
https://docs.google.com/forms/d/e/1FAIpQLSca6NOTbFMU9BBQBYHecUfjPsxhGbzzlFO5BNNR1AIXZjpvcw/viewform
Hey! This kind of program seems promising, and specifically I endorse the "checkup after 1 year" plan which is often under rated.
At the same time, it seems important to have some kind of short feedback loop too.
Some ideas on shorter feedback loops:
- Have them post their ideas on the forum.
- Have them suggest an improvement to the impact of a specific existing program (like VIVID :) ), and check if the head-of-that-program implements the change.
- You had some goals like "At the end of the process, each participant is clear about their next step, whether it be an area to explore further, work experience, volunteering, or their next job". I think it's worth checking if the participants even have such a plan.
- (maybe something else)
I think a very good version of this program might have participants already raise money (or something like that) for their new mental health project. I don't think this is practical for the first rounds, but if it resonates with you as a stretch goal, then perhaps you can put some intermediate goals in place, like "pitched to one investor" or something like that (which is hard in part because it requires something to pitch). (but if this goal doesn't resonate with you, then maybe something else)
Specifically I think "doing the math" is very hard, and even harder when you're not in touch with the relevant stakeholders
Ben, thanks for agreeing with me, I just wanted to say that my point isn't "do the math", it's "ask the org"
I personally think it's better not to have this link public (but ok to send it privately)
Reason: This group works so well partly because of such a good fit of people, and I think trolls/bots might break that.
[I'm not a mod/admin in the group, just a participant]
[when I'm not sure if it's ok to post a link publicly, I err on asking a mod/admin]
Ah, this sounds more like what I'd call a "list of services" and less like like what I'd call a "marketplace".
What I mean by this is "very few (even only 1) provider in each category" (as opposed to trying to get many providers to compete).
I'd also be happy if such providers would have an open way of getting long text feedback publicly (as opposed to only 1-5 stars)
- What product [I'm asking for a concrete example] would you put on this marketplace?
- What "cost" of being a bad buyer/seller [I'm asking for a concrete example] might you suggest?
- Do you think that having such a cost would create a more healthy situation than a negative review on Amazon?
My main reason for voting "no" is that I don't expect good answers to these questions, so if you have such answers (which cause me to also say "yes" to question 3), that would probably change my mind somewhat.
My secondary (but still big) pushback is that marketplaces need a critical mass of buyers and sellers, and creating such a critical mass is hard (and, in my opinion, a core problem and not a side note that can surely be solved somehow).
I do, btw, think that improving the level of marketplaces in the world is generally really great if it can somehow be done.
- If you hover over the number, you'll see how many voters there are

2. If you prefer to keep them separate (legit, especially since some people's vote causes more karma), then I'd ask people (A) to specifically agreevote (B) specifically to the comment they agree with (and not both), or something like that
(I decided to use the "agree" here to indicate "no", as opposed to using the upvote/downvote. I recommend you make it explicit what you prefer.
I also think you don't need a message for both "yes" and "no" since we can vote both up and down)
Hey!
Without disagreeing, I'd like to suggest:
"making the feed less bad in a specific way" is (I expect) only going to be a patch, compared to "make the feed good".
In different words: If we figure out what we want to optimize the feed for, we won't have to patch each instance of the feed not optimizing for that.
As a metaphor: I think it's a good idea to consider what I DO want in a job instead of looking for a job that doesn't have the 3 features that I didn't enjoy previously.
As an extreme example that I don't actually endorse, I'm writing it to help me point at a more general direction of ideas: Every day, pick 10 users and show them the feed (without any upvotes? without seeing the name of whoever posted it?), and ask them to vote for "this helped me personally" and for "I want more people in the community to read this post". Hopefully "don't show the daily drama to everyone" will be only one of the problems that are implicitly solved by running something like this, many other problems will be wiped out before you even notice they exist, let alone spend months of your time trying to solve each one of them. [reminder: I wouldn't actually do this]
Am I making any sense? Seems like I write more and more lines but I'm not sure if it's any better.
Reminder: I'm not disagreeing with you. This seems to be such a big recurring pain point that I do think it's worth attention, and I think it's kind of amazing/insane that we have you to do something about it
Nobody's said it, so: +1 for writing such a short post!
I'm specifically hopeful about this cause area because
- It seems like everyone has an incentive to solve it and the problem is only coordinating some solution
- Trying solutions seems cheap
I'm confused about why this wasn't solved yet. Is it just hard to monitize? I don't know
I think I'd start with solving the problem for 1-2 EA orgs, in the spirit of "do things that don't scale", and once that works (which will probably be hard in several unexpected ways), I'd try to scale to a consultancy that helps 10 orgs at once.
This is only based on my unverified guess about making a product that would fit what the orgs would say "hell yes" to, and my unverified-in-this-situation intuition that starting by trying to solve the problem in a scalable way before doing it for 1-2 "individuals" usually doesn't work.
(I can elaborate on my intuitions, but if someone read this and disagrees - I encourage you to ignore what I wrote)
Regardless of building a solution (consultancy?) that orgs will say yes to, I also think there's something healthy of having a single person in the org (the head of security?) who is personally responsible for the security going well (having "power" to make decisions, having information and knowledge to either make decisions or vet other people's opinions), and this often isn't the situation with consultancies, who are not in fact responsible in the way I mean.
I can also imagine a trusted consultancy that very specifically helps hiring competent people to be "head of security".
[rough thoughts, not my expertise]
I agree these are problems, but disagree they don't have solutions. (I was in the IDF where we did things to address these problems)
Also, the goal of defense is making offense very costly, it's not "making offense impossible".
We did, for example, allow data transfer, but there were limitations on it. Specifically USB drives were not allowed at all, and blocked from use on the computers themselves. If you wanted to transfer data, you couldn't bring your own usb drive, you had to use a specific organizational protocol for it.
Sorry I'm not giving specifics here. My main point is that I've seen solutions to such problems in a real working air gapped network that I personally used for my development work
Everyone who agree-voted here, may I ask why you don't configure your own feed to ignore community posts right now?
Hey, just pointing out that a few collections of "candidates looking for EA jobs" already exist, and there's an advantage in having them centralized (so for example each org only needs to look in one collection, and each candidate needs to sign up only in one place)
What I am not saying:
I am not saying "dump this project because someone else is already doing it" (totally not! I really don't believe in that argument).
I am saying:
If you didn't consider that lists like this exist and you're opening a new list by accident and you see no advantage to your one, then probably use one of the existing ones.
A lazy list-of-lists that I'm aware of (sorry for not including all links, I'm vaguely expecting that this list-of-lists exists somewhere and someone might link to it)
- 80k have a longtermist census (and my guess is that orgs use this a lot since in many ways 80k are a hiring center in EA)
- HIP are setting up something that looks promising to me
- There's a post for "who wants to be hired"
- The forum profile has an option for "seeking work"
- CEA probably has some system that takes in the info from swapcard, where people sometimes mark "looking for work"
- (maybe more, that's what I have on the top of my mind)
May I upvote this post, or would that be counter productive? 🫣
(Hey, maybe you'd like to make your link clickable, by editing your post, marking the text-to-become-a-link, and then:

)
Hey, do I understand correctly that you're pointing out a problem like "there are lots of problems that will eventually lead to x-risk" + "that's bad" + "these problems somewhat feed into each other" ?
If so, speaking only for myself and not for the entire community or anything like that:
- I agree
- I personally think that AI risks will simply arrive earlier. If I change my mind and think AI risks will arrive after some of the other risks, I'll probably change what I'm working on.
Again, I speak only for myself.
(I'll also go over some of your materials, I'm happy to hear someone made a serious review of it, I'm interested)
Maybe a good time to experiment with a Swapcard alternative, if you have any other option lined up