The idea is that by speeding through you increase risk initially, but the total risk is lower - i.e. smaller area under the grey curve here:
I think this probably breaks down if the peak is high enough though (here I'm thinking of AGI x-risk). Aschenbrenner gives the example of:
On the other extreme, humanity is extremely fragile. No matter how high a fraction of our resources we dedicate to safety, we cannot prevent an unrecoverable catastrophe. ..there is nothing we can do regardless. An existential catastrophe is inevitable, and it is impossible for us to survive to reach a grand future.
And argues that
even if there is some probability we do live in this world, to maximize the moral value of the future, we should act as if we live in the other scenarios where a long and flourishing future is possible.
I'm not sure if this applies if there is some possibility of "pulling the curve sideways" to flatten it - i.e. increase the fraction of resources spent on safety whilst keeping consumption (or growth) constant. This seems to be what those concerned with x-risk are doing for the most part (rather than trying to slow down growth).
Here is an argument for how GPT-X might lead to proto-AGI in a more concrete, human-aided, way:
..language modelling has one crucial difference from Chess or Go or image classification. Natural language essentially encodes information about the world—the entire world, not just the world of the Goban, in a much more expressive way than any other modality ever could. By harnessing the world model embedded in the language model, it may be possible to build a proto-AGI.
This is more a thought experiment than something that’s actually going to happen tomorrow; GPT-3 today just isn’t good enough at world modelling. Also, this method depends heavily on at least one major assumption—that bigger future models will have much better world modelling capabilities—and a bunch of other smaller implicit assumptions. However, this might be the closest thing we ever get to a chance to sound the fire alarm for AGI: there’s now a concrete path to proto-AGI that has a non-negligible chance of working.
We now have general funding for the next few months and are hiring for both a Community & Projects Manager and an Operations Manager, with input from Nicole and others at CEA. Unfortunately with the winding down of EA Grants the possibility of funding for the Community & Projects Manager salary has gone. If anyone would like to top up the salaries for either the Community & Projects Manager or Operations Manager (currently ~£21.5k/yr pro rata including free accommodation and food), please get in touch!
Sorry if this isn’t as polished as I’d hoped. Still a lot to read and think about, but posting as I won’t have time now to elaborate further before the weekend. Thanks for doing the AMA!
It seems like a crux that you have identified is how “sudden emergence” happens. How would a recursive self-improvement feedback loop start? Increasing optimisation capacity is a convergent instrumental goal. But how exactly is that goal reached? To give the most pertinent example - what would the nuts and bolts of it be for it happening in an ML system? It’s possible to imagine a sufficiently large pile of linear algebra enabling recursive chain reactions of both improvement in algorithmic efficiency, and size (e.g. capturing all global compute -> nanotech -> converting Earth to Computronium). Even more so since GPT-3. But what would the trigger be for setting it off?
Does the above summary of my take of this chime with yours? Do you (or anyone else reading) know of any attempts at articulating such a “nuts-and-bolts” explanation of “sudden emergence” of AGI in an ML system?
Or maybe there would be no trigger? Maybe a great many arbitrary goals would lead to sufficiently large ML systems brute-force stumbling upon recursive self-improvement as an instrumental goal (or mesa-optimisation)?
Responding to some quotes from the 80,000 Hours podcast:
“It’s not really that’s surprising, I don’t have this wild destructive preference about how they’re arranged. Let’s say the atoms in this room. The general principle here is that if you want to try and predict what some future technology will look like, maybe there is some predictive power you get from thinking about X percent of the ways of doing this involve property P. But it’s important to think about where there’s a process by which this technology or artifact will emerge. Is that the sort of process that will be differentially attracted to things which are let’s say benign? If so, then maybe that outweighs the fact that most possible designs are not benign.”
What mechanism makes AI be attracted to benign things? Surely only through human direction? But to my mind the whole Bostrom/Yudkowsky argument is that it FOOMs out of control of humans (and e.g. converts everything into Computronium as a convergent instrumental goal.)
“There’s some intuition of just the gap between something that’s going around and let’s say murdering people and using their atoms for engineering projects and something that’s doing whatever it is you want it to be doing seems relatively large.”
This reads like a bit of a strawman. My intuition for the problem of instrumental convergence is that in many take-off scenarios the AI will perform (a lot) more compute, and the way it will do this is by converting all available matter to Computronium (with human-existential collateral damage). From what I’ve read, you don’t directly touch on such scenarios. Would be interested to hear your thoughts on them.
“my impression is that you typically won’t get behaviours which are radically different or that seem like the system’s going for something completely different.”
Whilst you might not typically get radically different behaviours, in the cases where ML systems do fail, they tend to failcatastrophically (in ways that a human never would)! This also fits in with the notion of hidden proxy goals from “mesa optimisers” being a major concern (as well as accurate and sufficient specification of human goals).
I'm thinking that for me it would be something like 1/100 of a year! Maybe 1/10 tops. And for those such as the OP who think that "there's just no one inside to suffer" - would you risk making such a swap (with a high multiple) if it was somehow magically offered to you?
Pretty grim thought experiment - but I wonder: what amount of living as a chicken, or pig, on a factory farm would people trade for a year of extra healthy (human) life?
Assume that you would have the consciousness of the chicken or pig during the experience (memories of your previous life would be limited to the extent to what a chicken or pig could comprehend), and that you would have some kind of memory of the experience after (although these would be zero if chickens and pigs aren't sentient). Also assume that you wouldn't lose any time in your real life (say it was run as a very fast simulation, but you subjectively still experienced the time you specified).
Edit: there's another thought experiment along the same lines in MichaelStJules' comment here.
Maybe also that the talk of preventing a depression is an information hazard when we are at the stage of the pandemic where all-out lockdown is the biggest priority for most of the richest countries. In a few weeks when the epidemics in the US and Western Europe are under control, and lockdown can be eased with massive testing, tracing and isolating of cases, then it would make more sense to freely talk about boosting the economy again (in the mean time, we should be calling for governments to take up the slack with stimulus packages. Which they seem to be doing already).
I don't know that this is still or ever really was part of the mission of the EA Hotel (now CEEALAR), but one of the things I really appreciated about it from my fortnight stay there was that it provided a space for EA-aligned folks to work on things without the pressure to produce legible results. This to me seems extremely valuable because I believe many types of impact are quantized such that no impact is legible until a lot of things fall into place and you get a "windfall" of impact all at once
Yes, this was a significant consideration in my founding of the project. We also acknowledge it where we have collatedoutputs. And whilst we have had a good amount of support (see histogram here), I feel that many potential supporters have been holding back, waiting for the windfall (we have struggled with a short runway over the last year).
Makes sense from the point of view of killing germs, and temperatures being tolerable for us also being tolerable for germs. My intuition is that it's easier to get dirt (which contains germs) off hands with warmer water (similar to how it's easier to wash dishes with warmer water).
Props for putting in the work to keep this organization alive and well. It's a wonderful asset to the EA community. :)
I agree that the extra E is a bit jarring at first (someone else has pointed this out too). I worry that without it it's too similar to CEA though; and the "Enabling" also seems useful in helping to describe what we do.
Congratulations for putting in all the time and effort required to get the (former) EA Hotel registered as a proper charity!
poll on an EA Facebook group
We did do this for the initial naming. It seems like a very lengthly process though, looking at the example of FRI. I'll also note that the names that got to the top of the latest poll I've seen (from 14 Dec) don't seem that great (but then my judgement perhaps isn't the best in this area, given the reception so far to "CEEALAR"!)
Note we are still offering the same as before, with the caveat that we aren't subsidising people earning-to-give (as was originally envisaged with the EA Hotel). We are still open to short-term visitors.
We didn't apply (although did tick the "Forward my application to the EA Meta fund" box on our application for the October 2019 round of the Long Term Future Fund).
We got rejected in the March 2019 round of the Meta Fund, and didn't receive any feedback.
In July I made an application to the Meta Fund for an "EA Events Hotel" to be also based in Blackpool, UK (for a hotel dedicated to workshops/events/retreats/bootcamps, given it's difficult to host many people at the EA Hotel for events in addition to the longer term people). This also got rejected without feedback.
Given this situation, we haven't further engaged with the Meta Fund (we've had more engagement with the Long Term Future Fund, despite the Meta Fund being the more natural fit for the EA Hotel).
Regarding explicit direction vs advice - for me it was the fact that something I thought had been dealt with acceptably seems to have - unbeknownst to me - remained a live issue in terms of it effecting funding decisions. More explicit direction at the time in terms of "if you want to get funding from CEA you need to do this" seems like it would've been better in hindsight.
Hi Julia, ok but to me the point you raised about media was tangential, i.e. it was not directly related to the PR situations themselves. For those curious - I missed a meeting with a professional communications advisor at EAG London last year, on account of missing an email (in which the meeting was arranged for me) sent the day before whilst I was driving to London. I was overwhelmed at the time with interest in the hotel, and that wasn't the only email (or meeting) I missed.
which is less likely to have much of an effective impact, compared to other organizations such as DeepMind and OpenAI..
Do you think this is true even in terms of impact/$ (given they are spending ~1,000-10,000x what we are)?
however, the EA Hotel also has the funding it needs from other sources now
We now have ~3 months worth of runway. It's a good start to this fundraiser, but is hardly conducive to sustainability (as mentioned below, we would like to get to 6 months runway to be able to start a formal hiring process for our Community & Projects Manager. The industry standard for non-profits is 18 months runway).
Some concern about the handling of past PR situations; I think these were very difficult situations, but I think an excellent version of the hotel would have handled these better
I think this is a little unfair. It would be good to know exactly what we (or an excellent version of the hotel) could've (would've) done better regarding the PR situations (I assume this is referring to the Economist and Times articles). Oliver Habryka says here "I still think something in this space went wrong", but doesn't say what (see my reply to Habryka for detail on what happened with the media). Jonas Vollmer says in reply to Habryka's comment:
... "better than many did in the early stages (including myself in the early stages of EAF) but (due to lack of experience or training) considerably worse than most EA orgs would do these days." There are many counterintuitive lessons to be learnt, many of which I still don't fully understand, either.
but doesn't elaborate. I have also talked to someone at CEA at length about media, including what happened with the hotel, and they didn't suggest anything that we could've done better given the situation (of the media outlets publishing whether we liked it or not). So I'm genuinely curious here. Although, ok, I guess maybe we could’ve removed the flipboard sheet from the wall before the journalist came in, even though it was a surprise visit.
Potential for community health issues, and concern about handling of a staffing issue
It's true that there is potential for community health issues whenever you have a group of people living together. I think we have generally faired well in this regard so far though. It has been suggested that there is a significant reputational risk involved with funding a project such as the EA Hotel given the interpersonal dynamics of a large group of people living together, and therefore it might be better for it to be funded by individuals instead of grant-making organisations. However, as a counter-point: most universities provide massively-communal student accommodation.
Regarding the staffing issue, I'm afraid there's not much I can say publicly. Although it was my understanding at the time that we dealt with it appropriately, after taking advice from prominent community members.
Thanks for commenting Nicole. To address your points (will post a separate comment for each):
Hotel management generally (including selection of guests/projects)
In terms of general management, I agree that there is always room for improvement, but I don't think things have been too bad so far.
Regarding the selection of guests/projects, I have a lot to say about this, which I hope to cover in EA Hotel Fundraiser 10: Estimating the relative Expected Value of the EA Hotel (Part 2), and possibly also a separate post focusing more on my personal opinions. For now I will say that I think there might be some philosophical disagreement between us, although I can't be certain as I don't know the specifics of which guests/projects you are referring to in particular.
Regarding emotional investment, I agree that there is a substantial amount of it in the EA Hotel. But I don't think there is significantly more than there is for any new EA project that several people put a lot of time and effort into. And for many people, not being able to do the work they want to do (i.e. not getting funded/paid to do it) is at least as significant as not being able to live where they want to live.
Still, you're right in that critical comments can (often) be perceived as being antisocial. I think part of the reason that EA is considered by new people/outsiders to not be so welcoming can be explained by this.
[Replying to above thread] One reason I asked you to plug some numbers in is that these estimates will depend a lot on what your priors are for various parameters. We will hopefully provide some of our own numerical estimates soon, but I don't think that too much weight should be put on them (Halffull makes a good point about measurability above). Also consider that our priors may be biased relative to yours.
I'll also say that a reason for Part 2 of the EV estimate being put on the back burner for so long was that Part 1 didn't get a very good reception (i.e. people didn't see much value in it). You are the first person to ask about Part 2!
[Replying to this thread]
projects in developing regions are generally (but certainly not always) significantly higher in yield than in developed regions
I think this is missing the point. The point of the EA Hotel is not to help the residents of the hotel, it is to help them help other people (and animals). AI Safety research, and other X-risk research in general, is ultimately about preventing the extinction of humanity (and other life). This is clearly a valuable thing to be aiming for. However, as I said before, it's hard to directly compare this kind of thing (and meta level work), with shovel-ready object level interventions like distributing mosquito nets in the developing world.
does the EA Hotel provide better benefit than animal welfare efforts? Polio? Global warming? Political campaigns? Poverty alleviation?
The EA Hotel is a meta level project, as opposed to the other more object level efforts you refer to, so it's hard to do a direct comparison. Perhaps it's best to think of the Hotel as a multiplier for efforts in the EA space in general. We are enabling people to study and research topics relevant to EA, and also to start new projects and collaborations. Ultimately we hope that this will lead to significant pay-offs in terms of object level value down the line (although in many cases this could take a few years, considering that most of the people we host are in the early stages of their careers).
I would appreciate it if you could review the information a bit more thoroughly. Perhaps you could generate your own estimate using the framework developed in Fundraiser 3 and the outputs listed here. Fundraiser 10 was listed last because I want to try and do a thorough job of it (but also have other competing urgent priorities with respect to the hotel). There are also manyconsiderations as to why any such estimates will be somewhat fuzzy and perhaps not ideal to rely on too heavily for decision making (hoping to go into detail on this in the post).
The £5000/month was an estimate based on earlier spending. Our costs are variable dependent on occupancy, hours worked by staff, random maintenance costs etc. It's unfortunate that I didn't adjust the totaliser earlier based on the actual spend, and I considered just paying out of my own pocket to hide the mistake, given that it's likely to be a black mark against me/the EA Hotel (and there seems to be very little tolerance for mistakes in EA these days). I hope at least some people appreciate the honesty.
I bought the hotel next door with my own money, and I've not spent any of the EA Hotel's money on it. Given it's relatively low cost, I see it as a decent investment largely independent of the EA Hotel (i.e. even if the EA Hotel fails I think there's a reasonable chance property prices will go up in Blackpool in the next 5-10 years).
Perhaps in terms of maximising my positive impact it would've been best for me to donate the money to the EA Hotel. I think that remains to be seen though. Although I think I probably was a little over-optimistic about the funding prospects for the EA Hotel at the time, in hindsight.
(Note the timing of the purchase wasn't ideal. It came up for auction. Strategically, I didn't want to lose the opportunity to enable the EA Hotel to easily expand (i.e. through knocking through the wall and using all same resources in terms of kitchen, appliances, stock etc) in the event it is successful enough to warrant it.)
Going to say that personally, I still very much think this is the best use of EA money on the margin, considering the low costs per person-year of work, hits-based giving, community building and network effects, and room for more funding (i.e. the current accute need of funding). Especially in the current situation, I think it's an outstanding opportunity for small/medium-sized donations to move the needle.
However, I'm at the stage where I'm having to consider losing my own financial independence / potential for investing in the future (including in EA things) if I want to give further financial support to the EA Hotel. And I'm not quite ready to do that.
As things stand I’m likely to give people notice soon to start paying rent or leave from 1 Dec. I feel that this could then cause a death spiral from people leaving, cost/person increasing, further people leaving because of that etc :(
The Sequestration scenario outlined here is well articulated and struck a chord with me: the EA Hotel’s struggle to gain support (and funding) from those at the centre of the movement* seems like it could be a possible symptom of it.
I don’t want to be seen as arguing for any position in the debate about whether and how much to prioritize those who appear most talented—a sufficiently nuanced writeup of my thoughts would distract from my main point here.
I would be interested to read your thoughts on this, and intend to write about it more (and how it fits in with the value proposition of the EA Hotel) myself at some point.
*Note: we have had a good amount of support from those in the periphery.