Replacement for PONR concept 2022-09-02T00:38:53.759Z
Growth of prediction markets over time? 2021-09-02T13:43:09.820Z
What 2026 looks like (Daniel's median future) 2021-08-07T05:14:35.718Z
DeepMind: Generally capable agents emerge from open-ended play 2021-07-27T19:35:08.662Z
Taboo "Outside View" 2021-06-17T09:39:12.385Z
Vignettes Workshop (AI Impacts) 2021-06-15T11:02:04.064Z
Fun with +12 OOMs of Compute 2021-03-01T21:04:16.532Z
Birds, Brains, Planes, and AI: Against Appeals to the Complexity/Mysteriousness/Efficiency of the Brain 2021-01-18T12:39:30.132Z
Against GDP as a metric for timelines and takeoff speeds 2020-12-29T17:50:04.176Z
Incentivizing forecasting via social media 2020-12-16T12:11:33.789Z
Is this a good way to bet on short timelines? 2020-11-28T14:31:46.235Z
Persuasion Tools: AI takeover without AGI or agency? 2020-11-20T16:56:52.687Z
How Roodman's GWP model translates to TAI timelines 2020-11-16T14:11:38.809Z
How can I bet on short timelines? 2020-11-07T12:45:46.192Z
What considerations influence whether I have more influence over short or long timelines? 2020-11-05T19:57:16.172Z
AI risk hub in Singapore? 2020-10-29T11:51:49.741Z
Relevant pre-AGI possibilities 2020-06-20T13:15:29.008Z
Evidence on good forecasting practices from the Good Judgment Project: an accompanying blog post 2019-02-15T19:14:41.459Z
Tiny Probabilities of Vast Utilities: Bibliography and Appendix 2018-11-20T17:34:02.854Z
Tiny Probabilities of Vast Utilities: Concluding Arguments 2018-11-15T21:47:58.941Z
Tiny Probabilities of Vast Utilities: Solutions 2018-11-14T16:04:14.963Z
Tiny Probabilities of Vast Utilities: Defusing the Initial Worry and Steelmanning the Problem 2018-11-10T09:12:15.039Z
Tiny Probabilities of Vast Utilities: A Problem for Long-Termism? 2018-11-08T10:09:59.111Z
Ongoing lawsuit naming "future generations" as plaintiffs; advice sought for how to investigate 2018-01-23T22:22:08.173Z
Anyone have thoughts/response to this critique of Effective Animal Altruism? 2016-12-25T21:14:39.612Z


Comment by kokotajlod on Taboo "Outside View" · 2022-09-28T11:41:48.152Z · EA · GW

That sounds like a useful technique.  "Outside view" would be a good term for it if it wasn't already being used to mean so many other things. :/ How about "Neutral observer" or "friend's advice" or "hypothetical friend?"

Comment by kokotajlod on Forecasting thread: How does AI risk level vary based on timelines? · 2022-09-15T02:06:55.814Z · EA · GW

Oops, accidentally voted twice on this. Didn't occur to me that the LW and EAF versions were the same underlying poll.

Comment by kokotajlod on EA is about maximization, and maximization is perilous · 2022-09-05T16:45:00.217Z · EA · GW

In addition to that, it's important not just that you actually have high integrity but that people believe you do. And people will be rightly hesitant to believe that you do if you are going around saying that the morally correct thing to do is maximize expected utility but don't worry it's always and everywhere true that the way to maximize expected utility is to act as if you have high integrity. There are two strategies available, then: Actually have high integrity, which means not being 100% a utilitarian/consequentialist, or carry out an extremely convincing deception campaign to fool people into thinking you have high integrity.  I recommend the former & if you attempt the latter, fuck you.

Comment by kokotajlod on EA is about maximization, and maximization is perilous · 2022-09-05T16:35:38.722Z · EA · GW

In practice, many of the utilitarians/consequentialists don't see the negative outcomes themselves, or at least sufficiently many of them don't that things will go to shit pretty quickly. (Relatedly, see the Unilateralists' Curse, the Epistemic Prisoner's Dilemma, and pretty much the entire literature of game theory, all those collective action problems...).

Comment by kokotajlod on Reasons I’ve been hesitant about high levels of near-ish AI risk · 2022-09-04T21:20:47.696Z · EA · GW

Hey, no need to apologize, and besides I wasn't even expecting a reply since I didn't ask a question. 

Your points 1 and 2 are good. I should have clarified what I meant by "people." I didn't mean everyone, I guess I meant something like "Most of the people who are likely to read this." But maybe I should be less extreme, as you mentioned, and exclude people who satisfy 1a+1b. Fair enough.

Re point 2: Yeah. I think your post is good; explicitly thinking about and working through feelings & biases etc. is an important complement to object-level thinking about a topic. I guess I was coming from a place of frustration with the way meta-level stuff seems to get more attention/clicks/discussion on forums like this, than object-level analysis. At least that's how it seems to me. But on reflection I'm not sure my impression is correct; I feel like the ideal ratio of object level to meta stuff should be 9:1 or so, and I haven't bothered to check maybe we aren't that far off on this forum (on the subject of timelines).


Comment by kokotajlod on Replacement for PONR concept · 2022-09-02T00:39:06.656Z · EA · GW

Note that Crunch Time is different for different people & different paths-to-impact. For example, maybe when it comes to AI alignment, crunch time begins 1 year before powerbase ability, because that's when people are deciding which alignment techniques to use on the model(s) that will seize power if they aren't aligned, and the value of learning & growing in the years immediately prior is huge. Yet at the same time it could be that for AI governance crunch time begins 5 years before powerbase ability, because coordination of labs and governments gets exponentially harder the closer you get to powerbase ability as the race to AGI heats up, and the value of learning and growing in those last few years is relatively small since it's more about implementing the obvious things (labs should coordinate, slow down, invest more in safety, etc.)

Comment by kokotajlod on A Critique of AI Takeover Scenarios · 2022-08-31T16:26:23.306Z · EA · GW

Thanks for this critique! I agree this is an important subject that is relatively understudied compared to other aspects of the problem. As far as I can tell there just isn't a science of takeover; there's military science and there's the science of how to win elections in a democracy and there's a bit of research and a few books on the topic of how to seize power in a dictatorship... but for such an important subject when you think about it it's unfortunate that there isn't a general study of how agents in multi-agent environments accumulate influence and achieve large-scale goals over long time periods.

I'm going to give my reactions below as I read:

These passages seem to imply that the rate of scientific progress is primarily limited by the number and intelligence level of those working on scientific research. It is not clear, however, that the evidence supports this.

I mean it's clearly more than JUST the number and intelligence of the people involved, but surely those are major factors!  Piece of evidence: Across many industries performance on important metrics (e.g. price) seems to predictably improve exponentially with investment/effort (this is called experience curve effect).  Another piece of evidence: AlphaFold 2.

Later you mention the gradual accumulation of ideas and cite the common occurrence of repeated independent discoveries. I think this quite plausible. But note that a society of AIs would be thinking and communicating much faster than a society of humans, so the process of ideas gradually accumulating in their society would also be sped up.

Frist, though the actual model training was rapid, the entire process of developing Alpha Zero was far more protracted. Focusing on the day of training presents a highly misleading picture of the actual rate of progress of this particular example. 

Sure, and similarly if AI R&D ability is like AI Go ability, there'll be a series of better and better AIs over the course of many years that gradually get better at various aspects of R&D, until one day an AI is trained that is better than the most brilliant genius scientists. I actually expect things to be slower and more smoothed out than this, probably, because training will take more like a year. This is all part of the standard picture of AI takeover, not an objection to it.

Second, Go is a fully-observable, discrete-time, zero-sum, two-player board game. 

I agree that the real world is more complex etc. and that just doing the same sort of self-play won't work. There may be more sophsiticated forms of self-play that work though. Also you don't need self-play to be superhuman at something, e.g. you could use decision transformers + imitation learning.

These all take time to develop and put into place, which is why the development of novel technologies takes a long time. For example, the Lockheed Martin F-35 took about fifteen years from initial design to scale production. The Gerald R. Ford aircraft carrier took about ten years to build and fit out. Semiconductor fabrication plants cost billions of dollars, and the entire process from the design of a chip to manufacturing takes years. Given such examples, it seems reasonable to expect that even a nascent AGI would require years to design and build a functioning nanofactory. Doing so in secret or without outside interference would be even more difficult given all the specialised equipment, raw materials, and human talent that would be needed. A bunch of humans hired online cannot simply construct a nanofactory from nothing in a few months, regardless of how advanced is the AGI overseeing the process.

I'd be interested to hear your thoughts on this post which details a combination of "near-future" military technologies. Perhaps you'll agree that the technologies on this list could be built in a few months or years by a developed nation with the help of superintelligent AI? Then the crux would be whether this tech would allow that nation to take over the world. I personally think that military takeover scenarios are unlikely because there are much easier and safer methods, but I still think military takeover is at least on the table -- crazier things have happened in history. 

That said, I don't concede the point -- You are right that it would take modern humans many years to build nanofactories etc. but I don't think this is strong evidence that a superintelligence would also take many years. Consider video games and speedrunning. Even if speedrunners don't allow themselves to use bugs/exploits, they still usually go significantly faster than reasonably good players. Consider also human engineers building something that is well-understood already how to build vs. building something for the first time ever. The point is, if you are really smart and know what you are doing, you can do stuff much faster. You said that a lot of experimentation and experience is necessary -- well, maybe it's not. In general there's a tradeoff between smarts and experimentation/experience; if you have more of one you need less of the other to reach the same level of performance. Maybe if you crank up smarts to superintelligence level -- so intelligent that the best human geniuses seem a rounding error away from the average -- you can get away with orders of magnitude less experimentation/experience. Not for everything perhaps, but for some things. Suppose there are N crazy sci-fi technologies that an AI could use to get a huge advantage: nanofactories, fusion, quantum shenanigans, bioengineering ... All it takes if for 1 of them to be such that you can mostly substitute superintelligence for experimentation. And also you can still do experimentation, and you can do it much faster than humans do it too because you know what you are doing. Instead of toying around until hypotheses gradually coalesce in your brain, you can begin with a million carefully crafted hypotheses consistent with all the evidence you've seen so far and an experiment regime designed to optimally search through the space of hypotheses as fast as possible.

I expect it to take somewhere between a day and five years to go from what you might call human-level AI to nanobot swarms. Perhaps this isn't that different from what you think? (Maybe you'd say something like 3 to 10 years?)

Relying on a ‘front man’ to serve as the face of the AGI would be highly dangerous, as the AGI would become dependent on this person for ensuring the loyalty of its followers. Of course one might argue that a combination of bribery and threats could be sufficient, but this is not the primary means by which successful leaders in history have obtained obedience and popularity, so an AGI limited to these tools would be at a significant disadvantage. Furthermore, an AGI reliant on control over money is susceptible to intervention by government authorities to freeze assets and hamper the transfer of funds. This would not be an issue if the AGI had control over its own territory, but then it would be subject to blockade and economic sanctions. For instance, it would take an AGI considerable effort to acquire the power of Vladimir Putin, and yet he is still facing considerable practical difficulties in exerting his will on his own (and neighbouring) populations without the intervention of the rest of the world. While none of these problems are necessarily insuperable, I believe they are significant issues that must be considered in an assessment of the plausibility of various AI takeover scenarios.

History has many examples of people ruling from behind the throne, so to speak. Often they have no official title whatsoever, but the people with the official titles are all loyal to them. Sometimes the people with the official titles do rebel and stop listening to the power behind the throne, and then said power behind the throne loses power. Other times, this doesn't happen.

AGI need not rule from behind the scenes though. If it's charismatic enough it can rule over a group of Blake Lemoines. Have you seen the movie Her? Did you find the behavior of the humans super implausible in that movie -- no way they would form personal relationships with an AI, no way they would trust it?

It is also unclear how an AGI would gain the skills needed to manipulate and manage large numbers of humans in the first place. It is by no means evident why an AGI would be constructed with this capability, or how it would even be trained for this task, which does not seem very amenable to traditional reinforcement learning approaches. In many discussions, an AGI is simply defined as having such abilities, but it is not explained why such skills would be expected to accompany general problem-solving or planning skills. Even if a generally competent AGI had instrumental reasons to develop such skills, would it have the capability of doing so? Humans learn social skills through years of interaction with other humans, and even then, many otherwise intelligent and wealthy humans possess such skills only to a minimal degree. Unless a credible explanation can be given as to how such an AI would acquire such skills or they why should necessarily follow from broader capabilities, I do not think it is reasonable to simply define an AGI as possessing them, and then assuming this as part of a broader takeover narrative. This presents a major issue for takeover scenarios which rely on an AGI engaging large numbers of humans in its employment for the development of weapons or novel technologies. 

It currently looks like most future AIs, and in particular AGIs, will have been trained on reading the whole internet & chatting to millions of humans over the course of several months. So, that's how they'll gain those skills.

(But also, if you are really good at generalizing to new tasks/situations, maybe manipulation of humans is one of the things you can generalize to. And if you aren't really good at generalizing to new tasks/situations, maybe you don't count as AGI.)

So far all I've done is critique your arguments but hopefully one day I'll have assembled some writing laying out my own arguments on this subject.

Anyhow, thanks again for writing this! I strongly disagree with your conclusions but I'm glad to see this topic getting serious & thoughtful attention.


Comment by kokotajlod on A concern about the “evolutionary anchor” of Ajeya Cotra’s report on AI timelines. · 2022-08-22T22:05:11.167Z · EA · GW

OK. I'll DM Nuno.

Something about your characterization of what happened continues to feel unfair & inaccurate to me, but there's definitely truth in it & I think your advice is good so I will stop arguing & accept the criticism & try to remember it going forward. :)

Comment by kokotajlod on A concern about the “evolutionary anchor” of Ajeya Cotra’s report on AI timelines. · 2022-08-21T19:57:25.953Z · EA · GW

Thanks for this thoughtful explanation & model.

(Aside: So, did I or didn't I come across as unfriendly/hostile? I never suggested that you said that, only that maybe it was true. This matters because I genuinely worry that I did & am thinking about being more cautious in the future as a result.)

So, given that I wanted to do both 1 and 2, would you think it would have been fine if I had just made them as separate comments, instead of mentioning 1 in passing in the thread on 2? Or do you think I really should have picked one to do and not done both?

The thing about changing my mind also resonates--that definitely happened to some extent during this conversation, because (as mentioned above) I didn't realize Nuno was talking about people who put lots of probability mass on the evolution anchor. For those people, a shift up or down by a couple OOMs really matters, and so the BOTEC  I did about how probably the environment can be simulated for less than 10^41 flops needs to be held to a higher standard of scrutiny & could end up being judged insufficient.


Comment by kokotajlod on A concern about the “evolutionary anchor” of Ajeya Cotra’s report on AI timelines. · 2022-08-20T21:51:36.896Z · EA · GW

Did I come across as unfriendly and hostile? I am sorry if so, that was not my intent.

It seems like you think I was strongly disagreeing with your claims; I wasn't. I upvoted your response and said basically "Seems plausible idk. Could go either way." 

And then I said that it doesn't really impact the bottom line much, for reasons XYZ. And you agree. 

But now it seems like we are opposed somehow even though we seem to basically be on the same page.

For context: I think I didn't realize until now that some people actually took the evolution anchor seriously as an argument for AGI by 2100, not in the sense I endorse (which is as a loose upper bound on our probability distribution over OOMs of compute) but in the much stronger sense I don't endorse (as an actual place to clump lots of probability mass around, and naively extrapolate moore's law towards across many decades). I think insofar as people are doing that naive thing I don't endorse, they should totally stop. And yes, as Nuno has pointed out, insofar as they are doing that naive thing, then they should really pay more attention to the environment cost as well as the brain-simulation cost, because it could maaaybe add a few OOMs to the estimate which would push the extrapolated date of AGI back by decades or even centuries.

Comment by kokotajlod on A concern about the “evolutionary anchor” of Ajeya Cotra’s report on AI timelines. · 2022-08-20T21:34:27.740Z · EA · GW

Huh, I guess I didn't realize how much weight some people put on the evolution anchor.  I thought everyone was (like me) treating it as a loose upper bound basically, not something to actually clump lots of probability mass on.

In other words: The people I know who were using the evolutionary anchor (people like myself, Ajeya, etc.) weren't using it in a way that would be significantly undermined by having to push the anchor up 6 OOMs or so. Like I said, it would be a minor change to the bottom line according to the spreadsheet. Insofar as people were arguing for AGI this century in a way which can be undermined by adding 6 OOMs to the evolutionary anchor then those people are silly & should stop, for multiple reasons, one of which is that maaaybe environmental simulation costs mean that the evolution anchor really is 6 OOMs bigger than Ajeya estimates.


Comment by kokotajlod on A concern about the “evolutionary anchor” of Ajeya Cotra’s report on AI timelines. · 2022-08-18T01:56:00.398Z · EA · GW

Sorta? Like, yeah, suppose you have 10% of your probability mass on the evolution anchor. Well, that means that like maaaaybe in 2090 or so we'll have enough compute to recapitulate evolution, and so maaaaybe you could say you have 10% credence that we'll actually build AGI in 2090 using the recapitulate evolution method. But that assumes basically no algorithmic progress on other paths to AGI. But anyhow if you were doing that, then yes it would be a good  counterargument that actually even if we had all the compute in 2090 we wouldn't have the clock time because latency etc. would make it take dozens of years at least to perform this computation. So then (that component of) your timelines would shift out even farther.

I think this matters approximately zero, because it is a negligible component of people's timelines and it's far away anyway so making it move even farther away isn't decision-relevant.

Comment by kokotajlod on A concern about the “evolutionary anchor” of Ajeya Cotra’s report on AI timelines. · 2022-08-17T21:20:44.135Z · EA · GW

If I understand you correctly, you are saying that the Evolution Anchor might not decrease in cost with time as fast as the various neural net anchors? Seems plausible to me, could also be faster, idk. I don't think this point undermines Ajeya's report though because (a) we are never going to get to the evolution anchor anyway, or anywhere close, so how fast it approaches isn't really relevant except in very long-timelines scenarios, and (b) her spreadsheet splits up algorithmic progress into different buckets for each anchor, so the spreadsheet already handles this nuance.

Comment by kokotajlod on A concern about the “evolutionary anchor” of Ajeya Cotra’s report on AI timelines. · 2022-08-17T21:16:37.693Z · EA · GW

Well totally this thing would take a fuckton of wall-clock time etc. but that's not a problem, this is just a thought experiment -- "If we did this bigass computation, would it work?" If the answer is "Yep, 90% likely to work" then that means our distribution over OOMs should have 90% by +18.

Comment by kokotajlod on A concern about the “evolutionary anchor” of Ajeya Cotra’s report on AI timelines. · 2022-08-17T04:23:45.922Z · EA · GW

I did this analysis a while back, but it's worth doing again, let's see what happens:

If you are spending 1e25 FLOP per simulated second simulating the neurons of the creatures, you can afford to spend 4e24 FLOP per simulated second simulating the environment & it will just be a rounding error on your overall calculation so it won't change the bottom line. So the question is, can we make a sufficiently detailed environment for 4e24 FLOP per second?

There are 5e14 square meters on the surface of the earth according to wolfram alpha.

So that's about 1e10 FLOP per second per square meter available. So, you could divide the world into 10x10 meter squares and then have a 1e12 FLOP computer assigned to each square to handle the physics and graphics. If I'm reading this chart right, that's about what a fancy high-end graphics card can do.  (Depends on if you want double or single-precision I think?). That feels like probably enough to me; certainly you could have a very detailed physics simulation at least. Remember also that you can e.g. use a planet 1 OOM smaller than Earth but with more habitable regions, and also dynamically allocate compute so that you have more of it where your creatures are and don't waste as much simulating empty areas. Also, if you think this is still maybe not enough, IIRC Ajeya has error bars of like +/- SIX orders of magnitude on her estimate, so you can just add 3 OOMs no problem without really changing the bottom line that much.

It would be harder if you wanted to assign a graphics card to each nematode worm, instead of each chunk of territory. There are a lot of nematode worms or similar tiny creatures--Ajeya says 1e21 alive at any given point of time. So that would only leave you with 10,000 flops per second per worm to do the physics and graphics!  If you instead wanted a proper graphics card for each worm you'd probably have to add 7 OOMs to that, getting you up to a 100 GFLOP card. This would be a bit higher than Ajeya estimated; it would be +25 OOMs more than GPT-3 cost instead of +18.

Personally I don't think the worms matter that much, so I think the true answer is more likely to be along the lines of "a graphics card per human-sized creature" which would be something like 10 billion graphics cards which would let you have 5e14 FLOP per card which would let you create some near-photorealistic real time graphics for each human-sized creature. 

Then there's also all the various ways in which we could optimize the evolutionary simulation e.g. as described here. I wouldn't be surprised if this shaves off 6 OOMs of cost.


Comment by kokotajlod on The animals and humans analogy for AI risk · 2022-08-13T18:53:44.226Z · EA · GW

In the relevant sense of "create," humans will not create the AIs that disempower humanity. Let me explain.

If we built AI using ordinary software ("Good Old Fashioned AI") then it would be a big file of code, every line of which would have been put there on purpose by someone & probably annotated with explanation for why it was there and how it works. And at higher levels of abstrction, the system would also be interpretable/transparent, because the higher-level structure of the system would also have been deliberately chosen by some human or group of humans.

But instead we're going to use deep learning / artificial neural networks. These are basically meta-programs that perform a search process over possible object-level programs (circuits), until they find one that performs perfectly on the training environment. So the trained neural net -- the circuit that pops out at the end of the training process -- is a tangled spaghetti mess of complex, uber-efficient structure, that is very good at scoring highly in the training environment. But that's all we know about it; we don't know how or why it works. 

If this situation is still approximately true when we get to AGI -- if we create AGI by searching for it rather than by building it -- then we really won't have much control over how it behaves once it gets powerful enough to disempower us, because we won't know what it's thinking or why it does what it does.


Comment by kokotajlod on Why does no one care about AI? · 2022-08-08T02:16:12.459Z · EA · GW

The stuff in the news does not equal the stuff that's actually most important to learn and talk about. Instead, it's some combination of (a) stuff that sells clicks/eyeballs/retweets, and (b) the hobbyhorses of journalists and editors. Note that journalists usually have an extremely shallow understanding of what they write about, much more shallow than you'd like to believe. (See: Gell-Mann amnesia.)

(Think about the incentives. If there was an organization that was dedicated to sifting through all the world's problems and sorting them by their all-things-considered importance to the world, and then writing primarily about the problems judged to be maximally important... are you imagining that this organization's front page would be wildly popular, would become the new New York Times or Fox News or whatever and have hundreds of millions of viewers? No, it would be a niche thing. People would disagree with its conclusions about what's important, often without even having given them more than two second's thought, and even the people who agree with its conclusions might be bored and go read other publications instead.)

The stuff on the agendas of governments, unfortunately, also does not equal the stuff that's actually most important. Politicians have arguably less of an understanding of most things than journalists.


Comment by kokotajlod on AI timelines via bioanchors: the debate in one place · 2022-08-01T16:36:11.617Z · EA · GW

Thanks for compiling this list! I humbly suggest that Fun With +12 OOMs be added in under my name, since it's the closest to a public rebuttal I've written to Ajeya's report.

Comment by kokotajlod on Interesting vs. Important Work - A Place EA is Prioritizing Poorly · 2022-07-29T10:35:16.041Z · EA · GW

Fair enough!

Comment by kokotajlod on Interesting vs. Important Work - A Place EA is Prioritizing Poorly · 2022-07-29T02:09:22.753Z · EA · GW

I don't even know that it's more interesting. What's interesting is different for different people, but if I'm honest with myself I probably find timelines forecasting more interesting than decision theory, even though I find decision theory pretty damn interesting. 

Comment by kokotajlod on Interesting vs. Important Work - A Place EA is Prioritizing Poorly · 2022-07-28T13:45:37.121Z · EA · GW

He more recently mentioned that he noticed “people continuously vanishing higher into the tower,” that is, focusing on more abstract and harder to evaluate issues, and that very few people have done the opposite. One commenter, Ben Weinstein-Raun, suggested several reasons, among them that longer-loop work is more visible, and higher status. 

I disagree that longer-loop work is more visible and higher status, I think the opposite is true. In AI, agent foundations researchers are less visible and lower status than prosaic AI alignment researchers, who are less visible and lower status than capabilities researchers. In my own life, I got a huge boost of status  & visibility when I did less agent foundationsy stuff and more forecasting stuff (timelines, takeoff speeds, predicting ML benchmarks, etc.).

Comment by kokotajlod on Reasons I’ve been hesitant about high levels of near-ish AI risk · 2022-07-25T11:27:51.716Z · EA · GW

I beg people to think for themselves on this issue instead of making their decision about what to believe mainly on the basis of deference and bias-correction heuristics. Yes, you can't think for yourself about every issue, there just isn't enough time in the day. But you should cultivate the habit of thinking for yourself about some issues at least, and I say this should be one of them. 

Comment by kokotajlod on Without specific countermeasures, the easiest path to transformative AI likely leads to AI takeover · 2022-07-19T21:58:25.285Z · EA · GW

Can you say more about why you think this? Both why you think there's 0 chance of HFDT leading to a system that can evaluate whether ideas are good and generate creative new ideas, and why you think this is what the majority of ML researchers think?

(I've literally never met a ML researcher with your view before to my knowledge, though I haven't exactly gone around asking everyone I know & my environment is of course selected against people with your view since I'm at OpenAI.)

Comment by kokotajlod on Why AGI Timeline Research/Discourse Might Be Overrated · 2022-07-04T10:09:38.164Z · EA · GW

What about distinguishing 50% by 2050 vs. 50% by 2027?

Comment by kokotajlod on Fanatical EAs should support very weird projects · 2022-07-01T10:42:04.234Z · EA · GW

Ooof, yeah, I hadn't thought about the solipsism possibility before. If the math checks out then I'll keep my bounded utility function but also maybe add in some nonconsequentialist-ish stuff to cover this case and cases like it. (or, you can think of it as just specifying that the utility function should assign significant negative utility to you doing unvirtuous acts like this.)

That said, I'm skeptical that the math works out for this example. Just because the universe is very big doesn't mean we are very near the bound. We'd only be very near the bound if the universe was both very big and very perfect, i.e. suffering, injustice, etc. all practically nonexistent as a fraction of things happening.

So we are probably nowhere near either end of the bound, and the question is how much difference saving one child makes in a very big universe.

For reasons related to noncausal decision theory, the answer is "a small but non-negligible fraction of all the things that happen in this universe depend on what you do in this case. If you save the child, people similar to you in similar situations all across the multiverse will choose to save similar children (or alien children, or whatever)."

The question is whether that small but non-negligible positive impact is outweighed by the maybe-solipsism-is-true-and-me-enjoying-this-ice-cream-is-thus-somewhat-important possibility.

Intuitively it feels like the answer is "hell no" but it would be good to see a full accounting. I agree that if the full accounting says the answer is "yes" then that's a reductio.

Note that the best possible solipsistic world is still vastly worse than the best possible big world.

(Oops, didn't realize you were the same person that talked to me about the sequence, shoulda put two and two together, sorry!)


Comment by kokotajlod on Fanatical EAs should support very weird projects · 2022-06-30T16:01:17.161Z · EA · GW

My own thoughts on this subject.

Also relevant: Impossibility results for unbounded utility functions.

You say:

Isaacs, Beckstead & Thomas, and Wilkinson point out how weird it would be to adopt a complete and consistent decision theory that wasn't fanatical. It would involve making arbitrary distinctions between minute differences of the probability of different wagers or evaluating packages of wagers differently then one evaluates the sum of the wagers individually. Offered enough wagers, non-fanatics must make some distinctions that they will be very hard-pressed to justify.

Idk, bounded utility functions seem pretty justifiable to me.* Just slap diminishing returns on everything. Yes, more happy lives are good,  but if you already have a googleplex of them, it's not so morally important to make more. Etc. As for infinities, well, I think we need a measure over infinities anyway, so let's say that our utility function is bounded by 1 and -1, with 1 being the case where literally everything that happens across infinite spacetime is as good as possible--the best possible world--and -1 is the opposite, and in between we have various cases in which good things happen with some measure and bad things happen with some measure.

*I totally feel the awkwardness/counterintuitiveness in certain cases, as the papers you link point out. E.g. when it's about suffering. But it feels much less bad than the problems with unbounded utility functions. As you say, it seems like people with unbounded utility functions should be fanatical (or paralyzed, I'd add) and fanatics... well, no one I know is willing to bite the bullet and actually start doing absurdist research in earnest. People might claim, therefore, to have unbounded utility functions, but I doubt their claims.

Comment by kokotajlod on On Deference and Yudkowsky's AI Risk Estimates · 2022-06-20T16:24:36.304Z · EA · GW

I think that insofar as people are deferring on matters of AGI risk etc., Yudkowsky is in the top 10 people in the world to defer to based on his track record, and arguably top 1. Nobody who has been talking about these topics for 20+ years has a similarly good track record. If you restrict attention to the last 10 years, then Bostrom does and Carl Shulman and maybe some other people too (Gwern?), and if you restrict attention to the last 5 years then arguably about a dozen people have a somewhat better track record than him. 

(To my knowledge. I think I'm probably missing a handful of people who I don't know as much about because their writings aren't as prominent in the stuff I've read, sorry!)

He's like Szilard. Szilard wasn't right about everything (e.g. he predicted there would be a war and the Nazis would win) but he was right about a bunch of things including that there would be a bomb, that this put all of humanity in danger, etc. and importantly he was the first to do so by several years.

I think if I were to write a post cautioning people against deferring to Yudkowsky, I wouldn't talk about his excellent track record but rather about his arrogance, inability to clearly explain his views and argue for them (at least on some important topics, he's clear on others), seeming bias towards pessimism, ridiculously high (and therefore seemingly overconfident) credences in things like p(doom), etc. These are the reasons I would reach for (and do reach for) when arguing against deferring to Yudkowsky.

[ETA: I wish to reemphasize, but more strongly, that Yudkowsky seems pretty overconfident not just now but historically. Anyone deferring to him should keep this in mind; maybe directly update towards his credences but don't adopt his credences. E.g. think "we're probably doomed" but not "99% chance of doom" Also, Yudkowsky doesn't seem to be listening to others and understanding their positions well. So his criticisms of other views should be listened to but not deferred to, IMO.]

Comment by kokotajlod on On Deference and Yudkowsky's AI Risk Estimates · 2022-06-20T16:05:46.146Z · EA · GW

Oops! Dunno what happened, I thought it was not yet posted. (I thought I had posted it at first, but then I looked for it and didn't see it & instead saw the unposted draft, but while I was looking for it I saw Richard's post... I guess it must have been some sort of issue with having multiple tabs open. I'll delete the other version.)

Comment by kokotajlod on On Deference and Yudkowsky's AI Risk Estimates · 2022-06-20T04:38:22.003Z · EA · GW

Re gradations of agency: Level 3 and level 4 seem within reach IMO. IIRC there are already some examples of neural nets being trained to watch other actors in some simulated environment and then imitate them. Also, model-based planning (i.e. level 4) is very much a thing, albeit something that human programmers seem to have to hard-code. I predict that within 5 years there will be systems which are unambiguously in level 3 and level 4, even if they aren't perfect at it (hey, we humans aren't perfect at it either).

Comment by kokotajlod on On Deference and Yudkowsky's AI Risk Estimates · 2022-06-20T04:23:03.977Z · EA · GW

Beat me to it & said it better than I could. 

My now-obsolete draft comment was going to say:

It seems to me that between about 2004 and 2014, Yudkowsky was the best person in the world to listen to on the subject of AGI and AI risks. That is, deferring to Yudkowsky would have been a better choice than deferring to literally anyone else in the world. Moreover, after about 2014 Yudkowsky would probably have been in the top 10; if you are going to choose 10 people to split your deference between (which I do not recommend, I recommend thinking for oneself), Yudkowsky should be one of those people and had you dropped Yudkowsky from the list in 2014 you would have missed out on some important stuff. Would you agree with this?

On the positive side, I'd be interested to see a top ten list from you of people you think should be deferred to as much or more than Yudkowsky on matters of AGI and AI risks.*

*What do I mean by this? Idk, here's a partial operationalization: Timelines, takeoff speeds, technical AI alignment, and p(doom).

[ETA: lest people write me off as a Yudkowsky fanboy, I wish to emphasize that I too think people are overindexing on Yudkowsky's views, I too think there are a bunch of people who defer to him too much, I too think he is often overconfident, wrong about various things, etc.]

[ETA: OK, I guess I think Bostrom probably was actually slightly better than Yudkowsky even on 20-year timespan.]

[ETA: I wish to reemphasize, but more strongly, that Yudkowsky seems pretty overconfident not just now but historically. Anyone deferring to him should keep this in mind; maybe directly update towards his credences but don't adopt his credences. E.g. think "we're probably doomed" but not "99% chance of doom" Also, Yudkowsky doesn't seem to be listening to others and understanding their positions well. So his criticisms of other views should be listened to but not deferred to, IMO.]

Comment by kokotajlod on On Deference and Yudkowsky's AI Risk Estimates · 2022-06-19T23:33:32.669Z · EA · GW

I don't defer much myself on these matters (to anyone) and I don't recommend other people do. In fact I think that if people deferred less and read & thought through the arguments themselves instead, more people in the broad EA community would update closer to Yudkowsky's position than away from it. That's what happened to me.

But that said:

It seems to me that between about 2004 and 2014, Yudkowsky was the best person in the world to listen to on the subject of AGI and AI risks. That is, deferring to Yudkowsky would have been a better choice than deferring to literally anyone else in the world. Moreover, after about 2014 Yudkowsky would probably have been in the top 10; if you are going to choose 10 people to split your deference between (which I do not recommend, I recommend thinking for oneself), Yudkowsky should be one of those people and had you dropped Yudkowsky from the list in 2014 you would have missed out on some important stuff. Would you agree with this?

On the positive side, I'd be interested to see a top ten list from you of people you think should be deferred to as much or more than Yudkowsky on matters of AGI and AI risks.*

*What do I mean by this? Idk, here's a partial operationalization: Timelines, takeoff speeds, technical AI alignment, and p(doom).

Comment by kokotajlod on Deference Culture in EA · 2022-06-09T06:16:34.777Z · EA · GW

FWIW I agree that EAs should probably defer less on average.  So e.g. I agree with your point 5.

I don't like the example you gave about MIRI -- I think filter bubbles & related issues are real problems but distinct from deference; nothing in the example you gave seems like deference to me. (Also, in my experience the people from MIRI defer less than pretty much anyone in EA. If anyone is deferring too little, it's them.)

Comment by kokotajlod on Deference Culture in EA · 2022-06-09T00:42:16.283Z · EA · GW

Yeah, I agree with that. On the margin I think more EAs should defer less. I've been frustrated with this in particular on topics I know a lot about, such as AI timelines.

Comment by kokotajlod on Deference Culture in EA · 2022-06-08T02:23:06.875Z · EA · GW

Agreed--except that on the margin I'd rather encourage EAs to defer less than more. :) But of course some should defer less, and others more, and also it depends on the situation, etc. etc.

Comment by kokotajlod on Deference Culture in EA · 2022-06-07T22:28:12.757Z · EA · GW

EA has a high deference culture? Compared to what other cultures? Idk but I feel like the difference between EA and other groups of people I've been in (grad students, City Year people, law students...) may not be that EAs defer more on average but rather that they are much more likely to explicitly flag when they are doing so. In EA the default expectation is that you do your own thinking and back up your decisions and claims with evidence*, and deference is a legitimate source of evidence so people cite it. But in other communities people would just say "I think X" or "I'm doing X" and not bother to explain why (and perhaps not even know why, because they didn't really think that much about it).

*Other communities have this norm too, I think, but not to the same extent.

Comment by kokotajlod on Michael Nielsen's "Notes on effective altruism" · 2022-06-05T03:41:40.361Z · EA · GW

In contrast to a bear attack, you don't expect to know that the "period of stress" has ended during your lifetime.

I expect to know this. Either AI will go well and we'll get the glorious transhuman future, or it'll go poorly and we'll have a brief moment of realization before we are killed etc. (or more realistically, a longer moment of awareness where we realize all is truly and thoroughly lost, before eventually the nanobots orwhatever come for us).


Comment by kokotajlod on Replicating and extending the grabby aliens model · 2022-05-06T16:47:08.823Z · EA · GW

My bullish prior (which has a priori has 80% credence in us not being alone) with SIA and the assumption that grabby aliens are hiding gives a median of ~ chance in a grabby civilization reaching us in the next 1000 years.

Don't you mean 1-that?

Comment by kokotajlod on What examples are there of (science) fiction predicting something strange/bad, which then happened? · 2022-04-27T20:46:35.500Z · EA · GW

(Typo: You say the US when you meant the USSR.)

Comment by kokotajlod on Tiny Probabilities of Vast Utilities: Defusing the Initial Worry and Steelmanning the Problem · 2022-04-24T08:42:29.187Z · EA · GW

The philosophy literature has stuff on this.  If I recall correctly I linked some of it in the bibliography of this post. It's been a while since I thought about this I'm afraid so I don't have references in memory. Probably you should search the Stanford Encyclopedia of Philosophy for the "Pasadena Game" and "st petersburg paradox"

Comment by kokotajlod on Tiny Probabilities of Vast Utilities: Defusing the Initial Worry and Steelmanning the Problem · 2022-04-22T20:00:04.784Z · EA · GW

Welcome to the fantastic world of philosophy, friend! :) If you are like me you will enjoy thinking and learning more about this stuff. Your mind will be blown many times over.

I do in fact think that utilitarianism as normally conceived is just wrong, and one reason why it is wrong is that it says every action is equally choiceworthy because they all have undefined expected utility.

But maybe there is a way to reconceive utilitarianism that avoids this problem. Maybe.

Personally I think you might be interested in thinking about metaethics next. What do we even mean when we say something matters, or something is good? I currently think that it's something like "what I would choose, if I was idealized in various ways, e.g. if I had more time to think and reflect, if I knew more relevant facts, etc."

Comment by kokotajlod on “Biological anchors” is about bounding, not pinpointing, AI timelines · 2022-04-21T14:32:18.568Z · EA · GW

I'm a fan of lengthy asynchronous intellectual exchanges like this one, so no need to apologize for the delay. I hope you don't mind my delay either? As usual, no need to reply to this message.

If we condition on not having extreme capabilities for persuasion or research/engineering, I’m quite skeptical that something in the "business/military/political strategy" category is a great candidate to have transformative impact on its own.

I think I agree with this.

Re: quantification: I agree; currently I don't have good metrics to forecast on, much less good forecasts, for persuasion stuff and AI-PONR stuff. I am working on fixing that problem. :)

Re persuasion: For the past two years I have agreed with the claims made in "The misinformation problem seems like misinformation."(!!!) The problem isn't lack of access to information; information is more available than it ever was before. Nor is the problem "fake news" or other falsehoods. (Most propaganda is true.) Being politically polarized and extremist correlates positively with being well-informed, not negatively! (Anecdotally, my grad school friends with the craziest/most-extreme/most-dangerous/least-epistemically-virtuous political beliefs were generally the people best informed about politics. Analogous to how 9/11 truthers will probably know a lot more about 9/11 than you or me.) This is indeed an epistemic golden age... for people who are able to resist the temptations of various filter bubbles and the propaganda of various ideologies. (And everyone thinks themself one such person, so everyone thinks this is an epistemic golden age for them.)

I do disagree with your claim that this is currently an epistemic golden age. I think it's important to distinguish between ways in which it is and isn't. I mentioned above a way that it is.

If we made a chart of some number capturing "how easy it is to convince key parts of society to recognize and navigate a tricky novel problem" ... since the dawn of civilization, what would that chart look like? My guess is that it would be pretty chaotic; that it would sometimes go quite low and sometimes go quite high

Agreed. I argued this, in fact.

 and that it would be very hard to predict the impact of a given technology or other development on epistemic responsiveness. 

Disagree. I mean, I don't know, maybe this is true. But I feel like we shouldn't just throw our hands up in the air here, we haven't even tried! I've sketched an argument for why we should expect epistemic responsiveness to decrease in the near future (propaganda and censorship are bad for epistemic responsiveness & they are getting a lot cheaper and more effective & no pro-epistemic-responsiveness-force seems to be rising to counter it)

Maybe there have been one-off points in history when epistemic responsiveness was very high; maybe it is much lower today compared to peak, such that someone could already claim we have passed the "point of no return"; maybe "persuasion AI" will drive it lower or higher, depending partly on who you think will have access to the biggest and best persuasion AIs and how they will use them. 

Agreed. I argued this, in fact. (Note: "point of no return" is a relative notion; it may be that relative to us in 2010 the point of no return was e.g. the founding of OpenAI, and nevertheless relative to us now the point of no return is still years in the future.)

So I think even if we grant a lot of your views about how much AI could change the "memetic environment," it's not clear how this relates to the "point of no return."

The conclusion I built was "We should direct more research effort at understanding and forecasting this stuff because it seems important." I think that conclusion is supported by the above claims about the possible effects of persuasion tools.

What has/had higher ex ante probability of leading to a dramatic change in the memetic environment: further development of AI language models that could be used to write more propaganda, or the recent (last 20 years) explosion in communication channels and data, or many other changes over the last few hundred years such as the advent of radio and television, or the change in business models for media that we're living through now? This comparison is intended to be an argument both that "your kind of reasoning would've led us to expect many previous persuasion-related PONRs without needing special AI advances" and that "if we condition on persuasion-related PONRs being the big thing to think about, we shouldn't necessarily be all that focused on AI."

Good argument. To hazard a guess:
1. Explosion in communication channels and data (i.e. the Internet + Big Data)
2. AI language models useful for propaganda and censorship
3. Advent of radio and television
4. Change in business models for media

However I'm pretty uncertain about this, I could easily see the order being different. Note that from what I've heard the advent of radio and television DID have a big effect on public epistemology; e.g. it partly enabled totalitarianism. Prior to that, the printing press is argued to have also had disruptive effects.

This is why I emphasized elsewhere that I'm not arguing for anything unprecedented. Public epistemology / epistemic responsiveness has waxed and waned over time and has occasionally gotten extremely bad (e.g. in totalitarian regimes and the freer societies that went totalitarian) and so we shouldn't be surprised if it happens again and if someone has an argument that it might be about to happen again it should be taken seriously and investigated. (I'm not saying you yourself need to investigate this, you probably have better things to do.) Also I totally agree that we shouldn't just be focused on AI; in fact I'd go further and say that most of the improvements in propaganda+censorship will come from non-AI stuff like Big Data. But AI will help too; it seems to make censorship a lot cheaper for example.

I'd be interested in seeing literature on how big an effect size you can get out of things like focus groups and A/B testing. My guess is that going from completely incompetent at persuasion (e.g., basically modeling your audience as yourself, which is where most people start) to "empirically understanding and incorporating your audience's different-from-you characteristics" causes a big jump from a very low level of effectiveness, but that things flatten out quickly after that, and that pouring more effort into focus groups and testing leads to only moderate effects, such that "doubling effectiveness" on the margin shouldn't be a very impressive/scary idea.

  • I think most media is optimizing for engagement rather than persuasion, and that it's natural for things to continue this way as AI advances. Engagement is dramatically easier to measure than persuasion, so data-hungry AI should help more with engagement than persuasion; targeting engagement is in some sense "self-reinforcing" and "self-funding" in a way that targeting persuasion isn't (so persuasion targeters need some sort of subsidy to compete with engagement targeters); and there are norms against targeting persuasion as well. I do expect some people and institutions to invest a lot in persuasion targeting (as they do today), but my modal expectation does not involve it becoming pervasive on nearly all websites, the way yours seems to.
  • I feel like a lot of today's "persuasion" is either (a) extremely immersive (someone is raised in a social setting that is very committed to some set of views or practices); or (b) involves persuading previously-close-to-indifferent people to believe things that call for low-cost actions (in many cases this means voting and social media posting; in some cases it can mean more consequential, but still ultimately not-super-high-personal-cost, actions). (b) can lead over time to shifting coalitions and identities, but the transition from (b) to (a) seems long.
  • I particularly don't feel that today's "persuaders" have much ability to accomplish the things that you're pointing to with "chatbots," "coaches," "Imperius curses" and "drugs." (Are there cases of drugs being used to systematically cause people to make durable, sustained, action-relevant changes to their views, especially when not accompanied by broader social immersion?)

These are all good points. This is exactly the sort of thing I wish there was more research into, and that I'm considering doing more research on myself.

Re: pervasiveness on almost all websites: Currently propaganda and censorship both seem pretty widespread and also seem to be on a trend of becoming more so. (The list of things that get censored is growing, not shrinking, for example.) This is despite the fact that censorship is costly and so theoretically platforms that do it should be outcompeted by platforms that just maximize engagement. Also, IIRC facebook uses large language models to do the censoring more efficiently and cheaply, and I assume the other companies do too. As far as I know they aren't measuring user opinions and directly using that as a feedback signal, thank goodness, but... is it that much of a stretch to think that they might? It's only been two years since GPT-3.

Comment by kokotajlod on Samotsvety Nuclear Risk Forecasts — March 2022 · 2022-03-14T22:50:21.688Z · EA · GW

Oh nice, thanks, this is the sort of thing I was looking for!

Comment by kokotajlod on Samotsvety Nuclear Risk Forecasts — March 2022 · 2022-03-13T16:52:05.872Z · EA · GW

Fair enough! I'm grateful for all the work you've already done and don't think it's your job to do more research in the areas that would be more convincing to me.

Comment by kokotajlod on Samotsvety Nuclear Risk Forecasts — March 2022 · 2022-03-12T23:56:30.262Z · EA · GW

Agreed. Though that was a century ago & with different governments, as you pointed out elsewhere. Also no nukes; presumably nukes make escalation less likely than it was then (I didn't realize this until now when I just read the wiki--apparently the Allies almost declared war on the soviet union due to the invasion of finland!).

Comment by kokotajlod on Samotsvety Nuclear Risk Forecasts — March 2022 · 2022-03-12T23:38:13.292Z · EA · GW

OK, thanks!

Comment by kokotajlod on Samotsvety Nuclear Risk Forecasts — March 2022 · 2022-03-12T23:34:44.233Z · EA · GW

Ah, that's a good argument, thanks! Updating downwards.

Comment by kokotajlod on Samotsvety Nuclear Risk Forecasts — March 2022 · 2022-03-12T23:32:33.931Z · EA · GW

That makes sense. 2 OOMs is clearly too high now that you mention it. But I stand by my 1 OOM claim though, until people convince me that really this is much more like an ordinary business-as-usual month than I currently think it is. Which could totally happen! I am not by any means an expert on this stuff, this is just my hot take!

Comment by kokotajlod on Samotsvety Nuclear Risk Forecasts — March 2022 · 2022-03-12T23:30:40.264Z · EA · GW

That's why it was my upper bound. I too think it's pretty implausible. How would you feel about a bet on the +1 OOM odds?

Comment by kokotajlod on Samotsvety Nuclear Risk Forecasts — March 2022 · 2022-03-12T16:37:07.107Z · EA · GW

To be clear I haven't thought about this nearly as much as you. I just think that this is clearly an unusually risky month and so the number should be substantially higher than for an average month.

Comment by kokotajlod on Samotsvety Nuclear Risk Forecasts — March 2022 · 2022-03-12T16:32:56.708Z · EA · GW

Seems less mild than the invasion of Georgia and the Armenia and Azerbaijan war. But more mild than the cuban missile crisis for sure. Anyhow, roughly how many examples do you think you'd want to pick for the reference class? If it's a dozen or so, then I think you'd get a substantially higher base rate.