Posts

AGI and Lock-In 2022-10-29T01:56:10.177Z
Truthful AI 2021-10-20T15:11:10.363Z
Quantifying anthropic effects on the Fermi paradox 2019-02-15T10:47:04.239Z

Comments

Comment by Lukas_Finnveden on AGI and Lock-In · 2022-12-05T22:25:25.711Z · EA · GW

Depends on how much of their data they'd have to back up like this. If every bit ever produced or operated on instead had to be be 25 bits — that seems like a big fitness hit. But if they're only this paranoid about a few crucial files (e.g. the minds of a few decision-makers), then that's cheap.

And there's another question about how much stability contributes to fitness. In humans, cancer tends to not be great for fitness. Analogously, it's possible that most random errors in future civilizations would look less like slowly corrupting values and more like a coordinated whole splintering into squabbling factions that can easily be conquered by a unified enemy. If so, you might think that an institution that cared about stopping value-drift and an instiution that didn't would both have a similarly large interest in preventing random errors.

Also, by the same token, even if there is a "singleton" at some relatively early time, mightn't it prefer to take on a non-negligible risk of value drift later in time if it means being able to, say, 10x its effective storage capacity in the meantime?

The counter-argument is that it will be super rich regardless, so it seems like satiable value systems would be happy to spend a lot on preventing really bad events from happening with small probability. Whereas instabiable value systems would notice that most resources are in the cosmos, and so also be obsessed with avoiding unwanted value drift. But yeah, if the values contain a pure time preference, and/or doesn't care that much about the most probable types of value drift, then it's possible that they wouldn't deem the investment worth it.

Comment by Lukas_Finnveden on AGI and Lock-In · 2022-12-05T00:05:53.407Z · EA · GW

This is a great question. I think the answer depends on the type of storage you're doing.

If you have a totally static lump of data that you want to encode in a harddrive and not touch for a billion years, I think the challenge is mostly in designing a type of storage unit that won't age. Digital error correction won't help if your whole magnetism-based harddrive loses its magnetism. I'm not sure how hard this is.

But I think more realistically, you want to use a type of hardware that you regularly use, regularly service, and where you can copy the information to a new harddrive when one is about to fail. So I'll answer the question in that context.

As an error rate, let's use the failure rate of 3.7e-9 per byte per month ~= 1.5e-11 per bit per day from this stack overflow reply.  (It's for RAM, which I think is more volatile than e.g. SSD storage, and certainly not optimised for stability, so you could probably get that down a lot.)

Let's use the following as an error correction method: Each bit is represented by N bits; for any computation the computer does, it will use the majority vote of the N bits; and once per day,[1] each bit is reset to the majority vote of its group of bits.

If so...

  • for N=1, the probability that a bit is stable for 1e9 years is ~exp(-1.5e-11*365*1e9)=0.4%. Yikes!
  • for N=3, the probability that 2 bit flips happen in a single day is ~3*(1.5e-11)^2 and so the probability that a group of bits is stable for 1e9 years is ~exp(-3*(1.5e-11)^2*365*1e9)=1-2e-10. Much better, but there will probably still be a million errors  in that petabyte of data.
  • for N=5, the probability that 3 bit flips happen in a single day is ~(5 choose 2)*(1.5e-11)^3 and so the probability that the whole petabyte of data is safe for 1e9 years is ~99.99%. And so on this scheme, it seems that 5 petabytes of storage is enough to make 1 petabyte stable for a billion years.

Based on the discussion here, I think the errors in doing the majority-voting calculations are negligible compared to the cosmic ray calculations. At least if you do it cleverly so that you don't get too many correlations and ruin your redundance (which there are ways to do according to results on error correcting computations — though I'm not sure if they might require some fixed amount of extra storage space to do this, in which case you might need N somewhat greater than 5).

Now this scheme requires that you have a functioning civilization that can provide electricity for the computer, that can replace the hardware when it starts failing, and stuff — but that's all things that we wanted to have anyway. And any essential component of that civilization can run on similarly error-corrected hardware.

And to account for larger-scale problems than cosmic rays (e.g. local earthquake throws harddrive to the ground and shatters it, or you accidentally erase a file when you were supposed to make a copy of it), you'd probably want backup copies of the petabyte on different places across the Earth, which you replaced each time something happened to one of them. If there's an 0.1% chance of that happening in any one day (corresponding to once/3 years, which seems like an overestimate if you're careful), and you immediately notice it and replace the copy within a day, and you have 5 copies in total, the probability that one of them keeps working at all times is ~exp(-(0.001)^5*365*1e9)~=99.96%. So combined with the previous 5, that'd be a multiple of 5*5=25.

This felt enlightening. I'll add a link to this comment from the doc.

  1. ^

    Using a day here rather than an hour or a month isn't super-motivated. If you reset things very frequently, you might interfere with normal use of the computer, and errors in the resetting-operation might start to dominate the errors from cosmic rays. But I think a day should be above the threshold where that's much of an issue.

Comment by Lukas_Finnveden on FTX FAQ · 2022-11-13T08:13:10.799Z · EA · GW

I'm not sure how literally you mean "disprove", but at it's face, "assume nothing is related to anything until you have proven otherwise" is a reasoning procedure that will never recommend any action in the real world, because we never get that kind of certainty. When humans try to achieve results in the real world, heuristics, informal arguments, and looking at what seems to have worked ok in the past are unavoidable.

Comment by Lukas_Finnveden on FTX FAQ · 2022-11-13T07:50:46.303Z · EA · GW

Global poverty probably have slower diminishing marginal returns, yeah. Unsure about animal welfare. I was mostly thinking about longtermist causes.

Re 80,000 Hours: I don't know exactly what they've argued, but I think "very valuable" is compatible with logarithmic returns. There are also diminishing marginal returns to direct workers in any given cause, so logarithmic returns on money doesn't mean that money becomes unimportant compared to people, or anything like that.

Comment by Lukas_Finnveden on FTX FAQ · 2022-11-13T07:28:00.309Z · EA · GW

Because utility and integrity are wholly independent variables, so there is no reason for us to assume a priori that they will always correlate perfectly. So if we wish to believe that integrity and expected value correlated for SBF, then we must show it. We must actually do the math.

This feels a bit unfair when people (i) have argued that utility and integrity will correlate strongly in practical cases (why use "perfectly" as your bar?), and (ii) that they will do so in ways that will be easy to underestimate if you just "do the math".

You might think they're mistaken, but some of the arguments do specifically talk about why the "assume 0 correlation and do the math"-approach works poorly, so if you disagree it'd be nice if you addressed that directly.

Comment by Lukas_Finnveden on FTX FAQ · 2022-11-13T06:52:32.066Z · EA · GW

Because a double-or-nothing coin-flip scales; it doesn't stop having high EV when we start dealing with big bucks.

Risky bets aren't themselves objectionable in the way that fraud is, but to just address this point narrowly: Realistic estimates puts risky bets at much worse EV when you control a large fraction of the altruistic pool of money. I think a decent first approximation is that EA's impact scales with the logarithm of its wealth. If you're gambling a small amount of money, that means you should be ~indifferent to 50/50 double or nothing (note that even in this case it doesn't have positive EV). But if you're gambling with the majority of wealth that's predictably committed to EA causes, you should be much more scared about risky bets.

(Also in this case the downside isn't "nothing" — it's much worse.)

Comment by Lukas_Finnveden on [deleted post] 2022-11-13T05:23:43.772Z

conflicts of interest in grant allocation, work place appointments should be avoided

Worth flagging: Since there are more men than women in EA, I would expect a greater fraction of EA women than EA men to be in relationships with other EAs. (And trying to think of examples off the top of my head supports that theory.) If this is right, the policy "don't appoint people for jobs where they will have conflicts of interest" would systematically disadvantage women.

(By contrast, considering who you're already in a work-relationship with when choosing who to date  wouldn't have a systematic effect like that.)

My inclination here would be to (as much as possible) avoid having partners make grant/job-appointment decisions about their partners. But that if someone seems to be the best for a job/grant (from the perspective of people who aren't their partner), to not deny them that just because it would put them in a position closer to their partner.

(It's possible that this is in line with what you meant.)

Comment by Lukas_Finnveden on AGI and Lock-In · 2022-11-10T06:35:49.958Z · EA · GW

Yeah, I agree that multipolar dynamics could prevent lock-in from happening in practice.

I do think that "there is a non-trivial probability that a dominant institution will in fact exist", and also that there's a non-trivial probability that a multipolar scenario will either

  • (i) end via all relevant actors agreeing to set-up some stable compromise institution(s), or
  • (ii) itself end up being stable via each actor making themselves stable and their future interactions being very predictable. (E.g. because of an offence-defence balance strongly favoring defence.)

...but arguing for that isn't really a focus of the doc.

(And also, a large part of why I believe they might happen is that they sound plausible enough, and I haven't heard great arguments for why we should be confident in some particular alternative. Which is a bit hard to forcefully argue for.)

Comment by Lukas_Finnveden on AGI and Lock-In · 2022-11-10T06:24:05.071Z · EA · GW

If re-running evolution requires simulating the weather and if this is computationally too difficult then re-running evolution may not be a viable path to AGI.

There are many things that prevent us from literally rerunning human evolution. The evolution anchor is not a proof that we could do exactly what evolution did, but instead an argument that if something as inefficient as evolution spit out human intelligence with that amount of compute, surely humanity could do it if we had a similar amount of compute. Evolution is very inefficient — it has itself been far less optimized than the creatures it produces.

(I'd have more specific objections to the idea that chaos-theory-in-weather in particular would be an issue: I think that a weather-distribution approximated with a different random generation procedure would be as likely to produce human intelligence as a weather distribution generated by Earth's precise chaotic behavior. But that's not very relevant, because there would be far bigger differences between Earthly evolution and what-humans-would-do-with-1e40-FLOP than the weather.)

Comment by Lukas_Finnveden on AGI and Lock-In · 2022-11-03T18:54:08.204Z · EA · GW

For instance we might get WBEs only in hypothetical-2080 but get superintelligent LLMs in 2040, and the people using superintelligent LLMs make the world unrecognisably different by 2042 itself.

I definitely don't just want to talk about what happens / what's feasible before the world becomes unrecognisably different. It seems pretty likely to me that lock-in will only become feasible after the world has become extremely strange. (Though this depends a bit on details of how to define "feasible", and what we count as the start-date of lock-in.)

And I think that advanced civilizations that tried could eventually become very knowledgable about how to create AI with a wide variety of properties, which is why I feel ok with the assumption that AIs could be made similar to humans in some ways without being WBEs.

(In particular, the arguments in this document are not novel suggestions for how to succeed with alignment in a realistic scenario with limited time! That still seems like a hard problem! C.f. my response to Michael Plant.)

Comment by Lukas_Finnveden on AGI and Lock-In · 2022-11-03T18:36:31.154Z · EA · GW

Chaos theory is about systems where tiny deviations in initial conditions cause large deviations in what happens in the future. My impression (though I don't know much about the field) is that, assuming some model of a system (e.g. the weather), you can prove things about how far ahead you can predict the system given some uncertainty (normally about the initial conditions, though uncertainty brought about by limited compute that forces approximations should work similarly). Whether the weather corresponds to any particular model isn't really susceptible to proofs, but that question can be tackled by normal science.

Comment by Lukas_Finnveden on AGI and Lock-In · 2022-11-03T18:16:09.491Z · EA · GW

Quoting from the post:

Thus, we suspect that an adequate solution to AI alignment could be achieved given sufficient time and effort. (Though whether that will actually happen is a different question, not addressed since our focus is on feasibility rather than likelihood.)

AI doomers tend to agree with this claim.  See e.g. Eliezer in list of lethalities:

None of this is about anything being impossible in principle.  The metaphor I usually use is that if a textbook from one hundred years in the future fell into our hands, containing all of the simple ideas that actually work robustly in practice, we could probably build an aligned superintelligence in six months.  (...) What's lethal is that we do not have the Textbook From The Future telling us all the simple solutions that actually in real life just work and are robust; we're going to be doing everything with metaphorical sigmoids on the first critical try.  No difficulty discussed here about AGI alignment is claimed by me to be impossible - to merely human science and engineering, let alone in principle - if we had 100 years to solve it using unlimited retries, the way that science usually has an unbounded time budget and unlimited retries.  This list of lethalities is about things we are not on course to solve in practice in time on the first critical try; none of it is meant to make a much stronger claim about things that are impossible in principle.

Comment by Lukas_Finnveden on AGI and Lock-In · 2022-11-03T06:23:19.033Z · EA · GW

Thanks Lizka. I think about section 0.0 as being a ~1-page summary (in between the 1-paragraph summary and the 6-page summary) but I could have better flagged that it can be read that way. And your bullet point summary is definitely even punchier.

Comment by Lukas_Finnveden on AGI and Lock-In · 2022-11-03T06:14:51.215Z · EA · GW

Thanks!

You've assumed from the get go that AIs will follow similar reinforcement-learning like paradigms like humans and converge on similar ontologies of looking at the world as humans. You've also assumed these ontologies will be stable - for instance a RL agent wouldn't become superintelligent, use reasoning and then decide to self modify into something that is not an RL agent.

Something like that, though I would phrase it as relying on the claim that it's feasible to build AI systems like that, since the piece is about the feasibility of lock-in. And in that context, the claim seems pretty safe to me. (Largely because we know that humans exist.)

You've assumed laws of physics as we know them today are constraints on things like computation and space colonization and oversight and alignment processes for other AIs.

Yup, sounds right.

Does this assume a clean separation between two kinds of processes - those that can be predicted and those that can't?

That's a good question. I wouldn't be shocked if something like this was roughly right, even if it's not exactly right. Let's imagine the situation from the post, where we have an intelligent observer with some large amount of compute that gets to see the paths of lots of other civilizations built by evolved species. Now let's imagine a graph where the x-axis has some increasing combination of "compute" and  "number of previous examples seen", and the y-axis has something like "ability to predict important events". At first, the y-value would probably go up pretty fast with greater x, as the observer get a better sense of what the distribution of outcomes are. But  on our understanding of chaos theory, it's ability to predict e.g. the weather years in advance would be limited even at astoundingly large values of compute+knowledge of what the distribution is like. And since chaotic processes affect important real-world events in various ways (e.g. the genes of new humans seem similarly random as the weather, and that has huge effects), it seems plausible that our imagined graph would asymptote towards some limit of what's predictable.

And that's not even bringing up fundamental quantum effects, which are fundamentally unpredictable from our perspective. (With a many-worlds interpretation, they might be predictable in the sense that all of them will happen. But that still lets us make interesting claims about "fractions of everett branches", which seems pretty interchangeable with "probabilities of events".)

In any case, I don't think this impinges much on the main claims in the doc. (Though if I was convinced that the picture above was wildly wrong, I might want to give a bit of extra thought to what's the most convenient definition of lock-in.)

Comment by Lukas_Finnveden on AGI and Lock-In · 2022-10-29T16:40:05.619Z · EA · GW

I broadly agree with this. For the civilizations that want to keep thinking about their values or the philosophically tricky parts of their strategy, there will be an open question about how convergent/correct their thinking process is (although there's lots you can do to make it more convergent/correct — eg. redo it under lots of different conditions, have arguments be reviewed by many different people/AIs, etc).

And it does seem like all reasonable civilizations should want to do some thinking like this. For those civilizations, this post is just saying that other  sources of instability could be removed (if they so chose, and insofar as that was compatible with the intended thinking process).

Also, separately, my best guess is that competent civilizations (whatever that means) that were aiming for correctness would probably succeed (at least in areas were correctness is well defined). Maybe by solving metaphilosophy and doing that, maybe because they took lots of precautions like mentioned above, maybe just because it's hard to get permanently stuck at incorrect beliefs if lots of people are dedicated to getting things right, have all the time and resources in the world, and are really open-minded. (If they're not open-minded but feel strongly attached to keeping their current views, then I become more pessimistic.)

But even if a civilization was willing to take this extreme step, I'm not sure how you'd design a filter that could reliably detect and block all "reasoning" that might exploit some flaw in your reasoning process.

By being unreasonably conservative. Most AIs could be tasked with narrowly doing their job, a few with pushing forward technology/engineering, none with doing anything that looks suspiciously like ethics/philosophy.  (This seems like a bad idea.)

Comment by Lukas_Finnveden on What is the most pressing feature to add to the forum? Upvote for in general, agreevote to agree with reasoning. · 2022-10-06T07:57:11.033Z · EA · GW

And tags / wiki entries.

Comment by Lukas_Finnveden on Samotsvety Nuclear Risk update October 2022 · 2022-10-04T13:20:40.647Z · EA · GW

We used the geometric mean of the samples with the minimum and maximum removed to better deal with extreme outliers, as described in our previous post

I don't see how that's consistent with:

What is the probability that Russia will use a nuclear weapon in Ukraine in the next MONTH?

  • Aggregate probability: 0.0859 (8.6%)
  • All probabilities: 0.27, 0.04, 0.02, 0.001, 0.09, 0.08, 0.07

What is the probability that Russia will use a nuclear weapon in Ukraine in the next YEAR?

  • Aggregate probability: 0.2294 (23%)
  • All probabilities: 0.38, 0.11, 0.11, 0.005, 0.42, 0.2, 0.11

I get that the first of those should be 0.053. Haven't run the numbers on the latter, but pretty sure the geometric mean should be smaller than 23% from eyeballing it. (I also haven't run the numbers on other aggregated numbers in this post.)

Comment by Lukas_Finnveden on Samotsvety Nuclear Risk update October 2022 · 2022-10-04T00:06:55.697Z · EA · GW

On the other hand, the critic updated me towards higher numbers on p(nuke london|any nuke). Though I assume Samotsvety have already read it, so not sure how to take that into account. But given that uncertainty, given that that number only comes into play in confusing worlds where everyone's models are broken, and given Samotsvety's 5x higher unconditional number, I will update at least a bit in that direction.

Comment by Lukas_Finnveden on Samotsvety Nuclear Risk update October 2022 · 2022-10-04T00:03:18.806Z · EA · GW

Thanks for the links! (Fyi the first two points to the same page.)

The critic's 0.3 assumes that you'll stay until there's nuclear exchanges between Russia and NATO. Zvi was at 75% if you leave as soon as a conventional war between NATO and Russia starts.

I'm not sure how to compare that situation with the current situation, where it seems more likely that the next escalatory step will be a nuke on a non-NATO target than conventional NATO-Russia warfare. But if you're happy to leave as soon as either a nuke is dropped anywhere or conventional NATO/Russia warfare breaks out, I'm inclined to aggregate those numbers  to something closer to 75% than 50%.

Comment by Lukas_Finnveden on Samotsvety Nuclear Risk update October 2022 · 2022-10-03T21:54:33.938Z · EA · GW

Thanks for doing this!

In this squiggle you use "ableToEscapeBefore = 0.5". Does that assume that you're following the policy "escape if you see any tactical nuclear weapons being used in Ukraine"? (Which someone who's currently on the fence about escaping London would presumably do.)

If yes, I would have expected it to be higher than 50%. Do you think very rapid escalation is likely, or am I missing something else?

Comment by Lukas_Finnveden on Announcing the Future Fund's AI Worldview Prize · 2022-09-27T10:14:51.151Z · EA · GW

I think this particular example requires an assumption of logarithmically diminishing returns, but is right with that.

(I think the point about roughly quadratic value of information applies more broadly than just for logarithmically diminishing returns. And I hadn't realised it before. Seems important + underappreciated!)

One quirk to note: If a funder (who I want to be well-informed) is 50/50 on S vs L, but my all-things-considered belief is 60/40, then I would value the first 1% they shift towards my position much more than they do (maybe 10x more?)  and will put comparatively little value on shifting them all the way (ie the last percent from 59% to 60% is much less important). You can get this from a pretty similar argument as in the above example.

(In fact, the funder's own much greater valuation of shifting 10% than 1% can be seen as a two-step process where (i) they shift to 60/40 beliefs, and then (ii) they first get a lot of value from shifting their allocation from 50 to 51, then slightly less from shifting from 51 to 52, etc...)

Comment by Lukas_Finnveden on Eliminate or Adjust Strong Upvotes to Improve the Forum · 2022-09-05T17:00:00.905Z · EA · GW

I think that's right other than that weak upvotes never become worth 3 points anymore (although this doesn't matter on the EA forum, given that no one has 25,000 karma), based on this lesswrong github file linked from the LW FAQ.

Comment by Lukas_Finnveden on My take on What We Owe the Future · 2022-09-02T15:25:09.184Z · EA · GW

Nitpicking:

A property of making directional claims like this is that MacAskill always has 50% confidence in the claim I’m making, since I’m claiming that his best-guess estimate is too high/low.

This isn't quite right. Conservation of expected evidence means that MacAskill's current probabilities should match his expectation of the ideal reasoning process. But for probabilities close to 0, this would typically imply that he assigns higher probability to being too high than to being too low. For example: a 3% probability is compatible with 90% probability that the ideal reasoning process would assign probability ~0% and a 10% probability that it would assign 30%. (Related.)

This is especially relevant when the ideal reasoning process is something as competent as 100 people for 1000 years. Those people could make a lot of progress on the important questions (including e.g. themselves working on the relevant research agendas just to predict whether they'll succeed), so it would be unsurprising for them to end up much closer to 0% or 100% than is justifiable today.

Comment by Lukas_Finnveden on Existential risk pessimism and the time of perils · 2022-08-15T14:19:36.740Z · EA · GW

The term "most important century" pretty directly suggests that this century is unique, and I assume that includes its unusually large amount of x-risk (given that Holden seems to think that the development of TAI is both the biggest source of x-risk this century and the reason for why this might be the most important century).

Holden also talks specifically about lock-in, which is one way the time of perils could end.

See e.g. here:

It's possible, for reasons outlined here, that whatever the main force in world events is (perhaps digital people, misaligned AI, or something else) will create highly stable civilizations with "locked in" values, which populate our entire galaxy for billions of years to come.

If enough of that "locking in" happens this century, that could make it the most important century of all time for all intelligent life in our galaxy.

I want to roughly say that if something like PASTA is developed this century, it has at least a 25% chance of being the "most important century" in the above sense.

Comment by Lukas_Finnveden on [AMA] Announcing Open Phil’s University Group Organizer and Century Fellowships · 2022-08-03T07:55:17.597Z · EA · GW

The page for the Century Fellowship outlines some things that fellows could do, which are much broader than just university group organizing:

When assessing applications, we will primarily be evaluating the candidate rather than their planned activities, but we imagine a hypothetical Century Fellow may want to:

  • Lead or support student groups relevant to improving the long-term future at top universities
  • Develop a research agenda aimed at solving difficult technical problems in advanced deep learning models
  • Start an organization that teaches critical thinking skills to talented young people
  • Run an international contest for tools that let us trace where synthetic biological agents were first engineered
  • Conduct research on questions that could help us understand how to to make the future go better
  • Establish a publishing company that makes it easier for authors to print and distribute books on important topics

Partly this comment exists just to give readers a better impression of the range of things that the century fellowship could be used for. For example, as far as I can tell, the fellowship is currently one of very few options for people who want to pursue fairly independent longtermist research and who want help with getting work authorization in the UK or US.

But I'm also curious if you have any comments on the extent to which you expect the century fellowship to take on community organizers vs researchers vs ~entrepeneurs. (Is the focus on community organizing in this post indicative, or just a consequence of the century fellowship being mentioned in a post that's otherwise about community organizing?)

Comment by Lukas_Finnveden on Punching Utilitarians in the Face · 2022-07-14T08:40:51.441Z · EA · GW

I'm not saying it's infinite, just that (even assuming it's finite) I assign non-0 probability to different possible finite numbers in a fashion such that the expected value is infinite. (Just like the expected value of an infinite st petersburg challenge is infinite, although every outcome has finite size.)

Comment by Lukas_Finnveden on Punching Utilitarians in the Face · 2022-07-14T08:37:01.422Z · EA · GW

The topic under discussion is whether pascalian scenarios are a problem for utilitarianism, so we do need to take pascalian scenarios seriously, in this discussion.

Comment by Lukas_Finnveden on Punching Utilitarians in the Face · 2022-07-13T23:28:13.863Z · EA · GW

I simply don’t believe that infinities exist, and even though 0 isn’t a probability, I reject the probabilistic argument that any possibility of infinity allows them to dominate all EV calculations.

Problems with infinity doesn't go away just because you assume that actual infinities don't exist. Even with just finite numbers, you can face gambles that have infinite expected value, if increasingly good possibilities have insufficiently rapidly diminishing probabilities. And this still causes a lot of problems.

(I also don't think that's an esoteric possibility. I think that's the epistemic situation we're currently in, e.g. with respect to the amount of possible lives that could be created in the future.)

Also, as far as I know (which isn't a super strong guarantee) every nice theorem that shows that it's good to maximize expected value assumes that possible utility is bounded in both directions (for outcomes with probability >0). So there's no really strong reason to think that it would make sense to maximize expected welfare in an unbounded way, in the first place.

See also: www.lesswrong.com/posts/hbmsW2k9DxED5Z4eJ/impossibility-results-for-unbounded-utilities

Comment by Lukas_Finnveden on Some research questions that you may want to tackle · 2022-07-12T09:59:26.893Z · EA · GW

10^12 might be too low. Making up some numbers: If future civilizations can create 10^50 lives, and we think there's an 0.1% chance that 0.01% of that will be spent on ancestor simulations, then that's 10^43 expected lives in ancestor simulations. If each such simulation uses 10^12 lives worth of compute, that's a 10^31 multiplier on short-term helping.

Comment by Lukas_Finnveden on Some research questions that you may want to tackle · 2022-07-11T23:30:24.607Z · EA · GW

A proper treatment of this should take into account that short-term helping also might have positive effects in lots of simulations to a much greater extent than long-term helping. https://longtermrisk.org/how-the-simulation-argument-dampens-future-fanaticism

Comment by Lukas_Finnveden on (Even) More Early-Career EAs Should Try AI Safety Technical Research · 2022-07-01T23:02:13.927Z · EA · GW

I agree. Anecdotally, among people I know, I've found aphantasia to be more common among those who are very mathematically skilled.

(Maybe you could have some hypothesis that aphantasia tracks something slightly different than other variance in visual reasoning. But regardless, it sure seems similar enough that it's a bad idea to emphasize the importance of "shape rotating". Because that will turn off some excellent fits.)

Comment by Lukas_Finnveden on What’s the theory of change of “Come to the bay over the summer!”? · 2022-06-10T00:27:34.707Z · EA · GW

But note the hidden costs. Climbing the social ladder can trade of against building things. Learning all the Berkeley vibes can trade of against, eg., learning the math actually useful for understanding agency.

This feels like a surprisingly generic counterargument, after the (interesting) point about ladder climbing. "This could have opportunity costs" could be written under every piece of advice for how to spend time.

In fact, it applies less to this posts than to most advice on how to spend time, since the OP claimed that the environment caused them to work harder.

(A hidden cost that's more tied to ladder climbing is Chana's point that some of this can be at least somewhat zero-sum.)

Comment by Lukas_Finnveden on Some potential lessons from Carrick’s Congressional bid · 2022-05-23T00:51:44.844Z · EA · GW

By the way, as an aside, the final chapter here is that Protect our Future PAC went negative in May -- perhaps a direct counter to BoldPAC's spending. (Are folks here proud of that? Is misleading negative campaigning compatible with EA values?)

I wanted to see exactly how misleading these were. I found this example of an attack ad, which (after some searching) I think cites this, this, this, and this. As far as I can tell:

  • The first source says that Salinas "worked for the chemical manufacturers’ trade association for a year", in the 90s.
  • The second source says that she was a "lobbyist for powerful public employee unions SEIU Local 503 and AFSCME Council 75 and other left-leaning groups" around 2013-2014. The video uses this as a citation for the slide "Andrea Salinas — Drug Company Lobbyist".
  • The third source says that insurers' drug costs rose by 23% between 2013-2014. (Doesn't mention Salinas.)
  • The fourth source is just the total list of contributors to Salina's campaigns, and the video doesn't say what company she supposedly lobbied for that gave her money. The best I can find is that this page says she lobbied for Express Scripts in 2014, who is listed as giving her $250.

So my impression is that the situation boils down to: Salinas worked for a year for the chemical manufacturers’ trade association in the 90s, had Express Scripts as 1 out of 11 clients in 2014 (although the video doesn't say they mean Express Scripts, or provide any citation for the claim that Salinas was a drug lobbyist in 2013/2014), and Express Scripts gave her $250 in 2018. (And presumably enough other donors can be categorised as pharmaceutical to add up to $18k.)

So yeah, very misleading.

(Also, what's up with companies giving and campaigns accepting such tiny amounts as $250? Surely that's net-negative for campaigns by enabling accusations like this.)

Comment by Lukas_Finnveden on Replicating and extending the grabby aliens model · 2022-05-19T17:24:42.675Z · EA · GW

(1) maybe doom should be disambiguated between  "the short-lived simulation that I am in is turned of"-doom (which I can't really observe) and "the basement reality Earth I am in is turned into paperclips by an unaligned AGI"-type doom.

Yup, I agree the disambiguation is good. In aliens-context, it's even useful to disambiguate those types of doom from "Intelligence never leaves the basement reality Earth I am on"-doom. Since paperclippers probably would become grabby.

Comment by Lukas_Finnveden on Replicating and extending the grabby aliens model · 2022-05-19T15:39:38.031Z · EA · GW

When I model the existence of simulations like us, SIA does not imply doom (as seen in the marginalised posteriors for  in the appendix here). 

It does imply doom for us, since we're almost certainly in a short-lived simulation.

And if we condition on being outside of a simulation, SIA also implies doom for us, since it's more likely that we'll find ourselves outside of a simulation if there are more basement-level civilizations, which is facilitated by more of them being doomed.

It just implies that there  weren't necessarily a lot of doomed civilizations in the basement-level universe, many basement-level years ago, when our simulators were a young civilization.

Comment by Lukas_Finnveden on Discussion on Thomas Philippon's paper on TFP growth being linear · 2022-05-09T18:45:56.877Z · EA · GW

There's an excellent critique of that paper on LW: https://www.lesswrong.com/posts/yWCszqSCzoWTZCacN/report-likelihood-ratios

The conclusion is that exponentials look better for longer-run trends, if you do fair comparisons. And that linear being a better fit than exponentials in recent data is more about the error-model than the growth-model, so it shouldn't be a big update against exponential growth.

Comment by Lukas_Finnveden on How about we don't all get COVID in London? · 2022-04-15T09:32:27.327Z · EA · GW

It's table 3 I think you want to look at. For fatigue and other long covid symptoms, belief that you had covid has a higher odds ratio than does confirmed covid

That's exactly what we should expect if long covid is caused by symptomatic covid, and belief-in-covid is a better predictor of symptomatic covid than positive-covid-test. (The latter also picks up asymptomatic covid, so it's a worse predictor of symptomatic covid.)

Comment by Lukas_Finnveden on Announcing What The Future Owes Us · 2022-04-03T00:11:25.137Z · EA · GW

The future's ability to affect the past is truly a crucial consideration for those with high discount rates. You may doubt whether such acausal effects are possible, but in expectation, on e.g. an ultra-neartermist view, even a 10^-100 probability that it works is enough, since anything that happened 100 years ago is >>10^1000 times as important as today is, with an 80%/day discount rate.

Indeed, if we take the MEC approach to moral uncertainty, we can see that this possibility of ultra-neartermism + past influence will dominate our actions for any reasonable credences. Perhaps the future can contain 10^40 lives, but that pales in comparison to the >>10^1000 multiplier we can get by potentially influencing the past.

Comment by Lukas_Finnveden on Future-proof ethics · 2022-03-30T15:02:55.245Z · EA · GW

I think the title of this post doesn't quite match the dialogue. Most of the dialogue is about whether additional good lives is at least somewhat good. But that's different from whether each additional good life is morally equivalent to a prevented death. The former seems more plausible than the latter, to me.

Separating the two will lead to some situations where a life is bad to create but also good to save, once started. That seems more like a feature than a bug. If you ask people in surveys, my impression is that some small fraction of people say that they'd prefer to not have been born and that some larger fraction of people say that they'd not want to relive their life again — without this necessarily implying that they currently want to die.

Comment by Lukas_Finnveden on We're announcing a $100,000 blog prize · 2022-03-08T04:35:58.865Z · EA · GW

I assume it's fine to prominently link to the EA forum or LW as the place to leave comments? Like e.g. cold takes does.

Comment by Lukas_Finnveden on Yonatan Cale's Shortform · 2022-02-28T03:46:33.211Z · EA · GW

2. The best workout game I found is "thrill of the fight", I have some tips before you try it. Also, not everyone will like it

What are your tips?

Comment by Lukas_Finnveden on Simplify EA Pitches to "Holy Shit, X-Risk" · 2022-02-11T17:29:51.970Z · EA · GW

You could replace working on climate change with 'working on or voting in elections', which are also all or nothing.

(Edit: For some previous arguments in this vein, see this post .)

Comment by Lukas_Finnveden on New EA Cause Area: Run Blackwell's Bookstore · 2022-02-06T20:59:34.785Z · EA · GW

SSC argued that there was not enough money in politics

To be clear, SSC argued that there was surprisingly little money in politics. The article explicitly says "I don’t want more money in politics".

Comment by Lukas_Finnveden on External Evaluation of the EA Wiki · 2021-12-14T19:16:11.857Z · EA · GW

Here's one idea: Automatic or low-effort linking to wiki-tags when writing posts or comments. A few different versions of this:

  • When you write a comment or post that has contains the exact name of a tag/wiki article, those words automatically link to that tag. (This could potentially be turned on/off in the editor or in your personal prefs.)
  • The same as the above except it only happens if you do something special to the words, e.g. enclose them in [[double brackets]], surround them by [tag] [/tag], or capitalise correctly. (Magic the gathering forums often have something like this for linking to cards.)
  • The same as the above, except there's some helpful search function that helps you find relevant wiki articles. E.g. you type [[ or you click some particular button in the editor, and then a box for searching for tags pops up. (Similar to linking to another page in Roam. This could also be implemented for linking to posts.)
Comment by Lukas_Finnveden on What is the EU AI Act and why should you care about it? · 2021-12-08T16:30:20.566Z · EA · GW

I think this is a better link to FLI's position on the AI act: https://ec.europa.eu/info/law/better-regulation/have-your-say/initiatives/12527-Artificial-intelligence-ethical-and-legal-requirements/F2665546_en

(The one in the post goes to their opinion on liability rules. I don't know the relationship between that and the AI act.)

Comment by Lukas_Finnveden on How many EA 2021 $s would you trade off against a 0.01% chance of existential catastrophe? · 2021-11-28T22:47:09.324Z · EA · GW

Seems better than the previous one, though imo still worse than my suggestion, for 3 reasons:

  • it's more complex than asking about immediate extinction. (Why exactly 100 year cutoff? why 50%?)
  • since the definition explicitly allows for different x-risks to be differently bad, the amount you'd pay to reduce them would vary depending on the x-risk. So the question is underspecified.
  • The independence assumption is better if funders often face opportunities to reduce a Y%-risk that's roughly independent from most other x-risk this century. Your suggestion is better if funders often face opportunities to reduce Y percentage points of all x-risk this century (e.g. if all risks are completely disjunctive, s.t. if you remove a risk, you're guaranteed to not be hit by any other risk).
    • For your two examples, the risks from asteroids and climate change are mostly independent from the majority of x-risk this century, so there the independence assumption is better.
    • The disjunctive assumption can happen if we e.g. study different mutually exclusive cases, e.g. reducing risk from worlds with fast AI take-off vs reducing risk from worlds with slow AI take-off.
    • I weakly think that the former is more common.
    • (Note that the difference only matters if total x-risk this century is large.)

Edit: This is all about what version of this question is the best version, independent of inertia. If you're attached to percentage points because you don't want  to change to an independence assumption after there's already been some discussion on the post, then this your latest suggestion seems good enough. (Though I think most people have been assuming low total amount of x-risk, so probably independence or not doesn't matter that much for the existing discussion.)

Comment by Lukas_Finnveden on How many EA 2021 $s would you trade off against a 0.01% chance of existential catastrophe? · 2021-11-28T18:47:50.428Z · EA · GW

Currently, the post says:

A risk of catastrophe where an adverse outcome would permanently cause Earth-originating intelligent life's astronomical value to be <50% of what it would otherwise be capable of.

I'm not a fan of this definition, because I find it very plausible that the expected value of the future is less than 50% of what humanity is capable of. Which e.g. raises the question: does even extinction fulfil the description? Maybe you could argue "yes": but the mix of causing  an actual outcome compared with what intelligent life is "capable of" makes all of this unnecessarily dependant on both definitions and empirics about the future.

For purposes of the original question, I don't think we need to deal with all the complexity around "curtailing potential". You can just ask: How much should a funder be willing to pay to remove an 0.01% risk of extinction that's independent from all other extinction risks we're facing. (Eg., a giganormous asteroid is on its way to Earth and has an 0.01% probability of hitting us, causing guaranteed extinction. No on else will notice this in time. Do we pay $X to redirect it?)

This seems closely analogous to questions that funders are facing (are we keen to pay to slightly reduce one, contemporary extinction risk). For non-extinction x-risk reduction, this extinction-estimate will be informative as a comparison point, and it seems completely appropriate that you should also check "how bad is this purported x-risk compared to extinction" as a separate exercise.

Comment by Lukas_Finnveden on Listen to more EA content with The Nonlinear Library · 2021-11-27T17:48:40.866Z · EA · GW

I see you've started including some text from the post in each episode description, which is useful! Could you also include the URL to the post, at the top of the episode description? I often want to check out comments on interesting posts.

Comment by Lukas_Finnveden on Opportunity Costs of Technical Talent: Intuition and (Simple) Implications · 2021-11-24T09:44:01.336Z · EA · GW

For example, I can't imagine any EA donor paying a non-ML engineer/manager $400,000, even if that person could make $2,000,000 in industry.

Hm, I thought lightcone infrastructure might do that.

Our current salary policy is to pay rates competitive with industry salary minus 30%. Given prevailing salary levels in the Bay Area for the kind of skill level we are looking at, we expect salaries to start at $150k/year plus healthcare (but we would be open to paying $315k for someone who would make $450k in industry).

https://www.lesswrong.com/posts/eR7Su77N2nK3e5YRZ/the-lesswrong-team-is-now-lightcone-infrastructure-come-work-3

Comment by Lukas_Finnveden on Preprint is out! 100,000 lumens to treat seasonal affective disorder · 2021-11-14T10:22:16.486Z · EA · GW

For 100,000 LM, 12 hours a day, that would be 1000W * 12h/day * 20c/kwh = $2.4.