I used to be concerned about this a lot from a “what if it sparks nuclear war” POV, and I suppose I still am pretty concerned about that on some level. However, to be brutally honest I think that one of my main paradigms for evaluating geopolitics and risk has increasingly shifted to focusing on just AI risk, with a little biosecurity sprinkled in.
For example: if China invaded Taiwan would it set back AI capability timelines (because TSMC—which produces I think a majority of leading edge semiconductors—might get scuttled), and/or will great power conflict incentivize military AI development which shortens timelines?
I don’t think it’s advisable to treat these as independent probabilities you can multiply to get risk estimates, but perhaps it’s not that inaccurate. (e.g., it seems quite likely that the probability that the US retaliates is to some extent influenced by their estimate of the probability that it will cause a hot war with China)
I’m afraid you’ve totally lost me at this point. Saving mammoths?? Why??
And are you seriously suggesting that we can resurrect dead people whose brains have completely decayed? What?
And what is this about saving humans first? No, we don’t have to save every human first, we theoretically only need to save enough so that the process of (whatever you’re trying to accomplish?) can continue. If we are strictly welfare-maximizing without arbitrary speciesism, it may mean prioritizing saving some of the existing animals over every human currently (although this may be unlikely).
To be clear, I certainly understand that you aren’t saying you only care about saving your own life, but the post gives off those kinds of vibes nonetheless.
The transition from "good" to "wellbeing" seems rather innocent, but it opens the way to rather popular line of reasoning: that we should care only about the number of happy observer-moments, without caring whose are these moments. Extrapolating, we stop caring about real humans, but start caring about possible animals. In other words, it opens the way to pure utilitarian-open-individualist bonanza, where value of human life and individuality are lost and badness of death is ignored. The last point is most important for me, as I view irreversible mortality as the main human problem.
To be totally honest, this really gives off vibes of "I personally don't want to die and I therefore don't like moral reasoning that even entertains the idea that humans (me) may not be the only thing we should care about." Gee, what a terrible world it might be if we "start caring about possible animals"!
Of course, that's probably not what you're actually/consciously arguing, but the vibes are still there. It particularly feels like motivated reasoning when you gesture to abstract, weakly-defined concepts like the "value of human life and individuality" and imply they should supersede concepts like wellbeing, which, when properly defined and when approaching questions from a utilitarian framework, should arguably subsume everything morally relevant.
You seem to dispute the (fundamental concept? application?) of utilitarianism for a variety of reasons—some of which (e.g., your very first example regarding the fog of distance) I see as reflecting a remarkably shallow/motivated (mis)understanding of utilitarianism, to be honest. (For example, the fog case seems to not understand that utilitarian decision-making/analysis is compatible with decision-making under uncertainty.)
In short, I don't find your arguments persuasive, and I think they're derived from some errors such as equivocation, weird definitions, etc.
But it could be the real goodness is outside well-being. In my view, reaching radical life extension and death-reversal is more important than well-being, if it is understood as comfortable healthy life.
First of all, I don't understand the conflict here—why would you want life extension/death reversal if not to improve wellbeing? Wellbeing is almost definitionally what makes life worth living; I think you simply may not be translating or understanding "wellbeing" correctly. Furthermore, you don't seem to offer any justification for that view: what could plausibly make life extension and death-reversal more valuable than wellbeing (given that wellbeing is still what determines the quality of life of the extended lives).
The fact that an organisation is doing good assumes that some concept of good exists in it. And we can't do good effectively without measuring it, which requires even stricter model of good. In other words, altruism can't be effective, if it escapes defining good.
You can assert things as much as you'd like, but that doesn't justify the claims. Someone does not need to objectively, 100% confidently "know" what is "good" nor how to measure it if various rough principles, intuition, and partial analysis suffices. Maybe saving people from being tortured or killed isn't good—I can't mathematically or psychologically prove to you why it is good—but that doesn't mean I should be indifferent about pressing a button which prevents 100 people from being tortured until I can figure out how to rigorously prove what is "good."
Moreover, some choices about what is good could be pivotal acts both for organisations and for AIs: like should we work more on biosafety, on nuclear war prevention, or digital immortality (data preservation). Here again we are ready to make such choices for organisation, but not for AI.
This almost feels like a non-sequitur that fails to explicitly make a point, but my impression is that it's saying "it's inconsistent/contradictory to think that we can decide what organizations should do but not be able to align AI." 1) This and the following paragraph still don't address my second point from my previous comment, and so you can't say "well, I know that (2) is a problem, but I'm talking about the inconsistency"—a sufficient justification for the inconsistency is (2) all by itself; 2) The reason we can do this with organizations more comfortably is that mistakes are far more corrigible, whereas with sufficiently powerful AI systems, screwing up the alignment/goals may be the last meaningful mistake we ever make.
But what I wanted to say here is that many problems which we encounter in AI alignment, also reappear in organisations, e.g. goodharting. Without knowing how to solve them, we can't do good effectively.
I very slightly agree with the first point, but not the second point (in part for reasons described earlier). On the first point, yes, "alignment problems" of some sort often show up in organizations. However: 1) see my point above (mistakes in organizations are more corrigible); 2) aligning humans/organizations—with which we share some important psychological traits and have millennia of experience working with—is fairly different from aligning machines in terms of the challenges. So "solving" (or mostly solving) either one does not necessarily guarantee solutions to the other.
In that case, I think there are some issues with equivocation and/or oversimplification in this comparison:
EAs don’t “know what is good” in specific terms like “how do we rigorously define and measure the concept of ‘goodness’”; well-being is a broad metric which we tend to use because people understand each other, which is made easier by the fact that humans tend to have roughly similar goals and constraints. (There are other things to say here, but the more important point is next.)
One of the major problems we face with AI safety is that even if we knew how to objectively define and measure good we aren’t sure how to encode that into a machine and ensure it does what we actually want it to do (as opposed to exploiting loopholes or other forms of reward hacking).
So the following statement doesn’t seem to be a valid criticism/contradiction:
This looks contradictory: either we know what is real good and could use this knowledge in AI safety, or we don't know, and in that case we can't do anything useful.
We're looking for what would be the greatest good for humanity, and we're searching for a way to safely and effectively implement this good.
EA considers both of these tasks to be generally solved (otherwise we would not have been doing anything at all) for the social activities of a public organization, but at the same time, completely unsolved for the AI alignment.
I'm confused by what's meant here: perhaps I just don't understand what you mean with the clause "for the social activities of a public organization", but I don't think that "EA considers both of these tasks to be generally solved"?
In your section on “which recommendations are most worthwhile?” you mention using the ITN framework. While this is probably makes for efficient communication since many readers are familiar with it, I have some qualms with applying the ITN framework to actual decisions. Per the framing you described (which I get is meant to be simple and intuitive, but nonetheless), neglectedness seems obsolesced by importance, and tractability would probably also be obsolesced except that it tries taking into account implementation costs (while not addressing other potential disadvantages?).
How efficient at X would I be without using this framework, or how prone to making Y mistake would I be without this framework?
What does my expected usage/implementation of this framework look like (given cognitive or familiarity constraints)?
How efficient would I be at X or prone to making Y mistake while using this framework?
How good is an improvement in X / reduction in Y.
In reality, your description of the ITN framework seems a bit different from normal interpretations, which leads to it looking more like COILS. However, my loose sense is that it’s probably better to formally recognize the limitations of ITN for certain contexts (e.g., specific decisions) and explicitly identify/use alternate frameworks, rather than using “sort-of-ITN.”
Personally, I'm a huge fan of exploring ways of improving research efficiency and quality, and I am glad to see another person working on this.
Of potential relevance: I'm currently experimenting with an approach to supporting AI-relevant policy research via collaborative node-and-link "reality modeling / claim aggregation, summarization, and organization" (I'm not exactly sure what to call it).
For example, questions/variables like "How much emphasis would the Chinese government/Chinese companies put on (good) safety and alignment work (relative to America)" and "would more research/work on interpretability would be beneficial for powerful-AI/AGI alignment" seem like they could be fairly important in shaping various policy recommendations while also not being easy to evaluate. Additionally, there may be some questions or "unknown unknowns" that may not seem obvious to ask—especially for younger researchers and/or researchers who are trying to do research outside of their normal fields of expertise.
Having a hub for collecting and interrelating arguments/evidence on these and other questions seems like it could save time by reducing duplication/wasted effort and also increase the likelihood of encountering good points, among other potential benefits (e.g., better keeping track of back-and-forth responses and overall complexity for a question which doesn't lend itself well to traditional mathematical equations). Additionally, if junior researchers (including even interns) are the primary source of labor for this project—which I expect would be fairly practical given that much of the "research" aspect of the work is akin to conducting literature reviews (which is then formalized into the node-and-link structure )—then you could potentially get comparative advantage benefits by having less-experienced researchers save research time of more-experienced researchers, all while providing a "training ground/on-ramp" for less-experienced researchers to become more familiar with various fields.
Because I am still testing out which approaches/structures seem to work best, the networks in the following screenshots are quite raw: there are still some inconsistencies in terms of node types and presence, little attention paid to aesthetics, many undefined placeholder relationships, inconsistencies in terms of granularity, etc. (That being said, the purple squares are generally "claims/assertions/arguments", yellow triangles are policy proposals, blue hexagons are research questions, and green diamonds are generally "reality variables")
Note: I do not think that the map view above should be the only way of viewing the research/analysis; for example, I think that one of the viewing methods should ideally be something like a minimal-information interface with dropdowns that let you search for and branch out from specific nodes (e.g., "what are the arguments for and against policy proposal X... what are the responses to argument Y").
Related to this is my post on "epistemic mapping", which put a greater emphasis on the academic literature around non-policy questions (including identifying authors, studies, the inputs for those studies (e.g., datasets), etc.) as opposed to supporting policy research directly—although both systems could probably be used for the same purposes with minor adaptations.
Also relevant—and more developed + better resourced than my project—is Modeling Transformative AI Risks (MTAIR), which puts more emphasis on quantitative analysis via input elicitation/estimation and output/effect calculation (and at the moment seems to focus a bit more on the factors and pathways of AI risk vs. the direct effects of policy, although my understanding is that it is also intended to eventually focus on policy analysis/recommendations).
One of the first definitions I see when looking up pathological is “involving or caused by a physical or mental disease” (albeit along with “compulsive, obsessive”), and the connotation of “pathological” is really negative and insulting. EA already faces accusations of being elitist and insulting to other causes, which may undermine its appeal to some people, and I am fairly skeptical of the idea that this “social stick” (as you call it) will actually be effective at persuading people who don’t already share your point of view.
Therefore, I am really averse to calling things like playpumps “pathological altruism.” There are a few things which are quite actively and (this is important) clearly harmful and thus probably warrant condemnation: for example, some situations where someone is enabling a drug habit. However, society already does have norms against many of these things, and there doesn’t need to be some new “EA social stick.”
I’m confused why this got downvoted without any comments; maybe someone thought this was duplicative of an older post/project? Regardless, assuming it isn’t duplicative I think it sounds like a good idea, and even if it is duplicative it might still be good to have another project attempt (so long as this one integrates lessons from the old one).
Does it make sense to share this with clubs from a homeschool league (i.e., one that doesn't have PF, will have different LD resolutions, and tends to have a fairly different debate culture from public school leagues)? I'm not particularly familiar with VBI.
Having a list of EA-related or EA-adjacent motions can be good, but if the goal is to make them usable at tournaments for parliamentary debate (the format with new resolutions each round, with each resolution announced only 15–25 minutes before the round) then the full list probably needs to not be public, so that some teams don't have foreknowledge of potential motions.
Alternatively, if the resolutions are meant for different debate formats or for practice or special tournaments, it might be fine to make it fully public.
Comment by Harrison D on [deleted post]
I’m curious what gave you the impression that “everyone is just trying to help out in small areas without a large coordinate effort,” as that is not really how I would describe the EA community?
Whenever your expected value calculation relies on infinity—especially if it relies on the assumption that an infinite outcome will only occur when given infinite attempts—your calculation is going to end up screwy. In this case, though, an infinite outcome is impossible: as others have pointed out, the EV of infinitely taking the bet is 0.
Relatedly, I think that at some point moral uncertainty might kick in and save the day.
I understand—and agree with—the overall point being made about “don’t just talk about the extreme things like paperclip maximizers”, but I’m still thrown off by the statement that “the mechanisms needed to prevent [paperclip maximizers] are the same as those needed to prevent the less severe and more plausible-sounding scenarios”
It may be the case that we don’t even need to make general audiences consider paperclip maximizers at all, since the mechanisms needed to prevent them are the same as those needed to prevent the less severe and more plausible-sounding scenarios
I’m somewhat unsure what exactly you meant by this, but if your point is “solutions to near-term AI concerns like bias and unexpected failures will also provide solutions to long-term concerns about AI alignment,” that viewpoint is commonly disputed by AI safety experts.
Personally, I dislike the title framing on at least two accounts:
As Vaidehi mentioned, I don’t think that “values” have shifted, but rather implementation/interpretation of fairly steady values (e.g., truth-seeking/open-mindedness, impact-maximization) have shifted or at least expanded.
“Drift” has a connotation (if not outright denotation) of “drifting [away] [[from the ideal/target]]” whereas I think that it’s mainly just fair to say “interpretations are shifting."
I wonder if Twitter data (e.g., follows, engagement) would replicate some of these distinctions in terms of clustering, and if it might show areas of common cross-pollination? (Of course, there may be some representativeness issues with some groups, but it’d still be interesting or at least “amusing.”)
I’ve also had similar thoughts, but haven’t really thought about alternative names until now. Still, I’m not quickly thinking of obviously-great alternatives. Perhaps “EA Workshops” or “Seminars”?
Having said that, it’s worth pointing out that although “retreats” can often be used in religious contexts, there is plenty of usage in the sense of “corporate retreats.” So ultimately the label may not be that bad, it’s more a matter of how it’s framed and whether it involves a lot of people who are new to/unfamiliar with EA.
To me, summit feels a bit too grand and “culminatory” (or whatever the word is), either because I think of summits bringing together disparate groups of people (e.g., from different universities/countries) or at the end of some project.
Many issue areas most prioritized by EA – biosecurity, pandemic response, artificial intelligence – remain neglected within the State Department. If you can introduce a more rational, long termist perspective into an often short-sighted policy process, the marginal impact of your presence can be quite significant.
How easy is it to go against the grain like that? Are there not institutional pressures to focus on short-term considerations?
I think it probably makes sense to change the title of the post for efficiency reasons (I.e., “don’t bother reading if you aren’t American”), but not because I think it contributes to EA being a more “globally welcoming and inclusive movement,” which I feel like is a less significant issue/concern here. (Yes, the argument seems to be that without saying “American EAs” the implied assumption is that all EAs are American, but I don’t think that’s a strong vibe; at the very least, I wouldn’t imply that the post shows hypocrisy in EA)
Setting aside whether or not such risks were actually significant, perhaps planetary protection could be an interesting example of where bureaucracies spent time and money to mitigate unknown risks from e.g., extraterrestrial contamination.
I'm not exactly sure what you have in mind for the research, but I think it might be interesting to at least draw parallels or have pseudo-benchmarks with policy responses to non-existential low-probability risks, such as 9/11 (or terrorism more generally) and US mass shootings.
re: "filtering", I really was only talking about "clearly uninteresting/bad" claims—i.e., things that almost no reasonable person would take seriously even before reading counterarguments. I can't think of many great examples off the top of my head—and in fact it might rarely ever require such moderation among most EAs—but perhaps one example may be conspiracy-theory claims like "Our lizard overlords will forever prevent AGI..." or non-sequiturs like "The color of the sky reflects a human passion for knowledge and discovery, and this love of knowledge can never be instilled in a machine that does not already understand the color blue."
In contrast, I do think it would probably be a good idea to allow heterodox claims like "AGI/human-level artificial intelligence will never be possible"—especially since such claims would likely be well-rebutted and thus downvoted.
Yes, de-duplication is a major reason why I support using these kinds of platforms: it just seems so wasteful to me that there are people out there who have probably done research on questions of interest to other people but their findings are either not public or not easy to find for someone doing research.
To be clear, that article only forecasts that outcome in the "business-as-usual" approach which seems to mean to them an increase of 5–8 degrees Celsius (figure 2B), which seems like a really high estimate; is that within the standard forecasted range, or is that more like the "assume all progress in renewable energy magically halts and we continue on as if nothing bad is happening" forecast?
1. How much suffering different environmental problems will cause is, as you know, difficult to put numbers on, especially in combination.
I think there is probably a range of decent estimates out there about mortality/DALYs as well as some economic costs under different scenarios (which should not include what I described above, if I understood what was meant by "business-as-usual"). It doesn't need to be precise to be helpful here; even an order of magnitude range could be very helpful, possibly even two orders of magnitude.
2. Ok, it’s only a 3 minutes text with different aspects, but perhaps like this? Key point: If we continue to have overall GDP growth in rich countries this decade, we will most likely exceed the planetary boundaries even more. Is it worth that?
The estimated reading time on each post is only a loose estimate, and in this case it definitely was not a 3-minute read for me since I had to re-read multiple things to get a clear picture of what you were vs. weren't claiming + I had to read about some of the mentioned concepts, such as "planetary boundaries." Ultimately, it's just a good practice to have a tl;dr up front that summarizes your main points in 2–4 sentences.
As to the summary in this case, I would again re-emphasize my points above: I'd like to see actual rough estimates as to the potential costs of not pursuing degrowth, because "exceed the planetary boundaries" means basically nothing to me (and even what I briefly read was not very persuasive, especially if we're already exceeding the boundaries and not facing mass starvation/heat exhaustion/etc.)
we should prioritize the environment and other central societal goals and then GDP will be what it becomes. But we don't have a common word for that?
We don't have a term for "Environmental protection"?? That sounds like a failure of imagination. Even an acronym or "no catchphrase at all" seems better to me than "degrowth," which really seems like a counterproductive label.
4. Absolutely in some areas for a limited time, but on a global level we only see some relative decoupling between GDP and climate emissions, no absolute decoupling. Both climate emissions and GDP are globally higher than ever. When it comes to GDP and material footprint we see no global decoupling at all. See figure 1 about this paragraph: https://www.eea.europa.eu/publications/growth-without-economic-growth
To be blunt, that's a rather shallow, self-confirming collection/interpretation of data, especially since it doesn't even grapple with the ideas of the EKC: you shouldn't expect an aggregate/global decoupling when you have numerous developing countries with massive population sizes (China and India) as well as many other developing countries going through the low-income industrial/manufacturing stages currently. A more dispassionate review of the data would look at things with a more granular lens, which is what I did with some WorldBank data for a class paper a few years ago, producing the following graphs:
When something becomes more efficient, we buy more instead of choosing more free time (rebound effect). Of course, it’s not impossible that it will be different in the future, but we need to drastically reduce our environmental impact now, otherwise we’re exceeding more tipping points that can’t be reversed, like losing most of the Amazon rainforest.
Counterpoint: If you replace coal plants with solar or other renewables (e.g., hydroelectric), you don't just adjust your consumption patterns so drastically as to make solar/etc. pollute as much as coal.
More broadly, it just seems like you aren't familiar with the research/argumentation on the EKC. If that's true, I would strongly encourage you to learn more about it if you are planning to focus on environmental concerns. The EKC is certainly debatable in terms of how powerful it is and whether it will act fast enough, but the theoretical evidence is very strong for some activities (i.e., replacing coal with renewables or at least natural gas), and some shallow review of the data (e.g., above) partially supports the idea (with some potential exceptions, such as with oil-producing economies).
There needs to be more quantification—even if only loose quantification—of the impact of environmental harms in this post. I was unclear what the problem we’re trying to avoid is: it felt a bit like hand waving, saying “we might miss these goals/targets” but without making the impact of that clear.
Could you try to summarize the post more clearly up front and/or use headers for different sections? (Or use bolding for key statements). The analysis felt a bit windy, which slowed down and undermined my reading/understanding.
My view is that “degrowth” is just a (politically) terrible catchphrase/policy label. Who even invented the term? An oil lobbyist? At least at the label level, the focus should not be on “let’s kill growth,” it should be on “[what do we want to achieve?]”—especially given the next point.
Many of the simplistic assumptions I’ve seen regarding the relationship between “growth” and environmental damage are wrong: arguments related to the Environmental Kuznets Curve (EKC) illustrate that regardless of whether the EKC is true on balance, there are definitely ways in which economic growth can also involve reducing pollution (e.g., developments in renewable energy) and economic growth makes environmental standards/health effects more important for some people.
If you aren’t opposed to donating to political campaigns: some campaign finance laws restrict the amount of money that can go directly to campaigns on a per-person basis, so at least that seems like an area where “small” donors can still matter.
I’m not sure how I never saw this response (perhaps I saw the notification but forgot to read), but thank you for the response!
I’m not familiar with the 6x6x6 synthesis; would it not require 216 participants, though? (That seems quite demanding) Or am I misunderstanding? (Also, the whole 666 thing might not make for the best optics in light of e.g., cult accusations, lol)
I’m not sure what you’re referring to regarding “curated,” but if you’re referring to the collection of ideas/claims on something like Kialo I think my point was just that you can have moderators filter out the ideas that seem clearly uninteresting/bad, duplicative, etc.
I’m not super motivated+available at the moment to do a full write up/analysis, but I’m quite skeptical of the idea that the default/equilibrium in EA would trend towards 100% grift, regardless of whether that is the standard in companies (which I also dispute, although I don’t disagree that as an organization becomes larger self-management becomes increasingly complex—perhaps more complex than can be efficiently handled by humans running on weak ancestral-social hardware).
It might be plausible that “grift” becomes more of a problem, approaching (say) 25% of spending, but there are a variety of strong horizontal (peer-to-peer) and vertical checks on blatant grift, and at some point if someone wants to just thoroughly scam people it seems like it would be more profitable to do it outside of EA.
I’d be happy to see someone else do a more thorough response, though.
I’m a bit confused and wanted to clarify what you mean by AGI vs AAGI: are you of the belief that AGI could be safely controlled (e.g., boxed) but that setting it to “autonomously” pursue the same objectives would be unsafe?
Could you describe what an AGI system might look like in comparison to an AAGI?
I support thinking about/discussing neglected problems like this, and it might be the case that there is serious room for improvement here. However, I do want to briefly push back on your selective reporting of the most favorable $/DALY estimate:
There’s also good evidence that treatment programs can be cost-effective. A review of hypertension control interventions reports a handful of studies with costs of less than $100 per DALY averted. This cutoff is sometimes referenced as a benchmark for the cost effectiveness of insecticide treated bednet programs.
When I read the study (Table 4), it seemed that most of the relevant estimates were multiple times over the $100/DALY mark (e.g., 200–1000).
This isn’t a terrible problem, but I would definitely prefer that posts like this more explicitly acknowledge/emphasize that most of the studies were not that positive. That was only the first claim made here that I investigated (admittedly because it seemed like one of the most concerning), so it leaves me a bit more skeptical as to the overall post. I’m also a bit unsure/skeptical as to how effectively the programs would scale and whether the costs incorporate certain administrative costs that might be incurred.
Still, like I said it might be the case that even with that being acknowledged there are still some interventions really worth supporting (or at least investigating further).
I feel like when I've heard people talk about this it was often targeted at specific high-importance, high-density buildings like schools, hospitals, government institutions, etc. rather than every office building.
Yeah, I recall my university organizing days and the awkwardness/difficulty of trying to balance "tell me about the careers you are interested in and why" and "here are the careers that seem highly impactful according to research/analysis."
I frequently thought things like "I'd like for people to have a way for people to share their perspective without feeling obligated to defend it, but I also don't want to blanket-validate everyone's perspectives by simply not being critical."
The end result ("I missed out on an opportunity") might be the same, but the process matters. There's a meaningful difference between, e.g., "having a breakdown and sending a long obscenity-filled rant-text to your former boss who then talks to your current boss and has you fired" and "not following up on an opportunity because you thought you had a better opportunity but you were probably wrong."
When I read the post title (“torched”), I was expecting to see a story of how you totally screwed something up for no good reason, but unless I missed something I would describe this as more like “passed over” rather than “torched.”
"This plan will also cause Z, which is morally bad" is its own disadvantage/con.
"... and outweighs the benefit of X" relates to the caveat listed in footnote 3: you are no longer attacking/challenging the advantage itself ("this plan causes X"), but rather just redirecting towards a disadvantage. (Unless you are claiming something like "the benefits of X are not as strong as you suggested," in which case you're attacking it on significance.)
I did include some example applications in the long introduction post (see my response to Khorton); I was worried that trying to include an example in this version might make it too long and thus lead to a high bounce rate… but perhaps I should have made it clear that I do have some applications in the old post.
Here is the first example (with updated terminology):
Consider lobbying for some policy change in a developing country—for example, on tobacco policy. Suppose that the proposal is to fund an advocacy campaign that would push for tighter controls on cigarettes, with the primary claimed advantage being “it will (increase the likelihood of passing legislation that will) reduce the mortality caused by smoking.” To evaluate this advantage, you would likely face questions such as:
Counterfactuality: What would happen without this intervention? (Imagine for example that someone claims the campaign is likely to work because there is a “growing wave of support” for the reform: this might mean that the reform—or a slightly less strong version of the reform—already has a decent chance of passing. As part of this, it may be the case that the advocacy campaign will already receive sufficient funding.)
Implementation: Do we actually have the necessary funding and can we actually meet the timeline outlined by the plan? (For example, are there any restrictions on foreign funding that have not been accounted for?)
Linkage: Supposing that the plan is implemented (or, for a given implementation of the plan), what is the resulting likelihood that the desired reform will be signed into law—and subsequently, how effective will the desired reform be in reducing mortality caused by smoking (which introduces a recursion of this framework).
Significance (assuming a utilitarian moral framework): How does “reducing mortality caused by smoking” translate to changes in wellbeing? If one considers the goal to simply be reducing mortality caused by smoking, that might be achieved, but it’s not guaranteed that achieving that goal will lead to an increase in wellbeing, such as is more-directly measured by a metric like QALYs. (For example, it’s possible that there are other widespread environmental problems that significantly reduce the effect of smoking mortality reduction on QALYs.)
Just to clarify since I see you are new and you didn't mention it by name: Are you familiar with Givewell? It's a fairly well-respected organization in the EA community, and people can simply donate money to Givewell for them to allocate to the most effective charities.
Also, I want to push back on one of your potential assumptions: I'm not so confident that people donate to less-effective charities primarily as a result of "laziness", nor am I convinced that it's accurate to broadly say that "many people seem to not care who or what receives their donations." I can't read the full article that you cite, but its abstract does explicitly say "donors often support organisations that promote their own preferences, that help people with whom they feel some affinity and that support causes that relate to their own life experiences", which almost seems like the opposite of your interpretation. Regardless, I think it's probably more accurate to say that many people are not actually motivated by some impersonal desire to "maximize good" and instead are influenced by a variety of desires (and habits, aversions, etc.), including how donating makes them feel.
Thus, you might get many people donating to a seeing eye dog charity instead of the Fred Hollows Foundation not because they're actually trying to maximize how much they help other people but rather because, e.g., they've personally interacted with seeing eye dogs or such services have helped a family member, and thus donating to those charities gives them just as much if not more "warm fuzzies" as would donating to something far more effective at mitigating/preventing blindness like the Fred Hollows Foundation.
And if learning about the Fred Hollows Foundation would make them feel less happy about donating to guide dog charities (e.g., "yeah, I guess these charities actually are ineffective, I'm just donating to make myself feel better") then it might even be instrumentally rational to not be intellectually curious about better alternatives.
If I'm reading you right I don't think your points apply to near-term considerations, such as from arms control in space.
That is mostly correct: I wasn't trying to respond to near-term space governance concerns, such as how to prevent space development or space-based arms races, which I think could indeed play into long-term/x-risk considerations (e.g., undermining cooperation in AI or biosecurity), and may also have near-term consequences (e.g., destruction of space satellites which undermines living standards and other issues).
But if you have a reason to be confident that none of it ends up being useful, it feels like that must be a general reason for scepticism that any kind of efforts at improving governance, or even values change, are rendered moot by the arrival of TAI. And I'm not fully sceptical about those efforts.
To summarize the point I made in response to Charles (which I think is similar, but correct me if I'm misunderstanding): I think that if an action is trying to improve things now (e.g., health and development, animal welfare, improving current institutional decision-making or social values), it can be justified under neartermist values (even if it might get swamped by longtermist calculations). But it seems that if one is trying to figure out "how do we improve governance of space settlements and interstellar travel that could begin 80–200 years from now," they run the strong risk of their efforts having effectively no impact on affairs 80–200 years from now because AGI might develop before their efforts ever matter towards the goal, and humanity either goes extinct or the research is quickly obsolesced.
Ultimately, any model of the future needs to take into account the potential for transformative AI, and many of the pushes such as for Mars colonization just do not seem to do that, presuming that human-driven (vs. AI-driven) research and efforts will still matter 200 years from now. I'm not super familiar with these discussions, but to me this point stands out so starkly as 1) relatively easy to explain (although it may require introductions to superintelligence for some people); 2) substantially impactful on ultimate conclusions/recommendations, and 3) frequently neglected in the discussions/models I've heard so far. Personally, I would put points like this among the top 3–5 takeaway bullet points or in a summary blurb—unless there are image/optics reasons to avoid doing this (e.g., causing a few readers to perhaps-unjustifiably roll their eyes and disregard the rest of the problem profile).
Suppose before TAI arrived we came to a strong conclusion: e.g. we're confident we don't want to settle using such-and-such a method, or we're confident we shouldn't immediately embark on a mission to settle space once TAI arrives. What's the chance that work ends up making a counterfactual difference, once TAI arrives? Notquite zero, it seems to me.
This is an interesting point worth exploring further, but I think that it's helpful to distinguish—perhaps crudely?—between two types of problems:
Technical/scientific problems and "moral problems" which are really just "the difficulty of understanding how our actions will relate to our moral goals, including what sub-goals we should have in order to achieve our ultimate moral goals (e.g., maximizing utility, maximizing virtue/flourishing)."
Social moral alignment—i.e., getting society to want to make more-moral decisions instead of being self-interested at the expense of others.
It seems to me that an aligned superintelligence would very likely be able to obsolesce every effort we make towards the first problem fairly quickly: if we can design a human-aligned superintelligent AI, we should be able to have it automate or at least inform us on everything from "how do we solve this engineering problem" to "will colonizing this solar system—or even space exploration in general—be good per [utilitarianism/etc.]?"
However, making sure that humans care about other extra-terrestrial civilizations/intelligence—and that the developers of AI care about other humans (and possibly animals)—might require some preparation such as via moral circle expansion. Additionally, I suppose it might be possible that a TAI's performance on the first problem is not as good as we expect (perhaps due to the second problems), and of course there are other scenarios I described where we can't rely as much on a (singleton) superintelligence, but my admittedly-inexperienced impression is that such scenarios seem unlikely.
This is partially an accurate objection (i.e., I do think that x-risks and other longtermist concerns tend to significantly outweigh near-term problems such as in health and development), but there is an important distinction to make with my objections to certain aspects of space governance:
Contingent on AI timelines, there is a decent chance that none of our efforts will even have a significantly valuable near-term effect (i.e., we won't achieve our goals by the time we get AGI). Consider the following from the post/article:
If the cost of travelling to other planetary bodies continues the trend in the chart above and falls by an order of magnitude or so, then we might begin to build increasingly permanent and self-sustaining settlements. Truly self-sustaining settlements are a long way off, but both NASA and China have proposed plans for a Moon base, and China recently announced plans to construct a base on Mars.
Suppose that it would take ~80 years to develop meaningful self-sustaining settlements on Mars without AGI or similar forms of superintelligence. But suppose that we get AGI/superintelligence in ~60 years: we might get misaligned AGI and all the progress (and humanity) is erased and it fails to achieve its goals; we might create aligned AGI which might obsolesce all ~60 years of progress within 5 or so years (I would imagine even less time); or we might get something unexpected or in between, in which case maybe it does matter?
In contrast, at least with health and development causes you can argue "I let this person live another ~50 years... and then the AGI came along and did X."
Furthermore, this all is based on developing self-sustaining settlements being a valuable endeavor, which I think is often justified with ideas that we'll use those settlements for longer-term plans and experimentation for space exploration, which requires an even longer timeline.
I suppose "management complexity/demand" might indeed be a bit too narrow, but either way it just feels like you're basically trying to define "core competency-ness" as "difficulty of outsourcing this task [whether for management demand or other reasons]," in which case I think it would make more sense to just replace "core competency-ness" with "difficulty of outsourcing this task."
My worry is that trying to define "core competency-ness" that way feels a bit unintuitive, and could end up leading to accidental equivocation/motte-and-baileys if someone who isn't familiar with management theory/jargon interprets "core competency" as important functions X, Y, and Z, but you only mean it to refer to X and Y, reasoning that "Z is some really core part of our operation that we are competent at, but it can be outsourced, therefore it's not a core competency."
I feel like the discussion of AI is heavily underemphasized in this problem profile (in fact, in this post it is the last thing mentioned).
I used to casually think "sure, space governance seems like it could be a good idea to start on soon; space exploration needs to happen eventually, I guess," but once I started to consider the likelihood and impact of AI development within the next 200 or even ~60 years, I very heavily adjusted my thinking towards skepticism/pessimism.
That question of AI development seems like a massive gatekeeper/determinant to this overall question: I'm unclear how any present efforts towards long-term space governance and exploration matter in the case where AI 1) is extraordinarily superintelligent and agentic, and 2) operates effectively as a "singleton" -- which itself seems like a likely outcome from (1).
Some scenarios that come to my mind regarding AI development (with varying degrees of plausibility):
We create a misaligned superintelligence which leads to extinction or other forms of significant curtailment of humanity's cosmic potential, which renders all of our efforts towards space exploration unimportant for the long term.
We create an aligned, agentic superintelligent singleton which basically renders all of our efforts towards space exploration unimportant for the long term (because it will very very quickly surpass all of our previous reasoning and work).
We somehow end up with multiple highly intelligent agents (e.g., national AIs) that are somewhat "aligned" with certain values, but their intelligence does not enable them to identify/commit to positive-sum cooperation strategies (e.g., they cannot converge towards a singleton) and this curtails space expansion capabilities, but having developed norms in advance helps to (slightly?) mitigate this curtailment.
We determine that the alignment problem is unsolvable or too inherently risky to try to develop a superintelligence--at least for a few centuries--but we are also somehow able to prevent individual/unilateral actors from trying to create superintelligent agents, and so it may be worthwhile to get a "headstart" on space exploration, even if it only equates to improving our long term future by some (0.00...1%)
Creating superintelligence/AGI proves far more elusive than we expect currently (and/or the alignment problem is difficult, see above) and thus takes many decades or even centuries longer, while at the same time space becomes a domain that could trigger hostilities or tensions that undermine coordination on existential risks (including AI).
Ultimately, I'd really like to see:
More up-front emphasis on the importance of AI alignment as a potential determinant.
Examination of the scenarios in which work on space governance would have been useful, given the first point, including how likely those scenarios appear to be.