Why do you care?post by anonymous1 · 2022-05-08T10:44:54.290Z · EA · GW · 7 comments
In this post, I ask the readers about what motivates them, I provide an example of answer (i.e. a reasoning that motivates me) and, as a by-product, I jot down some rudimentary considerations about AGI safety derived from this exercise. Happy reading!
I am curious to learn more about what motivates other readers of this forum. In 2015, the moral philosophy arguments for effective altruism relied heavily on utilitarianism, and I wonder whether this has evolved. If I remember correctly, moral philosophy discussions in Rationality from AI to Zombie were cut short by the Is-Ought problem. I am also curious to hear if anyone relies on solutions to the Is-Ought problem in their own moral philosophy/motivational framework.
I am looking for more “applied” philosophy rather than theoretical argumentation about philosophical approaches, though I welcome links between theory and practice. In brief, what drives you, fundamentally? When you ask yourself “why am I exerting so much efforts towards that goal?”, what are the arguments you give yourself?
To illustrate the type of answers I am looking for, here is a reasoning that motivates me:
1. We cannot deduce from real observation whether the existence of reality has a purpose; we cannot deduce what we should do (~the Is-Ought problem). Note here the assumed definition that ‘purpose of reality’ = ‘what we should do’.
2. Current models of reality rely on the occurrence of a phenomenon that does not respect causality-as-my-brain-understands-it and ‘resulted’ in reality (~the Big Bang occurred but we don’t know why, and when we discover why, we won’t know why that, etc.)
3. Given this cosmological uncertainty, it is possible that the (literally metaphysical) laws/processes governing that precursor phenomenon apply around reality. (For example, if you are into simulation arguments: whatever launched our simulation is still around ‘outside’ the simulation)
4. And therefore it is possible that in the future, a similar causality-violating phenomenon strikes again and gives a purpose to reality (sim version: whatever launched our sim intervenes to give a hint about what the purpose of the simulation is)
5. Or, regardless of points 2-4, maybe through more research & philosophy, we will figure out a way around the Is-Ought problem down the road and find a convincing moral philosophy, a deterministic purpose informing us on what we should do.
6. By point 4 or point 5, there is a chance we’ll eventually get to know the purpose of reality i.e. know what we should do.
7. This purpose can take many forms, from “break free from the sim” or “capture the flag” to “beat heat death” or “create as many paperclips as possible”. Ultimately, the assumption here is that we find something informative about “what we should do” in a deterministic way. So for example “42” or “blue” would not qualify as purposes, in this reasoning (and we’d therefore have to wait/search/philosophise longer to get something informative).
8. Since it is something we should do, regardless of what it is, sentience should probably equip itself with as much deliberate control (incl. understanding) over its environment as possible, in order to be better able to “do” whatever in a deliberate way, and therefore to be ready to fulfill that potential upcoming purpose ASAP once it is discovered. And there is the meta implication that sentience should also invest in its ability to equip itself better/faster for deliberately controlling its environment (analogous to Omohundro’s basic AI drives arguments – cf. Post-script below). The term “deliberate” is important: by “deliberate control”, I don’t mean just “control”, which would be more akin to ~“conquest of the universe and expand the frontier of knowledge”, but the ability to deliberately direct this control over resources/knowledge towards the fulfillment of the purpose sentience sets itself to achieve, whichever it is. (So, not only better sensors and actuators for society, but also better decision-making process/more wisdom to use these sensors and actuators.)
9. So maximizing sentience’s deliberate control over its environment seems like a good starting point to fulfill the purpose of reality (whatever that purpose), it seems like a good instrumental objective for what we should do.
10. This is true unless the purpose is explicitly something on the spectrum of “do not maximize deliberate control over the environment” (which would be a surprisingly specific goal, so surprising in fact I’d get suspicious: it probably means whatever deus ex machina provides us the goal is afraid of something. In any case, if that is the purpose and you spent eons maximizing control, you have enough control to simply give up control straight away by letting everything go as soon as you discover this was the wrong thing to do.)
11. This applies also if there are competing purposes (~imagine a weird situation where the sim is fought over by multiple dei ex machina: whatever the purpose or combination of purposes you end up electing to fulfill, you are still in a better position by controlling your environment to achieve the compromise/the most likely true purpose.
12. If points 2-9 have 0% chance of being correct, you are left with point 1 where nothing matters, and therefore wasting your time towards achieving sentient deliberate control over the environment does not matter either and at least it keeps your mind busy. If points 2-9 has a non-zero chance of being correct, you should care and try to maximize sentience’s deliberate control over its environment (something like ~Pascal Wager?)
13. If point 1 is incorrect and there is already a way to deduce a goal for reality and an answer to what we should do, do share :^)
A longer-than-expected post-script on considerations for AGI safety:
a. Points 6-9 made me think that an AI system that is still very uncertain about its goal (e.g. bc trained through inverse reinforcement) could still suffer from the Omohundro’s basic AI drives if it weakly 'expects' to have a goal in the future. So, basic AI drives are not only for “almost any goals” as written in Omohundro’s 2008 paper, but also for weakly positive expectations of any goal. As a result, risk arises from the mere ‘conceptualization’ by a powerful AI system that it may eventually have a purpose. This might be true at the encoding of the objective function (though I am not sure about that), but it is definitely true at more advanced stages where an AI system expects to be given instructions/be applied to various tasks.
b. Point 10 made me think that, at the design level, there is a need for explicit strong rewards for minimizing the impact potential/Omohundro’s AI drives even in AI systems that do not know the rest of their reward functions/their goals. There is also a need for making the relative weighting of that impact minimization immutable, even if the AI system can alter its code (~so that we can be sure that, contrary to me, the AI system is not suspicious of that goal)
c. Overall, writing down this reasoning made me think that blocking this “expectation of a goal” by design at an advanced stage or placating the behaviors resulting from that expectation (e.g. “inactivity as a default” by design, until uncertainty is reduced) sound more important than before writing it down. Solving the question of what the AI system should aim for might be secondary to the solving of Omohundro’s basic AI drives.
Comments sorted by top scores.