On Solving Problems Before They Appear: The Weird Epistemologies of Alignment 2021-10-11T08:21:20.045Z


Comment by adamShimi on The academic contribution to AI safety seems large · 2020-08-05T07:53:12.188Z · EA · GW

Hum, I think I wrote my point badly on the comment above. What I mean isn't that formal methods will never be useful, just that they're not really useful yet, and will require more pure AI safety research to be useful.

The general reason is that all formal methods try to show that a program follows a specification on a model of computation. Right now, a lot of the work on formal methods applied to AI focus on adapting known formal methods to the specific programs (say Neural Networks) and the right model of computation (in what contexts do you use these programs, how can you abstract their execution to make it simpler). But one point they fail to address is the question of the specification.

Note that when I say specification, I mean a formal specification. In practice, it's usually a modal logic formula, in LTL for example. And here we get at the crux of my argument: nobody knows the specification for almost all AI properties we care about. Nobody knows the specification for "Recognizing kittens" or "Answering correctly a question in English". And even for safety questions, we don't have yet a specification of "doesn't manipulate us" or "is aligned". That's the work that still needs to be done, and that's what people like Paul Christiano and Evan Hubinger, among others, are doing. But until we have such properties, the formal methods will not be really useful to either AI capability or AI safety.

Lastly, I want to point out that working on AI for formal methods is also a means to get money and prestige. I'm not going to go full Hanson and say that's the only reason, but it's still a part of the international situation. I have examples of people getting AI related funding in France, for a project that is really, but really useless for AI.

Comment by adamShimi on The academic contribution to AI safety seems large · 2020-08-02T21:14:25.896Z · EA · GW

This post annoyed me. Which is a good thing! It means that you hit where it hurts, and you forced me to reconsider my arguments. I also had to update (a bit) toward your position, because I realized that my "counter-arguments" weren't that strong.

Still, here they are:

  • I agree with the remark that many work will have both capability and safety consequences. But instead of seeing that as an argument to laud the safety aspect of capability-relevant work, I want to look for the differential technical progress. What makes me think that EA safety is more relevant than mainstream AI to safety questions is that for almost all EA safety, the differential progress is in favor of safety, while for most research in mainstream/academic AI, the different progress seems either neutral or in favor of capabilities. (I'll be very interested in counter examples, on both sides)
  • Echoing what Buck wrote, I think you might overestimate the value of research that has potential consequences about safety but is not about it. And thus I do think there's a significant value gain to focus on safety problems specifically.
  • About Formal Methods, it isn't even useful for AI capabilities, even less for AI safety. I want to write a post about that at some point, but when you're unable to specify what you want, Formal Methods cannot save your ass.

With all that being said, I'm glad you wrote this post and I think I'll revisit it and think more about it.

Comment by adamShimi on Is it suffering or involuntary suffering that's bad, and when is it (involuntary) suffering? · 2020-06-23T20:39:20.443Z · EA · GW

Since many other answers treat the more general ideas, I want to focus on the "volontary" sadness of reading/watching/listening sad stories. I was curious about this myself, because I noticed that reading only "positive" and "joyous" stories eventually feel empty.

The answer seem that sad elements in a story bring more depth than the fun/joyous ones. In that sense, sadness in stories act as a signal of deepness, but also a way to access some deeper part of our emotions and internal life.

I'm reminded of Mark Manson's quote from this article:

If I ask you, “What do you want out of life?” and you say something like, “I want to be happy and have a great family and a job I like,” it’s so ubiquitous that it doesn’t even mean anything.
A more interesting question, a question that perhaps you’ve never considered before, is what pain do you want in your life? What are you willing to struggle for? Because that seems to be a greater determinant of how our lives turn out.

Maybe sadness and pain just tell us more about other and ourselves, and that's what we find so enthralling.

Comment by adamShimi on Causal diagrams of the paths to existential catastrophe · 2020-03-10T15:32:24.527Z · EA · GW

That answers my question, yes. :)

Comment by adamShimi on Causal diagrams of the paths to existential catastrophe · 2020-03-10T12:24:09.511Z · EA · GW

Thanks for that very in-depth answer!

I was indeed thinking about 3., even if 1. and 2. are also important. And I get that the main value of these diagrams is to force an explicit and as formal as possible statement to be made.

I guess my question was more about, given two different causal diagrams for the same risk (made by different researchers for example), have you an idea of how to compare them? Like finding the first difference along the causal path, or others means of comparison. This seems important because even with clean descriptions of our views, we can still talk past each other if we cannot see where the difference truly lies.

Comment by adamShimi on Causal diagrams of the paths to existential catastrophe · 2020-03-09T12:42:26.684Z · EA · GW

Great post! I feel these diagrams will be really useful for clarifying the possible interventions and parts of the existential risks.

Do you think they'll also serve for comparing different positions on a specific existential risk, like the trajectories in this post? Or do you envision the diagram for a specific risk as a summary of all causal pathways to this risk?

Comment by adamShimi on Cortés, Pizarro, and Afonso as Precedents for Takeover · 2020-03-03T12:59:46.744Z · EA · GW

What about diseases? I admit I know little about this period of history, but the accounts I read (for example in Guns, Germs and Steel) place the advantage in the spread of diseases to the Americas.

Basically, because the Americas lacked many big domesticated mammals, they could not have cities like European ones with cattle everywhere. The conditions of living in these big cities caused the spread of diseases. And when going to the Americas, the conquistadors took these diseases with them to a population which had never experienced them, causing most of the deaths of the early conquests.

(This is the picture from the few sources I've read. So it might be wrong or inaccurate, but if it is, I am very curious of why.)

Comment by adamShimi on Effects of anti-aging research on the long-term future · 2020-02-29T12:07:18.406Z · EA · GW

Also interested. I did not think about it before, but since the old generation dying is one way scientific and intellectual changes are completely accepted, that would probably have some big impact on our intellectual landscape and culture.

Comment by adamShimi on My personal cruxes for working on AI safety · 2020-02-20T17:33:48.635Z · EA · GW

I'm curious about the article, but the link points to nothing. ^^

Comment by adamShimi on Michelle Graham: How evolution can help us understand wild animal welfare · 2020-02-16T14:00:59.653Z · EA · GW

Thanks a lot for this presentation and corresponding transcript. I am quite new to thinking about animal welfare at all, and even more about wildlife animal welfare, but I felt this presentation was easy to follow even from this point of view (my half decent knowledge of evolution might have helped).

I like the clarification of evolution, and more specifically, of the fact that natural selection selects away options with bad fitness or bad relative fitness, instead of optimizing fitness to the maximum. That's a common issue when using theoretical computer science for modeling natural systems: instead of looking for the best algorithms for our classical measures (like time or space), we need to take into account the specifics of evolution (some forms of simplicity in the algorithms for example) and not necessarily optimize completely.

On the level of details and nitpicks, I have a few comments:

  • I'm not sure I understand differential reproduction correctly. Is it the fact that (in your example) blue bears have more offsprings? Or that these offsprings use the advantage of being blue for having even more offsprings, which changes the proportion? Or both?
  • For the line representation, I think you wanted to define the line of all humans to be an inch long. Because without this, or a length for one individual, I cannot make sense of the comparison between the line of humans and the line of ants.
  • There is a red cross left in the background of the second "Assumptions of the Argument" slide, the one just after the example of rats and elephants.
  • I had never heard of exaptations! I am curious of some references to the literature in general, and maybe also to the specific example you gave about feathers.
  • The hypothesis in the last question that less cognitive power entails more pain, because the signal needs to be stronger in order to be treated and registered... that's a fascinating idea. Horrible, of course, but I never thought about it that way. And that would be a counterweight to the "moral weight" argument about the relative value of different species.

Finally, on the specific topic of intervention for improving welfare, I have one worry: what of cases where two species have mutually exclusive needs? Something like a meat-eater species and the species it eats. In theses cases, I feel like evolution left us with some sort of zero-sum game, and there might be necessary welfare tradeoffs because of it.

Comment by adamShimi on Illegible impact is still impact · 2020-02-14T12:40:52.647Z · EA · GW

Once upon a time, my desire to build a useful mastery and career made me neglect my family, and more precisely my little brothers. Not dramatically, but whenever we were together, I was too stuck up with my own issues to act like a good old brother, or even to interact correctly with them. At some point, I realized that giving time and attention to my family was also important to me, and thus that I could not simply allocate all my mental time and energy to "useful" things.

This happened before I discovered EA, and is not explicitly about the EA community, but that's what popped into my mind when reading to this great post. In a sense, I refused to do illegible work (being a good brother, friend and son) because I considered it worthless in comparison with my legible aspirations.

Beyond becoming utility monster, I think what you are pointing out is that optimizing what we can measure, even with caveat, yields a negligence of small things that matters a lot. And I agree that this issue is probably tractable, because the illegible but necessary tasks themselves tend to be small, not too much of a burden. If one wants to allocate their career to it, great. But everyone can contribute in the ways you point out. Just a bit every day. Like calling your mom from time to time.

That is to say, just like you don't have to have an EA job to be an effective altruist, you don't have to dedicate all your life to illegible work to contribute some.

Comment by adamShimi on My personal cruxes for working on AI safety · 2020-02-13T16:35:40.593Z · EA · GW

On a tangent, what are your issues with quantum computing? Is it the hype? that might indeed be abusive for what we can do now. But the theory is fascinating, there are concrete applications where we should get positive benefits for humanity, and the actual researchers in the field try really hard to clarify what we know and what we don't about quantum computing.

Comment by adamShimi on My personal cruxes for working on AI safety · 2020-02-13T16:33:09.195Z · EA · GW

Thanks a lot for this great post! I think the part I like the most, even more than the awesome deconstruction of arguments and their underlying hypotheses, is the sheer number of times you said "I don't know" or "I'm not sure" or "this might be false". I feel it places you at the same level than your audience (including me), in the sense that you have more experience and technical competence than the rest of us, but you still don't know THE TRUTH, or sometimes even good approximations to it. And the standard way to present clearly ideas and research is to structure them so that these points that we don't know are not the focus. So that was refreshing.

On the more technical side, I had a couple of questions and remarks concerning your different positions.

  • One underlying hypothesis that was not explicitly pointed out, I think, was that you are looking for priority arguments. That is, part of your argument is about whether AI safety research is the most important thing you could do (It might be so obvious in an EA meeting or the EA forum that it's not worth exploring, but I like expliciting the obvious hypotheses). But that's different from whether or not we should do AI safety research at all. That is one common criticism I have about taking at face value effective altruism career recommendations: we would not have for example pure mathematicians, because pure mathematics is never the priority. Whereas you could argue that without pure mathematics, almost all the positive technological progress we have now (from quantum mechanics to computer science) would not exist. (Note that this is not an argument for having a lot of mathematicians, just an argument for having some).
  • For the problems-that-solve-themselves arguments, I feel like your examples have very "good" qualities for solving themselves: both personal and economic incentives are against them, they are obvious when one is confronted with the situation, and at the point where the problems becomes obvious, you can still solve them. I would argue that not all these properties holds for AGI. What are your thoughts about that?
  • About the "big deal" argument, I'm not sure that another big deal before AGI would invalidate the value of current AI Safety research. What seems weird in your definition of big deal is that if I assume the big deal, then I can make informed guess and plans about the world after it, no? Something akin to The Age of Em by Hanson, where he starts with ems (whole-brain emulations) and then try to derive what our current understanding of the various sciences can tell us about this future. I don't see why you can't do this even if there is another big deal before AGI. Maybe the only cost is more and more uncertainty.
  • The arguments you point out against the value of research now compared to research closer to AGI seems to forget about incremental research. Not all research is a breakthrough, and most if not all breakthrough build on previous decades or centuries of quiet research work. In this sense, working on it now might be the only way to ensure the necessary breakthroughs closer to the deadline.
Comment by adamShimi on Snails used for human consumption: The case of meat and slime · 2020-02-12T14:45:22.765Z · EA · GW

Anecdotally, almost everyone from older generations that I know eat snails, so it might indeed be generational. Whereas I know approximately the same number of people from each generation that dislike oysters (mostly texture).

Comment by adamShimi on Morality vs related concepts · 2020-02-12T14:02:30.961Z · EA · GW

Thanks for the effort in summarizing and synthesizing this tangle of notions! Notably, I learned about axiology, and I am very glad I did.

One potential addition to the discussion of decision theory might be the use of "normative", "descriptive" and "prescriptive" within decision theory itself, which is slightly different. To quote the Decision Theory FAQ on Less Wrong:

We can divide decision theory into three parts (Grant & Zandt 2009; Baron 2008). Normative decision theory studies what an ideal agent (a perfectly rational agent, with infinite computing power, etc.) would choose. Descriptive decision theory studies how non-ideal agents (e.g. humans) actually choose. Prescriptive decision theory studies how non-ideal agents can improve their decision-making (relative to the normative model) despite their imperfections.

Because that was one way I think about these words, I got confused by your use of "prescriptive", even though you used it correctly in this context.

Comment by adamShimi on Snails used for human consumption: The case of meat and slime · 2020-02-11T13:35:31.672Z · EA · GW

Thanks for this thoughtful analysis! I must admit that I never really considered the welfare of snails as an issue, maybe because I am french and thus am culturally used to eating them.

One thing I wanted to confirm anecdotally is the consummation of snails in France. Even if snails with parsley butter is a classic french dish, it is eaten quite rarely (only for celebrations or Christmas and new year's eve diners). And I know many people that don't eat snails because they find it disgusting, even though they saw people eating them all their lives (similar to oysters in a sense).

As for what should be done, your case for the non-tractability and non-priority of snails welfare is pretty convincing. I still take from this post that undue pain (with or without sentience) is caused in snails, even from the position where it is okay to eat animals (My current position, which is in reassessing). I was quite horrified by the slime part.

Comment by adamShimi on State Space of X-Risk Trajectories · 2020-02-10T10:45:47.811Z · EA · GW

The geometric intuition underlying this post already proves useful for me!

Yesterday, while discussing with a friend why I want to change my research topic to AI Safety instead of what I currently do (distributed computing), my first intuition was that AI safety aims at shaping the future, while distributed computing is relatively agnostic about it. But a far better intuition comes when considering the vector along the current trajectory in state space, starting at the current position of the world, and whose direction and length capture the trajectory and the speed at which we follow it.

From this perspective, the difference between distributed computing/hardware/cloud computing research and AI safety research is obvious in terms of vector operations:

  • The former amounts to positive scaling of the vector, and thus makes us go along our current trajectory faster.
  • While the latter amounts to rotations (and maybe scaling, but it is a bit less relevant), which allows us to change our trajectory.

And since I am not sure we are heading in the right direction, I prefer to be able to change the trajectory (at least potentially).

Comment by adamShimi on Differential progress / intellectual progress / technological development · 2020-02-09T21:30:02.396Z · EA · GW

That's a great criterion! We might be able to find some weird counter-example, but it solves all of my issues. Because intellectual work/knowledge might be a part of all actions, but it isn't necessary on the main causal path.

I think this might actually deserve its own post.

Comment by adamShimi on Differential progress / intellectual progress / technological development · 2020-02-09T19:02:31.219Z · EA · GW

Thanks for the in-depth answer!

Let's take your concrete example about democracy. If I understand correctly, you separate the progress towards democracy into:

  • discovering/creating the concept of democracy, learning it, spreading the concept itself, which is under the differential intellectual progress.
  • convince people to implement democracy, do the fieldwork for implementing it, which is at least partially under the differential progress.

But the thing is, I don't have a clear criterion for distinguishing the two. My first ideas were:

  • differential intellectual progress is about any interaction where the relevant knowledge of some participant increases (in the democracy example, learning the idea is relevant, learning that the teeth of your philosophy teacher are slightly ajar is not). And then differential progress is about any interaction making headway towards a change in the world (in the example the implementation of democracy). But I cannot think of a situation where no one learns anything relevant to the situation at hand. That is, for these definitions, differential progress is differential intellectual progress.
  • Another idea is that differential intellectual progress is about all the work needed for making rational agents implement a change in the world, while differential progress is about all the work needed for making humans implement a change in the world. Here the two are clearly different. My issue there stems with the word intellectual: in this case Amos and Tversky's work, and pretty much all of behavioral economics, is not intellectual.

Does any of these two criteria feel right to you?

Comment by adamShimi on What are information hazards? · 2020-02-08T15:54:45.685Z · EA · GW

I did find this post clear and useful; it will be my main recommendation if I want to explain this concept to someone else.

I also really like your proposition of "potential information hazards", as at that point in the post, I was wondering if all basic research should be considered information hazards, which would make the whole concept rather vacuous. Maybe one way to address the potential information hazards is to try to quantify how removed are they from potential concrete risks?

Anyway, I'm looking forward to the next posts on dealing with these potential information hazards.

Comment by adamShimi on Differential progress / intellectual progress / technological development · 2020-02-08T15:36:23.282Z · EA · GW

Thanks a lot for this summary post! I did not know of these concepts, and I feel they are indeed very useful for thinking about these issues.

I do have some trouble with the distinction between intellectual progress and progress though. During my first reading, I felt like all the topics mentioned in the progress section were actually about intellectual progress.

Now, rereading the post, I think I see a distinction, but I have trouble crystallizing it in concrete terms. Is the difference about creating ideas and implementing them? But then it feels very reminiscent of the whole fundamental/applied research distinction, which gets blurry very fast. And even the applications of ideas and solutions requires a whole lot of intellectual work.

Maybe the issue is with the word intellectual? I get that you're not the one choosing the terms, but maybe something like fundamental progress or abstraction progress or theoretical progress would be more fitting? Or did I miss some other difference?

Comment by adamShimi on State Space of X-Risk Trajectories · 2020-02-07T16:29:30.973Z · EA · GW

As a tool for existential risk research, I feel like the graphical representation will indeed be useful in crystallizing the differences in hypotheses between researchers. It might even serves as a self-assessing tool, for checking quickly some of the consequences of one's own view.

But beyond the trajectories (and maybe specific distances), are you planning on representing the other elements you mention? Like the uncertainty or the speed along trajectories? I feel like the more details about an approach can be integrated into a simple graphical representation, the more this tool will serve to disentangle disagreement between researchers.

Comment by adamShimi on Attempted summary of the 2019-nCoV situation — 80,000 Hours · 2020-02-05T23:34:46.716Z · EA · GW

Thanks a lot for this podcast! I liked the summary you provided, and I think it is great to see people struggling to make sense of a lot of complex information on a topic, almost in direct. Given that you repeat multiple times that neither of you is an expert on the subject, I think this podcast is a net positive: it gives information while encouraging the listeners to go look for themselves.

Another great point: the criticism of the meme about overreacting. While listening to the beginning, when you said that there was no reason to panic, I wanted to object that preparing for possible catastrophes is as important, if not more important, to do before they are obviously here. But the discussion of the meme clarified this point, and I thought it was great.