Scrutinizing AI Risk (80K, #81) - v. quick summary

post by Louis_Dixon (bdixon) · 2020-07-23T19:02:55.558Z · score: 10 (4 votes) · EA · GW · None comments


  Core ideas I took away (not that I necessarily all agree with)
  My takeaways
No comments

Epistemic status: uncertain about whether this accurately describes Ben's views. The podcast is great and he's also doing a very interesting AMA [EA · GW]. This is a very complex topic, and I would love to hear lots of different perspectives and have them really fleshed out in detail. The below is my attempt at a quick summary for those short on time.

Core ideas I took away (not that I necessarily all agree with)

  1. Brain in a box - the classic Bostrom-Yudkowsky scenario is where there’s a superintelligent AGI developed which is far more capable than anything else people are dealing with, i.e. a brain in a box, but actually we’d expect systems to develop incrementally and so we should have other examples of similar concrete problems to work on.
  2. Intelligence explosion - one of the concepts behind the runaway intelligence explosion is that a system is recursively self-improving, so the AI starts to rewrite its own code or hardware and then get much better. But there are many tasks that go into system improvement, and even coding requires many different skills, so just because a system might be able to improve one of its inputs, that doesn’t mean that its overall capacity should increase.
  3. Entanglement and capabilities - when we’ve had AI systems they’ve usually got more capable by getting better at giving us what we want, so by exploring the potential space of solutions more and more carefully. For example house cleaning robots only get better as they learn more about our preferences. Thermostats only get more effective and capable when they get better at moderating the temperature, because the intelligence of meeting the goal is entangled with the goal itself. This should make us suspicious of extremely powerful and capable systems that also have divergent goals to ours.
  4. Hard to shape the future - if we take these arguments seriously, it also might be the case that AI safety can develop more gradually as a field over coming decades, and that while it’s important, it just might not be as much of a race as some have previously argued. To take something potentially analogous, it’s not clear what someone in the 1500s could have done to influence the industrial revolution, even if they had strong reasons to think it would take off.

Some other points

My takeaways

None comments

Comments sorted by top scores.