Why I think the Foundational Research Institute should rethink its approach 2017-07-20T20:46:27.298Z
A review of what affective neuroscience knows about suffering & valence. (TLDR: Affective Neuroscience is very confused about what suffering is.) 2017-01-13T02:01:11.223Z
Principia Qualia: blueprint for a new cause area, consciousness research with an eye toward ethics and x-risk 2016-12-09T05:47:23.087Z


Comment by mikejohnson on Qualia Research Institute: History & 2021 Strategy · 2021-01-28T07:13:17.622Z · EA · GW

Hi Daniel,

Thanks for the reply! I am a bit surprised at this:

Getting more clarity on emotional valence does not seem particularly high-leverage to me. What's the argument that it is?

The quippy version is that, if we’re EAs trying to maximize utility, and we don’t have a good understanding of what utility is, more clarity on such concepts seems obviously insanely high-leverage. I’ve written about specific relevant to FAI here: Relevance to building a better QALY here: And I discuss object-level considerations on how better understanding of emotional valence could lead to novel therapies for well-being here: On mobile, pardon the formatting.

Your points about sufficiently advanced AIs obsoleting human philosophers are well-taken, though I would touch back on my concern that we won’t have particular clarity on philosophical path-dependencies in AI development without doing some of the initial work ourselves, and these questions could end up being incredibly significant for our long-term trajectory — I gave a talk about this for MCS that I’ll try to get transcribed (in the meantime I can share my slides if you’re interested). I’d also be curious to flip your criticism and ping your models for a positive model for directing EA donations — is the implication that there are no good places to donate to, or that narrow-sense AI safety is the only useful place for donations? What do you think the highest-leverage questions to work on are? And how big are your ‘metaphysical uncertainty error bars’? What sorts of work would shrink these bars?

Comment by mikejohnson on Qualia Research Institute: History & 2021 Strategy · 2021-01-26T09:06:50.061Z · EA · GW

Hi Daniel,

Thanks for the remarks! Prioritization reasoning can get complicated, but to your first concern:

Is emotional valence a particularly confused and particularly high-leverage topic, and one that might plausibly be particularly conductive getting clarity on? I think it would be hard to argue in the negative on the first two questions. Resolving the third question might be harder, but I’d point to our outputs and increasing momentum. I.e. one can levy your skepticism on literally any cause, and I think we hold up excellently in a relative sense. We may have to jump to the object-level to say more.

To your second concern, I think a lot about AI and ‘order of operations’. Could we postulate that some future superintelligence might be better equipped to research consciousness than we mere mortals? Certainly. But might there be path-dependencies here such that the best futures happen if we gain more clarity on consciousness, emotional valence, the human nervous system, the nature of human preferences, and so on, before we reach certain critical thresholds in superintelligence development and capacity? Also — certainly.

Widening the lens a bit, qualia research is many things, and one of these things is an investment in the human-improvement ecosystem, which I think is a lot harder to invest effectively in (yet also arguably more default-safe) than the AI improvement ecosystem. Another ‘thing’ qualia research can be thought of as being is an investment in Schelling point exploration, and this is a particularly valuable thing for AI coordination.

I’m confident that, even if we grant that the majority of humanity's future trajectory will be determined by AGI trajectory — which seems plausible to me — I think it’s also reasonable to argue that qualia research is one of the highest-leverage areas for positively influencing AGI trajectory and/or the overall AGI safety landscape.

Comment by mikejohnson on New book — "Suffering-Focused Ethics: Defense and Implications" · 2020-06-01T10:05:04.864Z · EA · GW

Congratulations on the book! I think long works are surprisingly difficult and valuable (both to author and reader) and I'm really happy to see this.

My intuition on why there's little discussion of core values is a combination of "a certain value system [is] tacitly assumed" and "we avoid discussing it because ... discussing values is considered uncooperative." To wit, most people in this sphere are computationalists, and the people here who have thought the most about this realize that computationalism inherently denies the possibility of any 'satisfyingly objective' definition of core values (and suffering). Thus it's seen as a bit of a faux pas to dig at this -- the tacit assumption is, the more digging that is done, the less ground for cooperation there will be. (I believe this stance is unnecessarily cynical about the possibility of a formalism.)

I look forward to digging into the book. From a skim, I would just say I strongly agree about the badness of extreme suffering; when times are good we often forget just how bad things can be. A couple quick questions in the meantime:

  • If you could change peoples' minds on one thing, what would it be? I.e. what do you find the most frustrating/pernicious/widespread mistake on this topic?
  • One intuition pump I like to use is: 'if you were given 10 billion dollars and 10 years to move your field forward, how precisely would you allocate it, and what do you think you could achieve at the end?'
Comment by mikejohnson on Reducing long-term risks from malevolent actors · 2020-05-05T09:40:11.625Z · EA · GW

A core 'hole' here is metrics for malevolence (and related traits) visible to present-day or near-future neuroimaging.

Briefly -- Qualia Research Institute's work around connectome-specific harmonic waves (CSHW) suggests a couple angles:

(1) proxying malevolence via the degree to which the consonance/harmony in your brain is correlated with the dissonance in nearby brains;
(2) proxying empathy (lack of psychopathy) by the degree to which your CSHWs show integration/coupling with the CSHWs around you.

Both of these analyses could be done today, given sufficient resource investment. We have all the algorithms and in-house expertise.

Background about the paradigm:

Comment by mikejohnson on Intro to Consciousness + QRI Reading List · 2020-04-09T01:58:09.504Z · EA · GW

Very important topic! I touch on McCabe's work in Against Functionalism (EA forum discussion); I hope this thread gets more airtime in EA, since it seems like a crucial consideration for long-term planning.

Comment by mikejohnson on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-30T01:54:50.987Z · EA · GW

Hey Pablo! I think Andres has a few up on Metaculus; I just posted QRI's latest piece of neuroscience here, which has a bunch of predictions (though I haven't separated them out from the text):

Comment by mikejohnson on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T03:50:43.830Z · EA · GW

We’ve looked for someone from the community to do a solid ‘adversarial review’ of our work, but we haven’t found anyone that feels qualified to do so and that we trust to do a good job, aside from Scott, and he's not available at this time. If anyone comes to mind do let me know!

Comment by mikejohnson on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T01:53:48.064Z · EA · GW

I think this is a great description. "What happens if we seek out symmetry gradients in brain networks, but STV isn't true?" is something we've considered, and determining ground-truth is definitely tricky. I refer to this scenario as the "Symmetry Theory of Homeostatic Regulation" - (mostly worth looking at the title image, no need to read the post)

I'm (hopefully) about a week away from releasing an update to some of the things we discussed in Boston, basically a unification of Friston/Carhart-Harris's work on FEP/REBUS with Atasoy's work on CSHW -- will be glad to get your thoughts when it's posted.

Comment by mikejohnson on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T01:37:14.931Z · EA · GW

I think we actually mostly agree: QRI doesn't 'need' you to believe qualia are real, that symmetry in some formalism of qualia corresponds to pleasure, that there is any formalism about qualia to be found at all. If we find some cool predictions, you can strip out any mention of qualia from them, and use them within the functionalism frame. As you say, the existence of some cool predictions won't force you to update your metaphysics (your understanding of which things are ontologically 'first class objects').

But- you won't be able to copy our generator by doing that, the thing that created those novel predictions, and I think that's significant, and gets into questions of elegance metrics and philosophy of science.

I actually think the electromagnetism analogy is a good one: skepticism is always defensible, and in 1600, 1700, 1800, 1862, and 2018, people could be skeptical of whether there's 'deep unifying structure' behind these things we call static, lightning, magnetism, shocks, and so on. But it was much more reasonable to be skeptical in 1600 than in 1862 (the year Maxwell's Equations were published), and more reasonable in 1862 than it was in 2018 (the era of the iPhone).

Whether there is 'deep structure' in qualia is of course an open question in 2019. I might suggest STV is equivalent to a very early draft of Maxwell's Equations: not a full systematization of qualia, but something that can be tested and built on in order to get there. And one that potentially ties together many disparate observations into a unified frame, and offers novel / falsifiable predictions (which seem incredibly worth trying to falsify!)

I'd definitely push back on the frame of dualism, although this might be a terminology nitpick: my preferred frame here is monism: - and perhaps this somewhat addresses your objection that 'QRI posits the existence of too many things'.

Comment by mikejohnson on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T00:19:45.334Z · EA · GW

Thanks Matthew! I agree issues of epistemology and metaphysics get very sticky very quickly when speaking of consciousness.

My basic approach is 'never argue metaphysics when you can argue physics' -- the core strategy we have for 'proving' we can mathematically model qualia is to make better and more elegant predictions using our frameworks, with predicting pain/pleasure from fMRI data as the pilot project.

One way to frame this is that at various points in time, it was completely reasonable to be a skeptic about modeling things like lightning, static, magnetic lodestones, and such, mathematically. This is true to an extent even after Faraday and Maxwell formalized things. But over time, with more and more unusual predictions and fantastic inventions built around electromagnetic theory, it became less reasonable to be skeptical of such.

My metaphysical arguments are in my 'Against Functionalism' piece, and to date I don't believe any commenters have addressed my core claims:

But, I think metaphysical arguments change distressingly few peoples' minds. Experiments and especially technology changes peoples' minds. So that's what our limited optimization energy is pointed at right now.

Comment by mikejohnson on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-20T00:09:15.313Z · EA · GW

QRI is tackling a very difficult problem, as is MIRI. It took many, many years for MIRI to gather external markers of legitimacy. My inside view is that QRI is on the path of gaining said markers; for people paying attention to what we're doing, I think there's enough of a vector right now to judge us positively. I think these markers will be obvious from the 'outside view' within a short number of years.

But even without these markers, I'd poke at your position from a couple angles:

I. Object-level criticism is best

First, I don't see evidence you've engaged with our work beyond very simple pattern-matching. You note that "I also think that I'm somewhat qualified to assess QRI's work (as someone who's spent ~100 paid hours thinking about philosophy of mind in the last few years), and when I look at it, I think it looks pretty crankish and wrong." But *what* looks wrong? Obviously doing something new will pattern-match to crankish, regardless of whether it is crankish, so in terms of your rationale-as-stated, I don't put too much stock in your pattern detection (and perhaps you shouldn't either). If we want to avoid accidentally falling into (1) 'negative-sum status attack' interactions, and/or (2) hypercriticism of any fundamentally new thing, neither of which is good for QRI, for MIRI, or for community epistemology, object-level criticisms (and having calibrated distaste for low-information criticisms) seem pretty necessary.

Also, we do a lot more things than just philosophy, and we try to keep our assumptions about the Symmetry Theory of Valence separate from our neuroscience - STV can be wrong and our neuroscience can still be correct/useful. That said, empirically the neuroscience often does 'lead back to' STV.

Some things I'd offer for critique:

(you can also watch our introductory video for context, and perhaps a 'marker of legitimacy', although it makes very few claims )

I'd also suggest that the current state of philosophy, and especially philosophy of mind and ethics, is very dismal. I give my causal reasons for this here: - I'm not sure if you're anchored to existing theories in philosophy of mind being reasonable or not.

II. What's the alternative?

If there's one piece I would suggest engaging with, it's my post arguing against functionalism. I think your comments presuppose functionalism is reasonable and/or the only possible approach, and the efforts QRI is putting into building an alternative are certainly wasted. I strongly disagree with this; as I noted in my Facebook reply,

>Philosophically speaking, people put forth analytic functionalism as a theory of consciousness (and implicitly a theory of valence?), but I don't think it works *qua* a theory of consciousness (or ethics or value or valence), as I lay out here: This is more-or-less an answer to some of Brian Tomasik's (very courageous) work, and to sum up my understanding I don't think anyone has made or seems likely to make 'near mode' progress, e.g. especially of the sort that would be helpful for AI safety, under the assumption of analytic functionalism.


I always find in-person interactions more amicable & high-bandwidth -- I'll be back in the Bay early December, so if you want to give this piece a careful read and sit down to discuss it I'd be glad to join you. I think it could have significant implications for some of MIRI's work.

Comment by mikejohnson on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T15:50:29.065Z · EA · GW

Thanks, added.

Comment by mikejohnson on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T08:52:55.780Z · EA · GW

Buck- for an internal counterpoint you may want to discuss QRI's research with Vaniver. We had a good chat about what we're doing at the Boston SSC meetup, and Romeo attended a MIRI retreat earlier in the summer and had some good conversations with him there also.

To put a bit of a point on this, I find the "crank philosophy" frame a bit questionable if you're using only thin-slice outside view and not following what we're doing. Probably, one could use similar heuristics to pattern-match MIRI as "crank philosophy" also (probably, many people have already done exactly this to MIRI, unfortunately).

Comment by mikejohnson on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T08:35:56.588Z · EA · GW

We're pretty up-front about our empirical predictions; if critics would like to publicly bet against us we'd welcome this, as long as it doesn't take much time away from our research. If you figure out a bet we'll decide whether to accept it or reject it, and if we reject it we'll aim to concisely explain why.

Comment by mikejohnson on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-19T08:22:29.867Z · EA · GW

For a fuller context, here is my reply to Buck's skepticism about the 80% number during our back-and-forth on Facebook -- as a specific comment, the number is loosely held, more of a conversation-starter than anything else. As a general comment I'm skeptical of publicly passing judgment on my judgment based on one offhand (and unanswered- it was not engaged with) comment on Facebook. Happy to discuss details in a context we'll actually talk to each other. :)

--------------my reply from the Facebook thread a few weeks back--------------

I think the probability question is an interesting one-- one frame is asking what is the leading alternative to STV?

At its core, STV assumes that if we have a mathematical representation of an experience, the symmetry of this object will correspond to how pleasant the experience is. The latest addition to this (what we're calling 'CDNS') assumes that consonance under Selen Atasoy's harmonic analysis of brain activity (connectome-specific harmonic waves, CSHW) is a good proxy for this in humans. This makes relatively clear predictions across all human states and could fairly easily be extended to non-human animals, including insects (anything we can infer a connectome for, and the energy distribution for the harmonics of the connectome). So generally speaking we should be able to gather a clear signal as to whether the evidence points this way or not (pending resources to gather this data- we're on a shoestring budget).

Empirically speaking, the competition doesn't seem very strong. As I understand it, currently the gold standard for estimating self-reports of emotional valence via fMRI uses regional activity correlations, and explains ~16% of the variance. Based on informal internal estimations looking at coherence within EEG bands during peak states, I'd expect us to do muuuuch better.

Philosophically speaking, people put forth analytic functionalism as a theory of consciousness (and implicitly a theory of valence?), but I don't think it works *qua* a theory of consciousness (or ethics or value or valence), as I lay out here: This is more-or-less an answer to some of Brian Tomasik's (very courageous) work, and to sum up my understanding I don't think anyone has made or seems likely to make 'near mode' progress, e.g. especially of the sort that would be helpful for AI safety, under the assumption of analytic functionalism.

So in short, I think STV is perhaps the only option that is well-enough laid out, philosophically and empirically, to even be tested, to even be falsifiable. That doesn't mean it's true, but my prior is it's ridiculously worthwhile to try to falsify, and it seems to me a massive failure of the EA and x-risk scene that resources are not being shifted toward this sort of inquiry. The 80% I gave was perhaps a bit glib, but to dig a little, I'd say I'd give at least an 80% chance of 'Qualia Formalism' being true, and given that, a 95% chance of STV being true, and a 70% chance of CDNS+CSHW being a good proxy for the mathematical symmetry of human experiences.

An obvious thing we're lacking is resources; a non-obvious thing we're lacking is good critics. If you find me too confident I'd be glad to hear why. :)

Principia Qualia: arguments for formalism and STV laid out)
Against Functionalism:
(an evaluation of what analytic functionalism actually gives us)
Quantifying Bliss:
(Andres Gomez Emilsson's combination of STV plus Selen Atasoy's CSHW, which forms the new synthesis we're working from)
A Future for Neuroscience:
(more on CSHW)

Happy to chat more in-depth about details.

Comment by mikejohnson on Wireheading as a Possible Contributor to Civilizational Decline · 2018-11-16T02:07:39.212Z · EA · GW

Hi Alexey,

Glad to see the thought that went into this post. I'm in agreement that this is a real danger and potentially an x-risk. I'd recommend the following links from QRI:

  • A review of what affective neuroscience knows about valence -- essentially, affective neuroscience is very confused about what pleasure is, and thus is confused about how to talk about wireheading.
  • Wireheading Done Right -- there are better and worse ways to do hedonic recalibration. Andres describes a scenario that might be the best we can hope for.
  • A Future for Neuroscience -- one way to cut through the confusion plaguing affective neuroscience (and thus avoid some 'dirty wireheading' scenarios) is to find a good algorithm-level description of pain and pleasure. I think QRI has found one.
  • An interview with Adam Ford I did on the topic -- here's the relevant excerpt:
A device that could temporarily cause extreme positive or negative valence on demand would immediately change the world.
First, it would validate valence realism in a very visceral way. I’d say it would be the strongest philosophical argument ever made.
Second, it would obviously have huge economic & ethical uses.
Third, I agree that being able to induce strong positive & negative valence on demand could help align different schools of utilitarianism. Nothing would focus philosophical arguments about the discount rate between pleasure & suffering more than a (consensual!) quick blast of pure suffering followed by a quick blast of pure pleasure. Similarly, a lot of people live their lives in a rather numb state. Giving them a visceral sense that ‘life can be more than this’ could give them ‘skin in the game’.
Fourth, it could mess a lot of things up. Obviously, being able to cause extreme suffering could be abused, but being able to cause extreme pleasure on-demand could lead to bad outcomes too. You (Andres) have written about wireheading before, and I agree with the game-theoretic concerns involved. I would also say that being able to cause extreme pleasure in others could be used in adversarial ways. More generally, human culture is valuable and fragile; things that could substantially disrupt it should be approached carefully.
A friend of mine was describing how in the 70s, the emerging field of genetic engineering held the Asilomar Conference on Recombinant DNA to discuss how the field should self-regulate. The next year, these guidelines were adopted by the NIH wholesale as the basis for binding regulation, and other fields (such as AI safety!) have attempted to follow the same model. So the culture around technologies may reflect a strong “founder effect”, and we should be on the lookout for a good, forward-looking set of principles for how valence technology should work.
One principle that seems to make sense is to not publicly post ‘actionable’ equations, pseudocode, or code for how one could generate suffering with current computing resources (if this is indeed possible). Another principle is to focus resources on positive, eusocial applications only, insofar as that’s possible– I’m especially concerned about addiction, and bad actors ‘weaponizing’ this sort of research. Another would be to be on guard against entryism, or people who want to co-opt valence research for political ends.
All of this is pretty straightforward, but it would be good to work it out a bit more formally, look at the successes and failures of other research communities, and so on.
A question I find very interesting is whether valence research is socially disruptive or socially stabilizing by default. I think we should try very hard to make it a socially stabilizing force. One way to think about this is in terms of existential risk. It’s a little weird to say, but I think the fact that so many people are jaded, or feel hopeless, is a big existential risk, because they feel like they have very little to lose. So they don’t really care what happens to the world, because they don’t have good qualia to look forward to, no real ‘skin in the game’. If valence tech could give people a visceral, ‘felt sense’ of wonder and possibility, I think the world could become a much safer place, because more people would viscerally care about AI safety, avoiding nuclear war, and so on.
Finally, one thing that I think doesn’t make much sense is handing off the ethical issues to professional bioethicists and expecting them to be able to help much. Speaking as a philosopher, I don’t think bioethics itself has healthy community & research norms (maybe bioethics needs some bioethicsethicists…). And in general, I think especially when issues are particularly complex or technical, I think the best type of research norms comes from within a community.

Generally speaking, I think we should be taking this topic a lot more seriously, and thinking about specific plans for how to make hedonic recalibration happen in pro-social, pro-human ways, rather than the (default?) cyberpunk dystopia route.

Comment by mikejohnson on Is it better to be a wild rat or a factory farmed cow? A systematic method for comparing animal welfare. · 2018-09-18T20:00:22.463Z · EA · GW

Glad to see work on this.

It seems to me there are two questions here: (1) what are the average effects of different environments (e.g. wilderness; factory farm) on animal well-being? (2) what is the average hedonic well-being of different species?

It feels like you're attempting to find a method that will give the combined score for any given animal. But maybe it'd be best to focus on each individually. Some of the methods you mentioned (e.g. cortisol levels, behavior anomalies, self-narcotization) seem fairly solid for addressing (1), if you had more data. What's the biggest hurdle to gathering more data? Can you think of any clever ways to gather lots of data cheaply? Basically it seems really useful to try to build an intra-species hedonic comparison first, and worry about inter-species comparisons later.

That said-- on inter-species comparisons, I don't think any of the methods you mention are likely to give a good answer to (2), especially as none deal directly with brain activity. It's possible (although I don't know for sure) that some of QRI's work is relevant here- essentially, we have a method ('CDNS') that could be adapted to estimate the degree to which a given connectome is naturally 'tuned' toward harmony or dissonance. This would face many of the same data & validation challenges you mention for other proxy measures, but essentially I'm skeptical that it's possible to address (2) without something like what QRI is doing, that actually looks at brain activity and doesn't rely on hard-coded assumptions about things that could be species-specific and are probably leaky anyway (e.g., brain region X is associated with pain).

If it checks out, this could give a rough inter-species comparison of natural hedonic set-points between literally any two connectomes-- cows, chickens, rats, grasshoppers, mosquitos, humans. Probably not an end-all-be-all, but a useful tool in the toolbox. More on our 'CDNS' method.

Comment by mikejohnson on Ineffective entrepreneurship: post-mortem of Hippo, the happiness app that never quite was · 2018-05-23T14:49:55.455Z · EA · GW

I admire many things about this story, not least being your willingness to try & fail, and your brutal honesty in sharing what lessons you learned (or didn't learn).

There are many bad things about silicon valley, but one thing I think it gets right is giving partial credit for failed moonshots. Thank you for Really Trying to make this happen.

Comment by mikejohnson on Moral Anti-Realism Sequence #1: What Is Moral Realism? · 2018-05-23T14:38:00.804Z · EA · GW

Thanks for putting this out there. I like how you list the two versions of moral realism you find coherent, and especially that you list what would convince you of each.

My intuition here is the first option is the case, but also that instead of speaking about moral realism we should talk about qualia formalism. I.e., whether consciousness is real enough such that it can be spoken about in crisp formal terms, seems prior to whether morality is real in that same sense. I've written about this here, and spoke about this in the intro of my TSC2018 talk.

Whether qualia formalism is true seems an empirical question; if it is, we should be able to make novel and falsifiable predictions with it. This seems like a third option for moving forward, in addition to your other two.

Comment by mikejohnson on Why I prioritize moral circle expansion over artificial intelligence alignment · 2018-03-02T05:15:49.080Z · EA · GW

EA forum threads auto-hide so I’m not too worried about clutter.

I don’t think you’re fully accounting for the difference in my two models of meaning. And, I think the objections you raise to consciousness being well-defined would also apply to physics being well-defined, so your arguments seem to prove too much.

To attempt to address your specific question, I find the hypothesis that ‘qualia (and emotional valence) are well-defined across all arrangements of matter’ convincing because (1) it seems to me the alternative is not coherent (as I noted in the piece on computationalism I linked for you) and (2) it seems generative and to lead to novel and plausible predictions I think will be proven true (as noted in the linked piece on quantifying bliss and also in Principia Qualia).

All the details and sub arguments can be found in those links.

Will be traveling until Tuesday; probably with spotty internet access until then.

Comment by mikejohnson on Why I prioritize moral circle expansion over artificial intelligence alignment · 2018-02-28T18:05:47.988Z · EA · GW

This is an important point and seems to hinge on the notion of reference, or the question of how language works in different contexts. The following may or may not be new to you, but trying to be explicit here helps me think through the argument.

Mostly, words gain meaning from contextual embedding- i.e. they’re meaningful as nodes in a larger network. Wittgenstein observed that often, philosophical confusion stems from taking a perfectly good word and trying to use it outside its natural remit. His famous example is the question, “what time is it on the sun?”. As you note, maybe notions about emotional valence are similar- trying to ‘universalize’ valence may be like trying to universalize time-zones, an improper move.

But there’s another notable theory of meaning, where parts of language gain meaning through deep structural correspondence with reality. Much of physics fits this description, for instance, and it’s not a type error to universalize the notion of the electromagnetic force (or electroweak force, or whatever the fundamental unification turns out to be). I am essentially asserting that qualia is like this- that we can find universal principles for qualia that are equally and exactly true in humans, dogs, dinosaurs, aliens, conscious AIs, etc. When I note I’m a physicalist, I intend to inherit many of the semantic properties of physics, how meaning in physics ‘works’.

I suspect all conscious experiences have an emotional valence, in much the same way all particles have a charge or spin. I.e. it’s well-defined across all physical possibilities.

Comment by mikejohnson on Why I prioritize moral circle expansion over artificial intelligence alignment · 2018-02-26T03:35:03.691Z · EA · GW

Thanks, this is helpful. My general position on your two questions is indeed "Yes/No".

The question of 'what are reality's natural kinds?' is admittedly complex and there's always room for skepticism. That said, I'd suggest the following alternatives to your framing:

  • Whether the existence of qualia itself is 'crisp' seems prior to whether pain/pleasure are. I call this the 'real problem' of consciousness.

  • I'm generally a little uneasy with discussing pain/pleasure in technically precise contexts- I prefer 'emotional valence'.

  • Another reframe to consider is to disregard talk about pain/pleasure, and instead focus on whether value is well-defined on physical systems (i.e. the subject of Tegmark's worry here). Conflation of emotional valence & moral value can then be split off as a subargument.

Generally speaking, I think if one accepts that it's possible in principle to talk about qualia in a way that 'carves reality at the joints', it's not much of a stretch to assume that emotional valence is one such natural kind (arguably the 'c. elegans of qualia'). I don't think we're logically forced to assume this, but I think it's prima facie plausible, and paired with some of our other work it gives us a handhold for approaching qualia in a scientific/predictive/falsifiable way.

Essentially, QRI has used this approach to bootstrap the world's first method for quantifying emotional valence in humans from first principles, based on fMRI scans. (It also should work for most non-human animals; it's just harder to validate in that case.) We haven't yet done the legwork on connecting future empirical results here back to the computationalism vs physicalism debate, but it's on our list.

TL;DR: If consciousness is a 'crisp' thing with discoverable structure, we should be able to build/predict useful things with this that cannot be built/predicted otherwise, similar to how discovering the structure of electromagnetism let us build/predict useful things we could not have otherwise. This is probably the best route to solve these metaphysical disagreements.

Comment by mikejohnson on Why I prioritize moral circle expansion over artificial intelligence alignment · 2018-02-25T20:46:38.823Z · EA · GW

It seems to me your #2 and #4 still imply computationalism and/or are speaking about a straw man version of physicalism. Different physical theories will address your CPT reversal objection differently, but it seems pretty trivial to me.

If I understood you correctly, physicalism as a statement about consciousness is primary a negative statement, "the computational behavior of a system is not sufficient to determine what sort of conscious activity occurs there", which doesn't by itself tell you what sort of conscious activity occurs.

I would generally agree, but would personally phrase this differently; rather, as noted here, there is no objective fact-of-the-matter as to what the 'computational behavior' of a system is. I.e., no way to objectively derive what computations a physical system is performing. In terms of a positive statement about physicalism & qualia, I'm assuming something on the order of dual-aspect monism / neutral monism. And yes insofar as a formal theory of consciousness which has broad predictive power would depart from folk intuition, I'd definitely go with the formal theory.

Comment by mikejohnson on Why I prioritize moral circle expansion over artificial intelligence alignment · 2018-02-25T05:30:04.149Z · EA · GW

Possibly the biggest unknown in ethics is whether bits matter, or whether atoms matter.

If you assume bits matter, then I think this naturally leads into a concept cluster where speaking about utility functions, preference satisfaction, complexity of value, etc, makes sense. You also get a lot of weird unresolved thought-experiments like homomorphic encryption.

If you assume atoms matter, I think this subtly but unavoidably leads to a very different concept cluster-- qualia turns out to be a natural kind instead of a leaky reification, for instance. Talking about the 'unity of value thesis' makes more sense than talking about the 'complexity of value thesis'.

TL;DR: I think you're right that if we assume computationalism/functionalism is true, then pleasure and suffering are inherently ill-defined, not crisp. They do seem well-definable if we assume physicalism is true, though.

Comment by mikejohnson on Why I prioritize moral circle expansion over artificial intelligence alignment · 2018-02-22T02:56:12.391Z · EA · GW

Second, I'd imagine that a mature science of consciousness would increase MCE significantly. Many people don't think animals are conscious, and almost no one thinks anything besides animals can be conscious. How would we even know if an AI was conscious, and if so, if it was experiencing joy or suffering? The only way would be if we develop theories of consciousness that we have high confidence in. But right now we're very limited in studying consciousness, because our tools at interfacing with the brain are crude. Advanced neurotechnologies could change that - they could allow us to potentially test hypotheses about consciousness. Again, developing these technologies would be a technical problem.

I think that's right. Specifically, I would advocate consciousness research as a foundation for principled moral circle expansion. I.e., if we do consciousness research correctly, the equations themselves will tell us how conscious insects are, whether algorithms can suffer, how much moral weight we should give animals, and so on.

On the other hand, if there is no fact of the matter as to what is conscious, we're headed toward a very weird, very contentious future of conflicting/incompatible moral circles, with no 'ground truth' or shared principles to arbitrate disputes.

Edit: I'd also like to thank Jacy for posting this- I find it a notable contribution to the space, and clearly a product of a lot of hard work and deep thought.

Comment by mikejohnson on On funding medical research · 2018-02-18T19:54:36.420Z · EA · GW

Thanks for this writeup. I found it thoughtful and compelling.

My understanding is that ME is real, serious, and understudied/underfunded. Perhaps the core reason it's understudied is that it's unclear where to start, physiologically speaking -- there's a lot of ontological uncertainty in terms of what this thing is that sometimes cripples people.

This is sometimes solvable by throwing resources at the problem, and sometimes not.

It might be helpful to survey some other diseases that followed a similar trajectory (mysterious crippling conditions that later resolved into known diseases with known causes and known treatments) and see if there are any general lessons to learn. My expectation here is that often, what makes a mystery disease 'make sense' is a new method that gives a novel/fresh window into physiology. Celiac disease could be an interesting case study: it was hugely mysterious (and hugely underdiagnosed) until (1) we got a decent IgG screen, and (2) we started to understand how gut permeability works.

I'd also suggest that you may be a little too cynical about alternative medicine; there's a huge amount of snake-oil there, but alternative medicine is also highly heterogeneous, exploring a lot of the possibility space. There will be a lot of bs, but there often are some pearls as well. Mainstream medicine is also not particularly known for immediately finding these pearls and synthesizing them back into the medical literature, so I think it's also plausible that a viable way to make progress on this problem is to survey what alt-med thinks it knows about ME, filter the bs out, and see if there's anything left that can help mainstream medicine understand what ME is and what general class of treatments might help.

Comment by mikejohnson on Lessons for Building Up a Cause · 2018-02-18T07:57:18.651Z · EA · GW

This is really great info-- I'll probably blog about this on the QRI website at some point. (Thanks!)

Another interesting resource along these lines is Luke M's set of case studies about early field growth: I particularly enjoyed his notes on how bioethicists shaped medicine.

Comment by mikejohnson on “Just take the expected value” – a possible reply to concerns about cluelessness · 2018-01-20T01:02:44.307Z · EA · GW

I’m late to the party, but I’ve really enjoyed this series of posts. Thanks for writing.

Comment by mikejohnson on Why I think the Foundational Research Institute should rethink its approach · 2018-01-17T21:58:50.817Z · EA · GW

Aaronson's "Is 'information is physical' contentful?" also seems relevant to this discussion (though I'm not sure exactly how to apply his arguments):

But we should’ve learned by now to doubt this sort of argument. There’s no general principle, in our universe, saying that you can hide as many bits as you want in a physical object, without those bits influencing the object’s observable properties. On the contrary, in case after case, our laws of physics seem to be intolerant of “wallflower bits,” which hide in a corner without talking to anyone. If a bit is there, the laws of physics want it to affect other nearby bits and be affected by them in turn. ... In summary, our laws of physics are structured in such a way that even pure information often has “nowhere to hide”: if the bits are there at all in the abstract machinery of the world, then they’re forced to pipe up and have a measurable effect. And this is not a tautology, but comes about only because of nontrivial facts about special and general relativity, quantum mechanics, quantum field theory, and thermodynamics. And this is what I think people should mean when they say “information is physical.”

Comment by mikejohnson on Mental Health Shallow Review · 2017-11-20T21:42:28.072Z · EA · GW

I would definitely endorse these.

Comment by mikejohnson on Mental Health Shallow Review · 2017-11-20T20:46:51.805Z · EA · GW

This is a wonderful overview. I especially appreciated the notes about possible biases in each study.

My expectation is that the "mental health tech" field is also worth keeping an eye on, although it's often characterized by big claims and not a lot of supporting data. I'm cautiously optimistic that an app like UpLift (Spencer Greenberg et. al) might be able to improve upon existing self-administered CBT options.

There have also been a lot of promising developments in neuroscience and 'applied philosophy of mind', and if there are ways of turning these into technology, it seems plausible we could start to see some "10x results". Better ways to understand what's going on in brains will lead to better tools to fix them when they break.

The two paradigms I find most intriguing here are

  • the predictive coding / free energy paradigm (primary work by Karl Friston, Anil K. Seth, Andy Clark; for a nice summary see SSC's book review of Surfing Uncertainty and 'toward a predictive theory of depression' - also, Adam Safron is an EA who really knows his stuff here, and would be a good person to talk to about how predictive coding models could help inform mental health interventions)

  • the connectome-specific harmonic wave paradigm (primary work by Selen Atasoy; for a nice summary see this video&transcript - this has informed much of QRI's thinking about mental health)

I'd also love to survey other peoples' intuitions on what neuroscience work they think could lead to a '10x breakthrough' in mental health tech.

Comment by mikejohnson on Anti-tribalism and positive mental health as high-value cause areas · 2017-11-04T20:22:13.070Z · EA · GW

I prefer to keep discussion on the object level

I'm not seeing object-level arguments against mental health as an EA cause area. We have made some object-level arguments for, and I'm working on a longer-form description of what QRI plans in this space. Look for more object-level work and meta-level organizing over the coming months.

I'd welcome object-level feedback on our approaches. It didn't seem like your comments above were feedback-focused, but rather they seemed motivated by a belief that this was not "a good direction for EA energy to go relative to the other major ones." I can't rule that out at this point. But I don't like seeing a community member just dishing out relatively content-free dismissiveness on people at a relatively early stage in trying to build something new. If you don't see any good interventions here, and don't think we'll figure out any good interventions, it seems much better to just let us fail, rather than actively try to pour cold water on us. If we're on the verge of using lots of community resources on something that you know to be unworkable, please pour the cold water. But if your argument boils down to "this seems like a bad idea, but I can't give any object-level reasons, but I really want people to know I think this is a bad idea" then I'm not sure what value this interaction can produce.

But, that said, I'd also like to apologize if I've come on too strong in this back-and-forth, or if you feel I've maligned your motives. I think you seem smart, honest, invested in doing good as you see it, and are obviously willing to speak your mind. I would love to channel this into making our ideas better! In trying to do something new, there's approximately a 100% chance we'll make a lot of mistakes. I'd like to enlist your help in figuring out where the mistakes are and better alternatives. Or, if you'd rather preemptively write off mental health as a cause area, that's your prerogative. But we're in this tent together, and although all the evidence I have suggests we have significantly different (perhaps downright dissonant) cognitive styles, perhaps we can still find some moral trade.

Best wishes, Mike

Comment by mikejohnson on Multiverse-wide cooperation in a nutshell · 2017-11-02T21:42:38.465Z · EA · GW

This is a very clear description of some cool ideas. Thanks to you and Caspar for doing this!

Comment by mikejohnson on Anti-tribalism and positive mental health as high-value cause areas · 2017-10-28T15:02:30.755Z · EA · GW

Hi Gregory,

We have never interacted before this, at least to my knowledge, and I worry that you may be bringing some external baggage into this interaction (perhaps some poor experience with some cryonics enthusiast...). I find your "let's shut this down before it competes for resources" attitude very puzzling and aggressive, especially since you show zero evidence that you understand what I'm actually attempting to do or gather support for on the object-level. Very possibly we'd disagree on that too, which is fine, but I'm reading your responses as preemptively closed and uncharitable (perhaps veering toward 'aggressively hostile') toward anything that might 'rock the EA boat' as you see it.

I don't think this is good for EA, and I don't think it's working off a reasonable model of the expected value of a new cause area. I.e., you seem to be implying the expected cause area would be at best zero, but more probably negative, due to zero-sum dynamics. On the other hand, I think a successful new cause area would more realistically draw in or internally generate at least as many resources as it would consume, and probably much more -- my intuition is that at the upper bound we may be looking at something as synergistic as a factorial relationship (with three causes, the total 'EA pie' might be 321=6; with four causes the total 'EA pie' might be 432*1=24). More realistically, perhaps 4+3+2+1 instead of 3+2+1. This could be and probably is very wrong-- but at the same time I think it's more accurate than a zero-sum model.

At any rate, I'm skeptical that we can turn this discussion into something that will generate value to either of us or to EA, so unless you have any specific things you'd like to discuss or clarify, I'm going to leave things here. Feel free to PM me questions.

Comment by mikejohnson on Anti-tribalism and positive mental health as high-value cause areas · 2017-10-27T23:03:22.413Z · EA · GW

I worry that you're also using a fully-general argument here, one that would also apply to established EA cause areas.

This stands out at me in particular:

Naturally I don't mind if enthusiasts pick some area and give it a go, but appeals to make it a 'new cause area' based on these speculative bets look premature by my lights: better to pick winners based on which of the disparate fields shows the greatest progress, such that one forecasts similar marginal returns to the 'big three'.

There's a lot here that I'd challenge. E.g., (1) I think you're implicitly overstating how good the marginal returns on the 'big three' actually are, (2) you seem to be doubling down on the notion that "saving lives is better than improving lives" or that "the current calculus of EA does and should lean toward reduction of mortality, not improving well-being", which I challenged above, (3) I don't think your analogy between cryonics (which, for the record, I'm skeptical on as an EA cause area) and e.g., Enthea's collation of research on psilocybin seems very solid.

I would also push back on how dismissive "Naturally I don't mind if enthusiasts pick some area and give it a go, but appeals to make it a 'new cause area' based on these speculative bets look premature by my lights" sounds. Enthusiasts are the ones that create new cause areas. We wouldn't have any cause areas, save for those 'silly enthusiasts'. Perhaps I'm misreading your intended tone, however.

Comment by mikejohnson on Anti-tribalism and positive mental health as high-value cause areas · 2017-10-27T18:33:26.749Z · EA · GW

I don't think mental health has comparably good ... [c]ost per QALY or similar.

Some hypothetical future intervention could be much better, but looking for these isn't that neglected, and such progress looks intractable given we understand the biology of a given common mental illness much more poorly than a typical NTD.

I think the core argument for mental health as a new cause area is that (1) yes, current mental health interventions are pretty bad on average, but (2) there could be low-hanging fruit locked away behind things that look 'too weird to try', and (3) EA may be in a position to signal-boost the weird things ('pull the ropes sideways') that have a plausible chance of working.

Using psilocybin as an adjunct to therapy seems like a reasonable example of some low-hanging fruit that's effective, yet hasn't been Really Tried, since it is weird. And this definitely does not exhaust the set of weird & plausible interventions.

I'd also like to signal-boost @MichaelPlant's notion that "A more general worry is that effective altruists focus too much on saving lives rather than improving lives.." At some point, we'll get to hard diminishing returns on how many lives we can 'save' (delay the passing of) at reasonable cost or without significant negative externalities. We may be at that point now. If we're serious about 'doing the most good we can do' I think it's reasonable to explore a pivot to improving lives -- and mental health is a pretty key component of this.

Comment by mikejohnson on Effective Altruism for Animals: Consideration for different value systems · 2017-10-25T03:53:27.296Z · EA · GW

Hi Kevin,

I think it may be useful to frame your critiques in terms of causal stories -- e.g., how strategy or structural condition X, fails to achieve goal Y, that organization Z has explicitly endorsed. Offering a gears-level model of what you think is happening, and why that's bad, is probably the best way to (1) change peoples minds, if they're wrong, and (2) allow other people to change your mind, if you're wrong.

A few more specific things that I think are worth clarifying or pushing back on:

Welfare vs exploitation framing: You note the distinction between the pro-welfare vs anti-exploitation wings of animal advocacy, and suggest that the dominance of the pro-welfare wing has created some discontent in people with alternative value systems. I think that's a fair comment, but I'd also suggest (as an observer who is not associated with the organizations you mentioned) that the welfare-centric approach may have good reasons for popularity in the marketplace of ideas. Personally, as a valence realist, I believe that caring about animal welfare is much more philosophically defensible than caring about animal exploitation, because I think welfare is more 'real' (better definable; less of a leaky reification; hews closer to what actually has value) than exploitation/justice. I certainly could be wrong and it could be there are solid reasons why I should care more about alternative framings, but I'd need to see good philosophical arguments for this.

Democratisation / accountability at ACE: I should note that I'm not affiliated with ACE whatsoever, but I have been following them as an organization. I too have some qualms about some things they've written, but it seems my qualms run in the opposite direction of yours. :) I.e., I think equity, inclusion, and diversity can be good things, but I also believe organizations have a limited 'complexity budget', and by requiring of themselves an explicit focus on these things, ACE may be watering down their core goal of helping animals. However, I would also add (1) I'm glad ACE exists, (2) my impression is they’re doing a fine job, and (3) I don't see myself as having much standing (‘skin in the game’) to critique ACE.

This is not to say your concerns are baseless, but it is to note there are people who seem to share your goals (‘being good to animals’ is a non-trivial reason why I’m doing the work I’m doing, and I assume you feel the same), yet would pull in exactly the opposite direction you would.

Probably the most effective moral trade here is that we should just let ACE be ACE.

It could be that this isn’t the best approach, and that EAA orgs should ‘pay more attention to other perspectives’. But I think the burden of proof is on those who would make this assertion to be very clear about (1) what exactly their perspective is, (2) what exactly their perspective entails, practically and philosophically, (3) whether they have any ‘skin in the game’ in relevant ways, (4) what’s uniquely ethical or effective about these perspectives, among the countless perspectives out there, and by implication (5) why EAAs (such as ACE) should change their methods and/or goals to accommodate them.

Comment by mikejohnson on Anti-tribalism and positive mental health as high-value cause areas · 2017-10-24T20:38:32.143Z · EA · GW

Can you say more about the "revealed constraints" here? What would be the appropriate preconditions for "starting the party?" I think it can and should be done - we've embraced frontline cost-effectiveness in doing good today, and we've embraced initiatives oriented towards good in the far future even in the absence of clear interventions; even so, global mental health hasn't quite fit into either of those EA approaches, despite being a high-burden problem that is extremely neglected and arguably tractable.

Right, I think an obvious case can be made that mental health is Important; making the case that it's also Tractable and Neglected requires more nuance but I think this can be done. E.g., few non-EA organizations are 'pulling the ropes sideways', have the institutional freedom to think about this as an actual optimization target, or are in a position to work with ideas or interventions that are actually upstream of the problem. My intuition is that mental health is hugely aligned with what EAs actually care about, and is much much more tractable and neglected than the naive view suggests. To me, it's a natural fit for a top-level cause area.

The problem I foresee is that EA hasn't actually added a new Official Top-Level Cause Area since... maybe EA was founded? And so I don't expect to see much of a push from the EA leadership to add mental health as a cause area -- not because they don't want it to happen, but because (1) there's no playbook for how to make it happen, and (2) there may be local incentives that hinder doing this.

More specifically: mental health interventions that actually work are likely to be weird- e.g., Michael D. Plant's ideas about drug legalization is a little weird; Enthea's ideas about psilocybin is more weird; QRI's valence research is very weird. Now, at EAG there was a talk suggesting that we 'Keep EA Weird'. But I worry that's a retcon, that weird things have been grandfathered into EA but institutional EA is not actually very weird, and despite lots of funding, it has very little funding for Actually Weird Things. Looking at what gets funded ('revealed preferences') I see support for lots of conventionally-worthy things and some appetite for moderately weird things, but almost none for things that are sufficiently weird that they could seed a new '10x+' cause area ("zero-to-one weird").

*Note to all EA leadership reading this: I would LOVE LOVE LOVE to be proven wrong here!

So, my intuition is that EAs who want this to happen will need to organize, make some noise, 'start the party', and in general nurture this mental-health-as-cause-area thing until it's mature enough that 'core EA' orgs won't need to take a status hit to fund it. I.e., if we want EA to rally around mental health, it's literally up to people like us to make that happen.

I think if we can figure out good answers to these questions we'd have a good shot:

  • Why do you think mental health is Neglected and Tractable?

  • Why us, why now, why hasn't it already been done?

  • Which threads & people in EA do you think could be rallied under the banner of mental health?

  • Which people in 'core EA' could we convince to be a champion of mental health as an EA cause area?

  • Who could tell us What It Would Actually Take to make mental health a cause area?

  • What EA, and non-EA, organizations could we partner with here? Do we have anyone with solid connections to these organizations?

(Anyone with answers to these questions, please chime in!)

Comment by mikejohnson on Anti-tribalism and positive mental health as high-value cause areas · 2017-10-18T23:38:59.280Z · EA · GW

QRI is very interested and working hard on the mental health cause area; notable documents are Quantifying Bliss and Wireheading Done Right - more to come soon. There are also other good things in this space from e.g. Michael D. Plant, Spencer Greenberg, and perhaps Enthea.

On the tribalism point, I agree tribalism causes a lot of problems; I'd also agree with what I take you to be saying that in some senses it may be a load-bearing part of an Evolutionary Stable Strategy (ESS) which may prove troublesome to tinker with. Finally, I agree that mental health is a potentially upstream factor in some of the negative-sum presentations of tribalism.

I would say it's unclear at this point whether the EA movement has the plasticity required to make mental health an Official Cause Area -- I believe the leadership is interested but "revealed constraints" seem to tell a mixed story. I'm certainly hoping it'll happen if enough people get together to 'start the party'.

(Personal opinions; not necessarily shared by others at QRI)

Comment by mikejohnson on High Time For Drug Policy Reform. Part 2/4: Six Ways It Could Do Good And Anticipating The Objections · 2017-08-14T14:53:09.273Z · EA · GW

Hi Michael,

This is fantastic work, thanks for all the effort and thought that went into these posts. Your overall case seems solid to me-- or at minimum, I think yours is 'the argument to beat'.

One thought that I had while reading:

Drug policy reform may also allow us to better understand current pain medications and develop new treatments and uses. Your focus here is on decriminalizing existing drugs such as psilocybin, opioids, and MDMA, because you believe (with substantial evidence) that these drugs have nontrivial therapeutic potential, despite their sometimes substantial drawbacks. This seems reasonable, especially in the case of drugs with fairly benign risk profiles (e.g. psilocybin).

I do worry about some of the long-term side-effects associated with certain drugs, however, and it seems to me an interesting 'unknown unknown' here is if it's possible to develop new substances, or novel brain stimulation modalities, that allow us access to the upsides of such drugs, without suffering from the downsides.

E.g., in the case of MDMA, the not-uncommon long-term effects of chronic use include heightened anxiety & cognitive impairment, which seem very serious. But at the same time, there doesn't seem to be any 'law of the universe' mandating that the pleasant feelings of love & trust elicited by MDMA that are so therapeutically useful for PTSD must be unavoidably linked to brain damage.

I'm not completely sure how this observation interacts with your arguments, but I suspect it generally supports your case, since decriminalization could lower barriers for research into even better & safer options. Quite possibly, this could be one of the major reasons why decriminalization could lead to a better future.

On the other hand, the sword of innovation cuts both ways, as there seem to be a lot of very dangerous, toxic variants of drugs coming from overseas labs that are even less safe than current options (Fentanyl, Captagon, etc). Perhaps this is a case of "Banning dangerous substances as a precautionary principle can have perverse effects if it causes people to take a more dangerous drugs instead," and decriminalization would help mitigate this phenomenon. But I must admit to some uncertainty & worry here as to second-order effects.

Anyway, I think this is worth pursuing further. OpenPhil might be interested? I think probably Nick Beckstead might be a good contact there.

Comment by mikejohnson on Why I think the Foundational Research Institute should rethink its approach · 2017-08-02T22:30:59.316Z · EA · GW

You may also like Towards a computational theory of experience by Fekete and Edelman- here's their setup:

3.4. Counterfactually stable account of implementation To claim a computational understanding of a system, it is necessary for us to be able to map its instantaneous states and variables to those of a model. Such a mapping is, however, far from sufficient to establish that the system is actually implementing the model: without additional constraints, a large enough conglomerate of objects and events can be mapped so as to realize any arbitrary computation (Chalmers, 1994; Putnam, 1988). A careful analysis of what it means for a physical system to implement an abstract computation (Chalmers, 1994; Maudlin, 1989) suggests that, in addition to specifying a mapping between the respective instantaneous states of the system and the computational model, one needs to spell out the rules that govern the causal transitions between corresponding instantaneous states in a counterfactually resistant manner.

In the case of modeling phenomenal experience, the stakes are actually much higher: one expects a model of qualia to be not merely good (in the sense of the goodness of fit between the model and its object), but true and unique. Given that a multitude of distinct but equally good computational models may exist, why is not the system realizing a multitude of different experiences at a given time? Dodging this question amounts to conceding that computation is not nomologically related to qualia.

Construing computation in terms of causal interactions between instantaneous states and variables of a system has ramifications that may seem problematic for modeling experience. If computations and their implementations are individuated in terms of causal networks, then any given, specific experience or quale is individuated (in part) by the system’s entire space of possible instantaneous states and their causal interrelationships. In other words, the experience that is unfolding now is defined in part by the entire spectrum of possible experiences available to the system.

In subsequent sections, we will show that this explanatory problem is not in fact insurmountable, by outlining a solution for it. Meanwhile, we stress that while computation can be explicated by numbering the instantaneous states of a system and listing rules of transition between these states, it can also be formulated equivalently in dynamical terms, by defining (local) variables and the dynamics that govern their changes over time. For example, in neural-like models computation can be explicated in terms of the instantaneous state of ‘‘representational units’’ and the differential equations that together with present input lead to the unfolding of each unit’s activity over time. Under this description, computational structure results entirely from local physical interactions.

It's a little bit difficult to parse precisely how they believe they solve the multiple realization of computational interpretations of a system, but the key passage seems to be:

Third, because of multiple realizability of computation, one computational process or system can represent another, in that a correspondence can be drawn between certain organizational aspects of one process and those of the other. In the simplest representational scenario, correspondence holds between successive states of the two processes, as well as between their respective timings. In this case, the state-space trajectory of one system unfolds in lockstep with that of the other system, because the dynamics of the two systems are sufficiently close to one another; for example, formal neurons can be wired up into a network whose dynamics would emulate (Grush, 2004) that of the falling rock mentioned above. More interesting are cases in which the correspondence exists on a more abstract level, for instance between a certain similarity structure over some physical variables ‘‘out there’’ in the world (e.g., between objects that fall like a rock and those that drift down like a leaf) and a conceptual structure over certain instances of neural activity, as well as cases in which the system emulates aspects of its own dynamics. Further still, note that once representational mechanisms have been set in place, they can also be used ‘‘offline’’ (Grush, 2004). In all cases, the combinatorics of the world ensures that the correspondence relationship behind instances of representation is highly non-trivial, that is, unlikely to persist purely as a result of a chance configurational alignment between two randomly picked systems (Chalmers, 1994).

My attempt at paraphrasing this: if we can model the evolution of a physical system and the evolution of a computational system with the same phase space for some finite time t, then as t increases we can be increasingly confident the physical system is instantiating this computational system. At the limit (t->∞), this may offer a method for uniquely identifying which computational system a physical system is instantiating.

My intuition here is that the closer they get to solving the problem of how to 'objectively' determine what computations a physical system is realizing, the further their framework will stray from the Turing paradigm of computation and the closer it will get to a hypercomputation paradigm (which in turn may essentially turn out to be isomorphic to physics). But, I'm sure I'm biased, too. :) Might be worth a look.

Comment by mikejohnson on Why I think the Foundational Research Institute should rethink its approach · 2017-08-02T18:55:02.167Z · EA · GW

The counterfactual response is typically viewed as inadequate in the face of triviality arguments. However, when we count the number of automata permitted under that response, we find it succeeds in limiting token physical systems to realizing at most a vanishingly small fraction of the computational systems they could realize if their causal structure could be ‘repurposed’ as needed. Therefore, the counterfactual response is a prima facie promising reply to triviality arguments. Someone might object this result nonetheless does not effectively handle the metaphysical issues raised by those arguments. Specifically, an ‘absolutist’ regarding the goals of an account of computational realization might hold that any satisfactory response to triviality arguments must reduce the number of possibly-realized computational systems to one, or to some number close to one. While the counterfactual response may eliminate the vast majority of computational systems from consideration, in comparison to any small constant, the number of remaining possibly-realized computational systems is still too high (2^n).

That seems like a useful approach- in particular,

On the other hand, the argument suggests at least some computational hypotheses regarding cognition are empirically substantive: by identifying types of computation characteristic of cognition (e.g., systematicity, perhaps), we limit potential cognitive devices to those whose causal structure includes these types of computation in the sets of possibilities they support.

This does seem to support the idea that progress can be made on this problem! On the other hand, the author's starting assumption is we can treat a physical system as a computational (digital) automata, which seems like a pretty big assumption.

I think this assumption may or may not turn out to be ultimately true (Wolfram et al), but given current theory it seems difficult to reduce actual physical systems to computational automata in practice. In particular, it seems difficult to apply this framework to (1) quantum systems (which all physical systems ultimately are), and (2) biological systems which have messy levels of abstraction such as the brain (which we'd want to be able to do for the purposes of functionalism).

From a physics perspective, I wonder if we could figure out a way to feed in a bounded wavefunction, and get identify some minimum upper bound of reasonable computational interpretations of the system. My instinct is that David Deutsch might be doing relevant work? But I'm not at all sure of this.

Comment by mikejohnson on Why I think the Foundational Research Institute should rethink its approach · 2017-08-01T21:07:05.476Z · EA · GW

That's no reason to believe that analytic functionalism is wrong, only that it is not sufficient by itself to answer very many interesting questions.

I think that's being generous to analytic functionalism. As I suggested in Objection 2,

In short, FRI’s theory of consciousness isn’t actually a theory of consciousness at all, since it doesn’t do the thing we need a theory of consciousness to do: adjudicate disagreements in a principled way. Instead, it gives up any claim on the sorts of objective facts which could in principle adjudicate disagreements.


I only claim that most physical states/processes have only a very limited collection of computational states/processes that it can reasonably be interpreted as[.]

I'd like to hear more about this claim; I don't think it's ridiculous on its face (per Brian's and Michael_PJ's comments), but it seems a lot of people have banged their head against this without progress, and my prior is formalizing this is a lot harder than it looks (it may be unformalizable). If you could formalize it, that would have a lot of value for a lot of fields.

So although I used that critique of IIT as an example, I was mainly going off of intuitions I had prior to it. I can see why this kind of very general criticism from someone who hasn't read the details could be frustrating, but I don't expect I'll look into it enough to say anything much more specific.

I don't expect you to either. If you're open to a suggestion about how to approach this in the future, though, I'd offer that if you don't feel like reading something but still want to criticize it, instead of venting your intuitions (which could be valuable, but don't seem calibrated to the actual approach I'm taking), you should press for concrete predictions.

The following phrases seem highly anti-scientific to me:

sounds wildly implausible | These sorts of theories never end up getting empirical support, although their proponents often claim to have empirical support | I won't be at all surprised if you claim to have found substantial empirical support for your theory, and I still won't take your theory at all seriously if you do, because any evidence you cite will inevitably be highly dubious | The heuristic that claims that a qualia-related concept is some simple other thing are wrong, and that claims of empirical support for such claims never hold up | I am almost certain that there are trivial counterexamples to the Symmetry Theory of Valence

I.e., these statements seem to lack epistemological rigor, and seem to absolutely prevent you from updating in response to any evidence I might offer, even in principle (i.e., they're actively hostile to your improving your beliefs, regardless of whether I am or am not correct).

I don't think your intention is to be closed-minded on this topic, and I'm not saying I'm certain STV is correct. Instead, I'm saying you seem to be overreacting to some stereotype you initially pattern-matched me as, and I'd suggest talking about predictions is probably a much healthier way to move forward if you want to spend more time on this. (Thanks!)

Comment by mikejohnson on Why I think the Foundational Research Institute should rethink its approach · 2017-07-31T18:34:36.533Z · EA · GW

Speaking of the metaphysical correctness of claims about qualia sounds confused, and I think precise definitions of qualia-related terms should be judged by how useful they are for generalizing our preferences about central cases.

I agree a good theory of qualia should help generalize our preferences about central cases. I disagree that we can get there with the assumption that qualia are intrinsically vague/ineffable. My critique of analytic functionalism is that it is essentially nothing but an assertion of this vagueness.

Regarding the objection that shaking a bag of popcorn can be interpreted as carrying out an arbitrary computation, I'm not convinced that this is actually true, and I suspect it isn't.

Without a bijective mapping between physical states/processes and computational states/processes, I think my point holds. I understand it's counterintuitive, but we should expect that when working in these contexts.

I think the edge cases that you quote Scott Aaronson bringing up are good ones to think about, and I do have a large amount of moral uncertainty about them. But I don't see these as problems specific to analytic functionalism. These are hard problems, and the fact that some more precise theory about qualia may be able to easily answer them is not a point in favor of that theory, since wrong answers are not helpful.

Correct; they're the sorts of things a theory of qualia should be able to address- necessary, not sufficient.

Re: your comments on the Symmetry Theory of Valence, I feel I have the advantage here since you haven't read the work. Specifically, it feels as though you're pattern-matching me to IIT and channeling Scott Aaronson's critique of Tononi, which is a bit ironic since that forms a significant part of PQ's argument why an IIT-type approach can't work.

At any rate I'd be happy to address specific criticism of my work. This is obviously a complicated topic and informed external criticism is always helpful. At the same time, I think it's a bit tangential to my critique about FRI's approach: as I noted,

I mention all this because I think analytic functionalism- which is to say radical skepticism/eliminativism, the metaphysics of last resort- only looks as good as it does because nobody’s been building out any alternatives.

Comment by mikejohnson on Why I think the Foundational Research Institute should rethink its approach · 2017-07-31T17:32:40.690Z · EA · GW

I think that's fair-- beneficial equilibriums could depend on reifying things like this.

On the other hand, I'd suggest that with regard to identifying entities that can suffer, false positives are much less harmful than false negatives but they still often incur a cost. E.g., I don't think corporations can suffer, so in many cases it'll be suboptimal to grant them the sorts of protections we grant humans, apes, dogs, and so on. Arguably, a substantial amount of modern ethical and perhaps even political dysfunction is due to not kicking leaky reifications out of our circle of caring. (This last bit is intended to be provocative and I'm not sure how strongly I'd stand behind it...)

Comment by mikejohnson on Why I think the Foundational Research Institute should rethink its approach · 2017-07-27T21:37:33.495Z · EA · GW

An additional note on this:

What something better would look like - if I knew that, I'd be busy writing a paper about it. :-) That seems to be a part of the problem - everyone (that I know of) agrees that functionalism is deeply unsatisfactory, but very few people seem to have any clue of what a better theory might look like.

I'd propose that if we split the problem of building a theory of consciousness up into subproblems, the task gets a lot easier. This does depend on elegant problem decompositon. Here are the subproblems I propose:

A quick-and-messy version of my framework:

  • (1) figure out what sort of ontology you think can map to both phenomenology (what we're trying to explain) and physics (the world we live in);

  • (2) figure out what subset of that ontology actively contributes to phenomenology;

  • (3) figure out how to determine the boundary of where minds stop, in terms of that-stuff-that-contributes-to-phenomenology;

  • (4) figure out how to turn the information inside that boundary into a mathematical object isomorphic to phenomenology (and what the state space of the object is);

  • (5) figure out how to interpret how properties of this mathematical object map to properties of phenomenology.

The QRI approach is:

  • (1) Choice of core ontology -> physics (since it maps to physical reality cleanly, or some future version like string theory will);

  • (2) Choice of subset of core ontology that actively contributes to phenomenology -> Andres suspects quantum coherence; I'm more agnostic (I think Barrett 2014 makes some good points);

  • (3) Identification of boundary condition -> highly dependent on (2);

  • (4) Translation of information in partition into a structured mathematical object isomorphic to phenomenology -> I like how IIT does this;

  • (5) Interpretation of what the mathematical output means -> Probably, following IIT, the dimensional magnitude of the object could correspond with the degree of consciousness of the system. More interestingly, I think the symmetry of this object may plausibly have an identity relationship with the valence of the experience.

Anyway, certain steps in this may be wrong, but that's what the basic QRI "full stack" approach looks like. I think we should be able to iterate as we go, since we can test parts of (5) (like the Symmetry Hypothesis of Valence) without necessarily having the whole 'stack' figured out.

Comment by mikejohnson on Why I think the Foundational Research Institute should rethink its approach · 2017-07-26T18:33:54.752Z · EA · GW

I take this as meaning that you agree that accepting functionalism is orthogonal to the question of whether suffering is "real" or not?

Ah, the opposite actually- my expectation is that if 'consciousness' isn't real, 'suffering' can't be real either.

What something better would look like - if I knew that, I'd be busy writing a paper about it. :-) That seems to be a part of the problem - everyone (that I know of) agrees that functionalism is deeply unsatisfactory, but very few people seem to have any clue of what a better theory might look like. Off the top of my head, I'd like such a theory to at least be able to offer some insight into what exactly is conscious, and not have the issue where you can hypothesize all kinds of weird computations (like Aaronson did in your quote) and be left confused about which of them are conscious and which are not, and why. (roughly, my desiderata are similar to Luke Muehlhauser's)

Thanks, this is helpful. :)

The following is tangential, but I thought you'd enjoy this Yuri Harari quote on abstraction and suffering:

In terms of power, it’s obvious that this ability [to create abstractions] made Homo sapiens the most powerful animal in the world, and now gives us control of the entire planet. From an ethical perspective, whether it was good or bad, that’s a far more complicated question. The key issue is that because our power depends on collective fictions, we are not good in distinguishing between fiction and reality. Humans find it very difficult to know what is real and what is just a fictional story in their own minds, and this causes a lot of disasters, wars and problems.

The best test to know whether an entity is real or fictional is the test of suffering. A nation cannot suffer, it cannot feel pain, it cannot feel fear, it has no consciousness. Even if it loses a war, the soldier suffers, the civilians suffer, but the nation cannot suffer. Similarly, a corporation cannot suffer, the pound sterling, when it loses its value, it doesn’t suffer. All these things, they’re fictions. If people bear in mind this distinction, it could improve the way we treat one another and the other animals. It’s not such a good idea to cause suffering to real entities in the service of fictional stories.

Comment by mikejohnson on Why I think the Foundational Research Institute should rethink its approach · 2017-07-25T17:36:44.107Z · EA · GW

Functionalism seems internally consistent (although perhaps too radically skeptical). However, in my view it also seems to lead to some flavor of moral nihilism; consciousness anti-realism makes suffering realism difficult/complicated.

If you had a precise enough theory about the functional role and source of suffering, then this would be a functionalist theory that specified objective criteria for the presence of suffering.

I think whether suffering is a 'natural kind' is prior to this analysis: e.g., to precisely/objectively explain the functional role and source of something, it needs to have a precise/crisp/objective existence.

I've always assumed that anyone who has thought seriously about philosophy of mind has acknowledged that functionalism has major deficiencies and is at best our "least wrong" placeholder theory until somebody comes up with something better.)

Part of my reason for writing this critique is to argue that functionalism isn't a useful theory of mind, because it doesn't do what we need theories of mind to do (adjudicate disagreements in a principled way, especially in novel contexts).

If it is a placeholder, then I think the question becomes, "what would 'something better' look like, and what would count as evidence that something is better? I'd love to get your (and FRI's) input here.

Comment by mikejohnson on Why I think the Foundational Research Institute should rethink its approach · 2017-07-23T20:57:34.226Z · EA · GW

My sense that MIRI and FHI are fairly strong believers in functionalism, based on reading various pieces on LessWrong, personal conversation with people who work there, and 'revealed preference' research directions. OpenPhil may be more of a stretch to categorize in this way; I'm going off what I recall of Holden's debate on AI risk, some limited personal interactions with people that work there, and Luke Muehlhauser's report (he was up-front about his assumptions on this).

Of course it's harder to pin down what people at these organizations believe than it is in Brian's case, since Brian writes a great deal about his views.

So to my knowledge, this statement is essentially correct, although there may be definitional & epistemological quibbles.

Comment by mikejohnson on Why I think the Foundational Research Institute should rethink its approach · 2017-07-22T14:42:39.569Z · EA · GW

I really enjoyed your linked piece on meta-ethics. Short but insightful. I believe I'd fall into the second bucket.

If you're looking for what (2) might look like in practice, and how we might try to relate it to the human brain's architecture/drives, you might enjoy this:

I'd also agree that designing trustworthy reflection procedures is important. My intuitions here are: (1) value-drift is a big potential problem with FRI's work (even if they "lock in" caring about suffering, if their definition of 'suffering' drifts, their tacit values do too); (2) value-drift will be a problem for any system of ethics that doesn't cleanly 'compile to physics'. (This is a big claim, centering around my Objection 6, above.)

Perhaps we could generalize this latter point as "if information is physical, and value is informational, then value is physical too."