Posts

Working at EA Organizations Series: Animal Charity Evaluators 2016-03-03T22:54:14.614Z · score: 4 (4 votes)
Working at EA Organizations Series: Giving What We Can 2016-03-02T23:22:25.652Z · score: 6 (8 votes)
The EA Newsletter & Open Thread - January 2016 2016-01-07T23:19:19.163Z · score: 3 (3 votes)
Working At EA Organizations series: Charity Science 2015-11-29T19:53:48.154Z · score: 8 (8 votes)
The Effective Altruism Newsletter & Open Thread - 23 November 2015 Edition 2015-11-26T10:01:05.809Z · score: 3 (3 votes)
Working at EA organizations series: .impact 2015-11-19T10:35:28.141Z · score: 5 (5 votes)
Working at EA organizations series: Machine Intelligence Research Institute 2015-11-01T12:49:16.910Z · score: 8 (8 votes)
The Effective Altruism Newsletter & Open Thread - 26 October 2015 Edition 2015-10-27T11:58:47.670Z · score: 2 (2 votes)
Working at EA organizations series: Effective Altruism Foundation 2015-10-26T16:34:31.368Z · score: 6 (8 votes)
Working at EA organizations series: 80000 Hours 2015-10-21T19:06:21.546Z · score: 4 (4 votes)
Working at EA organizations series: Why work at an EA organization? 2015-10-18T09:42:59.474Z · score: 6 (6 votes)
The Effective Altruism Newsletter & Open Thread - 12 October 2015 Edition 2015-10-14T14:21:01.902Z · score: 5 (5 votes)
How to get more EAs to connect in person and share expertise? 2015-08-18T20:18:16.268Z · score: 8 (8 votes)
Meetup : 'The most good you can do" 2015-05-14T16:49:56.581Z · score: 0 (0 votes)
Meetup : 'The most good you can do" 2015-05-14T16:48:40.519Z · score: 0 (0 votes)

Comments

Comment by soerenmind on More info on EA Global admissions · 2019-12-29T03:25:58.722Z · score: 13 (10 votes) · EA · GW

Thanks for writing this. I don't have a solution but I'm just registering that I would expect plenty of rejected applicants to feel alienated from the EA community despite this post.

Comment by soerenmind on Are we living at the most influential time in history? · 2019-12-13T20:33:27.214Z · score: 7 (4 votes) · EA · GW

It's just an informal way to say that we're probably typical observers. It's named after Copernicus because he found that the Earth isn't as special as people thought.

Comment by soerenmind on Technical AGI safety research outside AI · 2019-10-19T13:17:35.518Z · score: 1 (1 votes) · EA · GW

Very nice list!

Comment by soerenmind on Introducing Foretold.io: A New Open-Source Prediction Registry · 2019-10-17T10:20:07.810Z · score: 1 (1 votes) · EA · GW

Great work!!!

Comment by soerenmind on The evolutionary argument against cognitive enhancement research is weak · 2019-10-17T10:19:05.419Z · score: 1 (1 votes) · EA · GW

Hmmm isn't the argument still pretty broadly applicable and useful despite the exceptions?

Comment by soerenmind on Is there a good place to find the "what we know so far" of the EA movement? · 2019-09-29T15:19:26.146Z · score: 8 (7 votes) · EA · GW

If you want a single source, I find the 80000 hours key ideas page and everything it links to quite comprehensive and well written.

Comment by soerenmind on Effective Altruism and Everyday Decisions · 2019-09-25T23:43:16.006Z · score: 9 (5 votes) · EA · GW

Like most commenters, I broadly agree with the empirical info here. It's sort of obvious, but telling others things like "don't go out of your way to use less plastic" or even just creating unnecessary waste in a social situation can be inconsiderate towards people's sensibilities. Of course, this post advocates no such thing but I want to be sure nobody goes away thinking these things are necessarily OK.

(I was recently reminded of a CEA research article about how considerateness is even more important than most people think, and EAs should be especially careful because their behavior reflects on the whole community.)

Comment by soerenmind on Are we living at the most influential time in history? · 2019-09-23T14:53:42.445Z · score: 1 (1 votes) · EA · GW

On second thoughts, I think it's worth clarifying that my claim is still true even though yours is important in its own right. On Gott's reasoning, P(high influence | world has 2^N times the # of people who've already lived) is still just 2^-N (that's 2^-(N-1) if summed over all k>=N). As you said, these tiny probabilities are balanced out by asymptotically infinite impact.

I'll write up a separate objection to that claim but first a clarifying question: Why do you call Gott's conditional probability a prior? Isn't it more of a likelihood? In my model it should be combined with a prior P(number of people the world has). The resulting posterior is then the prior for further enquiries.

Comment by soerenmind on Are we living at the most influential time in history? · 2019-09-19T14:53:58.303Z · score: 7 (3 votes) · EA · GW

Interesting point!

The diverging series seems to be a version of the St Petersburg paradox, which has fooled me before. In the original version, you have a 2^-k chance of winning 2^k for every positive integer k, which leads to infinite expected payoff. One way in which it's brittle is that, as you say, the payoff is quite limited if we have some upper bound on the size of the population. Two other mathematical ways are 1) if the payoff is just 1.99^k or 2) if it is 2^0.99k.



Comment by soerenmind on Are we living at the most influential time in history? · 2019-09-18T13:32:48.657Z · score: 1 (1 votes) · EA · GW

If you're just presenting a prior I agree that you've not conditioned on an observation "we're very early". But to the extent that your reasoning says there's a non-trivial probability of [we have extremely high influence over a big future], you do condition on some observation of that kind. In fact, it would seem weird if any Copernican prior could give non-trivial mass to that proposition without an additional observation.

I continue my response here because the rest is more suitable as a higher-level comment.

Comment by soerenmind on Are we living at the most influential time in history? · 2019-09-18T13:26:59.739Z · score: 6 (3 votes) · EA · GW

On your prior,

P(high influence) isn't tiny. But if I understand correctly, that's just because

P(high influence | short future) isn't tiny whereas

P(high influence | long future) is still tiny. (I haven't checked the math, correct me if I'm wrong).

So your argument doesn't seems to save existential risk work. The only way to get a non-trivial P(high influence | long future) with your prior seems to be by conditioning on an additional observation "we're extremely early". As I argued here, that's somewhat sketchy to do.

Comment by soerenmind on Are we living at the most influential time in history? · 2019-09-12T19:40:51.455Z · score: 13 (4 votes) · EA · GW

So your prior says, unlike Will’s, that there are non-trivial probabilities of very early lock-in. That seems plausible and important. But it seems to me that your analysis not only uses a different prior but also conditions on “we live extremely early” which I think is problematic.

Will argues that it’s very weird we seem to be at an extremely hingy time. So we should discount that possibility. You say that we’re living at an extremely early time and it’s not weird for early times to be hingy. I imagine Will’s response would be “it’s very weird we seem to be living at an extremely early time then” (and it’s doubly weird if it implies we live in an extremely hingy time).

If living at an early time implies something that is extremely unlikely a priori for a random person from the timeline, then there should be an explanation. These 3 explanations seem exhaustive:

1) We’re extremely lucky.

2) We aren’t actually early: E.g. we’re in a simulation or the future is short. (The latter doesn’t necessarily imply that xrisk work doesn’t have much impact because the future might just be short in terms of people in our anthropic reference class).

3) Early people don’t actually have outsized influence: E.g. the hazard/hinge rate in your model is low (perhaps 1/N where N is the length of the future). In a Bayesian graphical model, there should be a strong update in favor of low hinge rates after observing that we live very early (unless another explanation is likely a priori).

Both 2) and 3) seem somewhat plausible a priori so it seems we don’t need to assume that a big coincidence explains how early we live.

Comment by soerenmind on Existential Risk and Economic Growth · 2019-09-04T16:06:29.057Z · score: 4 (3 votes) · EA · GW

This sounds really cool. Will have to read properly later. How would you recommend a time pressured reader to go through this? Are you planning a summary?

Comment by soerenmind on Our forthcoming AI Safety book · 2019-09-04T15:58:02.175Z · score: 8 (6 votes) · EA · GW

Just registering that I'm not convinced this justifies the title.

Comment by soerenmind on Are we living at the most influential time in history? · 2019-09-04T15:52:20.388Z · score: 1 (1 votes) · EA · GW

Yep, see reply to Lukas.

Comment by soerenmind on Are we living at the most influential time in history? · 2019-09-04T15:48:56.292Z · score: 1 (1 votes) · EA · GW

Agreed, I was assuming that the prior for the simulation hypothesis isn't very low because people seem to put credence in it even before Will's argument.

But I found it worth noting that Will's inequality only follows from mine (the likelihood ratio) plus having a reasonably even prior odds ratio.

Comment by soerenmind on Are we living at the most influential time in history? · 2019-09-03T15:25:31.306Z · score: 2 (2 votes) · EA · GW

2.

For me, the HoH update is big enough to make a the simulation hypothesis a pretty likely explanation. It also makes it less likely that there are alternative explanations for "HoH seems likely". See my old post here (probably better to read this comment though).

Imagine a Bayesian model with a variable S="HoH seems likely" (to us) and 3 variables pointing towards it: "HoH" (prior: 0.001), "simulation" (prior=0.1), and "other wrong but convincing arguments" (prior=0.01). Note that it seems pretty unlikely there will be convincing but wrong arguments a priori (I used 0.01) because we haven't updated on the outside view yet.

Further, assume that all three causes, if true, are equally likely to cause "HoH seems likely" (say with probability 1, but the probability doesn't affect the posterior).

Apply Bayes rule: We've observed "HoH seems likely". The denominator in Bayes rule is P(HoH seems likely) ~~ 0.111 (roughly the sum of the three priors because the priors are small). The numerator for each hypothesis H equals 1 * P(H).

Bayes rule gives an equal update (ca 1/0.111x = 9x) in favor of every hypothesis, bringing up the probability of "simulation" to nearly 90%.

Note that this probability decreases if we find, or think there are better explanations for "HoH seems likely". This is plausible but not overwhelmingly likely because we already have a decent explanation with prior 0.1. If we didn't have one, we would still have a lot of pressure to explain "HoH seems likely". The existence of the plausible explanation "simulation" with prior 0.1 "explains away" the need for other explanations such as those falling under "wrong but convincing argument".

This is just an example, feel free to plug in your numbers, or critique the model.

Comment by soerenmind on Are we living at the most influential time in history? · 2019-09-03T15:02:21.706Z · score: 1 (4 votes) · EA · GW

Both seem true and relevant. You could in fact write P(seems like HoH | simulation) >> P(seems like HoH | not simulation), which leads to the other two via Bayes theorem.

Comment by soerenmind on Are we living at the most influential time in history? · 2019-09-03T14:49:36.840Z · score: 14 (8 votes) · EA · GW

Important post!

I like your simulation update against HoH. I was meaning to write a post about this. Brian Tomasik has a great paper that quantitatively models the ratio of our influence on the short vs long-term. Though you've linked it, I think it's worth highlighting it more.

How the Simulation Argument Dampens Future Fanaticism

The paper cleverly argues that the simulation argument combined with anthropics either strongly dampens the expected impact of far future altruism or strongly increases the impact of short-term altruism. That conclusion seems fairly robust to the choice of decision- and anthropic theory and uncertainty over some empirical parameters. He doesn't directly discuss how the "seems like HoH" observation affects his conclusions, but I think it makes them stronger. (i recommend Brian's simplified calculations here).

I assume this paper didn't get as much discussion as it deserves because Brian posted it in the dark days of LW.

Comment by soerenmind on Logarithmic Scales of Pleasure and Pain: Rating, Ranking, and Comparing Peak Experiences Suggest the Existence of Long Tails for Bliss and Suffering · 2019-08-17T00:27:14.247Z · score: 5 (4 votes) · EA · GW

That fair, I made a mathematical error there. The cluster headache math convinces me that a large chunk of total suffering goes to few people there due to lopsided frequencies. Do you have other examples? I particularly felt that the relative frequency of extreme compared to less extreme pain wasn't well supported.

Comment by soerenmind on Logarithmic Scales of Pleasure and Pain: Rating, Ranking, and Comparing Peak Experiences Suggest the Existence of Long Tails for Bliss and Suffering · 2019-08-16T14:20:48.004Z · score: 1 (1 votes) · EA · GW

Your 4 cluster headache groups contribute about equally to the total number of cluster headaches if you multiply group size by # of CH's. (The top 2% actually contribute a bit less). That's my entire point. I'm not sure if you disagree?

Comment by soerenmind on Logarithmic Scales of Pleasure and Pain: Rating, Ranking, and Comparing Peak Experiences Suggest the Existence of Long Tails for Bliss and Suffering · 2019-08-15T21:19:41.396Z · score: 3 (3 votes) · EA · GW

To the second half of your comment, I agree that extreme suffering can be very extreme and I think this is an important contribution. Maybe we have a misunderstanding about what 'the bulk' of suffering refers to. To me it means something like 75-99% and to you it means something like 45% as stated above? I should also clarify that by frequency I mean the product of 'how many people have it', 'how often' and 'for how long'.

"the people in the top 10% of sufferers will have 10X the amount, and people in the 99% [I assume you mean top 1%?] will have 100X the amount"

I'm confused, you seem to be suggesting that every level of pain accounts for the _same_ amount of total suffering here.

To elaborate, you seem to be saying that at any level of pain, 10x worse pain is also 10x less frequent. That's a power law with exponent 1. I.e. the levels of pain have an extreme distribution, but the frequencies do too (mild pains are extremely common). I'm not saying you're wrong - just that I've seen also seems consistent with extreme pain being less than 10% of the total. I'm excited to see more data :)

Comment by soerenmind on Logarithmic Scales of Pleasure and Pain: Rating, Ranking, and Comparing Peak Experiences Suggest the Existence of Long Tails for Bliss and Suffering · 2019-08-15T10:36:16.568Z · score: 3 (3 votes) · EA · GW

Aside from my concern about extreme pain being rarer than ordinary pain, I also would find the conclusion that

"...the bulk of suffering is concentrated in a small percentage of experiences..."

very surprising. Standard computational neuroscience decision-making views such as RL models would say that if this is true, animals would have to spend most of their everyday effort trying to avoid extreme pain. But that seems wrong. E. g. we seek food to relieve mild hunger and get a nice taste and not because we once had a an extreme hunger experience that we learned from.

You could argue that the learning from extreme pain doesn't track the subjective intensity of pain. But then people would be choosing e. g. a subjectively 10x worse pain over a <10x longer pain. In this cause I'd probably say that the subjective impression is misguided or ethically irrelevant, though that's an ethical judgment.

Comment by soerenmind on Logarithmic Scales of Pleasure and Pain: Rating, Ranking, and Comparing Peak Experiences Suggest the Existence of Long Tails for Bliss and Suffering · 2019-08-15T10:21:26.599Z · score: 2 (2 votes) · EA · GW

Thanks. I was actually asking about a different frequency distribution. You're talking about the frequency of extreme pain among people with extreme pain which has no bearing on the quote above. I'm talking about the frequency of extreme pain experiences among all pain experiences (i. e. is extreme pain it lmuch less prevalent). Hence the example about mild discomfort.

Comment by soerenmind on Logarithmic Scales of Pleasure and Pain: Rating, Ranking, and Comparing Peak Experiences Suggest the Existence of Long Tails for Bliss and Suffering · 2019-08-15T00:05:41.054Z · score: 12 (6 votes) · EA · GW

Great analysis!

"...the bulk of suffering is concentrated in a small percentage of experiences..."

This seems like your core implication. But it requires an argument about intensity distribution and frequency distribution. There's only arguments about the first one if I haven't missed anything? To illustrate, I have mild discomfort about 8000s/day on average but extreme pain perhaps 0.02s/day, if I get 1h of extreme pain in my life (and many people don't get any at all).

Comment by soerenmind on A philosophical introduction to effective altruism · 2019-08-04T12:36:56.632Z · score: 1 (1 votes) · EA · GW

Echoing your second point, I had the same reaction.

Comment by soerenmind on Is EA Growing? EA Growth Metrics for 2018 · 2019-07-28T11:10:57.936Z · score: 1 (1 votes) · EA · GW

Great work! I wonder if there are any ways to track quality adjusted engagement since that what we've mostly been optimizing for the last few years. E. g. if low-quality page views/joins/listeners are going down it seems hard to compensate with an equal number of high quality ones because they're harder to create. 80k's impact adjusted plan changes metric is the only suitable metric I can think of.

Comment by soerenmind on My recommendations for RSI treatment · 2019-07-12T22:34:19.401Z · score: 1 (1 votes) · EA · GW

PMed you the paywalled review. There seems to be some agreement that evidence transfers between different tendons FYI, e. g. some studies are about Achilles tendons. The specific review on golfer arm (seen by my doc as nearly equivalent to RSI on the hand-facing tendons) is also in my message. If you want to talk to an expert about the evidence you can probably ask to skype him for a fee.

Comment by soerenmind on My recommendations for RSI treatment · 2019-07-12T22:28:20.483Z · score: 1 (1 votes) · EA · GW

PMed, and yes. The exercise the doc gave me was to hold it with both hands facing down and then alternatingly bend into an inverted / normal u-shape. This hits both flexors and extensors and it's both eccentric and concentric combined.

Comment by soerenmind on Age-Weighted Voting · 2019-07-12T21:26:05.907Z · score: 6 (2 votes) · EA · GW

Many policies are later revoked and aren't about trading off present vs future resources (e. g. income redistribution). So those who are still alive when a policy's effects stop got more than their fair share of voting power under this proposal if I understand correctly. E. g. if I'm 80 when a policy against redistribution comes into effect, and it's revoked when I die at 84, my 1x vote weighting seems unfair because everyone else was also just affected for 4 years.

Comment by soerenmind on A philosophical introduction to effective altruism · 2019-07-12T20:53:30.990Z · score: 1 (1 votes) · EA · GW

Retracted because I'm no longer sure if "then" instead of "the" was intended. I still emphasize that it's a very nice read!

Comment by soerenmind on A philosophical introduction to effective altruism · 2019-07-12T12:36:56.554Z · score: 1 (1 votes) · EA · GW

Very nicely written! Typo: "and the taking action on that basis"

Comment by soerenmind on My recommendations for RSI treatment · 2019-06-18T13:09:05.842Z · score: 13 (5 votes) · EA · GW

This post seems to be missing the therapy with the best evidence basis - heavy loaded eccentric training. See e. g. https://www.uptodate.com/contents/overview-of-the-management-of-overuse-persistent-tendinopathy (paywalled). The combination with concentric training is almost as well supported and easier to do. The only tool needed is a flexbar. 3x 15 reps twice daily for 3 months should bring results.

The website painscience.com is a great SSC-like read but I've found it lacking from time to time, for instance by omitting eccentric training.

I can also recommend a professor in Germany who specializes in tendon problems and charges ca 150eur for a 30-60m session plus email support. I could even imagine him doing a skype session with some convincing but he'll want to get an ultra sound and strength test. He was recommended to me by a paid service in Germany (betterdoc) that asks a council of medical experts for the leading expert for a desease. The professor website: http://www.sportpraxis-knobloch.de/

Comment by soerenmind on Recap - why do some organisations say their recent hires are worth so much? (Link) · 2019-05-17T13:25:27.084Z · score: 5 (4 votes) · EA · GW

Possible ambiguity in the survey question: If the person stops working "for you or anyone for 3 years" that plausibly negates most of their life's impact, unless they find a great way to build career capital without working for anyone. So with this interpretation, the answers would be something close to the NPV of their life's impact divided by 3 (ignoring discounting).

Also, did you control for willingness to accept vs pay?

Sorry if this addressed, I skimmed the post.

Comment by soerenmind on Alignment Newsletter One Year Retrospective · 2019-04-14T16:12:46.629Z · score: 2 (2 votes) · EA · GW

To cover more content that's not new but important, you could use a new source on one topic to summarize the state of that topic. I like that papers do this in the introduction and literature review and I think more posts and the like should do it.

Comment by soerenmind on Alignment Newsletter One Year Retrospective · 2019-04-14T16:06:09.165Z · score: 2 (2 votes) · EA · GW

Google scholar also lists recommended new papers on its homepage.

Comment by soerenmind on The case for delaying solar geoengineering research · 2019-04-03T13:17:17.766Z · score: 1 (1 votes) · EA · GW

Why not just pay Russia an (arguably fair) reparation?

Comment by soerenmind on EA Angel Group: Applications Open for Personal/Project Funding · 2019-03-20T21:50:43.484Z · score: 11 (5 votes) · EA · GW

What sort of decision timeline can applicants expect? The existing opportunities are often slow compared to e.g. VC funding which is bad for planning.

Comment by soerenmind on Request for input on multiverse-wide superrationality (MSR) · 2018-09-02T23:27:49.574Z · score: 2 (2 votes) · EA · GW

Re 4): Correlation or similarity between agents is not really necessary condition for cooperation in the open source PD. LaVictoire et al. (2012) and related papers showed that 'fair' agents with completely different implementations can cooperate. A fair agent, roughly speaking, has to conform to any structure that implements "I'll cooperate with you if I can show that you'll cooperate with me". So maybe that's the measure you're looking for.

A population of fair agents is also typically a Nash equilibrium in such games so you might expect that they sometimes do evolve.

Source: LaVictoire, P., Fallenstein, B., Yudkowsky, E., Barasz, M., Christiano, P., & Herreshoff, M. (2014, July). Program equilibrium in the prisoner’s dilemma via Löb’s theorem. In AAAI Multiagent Interaction without Prior Coordination workshop.

Comment by soerenmind on Hi, I'm Holden Karnofsky. AMA about jobs at Open Philanthropy · 2018-03-26T21:13:22.756Z · score: 2 (2 votes) · EA · GW

How is doing research at Open Philanthropy different from doing research in academia? To make things concrete, how would the work of someone doing economic modelling or ML research in academia differ from typical research at OpenPhil?

Comment by soerenmind on 2017 AI Safety Literature Review and Charity Comparison · 2017-12-30T02:38:12.441Z · score: 1 (1 votes) · EA · GW

Good stuff! A little correction: The Center for Human-compatible AI goes by 'CHAI' these days.

Comment by soerenmind on Announcing the 2017 donor lottery · 2017-12-17T03:16:07.135Z · score: 2 (2 votes) · EA · GW

Good stuff, joined!

Comment by soerenmind on An algorithm/flowchart for prioritizing which content to read · 2017-11-20T19:50:32.763Z · score: 1 (1 votes) · EA · GW

Another thing I found useful: When you have a few open tabs, order them from left to right in terms of how important (and urgent) to read you think they are. Every time you open a new one, put it in the right place. Often you won't even get to the less important ones anymore.

Comment by soerenmind on An intervention to shape policy dialogue, communication, and AI research norms for AI safety · 2017-10-03T15:09:41.925Z · score: 1 (3 votes) · EA · GW

OpenPhil notion of 'accident risk' more general than yours to describe the scenarios that aren't misuse risk and their term makes perfect sense to me: https://www.openphilanthropy.org/blog/potential-risks-advanced-artificial-intelligence-philanthropic-opportunity

Comment by soerenmind on Why I think the Foundational Research Institute should rethink its approach · 2017-09-20T20:18:50.473Z · score: 1 (1 votes) · EA · GW

After some clarification Dayan thinks that vigour is not the thing I was looking for.

We discussed this a bit further and he suggested that the temporal difference error does track pretty closely what we mean by happiness/suffering, at least as far as the zero point is concerned. Here's a paper making the case (but it has limited scope IMO).

If that's true, we wouldn't need e.g. the theory that there's a zero point to keep firing rates close to zero.

The only problem with TD errors seems to be that they don't account for the difference between wanting and liking. But it's currently just unresolved what the function of liking is. So I came away with the impression that liking vs wanting and not the zero point is the central question.

I've seen one paper suggesting that liking is basically the consumption of rewards, which would bring us back to the question of the zero point though. But we didn't find that theory satisfying. E.g. food is just a proxy for survival. And as the paper I linked shows, happiness can follow TD errors even when no rewards are consumed.

Dayan mentioned that liking may even be an epiphenomenon of some things that are going on in the brain when we eat food/have sex etc, similar to how the specific flavour of pleasure we get from listening to music is such an epiphenomenon. I don't know if that would mean that liking has no function.

Any thoughts?

Comment by soerenmind on Why I think the Foundational Research Institute should rethink its approach · 2017-08-27T11:54:13.991Z · score: 1 (1 votes) · EA · GW

I feel like there's a difference between (a) an agent inside the room who hasn't yet pressed the lever to get out and (b) the agent not existing at all.

Yes that's probably the right way to think about it. I'm also considering an alternative though: Since we're describing the situation with a simple computational model we shouldn't assume that there's anything going on that isn't captured by the model. E.g. if the agent in the room is depressed, it will be performing 'mental actions' - imagining depressing scenarios etc. But we may have to assume that away, similar to how high school physics would assume no friction etc.

So we're left with an agent that decides initially that it won't do anything at all (not even updating its beliefs) because it doesn't want to be outside of the room and then remains inactive. The question arises if that's an agent at all and if it's meaningfully different unconsciousness.

Comment by soerenmind on Why I think the Foundational Research Institute should rethink its approach · 2017-08-26T10:11:51.778Z · score: 1 (1 votes) · EA · GW

Thanks for the reply. I think I can clarify the issue about discrete time intervals. I'd be curious on your thoughts on the last sentence of my comment above if you have any.

Discrete time

So it seems like a time step is defined as the interval between one action and the next?

Yes. But in a SEMI or a https://en.wikipedia.org/wiki/Markov_decision_process#Continuous-time_Markov_Decision_Process Markov Decision Process (SMDP) this is not the case. SMDPs allow temporally extended actions and are commonly used in RL research. Dayan's papers use a continuous SMDP. You can still have RL agents in this formalism and it tracks our situation more closely. But I don't think the formalism matters for our discussion because you can arbitrarily approximate any formalism with a standard MDP - I'll explain below.

The continuous-time experiment looks roughly like this: Imagine you're in a room and you have to press a lever to get out - and get back to what you would normally be doing and get an average reward rho per second. However, the lever is hard to press. You can press it hard and fast or light and slowly, taking a total time T to complete the press. The total energy cost of pressing is 1/T so ideally you'd press very slowly but that would mean you couldn't be outside the room during that time (opportunity costs).

In this setting, the 'action' is just the time T that you to press the lever. We can easily approximate this with a standard MDP. E.g. you could take action 1 which completely presses the lever in one time step, costing you 1/1=1 reward in energy. Or you could take action 2, which you would have to take twice to complete the press, costing you only 1/2 reward (so 1/4 for each time you take action 2). And so forth. Does that make sense?

Zero point

Of course, if you don't like it outside the room at all, you'll never press the lever - so there is a 'zero point' in terms of how much you like it outside. Below that point you'll never press the lever.

It seems like vigor just says that what you're doing is better than not doing it?

I'm not entirely sure what you mean, but I'll clarify that acting vigorously doesn't say anything about whether the agent is currently happy. It may well act vigorously just to escape punishment. Similarly, an agent that currently works to increase its life-time doesn't necessarily feel good, but its work still implies that it thinks the additional life-time it gets will be good.

But I think your criticism may be the same as what I said in the edit above - that there is an unwarranted assumption that the agent is at the zero-point before it presses the lever. In the experiments this is assumed because there are no food rewards or shocks during that time. But you could still imaging that a depressed rat would feel bad anyway.

The theory that assumes nonexistence is the zero-point kind of does the same thing though. Although nonexistence is arguably a definite zero-point, the agent's utility function might still go beyond its life-time...

Does this clarify the case?

Comment by soerenmind on Why I think the Foundational Research Institute should rethink its approach · 2017-08-21T11:48:23.378Z · score: 1 (1 votes) · EA · GW

I've had a look into Dayan's suggested papers - they imply an interesting theory. I'll put my thoughts here so the discussion can be public. The theory contradicts the one you link above where the separation between pain and pleasure is a contingency of how our brain works.

You've written about another (very intuitive) theory, where the zero-point is where you'd be indifferent between prolonging and ending your life:

"This explanation may sound plausible due to its analogy to familiar concepts, but it seems to place undue weight on whether an agent’s lifetime is fixed or variable. Yet I would still feel pain and pleasure as being distinct even if I knew exactly when I would die, and a simple RL agent has no concept of death to begin with."

Dayan's research suggests that the zero-point will also come up in many circumstances relating to opportunity costs which would deal with that objection. To simplify, let's say the agent expects a fixed average rate of return rho for the foreseeable future. It is faced with a problem where it can either act fast (high energy expenditure) or act slowly (high opportunity costs as it won't get the average return for a while). If rho is negative or zero, there is no need to act quickly at all because there are not opportunity costs. But the higher the opportunity costs get, the fast the agent will want to be at getting its average reward back so it will act quickly despite the immediate cost.

The speed with which the agent acts is called vigour in Dayan's research. The agent's vigour mathematically implies an average rate of return if the agent is rational. There can be other reasons for low vigour such as a task that requires patience - they have some experiments here in figure 1. In their experiment the optimal vigour (one over tao*) is proportional to the square root of the average return. A recent paper has confirmed the predictions of this model in humans.

So when is an agent happy according to this model?

The model would imply that the agent has positive welfare positive welfare when the agent treats it as creating positive opportunity costs while it's doing other things (and vice versa for negative welfare). This would also apply to your example where the agent expends resources to increase or decrease its life-time.

What I like about this is that the welfare depends on the agent's behaviour and not the way the rewards are internally processed and represented as numbers which is arbitrary.

I'm still not sure how you would go about calculating the welfare of an agent if you don't have a nice experimental setup like Dayan's. That might be amenable to more thinking. Moreover, all welfare is still relative and it doesn't allow comparisons between agents.

Edit: I'm not sure though if there's a problem because we now have to assume that the 'inactive' time where the agent doesn't get its average reward is the zero-baseline which is also arbitrary.

Comment by soerenmind on How can we best coordinate as a community? · 2017-07-25T11:27:33.850Z · score: 0 (0 votes) · EA · GW

Great talk!

Given the value that various blogs and other online discussion has provided to the EA community I'm a bit surprised by the relative absence of 'advancing the state of community knowledge by writing etc' in 80k's advice. In fact, I've found that the advice to build lots of career capital and fill every gap with an internship has discouraged me from such activities in the past.

Comment by soerenmind on Cognitive Science/Psychology As a Neglected Approach to AI Safety · 2017-07-25T11:15:43.050Z · score: 0 (0 votes) · EA · GW

I see quite a bunch of relevant cognitive science work these days, e.g. this: http://saxelab.mit.edu/resources/papers/Kleiman-Weiner.etal.2017.pdf