Root out maximizers within yourself. Even 'doing the most good.' Maximizer processes are cancer, trying to convert the universe into copies of themselves. But this destroys anything that the maximizing was for.
Potentially of use in running a short workshop is the effectiveness of pedagogical techniques. From engaging with the literature on such, the highest quality systematic review I could find pointed to four techniques as showing robust effect size across many contexts and instantiations. They are
In the same way that an organism tries to extend the envelope of its homeostasis, an organization has a tendency to isolate itself from falsifiability in its core justifying claims. Beware those whose response to failure is to scale up.
Recommended. One interesting bit for me is that I think foreign dictators often appear clownish because the translations don't capture what they were speaking to, either literally in terms of them being a good speech writer, or contextually in terms of not really being familiar with the cultural context that animates a particular popular political reaction. I think this applies even if you speak nominally the same language as the dictator but don't share their culture.
Appreciate the care taken, especially in the atomistic section. One thing is that it seems to assume that best we can do with such a research agenda is analyze correlates, where what we really want is a causal model.
I really enjoyed this. A related thing is about a possible reason why more debate doesn't happen. I think when rationalist style thinkers debate, especially in public, it feels a bit high stakes. There is pressure to demonstrate good epistemic standards, even though no one can define a good basis set for that. This goes doubly so for anyone who feels like they have a respectable position or are well regarded. There is a lot of downside risk to them engaging in debate and little upside. I think the thing that breaks this is actually pretty simple and is helped out by the 'sorry' command concept. If it's a free move socially to choose whether or not to debate (which avoids the thing where a person mostly wants to debate only if they're in the mood and about the thing they are interested in but don't want to defend a position against arbitrary objections that they may have answered lots of times before etc.) and also a free move to say 'actually, some of my beliefs in this area are cached sorries, so I reserve the right to not have perfect epistemics here already, and we also recognize that even if we refute specific parts of the argument, we might disagree on whether it is a smoking gun, so I can go away and think about it and I don't have to publicly update on it' then it derisks engaging in a friendly, yet still adversarial form debate.
If we believe that people doing a lot of this play fighting will on average increase the volume and quality of EA output both through direct discovery of more bugs in arguments and in providing more training opportunity, then maybe it should be a named thing like Crocker's rules? Like people can say 'I'm open to debating X, but I declare Kid Gloves' or something. (What might be a good name for this?)
> Costs of being vegan are in fact trivial, despite all the complaining that meat-eaters do about it. For almost everyone there is a net health benefit and the food is probably more enjoyable than the amount of enjoyment one would have derived from sticking with one's non-vegan diet, or at the very least certainly not less so. No expenditure of will-power is required once one is accustomed to the new diet. It is simply a matter of changing one's mind-set.
Appreciate some of the points, but this part seems totally disconnected from what people report along several dimensions.
This is only half formed but I want to say something about a slightly different frame for evaluation, what might be termed 'reward architecture calibration.' I think that while a mapping from this frame to various preference and utility formulations is possible, I like it more than those frames because it suggests concrete areas to start looking. The basic idea is that in principle it seems likely that it will be possible to draw a clear distinction between reward architectures that are well suited to the actual sensory input they receive and reward architectures that aren't (by dint of being in an artificial environment). In a predictive coding sense, a reward architecture that is sending constant error signals that an organism can do nothing about is poorly calibrated, since it is directing the organism's attention to the wrong things. Similarly there may be other markers that could be spotted in how a nervous system is sending signals e.g. lots of error collisions vs few, in the sense of two competing error signals pulling behavior in different directions. I'd be excited about a medium depth dive into the existing literature on distress in rats and what sorts of experiments we'd ideally want done to resolve confusions.
Literally today I was idly speculating that it would be nice to see more things that were reminiscent of the longer letters academics in a particular field would write to each other in the days of such. More willingness to explore at length. Lo and behold this very post appears. Thanks!
WRT content, you mention it in passing, but yeah this seems related to tendency towards optimization of causal reality (inductive) or social reality (anti-inductive).
Seems like you're trying to get at what I've seen referred to as 'multifinal means' at one point. Keyword might help find related stuff.
This is sort of tangential, but related to the idea of making the distinction between inputs and outputs in running certain decision processes. I now view both consequentialism and deontological theories to be examples of what I've been calling perverse monisms. A perverse monism is when there is a strong desire to collapse all the complexity in a domain into a single term. This is usually achieved via aether variables, we rearrange the model until the complexity (or uncertainty) has been shoved into a corner either implicitly or explicitly, which makes the rest of the model look very tidy indeed.
With consequentialism we say that one should allow the inputs to vary freely while holding the outputs fixed (our idea of what the outcome should be, or heuristics that evaluate outcomes etc.). We backprop the appropriate inputs from the outputs. Deontology says we can't control outputs, but we can control inputs, so we should allow outputs to vary freely while holding the inputs to some fixed ideal.
Both of these are a hope that one can avoid the nebulosity of having a full blown confusion matrix about inputs and outputs, and that changing problem to problem. That is to say, I have some control over which outputs to optimize for, and some control over inputs, and false positives and false negatives in my beliefs about both of those. Actual problem solving of any complexity at all both forward chains from known info about inputs, and backchains from previous data about outputs then tries to find places where the two branching chains meet. In the process of investigating this, beliefs about the inputs or outputs may also update.
More generally, I've been getting a lot of mileage out of thinking of 'philosophical positions' as different sorts of error checks that we use on decision processes.
It's also fun to think about this in terms of the heuristic that How to Measure Anything recommends:
Define parameters explicitly (what outputs do we think we care about, what inputs do we think we control)
Establish value of information (how much will it cost to test various assumptions)
Sensitivity analysis (how much does final proxy vary as a function of changes in inputs)
it's a non linear heuristic, so the info gathered in any one step can cause you to go back and adjust one of the others, which involves that sort of bouncing back and forth between forward chaining and back chaining.
It has been noted that when status hierarchies diversify, creating more niches, that people are happier than when status hierarchies collapse to a single or a small number of very legible dimensions. This suggests that it would be possible to increase net happiness by studying the conditions by which these situations arise and tilting the playing field. E.g. are social media sites only having a negative impact on mental health because they compress the metrics by which success is measured?
Related: surely someone somewhere is doing critical path analysis of vaccine development. It certainly wouldn't be the case that in the middle of a crisis people just keep on doing what they've always done. Even if it isn't anyone's job to figure out what the actual non parallelizable causal steps are in producing a tested vaccine and trimming the fat, someone would still take it on right?
Exploit selection effects on prediction records to influence policy.
During a crisis, people tend to implement the preferred policies of whoever seems to be accurately predicting each phase of the problem. When a crisis looms on the horizon, EAs coordinate to all make different predictions thus maximizing the chance that one of them will appear prescient and thus obtain outsize influence.
A lot of people are willing to try new things right now. Rapid prototyping of online EA meetups could lead to better ability to do remote collaboration permanently. This helps cut against a key constraint in matching problems, co-location.
At $50 per ton cost to sequester the average American would need to generate $1000 per year of positive impact to offset their co2 use. The idea that the numbers are even close to comparable means priors are way way off. The signaling commons have been polluted on this front from people impact larping their short showers, lack of water at restaurants and other absurdities.
I think that much of the disconnect comes down to focusing on goals over methods. I think it is better to think of goals as orienting us in the problem-space, while most of the benefits accrue along the way. By the time you make it a substantial fraction of the way to a goal, you'll likely be in a much better position to realize the original goal was slightly off and adjust course. So 'eliminating all infectious disease' could easily be criticized as unrealistic for endless reasons, yet it is very useful for orienting us to be scope sensitive, think in terms of hits-based reasoning and so on. Similarly, even having an 'N problems of aging' list to argue about is because someone did the work of trying to figure out what it would take at a multi-year research level. If we want to talk about neglected areas of funding, I think a great place to start is neglect for funding promising methods or directions that might plausibly generate new methods with less focus on what the particular outcomes might be. Or, to sort of paraphrase Hanson and Bostrom a bit: new considerations generally trump fine tuning of existing considerations.
What could we measure that would make seemingly intractable problems trivial? Can we take moonshots at those? And I'm not talking about actually funding the moonshot once the opportunity has been identified. I'm talking about the seed research to identify plausibility, funding small numbers of people at the 1 year level to do deep dives in much weirder areas than in house researchers have been doing.
> there seems to be no way to determine what equal weights should look like, without settling on a way to normalize utility functions, e.g., by range normalization or variance normalization. I think the debate about intersubjective utility comparisons comes in at the point where you ask how to normalize utility functions.
yup, thanks. Also across time as well as across agents at a particular moment.
Like other links between VNM and Utilitarianism, this seems to roll intersubjective utility comparison under the rug. The agents are likely using very different methods to convert their preferences to the given numbers, rendering the aggregate of them non rigorous and subject to instability in iterated games.
Note also that your question has a selection filter where you'd also want to figure out where the best arguments for longer timelines are. In an ideal world these two sets of things tend to live in the same place, in our world this isn't always the case.
First, doing philosophy publicly is hard and therefore rare. It cuts against Ra-shaped incentives. Much appreciation to the efforts that went into this.
>he thinks the world is metaphorically more made of liquids than solids.
Damn, the convo ended just as it was getting to the good part. I really like this sentence and suspect that thinking like this remains a big untapped source of generating sharper cruxes between researchers. Most of our reasoning is secretly analogical with deductive and inductive reasoning back-filled to try to fit it to what our parallel processing already thinks is the correct shape that an answer is supposed to take. If we go back to the idea of security mindest, then the representation that one tends to use will be made up of components, your type system for uncertainty will be uncertainty of those components varying. So which sorts of things your representation uses as building blocks will be the kinds of uncertainty that you have an easier time thinking about and managing. Going upstream in this way should resolve a bunch of downstream tangles since the generators for the shape/direction/magnitude (this is an example of such a choice that might impact how I think about the problem) of the updates will be clearer.
This gets at a way of thinking about metaphilosophy. We can ask what more general class of problems AI safety is an instance of, and maybe recover some features of the space. I like the capability amplification frame because it's useful as a toy problem to think about random subsets of human capabilities getting amplified, to think about the non-random ways capabilities have been amplified in the past, and what sorts of incentive gradients might be present for capability amplification besides just the AI research landscape one.
I would guess that many feel small not because of abstract philosophy but because they are in the same room as elephants whose behavior they can not plausibly influence. Their own efforts feel small by comparison. Note that this reasoning would have cut against originally starting GiveWell though. If EA was worth doing once (splitting away from existing efforts to figure out what is neglected in light of those existing efforts), it's worth doing again. The advice I give to aspiring do-gooders these days is to ignore EA as mostly a distraction. Getting caught up in established EA philosophy makes your decisions overly correlated with existing efforts, including the motivation effects discussed here.
IIRC Interactive Brokers isn't going to let you lever up more than about 2:1, though if you have 'separate' personal and altruistic accounts you can potentially lever your altruistic side higher. e.g. if you have 50k in personal accounts and 50k in altruistic accounts, you can get 100k in margin, allowing you to lever up the altruistic side 3:1.
Lazy people can access mild leverage (1.5:1) through NTSX for low fees. Many brokerages don't grant access to the more extreme 3:1 ETFs.