Posts

Epistea Summer Experiment (ESE) 2020-01-24T10:51:00.672Z · score: 15 (8 votes)
How x-risk projects are different from startups 2019-04-05T07:35:39.513Z · score: 50 (32 votes)
Request for comments: EA Projects evaluation platform 2019-03-20T22:36:32.565Z · score: 32 (30 votes)
OpenPhil: Reflections on 2018 Generalist Research Analyst Recruiting 2019-03-08T02:41:44.804Z · score: 44 (16 votes)
What to do with people? 2019-03-06T11:04:21.556Z · score: 87 (54 votes)
Critique of “Existential Threats” chapter in Enlightenment Now 2018-11-21T10:09:54.552Z · score: 9 (12 votes)
Suggestions for developing national-level effective altruism organizations 2018-10-17T23:35:37.241Z · score: 19 (15 votes)
Why develop national-level effective altruism organizations? 2018-10-17T23:29:44.203Z · score: 28 (20 votes)
Effective Thesis project review 2018-05-31T18:45:22.248Z · score: 26 (25 votes)
Review of CZEA "Intense EA Weekend" retreat 2018-04-05T20:10:04.290Z · score: 25 (25 votes)
Optimal level of hierarchy for effective altruism 2018-03-27T22:32:15.211Z · score: 8 (11 votes)
Introducing Czech Association for Effective Altruism - history 2018-03-12T22:01:49.556Z · score: 23 (22 votes)

Comments

Comment by jan_kulveit on Neglected EA Regions · 2020-02-18T14:05:50.459Z · score: 5 (2 votes) · EA · GW

I'm not sure you've read my posts on this topic? (1,2)

In the language used there, I don't think the groups you propose would help people overcome the minimum recommended resources, but are at the risk of creating the appearance some criteria vaguely in that direction are met.

  • e.g., in my view, the founding group must have a deep understanding of effective altruism, and, essentially, the ability to go through the whole effective altruism prioritization framework, taking into account local specifics to reach conclusions valid at their region. This basically impossible to implement as membership requirement in a fb group
  • or strong link(s) to the core of the community ... this is not fulfilled by someone from the core hanging in many fb groups with otherwise unconnected ppl

Overall, I think sometimes small obstacles - such as having to find EAs from your country in the global FB group or on EA hub and by other means - are a good thing!

Comment by jan_kulveit on Neglected EA Regions · 2020-02-18T13:42:00.787Z · score: 8 (6 votes) · EA · GW

FWIW the Why not to rush to translate effective altruism into other languages post was quite influential but is often wrong / misleading / advocating some very strong prior on inaction, in my opinion

Comment by jan_kulveit on Neglected EA Regions · 2020-02-17T20:11:18.468Z · score: 4 (4 votes) · EA · GW

I don't think this is actually neglected

  • in my view, bringing effective altruism into new countries/cultures is in initial phases best understood as a strategy/prioritisation research, not as "community building"
    • importance of this increases with increasing distance (cultural / economic / geographical / ...) from places like Oxford or Bay

(more on the topic here)

  • I doubt the people who are plausibly good founders would actually benefit from such groups, and even less from some vague coordination due to facebook groups
    • actually I think on the margin, if there are people who would move forward with the localization efforts if such fb groups exist and other similar people express interest, and would not do that otherwise, their impact could be easily negative
Comment by jan_kulveit on AI safety scholarships look worth-funding (if other funding is sane) · 2019-11-26T12:24:57.312Z · score: 2 (4 votes) · EA · GW
  • I don't think it's reasonable to think about FHI DPhil scholarships and even less so RSP as a mainly a funding program. (maybe ~15% of the impact comes from the funding)
  • If I understand the funding landscape correctly, both EA funds and LTFF are potentially able to fund single-digit number of PhDs. Actually has someone approached these funders with a request like "I want to work on safety with Marcus Hutter, and the only thing preventing me is funding"? Maybe I'm too optimistic, but I would expect such requests to have decent chance of success.
Comment by jan_kulveit on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-24T14:30:50.107Z · score: 21 (3 votes) · EA · GW

Sure

a)

For example, CAIS and something like "classical superintelligence in a box picture" disagree a lot on the surface level. However, if you look deeper, you will find many similar problems. Simple to explain example: problem of manipulating the operator - which has (in my view) some "hard core" involving both math and philosophy, where you want the AI to somehow communicate with humans in a way which at the same time allows a) the human to learn from the AI if the AI knows something about the world b) the operator's values are not "overwritten" by the AI c) you don't want to prohibit moral progress. In CAIS language this is connected to so called manipulative services.

Or: one of the biggest hits of past year is the mesa-optimisation paper. However, if you are familiar with prior work, you will notice many of the proposed solutions with mesa-optimisers are similar/same solutions as previously proposed for so called 'daemons' or 'misaligned subagents'. This is because the problems partially overlap (the mesa-optimisation framing is more clear and makes a stronger case for "this is what to expect by default"). Also while, for example, on the surface level there is a lot of disagreement between e.g. MIRI researchers, Paul Christiano and Eric Drexler, you will find a "distillation" proposal targeted at the above described problem in Eric's work from 2015, many connected ideas in Paul's work on distillation, and while find it harder to understand Eliezer I think his work also reflects understanding of the problem.

b)

For example: You can ask whether the space of intelligent systems is fundamentally continuous, or not. (I call it "the continuity assumption"). This is connected to many agendas - if the space is fundamentally discontinuous this would cause serious problems to some forms of IDA, debate, interpretability & more.

(An example of discontinuity would be existence of problems which are impossible to meaningfully factorize; there are many more ways how the space could be discontinuous)

There are powerful intuitions going both ways on this.

Comment by jan_kulveit on I'm Buck Shlegeris, I do research and outreach at MIRI, AMA · 2019-11-21T12:38:40.575Z · score: 22 (9 votes) · EA · GW

I think the picture is somewhat correct, and we surprisingly should not be too concerned about the dynamic.

My model for this is:

1) there are some hard and somewhat nebulous problems "in the world"

2) people try to formalize them using various intuitions/framings/kinds of math; also using some "very deep priors"

3) the resulting agendas look at the surface level extremely different, and create the impression you have

but actually

4) if you understand multiple agendas deep enough, you get a sense

  • how they are sometimes "reflecting" the same underlying problem
  • if they are based on some "deep priors", how deep it is, and how hard to argue it can be
  • how much they are based on "tastes" and "intuitions" ~ one model how to think about it is people having boxes comparable to policy net in AlphaZero: a mental black-box which spits useful predictions, but is not interpretable in language

Overall, given our current state of knowledge, I think running these multiple efforts in parallel is a better approach with higher chance of success that an idea that we should invest a lot in resolving disagreements/prioritizing, and everyone should work on the "best agenda".

This seems to go against some core EA heuristic ("compare the options, take the best") but actually is more in line with what rational allocation of resources in the face of uncertainty.


Comment by jan_kulveit on Update on CEA's EA Grants Program · 2019-11-16T16:35:38.785Z · score: 30 (11 votes) · EA · GW

Re: future of the program & ecosystem influences.

What bad things will happen if the program is just closed

  • for the area overlapping with something "community building-is", CBG will become the sole source of funding, as meta-fund does not fund that. I think at least historically CBG had some problematic influence on global development of effective altruism not because of the direct impact of funding, but because of putting money behind some specific set of advice/evaluation criteria. (To clarify what I mean: I would expect the space would be healthier if exactly the same funding decisions were made, but less specific advice what people should do was associated; the problem is also not necessarily on the program side, but can be thought about as goodharting on the side of grant applicants/grant recipients.)
  • for x-risk, LTFF can become too powerful source of funding for new/small projects. In practice while there are positive impacts of transparency, I would expect some problematic impacts of mainly Oli opinions and advice being associated with a lot of funding. (To clarify: I'm not worried about funding decisions, but about indirect effects of the type "we are paying you so you better listen to us", and people intentionally or unintentionally goodharting on views expressed as grant justification)
  • for various things falling in between the gaps of fund scope, it may be less clear what to do
  • it increases the risks of trying to found something like "EA startups"
  • it can make the case for individual donors funding things stronger

All of that could be somewhat mitigated if rest of the funding ecosystem adapts; e.g. by creating more funds with intentional overlap, or creating others stream of funding going e.g. along geographical structures.


Comment by jan_kulveit on Which Community Building Projects Get Funded? · 2019-11-16T15:48:46.887Z · score: 10 (7 votes) · EA · GW

As a side-note: In case of the Bay area, I'd expect some funding-displacement effects. BERI grant-making is strongly correlated with geography and historically BERI funded some things which could be classified as community building. LTFF is also somewhat Bay-centric, and also there seem to be some LTFF grants which could be hypothetically funded by several orgs. Also some things were likely funded informally by local philantrophists.

To make the model more realistic one should note

  • there is some underlying distribution of "worthy things to fund"
  • some of the good projects could be likely funded from multiple sources; all other things being equal, I would expect the funding to come more likely from the nearest source

Comment by jan_kulveit on EA Hotel Fundraiser 6: Concrete outputs after 17 months · 2019-11-05T12:26:57.608Z · score: 49 (20 votes) · EA · GW

meta: I considered commenting, but instead I'm just flagging that I find it somewhat hard to have an open discussion about the EA hotel on the EA forum in the fundraising context. The feeling part is

  • there is a lot of emotional investment in EA hotel,
  • it seems if the hotel runs out of runway, for some people it could mean basically loosing their home.

Overall my impression is posting critical comments would be somewhat antisocial, posting just positives or endorsements is against good epistemics, so the personally safest thing to do for many is not to say anything.

At the same time it is blatantly obvious there must be some scepticism about both the project and the outputs: the situation when the hotel seems to be almost out of runway repeats. While eg EA funds collect donations basically in millions $ per year, EA hotel struggles to collect low tens of $.

I think this equilibrium where

  • people are mostly silent but also mostly not supporting the hotel, at least financially
  • the the financial situation of the project is somewhat dire
  • talks with EA Grants and the EA Long Term Future Fund are in progress but the funders are not funding the project yet

is not good for anyone, and has some bad effects for the broader community. I'd be interested in ideas how to move out of this state.

Comment by jan_kulveit on Only a few people decide about funding for community builders world-wide · 2019-10-25T13:52:25.269Z · score: 8 (2 votes) · EA · GW

In practice, it's almost never the inly option - e.g. CZEA was able to find some private funding even before CBG existed; several other groups were at least partially professional before CBG. In general it's more like it's better if national-level groups are funded from EA

Comment by jan_kulveit on Long-Term Future Fund: August 2019 grant recommendations · 2019-10-10T19:54:54.670Z · score: 3 (5 votes) · EA · GW

The reason may be somewhat simple: most AI alignment researchers do not participate (post or comment) on LW/AF or participate only a little. For more understanding why, check this post of Wei Dai and the discussion under it.

(Also: if you follow just LW, your understanding of the field of AI safety is likely somewhat distorted)

With hypothesis 4.&5. I expect at least Oli to have strong bias of being more enthusiastic in funding people who like to interact with LW (all other research qualities being equal), so I'm pretty sure it's not the case

2.&3. is somewhat true at least on average: if we operationalize "private people" as "people who do you meet participating in private research retreats or visiting places like MIRI or FHI", and "online people" as "people posting and commenting on AI safety on LW" than the first group is on average better.

1. is likely true in the sense that best LW contributors are not applying for grants



Comment by jan_kulveit on Long-Term Future Fund: August 2019 grant recommendations · 2019-10-08T14:33:14.431Z · score: 11 (4 votes) · EA · GW

In my experience teaching rationality is more tricky than the reference class education, and is an area which is kind of hard to communicate to non-specialists. One of the main reasons seems to be many people have somewhat illusory idea how much they understand the problem.

Comment by jan_kulveit on Get-Out-Of-Hell-Free Necklace · 2019-07-15T07:45:25.653Z · score: 4 (5 votes) · EA · GW

I've suggested something similar for happiness (https://www.lesswrong.com/posts/7Kv5cik4JWoayHYPD/nonlinear-perception-of-happiness ). If you don't want to introduce the weird asymmetry where negative counts and positive not, what you get out of that could be somewhat surprising - it possibly recovers more "common folk" altruism where helping people who are already quite well off could be good, and if you allow more speculative views on the space on mind-states, you are at risk of recovering something closely resembling some sort of "buddhist utilitarian calculus".

Comment by jan_kulveit on EA Forum 2.0 Initial Announcement · 2019-07-12T22:26:45.635Z · score: 12 (6 votes) · EA · GW

As humans, we are quite sensitive to signs of social approval and disapproval, and we have some 'elephant in the brain' motivation to seek social approval. This can sometimes mess up with epistemics.

The karma represents something like sentiment of people voting on a particular comment, weighted in a particular way. For me, this often did not seemed to be a signal adding any new information - when following the forum closely, usually I would have been able to predict what will get downvoted or upvoted.

What seemed problematic to me was 1. a number of times when I felt hesitation to write something because part of my S1 predicted it will get downvoted. Also I did not wanted to be primed by karma when reading other's comments.

On a community level, overall I think the quality of the karma signal is roughly comparable to facebook likes. If people are making important decisions, evaluating projects, assigning prices... based on it, it seems plausible it's actively harmful.

Comment by jan_kulveit on EA Forum 2.0 Initial Announcement · 2019-07-12T11:10:47.130Z · score: 10 (4 votes) · EA · GW

It's not an instance of complain, but take it as a datapoint: I've switched off the karma display on all comments and my experience improved. The karma system tends to mess up with my S1 processing.

It seems plausible karma is causing harm in some hard to perceive ways. (One specific way is by people updating on karma pattern mistaking them for some voice of the community / ea movement / ... )

Comment by jan_kulveit on Is there an analysis that estimates possible timelines for arrival of easy-to-create pathogens? · 2019-06-15T10:22:35.332Z · score: 19 (7 votes) · EA · GW

I would expect if organizations working in the area have reviews of expected technologies and how they enable individuals to manufacture pathogens, which is likely the background necessary for constructing timelines, they would not publish too specific documents.

Comment by jan_kulveit on What new EA project or org would you like to see created in the next 3 years? · 2019-06-14T23:06:13.481Z · score: 6 (4 votes) · EA · GW

If people think this is generally good idea I would guess CZEA can make it running in few weeks. Most of the work likely comes from curating the content, not from setting up the service

Comment by jan_kulveit on Long-Term Future Fund: April 2019 grant recommendations · 2019-04-09T00:24:20.137Z · score: 5 (4 votes) · EA · GW

To clarify - agree with the benefits of splitting the discussion threads for readability, but I was unenthusiastic about the motivation be voting.

Comment by jan_kulveit on Long-Term Future Fund: April 2019 grant recommendations · 2019-04-09T00:02:29.425Z · score: 6 (8 votes) · EA · GW

I don't think karma/voting system should be given that much attention or should be used as a highly visible feedback on project funding.

Comment by jan_kulveit on Long-Term Future Fund: April 2019 grant recommendations · 2019-04-08T23:16:52.995Z · score: 40 (13 votes) · EA · GW

I don't think anyone should be trying to persuade IMO participants to join the EA community, and I also don't think giving them "much more directly EA content" is a good idea.

I would prefer Math Olympiad winners to think about long-term, think better, and think independently, than to "join the EA community". HPMoR seems ok because it is not a book trying to convince you to join a community, but mostly a book about ways how to think, and a good read.

(If they readers eventually become EAs after reasoning independently, it's likely good; if they for example come to the conclusion there are mayor flaws in EA and it's better to engage with the movement critically, it's also good.)

Comment by jan_kulveit on How x-risk projects are different from startups · 2019-04-06T17:05:37.220Z · score: 30 (10 votes) · EA · GW

I don't think risk of this type is given too much weight now. In my model, considerations like this got at some point in the past rounded of to some over-simplified meme like "do not start projects, they fail and it is dangerous". This is wrong and led to some counterfactual value getting lost.

This was to some extent reaction to the previous mood, which was more like "bring in new people; seed groups; start projects; grow everything". Which was also problematic.

In my view we are looking at something like pendulum swings, where we were somewhere at the extreme position of not many projects started recently, but the momentum is in direction of more projects, and the second derivative is high. So I expect many projects will actually get started. In such situation the important thing is to start good projects, and avoid anti-unicorns.

IMO the risk was maybe given too much weight before, but is given too little weight now, by many people. Just look at many of the recent discussions, where security mindset seem rare, and many want to move fast forward.

Comment by jan_kulveit on How x-risk projects are different from startups · 2019-04-06T16:54:18.392Z · score: 12 (6 votes) · EA · GW

Discussing specific examples seems very tricky - I can probably come up with a list of maybe 10 projects or actions which come with large downside/risks, but I would expect listing them would not be that useful and can cause controversy.

Few hypothetical examples

  • influencing mayor international regulatory organisation in a way leading to creating some sort of "AI safety certification" in a situation where we don’t have the basic research yet, creating false sense of security/fake sense of understanding
  • creating a highly distorted version of effective altruism in a mayor country e.g. by bad public outreach
  • coordinating effective altruism community in a way which leads to increased tension and possibly splits in the community
  • producing and releasing some infohazard research
  • influencing important players in AI or AI safety in a harmful leveraged way, e.g. by bad strategic advice

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-26T00:54:09.963Z · score: 2 (1 votes) · EA · GW

My impression is you have in mind something different than what was intended in the proposal.

What I imagined was 'priming' the argument-mappers with prompts like

  • Imagine this projects fails. How?
  • Imagine this project works, but has some unintended bad consequences. What they are?
  • What would be a strong reason not to associate this project with the EA movement?

(and the opposites). When writing their texts the two people would be communicating and looking at the arguments from both sides.

The hope is this would produce more complete argument map. One way to think about it, is each person is 'responsible' for the pro/con section, trying to make sure it captures as much important considerations as possible.

It seems quite natural for people to think about arguments in this way, with "sides" (sometimes even single authors expose complex arguments in the "dialogue" way).

There are possible benefits - related to why 'debate' style is used in justice

  • It levels the playing field in interesting ways (when compared to public debate on the forum). In the public debate, what "counts" is not just arguments, but also discussion and social skills, status of participants, moods and emotions of the audience, and similar factor. The proposed format would mean both the positives and negatives have "advocates" ideally of "similar debate strength" (anonymous volunteer). This is very different from a public forum discussion, where all kinds of "elephant in the brain" biases may influence participants and bias judgements.
  • It removes some of the social costs and pains associated with project discussions. Idea authors may get discouraged by negative feedback, downvotes/karma, or similar.

Also, just looking at how discussions on the forum look now, it seems in practice it is easy for people to look at things from positive or negative perspectives: certainly I have seen arguments structured like (several different ways how something fails + why is it too costly if it succeeded + speculation what harm it may cause anyway).

Overall: in my words, I'm not sure whether your view is 'in the space of argument-mapping, noting in the vicinity of debate, will work - at least when done by humans and applied to real problems'. Or 'there are options in this space which are bad' - where I agree something like bullet-pointed lists of positives and negatives where the people writing them would not communicate seems bad.

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-22T10:02:09.061Z · score: 0 (3 votes) · EA · GW

My impression was based mostly on our conversations several months ago - quoting the notes from that time

lot of the discussion and debate derives from differing assumptions held by the participants regarding the potential for bad/risky projects: Benjamin/Brendon generally point out the lack of data/signal in this area and believe launching an open project platform could provide data to reduce uncertainty, whereas Jan is more conservative and prioritizes creating a rigorous curation and evaluation system for new projects.

I think it is fair to say you expected very low risk from creating an open platform where people would just post projects and seek volunteers and funding, while I expected with minimum curation this creates significant risk (even if the risk is coming from small fraction of projects). Sorry if I rounded off suggestions like "let's make an open platform without careful evaluation and see" and "based on the project ideas lists which existed several years ago the amount of harmful projects seems low" to "worrying about them is premature".

Reading your recent comment, it seems more careful, and pointing out large negative outcomes are more of a problem with x-risk/long-term oriented projects.

In our old discussions I also expressed some doubt about your or altruism.vc ability to evaluate x-risk and similar projects, where your recent post states that projects that impact x-risks by doing something like AI safety research has not yet applied to the EA Angel Group.

I guess part of the disagreement comes from the fact that I have focus on x-risk and the long-term future, and I'm more interested both in improving the project landscape in these areas, and more worried about negative outcomes.

If open platforms or similar evaluation process also accept mitigating x-risk and similar proposals, in my opinion, unfortunately the bar how good/expert driven evaluations you need is higher, and unfortunately signals like "this is a competent team" which VCs would mainly look at are not enough.

Because I would expect the long-term impact will come mainly from long-term, meta-, exploratory or very ambitious projects, I think you can be basically right about low obvious risk of all the projects historically posted on hackpad or proposed to altruism.vc, and still miss the largest term in the EV.

Milan asked this question and I answered it.

Thanks - both of that happened after I posted my comment, and also I still do not see the numbers which would help me estimate the ratio of projects which applied and which got funded. I take as mildly negative signal that someone had to ask, and this info was not included in the post, which solicits project proposals and volunteer work.

In my model it seems possible you have something like chicken-and-egg problem, not getting many great proposals, and the group of unnamed angels not funding many proposals coming via that pipeline.

If this is the case and the actual number of successfully funded projects is low, I think it is necessary to state this clearly before inviting people to work on proposals. My vague impression was we may disagree on this, which seems to indicate some quite deep disagreement about how funders should treat projects.

I'm not entirely sure what your reasons are for having this opinion, or what you even mean

The whole context was, Ryan suggested I should have sought some feedback from you. I actually did that, and your co-founder noted that he will try to write the feedback on this today or tomorrow, on 11th of Mar - which did not happen. I don't think this is large problem, as we had already discussed the topic extensively.

When writing it I was somewhat upset about the mode of conversation where critics do ask whether I tried to coordinate with someone, but just assume I did not. I apologize for the bad way it was written.

Overall my summary is we probably still disagree in many assumptions, we did invest some effort trying to overcome them, it seems difficult for us to reach some consensus, but this should not stop us trying to move forward.

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-22T02:30:36.610Z · score: 4 (3 votes) · EA · GW

Summary impressions so far: object-level

  • It seems many would much prefer expediency in median project cases to robustness and safety in rare low frequency possibly large negative impact cases. I do not think this is the right approach, when the intention is also to evaluate long-term oriented, x-risk, meta-, cause-X, or highly ambitious projects.
  • I'm afraid there is some confusion about project failure modes. I'm more worried about projects which would be successful in having a team, working successfully in some sense, changing the world, but achieving large negative impact in the end.
  • I feel sad about the repeated claims the proposal is rigid, costly or large-scale. If something would not work in practice it could be easily changed. Spending something like 5h of time on a project idea which likely was result of much longer deliberation and which may lead to thousands hours of work seems reasonable. Paradoxically, just the discussion about whether the project is costly or not likely already had higher cost than what setting the whole proposed infrastructure for the project + phases 1a,1d,1c would cost.

Meta:

  • I will .not have time to participate in the discussion in next few days. Thanks for the comments so far.

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-22T00:30:37.082Z · score: 10 (3 votes) · EA · GW

Thanks Sundanshu! Sorry for not replying sooner, I was a bit overwhelmed by some of the negative feedback in the comments.

I don't think step 1b. has the same bottleneck as current grant evaluator face, because it is less dependent on good judgement.

With your proposal, I think part of it may work, I would be worried about other parts. With step 2b I would fear nobody would feel responsible for producing the content.

With 3a or any automatic steps like that, what does that lack is some sort of (reasonably) trusted expert judgement. In my view this is actually the most critical step in case of x-risk, long-term, meta-, and similarly difficult to evaluate proposals.

Overall

  • I'm sceptical the karma or similar automated system is good for tracking what is actually important here
  • I see some beauty in automation, but I don't see it applied here in the right places

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-21T22:55:58.368Z · score: 7 (2 votes) · EA · GW

FWIW, part of my motivation for the design, was

1. there may be projects, mostly in long-term, x-risk, meta- and outreach spaces, which are very negative, but not in an obvious way

2. there may be ideas, mostly in long-term and x-risk, which are infohazard

The problem with 1. is most of the EV can be caused by just one project, with large negative impact, where the downside is not obvious to notice.

It seems to me standard startup thinking does not apply here, because startups generally can not go way bellow zero.

I also do not trust arbitrary set of forum users to handle this well.

Overall I believe the very lightweight unstructured processes are trading some gain in speed and convenience in most cases for some decreased robustness in worst cases.

In general I would feel much better if the simple system you want to try would avoid projects in long-term, x-risk, meta-, outreach, localization, and "searching for cause X" areas.

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-21T19:42:52.863Z · score: 11 (5 votes) · EA · GW

It is possible my reading of your post somewhat blended with some other parts of the discussion, which are in my opinion quite uncharitable reading of the proposal. Sorry for that.

Actually from the list, I talked about it and shared the draft with people working on EA grants, EA funds, and Brendon, and historically I had some interactions with BERI. What I learned is people have different priors over existence of bad projects, ratio of good projects, number of projects which should or should not get funded. Also opinions of some of the funders are at odds with opinions of some people I trust more than the funders.

I don't know, but it seems to me you are either a bit underestimating the amount of consultation which went into this, or overestimating how much agreement is there between the stakeholders. Also I'm trying to factor in the interests of the project founders, and overall I'm more concerned whether the impact in the world would be good, and what's good for the whole system.

Despite repeated claims the proposal is very heavy, complex, rigid, etc. I think the proposed project would be in fact quite cheap, lean, and flexible (and would work). I'm also quite flexible in modifying it in any direction which seems consensual.

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-21T18:49:51.172Z · score: 3 (2 votes) · EA · GW
You are missing one major category here: projects which are simply bad because they do have approximately zero impact, but aren't particularly risky. I think this category is the largest of the the four.

I agree that's likely. Please take the first paragraphs more as motivation than precise description of the categories.

Which projects have a chance of working and which don't is often pretty clear to people who have experience evaluating projects quite quickly (which is why Oli suggested 15min for the initial investigation above).

I think we are comparing apples and oranges. As far as the output should be some publicly understandable reasoning behind the judgement, I don't think this is doable in 15m.

It sounds to me a bit like your model of ideas which get proposed is that most of them are pretty valuable. I don't think this is the case.

I don't have strong prior on that.

To do this well they need to have a good mental map of what kind of projects have worked or not worked in the past,...

From a project-management perspective, yes, but with slow and bad feedback loops in long-term, x-risk and meta oriented projects, I don't think it is easy to tell what works and what does not. (Even with projects working in the sense they run smoothly and are producing some visible output.)

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-21T17:47:42.577Z · score: 3 (2 votes) · EA · GW

I'm not sure if we agree or disagree, possibly we partially agree, partially disagree. In case of negative feedback, I think as a funder, you are in greater risk of people over-updating in the direction "I should stop trying".

I agree friends and social neighbourhood may be too positive (that's why the proposed initial reviews are anonymous, and one of the reviewers is supposed to be negative).

When funders give general opinions on what should or should not get started or how you value or not value things, again, I think you are at greater risk of having too much of an influence on the community. I do not believe the knowledge of the funders is strictly better than the knowledge of grant applicants.

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-21T13:24:31.627Z · score: -2 (9 votes) · EA · GW

On a meta-level

I'm happy to update the proposal to reflect some of the sentiments. Openly, I find some of them quite strange - e.g. it seems, coalescing the steps into one paragraph and assuming all the results (reviews, discussion, "authoritative" summary of the discussion) will just happen may make it look more flexible. Ok, why not.

Also it seems you and Oli seem to be worried that I want to recruit people who are currently not doing some high-impact direct work ... instead of just asking a couple of people around me, which would often mean people already doing impactful volunteer work.

Meta-point is, I'm not sure if you or Oli realize how big part of solving

new EA projects evaluation pipeline

is in consensus-building. Actually I think the landscape of possible ways how to do evaluations looks like in such a way that it is very hard to get consensus on what the "strongest form" is. I'm quite happy to create a bunch of proposals, e.g.

  • with removing final expert evaluation
  • removing initial reviews
  • removing public forum discussions
  • writing an unrealistic assumption that the initial reviews will take 15m instead of hours,
  • suggesting that the volunteers will be my busy friends (whose voluntary work does not count?)
  • emphasising public feedback more, or less
  • giving stronger or weaker voice to existing funders.

I have stronger preference for the platform to happen than for one option in any single of these choices. But what is the next step? After thinking about the landscape for a some time I'm quite skeptical any particular combination of options would not have some large drawback.

On the object level:

Re: funder involvement

Cross-posting from another thread

Another possible point of discussion is whether the evaluation system would work better if it was tied to some source of funding. My general intuition is this would create more complex incentives, but generally I don't know and I'm looking for comments.

I think it much harder to give open feedback if it is closely tied with funding. Feedback from funders can easily have too much influence on people, and should be very careful and nuanced, as it comes from some position of power. I would expect adding financial incentives can easily be detrimental for the process. (For self-referential example, just look on this discussion: do you think the fact that Oli dislikes my proposal and suggest LTF can back something different with $20k will not create at least some unconscious incentives?)

We had some discussion with Brendon, and I think his opinion can be rounded to "there are almost no bad projects, so to worry about them is premature". I disagree with that. Also, given the Brendon's angel group is working, evaluating and funding projects since October, I would be curious what projects were funded, what was the total amount of funding allocated, how many applications they got.

Based on what I know I'm unconvinced that Brendon or BERI should have some outsized influence how evaluations should be done; part of the point of the platform would be to serve broader community.

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-21T04:09:23.379Z · score: 9 (5 votes) · EA · GW

It is very easy to replace this stage with e.g. just two reviews.

Some of the arguments for the contradictory version

  • the point of this stage is not to produce EV estimate, but to map the space of costs, benefits, and considerations
  • it is easier to be biased in a defined way than unbiased
  • it removes part of the problem with social incentives

Some arguments against it are

  • such adversarial setups for truth-seeking are uncommon outside of judicial process
  • it may contribute to unnecessary polarization
  • the splitting may feel unnatural

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-21T02:07:13.942Z · score: 3 (3 votes) · EA · GW

I don't see why continuous coordination of a team of about 6 people on slack would be very rigid, or why people would have very narrow responsibilities.

For the panel, having some defined meeting and evaluating several projects at once seems time and energy conserving, especially when compared to the same set of people watching the forum often, being manipulated by karma, being in a way forced to reply to many bad comments, etc.

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-21T01:56:58.382Z · score: 3 (2 votes) · EA · GW

On the contrary: on slack, it is relatively easy to see the upper bound of attention spent. On the forum, you should look not on just the time spent to write comments, but also on the time and attention of people not posting. I would be quite interested how much time for example CEA+FHI+GPI employees spend reading the forum, in aggregate (I guess you can technically count this.)

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-21T01:12:16.424Z · score: 4 (3 votes) · EA · GW

I don't understand why you assume the proposal is intended as something very rigid, where e.g. if we find the proposed project is hard to understand, nobody would ask for clarification, or why you assume the 2-5h is some dogma. The back-and-forth exchange could also add to 2-5h.

With assigning two evaluators to each project you are just assuming the evaluators would have no say in what to work on, which is nowhere in the proposal.

Sorry but can you for a moment imagine also some good interpretation of the proposed schema, instead of just weak-manning every other paragraph?

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-21T00:52:55.462Z · score: 7 (3 votes) · EA · GW

I would be curious about you model why the open discussion we currently have does not work well - like here, where user nonzerosum proposed a project, the post was heavily downvoted (at some point to negative karma) without substantial discussion of the problems. I don't think the fact that I read the post after three days and wrote some basic critical argument is a good evidence for an individual reviewer and a board is much less likely to notice problems with a proposal than a broad discussion with many people contributing would.

Also when you are making these two claims

Setting up an EA Forum thread with good moderation would take a lot less than 20 hours.

...

I am pretty excited about someone just trying to create and moderate a good EA Forum thread, and it seems pretty plausible to me that the LTF fund would be open to putting something in the $20k ballpark into incentives for that

at the same time I would guess it probably needs more explanation from you or other LTF managers.

Generally I'm in favour of solutions which are quite likely to work as opposed to solutions which look cheap but are IMO likely worse.

I also don't see how complex discussion on the forum with the high quality reviews you imagine would cost 5 hours. Unless, of course, the time and attention of the people who are posting and commenting on the forum does not count. If this is the case, I strongly disagree. The forum is actually quite costly in terms of time, attention, and also emotional impacts on people trying to participate.

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-21T00:50:37.640Z · score: 2 (1 votes) · EA · GW

With the first part, I'm not sure what would you imagine as the alternative - having access to evaluators google drive so you can count how much time they spent writing? The time estimate is something like an estimate how much it can take for volunteer evaluators - if all you need is in the order of 5m you are either really fast or not explaining your decisions.

I expect much more time of experts will be wasted in forum discussions you propose.

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-21T00:07:57.412Z · score: 1 (2 votes) · EA · GW

As I've already explained in the draft, I'm still very confused by what

An individual reviewer and a board is much less likely to notice problems with a proposal than a broad discussion with many people contributing would ...

should imply for the proposal. Do you suggest that steps 1b. 1d. 1e. are useless or harmful, and having just the forum discussion is superior?

The time of evaluators is definitely definitely definitely not free, and if you treat them as free then you end up exactly in the kind of situation that everyone is complaining about. Please respect those people's time.

Generally I think this is quite strange misrepresentation of how I do value people's time and attention. Also I'm not sure if you assume the time people spend arguing on fora is basically free or does not count, because it is unstructured.

From my perspective, having this be in the open makes it a lot easier for me and other funders in the space to evaluate whether the process is going well, whether it is useful, or whether it is actively clogging up the EA funding and evaluation space. Doing this in distinct stages, and with most of the process being opaque, makes it much harder to figure out the costs of this, and the broader impact it has on the EA community, moving the expected value of this into the net-negative.

Generally almost all of the process is open, so I don't see what should be changed. If the complain is the process has stages instead of unstructured discussion, and this makes it less understandable for you, I don't see why.

Comment by jan_kulveit on Request for comments: EA Projects evaluation platform · 2019-03-20T23:26:35.661Z · score: 10 (4 votes) · EA · GW

To make the discussions more useful, I'll try to briefly recapitulate parts of the discussions and conversations I had about this topic in private or via comments in the draft version. (I'm often coalescing several views into more general claim)

There seems to be some disagreement about how rigorous and structured the evaluations should be - you can imagine a scale where on one side you have just unstructured discussion on the forum, and on the opposite side you have "due diligence", multiple evaluators writing detailed reviews, panel of forecasters, and so on.

My argument is: unstructured discussion on the forum is something we already have, and often the feedback project ideas get is just a few bits from voting, plus a few quick comments. Also the prevailing sentiment of comments is sometimes at odds with expert views or models likely used by funders, which may cause some bad surprises. That is too "light". The process proposed here is closer to the "heavy" end of the scale. My reason is it seems easier to tune the "rigour" parameter down than up, and trying it on a small batch has higher learning value.

Another possible point of discussion is whether the evaluation system would work better if it was tied to some source of funding. My general intuition is this would create more complex incentives, but generally I don't know and I'm looking for comments.

Some people expressed uncertainty if there is a need for such system. Some because they believe that there aren't many good project ideas or projects (especially unfunded ones). Others expressed uncertainty if there is a need for such system, because they feel proposed projects are almost all good, there are almost no dangerous project ideas, and even small funders can choose easily. I don't have good data, but I would hope having largely public evaluations could at least help everyone to be better calibrated. Also, when comparing the "EA startup ecosystem" with the normal startup ecosystem, it seems we are often lacking what is provided by lead investors, incubators or mentors.

Comment by jan_kulveit on Announcement: Join the EA Careers Advising Network! · 2019-03-19T21:51:44.064Z · score: 6 (6 votes) · EA · GW

Hi Evan, given that effective altruism is somewhat complex, how do you make sure the career advise given will be good? From the brief text, there does not seem to be

  • any quality control of advisors
  • any quality control of the advice given
  • any coordination with other people doing something similar, like local groups

Overall I like the general idea, but I'm worried about the execution.

What do you imagine as a worst-case failure scenario? I can easily imagine various viral headlines like

  • I applied for EA career advice and the advisor recommended me to donate a kidney! Scary!
  • EA coach made unwelcome sexual advances
  • EA advisor tried to recruit me for his ( dubious investment scheme / crypto startup / project to save the world by using psychedelics a lot / ...)

Comment by jan_kulveit on Sharing my experience on the EA forum · 2019-03-19T13:07:23.786Z · score: 24 (9 votes) · EA · GW

Please try to not take the negative feedback personally. I hope it will not discourage you from contributing in the future.

My best guess what happened with your previous post was that a lot of people either disliked the proposal, or disliked the fact the you seem to set on creating something "hard to abandon" without seeking much input from the community before. Downvoting the post is cheap way to express such feeling.

I agree that if people collectively downvote, in particular strong downvote, without explaining why, the result of the collective action is bad. (On the other hand it is easy to see why: explanations are costly. I explained why I don't like the proposal, but that may mean I will bear more of the social costs of disagreement or conflict.)

Comment by jan_kulveit on Concept: EA Donor List. To enable EAs that are starting new projects to find seed donors, especially for people that aren’t well connected · 2019-03-19T02:20:37.916Z · score: 42 (16 votes) · EA · GW

I'm in favour to improving this coordination problem, but I think this particular solution is a bad idea and should not be started. The main problem is unilateralist's curse. Suppose there is a really bad project which 19 out of 20 altruistically minded funders (and no professional grantmaker) would support. Your design of the structure would make it much more likely that it will get funded.

In general effective altruism has a lot of value in brand and goodwill and epistemic standards/culture (like in billions $). It seems relatively easy to create large negative impact by destroying part of this, which can be "achieved" even by relatively small project with modest funding. Public donor list seems to be literally the worst option for structure if we want to avoid bad projects.

Comment by jan_kulveit on Getting People Excited About More EA Careers: A New Community Building Challenge · 2019-03-10T18:08:25.084Z · score: 13 (6 votes) · EA · GW

To clarify the concern, I'm generally not much more worried about how you use it internally, but about other people using the metric. It was probably not clear from my comment.

I understand it was probably never intended as something which other should use either for guiding their decisions or evaluating their efforts.

Comment by jan_kulveit on Getting People Excited About More EA Careers: A New Community Building Challenge · 2019-03-10T16:05:48.501Z · score: 13 (9 votes) · EA · GW

Ultimately the more you ground the metric in "what some sensible people thing is important and makes sense right now", the more nuance it has, and the more is it tracking reality. The text in my quote is verbatim copy from the page describing the metric from 2015, so I think it's highly relevant for understanding how IASPCs were understood. I agree that 80k career guides as a whole actually has much more nuance, and suggests approach like "figure out what will be needed in future and prepare for that".

The whole accounting still seems wrong: per definition, what's counted is ... caused them to change the career path they intend to pursue, ... ; this is still several steps away from impact: if someone changes their intentions to pursue jobs in EA orgs, it is counted as impact, even if the fraction of the people making such plans who will succeed is low.

For specificity, would you agree that someone who was 2 years away from graduation in 2016, deciding to change career plan to pursuing a job in CEA, would have been counted as impact 10, while someone switching from a plan going to industry to pursuing PhD in econ would have been counted as 1, and someone deciding to stay in, let's say, cognitive neuroscience, would have been counted as 0?

Comment by jan_kulveit on Suggestions for developing national-level effective altruism organizations · 2019-03-10T11:53:19.122Z · score: 3 (2 votes) · EA · GW

Here by hierarchy I mean strictly tree-like flow of information, where the centre collects the inputs, decides, and sends commands. By fully distributed I mean "everybody talking to everybody" (fully connected network) or a random network (random pairs of orgs talking to each other).

Comment by jan_kulveit on Getting People Excited About More EA Careers: A New Community Building Challenge · 2019-03-10T11:08:48.035Z · score: 32 (21 votes) · EA · GW

Part of this is caused by (over)use of a metric called impact-adjusted significant career plan changes. In a way, you get exactly what you optimise for. Quoting from 80k website

A typical plan change scored 10 is someone who, in large part due to us, switched to working at a highly effective organisation like GiveWell, became a major donor (>$100k/year) to effective organisations, or become a major advocate of effective causes.
A typical plan change scored 1 is someone who has taken the Giving What We Can pledge or decided to earn to give in a medium income career. We also award 1s to people who want to work on the most pressing problems and who switch to build better career capital in order to do this, for instance doing quantitative grad studies or pursuing consulting; people who have become much more involved in the effective altruism community in a way that has changed their career, and people who switch into policy or research in pressing problem areas.
A typical plan change scored 0.1 is someone shifting to gain better career capital but where they’re less obviously focused on the most pressing problems, or where they’ve switched into an option that is less obviously higher impact than what they were planning before.

Scoring the options you recommend

  • Skills building in non-EA organisations such as start-ups ... scores either 0.1 or 1, so 10x or 100x less valuable, in comparison to changing plan to work in GiveWell
  • Earning to give ... scores 10x less valuable
  • Speculative options ... 10x -100x less valuable

It's worth to emphasise the metric which is optimised is changing plans. How the difference between where someone actually switched, vs. switched just the plan, is handled, is likely inconsistent across places and orgs.

Taken literally, the best thing for a large number of people under this metric is to switch plan to working for OpenPhil, and consider other options as failure.

Taken literally, the best thing to do for student group community builders is to convince everyone to switch plans in this way, and count that as success.

So it is not a bias. Quite the opposite: it is a very literal interpretation of the objective function, which was explicitly specified several years ago.

The meta-level point people should take from this is:

  • If you are in a position of influence, you should be super-careful before you introduce anything like a quantitative metric into EA culture. EAs love measuring impact, are optimisers, and will Goodhart hard
Comment by jan_kulveit on EA is vetting-constrained · 2019-03-09T02:52:39.605Z · score: 27 (12 votes) · EA · GW

I'm intermittently working on a project to provide more scaleable and higher quality feedback for project proposals for several months. First alpha-stage test should start in a time-horizon of weeks and I'll likely post the draft of the proposal soon.

Very rough reply ... the bottleneck is a combination of both of the factors you mention, but the most constrained part of the system is actually something like the time of senior people with domain expertise and good judgement (as far as we are discussing projects oriented on long-term, meta, AI alignment, and similar). Adding people to the funding organisations would help a bit, but less than you would expect: the problem is, for evaluating e.g. somewhat meta- oriented startup trying to do also something about AI alignment, as a grantmaker, you often do not have the domain experience, and need to ask domain experts, and sometimes macrostrategy experts. (If the proposal is sufficiently ambitious or complex or both, even junior domain experts would be hesitant to endorse it.) Unfortunately, the number of people with final authority is small, their time precious, and they are often very busy with other work.

edit: To gesture toward the solution ... the main thing the proposed system will try to do is "amplify" the precious experts. For some ideas how you can do it see Ozzie's posts, other ideas can be ported from academic peer review, other from anything-via-debate.

[meta: I'm curious, why was this posted anonymously?]

Comment by jan_kulveit on You Have Four Words · 2019-03-07T23:25:53.036Z · score: 6 (4 votes) · EA · GW

I'll probably refer people to this post when trying to explain why you totally need complex networks when you are trying to coordinate about absolutely anything more complicated than what you can express in 4 words.

(Also: One friend pointed toward the fact that the word hierarchy comes from organisation coordinating effort of more than a billion of people across long time horizons)

Comment by jan_kulveit on What to do with people? · 2019-03-07T21:25:38.288Z · score: 4 (2 votes) · EA · GW

My rough understanding:

To some extent the ideas seem to be now "in the water". The maths part is something now more developed under the study of complex networks. Alexander's general ideas about design inspired people to create wikis, patterns in software movement, to some extent objective oriented programming, and extreme programming, and some urbanists... which made me motivated to read more from of him.

(Btw in another response here, I pointed toward Wikipedia as a project with some interesting social technology behind it. So, it's probably worth to note that a lot of the social technology was originally created/thought about at wikis like Meatball and the original WikiWiki by Ward Cunningham who was in turn inspired by Alexander.)

Comment by jan_kulveit on What to do with people? · 2019-03-07T12:32:59.515Z · score: 4 (5 votes) · EA · GW

In my view without all the hierarchy stuff, it is harder to see what to create, start, manage, delegate. I would be significantly more worried about the meme of "just go&do things&manage others" spreading than about the meme "figure out how to grow the structure".