Posts

AI safety and consciousness research: A brainstorm 2023-03-15T14:33:42.240Z
A new place to discuss cognitive science, ethics and human alignment 2022-11-04T14:34:13.625Z
Exploratory survey on psychology of AI risk perception 2022-08-02T20:34:27.000Z
Daniel_Friedrich's Shortform 2021-12-06T22:35:36.528Z

Comments

Comment by Daniel_Friedrich on Daniel_Friedrich's Shortform · 2023-02-26T11:52:09.953Z · EA · GW

I got access to Bing Chat. It seems:
- It only searches through archived versions of websites (it doesn't retrieve today's news articles, it accessed an older version of my Wikipedia user site)
- During archivation, it only downloads the content one can see without any engagement with the website (tested on Reddit "see spoiler" buttons which reveal new content in the code. It could retrieve info from posts that gained less attention but weren't hidden behind the spoiler button)
I. e. it's still in a box of sorts, unless it's much more intelligent than it pretends.

Edit: A recent ACX post argues text-predicting oracles might be safer, as their ability to form goals is super limited, but it provides 2 models how even they could be dangerous: By simulating an agent or via a human who decides to take bad advice like "run the paperclip maximizer code". Scott implies thinking it would spontaneously form goals is extreme, linking a post by Veedrac. The best argument there seems to be: It only has memory equivalent to 10 human seconds. I find this  convincing for the current models but it also seems limiting for the intelligence of these systems, so I'm afraid for future models, the incentives are aligned with reducing this safety valve.

Comment by Daniel_Friedrich on What AI Take-Over Movies or Books Will Scare Me Into Taking AI Seriously? · 2023-01-10T20:49:51.013Z · EA · GW

For me, the easiest to imagine model of how an AI takeover could look like has been depicted in Black Mirror: Shut Up and Dance (the episodes are fully independent stories). It's probably just meant to show scary things humans can do with current technology, but such schemes could be trivial for a superintelligence with future technology.

Comment by Daniel_Friedrich on EA Forum feature suggestion thread · 2022-08-08T11:09:31.874Z · EA · GW

It could be nice to be able to filter these by date

Comment by Daniel_Friedrich on 2021 EA Mental Health Survey Results · 2022-05-03T16:22:40.756Z · EA · GW

I'd love to see a deeper inquiry into which problems of EAs are most effectively reduced by which interventions. The suggestion there's a lack of "skilled therapists used to working with intelligent, introspective clients" is a significant novel consideration for me as an aspiring psychologist and this kind of hybrid research could  help me calibrate my intuitions.

Comment by Daniel_Friedrich on Seeking Survey Responses - Attitudes Towards AI risks · 2022-03-28T18:43:42.484Z · EA · GW

When coming up with a similar project,* I thought the first step should be to conduct exploratory interviews with EAs that would reveal their hypotheses about the psychological factors that may go into one's decision to take AI safety seriously. My guess would be that ideological orientation would explain the most variance.

*which I most likely won't realize (98 %) 
Edit: My project has been accepted for the CHERI summer research program, so I'll keep you posted!

Comment by Daniel_Friedrich on Prioritization Research for Advancing Wisdom and Intelligence · 2021-10-20T21:39:27.787Z · EA · GW

The core idea sounds very interesting: Increasing rationality likely has effects which can be generalized, therefore having a measure could help evaluate wider social outreach causes.

Defining intelligence could be an AI-complete problem, but I think the problem is complicated enough as a simple factor analysis (i. e. even without knowing what we're talking about :). I think estimating impact once we know the increase in any measure of rationality is the easier part of the problem - for ex. knowing how much promoting long-termist thinking increases support for AI regulation, we're only a few steps from getting the QALY. The harder part for people starting out in social outreach might be to estimate how much people they can get on board of thinking more long-termistically with their specific intervention.
So I think it might be very useful to put together a list of all attempts to calculate the impact of various social outreach strategies for anyone who's considering a new one to be able to find some reference points because the hardest estimates here also seem to be the most important (e. g. the probability Robert Wright would decrease oversuspicion between powers). My intuition tells me differences in attitudes are something intuition could predict quite well, so the wisdom of the crowd could work well here.
The best source I found when I tried to search whether someone tried to put  changing society into numbers recently is this article by The Sentience Institute.
 

Also, this post adds some evidence based intervention suggestions to your list.

Comment by Daniel_Friedrich on AMA: Jason Brennan, author of "Against Democracy" and creator of a Georgetown course on EA · 2021-08-18T14:16:13.781Z · EA · GW

What can an EA academic do to improve the incentives in the research side of academia? To help reward quality or even positive impact?