Privacy as a Blind Spot: Are There Long Term Harms in Using Facebook, Google, Slack etc.?post by florian-z · 2021-01-16T17:15:49.387Z · EA · GW · 18 comments
Epistemic Status Summary Introduction Definitions Why work on privacy & security now? Threats change gradually Do we have enemies? – Threat modelling 1 Threats from state actors / governments Concrete threats: 2 Threats from non-governmental organisations Capabilities Concrete threats arising from this are: Can we reduce these threats? Information hazards Arguments for using tech company platforms Everybody uses it and is fine with it People will leave / not become involved because of the bad user experience of the self-hosted solutions It takes too much maintenance time and effort People will know when to switch if they have something to hide Hey, I know the people who work in/built that company, they would oppose anything nefarious happening You still have to solve the same security issues, but with less resources a) target size & attack surface b) having control Self-hosting does not solve the surveillance problem You're overvaluing this because of your own availability heuristic Prediction Time! What I would propose What kind of threat analysis can we do? TL;DR plz Footnotes Compliance But Lots of startups don't care about compliance and they're doing just fine! Switching costs and lock-in None 18 comments
A big thank you to Birte Spekker [EA · GW], Alexander Herwix [EA · GW], Achim Voss, Florian Jehn [EA · GW], Manuel Allgaier [EA · GW], Peter Ruschhaupt [EA · GW], Rudi Zeidler, Ruslan Krenzler for commenting on drafts of this post.
Pretty convinced but with an uneasy feeling that I couldn't pass an ideological turing test. List of threats is tentative, only meant for illustration, and probably has blind spots.
In this post, I will argue that for many EA cause areas, having secure communication and collaboration platforms is an instrumental goal. And that Privacy, security and safety aspects are currently undervalued by the EA Community and too much weight is given to how wide-spread and easy to use these platforms are. I argue that self-hosted open-source collaboration tools are a good default alternative to the proprietary cloud services provided by tech companies.
Which platforms to use is a hard-to-reverse decision in which we are trading off higher growth in the short term (1-5 years) against risks that might appear in the next 5 to 50 years.
Over longer time-spans similar trade-offs along these instrumental values may arise. Building a good model of the risks from a lack of privacy & security seems valuable.
From talking to different people in the community, I got the impression that quite a few people fall into one of the two following categories:
- feeling a diffuse sense of being uncomfortable about giving up their private information, but can't really justify where to draw the line.
- feeling held back by other's privacy demands, while seeing little to no risks from giving away the information in question.
I'm hoping to provide both groups with a clearer picture on the risks of lacking data privacy, so that group 2 can gain more understanding of the concerns and feel less held back and group 1 can improve their mental model of privacy and feel safer.
I'm concerning myself in this post mainly with internal communication and collaboration tools. Threats arising from debating EA topics in public spaces tied to one's main online identity are out of scope for this post.
In the context of this discussion I will use the terms as follows:
privacy: protection of personal or confidential information from leaking out.
security: protection against deliberate attacks.
safety: protection against unintentional data loss.
data loss: no one has the data any more.
data leak: someone who shouldn't have the data, has them. This is generally non-reversible.
personally identifiable information (PII): is information that, when used alone or with other relevant data, can identify an individual.
collaboration & communication platforms: Software tools that allow groups to work together. Concrete examples are: Slack, WhatsApp, Facebook, G-Suite/GoogleDocs, Mattermost, GitHub, Asana, Trello, etc.
self-hosted (software): Software and online services that are running on computers that we control, in the sense that no actors can access the stored data without our knowledge and consent.
Why work on privacy & security now?
My general argument goes as follows
- We want to build a movement that exists for hundreds of years.
- The political landscape might change and with it the types of adversaries we might face.
- The capabilities of these (future) adversaries should inform our strategy on operational security. Especially what tools we use, and where we store our internal data.
a) internal conversations and documents stored on services controlled by third parties may become accessible to future adversaries.
b) Groups and organisations should do some threat modelling before choosing tools, deciding where to store their data and how to protect it.
c) Groups and orgs need to actually have options adequate to their threat landscape to choose from.
These lead me to believe that it would be quite useful to provide self-hosted options for collaboration software for EA groups and organisations who may lack the resources and technical knowledge to set them up themselves.
General prescriptions for what tools each EA org should use don't seem useful, each one knows their own requirements and threat landscape best.
I argue that we should investigate and document the concrete drawbacks and advantages of self-hosted open-source alternatives to proprietary collaboration platforms offered by companies. Some people might overestimate the associated costs.
Actively seeking out secure default options, preserving them as options and setting them up can help give EA orgs an informed choice when it comes to information security.
Secondary benefits are:
- Skills and means in information security will probably be valuable in several global catastrophic risk cause areas, and working on projects that provide secure tools for EA groups and organizations can help EAs build info-sec skills together.
- Managing compliance with respect to privacy regulation is easier with self-hosted platforms.
Threats change gradually
Privacy and Security should not be evaluated against today's threat landscape, but also against future changes according to their probability, and their respective potential of harm.
These changes can be very gradual, so people will disagree when a point is reached where discussions on a platform are not safe any more. Building consensus to switch might be hard, and may be a lot harder to reverse in 5 years time when the community has grown.
Do we have enemies? – Threat modelling
In this section I want to illustrate some adversaries and threats that we might face in the near term future, and what their capabilities could be.
1 Threats from state actors / governments
EAs will probably be discussing topics that some states want to censor. They might also develop ideas that go directly against the interest of actors that are in power, or that some states consider illegal.
- The idea of building global cooperation on AI research might run counter to a states interest of "winning" the race to artificial general intelligence.
- Ideas implying the overthrow of some totalitarian regime, direct opposition to some political party, or powerful malevolent individual [EA · GW]. (i. e. to preserve liberal democracy).
- Lobbying for new voting systems that would reduce established parties' chances of winning
Even if EAs might not openly oppose these actors, they still might funnel resources to interventions against these actors' interests.
Given Current/near future capabilities, I would argue that data stored on cloud services provided by large companies should be assumed to either become accessible to automated surveillance over the next 50 years, or the corresponding company will be forced to stop serving the jurisdiction.
Security & Privacy: Surveillance – collection of contents as well as meta-data (i.e. who has been talking to whom) used for:
- Automatic tracking and tagging of users by interests, leading to people appearing on an intelligence agency's lists of dissidents/separatists/terrorists (terminology varies, but in general "enemies of the state")
- Imprisonment, torture, killings
- Removal of content
- Services banning users from certain countries
- Governments banning services in their country
2 Threats from non-governmental organisations
In some cause areas, EA might be perceived to be opposing the interests of a non-state actor (group of people with some amount of power). These could be:
- Industry (for example cigarettes, animal products)
- Charities that lose funding due to EA ideas about effective interventions spreading
- Political movements that come to see EAs as the enemy (i. e. right-wing extremists might see EA as open-border-globalists, left-wing extremists might see EA as neo-liberals)
- Companies working on advanced AI may see EA lobby efforts (for example windfall clauses) as a danger to their profits
- Significant resources to be spent on efforts against EA
- Acquisition of smaller companies that provide discussion platforms
Concrete threats arising from this are:
- Smear campaigns, astroturfing
- Death threats [EA · GW]
- Legal action
- Compilation of kill lists
Can we reduce these threats?
By involving fewer third parties (that could become compromised), and retaining control over our internal information we can prevent or reduce the impact of most of these risks.
People who deal with topics that contain information hazards [? · GW] probably (hopefully?) know where not to post what kind of information. But sometimes things that seem innocent in the beginning might turn out to be information hazards in retrospect or lead down a dangerous path.
Having full control over the servers on which our conversations happen is very useful in this case, and makes containing these easier.
Arguments for using tech company platforms
There are clear benefits to services provided by Facebook, Google, Slack, etc.:
- They are constantly improving, and try to make working with them as simple and comfortable as possible. This leads to an improved user experience that open source tools often only imitate months to years later.
- Sharing Data and Integrating workflows is easy as long as you stay within the Systems of the given provider (i.e. Google Forms -> Google Sheets, using Facebook data about User's likes and group membership to run targeted ads, etc.)
- Due to the mentioned network effects that these companies strive for, many people already have experience with the tools, because some other organisation in their life already uses them.
I agree with these statements, and I do not value them important enough to accept the increased vulnerabilities that I laid out above.
The following are arguments against my valuation that have not yet convinced me.
I'm trying not to straw-man here, but I have an uneasy feeling that I may not represent the counterarguments faithfully. Please point out any flaws in my reasoning, and help me better understand these stances.
Side note: these are laid out in a frequently asked questions style with lots of highlighted keywords to make them more skimmable.
Everybody uses it and is fine with it
It could be argued that a large portion of the EA community is already using these services, and is therefore fine with the trade-off. This could be an example of Revealed Preferences – people saying they value privacy but then not acting accordingly.
Do we know enough about peoples' motivations and reasons to conclude this? I think there are a few other possible explanations why we would see widespread use in the Community, even if the privacy argument is fully valid:
- network effects are real and positive when you're part of the network (see lock in[6:1])
- founder effects may lead to less diversity in communication and collaboration tools (initial choices get scaled up)
- preferences seems more complicated when privacy is concerned: there are experiments pointing towards a kind of strategic information avoidance happening.
- Acemoglu et al argue that the price of my personal data may be artificially depressed because of what others with shared characteristics have already revealed about themselves, creating the impression that users do not value their privacy much.
- selection bias – unintentionally we already might have excluded the people which are most uncomfortable with having their PII on those services.
- familiarity and no motivation to learn a different tool. "My company uses Slack. I want to use Slack for EA too."
- Lack of security knowledge and experience.
Independent of the merit of the arguments for privacy, concern for privacy is widespread among academics in some countries. It might make sense for some national groups to offer privacy preserving infrastructure, even if the worry were completely unfounded. Just to appeal more to people who happen to hold these beliefs.
As EA becomes a more global community, prioritizing privacy more also helps protect visitors from more hostile places, who might not be able to use platforms that are restricted in their country.
People will leave / not become involved because of the bad user experience of the self-hosted solutions
The argument here, is that people will more gladly use a polished interface. If it makes the tasks they want to achieve easier, people will do the task more regularly and/or more people will participate.
I agree that the ease of use of the interface does matter on the margin, but way less than the central motivating factors. I. e. what is motivating people to be involved in the first place and how valuable/engaging the content is.
All else being equal, one should strive to improve user experience, but I think maintaining control over the internal information should be valued higher than user experience.
It takes too much maintenance time and effort
It costs too much valuable EA-person-time to maintain these servers. Also we might not have enough people with the skills to do that.
EA is a community that has been criticized for its skew towards tech people, and also has problems to find enough direct work to do for talented people.
To convince me that this is a severe problem, I would need to become convinced that the makeup of the EA Community changes drastically in the next years, and at the same time not enough financial resources become available to pay for professional sysadmins or at least smaller IT companies who offer managed hosting of open-source software solutions at reasonable prices.
People will know when to switch if they have something to hide
Again, I fear the process is gradual in a "boiling the frog" kind of sense[6:2]. Also, evaluating this regularly has mental overhead cost: If I have a paper shredder, it might be more effective to just put all discarded documents in the shredder instead of asking yourself every time "is this confidential enough to shred?".
What if only a minority of EAs is threatened? Won't they just leave before the rest can agree to move to a different platform?
And if it's not a gradual change but a sudden change (for example a platform being banned), would we be able to get all our data out of the old platform? How much time and resources would it cost us to get everyone on a new platform during whatever crisis that forced us to switch?
Hey, I know the people who work in/built that company, they would oppose anything nefarious happening
Arguments for this would include the project Maven scandal, which lead google to announce new principles on what AI applications they build. (Or maybe just to hide their investments in this sector under a different Alphabet subsidiary?)
A lot of things have to go right for this kind of internal control to work:
- awareness – knowledge of the project/problem has to spread, employees have to be aware of security issues
- culture – the "nefarious thing happing" has to be commonly understood as bad
- employee incentives – lots of other job opportunities, not easily replaceable
- large enough company to resist acquisition by others with different ethical standards
Note that you also need to be sure that these factors will all persist in the future.
There are just as many counterexamples where this did not work.
In all these cases, once the damage is done, there is no way back.[1:1]
In a recent Future of Life Institute podcast, Mohamed Abdalla makes the argument that companies have a huge incentive to create an image of social responsibility. I would therefore expect it to be quite hard to evaluate from the outside if the conditions for internal self-regulation are in place and actually working.
We are facing a principal-agent problem here, where we are trying to verify that the incentives of the agent we put in charge of our infrastructure are aligned with our goals.
At the least we would have to do a detailed evaluation on these companies. Frameworks to evaluate a companies privacy orientation exist, but even companies that try to differentiate on high privacy standards can still suffer security breaches[12:1] that are almost impossible to anticipate from the outside.
You still have to solve the same security issues, but with less resources
Yes, you have to secure each server that you set up, which needs expertise. And even Open Source Software can only be trusted in so far as you trust the people who wrote it. But you are a) a smaller target and b) in control of all future security measures.
a) target size & attack surface
The argument that large companies can better defend customers data against attacks than the customers could by themselves is sometimes called "death star logic":
"The effort behind a cyber attack is proportional to the value of breaching the system. As the value of the data held […] increases, so does the scale of its cyber threats."
In a big company the complexity of the infrastructure increases by a lot, which gives you also a larger attack surface. The supposed advantage of having a big company's resources can often be outweighed by a highly complex infrastructure that leads to leaks, security problems and outages despite all efforts[12:2].
With big companies, there is a way larger incentive for attackers to find a breach[15:1], than with some small community server with maybe 10,000 irregular users.
Small self-hosted servers will simply not attract the same attention.
b) having control
With your own servers, you can scale up security measures as the data you collaborate on or share becomes more sensitive, or if the adversarial environment changes.
Let's say a breach happens on your server. Then you have the means to actually investigate and assess the harm done, and verify afterwards that the breach can not happen again. As customer of a company you only have press releases like these to go on:
"Our investigation found that usernames and passwords recently stolen from other websites were used to sign in to a small number of Dropbox accounts. […] We’re sorry about this, and have put additional controls in place to help make sure it doesn’t happen again."
My argument here is that it would be easier to vet a few system administrators every few years, than the inner workings of a company.
Depending on how the threat landscape develops, this might even become it's own effort. Maybe at some point we decide to build an EA-ops-sec Organisation that provides security consulting for other EA organisations.
Self-hosting does not solve the surveillance problem
Yes, your cloud provider could also be forced to give a state access, or an intelligence agency might exploit a software bug. Even warrant canaries probably don't help.
Finely grained surveillance efforts are a lot more costly to state actors, and won't be used as routinely. Only high value targets for intelligence services get special treatment.
Meanwhile on Facebook, Google, Slack, office365, etc. one should assume automated snooping efforts at least by five-eyes intelligence services.
Should it be revealed that your provider is not safe any more, it is easy to just move your virtual server to a different jurisdiction. Probably without the users even taking notice.
You're overvaluing this because of your own availability heuristic
There is some evidence for this: Many of these concepts may be more present to me, because I used to work with Colombian human rights defenders who frequently were targets of surveillance, threats and violence. I saw many examples of people ignoring good operational security practices due to convenience and there not being secure defaults. Also setting up servers for collaboration tools is a big part of my day job, so I perceive it as less complicated than the average person.
So yes, maybe I am just used to doing security analysis in this adversarial way, and maybe I see less drawbacks to self-hosting.
To convince me that I am overvaluing this, I would need to see that the increased community growth through using more widespread software services with marginally better user interface/experience largely outweighs the expected long term harm to the movement through adversaries gaining access to our internal information.
(sadly I could not figure out how to directly embed these using the markdown editor, and the regular editor breaks my footnotes)
Will internal communications of an EA organisation be part of a data leak by 2040?
Conditional on all EA community building Slack Workspaces being replaced with Mattermost instances, would user participation in those spaces drop by more than 5%?
Will an EA aligned organisation become target of directed government repression by 2040? (for example withdrawal of operating license, denial of work visas, etc.)
What I would propose
When choosing tools for collaboration and communication, I'd recommend to default to:
- collect the least amount of data that still allows us to full-fill our goals
- not give any data to third parties, because this may become irreversible
When selecting tools we should justify why we deviate from these defaults. (I expect it to be justified in many cases.)
What kind of threat analysis can we do?
Ultimately the depth of analysis is up to each organisation. A useful toolbox for risk assessment in this area is "How to measure anything in cyber security risk" by Douglas W. Hubbard and Richard Seiers. They also provide example spreadsheets on the book's website.
Systems-mapping is also a useful starting point to identify relevant actors and possible vulnerabilities.
As threat levels increase, other tools from the operational security toolbox (such as red-teaming) can be useful.
We may not have anything to hide yet, but should take precautions now, because in the near term future many of us probably will.
Summary: Some strategic decisions available to the effective altruism movement may be difficult to reverse. One example is making the movement’s brand explicitly political. Another is growing large. Under high uncertainty, there is often reason to avoid or delay such hard-to-reverse decisions.
This is connected to the free-speech/"cancel-culture" debate, as well as more general arguments on epistemic security and possible detrimental effects of social media. There is a lot to think and write about there regarding possible risks to the EA movement that arise in these areas. ↩︎
In this post, we summarize why we think information security (preventing unauthorized users, such as hackers, from accessing or altering information) may be an impactful career path for some people who are focused on reducing global catastrophic risks (GCRs).
Non-compliance gives all types of adversaries more ammunition to use against an organisation.
In the context of this post privacy regulation such as the General Data Protection Regulation (GDPR) and the California Consumer Privacy Act (CCPA) is the most relevant. The argument is the same for other regulations (employment contracts, tax codes etc.).
By giving adversaries ammunition I mean leaving yourself more vulnerable than necessary to some types of attack – as Colombians say "dar papaya".
Concrete threats from non-compliance
- a government closes an EA org / takes away their tax exemption status due to non-compliance with local privacy regulations
- an activist group could organize a campaign of bombarding EA orgs with GDPR information/deletion requests (Data Subject Access Requests)
- a disgruntled ex employee could bind large amounts of time by taking legal action against an EA organisation that has not taken any measures to comply with privacy laws.
But Lots of startups don't care about compliance and they're doing just fine!
Yep, it would be a mistake for most of them to prioritize compliance, because it would slow their efforts to find their niche as quickly as possible and then grow as big as possible. Their incentives are geared towards extreme short-term growth to either reach a monopoly in their market (As promoted by Peter Thiel and other venture capitalists) or become acquired. The risks of getting shut down over non-compliance are just a tiny category among many risks of failure that a startup faces.
I'd argue that the EA movement should go for a slower and more deliberate kind of kind of growth. Therefore it has less to gain and more to loose by ignoring compliance. ↩︎
In it's November 2020 Edition the Economist had an article outlining how protectionism might move the global tech-sphere towards a "Splinternet". Over a longer 10-50 year time-frame even more drastic shifts not only in state control, but also online discussion culture seem possible (less than 15 years ago there was no twitter).
The article outlines how countries with different regulatory platforms serve as a kind of legal "geography" for the geopolitical battle around influence/control over users and their data.
"America combines monopolies and a strongish state with lots of competition. Mainly thanks to this profitable amalgam, the country has given rise to most of the world’s leading tech firms. China is more like Apple and Oracle, which combine being closed with lots of internal competition. The European Union is best compared to an open-source project such as Linux, which needs complex rules to work. India, Japan, Britain, Taiwan and South Korea all run differently and have technology bases to match."
Switching costs and lock-in
A local group of 10 people can easily switch platforms as soon as they perceive a threat.
For larger community platforms this becomes more and more difficult, as the community grows. Especially if the threat only affects a minority of EAs (who live in a country where discussing X is dangerous / who work on a cause area in conflict with industry Y).
Venture Capital backed companies in the communication/collaboration space have strong incentives to provide network effects while at the same time making it hard to switch to competitors. Network effects, economies of scale and high switching costs are a common combination of so-called moats (protections against competition) for buisinesses. Some current regulation might force them to make data available to their users, but they still decide which data, and how. Getting a 1000 page pdf of your Facebook activities does not help you much when you want to migrate your Facebook Group conversations to a different service. ↩︎ ↩︎ ↩︎
Other movements did organize illegal activities on Google Docs. Julian Oliver on Extinction Rebellion:
"They just went straight to base camp. Google for sharing like things like contact lists. They didn't have anyone with technical, shall we say, know how or operational security intuition or interest to look at it any other way. So they just reach for what's at hand. The Action Network, too, hosted over in the United States Base camp, I mean, the extinction rebellion explicitly breaks base camps terms of service. You may not use the service or any illegal purpose. Well, civil disobedience is breaking the law. That's what it is."
The Totalitarian Threat by Bryan Caplan
[…] perhaps an eternity of totalitarianism would be worse than extinction.
Current example of state capabilities:
- Complete bans of platforms in a country (Facebook, Google in China, TikTok & WeChat came close in the US, YouTube and many other sites in Turkey)
- Currently all companies doing business in China must provide means for state surveillance of their users, as well as censorship of content.
- Companies in the US must comply with National security letters, if the NSA has not found other means to simply intercept all communications of their users without a warrant.
- Services complying with government sanctions, banning users from third party countries (for US based companies currently: Iran, Cuban, Syrian, North Korea and Crimea)
- Open surveillance of all internet traffic in Kazakhstan by forcing people to install government issued root SSL certificates
- Some in the European Union are working on surveillance laws to force companies to circumvent encryption, as well as restrictions on services from third party countries, see 5.1 in New Developments in Digital Services | Short-(2021), medium-(2025) and long-term (2030) perspectives and the implications for the Digital Services Act
"Like the Chinese firewall, this European internet would block off services that condone or support unlawful conduct from third party countries."
Why are Privacy Preferences Inconsistent? by Dan Svirsky
"Even people who would otherwise pay for privacy seem able to exploit strategic ignorance – keeping their head in the sand –and deal away their data for small amounts of money."
[…] the case for the importance of thinking about "Task Y", a way in which people who are interested in EA ideas can usefully help, without moving full time into an EA career. The most useful way in which I am now thinking about "Task Y" is as an answer to the question "What can I do to help?".
also: After one year of applying for EA jobs: It is really, really hard to get hired by an EA organisation [EA · GW] by pseudonymous EA Applicant ↩︎
Examples of companies where trusting founders/employees to "do the right thing" was not enough:
- Well before the News became public several Facebook employees were probably aware that Cambridge Analytica was violating their terms of service.
- Equifax employees knew for years that their antiquated internal systems were insecure and failed to implement the security standards that regulations demand of similar financial institutions.
- I assume employees at Finnish psychotherapy provider Vastaamo did not intend to give all therapy transcripts to blackmailers.
- Keybase seemed like a an extremely privacy conscious company, still they were aquired by zoom, which has a very different image.
- Cloudflare leak leading to exposure of private information.
- 400M Microsoft accounts left unsecured.
- Microsoft also targeted in SolarWinds attack which compromised probably hundreds of companies and government offices.
- Bug in Slack would have allowed anyone to log in anyone's account.
- Dropbox temporarily allowed users to log into any account with any password.
- Hackers entered Target's networks through air-conditioning systems contractor.
- A full download of all (even deleted) posts from "free-speech"/right-wing twitter clone parler was possible.
- Stealing your private YouTube Videos one frame at a time
- Hundreds of other companies who lost control over their customer's data.
Company information privacy orientation: a conceptual framework, Greenaway et al. 2015 ↩︎
As outlined in Ken Thompson's Turing Award Lecture "Reflections on Trusting Trust" it is possible to hide malicious functionality even in open source code. ↩︎
Centralization of data and or control increases incentives for attackers:
- Death Star logic in multi-tenant systems
- Solarwinds was an extremely high value target, because it's logging software was used in so many companies and government agencies, which themselves were high value targets.
i.e. enable end-to-end encryption, only allow access via a virtual private network (VPN), requiring all users to use second factor authentication (2FA), move to a more secure jurisdiction. ↩︎
Comments sorted by top scores.