Existing approaches to AI alignment (how do we get artificial intelligence to do what we want it to do), safety (how do we make sure we minimise economic, social, or political risk), and policy (how do we legislate around alignment, safety, and the use of training data) have been either focussed on the technical challenges of AI or been guided by the utilitarian approach of many interested in the field.
This is understandable. AI development is inevitably rooted in Computer Science and Mathematics. If you think Artificial General Intelligence is possible, you probably think consciousness is solvable. Safety thus becomes a matter of mathematic proof, and ethics something that can be reasoned through to a definitive solution.
Here, then, we see the link between mathematical thinking and utilitarian ethics. Utilitarianism is not just the foundation for the behavioural science and computer science on which much modern ML relies, (see, for example, reinforcement learning, skinner boxes, and Turing Machines) but also influences how many tech founders think about themselves and their role in the world.
Effective Altruism is one modern offshoot. It aims to think rationally about how people can maximise the amount of good in the world. This might take the form of choosing high-earning careers in order to donate 80% of the proceeds, or via altruistic kidney donation. On the other hand, it might also take the form of choosing to work in a field you believe has a good chance of creating either a catastrophically bad or unimaginably positive global outcome. Like, for example, Artificial Intelligence.
This is a fine and noble sentiment. EA-aligned research has made discussion of the risks of Artificial General Intelligence mainstream. However, this has also led to policy discussions being dominated by what is a narrow philosophical approach. Even worse, many are not even aware that this is one way of viewing the world among many; if you’re not aware you’re driving a car, you won’t know how to brake.
This reading list therefore seeks to approach AI Policy from a broader prospective. In particular, it seeks to make the utilitarian influence on modern AI development legible, and highlight the philosophical assumptions so often taken for granted. AI policy as it currently stands rests on rickety foundations about our relation to the world, to computers, to society, and to ethics. Again and again, it takes philosophical opinion and historical contingency for scientific fact. If we wish to step back from the precipice of existential risk, we first have to use a lamp to find out how far we are from the edge and, perhaps, find other paths forward.
Benjamin Bratton
The Stack
https://direct.mit.edu/books/book/3504/The-StackOn-Software-and-Sovereignty
Discusses how planetary computation, the minerals, and the resources required for this, has influenced contemporary geopolitics. Good to read alongside Smil, and good to prompt you to think about second and third order effects of compute.
Bruno Latour
Facing Gaia
http://www.bruno-latour.fr/node/693.html
Critiques the Kantian copernican revolution on which modern science is based.
Plato
Ring of Gygas
https://www.perseus.tufts.edu/hopper/text?doc=urn:cts:greekLit:tlg0059.tlg030.perseus-eng1
Discussion of the role of the state vs internal morality.
Amitav Ghosh
The Great Derangement
https://press.uchicago.edu/ucp/books/book/chicago/G/bo22265507.html
How easy is it for humans to grasp deep, climactic time?
St Augustine
Confessions
https://stpauls.org.uk/confessions-4331.html
Specifically the bit about “I ate the pear and still I hungered”. Augustine writes about the discordance between human will and ordered goodness. If we want AI to predict human preferences, do we want to give more weight to what humans say they want, and what they show they want?
Plato
Crito
A first description of social contract theory, what constitutes good governance &c. Takes the form of a dialogue between Socrates and Crito. By this point, Socrates has been found guilty of corrupting the youth and will be executed in a matter of days.
Robert Filmer
Patriarcha
https://pages.uoregon.edu/dluebke/301ModernEurope/FilmerPatriarcha1680.pdf
Defence of the divine right of Kings in the context of the English Civil War, influenced Hobbes and Locke
Joseph Butler
Fifteen Sermons
https://www.ccel.org/ccel/b/butler/sermons/cache/sermons.pdf
Argues against Hobbesian philosophical egotism.
Thomas Hobbes
Leviathan
https://global.oup.com/academic/product/leviathan-9780199537280?cc=gb&lang=en&
Argues humans are mechanistic, therefore self interested, and therefore without a sovereign’s total monopoly on violence would be engaged of a war against all.
John Locke
Two Treatises of Government
https://www.yorku.ca/comninel/courses/3025pdf/Locke.pdf
First treatise attacks Filmer, second uses natural rights against locke for a liberal theory of government.
Mary Wollestonecraft
Vindications of the Rights of Women
This volume in particular, which places Vindication alongside vindication of men and writings on the French Revolution, explores how far humans can be perfected via rational education. In other words, can humans be educated out of their baser instincts?
Vindication of the Rights of Women argues that women should receive a rational education (something that was not accepted even amongst the “liberal” sort of the 18th century).
Celeste Friend
Overview of the Social Contract Theory
Good introductory overview
David Hume
A Treatise of Human Nature
Classic statement of empiricism and rationalism. Of particular interest is his “problem of induction” and how far we can use the past to predict the future.
John Locke
An Essay Concerning Human Understanding
Foundational text of modern empiricism, particularly interesting for its philosophy of language contra rationalism.
George Berkeley
Principles of Human Knowledge, Three Dialogues
Contra Locke; argues that ideas can only associate with ideas, that the external world is not physical, but rather ideas in the mind of God.
George Berkeley
De Motu
Contra Newton, Pro Einstein avant la lettre on space.
William MacAskill
What We Owe The Future
https://whatweowethefuture.com/uk/
Good intro to longtermist thinking.
John Stuart Mill
Utilitarianism, On Liberty
Theorises utilitarianism, then applies its principles to government
Franklin Perkins
Liebniz and China: A Commerce of Light
https://philpapers.org/rec/PERLAC-5
Liebniz’s interest in confucianism, its influence on the development of binary numbers, syncretism
Immanuel Kant
Groundwork of the Metaphysic of Morals
Broad description of the categorical imperative, role of reason, impossibility of attempting to grasp the regularity of the world and all second and third order effects
JJ Rousseau
On the Social Contract
You know why! Argues for popular consent of the governed, against divine right, inter alia.
Peter Singer
Practical Ethics
https://uk.bookshop.org/p/books/practical-ethics-peter-singer/4238681
On the moral consequences of utilitarianism.
Peter Singer
The Most Good You Can Do
https://yalebooks.co.uk/book/9780300219869/the-most-good-you-can-do/
On Effective Altruism and maximising utility.
JL Austin
How to do things with words
https://www.hup.harvard.edu/books/9780674411524
Most words, he argues, don’t have anything to do with truthfulness. They are in fact better thought of as actions: performances, or as he later called the “speech acts.”
Jurgen Habermas
Theory of Communicative Action
https://plato.stanford.edu/entries/habermas/
Attempts to use a philosophy of language for a theory of the public sphere
Jurgen Habermas
Structural Transformation of the Public Sphere
https://mitpress.mit.edu/9780262581080/the-structural-transformation-of-the-public-sphere/
Introduced / made famous the concept of a public sphere, and then argued it formed a key role in the construction of democracy and popular sovereignty.
Much critiqued now - we think in terms of public spheres rather than a sphere - but again, gives a good foundation on modern analysis of society and state.
Ernest Gellner
Nations and Nationalism
https://www.wiley.com/en-gb/Nations+and+Nationalism%2C+2nd+Edition-p-9781405134422
What is a modern nation? How is it new, what does it exclude, what are its roots? Gellner was one of the foundational thinkers of nations along with Benedict Anderson. Here he argues it’s bound up with modernity, and that this novelty is critical to understanding its fragility.
WVO Quine
Two Dogmas of Empiricism
https://philpapers.org/rec/QUITDO-3
Contra logical positivism
A.J. Ayer
Language, Truth, and Logic
How to apply verification to language.
Thomas Nagel
What is it like to be a bat?
https://www.sas.upenn.edu/~cavitch/pdf-library/Nagel_Bat.pdf
Argues that we lack an understanding of what the physical cause of a mental state would even look like.
Isaiah Berlin
The Proper Study of Mankind
https://www.penguin.co.uk/books/366279/the-proper-study-of-mankind-by-isaiah-berlin/9780099582762
Particularly hedgehog and the fox.
Isaiah Berlin
Three Critics of the Enlightenment
https://press.princeton.edu/books/paperback/9780691157658/three-critics-of-the-enlightenment
Essay on the counter-enlightenment, good both for understanding the intellectual opponents of the Enlightenment and of the influence of their ideas in the modern world
J.L. Mackie
Inventing Right and Wrong
Anti objectivity ethics. See also its espousal of argument from queerness; that moral entities would be just plain weird.
https://en.wikipedia.org/wiki/Max_Horkheimer and https://en.wikipedia.org/wiki/Theodor_W._Adorno
Dialectic of Enlightenment
https://www.sup.org/books/title/?id=1103
One of the foundational texts of the new left. Interested in why (they thought) the enlightenment had failed, why Nazism, Stalinism, etc had resulted, and the poverty of theory to enact positive social change.
Roberto Calasso
The Unnameable Present
https://www.penguin.co.uk/books/308742/the-unnamable-present-by-calasso-roberto/9780141988016
Against digitability, and on the link between utilitarianism, ethics, computing, big data, and totalitarianism
James C. Scott
Seeing Like A State
https://yalebooks.yale.edu/book/9780300078152/seeing-like-a-state/
Argues that measuring is a form of control, and that utopian projects from the late 19c onward turned to dystopia
Charles Taylor
Sources of the Self
https://www.hup.harvard.edu/books/9780674824263
Sets out the contingency of the modern individual in the west and its philosophical forebears.
Why do we care about this? Because it’s this modern rational identity that is being taken as somehow more true, more fundamental, than any other by those who seek to make ai an amplification of their own self-image.
Vaclav Smil
Energy and Civilisation: A History
https://mitpress.mit.edu/9780262536165/energy-and-civilization/
Looks at the history of civilisation in terms of its ability to access power.
Alasdair Macintyre
After Virtue
https://undpress.nd.edu/9780268035044/after-virtue/
Aristotelian virtue ethics and social order
Wendell Berry
Why I am not going to buy a computer
https://www.penguin.co.uk/books/297824/the-world-ending-fire-by-berry-wendell/9780141984131
Pro simplicity, technological scepticism. Berry in general is an excellent critique of optimistic techno-utopia
Wendell Berry
In distrust of movements
https://www.penguin.co.uk/books/297824/the-world-ending-fire-by-berry-wendell/9780141984131
Sets out Berry’s epistemology fairly straightforwardly.
Eugene McCarraher
The Enchantments of Mammon
https://www.hup.harvard.edu/books/9780674984615
Demonstrates the religious thinking at the heart of capitalist modernity. This is particularly critical to understand because it makes the utility (ha) of religious studies, religious history, and theology more legible for those coming from a STEM or atheist background.
Quinn Slobodian
Globalists
https://www.hup.harvard.edu/books/9780674979529
Demonstrates the links between behavioural science and neoliberalism of what he terms the Geneva School. Also explains how these neoliberals saw markets in epistemological terms, which raises the question of their links to neural networks
Donna Haraway
Staying with the Trouble
https://www.dukeupress.edu/staying-with-the-trouble
Critiques the anthropocene’s anthropocentrism. In this sense, it’s good to read alongside Bruno Latour’s facing gaia.
Rene Girard
All Desire is A Desire For Being
https://www.penguin.co.uk/books/445783/all-desire-is-a-desire-for-being-by-girard-rene/9780241543238
Essay collection of Girard’s work. His concept of mimesis and the scapegoat (see violence and the sacred) is popular among tech entrepreneurs, particularly Thiel. There’re also obvious links to how they create systems and see the world, in particular the use of reinforcement learning.
Rene Girard
Things Hidden Since the Foundation of the World
https://www.sup.org/books/title/?id=2670
Expands his idea of mimesis, and applies it to history of (particularly western) culture.
Peter Turchin
End Times
https://www.penguin.co.uk/books/447345/end-times-by-turchin-peter/9780141999289
Social effects of mechanisation, digitability, inter alia. Best read with Fukuyama
Francis Fukuyama
The End of History
Best read as a way of understanding the thought process of post cold war and pre GFC America and history as teleology.
Francis Fukuyama
Political Order and Political Decay
https://profilebooks.com/work/political-order-and-political-decay/
Answers critics to End of History by pointing to the possibility that institutions can always calcify, collapse, and otherwise atrophy. Democracy has to be fluid, and the institutions that preserved government in one situation may not in others.
James McBride
Ways of Being
https://us.macmillan.com/books/9780374601119/waysofbeing
An odd, quirky book. It points to the relational nature of intelligence, and coins the “more than human” world.
Timothy LeCain
The Matter of History
https://www.timothyjameslecain.com/the-matter-of-history
Argues for neomaterialist understanding that’s adjacent to Latour, inter alia. Like McBride, LeCain argues that it makes no sense to think of disembodied reason.
Susan Nieman
Evil in Modern Thought: An alternative history of philosophy
https://press.princeton.edu/books/paperback/9780691168500/evil-in-modern-thought
Views philosophy as an attempt to grapple with the problem of evil. Her defence of Kant in these terms is particularly interesting.
Richard Ngo
A Short Introduction to Machine Learning
https://www.alignmentforum.org/posts/qE73pqxAZmeACsAdF/a-short-introduction-to-machine-learning
Introduces what neural networks are, the difference between ai / machine learning / deep learning, inter alia.
Richard Ngo
AGI Safety From First Principles
https://www.alignmentforum.org/s/mzgtmmTKKn5MuCzFJ
Sets out current mainstream thinking on AGI, Alignment, how people currently approach AGI Safety in rationalist and rationalist adjacent communities
Rishi Bommasani et al
On the Opportunities and Risks of Foundation Models
https://arxiv.org/pdf/2108.07258.pdf
Defines foundation models, i.e models trained on broad sets of data that can be used for many different tasks. It then sets out how these models might be used, what they probably can’t be used for, and the risks as they stand.
Paul Christiano, OpenAi
Learning From Human Preferences
https://openai.com/research/learning-from-human-preferences
Sets out how to efficiently use human feedback to train a reinforcement agent to predict human preferences. Important both for building AI agents in the near future, but also in its relationship to Stuart Russell’s idea of how to solve Alignment
Stuart Russell
Human Compatible
https://www.penguin.co.uk/books/307948/human-compatible-by-russell-stuart/9780141987507
Russell’s one of the biggest names in AI development. Here he describes what alignment is, why it matters, and sets out a way to help it on its way using reinforcement learning. Since a big part of the argument of the other material presented here is why attempting to do this is tricky / leads to bad consequences, it’s good to have a clear reasoned argument by someone who knows the risks.
Brian Christian
The Alignment Problem
https://brianchristian.org/the-alignment-problem/
Goes into great, useful detail about the history of machine learning from Liebniz through to the Behavioural Scientists to Computer Scientists and attempts to create artificial intelligence.
Richard Sutton
The Bitter Lesson
https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson.pdf
Comes before (2019) the “big” developments in machine learning and LLMs, but i think is an excellent reminder from an experienced researcher that human brains and intelligence in general is really, really hard and not discrete.
Jacob Steinhardt
More is Different For AI
https://bounded-regret.ghost.io/more-is-different-for-ai/
Argues for emergent features in large models. Essentially that if you throw enough data and compute at a problem, there’ll be a decent change you’ll get interesting side effects.
Jacob Steinhardt
Future ML Systems Will Be Qualitatively Different
https://bounded-regret.ghost.io/future-ml-systems-will-be-qualitatively-different/
See above, but expands this from the 1972 paper from Phillip Anderson
Scott Alexander
Biological Anchors: A Trick That Might or Might Not Work
This whole sequence of Scott Alexander summarising debates in the rationalist community on AI is useful. Here he covers Cotya and Yudkowsky (yes that one)’s disagreement on whether it’s useful to think of AI in terms of human brains. In other words, if human brains are just a matter of computation, then so is evolution, and therefore with enough information and compute - Cotya argues - we’ll eventually hit AGI. Yudkowsky finds this hilariously wrong.
Graciela De Pierris and Michael Friedman
Kant and Hume on Causation
https://plato.stanford.edu/archives/win2018/entries/kant-hume-causality/
What do we mean when we say a causes b? Sounds simple, but important to understand what this means and what it does not mean when thinking of getting machines to do stuff, and of theorising later debates in the scientific method.
David Hume
An Enquiry Concerning Human Understanding.
Expanded edition of Treatise of Human Nature, which “fell dead from the press” as Hume put it. Immensely influential both for the western moderns, and particularly for Immanuel Kant.
Adam Smith
Wealth of Nations
Good to read what Smith did say about markets and what he didn’t. Read it, especially, in terms of its theory of knowledge. What can we know? What can we measure?
Adam Smith
Theory of Moral Sentiments
https://plato.stanford.edu/entries/smith-moral-political/
Now this is interesting, as it is a work of moral philosophy which sits alongside his economic work (which I’d argue shouldn’t be seen as economic in the modern sense, but as a more holistic philosophy).
Shosana Zuboff
The Age of Surveillance Capitalism
https://profilebooks.com/work/the-age-of-surveillance-capitalism/
One of the most popular books on how surveillance currently works with big data, and the extent to which our personal data are being sold. Understanding how this already happens pushes us to question how far amplification of existing tech and world views will deepen inequalities.
Eric Reiss
The Lean Start Up
https://theleanstartup.com/
Window into how the start ups of 2010s viewed their role, internal methodology, etc. I’d read this alongside the Power Law / Chip Wars / Deep Work for how VC-adjacent AI companies view themselves and people.
Cal Newport
Deep Work
https://bookshop.org/p/books/deep-work-rules-for-focused-success-in-a-distracted-world-cal-newport/8339760?ean=9781455586691
Again, background reading on how contemporary AI-adjacent people think. Newport’s guide to how to be productive over time touches on time management, but also heavily relies on quantification and the sort of thinking which permeates these spaces. Particularly important for his thinking on “flow” states, which I’d argue were quasi-religious.
Chris Miller
Chip War
https://www.simonandschuster.co.uk/books/Chip-War/Chris-Miller/9781398504127
Geopolitical background of microchip manufacture, underscores how its bound up with supply chains, etc. Read alongside “the stack” and Vaclav Smil’s work.
Sebastian Mallaby
The Power Law: Venture Capital and the Art of Disruption
https://www.penguin.co.uk/books/309693/the-power-law-by-mallaby-sebastian/9780141988948
A sociological history of venture capital, the rise and fall of Softbank’s influence, etc. Mallaby’s bits on the networks of VCs, interpersonal relationships, start ups, banks, and why they’re broadly on the West Coast provides a macroscopic look at the “great men” myths founders tell themselves. Why does this matter? Because they tend to think of themselves as examples of perfectable isolate reasoning agents. So when we talk about the midwestern origins - the lutheran background, the lutheran human geography, we see another mirror of neoliberal markets and neural networks.
Donna Haraway
The Cyborg Manifesto
Essay exploring feminist materialisms in late 20th century culture with the rise of the internet, widespread computing, etc.
Benedict Anderson
Imagined Communities
https://www.versobooks.com/en-gb/products/1126-imagined-communities
Traces the genealogy of nations back to the American Revolution, points to the importance of literacy and publishing in allowing families hundreds of miles away to imagine they belong to the same community. If this is how modern nations are sustained, it therefore raises questions as to what might replace them, and their fragility when that literacy is bound up with a flood of information.
Mary Midgley
Evolution as Religion
Discussion of what evolution is and isn’t; particularly useful to illuminate the role of biological anchors in AGI thinking.
Mary Midgley
Science as Salvation
Argues that much of what passes for science is in fact myth. Good therefore to read alongside Latour and McCarraher.
Mircea Eliade
The Myth of the Eternal Return
https://press.princeton.edu/books/paperback/9780691182971/the-myth-of-the-eternal-return
Basically argues that the default (”primitive”) view of history is cyclical and that what matters in those ontologies is the degree to which events have mythic parallels. Secular history is a product of Jewish and Christian thought in which God could be seen in history, but even this is profoundly difficult.
Much criticised - but really good for illuminating what it is that quite a lot of AI talk and transhumanism is doing, and the theological roots of its socio-histories.
Steven Shapin and Simon Schaffer
Leviathan and the Air-Pump
https://press.princeton.edu/books/paperback/9780691178165/leviathan-and-the-air-pump
How is knowledge produced? Socially. theatrically, and experimentally. This is a classic work on the production of knowledge. It reminds us of the difficulties of talking about objectivity, “scientific truth” etc.
Rudolph Carnap
An Introduction to the Philosophy of Science
https://store.doverpublications.com/0486283186.html
Goes step by step through how we can reason - and how we can know we can reason - about the world scientifically, via logic, probability, and causality. Good for introducing humanists to scientific concepts, and a broader explanation of the way in which the world is codified for machines.
Robin Hanson
The Age of Em
https://global.oup.com/academic/product/the-age-of-em-9780198754626?cc=gb&lang=en&
Hanson is bracing, in all senses of the word. The Age of Em is essentially a thought experiment in which copies of human minds “ems” might socially organise themselves, etc. The reason why you should read this that it sets out how quite a few AI adjacent people think about how brains function, how markets and ethics function, and what they view as leading necessarily from this.
You’re not wrong there, but I’d say that that context is important because of the way in which Hobbes’ thought influences modern political thought. Not so much the 18c stuff, as he apparently wasn’t read much then, but Runciman’s recent work.
Nice list. One point I'd like to make about Hobbes is that while his theoretical mechanistic notion of men predicts the war of all against all, Leviathan probably shouldn't be read outside of the context of Behemoth. That is to say that Hobbes also had empirical evidence from his personal experiences in the English Civil War as to what happens when anarchy reigns. (In fact, this seems to be a general empirical realisation that arises from English Civil Wars, since the chroniclers writing about the civil war between Stephan and Maude over the English throne in the 12th century described it as a period of lawlessness and wanton violence of all against all, "Christ and his saints slept").