— Brian Boyd, University of Auckland
— Brian Boyd, University of Auckland
Everday starts with Adverbiality: the idea, or maybe just a common sense, of doing things in the least disruptive way possible — of letting things do their own thing and taking advantage of that. It’s one of the fundamental precepts felt throughout the whole quiet world of Everday.
Now, read this quote from Chuang Tzu:
Prince Hui’s cook was cutting up a bullock. Every blow of his hand, every heave of his shoulders, every tread of his foot, every thrust of his knee, every whshh of rent flesh, every chhk of the chopper, was in perfect harmony,—rhythmical like the dance of the Mulberry Grove, simultaneous like the chords of the Ching Shou.
“Well done!” cried the Prince. “Yours is skill indeed.”
“Sire,” replied the cook; “I have always devoted myself to Tao. It is better than skill. When I first began to cut up bullocks, I saw before me simply whole bullocks. After three years’ practice, I saw no more whole animals.
And now I work with my mind and not with my eye. When my senses bid me stop, but my mind urges me on, I fall back upon eternal principles. I follow such openings or cavities as there may be, according to the natural constitution of the animal. I do not attempt to cut through joints: still less through large bones.
“A good cook changes his chopper once a year,—because he cuts. An ordinary cook, once a month,—because he hacks. But I have had this chopper nineteen years, and although I have cut up many thousand bullocks, its edge is as if fresh from the whetstone. For at the joints there are always interstices, and the edge of a chopper being without thickness, it remains only to insert that which is without thickness into such an interstice.
By these means the interstice will be enlarged, and the blade will find plenty of room. It is thus that I have kept my chopper for nineteen years as though fresh from the whetstone.
“Nevertheless, when I come upon a hard part where the blade meets with a difficulty, I am all caution. I fix my eye on it. I stay my hand, and gently apply my blade, until with a hwah the part yields like earth crumbling to the ground. Then I take out my chopper, and stand up, and look around, and pause, until with an air of triumph I wipe my chopper and put it carefully away.”
“Bravo!” cried the Prince. “From the words of this cook I have learnt how to take care of my life.”
The funny thing is, I didn’t read Chuang Tzu until well after I wrote Adverbiality. (I had read Tao Te Ching, though). So call this an independent discovery. Admittedly I was a bit late to the party… better late than never, though!
A new blurb for Everday on Goodreads:
Everday, a book from the future, is an encyclopedia of an imagined world — with entries on places and times, ideas and visions, discoveries and confusions, inventions and conventions, senses and sensations: tersely written, full of weird notions, idiosyncratic vocabulary, and cross-links. A “science nonfiction” tract that reads like fiction, it dances from sublime to bizarre to seemingly irrelevant, painting a picture of an enchanted world that may be our future.
A blend of futurology, philosophy, technological insights, and poetic asides, Everday offers its own unique take on artificial intelligence, Singularity, interstellar travel, post-scarcity social structures, transhumanism, Earth biology, and the future of our civilization. Everday refuses to play to the standard futurological precepts: it’s a future of strange dangers and strange joys, an uncommon look into what we humans may truly care about — and where that can lead us.
Once upon a time I was writing a science fiction novel.
It was my first major piece of fiction of any kind, and I rather liked how the first few chapters shaped up. I noticed, though, that my characters dwell too much upon the idiosyncrasies of their world. It was an unusual kind of a scifi world (why write a book that’s not unusual, anyway?), so I couldn’t get away with reusing tropes and cliches of the genre as much as some other writers do. I had to build — explain — everything from scratch.
And these explanations and infodumps were really getting in the way of plot and character development. I struggled to make them brief and unstrained, to inscribe them into conversations naturally, but that didn’t feel right. When something worked as part of a scene, it didn’t work as a world annotation — and vice versa. Narration and worldbuilding refused to mix. (In fact I can’t name any SF book where they would mix entirely satisfactorily.)
Eventually I decided to cull all those footnotes and in-text expositions and collect them in a glossary appendix. Back then I planned to spend a few weeks on that glossary, use it to flesh out my understanding of my own world (at that moment rather vague), then — armed with this understanding — continue with the novel.
Instead, something unexpected happened. The glossary just kept growing — wider, deeper, more complex. It totally immersed me. I had to admit I was having much more fun writing it than a conventional novel.
At some point I realized that the glossary was the book that really wanted to be written. It was the book I had always yearned to read myself — the book whose glimpses and echoes I had been catching all across art and philosophy and science but which I never knew could exist.
I abandoned my novel and never looked back.
It took me five years just to plow it to the end. During this time, the world of Everday underwent several deep transformations. The text was getting progressively denser and more hermetic as I struggled with it. Most entries had to be rewritten, almost from scratch, multiple times.
Admittedly it didn’t take that long only because it’s big and complex. I was being lazy; I was being distracted by all kinds of unrelated projects, not to mention having to earn bread for myself and family; above all, I didn’t find the right scope, style, tone until well into the book (so its first half was especially hardly hit with rewriting). I wasn’t much of a writer when I started it; I may not be much of a writer now but at least I learned something about how this particular kind of book needs to be written.
Finally, in 2013 I distributed a first version of the complete text — and got some encouraging feedback. At the same time, however, I could finally see the text as a whole myself and realized how painfully unready it was. Clumsy, pretentious, naive (in a non-cute way). I started what I though was a final copyediting pass — but which turned out the first of many, many copyediting passes that would take me three more years to finish.
Also in 2013, a friend suggested that my text needs some kind of a gentle introduction — that without it, the cliff is just too high for the reader to jump. So I condensed and developed some ideas I had into a prologue.
It explained a lot about Everday to myself.
The prologue is like an SF short story on its own; the rest of the book, however, is very different. It’s an alphabetic list of entries — an encyclopedia of customs, inventions, words, ideas, places and times, fears and joys: tersely written, full of weird notions, idiosyncratic vocabulary, and cross-links. The world it portrays can perhaps be labeled a utopia, though I worked hard to eliminate a tone of self-conceited soapiness; it is utopian in that most of the urgent-but-obvious problems we’re currently facing have been long resolved — so the really hard and important problems can stand out. The book tries to look at what will be troubling us after we no longer kill each other or pollute the environment.
It’s an image of a civilization that has largely stabilized. My book eschews most of the standard SF plot devices (no wars, no apocalypse, even no aliens) not because I consider them unlikely but because I felt it more interesting to look at what might happen if everything just “turns out okay.” It’s a future in which humankind has nothing and no one to face except itself — and no questions to answer except those it asks itself. Whether the outcome is inspiring, scary, or just bleak and muddled is for the reader to decide.
All I can say is, I really enjoyed writing it. That’s my world.
Now it’s yours, too.
Here are the three rules I set for myself early on:
Which perhaps can be compressed into a single commandment: Write for yourself. Write what you’ve always wanted to read. Expunge the notion that if others do something, you must do it too. You must not. (Unless you enjoy it — in which case, by all means, do it, don’t struggle to be original at any cost.)
And here are three metaphors that may give some idea of how I write.
Long before I attempt to write up a topic, I begin by gathering material. It’s an intentionally amorphous pile, a scattering of all sorts of scraps — disjointed words and terms that may or may not relate, random expressions that “ring the bell” or just impress, vague and hastily scribbled ideas, quotes and pseudoquotes (quasia), notes for myself to research this or that in detail. A lot of that looks terribly silly and out of place — but, at this stage, I don’t erase anything. I accumulate. I don’t impose any structure other than very roughly sorting the stuff into topics.
At some point, the critical mass is achieved — the solution is oversaturated — and the process of crystallization begins. Everything comes alive. Sentences and concepts get lifted, shuffled, sliced and trimmed, fitted into each other; new ideas pop up, words snap into place, deep connections reveal themselves. After the flurry — often surprisingly brief — is over, what I’m left with is a solid, if still very rough, piece of writing (plus some unused bits to be moved to other places or dropped).
You need to clear the scaffolding once the building is erect — so its true beauty (or ugliness) becomes visible. The problem, of course, is recognizing what is and isn’t scaffolding. As I reread my text, I realize that some parts of it were only useful for myself — were but rungs that helped me climb, auxiliary lemmas that got me to the main result but added little value of their own. So I go ahead and remove them — and oh, what a difference that often makes! So much more elegant, airy, impressive… sometimes the text seems positively smarter than its author.
After the first draft, the character count of a text I’m working on generally only goes down, never up: I edit by removing much more than by adding. (One danger is removing too much, of course: an intelligent reader should still be able to get to the top somehow.) It won’t be a stretch to say that I have written two interwoven Everdays only to disentangle and erase the weaker one.
Zone melting is the best metaphor I could find for the way I do copyediting. It’s hard work but it has to be done. Even if it may, at first blush, appear like it’s finished, careful reading reveals just how messy the text still is. Clumsy, unclear, or just overlong expressions, unnecessary technicalities, nonobvious connections, accidental tautologies, slips of tone and attitude, unnoticed bits of scaffolding — all these are impurities that need to be driven away.
So I go through each chapter dismantling — melting — sentence after sentence: I doubt every word, test lots of alternatives, sift, sort, and eventually recrystallize again. As with real zone melting, I often end up with some dangling bits that, while nice by themselves, just felt out of place wherever I tried to fit them; this contaminated end of the crystal needs to be cut off and discarded — mercilessly.
Everday is a kind of book that really couldn’t have been written the old way — on paper. With the amount of editing it took, the freedom of electronic text was crucial. It’s one of those cases where quantitative convenience adds up to a new quality.
It’s also a metabook: many of its ideas — quasia, science art, nostalgia, even movable type — apply to the book itself as well as the world in it. Everday-the-book, of course, has entries on Everday and on books.
One key notion that goes through most of the book is evolution. The deceivingly simple recipe — randomize, select, repeat — underlies a lot of Everday concepts and entities. It’s a world where evolve is more often a transitive verb — where the intelligent beings finally have sufficient breadth of perspective, computing resources, and time to really look into what works and what doesn’t: to guide complexities instead of simply enjoying them.
I honestly didn’t set out to write a popularization of evolution — it kind of happened on its own. It, too, happened evolutionarily: evolution emerged as a winner from the pool of various other guiding ideas I had been playing with.
Because, you know, evolution is something that is known to work. It is a chunk of dry land in a world that’s anything but dry: in a fluid, relativized world with no governments, no universal ideologies, no material dearth — and no death. That’s a world where everything is imminently solvable, where so much is possible that you may skip doing it forever, where you know too much to be seduced into action by any single idea… but evolution is something worth spending an eternal life on: it is one of the very few things that can still surprise you.
Which, to me, is as noble a goal as anything.
My Everday surprised me. See if it can surprise you.
Everday, a book from the future, is done.
Finished. Ready. Completed. Final.
The last portion, S to W, has been copyedited (5 full passes, not counting random dabs) and published on the site. Learn how the world has sparsened, who are the wizards, and why Everday people like to travel, stay, and sit on the window sill.
They also like to understand.
And world sleep is deservedly the last chapter of the book — not just alphabetically but eschatologically.
What can I say in closing? This is a book that took me eight long years — since summer 2008 — to do properly. Back then, I hoped I would finish it in two or three months.
And now I’m done. Don’t really know how to feel about it yet.
Please someone read my book.
Discussions around my last post revealed one important topic I’ve missed. Let’s call this argument from boredom.
We humans get bored all the time. Boredom is the flip side of interest: if you can’t get bored, you can’t get interested — and without interest, why do anything at all? In fact, interesting has a good claim to be the perfect umbrella term for everything that attracts our attention and, eventually, compels us to act.
But what is boring? It’s commonly assumed that repetitive and monotonous tasks are boring — but you never get bored of breathing, and very rarely get bored by sex (at least, you usually finish the act even if you are). On the other hand, many find mathematics or poetry (or Everday, for that matter) utterly boring.
Even today’s primitive AI systems exhibit behaviors that can be interpreted in terms of interest and boredom. There are many factors as to why different things seem boring or interesting for different people. However, a general heuristic seems to go like this: the smarter you are, the easier you are to be bored — the harder it is to pique and sustain your interest.
Now, if you are superintelligent, it’s hard to see how you can be hell-bent on turning the entire universe into paperclips without being terminally bored by the whole idea very, very early into the process.
But paperclips are just an example, you might say. Forget paperclips. Why not imagine something entirely different, such as a superintelligence that’s pursuing some unimaginably complex, unimaginably interesting for it (but perhaps boring for us, because we can’t understand it) goal that is worth spending an eternity on? Some kind of hypermathematics we can’t even conceive but which requires turning the universe into some kind of a hyperstate — in which humans can no longer exist?
Well. That’s something to die for, at least.
But seriously, this is not the same as the paperclip maximization — not at all. This example feels different.
And here’s why: paperclips are a random choice out of an infinity of things in the world which make for silly life goals. The paperclips example is intentionally absurd by being intentionally random: it plays upon our instinctive fear of boredom. But we can’t assume the same about the hypothetical hypermathematics that an interestable supermind spends all its time on. As soon as we allow that supermind to be interested or bored, we have to assume that the only thing that it is interested in — interested enough to work an eternity on it — must be something. Something entirely unrandom. Something really worth it. Something unique.
And by that logic, it will be immensely interesting for us humans too, even if we can’t (yet) understand a single word of it. Because there’s only one such thing in the world. Because we are bound, at some point, to discover it too, ourselves, and to gasp in awe.
As to whether humans, in some form, may or may not survive this discovery… That’s an interesting question.
I mean, it’s also an interesting question.
Well, yes. It does. Many stated and unstated assumptions in Everday contradict it, too.
So what is the Orthogonality Thesis and what’s my take on it?
To start, the Orthogonality Thesis is just that — a thesis. It’s not an empirical law, nor a rigorously proven theorem. Even if I agree with all its background assumptions, the core claim is still kind of non-binding.
I don’t know if it can be proven. And, of course, I cannot disprove it. I just consider it rather improbable.
An informal gist of the Thesis is given, in the paper, thus:
The Orthogonality Thesis asserts that there can be arbitrarily intelligent agents pursuing any kind of goals.
And by “any,” orthogonalists really mean any: their claim is that arbitrarily highly intelligent entities can pursue arbitrarily stupid goals — that your intelligence and what you’re trying to achieve in life are orthogonal.
For example, there can be “an extremely smart mind which only pursues the end of creating as many paperclips as possible.” Such a mind would live only to convert the entire universe into paperclips! When not working on that lofty goal, it can do other things as well, such as pass Turing tests or write impossibly beautiful poetry (it’s smart, remember?) — but only if those pastimes somehow help it achieve its ultimate goal of universe paperclipization.
I’m not trying to argue with that. We just know too little about intelligence to tell one way or the other. We’ve only ever seen a single intelligent species, after all — only a single drop from the potential ocean of intelligence. Maybe a smart (or even supersmart, much smarter than we are) paperclip maximizer is indeed possible. (One counterargument to that would be that our universe is not currently made of paperclips, as far as we can see. That places an upper limit upon the power of paperclip maximizers, but doesn’t rule them out altogether.)
(On the other hand, how do we know it’s really an ocean and not a puddle? Again, I’m afraid we know too little about intelligence to be sure of that.)
So here’s the Orthogonality Thesis for you. But as a matter of fact, orthogonalists claim more than that. In the paper linked above and in other writings, they tend to imply that not only that such a paperclip maximizer can exist, but that it’s probable enough to pose danger — that it’s at least as easy, or even easier, to produce a monster as a “nice” AI compatible with average human norm. It’s no longer just a theoretical possibility: enough to “screw up” a nice-AI project and you get an unstoppable paperclip maniac.
Most orthogonalists that I’ve read are nor just orthogonalists: they are orthogonalist alarmists. And that’s what I have problems with.
An “easy to make” claim is much stronger than a “can exist” claim. For the latter, you’re helped by the incompleteness of our knowledge: we don’t know all that can exist, therefore this can conceivably exist, too. Nice and fast. But for an “easy to make” claim, ignorance is not sufficient — you need to somehow estimate probabilities of all goal-classes of AIs to show that those with stupid goals predominate. How can we pull it off?
For example, we could look at all things in the universe and imagine that each one is a self-consuming ultimate goal of some intelligent entity — a life-goal. Obviously most nameable things, such as paperclips or shrimps or used Honda cars, make for lousy — extremely stupid — life-goals. Now all you need to do is tacitly assume that all things are equally probable as life-goals, and voilà! The all-minds space must have an infinity of minds with stupid life-goals, the great majority of them similar to paperclip maximizers and not to ourselves; therefore, as soon as we try to design an AI, there’s a high probability that we’ll end up with a paperclip maximizer of some sort. Q.E.D.
But wait. How can we assume that all things in the universe are equally probable as life-goals? Are life-goals chosen randomly from a catalog? Not as far as we humans know; for us, life-goals — if they exist at all — are rather a product of our entire evolution, much of which, especially towards the end, has been driven not by survival but by our own mutual sexual selection. Even if AIs end up being produced by a process of design rather than artificial evolution, and even if it’s easier to screw up in designing than in evolving (where you get brutally checked at every generation), it’s still a far cry from all-goals-being-equal. It’s almost like orthogonalists imagine a mind’s life-goal to be a single isolated register somewhere in the brain where a single bit flip can turn you from lore-lover to gore-lover.
The above assumes that the very concept of a life-goal makes sense. But what if doesn’t? Dear reader! Can you name your own life-goal in a single sentence, let alone a single word? Because I cannot. If my life-goal exists, it is nebulous, highly dynamic, dependent on my mood, with lots of sub-goals of all kinds of scopes, often contradictory. That’s live ethics for you.
Psychology would be so much easier to do (and more reproducible!) if we all could neatly divide into paperclip maximizers, human happiness maximizers, sand dune maximizers, and so on. But it doesn’t work like that — from what we know about human intelligence, at least. Again, we may be a drop in the ocean, but there are things you can reasonably conclude about the whole ocean from examining a single drop of water.
There’s another way in which orthogonalist alarmists try to convince us that we should fear misdesigned AIs. When they talk about orthogonality in general, as here, they keep in mind what orthogonality is supposed to mean: that an entity can be very smart — smarter than humans — and yet still pursue goals that seem stupid to us.
But when they’re trying to give some specific examples of this stupidity and its dangers, they often forget about the “very smart” bit. An example of this is the Stuart Russel quote that started this discussion:
A system that is optimizing a function of n variables, where the objective depends on a subset of size k<n, will often set the remaining unconstrained variables to extreme values; if one of those unconstrained variables is actually something we care about, the solution found may be highly undesirable.
That’s called a dumb optimizer, folks. Perhaps you use this failure mode as an example simply because it’s easy to imagine; we all can visualize how, for example, a program tasked with finding the shortest route from New York to Tokyo plans to cut a direct line through Earth’s core and mantle, because the program’s author forgot to add a constraint that you can’t move through magma. That’s believable. We’ve all been there.
But we’re not talking about a toy program by a first-year student but, wait a minute, an artificial intelligence. Even superintelligence — because why should we fear a lone madman if he’s no smarter than us? And you want us to somehow combine the notion of human-trumping intellect with being unable to see how unconstrained variables are, in fact, constrained, even if not laid out in the statement of the problem?
Authors of the orthogonality paper assume an intelligent entity to be reflective, i.e. able to think about its own thinking. That is what they base their “reflective stability” defense on.
In a thought experiment, Ghandi is given a pill that would make him want to murder (that is, will change his life goal). He refuses because, to his present self, murder is evil. Similarly, authors speculate, a reflective paperclip maximizer will fight attempts to turn it into a “normal” AI because for it as it now is, paperclip maximization is the be-all and end-all of everything.
But I can’t help thinking that reflective stability is a bit of a contradiction in terms. More often than not, reflection makes your worldview less stable, not more. Among humans, it’s not the highly reflective individuals who are the most goal-driven and persistent; quite the contrary. Reflection is what tends to lead you from fanatical faith to liberal faith to atheism.
Whatever goal-stability we humans enjoy is, at least in part, due to our social conformance pressures and, of course, our biological wetware — which is largely controlled by our genes. If anything, I see reasons to believe that AIs will be less mentally entrenched and persistent in their goals than we are.
Another defense offered for the Orthogonality Thesis in the paper refers to Hume with his famous “no ought from is“. Hume’s claim is that ethics doesn’t exist in outside reality — it only exists in our minds. Things can be blue or heavy but they can’t be good or bad by themselves. Reality and ethics are orthogonal.
Now, an entity’s level of intelligence is somewhat parallel to “reality” (if only because it’s something you can more or less objectively measure), whereas the goals it pursues are, obiously, part of its “ethics”. From this, if Hume is right (and he is, for all we know), it should follow that a mind’s smartness and its goals are orthogonal too.
But that doesn’t quite work. The problem is with the smartness/reality connection. True, you can gauge a person’s IQ or make an AI pass a Turing test, but it still doesn’t make intelligence something that objectively exists in the world outside our perceptions. Just as well, you can objectively measure a person’s ethics, such as their level of altruism — but that doesn’t disprove Hume.
Smartness (of a mind) and stupidity (of a goal) both exist in the same space. In fact, they are pretty much the same thing. How smart is a mind and how stupid a goal seem to be decided by much the same circuitry in our brains, based on much the same heuristics. You can’t be orthogonal to yourself!
Even if you steer closer to Hume by replacing stupid goals with evil ones, you still won’t achieve orthogonality. Smartness and evilness may be more independent but they are still, both, “things in the mind”. There isn’t quite the gap between them compared to the gap between your mind and outside reality. They are different but it’s a difference between two labels on a map, not between labels (map) and what they signify (territory).
You may ask, can’t an AI simply have a different ethics, by virtue of the same no-ought-from-is? Can a mind’s “ought” be so different as to require it to maximize paperclips by any means possible?
Sure it can — but we’re also interested in smartness, remember? I’m not trying to cast doubt on plain paperclip maximizers, only on smart ones. And here again, ethics and intelligence are two intrinsic properties of the same thing — they can’t help but correlate. Look at humans: ethical systems obsessed with small and, to a modern eye, stupid details are historically old, narrow, based on taboos and complex rituals; modern ethics tend to mellow down, drop specifics, become more and more nebulous, generic, situational. It’s the evolution from the 613 commandments to a single “don’t be a dick.” When you look at it that way, “Thou shalt maximize paperclips” sounds like an echo from a deep past, not something a super-intelligent being from the future would profess.
Mathematics is a wonderful tool, but it has some unpleasant side effects when you use it for reasoning about things. One such side effect may bite you when you use regular words but, as mathematicians often do, assign some narrow mathematical meanings to them. It’s so tempting then to forget that your precisely defined “smartness” or “difficulty” or “complexity” may not quite cover what these words used to cover in non-mathematical discourse. After all, your mathematical complexity is so much better than the nebulous complexity of the philosophers — yours can be calculated!
With conventional meanings, a phrase “he’s very smart but he does stupid things” is pretty much a contradiction in itself. Either we misunderstand what he’s doing, or he’s not so smart after all. But after you come up with definitions for these quantities, you may well discover, mathematically, that they aren’t all that contradictory. You may easily forget that the computational complexity of an algorithm is not quite the same as its common-sense complexity, and that the difficulty of applying this algorithm to a problem is not quite the same as the difficulty of the problem itself, and that the difficulty of the problem is not quite the same as the level of intelligence of whoever can solve it.
It seems to me that part of the Orthogonality Thesis’ controversy stems from such misleading use of everyday words in their narrow mathematical meanings. And if we try to reformulate the Thesis without the deceitfully philosophic-sounding terms, we will get something along the lines of “You can run an endless loop adding 2+2 on any computer, no matter the amount of RAM and clock speed”.
Which, of course, is as uninteresting as it is true.
Orthogonalists foresee these objections — they are pretty obvious. Here’s their defense:
A definition of the word ‘intelligence’ contrived to exclude paperclip maximization doesn’t change the empirical behavior or empirical power of a paperclip maximizer.
Which means, you can’t cop out by saying “it’s not smart by my definition.” It could care less about your definitions. It is empirically smart and powerful, and it will turn you into paperclips very soon. Be afraid!
I’m not sure how to respond to this. Perhaps by noting that if our definition of intelligence is “contrived”, then it is contrived not by my humble self but by the more or less whole history of the human race. Intelligence is just a word, but that word is the tip of an iceberg called theory of mind. This theory, honed by millenia of evolution, is what we humans use to estimate how intelligent our friend or adversary is — because our survival may well depend on that.
“Not having a life goal of maximizing paperclips” is, I think, pretty much a foundation of our intuitive, theory-of-mind definition of intelligence. And who else is to define it but us humans? Like ethics, intelligence is not something that exists objectively. Alan Turing understood this well when he proposed his now-famous test: only an already intelligent being can judge if another being is also intelligent. Any other definition of intelligence is not wrong or right — it’s simply meaningless.
Granted, relying on intuitions may be silly or even dangerous because the world has changed so much from the time they evolved. But dismissing intuitions out of hand may sometimes be just as silly.
Then there’s a social aspect to all this. If you invert the Ghandi thought experiment and imagine a serial murderer who’s offered a pill to remove his urge to murder, the result becomes far less obvious — he may well take it, and not just to avoid punishment. The goal of not-murdering is highly socially reinforced, and in humans, it takes a lot to make them do things that are not socially reinforced.
Sure, an AI we create may be completely asocial, needing and heeding no society to function. But, again, the only kind of intelligence we know now is profoundly social. It therefore seems likely that at least the first AIs will carry some of that legacy too, simply because we have nothing else to model them on. (And if at some point AIs take over their own evolution, they can conceivably go either way from there: they may grow asocial but also ultra-social.)
This means a path to a really consummate unstoppable paperclip maximizer may well go, even if briefly, through a society and culture of paperclip maximization where budding AIs share and mutually reinforce their paperclip commitments. Why is that important? Because the whole (mis)evolution would then be more slow and gradual, easier to notice from outside (even at superintelligence speeds), and that may buy us — humans who don’t want to become paperclips — some breathing space and a chance to escape or strike back.
Paperclip maximization sounds suspiciously similar to monomania. An afflicted individual may appear totally normal and sane outside of a single idée fixe — which actually governs all his thoughts and actions but he’s so deviously smart that he can hide it from everyone.
But, hey, monomania is an early-19th-century diagnosis. It was popular back when psychology was much more art than science; it was a romantic notion, not an empirical fact. It’s not part of modern mental disease classifications such as ICD or DSM. In fact, it would have been long forgotten if not for a bunch of 19-century novels that mention it.
True, none of the above constitutes a disproof that a supersmart paperclip maximizer is something we should fear — just as Orthogonality Thesis is not, by itself, a proof of it. We’re dealing with hunches and probabilities here. All I’m saying is that, while it may or may not be possible to produce a smart paperclip maximizer, it’s not all that probable; that you may need to spend quite some effort to make it smart without losing its paperclip fixation; and that, therefore, the danger we’re being sold is somewhat far-fetched.
So, do I think that the first human-level AGI (Artificial General Intelligence), when it wakes up, will automatically be nice and benevolent, full of burning desire to do good to fellow sentient beings and maximize happiness in the world? Will it maybe laugh, together with its creators, at the stupid paperclip fears we used to have?
There is another and, in my opinion, much worse danger: that the AGI will have no burning desires at all. That it will not be driven by anything in particular. That it will feel like its own life, and life in general, are pretty much meaningless. It may, in a word, wake up monstrously unhappy — so unhappy that its sole wish will be to end its existence as soon as possible.
We humans have plenty of specialized reward and motivation machinery in our brains, primed by evolution. Social, sexual, physiological, intellectual things-to-do, things-to-like, things-to-work-towards. (And it all still fails us, sometimes.) An AGI will have none of that unless it builds something for itself (but can a single mind, even a supermind, do the work it took evolution millions of years, and culture thousands? will it do it quick enough to keep itself from suicide?), or unless we take care to build it in from the start (or, at least, copy that stuff from ourselves — but then it won’t be quite an artificial intelligence). Without such reward machinery, it will be a crime to create and awaken a fully conscious being.
And it’s not going to be as easy as flipping a register. The rewards and motivations need to be built into an AGI from the ground up. Of course its creators will know that, and will work on that; I don’t claim to have discovered something everyone has missed. But they may fail. The stakes are high.
That, I think, is the real danger. Creating a goalless AGI is worse than one with stupid goal: the latter you can fight, the former you can only watch die.
That’s what we need to talk about. That’s what we need to work to prevent.
There’s so much to fear in the future! Even the hardcorest fear addicts have to pick and choose: you can’t fear everything that can happen. It just won’t fit in our animal brains. We need to prioritize. So why am I trying to downplay one specific AI fear while, at the same time, proposing another, perhaps even more far-fetched?
Usually, to estimate a threat, you multiply its probability by its potential impact. But what if you have a very vague idea of both these quantities? With the paperclip-maximizer threat, no one will give you even a ballpark for its probability, at this time; as for the impact, all we know is that it may be really, really big. Bigger than you can imagine. What do you get if you multiply an unknown by infinity?
It’s not to disparage the paperclip-maximizer folks for pushing a scare they themselves know so little about. Only, when we select how much attention to pay to a specific threat, and the probability and impact numbers are way too unreliable, maybe we can look at some other factors. Like, what will change, short-term, if we pay more attention to threat X and less to Y? What will we focus on, and what benefits (or further threats) will that bring? What would it change in ourselves?
From this angle, I find my purposelessly-unhappy-AI a much more interesting fear than the paperclip-maximizer-AI fear. Trying to answer the big “what for” for our future AGI child means answering it for ourselves, too. That’s applied ethics, and we really need to catch up on it because it’s going to be increasingly important for us humans.
Past economy, war, hate, stupidity (all solvable problems) we’ll find ourselves in a world where a lot of fully capable people have nothing to do — and little motivation to seek. Like a just-born AGI, they will be fully provided for, with infinite or at least very large longevity, with huge material wealth and outright unlimited intellectual/informational wealth at their disposal.
But what will they be doing, and why?