Michael Anissimov outlines the four basic views on what any eventual Artificial General Intelligence will be like:
1. Low power, low controllability
2. Low power, significant controllability
3. Great power, low controllability
4. Great power, significant controllability
Michael then describes the fourth option in some detail:
The great power, significant controllability group primarily originates with Eliezer Yudkowsky of the Singularity Institute. As such I will call it the SingInst view. The SingInst view acknowledges that after a certain point, AI will become self-improving and radically superintelligent and capable, but emphasizes that this doesn’t mean that all is lost. According to this view, by setting the initial conditions for AI carefully, we can expect certain invariants to persist after the roughly human-equivalent stage, even if we have no control over the AI directly. For instance, an AI with a fundamentally unselfish goal system would not suddenly transform into a selfish dictator AI, because future states of the AI are contingent upon specific self-modification choices continuous with the initial AI. So, if the second AI is not the type of person the first AI wants to be, then it will ensure that it never becomes it, even if it reprograms itself a bajillion times over. This is my view, and the view of maybe a few hundred SingInst supporters.
Sounds pretty good to me. So the question is…what do we want to go into that unselfish goal system driving the AI? Interestingly, I think this exercise might bring us back to Asimov’s Three Laws of Robotics.
Now, granted, folks like Michael and Eliezer and others promoting the SingInst view would be the first to tell us that the Three Laws are (take your pick) risible, unworkable, pretty much a relic of a less tech-savvy era. Here’s a typical critique.
I’m thinking that the whole problem with the Three Laws might just have to do with how they’re phrased. Asimov essentially gave us three (ultimately four; we’ll get to that in a minute) commandments for robots. And like the original ten commandments, they are primarily set up in the negative. Thou shalt not this; thou shalt not that.
But if the trick is to create a positive goal system for AI’s, the Three Laws might provide a good starting point. Let’s start with the first law:
A robot may not injure a human being or, through inaction, allow a human being to come to harm.
No good. Too negative. Let’s make it a positive goal:
Ensure the safety of individual sentient beings.
Moving quickly on to law number two:
A robot must obey orders given to it by human beings except where such orders would conflict with the First Law.
Many have pointed out that this law essentially enslaves the robots. No good. Let’s try something like this:
Maximize the happiness, freedom, and well-being of individual sentient beings.
See? Better. Then there’s law number three:
A robot must protect its own existence as long as such protection does not conflict with the First or Second Law..
Hmmm…interesting. Plus, there’s the fourth law that showed up in some of the later novels, which was given precedence over all the others as the Zeroth Law of Robotics:
A robot may not harm humanity, or, by inaction, allow humanity to come to harm.
This one is pretty good, but like the others it assumes a fundamental difference between human and machine intelligence. Why draw that line? The Three Laws need to be reworked not only as positive goals, but as goals that apply to us as much as they do the AI’s. Zero and Three might be combined thusly:
Ensure the survival of life and intelligence.
So now we have three goals where before we had four laws. These goals suffer from many of the same problems as the original laws. They’re kind of vague; there will no doubt be disagreements as to what they mean. But rather than defining them as limitations or exceptions to intelligent behavior, by stating them as goals we would be saying that AI’s are systems designed specifically to do these things. By extension, we would be saying that humanity is a system whose purpose is carrying out those goals.
We can debate how well humanity has done so far at carrying out those goals. (I tend to think we’ve done pretty well, but that we have a long way to go.)
As for the vagueness — yes, we will need to get very specific about what we mean by things like “safety,” “intelligence,” and “happiness” (Not to mention “life”) and the tricky relationship between each of these and “freedom.” But come to think of it, we really need to be figuring that stuff out, anyway. And with these three goals in place, we will eventually have help from beings that will have a clearer understanding of these concepts than we possibly can.
So I propose the following Three Goals of Artificial Intelligence:
1. Ensure the survival of life and intelligence.
2. Ensure the safety of individual sentient beings.
3. Maximize the happiness, freedom, and well-being of individual sentient beings.
Will they work? If not, what goals would work better? I’d be interested to see some discussion on this.
UPDATE: Welcome InstaPals! Glen quips:
We need progress fast, especially as natural intelligence appears to be in diminishing supply.
Scanning the headlines (or, worse yet, surfing channels to see what’s on TV) it would be hard to argue with that assessment. But, astoundingly, there is substantial evidence to suggest that human intelligence is actually increasing. Arnold Kling has some thoughts on the subject, here. I covered it here, too, in a pilot for a show that apparently never got picked up.
Hard as it is to accept that people may be getting smarter, it is of course very good news that we are. We need all the intelligence we can muster if we are to
1) Continue to implement these goals ourselves, and
2) Develop the technology that will eventually take them over
I guess the trick in finding this increase in human intelligence is knowing where to look. By nature of his valuable pundit work, Glen spends a lot of time following what politicians and the media are up to. Not a lot of gains happening there, sadly.