The question of AI ethics is not merely a matter of estimating the probability of Really Bad Things; it is a matter of weighing the probability of Really Bad Things against the probability of Really Good Things.

This makes some of the stuff in the previous post less relevant (though not completely irrelevant). The major question becomes: in various scenarios, what is the ratio of good to bad?

When thinking about these issues, I find myself going back and forth between two concepts of the future:

- AIs increase in intelligence slowly. For a long time, AIs remain at human level or anyway human-understandable level. There is no miracle of recursive self-improvement; you put in what you get out.
- The first AI smart enough to begin recursive self-improvement quickly goes beyond the human-understandable level of intelligence, and has its way with the future.

I have argued already that the first outcome would be safer than the second; the opportunity for Really Good Things is less, but this (I would think) is outweighed by the far reduced probability of Reall Bad Things. (In scenario 1, sociopathic AIs would occur, but they would be of roughly human level and so could be contained; by the time AIs were too powerful, the art of creating ethical AIs would be fairly well perfected. (Also, even if an unethical AI was created, there would be a safety net of nearly-as-smart AIs to contain it.)

Of course, there isn't freedom to choose between the two outcomes, for the most part. Either recursive self-improvement (aka RSI) is a powerful possibility, or it isn't. To some extent, the outcome is dependent on whether the algorithmic research outpaces hardware developments, or vice versa (as discussed last time). The "race between software and hardware" mindset would have it that the best strategy for avoiding Really Bad Things is to make algorithmic progress as quickly as possible-- which favors open research, et cetera. But even if algorithms outpace hardware, there could still be an RSI tipping-point: suddenly the hardware is good enough to support RSI, and outcome 2 occurs. So this isn't well-justified. What is needed is a solid analysis of RSI.

Matt Mahony offers an argument that RSI is inherently slow. However, this model is limited by the (purposeful) restriction to strict self-improvement; in particular, the exclusion of learning. Since explosive RSI-like learning capabilities are essentially as dangerous, this is isn't a particularly helpful model. Shane Legg makes similar claims for learning. As it happens, I disagree with those claims:

3 days ago Matt Mahoney referenced a paper by Shane Legg, supposedly

formally proving this point:

http://www.vetta.org/documents/IDSIA-12-06-1.pdf

I read it, and must say that I disagree with the interpretations

provided for the theorems. Specifically, one conclusion is that

because programs of high Kolmogorov complexity are required if we want

to guarantee the ability to learn sequences of comparably high

Kolmogorov complexity, AI needs to be an experimental science. So,

Shane Legg is assuming that highly complex programs are difficult to

invent. But there is an easy counterexample to this, which also

addresses your above point:

Given is T, the amount of computation time the algorithm is given

between sensory-inputs. Sensory inputs can ideally be thought of as

coming in at the rate of 1 bit per T cpu cycles (fitting with the

framework in the paper, which has data come in 1 bit at a time),

although in practice it would probably come in batches. Each time

period T:

--add the new input to a memory of all data that's come in so far

--Treat the memory as the output of a computer program in some

specific language. Run the program backwards, inferring everything

that can be inferred about its structure. A zero or one can only be

printed by particular basic print statements. It is impossible to know

for certain where conditional statements are, where loops are, and so

on, but at least the space of possibilities is well defined (since we

know which programming language we've chosen). Every time a choice

like this occurs, we split the simulation, so that we will quickly be

running a very large number of programs backwards.

--Whenever we get a complete program from this process, we need to run

it forwards (again, simulating it in parallel with everything else

that is going on). We record what it predicts as the NEXT data, along

with the program's length (because we will be treating shorter

programs as better models, and trusting what they tell us more

strongly than we trust longer programs).

--Because there are so many things going on at once, this will run

VERY slowly; however, we will simply terminate the process at time T

and take the best prediction we have at that point. (If we hadn't

gotten any yet, let's just say we predict 0.)

A more sophisticated version of that alg was presented at the AGI

conference in this paper:

http://www.agiri.org/docs/ComputationalApproximation.pdf

The algorithm will be able to learn any program, if given enough time!

NOW, why did Shane Legg's paper say that such a thing was impossible?

Well, in the formalism of the paper, the above "algorithm" cheated: it

isn't an algorithm at all! Fun, huh?

The reason is because I parameterized it in terms of that number T.

So, technically, it is a class of algorithms; we get a specific

algorithm by choosing a T-value. If we choose a very large T-value,

the algorithm coulf be very "complex", in terms of Kolmogorov

complexity. However, it will not be complex to humans, since it will

just be another instance of the same general algorithm! In fact, it

would just be the same algorithm given more time to do its job.

So, on that grounds, I disagree with Shane Legg: it is still possible

to find algorithms analytically, despite the found algorithms being

"of high complexity". Or, to rephrase, there are simple algorithms of

high Kolmogorov complexity, and those algorithms do the job that can't

be done by algorithms of low Kolmogoriv complexity.

(As an aside, the next paragraph in my email uses a mathematically flawed argument: "Once we have the ordering, I can define the game as: create the lowest video sequence not definable in set theory." What I was trying to do was diagonalize set theory, to define a set-theoretically undefinable video sequence. The right way to do this is not to take the "lowest" sequence not defineable by set theory. Instead: put an ordering on set-theoretical definitions of sequences. For the Nth frame of our non-definable video sequence, look at the Nth set-theoretic video sequence; if the screen is all white in that sequence, take our sequence to be all black on that frame; otherwise, take it to be all white. :D)

Still, all I'm claiming is that we can construct an algorithm that will perform better if it is given more time. That provides a sort of counterexample, but not one that is explosive (until the point where an AI gets the ability to improve its own hardware).

[Edit: In the comments, Shane Legg points out that this doesn't really provide a counterexample. We can construct an algorithm that will do better given more time, but that does not mean that for every sequence we might want to learn, there is some amount of processing time that will allow the algorithm to converge correctly. In fact, that's false; there will be sequences that can't converge, for any fixed amount of time we allow the algorithm. Shane also corrects me on the conclusions he draws: he does not, in fact, conclude that "ai must be an experimental science".]

Eliezer makes a vivid argument for the possibility of explosively fast learning. Here is the key mathematical bit:

I occasionally run into people who say something like, "There's a theoretical limit on how much you can deduce about the outside world, given a finite amount of sensory data."

Yes. There is. The theoretical limit is that every time you see 1 additional bit, it cannot be expected to eliminate more than half of the remaining hypotheses (half the remaining probability mass, rather). And that a redundant message, cannot convey more information than the compressed version of itself. Nor can a bit convey any information about a quantity, with which it has correlation

exactly zero,across the probable worlds you imagine.But nothing I've depicted this human civilization doing, even

beginsto approach the theoretical limits set by the formalism of Solomonoff induction. It doesn't approach the picture you could get if you could search throughevery single computable hypothesis, weighted by their simplicity, and do Bayesian updates onallof them.

He does, however, admit reservations about the possibility of this power being manifested in the physical universe:

No one - not even a Bayesian superintelligence - will ever come remotely close to making efficient use of their sensory information...

...is what I would like to say, but I don't trust my ability to set limits on the abilities of Bayesian superintelligences.

(Though I'd bet money on it, if there were some way to judge the bet. Just not at very extreme odds.)

But, of course, an entity (say) halfway between humans and a Bayesian Superintelligence is still rather dangerous.

> one conclusion is that because programs of high

ReplyDelete> Kolmogorov complexity are required if we want

> to guarantee the ability to learn sequences of

> comparably high Kolmogorov complexity, AI needs

> to be an experimental science.

Um, no, I don't say that. If you experimentally generate an algorithm, what guarantee would you have that it worked?

> So, Shane Legg is assuming that highly complex

> programs are difficult to invent.

Um, no again. High complexity is easy to generate, just flip a coin many times and you're pretty much guaranteed to get a string with complexity roughly equal to its length.

> The algorithm will be able to learn any program,

> if given enough time!

Wrong again. Hint: consider computable sequences where the computation time is a function of the length of the input. I'll let you work it out from there. ;-)

Shane,

ReplyDeleteThanks for the comments.

What I intended was "Highly complex *correct* programs are difficult to invent"... or, more specifically, highly complex programs that are usefully complex. In that case flipping a coin will not work very well. (But, giving an AIXI approximation like AIXItl an amount of time that is specified by, say, the nth place in the busy beaver function will.)

Of course, I am not claiming that giving a program astronomical amounts of time is practical... just that writing algorithms that can be given longer and shorter amounts of time is practical (and typical), and that if we allow that then the theorem in the paper no longer applies (because we're using a different definition of "algorithm"). But of course you are saying that this, also, is wrong...

"Hint: consider computable sequences where the computation time is a function of the length of the input. I'll let you work it out from there. ;-)"

Hmm...

Good point. 8\

OK, so the statement still holds for my wider class of "algorithms". Sorry for the fallacious objection.

"> one conclusion is that because programs of high

> Kolmogorov complexity are required if we want

> to guarantee the ability to learn sequences of

> comparably high Kolmogorov complexity, AI needs

> to be an experimental science.

Um, no, I don't say that. If you experimentally generate an algorithm, what guarantee would you have that it worked?"

In that case, sorry for interpreting your intentions badly. (On re-reading my email I am surprised by how strongly I stated my interpretation of the paper... I definitely made it sound as if you had explicitly stated this conclusion. My apologies!) Your theoretical work shows the inherent weakness of provable algorithms (as illustrated in figure 1). I was taking this to mean that you favored not-provable algorithms (ie, algorithms that perform well in practice but are opaque to formal analysis-- experimentally validated algorithms).

What *is* your position, then? Do you think instead that we *are* confined to the region of relatively weak, but provable, algorithms?

My views on this are given in Complexity and Intelligence.

ReplyDelete> What *is* your position, then? Do you think

ReplyDelete> instead that we *are* confined to the region

> of relatively weak, but provable, algorithms?

It's sort of difficult to say because as soon as you go from the math in my paper to words to somebody else's mind, precise definitions turn into intuitions which then turn into objections that don't make sense in terms of the original definitions! Take the the link EY just provided and the surrounding debate on various lists. Some people read my paper, thought it to mean things it didn't say, started promoting these ideas as gospel, and then others attacked those positions assuming that these were positions that I hold. Well, I guess that's the internet for you.

Anyhow. If you're a theory kind of a guy I'd say that this is an interesting result. It takes Goedel incompleteness from the domain of strange theorems about themselves and random numbers etc., into the realm of learning and prediction. My sense is that more interesting results in this general direction remain to be discovered.

If you're a practical guy interested in building an AGI of roughly human level intelligence, don't worry about the result. As I show in the paper, as soon as you have computable bounds on the computation time as a function of input length everything essentially becomes finite at each time step and the theoretical problems go away.

> OK, so the statement still holds for my wider class

ReplyDelete> of "algorithms". Sorry for the fallacious objection.

Wrong again. Just think about it a while...

Sorry, I think I misunderstood this when I read though again. If by "this statement" you mean my results, then the answer is yes: it does apply to what you describe.

ReplyDeleteYep, that's what I meant.

ReplyDelete