World's most popular travel blog for travel bloggers.

Clear, intuitive derivation of the fixed-point combinator (Y combinator)?

, , No Comments
Problem Detail: 

The fixed-point combinator FIX (aka the Y combinator) in the (untyped) lambda calculus ($\lambda$) is defined as:

FIX $\triangleq \lambda f.(\lambda x. f~(\lambda y. x~x~y))~(\lambda x. f~(\lambda y. x~x~y))$

I understand its purpose and I can trace the execution of its application perfectly fine; I would like to understand how to derive FIX from first principles.

Here is as far as I get when I try to derive it myself:

  1. FIX is a function: FIX $\triangleq \lambda_\ldots$
  2. FIX takes another function, $f$, to make it recursive: FIX $\triangleq \lambda f._\ldots$
  3. The first argument of the function $f$ is the "name" of the function, used where a recursive application is intended. Therefore, all appearances of the first argument to $f$ should be replaced by a function, and this function should expect the rest of the arguments of $f$ (let's just assume $f$ takes one argument): FIX $\triangleq \lambda f._\ldots f~(\lambda y. _\ldots y)$

This is where I do not know how to "take a step" in my reasoning. The small ellipses indicate where my FIX is missing something (although I am only able to know that by comparing it to the "real" FIX).

I already have read Types and Programming Languages, which does not attempt to derive it directly, and instead refers the reader to The Little Schemer for a derivation. I have read that, too, and its "derivation" was not so helpful. Moreover, it is less of a direct derivation and more of a use of a very specific example and an ad-hoc attempt to write a suitable recursive function in $\lambda$.

Asked By : BlueBomber

Answered By : Petr Pudlák

I haven't read this anywhere, but this is how I believe $Y$ could have been derived:

Let's have a recursive function $f$, perhaps a factorial or anything else like that. Informally, we define $f$ as pseudo-lambda term where $f$ occurs in its own definition:

$$f = \ldots f \ldots f \ldots $$

First, we realize that the recursive call can be factored out as a parameter:

$$f = \underbrace{(\lambda r . (\ldots r \ldots r \ldots))}_{M} f$$

Now we could define $f$ if we only had a way how to pass it as an argument to itself. This is not possible, of course, because we don't have $f$ at hand. What we have at hand is $M$. Since $M$ contains everything we need to define $f$, we can try to pass $M$ as the argument instead of $f$ and try to reconstruct $f$ from it later inside. Our first attempt looks like this:

$$f = \underbrace{(\lambda r . (\ldots r \ldots r \ldots))}_{M} \underbrace{(\lambda r . (\ldots r \ldots r \ldots))}_{M}$$

However, this is not completely correct. Before, $f$ got substituted for $r$ inside $M$. But now we pass $M$ instead. We have to somehow fix all places where we use $r$ so that they reconstruct $f$ from $M$. Actually, this not difficult at all: Now that we know that $f = M M$, everywhere we use $r$ we simply replace it by $(r r)$.

$$f = \underbrace{(\lambda r . (\ldots (rr) \ldots (rr) \ldots))}_{M'} \underbrace{(\lambda r . (\ldots (rr) \ldots (rr) \ldots))}_{M'}$$

This solution is good, but we had to alter $M$ inside. This is not very convenient. We can do this more elegantly without having to modify $M$ by introducing another $\lambda$ that sends to $M$ its argument applied to itself: By expressing $M'$ as $\lambda x.M(xx)$ we get

$$f = (\lambda x.\underbrace{(\lambda r . (\ldots r \ldots r \ldots))}_{M}(xx)) (\lambda x.\underbrace{(\lambda r . (\ldots r \ldots r \ldots))}_{M}(xx))$$

This way, when $M$ is substituted for $x$, $MM$ is substituted for $r$, which is by the definition equal to $f$. This gives us a non-recursive definition of $f$, expressed as a valid lambda term!

The transition to $Y$ is now easy. We can take an arbitrary lambda term instead of $M$ and perform this procedure on it. So we can factor $M$ out and define

$$Y = \lambda m . (\lambda x. m(xx)) (\lambda x.m(xx))$$

Indeed, $Y M$ reduces to $f$ as we defined it.


Note: I've derived $Y$ as it is defined in literature. The combinator you've described is a variant of $Y$ for call-by-value languages, sometimes also called $Z$. See this Wikipedia article.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/9604

0 comments:

Post a Comment

Let us know your responses and feedback