Archive for January, 2006

Oh, wherefore art Y…

One of the most profound ideas in lambda calculus, is the Y-operator.
I’ve learned the Y-operator at least three times, and every time, I found
it extremely hard to understand. This blog is therefore an example of a
quixotic undertaking: I want to see whether it is possible for me to
explain the Y-operator so that you, gentle reader, can understand it
with minimal background. In this case, “minimal background” means
at least a few years of college level mathematics.

Background: Why Y?

The Y-operator is part of lambda calculus, a field of logic that
appeals to people who just couldn’t get enough of solving equations in
high-school mathematics. Lambda calculus is named after the “lambda”
operator, which stands for an “anonymous function”. It can be used to
syntactically (that is, mechanically) transform expressions. A short
example: “(lambda (x) (* x x))” is a method that takes one
argument (called x) and returns the square of the argument.
Here is how you use it:

((lambda (x) (* x x)) 8) ;; apply the abovementioned function to 8
=>
(* 8 8)                   ;; MECHANICALLY substiture 8 for x
=>
64                         ;; Whee!

The Y-operator is used to create recursive functions
using lambdas. That is, functions that call themselves. That is
its purpose. Y is defined as follows:

(Y F) = (F (Y F))

For this to work, F better be a function. Read it as follows:
“Y of a function F equals F applied to Y of F”.

An example: Factorial

As I said: This allows us to define recursive stuff using just
lambdas. To demonstrate, I will need a volunteer recursive function.
You over there: Factorial! could you step over here, please?

The factorial function is defined thus: The factorial of a number
n is the product of all natural numbers up to n. A way of
writing this in a fairly conventional syntax is:
fact(n) = (n == 1) ? 1 : (n * fact(n - 1)). Now, lets
do it without function definitions. Lets say we want to
find the factorial of 8. In lambda calculus we can write:

((Y (lambda (f)
           (lambda (x)
                   (if (= x 1) 1 (* x (f (-1 x)))))))
     8)

Holy lambda, Batman! That didn’t help much, did it? Well: here’s
the good thing: You don’t have to understand it. Using
what we learned about lambda, we can just blindly calculate it.
Lets see how it works.

Doing the math

Calculating the factorial for 8 will be a really long
exercise, so let’s settle for 2 instead. Here we go:

((Y (lambda (f)
           (lambda (x)
                   (if (= x 1) 1 (* x (f (-1 x)))))))
     2)

Substitute the definition of (Y F) = (F (Y F)):

(((lambda (f)
           (lambda (x)
                   (if (= x 1) 1 (* x (f (-1 x))))))
   (Y (lambda (f)
           (lambda (x)
                   (if (= x 1) 1 (* x (f (-1 x))))))))
     2)

Whoa, those parenteses are dizzying. Well, we can substitute for the
f in the first lambda:

((lambda (x)
         (if (= x 1)
             1
             (* x ((Y (lambda (f)
                               (lambda (x)
                                       (if (= x 1) 1 (* x (f (-1 x)))))))
                    (-1 x)))))

 2)

Keep substituting for the outer lambda at all times. This time x = 2 (only for the outer lambda)

(if (= 2 1)
     1
     (* 2 ((Y (lambda (f)
                       (lambda (x)
                               (if (= x 1) 1 (* x (f (-1 x)))))))
            (-1 2))))

We now know the answer to the if (and I’ll expand (-1 2) => 1 as well):

(* 2 ((Y (lambda (f)
                  (lambda (x)
                          (if (= x 1) 1 (* x (f (-1 x)))))))
       1))

We need to do another of those crazy Y expansions. It is not hard, just exchange the f
(I’ll skip one step this time):

(* 2 ((lambda (x)
               (if (= x 1)
                   1
                   (* x ((Y (lambda (f)
                                     (lambda (x)
                                             (if (= x 1) 1 (* x (f (-1 x)))))))
                         (-1 x)))))
       1))

Substitute the 1 for x in the outermost lambda:

(* 2 (if (= 1 1)
          1
          (* 1 ((Y (lambda (f)
                            (lambda (x)
                                    (if (= x 1) 1 (* x (f (-1 x)))))))
                (-1 x)))))

If we had started with a larger number, we would’ve gotten a string of
(* 4 (* 3 (* 2…. built up from these transformations. But we’re
almost done: The if is true this time, so we can substitute:

(* 2 1)

The whole Y and lambda mess just evaporated. w00t! And the answer is
of course 2 which is correct.

What happened?

What the magic (Y F) = (F (Y F)) does is to let us embed a
function within itself an inifinitely number of times. So, for factorial:

(Y fact) = (fact (fact (fact (fact (fact (Y fact))))))

In this case fact is defined as (pay attention now): A function that
takes a function as an argument and returns another function. The returned
function takes a natural number as an argument, and if the number is one, it
returns it, otherwise: it calls the function that was given as argument to the
first function
. Y keeps injecting fact into itself, so it can always go
deeper in the recursive call. But as soon as the number reaches 1, we don’t
have to go any further
, and the “final” (Y fact) collapses into
nothingness.

Y generates as many function applications as we need, so it will let us
create recursiveness, without naming a single function. (Y F) is then
the value of F when F is applied to itself an infinitely number of times.
This is called the fixed point for the function F.

Conclusion: Why Y?

The Y operator lets us use purely syntactic (mechanical) steps
to evaluate recursive functions. It gives the mathematical construct of
lambda calculus as much computing power as a programming language, without
needing to define names for functions. It is a mathematical marvel.

Can it be used for anything? I don’t really think so. But just as
Y expands a lambda expression, it can expand your mind. I hope that
you feel a little smarter going away from this blog than entering it.

You can read more about the Y-operator at wikipedia.
Of course.

And if you want to go for the whole nine yards, check out the Structure and Interpretation of Computer Programs online video
lectures! 20 hours of Lisp!

Postscript

Did you manage to read through the whole article? Did you understand it?
Let me know what you think: Write a comment, or send me an
email.

Comments (1)

What is Software Architecture

I have been working as a Software Architect for several years now, but I still find myself unable to answer the question “what is software architecture?” However, I think I can point to some of the factors that can make the architectural work successful.

First: Architecture is about vision, communication and governance. The vision bit is relatively simple: Any company has huge inefficiencies in how it operates. I think the very nature of business is such that these inefficiencies have to exist. The architect(s) job is to find out how to reduce those efficiencies through standardization, technology and automation. That is the easy part. You have to verify continuously that your vision is more or less correct. Then you have to convince others of your vision. If your vision makes sense, that is also relatively easy.

Then comes the hard part: What should people do? The tricky part is to take a vision and then make it actionable, if you will — to transform it into something that people can relate to. At the end of the day, someone has to be able to say “no, we’re not complying with the architecture”. If there is no way of defining non-compliance, there is definately no way of defining compliance, and everything is just handwaving and empty noises.

This means that as an architect, you have to be extremely aware of communication issues. This is important at least at two levels: If you cannot communicate what you mean, there is no way anyone can follow your lead, even if they want to. Second, you have to be humble: Always remember that you are basically powerless. If the developers on your team doesn’t want to implement your idea, it will not be implementated. The architecture is not what you create — it is what ends up in the code: Code is king!

At the end of the day — your final job is not to force anyone to do something, but to describe how it is done. If you can collaborate with others so your description comes fast enough, who knows: You can even change things.

What frustrates me, then, is trying to understand how to do it. How do you attack technical issues in a productive way and how do you communicate and collaborate. But literature on architecture seems to be … well, dumb! Reading SEI: “Documenting Software Architecture“, the authors tell you how you should divide your documentation into views, how views consists of primary presentations, supporting documentation, view templates and whatnot. And almost all literature on architecture is like this: Obsessed with structure, and forgetting content and communication.

Your architecture will not succeed if you write nice, view-oriented architecture documents. As a matter of fact, I think they can be directly detrimental to your success. Are the developers going to read them? Can you really blame them?

The only thing I have found that works is to create a few “architectural cartoons” and bring them to people. Use the architecture documents (mainly the “cartoons”) as basis for two-way discussion about principles and implentation. Pair program with people and try to point out the essensial concepts in the code (and use the chance to learn what is really going on — remember Code is king). Describe what the future may look like, but let people take their time.

In the end: Architecture is slow as molasses. It takes forever to get where you want to go, so all you can do is to ensure that the small steps go more or less in the right direction. And when you get close to your goal, you probably find that it’s not right, after all, the code has changed the landscape. Code is king. I guess that makes those who work with code all the time kings, too. And, fellow architect, I think we have a lot to learn from court jesters.

Comments (1)

Creative Commons Attribution 3.0 Unported
Creative Commons Attribution 3.0 Unported