Menu

- Chapter 1 Distilled
- Introduction
- 1.1 - The Elements Of Programming
- Expressions
- Naming and the Environment
- Evaluating Combinations
- Defining New Functions
- The Substitution Model
- Exercises
- Predicates
- Conditional Expressions
- Example: Newton’s Method
- Functions as Black-Box Abstractions
- 1.2 - Procedures and the Processes They Generate
- Linear Recursion and Iteration
- Tree Recursion
- Orders of Growth
- Exponentiation
- Example: Testing For Primality
- 1.3 Higher Order Functions
- Project - Blackjack

- Chapter 2 Distilled
- Introduction
- Data Abstraction
- Everything From Nothing
- Abstractions In Clojure
- Clojure's Data Structures
- Data Abstraction, Revisited
- Escher
- Project - Escher
- Symbolic Data
- Representing Sets
- Huffman Encoding Trees
- Zippers

We now come to the decisive step of mathematical abstraction: we forget about what the symbols stand for. . . .[The mathematician] need not be idle; there are many operations which he may carry out with these symbols, without ever having to look at the things they stand for. Hermann Weyl, The Mathematical Way of Thinking

We concentrated in Chapter 1 on computational processes and on the
role of functions in program design. We saw how to use primitive data
(numbers) and primitive operations (arithmetic operations), how to
combine functions to create our own functions through composition,
conditionals, and the use of parameters, and how to abstract functions
by using `defn`

. We saw that a function can be regarded as a pattern
for the local evolution of a process, and we classified, reasoned
about, and performed simple algorithmic analyses of some common
patterns for processes as embodied in functions. We also saw that
higher-order functions enhance the power of our language by enabling
us to manipulate, and thereby to reason in terms of, general methods
of computation. This is much of the essence of programming.

In this chapter we are going to look at more complex data. All the
functions in chapter 1 operate on simple numerical data, and simple
data are not sufficient for many of the problems we wish to address
using computation. Programs are typically designed to model complex
phenomena, and more often than not one must construct computational
objects that have several parts in order to model real-world phenomena
that have several aspects. Thus, whereas our focus in chapter 1 was on
building abstractions by combining functions and creating our own, we
turn in this chapter to another key aspect of any programming
language: the means it provides for building abstractions by combining
data objects to form *compound data*.

Why do we want compound data in a programming language? For the same reasons that we want to create our own functions: to elevate the conceptual level at which we can design our programs, to increase the modularity of our designs, and to enhance the expressive power of our language. Just as the ability to create functions enables us to deal with processes at a higher conceptual level than that of the primitive operations of the language, the ability to construct compound data objects enables us to deal with data at a higher conceptual level than that of the primitive data objects of the language.

Consider the task of designing a system to perform arithmetic with
rational numbers. We could imagine an operation `add-rat`

that takes
two rational numbers and produces their sum. In terms of simple data,
a rational number can be thought of as two integers: a numerator and a
denominator. Thus, we could design a program in which each rational
number would be represented by two integers (a numerator and a
denominator) and where `add-rat`

would be implemented by two functions
(one producing the numerator of the sum and one producing the
denominator). But this would be awkward, because we would then need to
explicitly keep track of which numerators corresponded to which
denominators. In a system intended to perform many operations on many
rational numbers, such bookkeeping details would clutter the programs
substantially, to say nothing of what they would do to our minds. It
would be much better if we could “glue together” a numerator and
denominator to form a pair, a *compound data object*, that our
programs could manipulate in a way that would be consistent with
regarding a rational number as a single conceptual unit.

The use of compound data also enables us to increase the modularity of
our programs. If we can manipulate rational numbers directly as
objects in their own right, then we can separate the part of our
program that deals with rational numbers per se from the details of
how rational numbers may be represented as pairs of integers. The
general technique of isolating the parts of a program that deal with
how data objects are represented from the parts of a program that deal
with how data objects are used is a powerful design methodology called
*data abstraction*. We will see how data abstraction makes programs
much easier to design, maintain, and modify.

The use of compound data leads to a real increase in the expressive power of our programming language. Consider the idea of forming a “linear combination” $ax + by$. We might like to write a function that would accept $a$, $b$, $x$, and $y$ as arguments and return the value of $ax +by$. This presents no difficulty if the arguments are to be numbers, because we can readily define the function

```
(defn linear-combination [a b x y]
(+ (* a x) (* b y)))
```

But suppose we are not concerned only with numbers. Suppose we would like to express, in procedural terms, the idea that one can form linear combinations whenever addition and multiplication are defined - for rational numbers, complex numbers, polynomials, or whatever. We could express this as a function of the form

```
(defn linear-combination [a b x y]
(add (mul a x) (mul b y)))
```

where `add`

and `mul`

are not the primitive functions `+`

and `*`

but
rather more complex things that will perform the appropriate
operations for whatever kinds of data we pass in as the arguments `a`

, `b`

,`x`

, and `y`

. The key point is that the only thing
`linear-combination`

should need to know about `a`

, `b`

, `x`

, and
`y`

is that the functions `add`

and `mul`

will perform the appropriate
manipulations. From the perspective of the function
`linear-combination`

, it is irrelevant what `a`

, `b`

, `x`

, and `y`

are and even more irrelevant how they might happen to be represented
in terms of more primitive data. This same example shows why it is
important that our programming language provide the ability to
manipulate compound objects directly: Without this, there is no way
for a function such as `linear-combination`

to pass its arguments
along to `add`

and `mul`

without having to know their detailed
structure.

We begin this chapter by implementing the rational-number arithmetic
system mentioned above, using the same primitives as scheme - `cons`

,
`car`

and `cdr`

, but then leave them in favour of using Clojure's
datatypes.

As with user-defined functions, the main issue to be addressed is that
of abstraction as a technique for coping with complexity, and we will
see how data abstraction enables us to erect suitable *abstraction
barriers* between different parts of a program.

We will see that the key to forming compound data is that a
programming language should provide some kind of “glue” so that data
objects can be combined to form more complex data objects. There are
many possible kinds of glue. Indeed, we will discover how to form
compound data using no special “data” operations at all, only
functions. This will further blur the distinction between “function”
and “data,”. We will also explore some conventional techniques for
representing trees. One key idea in dealing with compound data is the
notion of *closure* —that the glue we use for combining data objects
should allow us to combine not only primitive data objects, but
compound data objects as well. Another key idea is that compound data
objects can serve as *conventional interfaces* for combining program
modules in mix-and-match ways. We illustrate some of these ideas by
presenting a simple graphics language that exploits closure.

We will then augment the representational power of our language by
introducing *symbolic expressions*, data whose elementary parts can be
arbitrary symbols rather than only numbers. We explore various
alternatives for representing sets of objects. We will find that, just
as a given numerical function can be computed by many different
computational processes, there are many ways in which a given data
structure can be represented in terms of simpler objects, and the
choice of representation can have significant impact on the time and
space requirements of processes that manipulate the data. We will
investigate these ideas in the context of the representation of sets
and the encoding of information.

Next we will take up the problem of working with data that may be
represented differently by different parts of a program. This leads to
the need to implement *generic operations*, which must handle many
different types of data. Maintaining modularity in the presence of
generic operations requires more powerful abstraction barriers than
can be erected with simple data abstraction alone. In particular, we
introduce data directed programming as a technique that allows
individual data representations to be designed in isolation and then
combined *additively* (i.e., without modification).

TODO: Edit last few sentences