[navigation map] Links Projects Forays Home

home | forays | article

Creative Byte-ing

I would like to begin this essay with a few thought-provoking comments on the slippery term, "creativity". At its most literal, of course, it can be used in the biblical context; i.e. to bring something from nothing. However, if we accept that mortal creativity is more limited than that of omnipotent beings, then we need to provide a better explanation - one that defines where our creative output really comes from. Many people have attempted to identify precisely what human creativity is, but most have used vague and essentially meaningless phrases (and this essay will probably be no different, unfortunately!).

Creativity is the ability to bring something into existence

- Concise Oxford Dictionary

The moment of truth, the sudden emergence of a new insight, is an act of intuition. Such intuitions give the appearance of miraculous flashes, or short-circuits of reasoning. In fact they may be likened to an immersed chain, of which only the beginning and the end are visible above the surface of consciousness. The diver vanishes at one end of the chain and comes up at the other end, guided by invisible links.

- Arthur Koestler

Creativity is a fortuitous by-product of language.

- Euan MacPhail

It is obvious that invention or discovery, be it in mathematics or anywhere else, takes place by combining ideas.

- Hadamard

Countess Ada Lovelace was a close friend of Charles Babbage. She was very interested in programmable computing, and made the following statement about Babbage's Analytical Engine (the forerunner of today's digital computers):

[Although in theory it is able to] compose elaborate and scientific pieces of music of any degree of complexity or extent ... the Analytical Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform.

- Ada Lovelace

Lovelace was in essence quite correct when she made this comment. It is true that since computers are fully deterministic devices (even their random numbers are only pseudo-random and hence deterministic), then they must do precisely what they are programmed to do (barring hardware faults). To program a computer to play Chess for instance, the coder programs the rules of the game regarding how pieces move and how the game ends, as well as methods for searching the possible combinations of moves to arrive at the best outcome. To make the game run faster, the coder may also program a set of heuristics which state in clear-cut rules how the search space can be pruned. The question thus arises : is the Chess program creative at all? Most people would say no without hesitation. It is only following some reasonably simple rules which have been explicitly given to it by the coder.

However, such programs have been used to find novel solutions to end-games in Chess, and have even been used to prove the superiority of some pieces versus others in the end. One such example of this is KBR vs KNN. After playing this (tedious) end-game for hundreds of moves, a computer announced that KBR was a winning combination. This solution would probably never have been discovered by human players because it is too complex. The dichotomy is this : any human that came up with the solution would probably be called creative, and yet most people refuse to call the program creative because it was simply following what it was told to do. I heard an interesting comment about this particular case in a newsgroup on the Internet :

Calling that "creative" is like calling a bulldozer "muscly".

- Tomoyuki Tanaka

Let me give another example of such a double standard. In Book I of his Geometry work, Euclid created a proof for the two base angles of a triangle being equal. It was a fairly complicated proof, and involved much construction of extra lines and so on. Today, a much simpler proof is used to demonstrate the theorem, and this is the one of which most people are probably aware.

The commonly cited proof

Consider triangle ABC.

Construct line AD such that AD bisects angle BAC.

Angle BAD = angle DAC (by construction).

AB = AC (given).

AD = DA (common).

Thus, the two triangles BAD and DAC are congruent (SAS).

Thus, angle ABD = angle ACD.

Q.E.D.

However, around six centuries after Euclid's books, Pappus came up with an even simpler proof than the one above, because it involved no construction.

Pappus' proof

Consider triangles ABC and ACB.

Angle BAC = angle CAB (common).

AB = AC (given).

AC = AB (given).

Thus, the two triangles ABC and ACB are congruent (SAS).

Thus, angle ABC = angle ACB.

Q.E.D.

This proof is appealing because it is simpler than any other proof (so far) and has an element of novelty - considering the same triangle to actually be two distinct triangles. To see this possibility requires a subtle shift in the way we think about these planar objects. Almost certainly, a student at High School who came up with such a proof would be considered creative.

In the 1950's a geometry-theorem proving program was developed and asked to prove the base angles theorem. It came up with Pappus' proof. Yet this is not considered creative by many people, because as with the Chess example, it was simply the product of a search algorithm. The program started with the theorem it wished to prove and progressively broke it down into simpler theorems, in the process widening its search space, until it arrived at other theorems it already knew, or the axioms of geometry. Though the program was able to construct new lines (as is needed in the commonly cited proof above), it was told not to do this until it had exhausted all possible avenues without construction. It is little wonder then that it came up with Pappus' proof. The program had no shift in logic that suddenly saw the "two" triangles as distinct, it was just one of the possibilities to be explored in its search space.

I would still argue however that the program is creative, at least in some sense. It cannot be denied that it provided an elegant and appealing proof that most people find creative. It was even unexpected by the programmer!

There are those that say, however, that to be regarded as creative one must break the rules (or at least bend them) to arrive at truly novel responses, and this is obviously something the geometry-theorem prover would never, and could never do. If, when asked to solve the theorem, it instead wrote some poetry on wandering clouds, they argue that one could then conceivably call it "creative" (or buggy).

AARON is a computer program that creates drawings and paintings. Some versions of AARON produce landscapes, while others produce acrobats and other human figures. AARON began in around 1980 by creating abstract pictures in a similar style to its programmer, Harold Cohen. Many of these were exhibited in modern art galleries around America. However, unsatisfied with the abstractness of the drawings, Cohen added more complex knowledge to the system (in the form of AI frames) regarding the human form. It was then able to produce infinitely varied pictures of human forms, though all had a consistent style. One had only to ask AARON to draw a picture of three acrobats and a ball, and an absolutely unique picture would be produced. The picture Liberty and Friends was produced around this time, and was part of an exhibition regarding the history of the statue.

Liberty and Friends, 1985. Ink and dyes on paper, 22x30 inches.

The colour on the image was not specified by the program - it was added afterwards by Cohen. Later, a more three-dimensional aspect was added to the system and the pictures became more detailed. Further, such features as haircuts and facial expressions were possible, as seen in Theo.

Theo, 1992. Oil on canvas, 34x24 inches.

Eventually AARON was able to use the element of colour itself in its compositions, at first filling shapes with blocks of colour, and later moving to actual simulated brush strokes. The routines controlling colour addition were written around Cohen's belief that intensity and brightness of a colour were more important than the hue. This simple heuristic tended to produce quite aesthetically pleasing results. Cohen has also attempted to interface AARON physically to a painting apparatus which paints in much the same way as a human artists does (i.e. with a brush, and by mixing colours on a palette, etc.).

An interesting point that Cohen makes about AARON is that instead of drawing things from back to front as most "graphics packages" do nowadays, AARON paints things from front to back in the way that an actual painter generally does. According to Cohen, this is a deliberate rejection of the European Renaissance-period perspectivii. Cohen wanted AARON to model, at least in some ways, the way humans create abstract art.

Cohen himself does not actually consider AARON "creative" in the human sense. However, he does believe that AARON produces very high quality art (or at least, what would be called art had a human produced it!) which is at least as good as many human artists, and has a distinctive style which is easily recognisable. He poses the question, if it's not art, what is it?

Copycat is a connectionist program by Melanie Mitchell and Douglas Hofstadter and comes from the University of Michigan. It was first formulated by Hofstadter in 1984 and was completed some years later. Basically, it is an analogy-forming program. Its domain is quite simple and abstract - it operates on strings of letters. Given an input of the form (abc -> abd; ijk -> ?) it returns a string it feels is a suitable answer. There are of course no correct answers, because there are many ways of seeing the problem. Even so, most people would say ijl is the most appropriate answer to the above example. Why does this appear to be a good solution? We are using concepts such as successor and rightmost when we come up with that answer, but another possible answer would be ijd, where we simply replace the rightmost letter with a literal d. Most people consider this answer as less creative because it does not rely on the deeper concept of successor. Another even more literal-minded answer might be abd, where we replace the entire string with abd.

The strings with which Copycat works are supposed to provide an uncluttered domain in which the bare essentials of analogy making are analysed. Many have argued that it is too simple, but this is actually a feature in Copycat's favour. As there are no real "fuzzy" rules or definitions (i.e. it's a very defined domain), the results of the program are more plausible - we can accept them because there is no real room to hide flaws in the program.

Copycat has three main modules through which information is filtered; the Slipnet, the Workspace, and the Coderack. Although these are all integral to the success of Copycat, the most interesting in my opinion is the Slipnet. This is an area containing nodes representing concepts such as successor and rightmost. Each concept is linked to many others in a connectionist fashion. For instance, successor is linked to predecessor and rightmost is linked to leftmost. In this way, concepts can "slip" along these connections into new concepts. Mitchell believes this to be an important aspect of analogy making, and creativity in general. If concepts can slip then we are far more tolerant to unusual situations where things don't quite fit.

Take for example, the input (abc -> abd; xyz -> ?). This is a more complicated problem because we now run into the problem of there being no successor to z. We can take the literal-minded approach and give xyd and this is in fact a common solution from both people and Copycat. However, there are certainly "better" answers which appeal to our sense of creativity more readily. These are ones that are not immediately obvious - an admiration of creativity appears to come from how deeply embedded an analogy or concept is. Another common solution to the above problem from people is xya, where we still use the successor notion, but allow the letters to wrap around the alphabet. Copycat on the other hand does not ever produce this answer, because it does not know that the domain can wrap in that way. If we similarly constrain ourselves to Copycat's domain, we are forced to come up with a significantly more creative answer.

Copycat, on finding no obvious solution involving successor is forced to slip the concepts around a little. For instance, it may notice that abc is right at one end of the alphabet, and xyz is right at the other. This symmetry may cause successor to slip into predecessor and rightmost to slip into leftmost. The answer given then would be wyz, which is an answer that most people find very appealing, even if they do not come up with it themselves.

Problem: (abc -> abd; xyz -> ?)

The chart supplied represents a number of runs on the xyz problem. The "Runs" bar indicate the number of times that answer was produced. The "Temp." bars indicate Copycats internal evaluation of that answer. The lower the temperature, the better is the answer in Copycat's judgement, so although xyd is produced more commonly than wyz, it considers the latter a better answer, as do most people.

EMI (Experiments in Musical Intelligence) is a program by David Cope (a respected 20th Century composer) that takes a collection of a composer's works and creates original music in the style of that composer. It was written in 1981 on a Macintosh computer and was at first supposed to be an aid to "composer's block". Cope was asked to compose an opera but was unable to come up with anything, so he decided to write a program that would inspire him. To this end, his program took encoded versions of some Bach inventions and analysed them. It searched for typical chord structures and progressions of notes that were trademarks of Bach and incorporated these into original compositions (called "EMI-tations"). As time went by, Cope modified the program with successively more complex rules and heuristics.

It has now produced hundreds of compositions, some of which are quite good. It has been successfully used to compose music in the style of Bach, Bartok and Chopin. It works by having sample music given to it in a special format, analysing the similarities throughout the examples given, and storing these as musical "words" which it uses to create musical "sentences". The samples given to it are defined not in absolute terms as is a normal musical score, but in relations between notes. I.e. "the next note is 4 semitones higher than the previous note" as opposed to "this note is an Eb and the previous note was a C". This allows it to pick out similar motifs regardless of the key in which the music is written.

An interesting aspect of the program, which Cope emphasises as being very important, is EMI's ability to identify "signatures". These are fragments of melody or harmony by which the composer is easily identified. For instance, Bach often included the following sequence of notes in his pieces : B-A-C-H (where H is Bb in German notation). This is a signature of the most literal kind, of course, but other composers had comparable features. Chopin for instance would frequently begin with a fairly rudimentary melody and then embellish the third or fourth measure of it with floral sections in subsequent recapitulations. The fact that EMI can pick these out is really just a statistical certainty, but the addition of these signatures to EMI's creations lends them much credibility. If one was only partially convinced of authenticity when hearing a vaguely Bach-like canon, the inclusion of the signature sequence would probably end the doubt.

However, the EMI-tations have certainly got problems. They seem to work best with short phrases as these are easily picked out of the original works. EMI appears to be less successful over the whole duration of the piece, as it doesn't quite hang together as well as it should - the music as a whole is sometimes lacking in coherency. This is often apparent in strange key modulations and erratic ostinati. It has been compared to the speaking ability of someone suffering from aphasia by Jonathon Berger of Stanford University. An aphasic speaker is one whose grammar is flawless yet whose sentences are meaningless. It cannot be denied though that EMI does produce some quality music, clearly recognisable as certain composers. This in itself is an achievement, and is more than most people can attain.

But does this make EMI creative? After all, it's merely copying other people's work, isn't it? Up to a point, I believe this to be true. EMI does take a collection of music and then "average it out" into a new composition. But if we say that this is not creative, then how can anything be considered truly original? The works produced by Mozart were influenced largely by his study of the works of Haydn - one can see definite resemblances in styles. It would seem that all music is influenced to a large degree by the composer's musical background and exposure, and indeed this would seem to be true for ALL forms of human creativity. True "creativity" is simply the skill with which an artist compresses their prior experiences. Obviously, there are also truly original moments, but these are really no more than perturbations or fractal diversions which mutate the process over time, and these could be introduced into computer programs very easily.

In an event at the University of Oregon about a year ago, three pieces of music were played to an audience. One was by Bach, one was by musical theory professor Steve Larson writing in the style of Bach, and one was by EMI. The audience was asked to pick the one written by the computer. Much to Larson's dismay, they picked his composition as EMI's. More interestingly, they picked EMI's composition as actual Bach! This either demonstrates the lack of musical knowledge of the audience, or the admirable ability of EMI. It is more reasonable to assume the former, but the fact that EMI could produce music which the audience found more engaging than either Bach or a professor of music theory is very interesting.

Often, when Cope plays some of EMI's works to people they agree that they find it aesthetically pleasing. But when people discover the music was written by a program they tend to renounce all emotional attachment to it, about which he says the following :

When I am camping in the Sierras there is an incredible beauty I see. But it is unintended by nature. The plants are not trying to express things to me and the mountain is not trying to communicate. But I'm inspired anyway.

- David Cope

I think this makes an important comment on creativity. Creativity is in the eye of the beholder. For this reason, my final definition of the term is this: creativity is the ability of an entity to produce something new which an individual (be it the creator or a spectator) finds interesting, novel or pleasing. Since a computer can do this (specifically the work of EMI and AARON), I believe that computers can be, and are, creative. Q.E.D.

References

Boden, Margaret A. The creative mind : myths & mechanisms.
Great Britain, George Weidenfeld and Nicholson Ltd., 1990

Cohen , Harold. SEHR, volume 4, issue 2: Constructions of the Mind
the further exploits of AARON, Painter
Edited by Fowler, H. W. and Fowler, F. G.

The Concise Oxford Dictionary of Current English. : Fifth Edition
Oxford, Oxford University Press, 1964

Johnson, George. Undiscovered Bach? No, a Computer Wrote It
The New York Times Company, November 11, 1997

Mitchell, Melanie. Analogy-making as perception : a computer model.
USA, Massachusetts Institute of Technology, 1993

Midi files obtained from David Cope's homepage at UCSC.