Link: ‘Hello World’ genetic algorithm

I enjoyed a post today at Hidden Clause talking about an attempt to generate the string ‘hello, world!’ using a genetic algorithm.

In the first chapter of the documentation for the Watchmaker Framework Daniel W. Dyer presents an interesting genetic algorithm example: what if we wanted to evolve the string “HELLO WORLD”? … My first thought: implementing this in JGAP shouldn’t be that hard.

The setup reminds me of the game ‘Mastermind’ — we’re dealing with a fixed-length string and scoring based on how many of the correct characters occur in the string and if they are in the right position. You combine some better strings and sometimes mutate them to generate new strings, and repeat that process until, hopefully, you arrive at the target string, ‘hello, world!’.

Did it work? Well, almost. The first result, “‘!ello, or!d!”,  was reached after 1 million iterations on a population size of a thousand — and, it used a limited alphabet containing only the letters and punctuation in the target string (and not, for example, ‘z’).

The reason the solution was not reached, the post’s author notes, is that the algorithm doesn’t have the breadth or the convergent ability to match the statistical odds involved. Recalling the old problem of monkeys on typewriters producing Shakespeare , the author notes how unlikely it is to produce the desired result.

Of course, we’re not simply monkeying at typewriters. We’re making random changes and mutations, but we’re intelligently selecting the parents. Apparently, the method first tried wasn’t intelligent enough to arrive at the desired solution within that number of iterations, and increasing iterations or population size, apparently, isn’t going to be practical.  So we need to use a better way of selecting the parents.

We could improve the fitness function. Let’s say we take the whole alphabet and the punctuation and number them (for instance: A=1… Z=26 and space and comma and exclamation could be 27, 28, and 29, respectively). Now, assign fitness based on the difference between actual and target values of each character. Is this cheating? Well, no. Our new fitness function requires no a priori knowledge of the solution and will work equally well for a different target string. This is providing our GA with a lot more information per epoch, and therefore we should see a much faster convergence.

(Aside: there are much better way of numbering the letters and punctuation to encourage faster convergence, based on the frequency the characters are used in the English language. In fact, one could develop a system that took advantage of commonly grouped letters like ‘th’ and ‘ing’, and if dealing with sentences, commonly grouped words…  Is that cheating? Well, we must remember that the more such information we encode, the more pre-disposed or biased our output will be. Also, the point of a using a genetic algorithm is to substitute computation time for programming complexity. In a contrived problem, it’s up to us where that line lies. )

There are a number of other ways to improve the fitness function, but my instinct is that this suggestion or a similar modification would be sufficient. I hope the Hidden Clause crew reports the result of another attempt.

August 13, 2009. Tags: , , , . Emergence. Leave a comment.

Emergence Project: Representing a Textual Discourse

via information aesthetics on 11/18/08

The Emergence Project [] is a “software art” installation exhibited at Hyde Park Art Center‘s digital building facade gallery. It is based on the ideas and textual discourse that emanated out of the Chicago Humanities Festival: its presentations, performances, and panel discussions were captured, analyzed, and processed into a set of dynamic data visualizations that evolve dynamically over time.

The generative data artwork uses simple morphological rules to animate word clusters, based on linguistic proximity, similarity, and difference. It deliberately utilizes computer-generated animation to chart how complex patterns arise out of a multiplicity of simple interactions, a phenomenon known as “emergence“.

Via Visual Complexity.

January 15, 2009. Tags: . Emergence. Leave a comment.

Surprising Results from Emergent Systems

One of my favorite examples of a surprising result from an emergent system is an experiment that utilized a genetic algorithm to solve a sorting problem on a FPGA chip. The hope was to arrive at the best (minimal) solution to the sorting problem faster and with more confidence than a human could. The actual result was that the genetic algorithm found a solution that was better than what theory said was even possible!

How did that happen? It turned out that the models humans used and programmed into computer simulations were not in fact complete. They are based in theoretical simplifications that do not fully describe the physical world, in which nature is not limited to the ‘1’s and ‘0’s of electrical theory, in which nature could take advantage of the minute electromagnetic coupling of electrical components packed tightly into a silicon ship, in which higher order effects not considered by models could be utilized to solve the problem. 

By physically wiring an infinitely reprogrammable Xilinx XC6216 FPGA chip to allow the genetic algorithm to use a physical board — instead of a computer simulation of the board — the algorithm arrived at a solution that used fewer steps and fewer gates than the best solution previously offered by theory. 

“What is downright scary is this: the FPGA only used 32 of its 100 available logic gates to achieve its task, and when scientists attempted to back-engineer the algorithm of the circuit, they found that some of the working gates were not even connected to the rest through normal wiring. Yet these gates were still crucial to the functionality of the circuit. This means, according to Thompson, that either electromagnetic coupling or the radio waves between components made them affect each other in ways which the scientists could not discern (Taubes 1997).”

The structure of the genetic solution discarded the need for models or theory and was allowed to experiment with actual physical properties.  All that was needed were rules that allowed better solutions to survive in each generation until the optimal result was achieved. 

Read more: [PDF] 

Not all problems are suitable for such an evolutionary approach; it is often impractical even when it does work. But what appeals to me is the idea that we can  do what nature does. We can use simple rules to find an amazingly efficient solutions to a problem, without understanding how that solution really works. Not far in the future, the world might be full of algorithmic solutions as diverse and amazing to a computer scientist as the animals of the world are to a zoologist.

December 15, 2008. Tags: , , , . Emergence. 1 comment.


Emergence is a very interesting concept in the computing world: the idea that complex phenomenon can arise from relatively simple interactions. There are two types of emergence, somewhat controversially:  weak emergence and strong emergence.  The wikipedia article provides an introduction to the mystery, but the essential difference is that strong emergence suggests that the resultant is more than the sum of its parts; with strong emergence, you end up with something that cannot be explained by dissembling into constituents.

Imagine a simple rule for a termite: “walk around until you see a wood chip, then pick it up — unless you’re already carrying a wood chip, in which case, drop it.” Simple rules like these have been shows to lead to incredibly large and complex termite mounds , one of the oft quoted examples of emergence.

In the world of algorithms, an emergent system is one that exhibits complex results based on such simple rules. A combination of NAND gates can be used to simulate essentially any sophisticated logic function. An artificial neural network (ANN) similarly leads to complex results, and the inherent fuzzy logic allows the system to correctly solve problems it has never encountered before.

Evolutionary algorithms (EAs) utilize simple rules repeatedly until parameters converge on the desired solution.  It should be of no surprise to anyone that understands Darwin’s ideas that this can produce almost limitless complexity.  The rich history of EAs, leading to what is now known as Evolutionary Computation will be the subject of a future post.

Cellular Automaton models allow complexity to arrive through both time (generations of self-replication) and the connectionist approach.

These concepts are best illustrated by example, which I will dive into in subsequent posts. I hope to explore these different computational themes and show how they are connected by recurring philosophical ideas such as ‘Holism vs. Reductionism’, and along the way we will remind ourselves to question whether our examples exhibit strong emergence or merely weak emergence.

December 12, 2008. Tags: . Emergence. Leave a comment.