Coder Profile - Show off your skills, get a coder profile.
 
 
 
N-Queens-Series Part I (Only the fittest survive)
Programming
Introduction:

This is the first article of a series i'm going to present here.
The entire series will be dealing with the 8 queens problem,or more general the n-queens-problem.
I assume that all of you know what the problem is about.
If not just google it out, there are loads of resources. Also i will shortly explain what it's all about.

So why dedicate quite some articles to this problem?
Well at a first glance it's just a relativly easy scientific problem.
But if you look closer you'll notice that solutions to this problems can be generated in quite many ways. Not few of them offer interesting applications beyond the problem itself.
The n-queens-problem is very well suited for studying various algorithms.
It's very easy to comprehend thus you can focus on solving it efficiently.

This first article attacks the problem with a genetic algorithm. I thought it's probably the most interesting to start with and shows some general ideas.
Note that it is not proven that genetic algorithms perform particulary well.
It's just that they're related to what we all know from school that makes them so appealing.

Before we can start we need to make a few things clear.
First of all, the 8-queens-problem is all about chess.
You know chess don't you? If not, go buy a book on it and learn it.
It's definitely worth the time you spend.
The goal of the 8-queens-problem is to place 8 queens on a chess-board such that none of the queens are attacking each other.The generalization, the n-queens-problem thus formulates the goal
of placing n queens on an nxn chessboard in such a way that no queen is attacking another.

So what's all that genetic algorithm sort of thing?

It's foundations are actually quite easy to grasp and quickly explained. We model our problem in a way that allows us to apply evolutionary methods to quite a big extend.
In a genetic algorithm, GA for short, we deal with populations, chromosomes, mutation, fitness and selection. I'll explain all of this term shortly.

In a GA the problemset is represented as a set of chromosomes.
Each chromosome represents a particlar problemstate.
We now have formal tequniques to measure the fitness of each chromosome in
regard to the goal we're trying to reach.
Or in other words we can express numerically how well a specific chromosome solves a particular problem.
This numeric value is called the fitness of a chromosome.
Our goal is to let chromosomes with better fitness survive where chromosomes with lower fitness will die, eventually producing a chromosome that does indeed solve the problem.

So how do we actually make chromosomes reproduce themselves?

Well the same way every higher biological being does.
We crossover chromosomes. Crossing over means breaking two chromosomes apart, and glueing the pieces together in such a way that we get two new resulting chromosomes.

The general layout of a GA is as follows
Code Copy / Restore
  1. procedure GA(problem) return solution
  2.    initial_generation := create_random_generation()
  3.    solution := false
  4.    while not solution
  5.      selected_chromosomes := select_chromosomes(initial_generation)
  6.      new_generation := crossover(selected_chromosomes)
  7.      solution := select_solution(new_generation)
  8.    end
  9.    return solution
  10. end
How can we apply this to the n-queens-problem?

Chromsomes:
The first thing we have to do, is to model our chromosomes and thus our problemstates.
A straight forward model is a numeric array, where the index represents the column of the queen
and the value at a given index i represents the row of the queen.

A chromosome for a give state in the 8-queens-problem might look like this:

|3|6|1|8|3|4|7|2|

This array represents one possible problem-state where 8 queens are placed on the board.
You can get the actual positions by forming a set of pairs, like so:

Ps = { (1,2), (2,6), (3,1), (4,8), (5,3), (6,4), (7,7), (8,2) }

This type of chromsomes can efficiently be represented with any programming language.

The fitness:
Now that we know how to represent our chromsomes, the next step is to determine a
way to measure the fitness of a particular chromosome.
I choose to base the fitness on the number of non-attacking pairs in a chromosome.
A queen is attacking another queen if it is placed on the same row or on the same column.
It also does attack another queen if it sits on the same diagonal(s).
We can compute if a queen is attacking another queen if we thing of the row and column
as a point in a coordinate system. We can place the queens on this coordinate system
and see if they lay on the same straights. There are 4 of those straights that we
need to consider.

To get the maximum number of non-attacking pairs for a given n in the n-queens-problem
you can just compute it via (n * ( n - 1)) / 2.
So for the 8-queens-problem the maximum number of non-attacking pairs and thus a goal state
is 28.

We have two important points allready covered. We know how to represent our
chromosome and we know how we can compute how well it is suited to solve the given problem.

The fittest will survive:
Next we obviously need a way to use the information we have and bring some evolution
into play. We need a way to select chromsomes for crossover such that the fitter
the chromosome is, the more likely it will be selected for reproduction. As you all
know, only the fittest survive. This is actually a keypoint of GAs. In the literature
there are many methods discussed that lead to more or less usefull selction-rules for
a given problem. These methods are also known as selection-operators.

I choose the fitness-proportionate-selection or roulette-wheel-selection.
It's by far the most comprehensive one. As a sidenote, almost all of the selection-operators
work stochastically to determine the best chromosomes. So it's no surprise that the underlying
idea is often the same, but the actuall implementation vary in complexity and performance.
See for example tournament-selection or SUS (stochastic universal sampling).

How does this work?

Well, imagine you have a roulette-wheel. You can spin it and hopefully it halts on the color
or number you've previously chosen. Now suppose the numbers and colors aren't allready there. You'll have
to paint them on. The first thing you do is you calculate the fitness in percent of each
chromsome you have. Thus the "weaker" chromsomes get a smaller percentage whereas the stronger
(fitter) chromosomes get a greater percentage. Now based on these values assign each chromsome
a slice on the wheel so that the slice-size corresponds to the percentage-value.
Now that you painted the slices on the wheel, you can spin it and see on which slice it halts.
Bigger slices are more likely to be selected here, as they obviously occupie more space on the wheel.
This spinning is our way to select two chromosomes for reproduction.
Note that weaker chromosoms still have a small probability of being chosen.

Cross them over:
So now that we can fairly select chromsomes for reproduction, we need to actually perform the
crossover process. This is straight forward. We just choose a random index between 1 and n.
We cut our two chromosomes at this index in half. Now we have four pieces. These four pieces
we now glue together so that the head of the first chromosome gets the tail of the second
and the head of the second gets the tail of the first.
Suppose we have the following chromosomes:
Code Copy / Restore
  1. c1 = |3|6|1|8|3|4|7|2|
  2.  
  3. c2 = |5|1|7|8|2|6|3|3|
  4.  
  5. Let our point for crossover be 3
  6.  
  7. i = 3
  8.  
  9. h1 = |3|6|1|
  10. t1 = |8|3|4|7|2|
  11.  
  12. h2 = |5|1|7|
  13. t2 = |8|2|6|3|3|
  14.  
  15. Now we join them:
  16.  
  17. h1t2 = |3|6|1|8|2|6|3|3|
  18.  
  19. h2t1 = |5|1|7|8|3|4|7|2|
We have our offspring. The idea behind this is that there are sequences of genes that
perform better. For example if |3|6|1|5 did not have a single queen that attacks another
this sequence does perform quite well. Suppose you have an 4-queens-problem it would actually
be a goal state. The hope of the GA is to combine those successful sequences so that even
better sequences are produced that solve the bigger problem. It's a kind of divide and conquer
but much more random.

Mutation:
In nature there is another mechanism that kicks in randomly. It's mutation.
Mutation randomly alters the chromomsomes. We model this by walking through the chromosome
and decide based on a certain delta (mutation-rate) if we do actually mutate the current gene.
The higher mutation-rate the more often this mutations will occure.
Also i choose to not stop mutation after i mutated a given gene in a chromosome, but instead
walk further and see if another gene shall be mutated.
You can easily replace this algorithm by the non-greedy one to see if it performs better or
worse.

So now that we have our now chromosomes we replace the parents and can thus form a new generation.
Hopefully this offspring has fitter individuals. If an individual is generated that solves the
problem we can abort the algorithm and return the solution.

Over and over again:
The GA repeats this cycles until a goal-state is reached or another break-condition arises.
Possibly quite many generations are needed to solve a problem.
And that's really all there is to it.

Note that this genetic algorithm isn't the fastest on earth but it is actually a good
one to study. It is able to solve the n-queens-problem for smaller ns in acceptable time.
For bigger ns you'll quickly notice that it doesn't find a solution or it takes much time.


Referenceimplementation: http://www.coderprofile.com/source-code/438/solving-the-n-queens-problem-pa...

Next up:
The next article will model the 8-queens-problem as a constraint satisfaction problem.
We'll be solving it by simple chronological backtracking first.
In a later article we'll use more advanced algorithms from linear programming to solve
the problem, like simulated annealing and hill-climbing. I'll even present you
a variation of the hill-climbing algorithm to solve the problem even for big ns in a short time.
This algorithm is known as random-restart hill-climbing.

Yet other articles will use straight uninformed search algorithms and other will
focus on informed search algorithms using heuristics like A*.

I hope you enjoy the articles and stay tuned. Don't expect me to pull them out too soon, as i don't have
that much time. But i'll try to make the breaks as short as possible.



References:
"Artificial Intelligence a Modern Approach" Second Edition by Russel and Norvig
Newcastle University EDC


Posted By closure
Please login to rate coding articles.

Click here to register a free account with us.
Comments
Please login to post comments.
 
closure     Posted 63 Days Ago
 
 
Hi, sure i'll check it in detail later this evening.
Don't care about performance right now, firstly, as i said, GA's are not
proven to be very efficient, though they are most of the time. My implementation and
thus yours is
straight-forward not optimized for speed. That's way the executing time gets
bigger.

There are also some other reasons for this. I'll pm you the most important
later. It's basically about the datatype we choose to represent the chromoses.
This can be made better. It's all about O (big O).

But put that aside, hope you had a good time implementing this. And you did indeed
succeed.

cheers
 
Cinjection     Posted 64 Days Ago
 
 
Well here's my attempt:
http://www.coderprofile.com/content/public/profile/view-source-code.php?coder=Cinje
ction&id=461

Could I ask you to look it over to make sure I did it right? I'm not 100%
sure. It runs really slow for N>5. Is this a limitation of the algorithm, or is my
implementation wrong?
 
Cinjection     Posted 68 Days Ago
 
 
Thanks. I'll modify my code and hopefully get it to work.
 
closure     Posted 68 Days Ago
 
 
Hi,
i understand your way of thinking. Unfortunatelly, this is not applicable in GAs.
In my particular solution you start with a solution of fixed size. For example 20
chromosomes. For each cycle you crossover chromosomes. Remember that when we
crossover two chromosomes we get two resulting chromosomes. These new chromoseomes
replace their parents, and the cycle starts again. The size of the population
remains constant. This is a bit counter-intuitive to what we know from biological
evolution.

You're rigt, that your way of doing this would less likely find a solution.
But you could well start out with a greate initial population to have a bigger
gene-pool. The rest of your algorithm could stay roughly the same. I guess this would
produce solutions as well.

Hope that helps
cheers
 
Cinjection     Posted 68 Days Ago
 
 
Okay. I've written an attempt at this but it's not working correctly. I
think I'm manipulating the populations incorrectly.

1) Do I start with a population of two and then based on those I create 2 new ones
and add them to the population? I would do this every generation so my population
size would grow like this: 2 4 6 8 10 ... Is this correct? Something sounds a little
wrong though. If the first two chromosomes cannot be permuted to make a solution,
then one will never be derived (ignoring mutation). What am I doing wrong?

Could you just go over how one would handle a population of chromosomes?

Thanks in advance.
 
Cinjection     Posted 76 Days Ago
 
 
Thanks a lot. It's clearer now. I havn't gotten a chance to play with it
yet, but I will try to design my own solution a little later.
 
closure     Posted 77 Days Ago
 
 
There was a mistake in my last comment. The second queen is not (2,1) but (2,2)
 
closure     Posted 77 Days Ago
 
 
Hi cinjection, i'm glad you like the article.
To answer your questions:


1) Yes you're right. Each chromosome represents a possible state. A few of
those states, the ones that are a solution are so called goal states. The fitness is
the value
of the fitnessfunction applied to a chromosome. The value will be the higher the
better the chromosome solves the problem.
2) Let's take a chromosome for the 4-queens problem. For example 3 2 1 1.
Now see how you get all possible pairs programmatically. You take the first
queen (3,1) and form the list of pairs where this first queen is the car.
How many possibilities are there? 3. Now that you have all possibilities with (3,1)
in the car you move on and take the next queen (2,1) and from there on you
take all remaining queens and form the pairs. Remember that the pair (3,1;2,2) is
the same pair as (2,2;3,1). We need not to go backward. The number of possibilities
to form the pairs decreases when you move on. So for the first queen you have 3
possibilities, for the second you have 2 and for the third you have 1. Once you reach
the last queen there are no more pairs left.
So this is equal to forming the sum of the first n - 1 natural numbers. (n - 1
because once we reach the n we allready have all possible combinations)
And this summation can be translated to (n * (n - 1)) / 2. Thanks you Karl
Friedrich Gauß

3) Yes it does. The entire idea behind GAs is the survival of successful schemas.
The problem is that you might start out with a bad initial population with a bad
distribution of good sequences. Mutation can help to compensate this. Mutation is
allways random, but if the fitness increased we won sth. That's how nature works
as well.
4) If there is a solution, yes. It may take some time though. You can compute the
probability for a purely random algorithm, that finds the solution by pure
"luck".
This is the exact same probability a GA solving the same algorithm has to succeed.
It's worst case though and the GA will most likely perform much better.

I hope this answers your questions :)
 
Cinjection     Posted 80 Days Ago
 
 
Great article. Really interesting. I find this topic _really_ interesting.

Questions:

1) Just to confirm, Cromosones represent solutions to the problem (valid or not).
The fitness is then a measure of how well the solution solves the problem. Correct?
2) Could you explain/derive the forumula you gave for number of non-attacking
pairs: (n * ( n - 1)) / 2 ?
3) What's the point of mutation. Does it offer any benifits to the algorithm?
4) Do GA's guarantee a solution?
 
closure     Posted 81 Days Ago
 
 
I don't have MSWord but i can certainly run aspell on it. Great idea.
I'll just do that and modify my article.
 
VBAssassin     Posted 81 Days Ago
 
 
Why not just pass the article through MSWord, that will check the spelling ;-)

Thats what i do before posting an article :-)

Kind regards,
Scott
 
closure     Posted 82 Days Ago
 
 
Thanks, about an hour for the rough and ready version.
There are still some things i'm going to fix in the future. And obviously
there are still typos left. I'll try to eliminate them. Feel free to point on
them :)
 
VBAssassin     Posted 82 Days Ago
 
 
Impressive article! How long did it take you to write that?

Kind regards,
Scott

P.S. Typo in your bold heading: Chromsomes, and no space here:
Referenceimplementation
Page 1 of 1
More Articles By This Author
N-Queens-Series Part I (Only the fittest survive)
Recently Posted "Programming" Articles
N-Queens-Series Part I (Only the fittest survive)
Currying in JavaScript: Fun for the Whole Family!
[PHP] Create A Unique Page Hits Counter
Actionscript Events
Actionscript API
Quick Threading Tutorial - C++
C++ And Me (A Love Story)
First look at C
Introduction to Pseudo Code
Integrating your website with PHP
Recently Rated "Programming" Articles
[C++] Templates
Quick Threading Tutorial - C++
Intorduction to memoization.
N-Queens-Series Part I (Only the fittest survive)
[PHP] Create A Unique Page Hits Counter
Currying in JavaScript: Fun for the Whole Family!
Actionscript Events
First look at C
if/else, switch/case and ?/: Statements In Actionscript 2.0
Actionscript API
source codes Categories articles
Browse All
Business & E-Commerce (1)
Databases (1)
Design & Creativity (1)
Internet & Web Sites (1)
Life In General (2)
Operating Systems (3)
Other (2)
Programming (48)
Security (10)
Software Development (5)
Web Development (15)
search Search Inside
Programming
 
 
Part of the MyPingle Network
Development Blog :: Make A Donation :: Contact Me
Terms & Conditions :: Privacy Policy :: Documents
Version 1.44.00
Copyright © 2007 - 2008, Scott Thompson, All Rights Reserved