Goldberg Genetic Algorithms in Search

ijij

Uploaded by

THALIA GEORGINA LUPACA LUPACA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

0% found this document useful (0 votes)

165 views432 pages

Goldberg Genetic Algorithms in Search

ijij

Uploaded by

THALIA GEORGINA LUPACA LUPACA

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF or read online on Scribd

You are on page 1/ 432

A y ~< GENETIC ALGORITHMS FX in Search, = Optimization & Machine Learning oe N Za ; DAVID E. GOLDBERG =A Ny “=GENETIC ALGORITHMS in Search, Optimization & Machine Learning DAVID E. GOLDBERG As agraduate student at the University of Michigan, David E. Goldberg spear. headed a successful project applying genetic algorithms and classifier systems to the control of natural gas pipelines. After receiving his Ph.D. at the University of Michigan, Dr. Goldberg joined the fac- ulty of the University of Alabama at Tusca- Joosa where he is now Associate Professor of Engineering Mechanics. Dr. Goldberg has continued his research in genetic algorithms and classifier systems and received a 1985 NSE Presiden- tial Young Investigator Award for his work. Dr. Goldberg has 12 years of con. sulting experience in industry and gov- ernment and has published numerous articles and papers.Genetic Algorithms in Search, Optimization, and Machine LearningGenetic Algorithms in Search, Optimization, and Machine Learning David E. Goldberg The University of Alabama a vw ADDISON-WESLEY PUBLISHING COMPANY, INC. Reading, Massachusetts + Menlo Park, Californio + Sydney Don Mills, Ontario + Madrid + SanJuan + NewYork + Singapore Amsterdam + Wokingham, England + Tokyo = BonnThe procedures and applications presented in this book have been included for their in- structional value. They have been tested with care but are not guaranteed for any partic ular purpose. The publisher does not offer any warranties or representations, nor does it accept any liabilities with respect to the programs or applications. Library of Congress Cataloging-in-Publication Data Goldberg, David E. (David Edward), 1953— Genetic algorithms in search, optimization, and machine learning, Bibliography: p. Includes index. 1. Combinatorial optimization. 2. Algorithms. 3.Machine learning. L Title. QAs02.5.G635 1989 0063’ 88-6276 ISBN 0-201-15767-5 Reprinted with corrections January, 1989 Copyright © 1989 by Addison-Wesley Publishing Company, Inc. Al rights reserved. No part of this publication may be reproduced, stored in a retrieval system, o transmitted, in any form or by any means, electronic, mechanical, photocopy: ing, recording, or otherwise, without the prior written permission of the publisher. Printed in the United States of America. Published simultaneously int Canada. DEFGHIJ-HA-943210Foreword I first encountered David Goldberg as a young, PhD-bound Civil Engineer inquir ing about my course Introduction to Adaptive Systems. He was something of an anomaly becuase his severely practical field experience in the gas-pipeline industry, and his evident interest in that industry, did not seem to dampen his interest in what was, after all, an abstract course involving a lot of “biological stuff,” After he enrolled in the course, I soon realized that his expressed interests in control, ‘gas pipelines, and AI were the surface tell-tales of a wide-ranging curiosity and a talent for careful analysis. He was, and is, an engineer interested in building, but the was, and is, equally interested in ideas. Not long thereafter, Dave asked if | would be willing to co-chair (with Ben Wylie, the chairman of our Civil Engineering Department ) a dissertation investi ating the use of genetic algorithms and classifier systems in the contro! of gas- Pipeline transmission. My first reaction was that this was too difficult a problem for a dissertation—there are no closed analytic solutions to even simple versions of the problem, and actual operation involves long, craftsmanlike apprenticeships, Dave persisted, and in a surprisingly short time produced a dissertation that, in turn, produced for him a 1985 NSF Presidential Young Investigator Award. So much for my intuition as to what constitutes a reasonable dissertation. In the past few years GAs have gone from an arcane subject known to a few of my students, and their students, to a subject engaging the curiosity of many different research communities including researchers in economics, political science, psychology, linguistics, immunology, biology, and computer science. A major reason for this interest is that GAs really work. GAs offer robust proce lures that can exploit massively parallel architectures and, applied to classifier systems, they provide a new route toward an understanding of intelligence and adaptation, David Goldberg's book provides a turnpike into this territory. ‘One cannot be around David Goldberg for long without being infected by enthusiasm and energy. That enthusiasm comes across in this book. It is alsoForeword an embodiment of his passion for clear explanations and carefully worked ex: amples. His book docs an exceptional job of making the methods of GAs and classifier systems available to a wide audience. Dave is deeply interested in the intellectual problems of GAs and classifier systems, but he is interested even more in secing these systems used. This book, I think, will be instrumental in realizing that ambition. John Holland ‘Ann Arbor, MichiganPreface This book is about genetic algorithms (GAs)—search procedures based on the mechanics of natural selection and natural genetics. In writing it, I have tried to bring together the computer techniques, mathematical tools, and research results that will enable you to apply genetic algorithms to problems in your field. If you choose to do so, you will join a growing group of researchers and practitioners who have come to appreciate the natural analogues, mathematical analyses, and computer techniques comprised by the genetic algorithm methodology. The book is designed to be a textbook and a self-study guide, I have tested the draft text in a one semester, senior-level undergraduateffirst-year graduate course devoted to genetic algorithms. Although the students came from different backgrounds (biochemistry, chemical engineering, computer science, electrical engineering, engineering mechanics, English, mathematics, mechanical engineer ing, and physics) and had wide differences in mathematical and computational ‘maturity, they all acquired an understanding of the basic algorithm and its theory of operation. To reach such a diverse audience, the tone of the book is intention- ally casual, and rigor has almost always been sacrificed in the interest of building intuition and understanding, Worked out examples illustrate major topics, and computer assignments are available at the end of each chapter. [have minimized the mathematics, genetics, and computer background required to read this book. An understanding of introductory college-level mathematics (algebra and a little calculus) is assumed. Elementary notions of counting and finite probability are used, and Appendix A summarizes the important concepts briefly. l assume no particular knowledge of genetics and define all required genetic terminology and concepts within the text. Last, some computer programming ability is necessary. If you have programmed a computer in any language, you should be able to follow the computer examples | present. All computer code in this book is written in Pascal, and Appendix B presents a brief introduction 10 the essentials of that language.vi Preface Although I have not explicitly subdivided the book into separate parts, the chapters may be grouped in two major categorics: those dealing with search and optimization and those dealing with machine learning, The first five chapters are devoted to genetic algorithms in search and opti mization. Chapter 1 introduces the topic of genetic search; it also describes a simple genetic algorithm and illustrates the Ga's application through a hand calculation, Chapter 2 introduces the essential theoretical basis of GAs, covering topics including schemata, the fundamental theorem, and extended analysis. If you dislike theory, you can safely skip Chapter 2 without excessive loss of continuity; however, before doing So, I suggest you try reading it anyway. The mathematical underpinnings of GAS are not difficult to follow, but their ramifications are subtle; some attention to analysis early in the study of GAs promotes fuller understanding of algorithm power. Chapter 3 introduces computer implementation of genetic algorithms through example. Specifically, a Pascal code called the simple genetic algorithm (SGA) is presented along with a number of extensions. Chapter 4 presents a historical account of early genetic algorithms together with a potpourri of current applications. Chapter 5 examines more advanced genetic operators and presents a number of applications illustrating their use, These in- clude applications of micro- and macro-level operators as well as hybrid techniques. Chapters 6 and 7 present the application of genctic algorithms in machine learning systems. Chapter 6 gives a generic description of one type of gen based machine learning (GBML) system, a classifier system. The theory of operation of such a system is briefly reviewed, and one Pascal implementation called the simple classificr system (SCS) is presented and applied to the learning of a boolean function. Chapter 7 rounds out the picture of GBML by presenting a historical review of carly GBML systems together with a selective survey of other current systems and topics. ACKNOWLEDGMENTS In writing acknowledgments for a book on genetic algorithms, there is no question who should get top billing. I thank John H. Holland from the University of Michigan for his encouragement of this project and for giving birth to the infant Wwe now recognize as the genetic algorithms movement. It hasn't been easy nur- turing such a child. At times she showed signs of stunted intellectual growth, and the other kids on the block haven't always treated her very nicely. Nonetheless, John stood by his baby with the quiet confidence only a father can possess, know: ing that his daughter would one day take her rightful place in the community of ideas. also thank two men who have influenced me in more ways than they know: E. Benjamin Wylie and William D. Jordan. Ben Wylie was my dissertation adviserPreface vil in Civil Engineering at the University of Michigan. When I approached him with the idea for a dissertation about gas pipelines and genetic algorithms, he was appropriately skeptical, but he gave me the rope and taught me the research and organizational skills necessary not to hang myself Bill Jordan was my Department Head in Engineering Mechanics at The University of Alabama (he retired in 1986). He was and continues to be a model of teaching quality and administrative fairness that I still strive to emulate. I thank my colleagues in the Department of Engineering Mechanics at Ala- bama, A. E. Carden, C. H. Chang, C. R Evces, S.C. Gambrell, J. L. Hill, S. E. Jones, D.C. Rancy, and H. B. Wilson, for their encouragement and support, I also thank my many colleagues in the genetic algorithms community. P: ks are due Stewart Wilson at the Rowland Institute for Science for providing special encouragement and a sympathetic ear on numerous occasions. I thank my students (the notorious Bama Gene Hackers), including ©. L. Bridges, K. Deb, C. L. Karr, C. H. Kuo, R. Lingle, Jr, M. P. Samtani, P. Segrest, T. Sivapalan, R. E, Smith, and M. Valenzucla-Rendon, for lots of long hours and hard work. | also recognize the workmanlike assistance rendered by a string of right hand persons: A. L. Thomas, S. Damsky, B. Korb, and K. Y. Lee. 1 acknowledge the editorial assistance provided by Sarah Bane Wood at Ala- ama, I am also grateful to the team at Addison-Wesley, including Peter Gordon, Helen Goldstein, Helen Wythe, and Cynthia Benn, for providing expert advice and assistance during this project. I thank the reviewers, Ken De Jong, John Holland, and Stewart Wilson, for their comments and suggestions. ‘A number of individuals and organizations have granted permission to reprint or adapt materials originally printed elsewhere. I gratefully acknowledge the permission granted by the following individuals: L. B. Booker, G. E. P. Box, K. A. De Jong, S. Forrest, J.J. Grefenstette, J. H. Holland, J. D. Schatfer, S. E Smith, and S. W. Wilson, I also acknowledge the permission granted by the following organizations: Academic Press, Academic Press London Ltd. (Journal of Theoretical Bi- ology), the American Society of Civil Engineers, the Association for Computing Machinery, the Conference Committee of the International Conference on Ge- netic Algorithms, Kluwer Academic Publishers (Machine Learning), North-Hol- land Physics Publishing, the Royal Statistical Society Journal of the Royat Statistical Society, C), and John Wiley and Sons, Inc. I thank my spouse and still best friend, Mary Ann, for her patience and assistance. There were more than a few evenings and weekends I didn't come home when I said I would, and she proofread the manuscript, judiciously separating my tolerable quips from my unacceptable quirks. Untold numbers of readers would thank you, Nary (sic), if they knew the fate they have been spared by your sound judgment. This material is bascd upon work supported by the National Science Foun- dation under Grant MSM-8451610. I am also grateful for rescarch support provided by the Alabama Rescarch Institute, Digital Equipment Corporation, Intelvill Preface Corporation, Mr. Peter Prater, the Rowland Institute for Science, Texas Instru- ments Incorporated, and The University of Alabama Last, it has become a cliche in textbooks and monographs; after thanking one and all for their assistance, the author gallantly accepts blame for all remaining errors in the text. This is usually done with no small amount of pomp and cit- cumstance—a ritualistic incantation to ward off the evil spirits of error. I will forgo this exercise and close these acknowledgments by paraphrasing a piece of graffiti that I first spotted on the third floor of the West Engineering Building at the University of Michigan: ‘To err is human. To really foul up, use a computer, Unfortunately, in writing this book, I find myself subject to both of these sources of error, and no doubt many mistakes remain. I can only take comfort in knowing that error is the one inevitable side effect of our human past and the probable destiny of our artificially intelligent future.Contents FOREWORD iii PREFACE v A GENTLE INTRODUCTION TO GENETIC ALGORITHMS 1 What Are Genetic Algorithms? 1 Robustness of Traditional Optimization and Search Methods 2 ‘The Goals of Optimization 6 How Are Genetic Algorithms Different from Traditional Methods? 7 A Simple Genetic Algorithm 10 Genetic Algorithms at Work—a Simulation by hand 15 Grist for the Search Mill—Important Similarities 18 ‘Similarity Templates (Schemata) 19 Learning the Lingo 21 Summary 22 Problems 23 ‘Computer Assignments 25 GENETIC ALGORITHMS REVISITED: MATHEMATICAL FOUNDATIONS 27 ‘Who Shall Live and Who Shall Dic? The Fundamental Theorem 28 Schema Processing at Work: An Example by Hand Revisited 33 The Two-armed and k-armed Bandit Problem 36 How Many Schemata Are Processed Usefully? 40Contents The Building Block Hypothesis 41 Another Perspective: The Minimal Deceptive Problem 46 Schemata Revisited: Similarity Templates as Hyperplanes 53 Summary $4 Problems 55 Computer Assignments 56 COMPUTER IMPLEMENTATION OF AGENETIC ALGORITHM 59 Data Structures 60 Reproduction, Crossover, and Mutation 62 A Time to Reproduce, a Time to Cross 66 Get with the Main Program 68 How Well Does it Work? 70 Mapping Objective Functions to Fitness Form 75 Fitness Scaling 76 Codings 80 ‘A Multiparameter, Mapped, Fixed-Point Coding 82 Discretization 84 Constraints 85 Summary 86 Problems 87 Computer Assignments 88 SOME APPLICATIONS OF GENETIC ALGORITHMS 89 The Rise of Genetic Algorithms 89 Genetic Algorithm Applications of Historical Interest. 92 De Jong and Function Optimization 106 Improvements in Basic Technique 120 Current Applications of Genetic Algorithms — 125 Summary 142 Problems 143 Computer Assignments 145 ADVANCED OPERATORS AND TECHNIQUES IN GENETIC SEARCH 147 Dominance, Diploidy, and Abeyance 148 Inversion and Other Reordering Operators 166Contents xi Other Micro-operators 179 Niche and Speciation 185 Multiobjective Optimization 197 Knowledge-Based Techniques 201 Genetic Algorithms and Parallel Processors 208 Summary 212 Problems 213 ‘Computer Assignments 214 INTRODUCTION TO GENETICS-BASED MACHINE LEARNING 217 Genetics-Based Machine Learning: Whence It Came 218 What is a Classifier System? 221 Rule and Message System 223 Apportionment of Credit: The Bucket Brigade 225 Genetic Algorithm 229 ‘A Simple Classifier System in Pascal 230 Results Using the Simple Classifier System — 245 Summary 256 Problems 258 Computer Assignments 259 APPLICATIONS OF GENETICS-BASED MACHINE LEARNING 261 ‘The Rise of GBML 261 Development of CS-1, the First Classifier System 265 Smith’s Poker Player 270 Other Early GBML Efforts 276 4 Potpourri of Current Applications 293 Summary 304 Problems 306 Computer Assignments 307 ALOOK BACK, AGLANCE AHEAD 309 APPENDIXES 313xl Contents A REVIEW OF COMBINATORICS AND ELEMENTARY PROBABILITY 313 Counting 313 Permutations 314 Combinations 316 Binomial Theorem 316 Events and Spaces 317 Axioms of Probability 318 Equally Likely Outcomes 319 Conditional Probability 321 Partitions of an Event 321 Bayes’ Rule 322 Independent Events 322 ‘Two Probability Distributions: Bernoulli and Binomial 323 Expected Value of a Random Variable 323 Limit Theorems 324 Summary 324 Problems 325 PASCAL WITH RANDOM NUMBER GENERATION FOR. FORTRAN, BASIC, AND COBOL PROGRAMMERS 327 Simple1: An Extremely Simple Code 327 imple2: Functions, Procedures, and More VO 330 Let's Do Something 332 Last Stop Before Freeway 338 Summary 341 A SIMPLE GENETIC ALGORITHM (SGA) IN PASCAL 343 ‘A SIMPLE CLASSIFIER SYSTEM (SCS) IN PASCAL 351 PARTITION COEFFICIENT TRANSFORMS FOR PROBLEM-CODING ANALYSIS 373 Partition Coefficient Transform 374 An Example: f(x) = x¢on Three Bits a Day 375 What do the Partition Coefficients Mean? 376Contents xl Using Partition Coefficients to Analyze Deceptive Problems 377 Designing GA-Deceptive Problems with Partition Coefficients 377 Summary 378 Problems 378 Computer Assignments 379 BIBLIOGRAPHY 381 INDEX 403A Gentle Introduction to Genetic Algorithms In this chapter, we introduce genetic algorithms: what they are, where they came from, and how they compare to and differ from other search procedures, We illustrate how they work with a hand calculation, and we start to understand their power through the concept of a schema or similarity template WHAT ARE GENETIC ALGORITHMS? Genetic algorithms are search algorithms based on the mechanics of natural selection and natural genetics. They combine survival of the fittest among string structures with a structured yet randomized information exchange to form a search algorithm with some of the innovative flair of human search. In every generation, a new set of artificial creatures (strings) is created using bits and pieces of the fittest of the old; an occasional new part is tried for good measure. While randomized, genetic algorithms are no simple random walk, They effi ciently exploit historical information to speculate on new search points with expected improved performance. Genetic algorithms have been developed by John Holland, his colleagues, and his students at the University of Michigan. The goals of their research have been twofold: (1) to abstract and rigorously explain the adaptive processes of natural systems, and (2) to design artificial systems software that retains the important mechanisms of natural systems. This approach has led to important discoveries in both natural and artificial systems science. ‘The central theme of research on genetic algorithms has been robustness, the balance between efficiency and efficacy necessary for survival in many differ-2 Chapter 1 / A Gentle Introduction to Genetic Algorithms ent environments. The implications of robustness for artificial systems are mani- fold, If artificial systems can be made more robust, costly redesigns can be reduced or climinated. If higher levels of adaptation can be achieved, existing systems can perform their functions longer and better. Designers of artificial systems—both software and hardware, whether engineering systems, computer sys tems, or business systems—can only marvel at the robustness, the efficiency, and the flexibility of biological systems. Features for self-repair, self-guidance, and re production are the rule in biological systems, whereas they barely exist in the most sophisticated artificial systems. ‘Thus, we are drawn to an interesting conclusion: where robust performance is desired (and where is it not?), nature does it better; the secrets of adaptation and survival are best learned from the careful study of biological example, Yet ‘we do not accept the genetic algorithm method by appeal to this beauty-of nature argument alone. Genetic algorithms are theoretically and empirically proven to provide robust search in complex spaces. The primary monograph on the topic is Holland's (1975) Adaptation in Natural and Artificial Systems. Many papers and dissertations establish the validity of the technique in function optimization and control applications. Having been established as a valid approach to problems requiring efficient and effective search, genetic algorithms are now finding more widespread application in business, scientific, and engineering circles. The reasons behind the growing numbers of applications are clear. These algorithms are computationally simple yet powerful in their search for improvement. Further- more, they are not fundamentally limited by restrictive assumptions about the search space (assumptions concerning continuity, existence of derivatives, uni- ‘modality, and other matters). We will investigate the reasons behind these attrac tive qualities; but before this, we need to explore the robustness of more widely accepted search procedures. ROBUSTNESS OF TRADITIONAL OPTIMIZATION AND SEARCH METHODS This book is not a comparative study of search and optimization techniques Nonetheless, it is important to question whether conventional search methods meet our robustness requirements. The current literature identifies three main types of search methods: calculus-based, enumerative, and random. Let us examine each type to see what conclusions may be drawn without formal testing. Calculus-based methods have been studied heavily. These subdivide into two main classes: indirect and direct. Indirect methods seek local extrema by solving the usually nonlinear set of equations resulting from setting the gradient of the objective function equal to zero. This is the multidimensional generalization of the elementary calculus notion of extremal points, as illustrated in Fig 1.1. Given 4 smooth, unconstrained function, finding a possible peak starts by restricting search to those points with slopes of zero in all directions. On the other hand,Robusiness of Traditional Optimization and Search Methods 3 f(x,y) FIGURE 1.1 The single-peak function is easy for calculus-based methods. direct (search) methods seek local optima by hopping on the function and mov- ing in a direction related to the local gradient. This is simply the notion of hill- climbing: to find the local best, climb the function in the steepest permissible direction. While both of these calculus-based methods have been improved, extended, hashed, and rehashed, some simple reasoning shows their lack of robustness. , both methods are local in scope; the optima they seek are the best in a neighborhood of the current point. For example, suppose that Fig. 1.1 shows a Portion of the complete domain of interest; a more complete picture is shown in Fig. 1.2. Cleatly, starting the search or zero-finding procedures in the neighbor: hood of the lower peak will cause us to miss the main event (the higher peak), Furthermore, once the lower peak is reached, further improvernent must be sought through random restart or other trickery. Second, calculus-based methods depend upon the existence of derivatives (well-defined slope values). Even if we allow numerical approximation of derivatives, this is a severe shortcoming. Many practical parameter spaces have little respect for the notion of a derivative and the smoothness this implies. Theorists interested in optimization have been too willing to accept the legacy of the great eighteenth and nineteenth-century math- ‘ematicians who painted a clean world of quadratic objective functions, ideal constraints, and ever present derivatives. The real world of search is fraught with discontinuities and vast multimodal, noisy search spaces as depicted in a less calculus-friendly function in Fig, 1.3. It comes as no surprise that methods de pending upon the restrictive requirements of continuity and derivative existence are unsuitable for all but a very limited problem domain, For this reason andChapter 1 / A Gentle Introduction to Genetic Algorithms 1(x, ¥) FIGURE 1.2. The multiple-peak function causes a dilemma. Which hill should we climb? because of their inherently local scope of search, we must reject calculus-based methods. They are insufficiently robust in unintended domains. Enumerative schemes have been considered in many shapes and sizes. The idea is fairly straightforward; within a finite scarch space, or a discretized infinite search space, the search algorithm starts looking at objective function values at every point in the space, one at a time. Although the simplicity of this type of f(x) 199 10 0.09) 020) ue 00 ono 100 x FIGURE 1.3 Many functions are noisy and discontinuous and thus unsuitable for search by traditional methods.Robustness of Traditional Optimization and Search Methods 5 algorithm is attractive, and enumeration is a very human kind of search (when the number of possibilities is small), such schemes must ultimately be discounted in the robustness race for one simple reason: lack of efficiency, Many practical spaces are simply too large to search one ata time and still have a chance of using the information to some practical end. Even the highly touted enumerative scheme dynamic programming breaks down on problems of moderate size and complexity, suffering from a malady melodramatically labeled the “curse of di ‘mensionality” by its creator (Bellman, 1961), We must conclude that less clever enumerative schemes are similarly, and more abundantly, cursed for real problems, Random search algorithms have achieved increasing popularity as research: ers have recognized the shortcomings of calculus-based and enumerative schemes. Yet, random walks and random schemes that search and save the best must also be discounted because of the efficiency requirement, Random searches, im the long run, can be expected to do no better than enumerative schemes. In our haste to discount strictly random search methods, we must be careful to separate them from randomized techniques, The genetic algorithm is an example of a search procedure that uses random choice as a tool to guide a highly explo tative search through a coding of a parameter space. Using random choice as tool in a directed search process seems strange at first, but nature contains many examples. Another currently popular search technique, simulated anneating, uses random processes to help guide its form of search for minimal energy states. A recent book (Davis, 1987) explores the connections between simulated an- nealing and genetic algorithms. The important thing to recognize at this juncture is that randomized search does not necessarily imply directionless search, While our discussion has been no exhaustive examination of the myriad methods of traditional optimization, we are left with a somewhat unsettling conclusion: conventional search methods are not robust. This does not imply that they are not useful. The schemes mentioned and countless hybrid combinations and permutations have been used successfully in many applications; however, as more complex problems are attacked, other methods will be necessary. To put this point in better perspective, inspect the problem spectrum of Fig. 1.4. In the figure a mythical effectiveness index is plotted across a problem continuum for a specialized scheme, an cnumerative scheme, and an idealized robust scheme. The gradient technique performs well in its narrow problem class, as we expect, but it becomes highly inefficient (if useful at all) elsewhere. On the other hand, the enumerative scheme performs with egalitarian inefficiency across the spectrum of problems, as shown by the lower performance curve. Far more desirable would be a performance curve like the one labeled Robust Scheme. It would be worth- while sacrificing peak performance on a particular problem to achieve a relatively high level of performance across the spectrum of problems. (Of course, with broad, efficient methods we can always create hybrid schemes that combine the best of the local search method with the more general robust scheme. We will have more to say about this possibility in Chapter 5.) We shall soon see how genetic algorithms help fill this robustness gap.6 Chapter 1 / A Gentle Introduction to Genetic Algorithms Robust Schone Specialized Scheme Efficiency endo Walk combinatorie! enmodel ‘multimodal Problem Type FIGURE 1.4 Many traditional schemes work well in a narrow problem domain, Enumerative schemes and random walks work equally inefficiently across a broad spectrum. A robust method works well across a broad spectrum of problems. THE GOALS OF OPTIMIZATION Before examining the mechanics and power of a simple genetic algorithm, we must be clearer about our goals when we say we want to optimize a function or a process. What are we trying to accomplish when we optimize? The conven: tional view is presented well by Beightler, Phillips, and Wilde (1979, p. 1), Man's longing for perfection finds expression in the theory of optimiza: tion. It studies how to describe and attain what is Best, once one knows how to measure and alter what is Good or Bad... Optimization theory ‘encompasses the quantitative study of optima and methods for finding, them. ‘Thus optimization seeks to improve performance toward some optimal point or points. Note that this definition has two parts: (11) we seek improvement to approach some (2) optimal point. There is a clear distinction between the process of improvement and the destination or optimum itself. Yet, in judging optimiz:- tion procedures we commonly focus solely upon convergence (does the method reach the optimum?) and forget entirely about interim performance. This emphasis stems from the origins of optimization in the calculus. It is not, however, 2 natural emphasis.How are Genetic Algorithms Different from Traditional Methods? 7 Consider a human decision maker, for example, a businessman, How do we judge his decisions? What criteria do we use to decide whether he has done a good or bad job? Usually we say he has done well when he makes adequate selections within the time and resources allotted. Goodness is judged relative to his competition. Does he produce a better widget? Does he get it to market more efficiently? With better promotion? We never judge a businessman by an attainment-of-the-best criterion; perfection is all too stern a taskmaster. AS a result, we conclude that convergence to the best is not an issue in business or in most walks of life; we are only concerned with doing better relative to others. ‘Thus, if we want more humanlike optimization tools, we are led to a reordering of the prioritics of optimization. The most important goal of optimization is improvement. Can we get to some good, “satisficing” (Simon, 1969) level of per: formance quickly? Attainment of the optimum is much less important for complex systems. It would be nice to be perfect: meanwhile, we can only strive to improve. In the next chapter we watch the genetic algorithm for these quali ties; here we outline some important differences between genetic algorithms and more traditional methods. HOW ARE GENETIC ALGORITHMS DIFFERENT FROM TRADITIONAL METHODS? In order for genetic algorithms to surpass their more traditional cousins in the quest for robustness, GAs must differ in some very fundamental ways. Genetic algorithms are different from more normal optimization and search procedures in four ways: 1. GAs work with a coding of the parameter set, not the parameters themselves, 2. GAs search from a population of points, not a single point. 3. GAs use payoff (objective function) information, not derivatives or other auxiliary knowledge. 4, GAs use probabilistic transition rules, not deterministic rules. Genetic algorithms require the natural parameter set of the optimization problem to be coded as a finite-length string over some finite alphabet. AS an ‘example, consider the optimization problem posed in Fig. 1.5. We wish to maximize the function f(x) = x" on the integer interval (0, 31]. With more traditional methods we would be tempted to twiddle with the parameter x, turning it like the vertical hold knob on a television set, until we reached the highest objective function value. With GAs, the first step of our optimization process is to code the parameter x as a finite-length string. There are many ways to code the x param eter, and Chapter 3 examines some of these in detail. At the moment, let's con sider an optimization problem where the coding comes a bit more naturally. Consider the black box switching problem illustrated in Fig, 1.6. This prob: lem concerns a black box device with a bank of five input switches. For every setting of the five switches, there is an output signal , mathematically f= f(s),Chapter 1 / A Gentle Introduction to Genetic Algorithms fo) ‘ _— ¢ a FIGURE 1.5 A simple function optimization example, the function tx) = x" on the integer interval (0, 31]. where 5 is a particular setting of the five switches, The objective of the problem is to set the switches to obtain the maximum possible f value. With other meth ‘ods of optimization we might work directly with the parameter set (the switch settings) and toggle switches from one setting to another using the transition rules of our particular method. With genetic algorithms, we first code the switches as a finite-length string. A simple code can be generated by considering a string of five 1's and O's where each of the five switches is represented by a 1 if the switch is on and a 0 if the switch is off. With this coding, the string 11110 codes the setting where the first four switches are on and the fifth switch is off Some of the codings introduced later will not be so obvious, but at this juncture we acknowledge that genetic algorithms use codings. Later it will be apparent FIGURE 1.6 A black box optimization problem with five on-off switches illustrates the idea of a coding and a payoff measure. Genetic algorithms only require these two things: they don't need to know the workings of the black box.How are Genetic Algorithms Different from Traditional Methods? 9 that genetic algorithms exploit coding similarities in a very general way result, they arc largely unconstrained by the limitations of other methods (continuity, derivative existence, unimodality, and so on), In many optimization methods, we move gingerly from a single point in the decision space to the next using some transition rule to determine the next point. This point-to-point method is dangerous because it is a perfect prescription for locating false peaks in multimodal (many-peaked) search spaces. By contrast, GAS work from a rich database of points simultancously (a population of strings), climbing many peaks in parallel; thus, the probability of finding a false peak is reduced over methods that go point to point. As an cxample, let's consider our black box optimization problem (Fig. 1.6) again. Other techniques for solving this problem might start with one set of switch settings, apply some transition rules, and generate a new trial switch setting. A genetic algorithm starts with a population of strings and thereafter generates successive populations of strings For example, in the five-switch problem, a random start using successive coin flips (head = 1, tail = 0) might generate the initial population of size n = 4 (small by genetic algorithm standards) 01101 11000 01000 10011 After this start, successive populations are generated using the genetic algorithm, By working from a population of well-adapted diversity instead of a single point, the genetic algorithm adheres to the old adage that there is safety in numbers; we will soon see how this parallel flavor contributes to a genetic algorithm's robustness. Many search techniques require much auxiliary information in order to work properly. For example, gradient techniques need derivatives (calculated analyti- cally or numerically) in order to be able to climb the current peak, and other local search procedures like the greedy techniques of combinatorial optimization (Lawler, 1976; Syslo, Deo, and Kowalik, 1983) require access to most if not all tabular parameters, By contrast, genetic algorithms have no need for all this aux iliary information: GAS are blind. To perform an effective search for better and better structures, they only require payoff values (objective function values) associated with individual strings. This characteristic makes a GA a more canonical method than many search schemes. After all, every search problem has a metric (or metrics) relevant to the search; however, different search problems have vastly different forms of auxiliary information. Only if we refuse to use this auxiliary information can we hope to develop the broadly based schemes we desire (On the other hand, the refusal to use specific knowledge when it does exist can place an upper bound on the performance of an algorithm when it goes head to head with methods designed for that problem. Chapter 5 examines ways to use nonpayoff information in so-called knowledge-directed genetic algorithms; how: ever, at this juncture we stress the importance of the blindness assumption to pure genetic algorithm robustness.10 Chapter 1 //A Gentle Introduction to Genetic Algorithms Unlike many methods, GAs use probabilistic transition rules to guide their search. To persons familiar with deterministic methods this secms odd, but the use of probability does not suggest that the method is some simple random search; this is not decision making at the toss of a coin. Genetic algorithms use random choice as a tool to guide a search toward regions of the search space with likely improvement, Taken together, these four differences—direct use of a coding, search from a population, blindness to auxiliary information, and randomized operators—con- tribute to a genetic algorithm's robustness and resulting advantage over other more commonly used techniques. The next section introduces a simple three ‘operator genetic algorithm, A SIMPLE GENETIC ALGORITHM ‘The mechanics of a simple genetic algorithm are surprisingly simple, involving nothing more complex than copying strings and swapping partial strings. The explanation of why this simple process works is much more subtle and powerful. Simplicity of operation and power of effect are two of the main attractions of the genetic algorithm approach. ‘The previous section pointed out how genetic algorithms process popula- tons of strings. Recalling the black box switching problem, remember that the initial population had four strings: o1101 11000 01000 10011 Also recall that this population was chosen at random through 20 successive flips of an unbiased coin. We now must define a set of simple operations that take this initial population and generate successive populations that (we hope) improve over time. A simple genetic algorithm that yields good results in many practical problems is composed of three operators: 1, Reproduction 2. Crossover 3. Mutation Reproduction is a process in which individual strings are copied according to their objective function values, f (biologists call this function the fitness func tion), Intuitively, we can think of the function fas some measure of profit, utility, or goodness that we want to maximize, Copying strings according to their fitness values means that strings with a higher value have a higher probability of con- tributing one or more offspring in the next generation. This operator, of course, is an artificial version of natural selection, a Darwinian survival of the fittestASimple Genetic Algorithm n TABLE 1.1 Sample Problem Strings and Fitness Values No. String Fitness % of Tor 1 1101 169 44 2 11000 576 49.2 3 01000 64 55 4 10011 361 309, Toral 1170 1000 among string creatures. In natural populations fitness is determined by a creature’ ability to survive predators, pestilence, and the other obstacles to adult- hood and subsequent reproduction, In our unabashedly artificial setting, the objective function is the final arbiter of the string-creature’s life or death, The reproduction operator may be implemented in algorithmic form in a number of ways. Perhaps the easiest is to create a biased roulette wheel where each current string in the population has a roulette wheel slot sized in proportion to its fitness. Suppose the sample population of four strings in the black box problem has objective or fitness function values fas shown in Table 1.1 (for now ‘we accept these values as the output of some unknown and arbitrary black box— later we will examine a function and coding that generate these same values). Summing the fitness over all four strings, we obtain a total of 1170. The percentage of population total fitness is also shown in the table. The correspond ing weighted roulette wheel for this generation's reproduction is shown in Fig 1.7. To reproduce, we simply spin the weighted roulette wheel thus defined four times. For the example problem, string number 1 has a fitness value of 169, which represents 14.4 percent of the total fitness. As a result, string 1 is given 14.4 percent of the biased roulette wheel, and cach spin turns up string 1 with prob- 10x saz FIGURE 1.7. Simple reproduction allocates offspring strings using a roulette wheel with slots sized according to fitness. The sample wheel is sized for the problem of Tables 1.1 and 1.2.2 Chapter 1 /A Gentle Introduction to Genetic Algorithms ability 0.144, Each time we require another offspring, a simple spin of the weighted roulette wheel yields the reproduction candidate. In this way, more highly fit strings have a higher number of offspring in the succeeding generation. Once a string has been selected for reproduction, an exact replica of the string is made. This string is then entered into a mating pool, a tentative new population, for further genetic operator action. After reproduction, simple crossover (Fig. 1.8) may proceed in two steps. First, members of the newly reproduced strings in the mating pool are mated at random. Second, each pair of strings undergoes crossing over as follows: an in teger position & along the string is selected uniformly at random between 1 and the string length less one [1, / — 1], Two new strings are created by swapping all characters between positions & + 1 and f inclusively. For example, consider strings A, and A, from our example initial population: A= 0110/1 A= 1100/0 Suppose in choosing a random number between 1 and 4, we obtain ak = 4 (as indicated by the separator symbol | ). The resulting crossover yields two new strings where the prime (’) means the strings are part of the new generation: Ww, 201100 4a,211001 BEFORE CROSSOVER AFTER CROSSOVER crass sire STRING 1 W/V ors =x) a ARiitiemo4 FIGURE 1.8. A schematic of simple crossover shows the alignment of two strings and the partial exchange of information, using a cross site chosen at random.A Simple Genetic Algorithm 13 ‘The mechanics of reproduction and crossover are surprisingly simple, involving random number generation, string copics, and some partial string exchanges, Nonetheless, the combined emphasis of reproduction and the structured, though randomized, information exchange of crossover give genetic algorithms much of their power. At first this seems surprising. How can two such simple (and computationally trivial) operators result in anything useful, let alone a rapid and robust search mechanism? Furthermore, doesn’t it seem a little strange that chance should play such a fundamental role in a directed search process? We will ¢x- amine a partial answer to the first of these two questions ina moment; the answer to the second question was well recognized by the mathematician J. Hadamard (1949, p. 29): We shall sce a little later that the possibility of imputing discovery to pure chance is already excluded... On the contrary, that there is an intervention of chance but also a necessary work of unconsciousness, the latter implying and not contradicting the former... Indeed, it is obvious that invention or discovery, be it in mathematics or anywhere else, takes place by combining ideas. Hadamard suggests that even though discovery is not a result—cannot be a result—of pure chance, it is almost certainly guided by directed serendipity. Fur- thermore, Hadamard hints that a proper role for chance in a more humanlike discovery mechanism is to cause the juxtaposition of different notions. It is interesting that genetic algorithms adopt Hadamard’s mix of direction and chance in a manner that efficiently builds new solutions from the best partial solutions, ‘of previous trials To see this, consider a population of m strings (perhaps the four-string population for the black box problem) over some appropriate alphabet, coded so that cach is a complete idea or prescription for performing a particular task Cin this case, cach string is one complete switch-setting idea). Substrings within cach string (idea) contain various notions of what is important or relevant to the task, Viewed in this way, the population contains not just a sample of 1 ideas; rather, it contains a multitude of notions and rankings of those notions for task perfor: ‘mance. Genetic algorithms ruthlessly exploit this wealth of information by (1) reproducing high-quality notions according to their performance and (2) cross: ing thesc notions with many other high-performance notions from other strings. ‘Thus, the action of crossover with previous reproduction speculates on new ideas constructed from the high-performance building blocks (notions) of past trials, In passing, we note that despite the somewhat fuzzy definition of a notion, we have not limited a notion to simple linear combinations of single features or pairs of features. Biologists have long recognized that evolution must efficiently pro cess the epistasis (positionwise nonlinearity) that arises in nature. In a similar ‘manner, the notion processing of genetic algorithms must effectively process no: tions even when they depend upon their component features in highly nonlinear and complex ways.4 Chapter 1 / A Gentle Introduction to Genetic Algorithms Exchanging of notions to form new ideas is appealing intuitively, if we think in terms of the process of énnovation. What is an innovative idea? As Hadamard suggests, most often it is a juxtaposition of things that have worked well in the past. In much the same way, reproduction and crossover combine to search potentially pregnant new ideas. This experience of emphasis and crossing is analogous to the human interaction many of us have observed at a trade show or scientific conference, At a widget conference, for example, various widget ¢x- perts from around the world gather to discuss the latest in widget technology. After the lecture sessions, they all pair off around the bar to exchange widget stories, Well-known widget experts, of course, are in greater demand and ex change more ideas, thoughts, and notions with their lesser known widget col leagues. When the show ends, the widget people return to their widget laboratories to try out a surfeit of widget innovations. The process of reproduction and crossover in a genetic algorithm is this kind of exchange. High-performance notions are repeatedly tested and exchanged in the search for better and better performance. If reproduction according to fitness combined with crossover gives genetic algorithms the bulk of their processing power, what then is the purpose of the ‘mutation operator? Not surprisingly, there is much confusion about the role of ‘mutation in genetics (both natural and artificial). Perhaps it is the result of too many B movies detailing the exploits of mutant eggplants that consume mass quantities of Tokyo or Chicago, but whatever the cause for the confusion, we find that mutation plays a decidedly secondary role in the operation of genetic algorithms. Mutation is needed because, even though reproduction and crossover effectively search and recombine extant notions, occasionally they may become overzealous and lose some potentially useful genetic material (1's or 0's at partic. ular locations). In artificial genetic systems, the mutation operator protects against such an irrecoverable loss. In the simple GA, mutation is the occasional Gwith small probability) random alteration of the value ofa string position. In the binary coding of the black box problem, this simply means changing a 1 to a 0 and vice versa. By itself, mutation is a random walk through the string space. ‘When used sparingly with reproduction and crossover, it is an insurance p against premature loss of important notions. That the mutation operator plays a secondary role in the simple GA, we simply note that the frequency of mutation to obtain good results in empirical genetic algorithm studies is on the order of onc mutation per thousand bit (position) transfers. Mutation rates arc similarly small (or smaller) in natural populations, leading us to conclude that mutation is appropriately considered as a secondary mechanism of genetic algorithm adaptation. Other genetic operators and reproductive plans have been abstracted from the study of biological example. However, the three examined in this section, reproduction, simple crossover, and mutation, have proved to be both computa tionally simple and effective in attacking a number of important optimization problems. In the next section, we perform a hand simulation of the simple genetic algorithm to demonstrate both its mechanics and its power.Genetic Algorithms at Work—A Simulation by Hand 5 GENETIC ALGORITHMS AT WORK—A SIMULATION BY HAND Let's apply our simple genetic algorithm to a particular optimizatic~ >roblem step by step. Consider the problem of maximizing the function f(x) = x, where x is Permitted to vary between 0 and 31, a function displayed earlier as Fig, 1.5. To tise a genctic algorithm we must first code the decision variables of our problem as some finite-length string, For this problem, we will code the variable x simply as a binary unsigned integer of length 5. Before we proceed with the simulation, let's briefly review the notion of a binary integer. As decadigited creatures, we have little problem handling base 10 integers and arithmetic, For example, the five digit number 53,095 may be thought of as, 5:10! + 3-10' + 0:10? + 9:10! + 5: 53,095. In base 2 arithmetic, we of course only have two digits to work with, 0 and 1, and as an example the number 10,011 decodes to the base 10 number 1-24 + O24 + 0-2? + 1-2! + 1-2 = 16+ 241 = 19 With a five-bit (binary digit) unsigned integer we can obtain numbers between © (00000) and 31 (11111). With a well-defined objective function and coding, we now simulate a single generation of a genetic algorithm with reproduction, ‘crossover, and mutation. ‘To start off, we select an initial population at random. We select a population of size 4 by tossing 2 fair coin 20 times. We can skip this step by using the initial population created in this way earlier for the black box switching problem, Look- ing at this population, shown on the left-hand side of Table 1.2, we observe that the decoded x values are presented along with the fitness or objective function values f(x). To make sure we know how the fitness values f(x) are calculated from the string representation, let’s take a look at the third string of the initial population, string 01000. Decoding this string as an unsigned binary integer, we note that there is a single one in the 2' = 8's position, Hence for string 01000 we obtain x = 8. To calculate the fitness or objective function we simply square the x value and obtain the resulting fitness value (x) = 64, Other x and f(x) values may be obtained similarly. You may notice that the fitness or objective function values are the same as the black box values (compare Tables 1.1 and 1.2). This is no coincidence, and the black box optimization problem was well represented by the particular function, f(x), and coding we are now using Of course, the genetic algorithm necd not know any of this; it is just as happy to optimize some arbitrary switching function (or any other finite coding and function for that matter) as some poly nomial function with straightforward binary coding. This discussion simply rein forces one of the strengths of the genetic algorithm: by exploiting similarities in codings, genetic algorithms can deal effectively with a broader class of functions than can many other procedures. ‘A generation of the genetic algorithm begins with reproduction. We select the mating pool of the next generation by spinning the weighted roulette wheel16 Chapter 1 / A Gentle Introduction to Genetic Algorithms TABLE 1.2 A Genetic Algorithm by Hand inital Population Value pectect, String Randomly Unsigned) xy Roulette No. Generated) (Vimeger) x Wheel 1 01102 3 169 O14 1 2 11000 24 5760.49 2 3 01000 8 4006 ° 4 10012 19 361 1 Sam 1170 1.00 40 Average 293 025 100 to Max 5% 049 197 20 (shown in Fig. 1.7) four times. Actual simulation of this process using coin tosses has resulted in string 1 and string 4 receiving one copy in the mating pool, string 2 receiving two copies, and string 3 receiving no copies, as shown in the center of Table 1.2. Comparing this with the expected number of copies (1 pselect,) we haye obtained what we should expect: the best get more copies, the average stay even, and the worst dic off ‘With an active pool of strings looking for mates, simple crossover proceeds in two steps: (1) strings are mated randomly, using coin tosses to pair off the happy couples, and (2) mated string couples cross over, using coin tosses to select the crossing sites. Referring again to Table 1.2, random choice of mates has selected the second string in the mating pool to be mated with the first. With a crossing site of 4, the two strings 01101 and 11000 cross and yield two new strings 01100 and 11001, The remaining two strings in the mating pool are crossed at site 2; the resulting strings may be checked in the table. ‘The last operator, mutation, is performed on a bit-by-bit basis. We assume that the probability of mutation in this test is 0,001. With 20 transferred bit po: sitions we should expect 20-0.001 = 0.02 bits to undergo mutation during a given generation, Simulation of this process indicates that no bits undergo mutation for this probability value. As a result, no bit positions are changed from 0 to 1 or vice versa during this generation. Following reproduction, crossover, and mutation, the new population is ready to be tested. To do this, we simply decode the new strings created by the simple genetic algorithm and calculate the fitness function values from the x values thus decoded. The results of a single generation of the simulation are shown at the right of Table 1.2. While drawing concrete conclusions from a single trial of a stochastic process is, at best, 2 risky business, we start to see how genetic algorithms combine high-performance notions to achieve better performance. In the table, note how both the maximal and average performance have improved in the new population. The population average fitness has improved from 293 t0Genetic Algorithms at Work—A Simulation by Hand WV TABLE 1.2 (Continued) au nasrane: Mate Crossover Site Reproduction Randomly Randomly New x fx» (Cross Site Shown) Selected Selected Population Value ee o11o0|1 2 4 01100 1 4 1100/0 1 4 11001 25 625, 1ifooo 4 2 11011 27 79 Lolo1d 3 2 10000 16 256 1754 439 2 Nomis. 1) Initial population chosen by four repetitions of five coin tosses where heads = 1, tals = 0 2) Reproduction performed through 1 part in & simulation of roulette wheel selection (three coin tosses). 3) Crossover performed through binary decoding of 2 coin tosses (TT = 00, LHI = 11, = 3 = cross site 4). 4) Crossover probability assumed to be unity p. = 1.0. 5) Mutation probability assumed to be 0.001, p. = 0.001, Expected mutations = §:4-0,001 = 0.02. No mutations expected during a single gencration. None simulated. 439 in one generation. The maximum fitness has increased from 576 to 729 during that same period, Although random processes help cause these happy citcum- stances, we start to sce that this improvement is no fluke. The best string of the first generation (11000) reccives two copies because of its high, above-average performance. When this combines at random with the next highest string (10011) and is crossed at location 2 (again at random), one of the resulting strings (11011) proves to be a very good choice indeed. This event is an excellent illustration of the ideas and notions analogy developed in the previous section. In this case, the resulting good idea is the combination of two above-average notions, namely the substrings 11-—— and ——-11. Although the argument is still somewhat heuristic, we start to see how genetic algorithms effect a robust search. In the next section, we expand our understanding of these concepts by analyzing genetic algorithms in terms of schemata or similarity templates. The intuitive viewpoint developed thus far has much appeal. We have com- pared the genetic algorithm with certain human search processes commonly called innovative or creative, Furthermore, hand simulation of the simple genetic algorithm has given us some confidence that indeed something interesting is ‘going on here. Yet, something is missing. What is being processed by genetic algorithms and how do we know whether processing it (whatever it is) will lead to optimal or near optimal results in a particular problem? Cleatly, as scientists,Chapter 1 / A Gentle Iniroduction to Genetic Algorithms engineers, and business managers we need to understand the what and the how of genctic algorithm performance. To obtain this understanding, we examine the raw data available for any search procedure and discover that we can search more effectively if we exploit important similarities in the coding we use. This leads us to develop the impor: tant notion of a similarity template, or schema This in turn leads us to a key stone of the genetic algorithm approach, the building block hypothesis. GRIST FOR THE SEARCH MILL—IMPORTANT SIMILARITIES For much too long we have ignored a fundamental question. In a search process given only payoff data (fitness values), what information is contained in a popu: lation of strings and their objective function values to help guide a directed search for improvement? To ask this question more clearly, consider the strings and fitness values originally displayed in Table 1.1 from the simulation of the previous section (the black box problem) and gathered below for convenience: string Fitness 01101 169 11000 576 01000 4 10011 361 What information is contained in this population to guide a directed search for improvement? On the face of it, there is not very much: four indepersi. ~t samples of different strings with their fitness values. As we stare at the page. however, quite naturally we start scanning up and down the string column, and we notice certain similarities among the strings. Exploring these similarities in more depth, we notice that certain string patterns seem highly associated with good performance. The longer we stare at the strings and their fitness values, the greater is the temptation to experiment with these high fitness associations. It seems perfectly reasonable to play mix and match with some of the substrings that are highly correlated with past success. For example, in the sample population, the strings starting with a I seem to be among the best. Might this be an important ingredient in optimizing this function? Certainly with our function (f(x) = <") and our coding (a five-bit unsigned integer) we know it is (why is this true?), But, what are we doing here? Really, two separate things, First, we are seeking similarities among strings in the population. Second, we are looking for causal relationships between these similarities and high fitness. In so doing, we admit a ‘wealth of new information to help guide a search. To see how much and preciselySimilarity Templates (Schemato) 9 what information we admit, let us consider the important concept of a schema (plural, schemata), or similarity template. ‘SIMILARITY TEMPLATES (SCHEMATA) In some sense we are no longer interested in strings as strings alone, Since important similarities among highly fit strings can help guide a search, we question how one string can be similar to its fellow strings. Specifically we ask, in what ways is a string a representative of other string classes with similarities at certain string positions? The framework of schemata provides the tool to answer these ‘questions. A schema (Holland, 1968, 1975) is a similarity template describing a subset of strings with similarities at certain string positions. For this discussion, let us once again limit ourselves without loss of generality to the binary alphabet {0,1}. ‘We motivate a schema most easily by appending a special symbol to this alphabet; we add the * or don’t care symbol. With this extended alphabet we can now create strings (schemata) over the ternary alphabet {0, 1, *}, and the meaning of the schema is clear if we think of it as a pattern matching device: a schema ‘matches a particular string if at every location in the schema a 1 matches a 1 in the string, a 0 matches a 0, or a* matches either. AS an example, consider the strings and schemata of length 5. The schema *0000 matches two strings, namely {10000, 00000}. As another example, the schema *111* describes a subset with four members {01110, 01111, 11110, 11111}. As one last example, the schema 0°1** matches any of the eight strings of length 5 that begin with a 0 and have a | in the third position. As you can start to see, the idea of a schema gives us a powerful and compact way to talk about all the well-defined similarities among finite-length strings over a finite alphabet. We should emphasize that the * is only 4 metasymbol (a symbol about other symbols); it is never explicitly processed by the genctic algorithm. It is simply a notational device that allows description of all possible similarities among strings of a particular length and alphabet. Counting the total number of possible schemata is an enlightening exercise. In the previous example, with / = 5, we note there are 33°3°3:3 = 3° = 243 different similarity templates because cach of the five positions may be a 0, 1, ‘or *, In general, for alphabets of cardinality (number of alphabet characters) k there arc (k + 1) schemata. At first blush, it appears that schemata are making the search more difficult. For an alphabet with & elements there are only (only?) different strings of length £ Why consider the (& + 1} schemata and enlarge the space of concern? Put another way, the length 5 example now has only 2° 32 different alternative strings. Why make matters more difficult by considering 3° = 243 schemata? In fact, the reasoning discussed in the previous section makes things easier, Do you recall glancing up and down the list of four strings and fitness values and trying to figure out what to do next? We recognized that if we considered the strings separately, then we only had four pieces of information;Chapter 1 / A Gentle Introduction to Genetic Algorithms however, when we considered the strings, their fitness values, and the similarities ‘among the strings in the population, we admitted a wealth of new information to help direct our search. How much information do we admit by considering the similarities? The answer to this question is related to the number of unique schemata contained in the population. To count this quantity exactly requires knowl- ‘edge of the strings in a particular population. To get a bound on the number of schemata in a particular population, we first count the number of schemata contained in an individual string, and then we get an upper bound on the total number of schemata in the population. To see this, consider a single string of length 5: 11111, for example, This string is a member of 2° schemata because each position may take on its actual value or a don’t care symbol. In general, a particular string contains 2’ schemata. AS a result, a population of size 7 contains somewhere between 2! and 7:2! sche. mata, depending upon the population diversity. This fact verifies our earlier intuition. The original motivation for considering important similarities was to get more information to help guide our search. The counting argument shows that a wealth of information about important similarities is indeed contained in even moderately sized populations. We will examine how genetic algorithms effectively exploit this information. At this juncture, some parallel processing appears to be needed if we are to make use of all this information in a timely fashion, ‘These counting arguments are well and good, but where does this all lead? More pointedly, of the 2! to 7-2! schemata contained in a population, how many are actually processed in a useful manner by the genetic algorithm? To obtain the answer to this question, we consider the effect of reproduction, crossover, and mutation on the growth or decay of important schemata from gencration to gen: eration. The effect of reproduction on a particular schema is easy to determine; since more highly fit strings have higher probabilitics of selection, on average we give an ever increasing number of samples to the observed best similarity patterns (this is a good thing to do, as is shown in the next chapter); however, reproduction alone samples no new points in the space. What then happens to a particular schema when crossover is introduced? Crossover leaves a schema un scathed if it does not cut the schema, but it may disrupt a schema when it does. For example, consider the two schemata 1***0 and **11*. The first is likely to be disrupted by crossover, whereas the second is relatively unlikely to be destroyed. AS a result, schemata of short defining length are left alone by crossover and reproduced at a good sampling rate by reproduction operator. Mutation at nor mal, low rates does not disrupt a particular schema very frequently and we are Jeft with a startling conclusion. Highly fit, short-defining length schemata (we call them building blocks) are propagated generation to generation by giving exponentially increasing samples to the observed best; all this goes in parallel with no special bookkeeping or special memory other than our population of strings. In the next chapter we will count how many schemata are processed usefully in each generation. It turns out that the number is something like n°. This compares favorably with the number of function evaluations (1). Because this processing leverage is so important (and apparently unique to genetic algorithms), we give ita special name, implicit parattetism.Learning the Lingo 21 LEARNING THE LINGO ‘The power behind the simple operations of our genetic algorithm is at least in: tuitively clearer if we think of building blocks. Some questions remain; How do we know that building blocks lead to improvement? Why is it a near optimal strategy to give exponentially increasing samples to the best? How can we calculate the number of schemata usefully processed by the genetic algorithm? ‘These questions are answered fully in the next chapter, but first we need to mas- ter the terminology used by researchers who work with genetic algorithms. Be- cause genetic algorithms are rooted in both natural genetics and computer science, the terminology used in the GA literature is an unholy mix of the natural and the artificial. Until now we have focused on the artificial side of the genetic algorithm's ancestry and talked about strings, alphabets, string positions, and the like. We review the correspondence between these terms and their natural coun- terparts to connect with the growing GA literature and also to permit our own ‘occasional slip of a natural utterance or two. Roughly speaking, the strings of artificial genetic systems are analogous to chromosomes in biological systems. In natural systems, one or more chromosomes combine to form the total genetic prescription for the construction and operation of some organism. In natural systems the total genetic package is called the genotype. In artificial genetic systems the total package of strings is called a Structure (in the early chapters of this book, the structure will consist of a single string, so the text refers to strings and structures interchangeably until it is necessary to differentiate between them). In natural systems, the organism formed by the interaction of the total genetic package with its environment is called the phenotype In artificial genetic systems, the structures decode to form a partic: ular parameter set, solution alternative, ot point (in the solution space). The designer of an artificial genctic system has a varicty of alternatives for coding both numeric and nonnumeric parameters. We will confront codings and coding principles in later chapters; for now, we stick to our consideration of GA and natural terminology. In natural terminology, we say that chromosomes are composed of genes, ‘which may take on some number of values called alleles. In genctics, the position of a gene (its focus) is identified separately from the gene's function, Thus, we can talk of a particular gene, for example an animal's eye color gene, its locus, position 10, and its allele value, blue eyes. In artificial genetic search we say that strings are composed of features or detectors, which take on different values, Features may be located at different positions on the string. The correspondence between natural and artificial terminology is summarized in Table 1.3. Thus far, we have not distinguished between a gene (a particular character) and its locus (its position); the position of a bit in a string has determined its meaning (how it decodes) uniformly throughout a population and throughout time. For example, the string 10000 is decoded as a binary unsigned integer 16 (base 10) because implicitly the | is in the 16's place. It is not necessary to limit codings like this, however. A later chapter presents more advanced structures that treat locus and gene separately.22 Chapter 1 / A Gentle Introduction to Genetic Algorithms TABLE 1.3 Comparison of Natural and GA Terminology Natural Genetic Algorithm chromosome string gene feature, character, or detector allele feature value locus string position genotype structure phenotype Parameter set, alternative solution a decoded structure epistasis nonlinearity SUMMARY ‘This chapter has laid the foundation for understanding genetic algorithms, their mechanics and their power. We are led to these methods by our search for robustness; natural systems are robust—efficient and efficacious—as they adapt to a wide varicty of environments. By abstracting nature’s adaptation algorithm of choice in artificial form we hope to achieve similar breadth of performance. In fact, genetic algorithms have demonstrated their capability in a number of analytical and empirical studics ‘The chapter has presented the detailed mechanics of a simple, three-operator genetic algorithm. Genetic algorithms operate on populations of strings, with the string coded to represent some underlying parameter set. Reproduction, crossover, and mutation are applied to successive string populations to create new string populations. These operators are simplicity itself, involving nothing more complex than random number generation, string copying, and partial string exchanging; yet, despite their simplicity, the resulting search performance is wide- ranging and impressive. Genetic algorithms realize an innovative notion exchange among strings and thus connect to our own ideas of human search or discovery. A simulation of one generation of the simple genetic algorithm has helped illus trate both the detail and the power of the method. Four differences separate genetic algorithms from more conventional opti mization techniques: 1. Direct manipulation of a coding 2. Search from a population, not a single point 3. Search via sampling, a blind search 4, Search using stochastic operators, not deterministic rules Genetic algorithms manipulate decision or control variable representations at the string level to exploit similarities among high-performance strings. Other ‘methods usually deal with functions and their control variables directly. BecauseProblems 23 genetic algorithms operate at the coding level, they are difficult to fool even when the function may be difficult for traditional schemes. Genetic algorithms work from a population; many other methods work from ‘single point. In this way, GAs find safety in numbers. By maintaining a population of well-adapted sample points, the probability of reaching a false peak is reduced. Genetic algorithms achieve much of their breadth by ignoring information except that concerning payoff, Other methods rely heavily on such information, and in problems where the necessary information is not available or difficult to ‘obtain, these other techniques break down. GAS remain general by exploiting information available in any search problem, Genetic algorithms process simila ities in the underlying coding together with information ranking the structures according to their survival capability in the current environment. By exploiting such widely available information, GAs may be applied in virtually any problem. ‘The transition rules of genetic algorithms are stochastic; many other methods have deterministic transition rules. A distinction exists, however, between the randomized operators of genetic algorithms and other methods that are simple random walks. Genetic algorithms use random choice to guide a highly exploi- tative search. This may seem unusual, using chance to achieve directed results (the best points), but nature is full of precedent, We have started a more rigorous appraisal of genetic algorithm performance through the concept of schemata or similarity templates. A schema is a string over an extended alphabet, {0,1,"} where the 0 and the I retain their normal ‘meaning and the * is a wild card or don’t care symbol. This notational device greatly simplifies the analysis of the genetic algorithm method because it explicitly recognizes all the possible similarities in a population of strings. We have discussed how building blocks—short, high-performance schemata—are combined to form strings with expected higher performance. This occurs because building blocks are sampled at near optimal rates and recombined via crossover. Mutation has little effect on these building blocks; like an insurance policy, it helps prevent the irrecoverable loss of potentially important genetic material. The simple genetic algorithm studied in this chapter has much to recom- mend it. In the next chapter, we will analyze its operation more carefully. Follow: ing this, we will implement the simple GA in a short computer program and ‘examine some applications in practical problems. @ PROBLEMS L.A. Consider 2 black box containing eight multiple-position switches. Switches 1 and 2 may be set in any of 16 positions. Switches 3, 4, and 5 are four-position switches, and switches 6-8 have only two positions. Calculate the number of unique switch settings possible for this black box device. 1.2. For the black box device of Problem 1.1, design a natural string coding that uses eight positions, one position for each switch. Count the number of switch24 Chapter 1 / A Gentle Introduction to Genetic Algorithms settings represented by your coding and count the number of schemata or similarity templates inherent in your coding. 1.3. For the black box device of Problem 1.1, design a minimal binary coding, for the eight switches and compare the number of schemata in this coding to a coding for Problem 1.2. 14. Consider a binary string of length 11, and consider a schema, 1** Under crossover with uniform crossover site selection, calculate a lower limit on the probability of this schema surviving crossover. Calculate survival probabilities under the same assumptions for the following schemata: ****10"*"**, MW ore Te9rte", P18 190", 15. Ifthe distance between the outermost alleles ofa particular schema is called its defining length 5, derive an approximate expression for the survival probability of a particular schema of total length / and defining length 6 under the operation of simple crossover. 1.6. Six strings have the following fitness function values: 5, 10, 15, 25, 50, 100. Under roulette wheel selection, calculate the expected number of copies of each string in the mating pool if a constant population size, 2 = 6, is maintained. 1.7. Instead of using roulette wheel selection during reproduction, suppose we define a copy count for each string, ncount, as follows: ncount, ~ f/f where f, is the fitness of the ‘th string and / is the average fitness of the population, The copy count is then used to generate the number of members of the mating pool by giving the integer part of ncownt, copies to the ith string and an additional copy with probability equal to the fractional part of ncownt,. For example, with Jf, = 100 and j = 80, string # would receive an ncount, of 1.25, and thus would receive one copy with probability 1.0 and another copy with probability 0.25, Using the string fitness values in Problem 1.6, calculate the expected number of copies for each of the six strings. Calculate the total number of strings expected in the gene pool under this form of reproduction. 1.8, The form of reproduction discussed in Problem 1.7 is sometimes called reproduction with expected number control. In a short essay, explain why this is so. In what ways are roulette wheel selection and expected number contro! similar? In what ways are they different? 1.9. Suppose the probability of a mutation at a single bit position is 0.1. Calculate the probability of a 10-bit string surviving mutation without change. Calculate the probability of a 20-bit string surviving mutation without change. Recalculate the survival probabilities for both 10- and 20-bit strings when the mutation probability is 0.01 1.10, Consider the strings and schemata of length 1. For the following schemata, calculate the probability of surviving mutation if the probability of mutation is 0.1 at a single bit position: 0949, #111448", #1000010" LL Recalculate the survival probabilities for a mutation probability P, = 0.01. Tesoreee, 10Computer Assignments 25 @ COMPUTER ASSIGNMENTS ‘A. One of the primitive functions required in doing genetic algorithms on a computer is the ability to generate pscudorandom numbers. The numbers arc pseudorandom because as von Neumann once said, “Anyone who considers at- ithmetical methods of producing random digits is, of course, in a state of sin.” As part of this assignment, go forth and sin some more. Use the random number enerator given in Appendix B to create a program where you generate 1000 random numbers between 0 and 1. Keep track of how many numbers are generated in cach of the four quartiles, 0-0.25, 0.25-0.5, 0.5-0.75, 0.75-1.0, and compare the actual counts with the expected number. Is the difference within reasonable limits? How can you quantify whether the difference is reasonable? B. Suppose you have 10 strings with the following probabilities of selection in the next gencration: 0.1, 0.2, 0.05, 0.15, 0.0, 0.11, 0.07, 0.04, 0.00, 0.12, 0.16. Given that these are the only possible alternatives, calculate whether the probabilities are consistent, Write a computer program that simulates roulette wheel selection for these 10 strings. Spin the wheel 1000 times and keep track of the number of selections for each string, comparing this number to the expected number of selections. Write a function that generates a pseudorandom integer between some specified lower limit and some specified upper limit. Test the program by generating 1000 numbers between 3 and and 12. Keep track of the quantity of each number selected and compare these figures to the expected quantities. D. Create a procedure that receives two binary strings and a crossing site value, performs simple crossover, and returns two offspring strings. Test the program by crossing the following strings of length 10: 1011101011, 0000110100. Try crossing site values of — 3, 1, 6, and 20, E, Create a function mutation that complements a particular bit value with specified mutation probability p,,. Test the function by performing 1000 calls to mutation using mutation probabilitics p,, = 0.001, 0.01, 0.1. Compare the realized number of mutations to the expected number. F. Using the simple crossover operator of Assignment D, repeatedly apply the crossover operator to strings contained within the following population of size n= 200and 1 = $: 100 copies of 11100 100 copies of 00011 Perform crossover (), = 1.0) for 50 generations without replacement under no selection. Compare the initial and final distributions of strings. Also compare the expected quantity of cach string to the realized quantity in generation 50.Genetic Algorithms Revisited: Mathematical Foundations ‘The broad brush of Chapter 1 painted an accurate, if somewhat crude, picture of genetic algorithms and their mechanics and power. Perhaps these brush strokes appeal to your own sense of human discovery and search, That somehow a regular though randomized procedure can achieve some of the breadth and intuitive Alar of human search seems almost too good to be true. That this discovery procedure should mirror the natural processes that created the species possessing the procedure is a recursion of which Godel, Escher, or Bach (Hofstadter, 1979) could each have been proud. Despite their intuitive appeal, and despite their symmetry, it is crucial that we back these fuzzy feelings and speculations about genetic algorithms using cold, mathematical facts ‘Actually, we have already begun a more rigorous appraisal of GAs. Toward the end of the last chapter, the fundamental concept of a schema or similarity template was introduced. Quantitatively, we found that there are indeed a large number of similarities to exploit in a population of strings. Intuitively, we saw how genetic algorithms exploit in parallel the many similarities contained in building blocks or short, high-performance schemata. In this chapter, we make these observations more rigorous by doing several things First, we count the schemata represented within a population of strings and consider which grow and which decay during any given gencration. To do this, we consider the effect of reproduction, crossover, and mutation on a particular schema. This analysis Icads to the fundamental theorem of genetic algorithms that quantifies these growth and decay rates more precisely; it also points to the mathematical form28 Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations of this growth. This form is connected to an important and classical problem of decision theory, the pwo-armed bandit problem (and its extension, the k-armed bandit), The mathematical similarity between the optimal (minimal loss) solution to the two-armed and A-armed bandit and the equation describing the number of trials given to successive generations of schemata in the simple genetic algo rithm is striking. Counting the number of schemata that are usefully processed by the simple genetic algorithm reveals tremendous leverage in the building block processing. Finally, we consider an important question: How do we know that combining building blocks leads to high performance in arbitrary problems? The question sparks our consideration of some relatively new tools of genetic algorithm analysis: schema transforms and the minimal deceptive problem. WHO SHALL LIVE AND WHO SHALL DIE? THE FUNDAMENTAL THEOREM The operation of genetic algorithms is remarkably straightiorward. After all, we start with a random population of n strings, copy strings with some bias toward the best, mate and partially swap substrings, and mutate an occasional bit value for good measure. Even though genetic algorithms directly manipulate a population of strings in this straightforward manner, in Chapter 1 we started to recognize that this explicit processing of strings really causes the implicit processing of many schemata during cach generation. To analyze the growth and decay of the many schemata contained in a population, we need some simple notation to ‘add rigor to the discussion. We consider the operation of reproduction, crossover, and mutation on the schemata contained in the population. We consider strings, without loss of generality, to be constructed over the binary alphabet V = (0, I}. As a notational convenience, we refer to strings by capital letters and individual characters by lowercase letters subscripted by their position. For example, the seven-bit string A — 0111000 may be represented symbolically as follows: A= aaaaaas Here each of the a, represents a single binary feature or detector (in accordance with natural analogy, we sometimes call the a's gevies), where each feature may take on a value 1 or 0 (we sometimes call the a, values alfetes), In the particular string 0111000, a, is 0, 4, is 1, a, is 1, etc. Itis also possible to have strings where detectors are not ordered sequentially as in string A For example a string 4 could have the following ordering: A = aaaaaaa, A later chapter explores the effect of extending the representation to allow features to be located in 2 manner independent of their function. For now, assume that a feature’s function may be determined by its position. Meaningful genetic search requires a population of strings, and we considerWho Shall Live ond Who Shall Die? The Fundomental Theorem 29 a population of individual strings A,,f = 1, 2,.....m, contained in the population A(t) at time (or generation )¢ where the boldface is used to denote a population. Besides notation to describe populations, strings, bit positions, and alleles, we need convenient notation to describe the schemata contained in individual strings and populations, Let us consider a schema H taken from the three-letter alphabet V+ = {0, 1, *}. As discussed in the previous chapter, the additional symbol, the asterisk or star *, is a don’t care or wild card symbol which matches either @ 0 or a 1 at a particular position. For example, consider the length 7 schema H = *11°0**. Note that the string A = 0111000 discussed above is an example of the schema #, because the string alleles @, match schema positions b, at the fixed positions 2, 3, and 5. From the results of the last chapter, recall that there are 3! schemata or similarity defined over a binary string of length £ In general, for alphabets of cardi- nnality & there are (k + 1) schemata. Furthermore, recall that in a string population with members there are at most 71°2! schemata contained in a population because each string is itself a representative of 2! schemata, These counting arguments give us some feel for the magnitude of information being processed by genetic algorithms; however, to really understand the important building blocks of future solutions, we need to distinguish between different types of schemata All schemata are not created equal. Some are more specific than others. For example, the schema 01171** is a more definite statement about important similarity than the schema 0 Furthermore, certain schema span more of the total string length than others. For example, the schema 1****1* spans a larger portion of the string than the schema 1*1****, To quantify these ideas, we introduce two schema propertics: schema order and defining length. ‘The order of a schema H, denoted by o(#), is simply the number of fixed positions (in a binary alphabet, the number of 1's and 0's) present in the template, In the cxamples above, the order of the schema 011°1** is 4 (symbolically, o(011*1**) = 4), whereas the order of the schema 0****** is 1. ‘The defining length of a schema //, denoted by 8(//), is the distance between the first and last specific string position. For example, the schema O11*1** has defining length 8 = 4 because the last specific position is 5 and the first specific position is 1, and the distance between them is 3(H) = 5 — 1 = 4. Inthe other example (the schema 0******), the defining length is particularly easy to calcu: late. Since there is only a single fixed position, the first and last specific positions are the same, and the defining length 5 = 0. Schemata and their properties are interesting notational devices for rigor ously discussing and classifying string similarities, More than this, they provide the basic means for analyzing the net effect of reproduction and genetic operators ‘on building blocks contained within the population. Let us consider the individ ual and combined effect of reproduction, crossover, and mutation on schemata contained within a population of strings. ‘The effect of reproduction on the expected number of schemata in the population is particularly easy to determine. Suppose at a given time step f there are30 Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations ‘m examples of a particular schema # contained within the population A(¢) where we write m = m(H,t) (there are possibly different quantities of different schemata H at different times ¢), During reproduction, a string is copied according to its fitness, or more precisely a string A, gets selected with probability 2; = f/f, After picking a nonoverlapping population of size m with replacement from the population A(t), we expect to have m(H, ¢ + 1) representatives of the schema H in the population at time ¢ + 1 as given by the equation m(H, t+ 1) = m(H,1)-n fUL)/EZf,, where {(H) is the average fitness of the strings representing schema H at time £ If we recognize that the average fitness of the entire population may be written as f = Yf/n then we may rewrite the reproductive schema growth equation as follows: fH) mH, t+ 1) = mn, yO f In words, a particular schema grows as the ratio of the average fitness of the schema to the average fitness of the population. Put another way, schemata with fitness values above the population average will receive an increasing number of samples in the next generation, while schemata with fitness values below the population average will receive a decreasing number of samples. It is interesting to observe that this expected behavior is carried out with every schema H contained in a particular population A in parallel. In other words, all the schemata in 2 population grow or decay according to their schema averages under the operation of reproduction alone. In a moment, we examine why this might be a good thing to do. For the time being, simply note that many things go on in parallel with simple operations on the 7 strings in the population. The effect of reproduction on the number of schemata is qualitatively clear; above-average schemata grow and below-average schemata die off. Can we learn anything else about the mathematical form of this growth (decay) from the schema difference equation? Suppose we assume that a particular schema H re. mains above average an amount ¢f with ¢ a constant, Under this assumption we can rewrite the schema difference equation as follows: (ii, b+ 1) = my LED r Starting at f= 0 and assuming a stationary value of we obtain the equation m(H, t) = mH, 0)° (1 + cy = (1 te) mi, t), Business-oriented readers will recognize this equation as the compound interest ‘equation, and mathematically oriented readers will recognize a geometric pro gression or the discrete analog of an exponential form. The effect of reproduction is now quantitatively clear; reproduction allocates exponentially increasing (de creasing ) numbers of trials to above- (below-) average schemata. We will connect this rate of schemata allocation to the multiarmed bandit problem, but for right now we will investigate how crossover and mutation affect this allocation of trialsWho Sholl Live and Who Shall Die? The Fundamental Theorem an ‘To some extent it is curious that reproduction can allocate exponentially increasing and decreasing numbers of schemata to future generations in parallel; many, many different schemata are sampled in parallel according to the same rule through the use of # simple reproduction operations, On the other hand, reproduction alone docs nothing to promote exploration of new regions of the search space, since no new points are searched; if we only copy old structures without change, then how will we ever try anything new? This is where crossover steps in, Crossover is a structured yet randomized information exchange between strings. Crossover creates new structures with a minimum of disruption to the allocation strategy dictated by reproduction alone. This results in exponentially increasing (or decreasing) proportions of schemata in a population on many of the schemata contained in the population. To see which schemata are affected by crossover and which are not, consider 4 particular string of length / = 7 and two representative schemata within that string: 4 =0111000 He tleseso met eeiose Clearly the two schemata H, and H, are represented in the string 4, but to see the effect of crossover on the schemata, we first recall that simple crossover proceeds with the random selection of a mate, the random selection of a crossover site, and the exchange of substrings from the beginning of the string to the crossover site inclusively with the corresponding substring of the chosen mate. Sup- pose string A has been chosen for mating and crossover. In this string of length 7, suppose we roll a single die to choose the crossing site (there are six sites in a string of length 7). Further suppose that the die turns up a 3, meaning that the cross cut will take place between positions 3 and 4. The effect of this cross on our two schemata 4, and #, can be seen easily in the following example, where the crossing site has been marked with the separator symbol. | A =011/1000 sle]ereo metettliore Unless string 4's mate is identical to A at the fixed positions of the schema (a possibility that we conservatively ignore). the schema H, will be destroyed because the 1 at position 2 and the 0 at position 7 will be placed in different offspring (they are on opposite sides of the separator symbol marking the cross Point, or cut point). It is equally clear that with the same cut point (between bits 3 and 4), schema #7, will survive because the | at position 4 and the 0 at position 5 will be carried intact to a single offspring. Although we have used a specific cut point for illustration, it is clear that schema H, is less likely to survive crossover than schema H, because on average the cut point is more likely to fall between the extreme fixed positions. To quantify this observation, we note that schema H, has a defining length of 5, If the crossover site is selected uniformly at random32 Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations among the ! ~ 1 = 7 ~ 1 = 6 possible sites, then clearly schema 1 is destroyed with probability p, = 5(M,(/ — 1) = 5/6 (it survives with probability p, = 1 ~ p, = 16) Similarly, the schema H, has defining length 8(#,) = 1, and it is destroyed during that one event in six where the cut site is selected to occur between positions 4 and 5 such that p, = 1/6 of the survival probability is p, 1- p, = 56, More generally, we see that a lower bound on crossover survival probability p, can be calculated for any schema. Because a schema survives when the cross site falls outside the defining length, the survival probability under simple crossover isp, = 1 — S(HY(I — 1), since the schema is likely to be disrupted when. ever a site within the defining length is selected from the £ — 1 possible sites. If crossover is itself performed by random choice, say with probability p, at a particular mating, the survival probability may be given by the expression aH) t Perl ~ pe which reduces to the earlier expression when p. = 10. ‘The combined effect of reproduction and crossover may now be considered. ‘As when we considered reproduction alone, we are interested in calculating the number of a particular schema Hf expected in the next generation, Assuming. independence of the reproduction and crossover operations, we obtain the estimate: Me ti mais 41) > meus 0) DL -p Comparing this to the previous expression for reproduction alone, the combined effect of crossover and reproduction is obtained by multiplying the expected number of schemata for reproduction alone by the survival probability under crossover p,. Once again the effect of the operations is clear. Schema H grows or decays depending upon a multiplication factor. With both crossover and reproduction, that factor depends on two things: whether the schema is above or below the population average and whether the schema has relatively short or long defining length. Clearly, those schemata with both above-average observed performance and short defining lengths are going to be sampled at exponentially increasing rates. The last operator to consider is mutation. Using our previous definition, mutation is the random alteration of a single position with probability p,,. In order for a schema H to survive, all of the specified positions must themselves survive. Therefore, since a single allele survives with probability (1 — p,,),and since each of the mutations is statistically independent, a particular schema survives when each of the o() fixed positions within the schema survives. Multiplying the survival probability (1 — p,,) by itself o(#) times, we obtain the probability of surviving mutation, (1 — p,,)%?, For small values of Py, (Py << 1), the schema survival probability may be approximated by the expression | — O(H)-Dy. We‘Schema Processing at Work: An Example by Hand Revisited 33 therefore conclude that a particular schema H receives an expected number of copies in the next generation under reproduction, crossover, and mutation as given by the following equation (ignoring small cross-product terms): mH, t +1) = m(H, t) AY f The addition of mutation changes our previous conclusions little. Short, low- order, above-average schemata receive exponentially increasing trials in subsequent generations. This conclusion is important, so important that we give it a special name: the Schema Theorem, or the Fundamental Theorem of Genetic Al gorithms. Although the calculations that led us to prove the schema theorem were not too demanding, the theorem’s implications are far reaching and subtle, To sce this, we examine the effect of the three-operator genetic algorithm on schemata in a population through another visit to the hand-calculated GA of Chapter 1 SCHEMA PROCESSING AT WORK: AN EXAMPLE BY HAND REVISITED Chapter 1 demonstrated the mechanics of the simple GA through a hand calcu. lation of a single generation. Let us return to that example, this time observing how the GA processes schemata—not individual strings—within the population ‘The hand calculation of Chapter 1 is reproduced in Table 2.1. In addition to the information presented earlier, we also keep a running count of three particular schemata, which we call H,, Hz, and H,, where H, = 1 and Hy = 170. Observe the effect of reproduction, crossover, and mutation on the first schema, H,. During the reproduction phase, the strings are copied probabilist cally according to their fitness values. Looking at the first column of the table, we notice that strings 2 and 4 are both representatives of the schema 1****, After reproduction, we notc that three copics of the schema have been produced (strings 2, 3. 4 in the mating pool column). Does this number correspond with the value predicted by the schema theorem? From the schema theorem we ex: pect to have mf Vf copies. Calculating the schema average /(/H,). we obtain (576 + 361)2 = 468.5. Dividing this by the population average f = 293 and multiplying by the number of /, schemata at time f, m(H,, £) = 2, we obtain the expected number of H, schemata at time ¢ + 1, m(H,t + 1) = 2:468:5/293 = 3.20. Comparing this to the actual number of schemata (three), we see that we have the correct number of copies. Taking this one step further, we realize that crossover cannot have any further effect on this schema because a defining length 8(,) = 0 prevents disruption of the single bit. Furthermore, with the mutation rate set at p,, = 0.001 we expect to have mp,, = 3°0.001 = 0.003 oF no bits changed within the three schema copies in the three strings. As 4 result, we ob:34 Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations TABLE 2.1 GA Processing of Schemata—Hand Calculations String Processing ‘Actual Expected Count xValue select, count feom Unsigned’ fix) & L Roulette Integer 2 ce Wheel B 169 014058 1 24 57% 049197 2 8 64 006 0.22 o 19 361 OSL 1.23, 1 sum 1170-400 «400 40 Average 293 0.25 100 10 Max 5% 049197 20 Schema Processing Before Reproduction String Schema Average Representatives Fitness (#1) 4, lites 24 469 a, *Loss 23 320 Hy 1etto 2 576 serve that for schema /,, we do obtain the expected exponentially increasing number of schemata as predicted by the schema theorem. So far, so good; but schema H, with its single fixed bit seems like something ofa special case. What about the propagation of important similarities with longer defining lengths? For example consider the propagation of the schema H, = *10** and the schema /, = 1***0. Following reproduction and prior to crossover the replication of schemata is correct. The case of H, starts with two examples in the initial population and ends with two copies following reproduction. This agrees with the expected number of copies, m(H2) = 2-320/293 = 2.18, where 320 is the schema average and 293 is the population average fitness. The case of ‘A, starts with a single example (string 2) and ends with two copies following reproduction (strings 2 and 3 in the string copies column). This agrees with the expected number of copies m(H,) = 1:576/293 = 1.97, where 576 is the sche. ma’s average fitness and 293 is the population's average fitness. The citcum. stances following crossover are a good bit different. Notice that for the short ‘schema, schema H,, the two copies are maintained even though crossover has‘Schema Processing at Work: An Example by Hand Revisited 35 TABLE 2.1 (Continued) String Processing Mating Poot aner, Mi#e——_ Crossover Site Reproduction ‘Randomly’ ‘Randomly’ New x Ax) (Cross ste Shown) \‘setected) (‘setected ) Population value x" Ol10/1 2 4 o1100 12 144 1100]0 1 4 11001 25 os 11/000 4 2 11011 27 m9 Lojo11 3 2 10000 1% 256 Sum 1754 Average 439 Max has Schema Processing Alter Reproduction ‘Aiter All Operators String suring Expected Actual Represen: Expected Actual_Represen- Count Count tatives Count Count tatives 3.20 3 234 3.20 3 234 218 2 1.64 2 23 197 2 00 t a occurred. Because the defining length is short we expect crossover to interrupt the process only one time in four (f — 1 = 5 — 1 = 4). Asa result, the schema HH, survives with high probability. The actual expected number of H, schemata is thus m(#H,, ¢+1) = 2.18-0.75 = 1.64, and this compares well with the actual ‘count of two schemata. //, is a schema of a different color. Because of the long defining length (8(/1,) = 4), crossover usually destroys this schema The hand calculation has confirmed the theory developed earlier in this chapter. Short, low-order schemata arc given exponentially increasing or decreasing numbers of samples depending on a schema’s average fitness. In a moment we will estimate the number of schemata that are processed in this way, but before we do, we must ask and answer a pressing question. Why should we give an exponentially increasing number of trials to the observed best schemata? In other words, why is this particular allocation strategy a good road to follow? The answer lies along what at first may seem like a detour, following a gambler among the slot machines of Las Vegas.36 Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations THE TWO-ARMED AND K-ARMED BANDIT PROBLEM The effect of reproduction, crossover, and mutation are now both quantitatively and qualitatively clear. Schemata of short defining length, low order, and above average fitness (building blocks) receive exponentially increasing trials in future generations. This is a fact, beyond all question. Yet despite careful proof of this point, at least one nagging question remains: Why is this a good thing to do? More to the point, why should exponentially increasing samples be given to the ob: served best building blocks? This question leads to an important problem of statistical decision theory, the two-armed bandit problem and its extension, the &- armed bandit problem, Although this seems like @ detour from our main concern (atter all, we are trying to understand genetic algorithms, not minimize our gam bling losses), we shall soon see that the optimal solution to the bandit problem is very similar in form to the exponential allocation of trials from our tripartite GA Suppose we have a two-armed slot machine, as depicted in Fig. 2.1 with two arms named LeFr and RIGHT. Furthermore, let’s assume we know that one of the arms pays an award j1, with variance a} and the other arm pays an award j., with variance a} where jt, 2 j1,. Which arm should we play? Clearly, we would like to play the arm that pays off more frequently (the arm with payoff j1,), but therein lies the rub, fh arm is associated with the FIGURE 2.1 The rwo-armed bandit problem poses a dilemma: how do we search for the right answer (exploration) at the same time we use that information (exploitation)?The Two-Armed and K-Armed Bandit Problem 37 higher expected reward, we are faced with an interesting dilemma, Not only mu: we make a decision (more precisely, a sequence of decisions) about which acm to play, but we must at the same time collect information about which is the better arm, This trade-off between the exploration for knowledge and the exploitation of that knowledge is a recurrent and fundamentally important theme in adaptive systems theory, How we address this dilemma will say a lot about the ultimate success of our methods. ‘One way to approach the trade-off is to separate exploration from exploita tion by first performing a single experiment and thereafter making a single irre- versible decision that depends upon the outcome of the experiment. This is one approach of traditional decision theory that we can describe rigorously as fol lows. Suppose we have a total of V trials to allocate among the two arms, We first allocate an equal number of trials 7 (2n < .N) trials to each of the two arms during the experimental phase. After the experiment, we allocate the remaining N = 2n trials to the arm with best observed payoff, Assuming we know Nj, bh, @,, and o.,, we can calculate the expected Loss (De Jong, 1975}: LN.) = [py — wal [CN = m)q(n) + nC = qn))), where g(1) is the probability that the worst arm is the observed best arm after 1 trials have been attempted on cach machine. This probability is well approximated by the tail of the normal distribution: Lee n) = wher an) Var = ere x From these equations we can see that two sources of loss are associated with the procedure. The first loss is a result of issuing 7 trials to the wrong arm during the experiment. The second is a result of choosing the arm associated with the lower payoff (1) even after performing the experiment. We cannot be absolutely certain at the end of the experiment that we will pick the right arm, so we do expect occasionally to pick the wrong arm and incur a loss on the remaining N ~ 2n trials during the exploration phase. We may solve for the optimal expe iment size n* by taking the dcrivative of the loss cquation and setting it to zero. Figure 2.2 shows how the optimal experiment n* size varies with the total number of trials WV and c the ratio of signal difference to noise, where ¢ = (Hh) ~ HY (a3 + 03)". This simple procedure is casy to analyze, but there are no doubt better ways, perhaps optimal ways, to allocate the trials to the better arm, Holland (1975) has performed calculations that show how trials should be allocated between the two arms to minimize expected losses. This results in the allocation of 7 trials to the worse arm and N — 7* trials to the better arm where n* is given by the following equation: + = 6'In| —~ |. where & = ov — ma) r= On| awe |: Where & = ov ~ we38 Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations on ve eo 10b.0 FIGURE 2.2 The modified total number of trials (c!N) grows at a greater than ‘exponential function of the modified optimal experiment size (c'n*) in the one- shot, decision-theory approach to the two-armed bandit problem, Turning the equation around, we realize that the number of trials given to the observed better arm is given by the equation: N~ nt = N= V8qnbiIn Ne”. In other words, to allocate trials optimally (in the sense of minimal expected loss), we should give slightly more than exponentially increasing trials 10 the observed best arm. Unfortunately, this strategy is untealizable, as it requires knowledge of outcomes before they occur. Nonetheless, it forms an important bound on performance that a realizable strategy should try to approach. Certainly ‘many strategies can approach this ideal. The experimental approach analyzed earlier showed how exponentially fewer trials were given to the worse arm as the number of trials increased. Another method that comes even closer to the ideal trial allocation is the three-operator genetic algorithm discussed earlier. The schema theorem guarantees giving at least an exponentially increasing number of trials to the observed best building blocks. In this way the genetic algorithm is a realizable yet near optimal procedure (Holland, 1973a, 1975) for searching among alternative solutions, With a genetic algorithm we are no longer solving a simple two-armed bandit problem, in the usual genctic algorithm we consider the simultaneous solution of many multiarmed bandits. To make this point more forcefully, we first considerThe Two-Armed and K-Armed Bandit Problem 39 the form of the solution to a single k-armed bandit and then demonstrate that the usual genetic algorithm may be thought of as the composition of many such kearmed bandits. The form of the bounding k-armed bandit solution was also discovered by Holland (1973a), The minimal expected loss solution to the allocation of trials to k competing arms is similar to the two-armed solution as it dictates that greater than exponentially increasing numbers of trials be given to the observed best of the & arms. This result is not surprising, but it does connect nicely to our notions of schema processing if we consider a set of competing schemata as a particular ke-armed bandit. To see this connection, we define this notion of a competing set of schemata, and then count the number and size of the k-armed bandit problems being solved within a genetic algorithm of given string length ‘Two schemata A and B with individual positions a, and b, are competing if at all positions # = 1, 2,..., either @, = 6, = "ora, #",b, 4", a, # 6 for at least one # value. For example, consider the set of eight schemata that compete at locations 2, 3, and 5 in the following strings of length 7: *oor08* #00*14* *01*0"* soleies *1lotoee sloties sLlisoee elieies ‘There are eight competing schemata over the three positions 2, 3, and 5 because any of the three positions may take on either a 1 or 20 (2° = 8) ‘We can start to see the connection to the -armed bandit problem in our list, of eight competing schemata Since these schemata are defined over the same positions, they compete with one another for precious population slots. {n order to allocate the population slots properly, we need to allocate exponentially increasing numbers to the observed best schemata just as we give exponentially increasing trials to the observed best arm in the k-armed bandit. One of the dif: ferences between our situation in 2 genetic algorithm and the vanilla flavored k- armed bandit is that we have a number of problems proceeding in parallel. For cxample, with three positions fixed over a string of length 7 there are (j) = 35 of the (2* = 8) cightarmed bandit problems. In gencral, for schemata of order j over strings of length {there are (') different kyarmed bandit problems, where & ¢, = 2. Not alll of the (} = 2‘ problems are played out equally well because ‘crossover has a tendency to disrupt those bandits with long defining lengths as discussed earlier. In the next section, we count the number of schemata that are usefully processed by the genetic algorithm.40 Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations HOW MANY SCHEMATA ARE PROCESSED USEFULLY? Our counting arguments thus far have indicated that we have somewhere between 2! and 1-2! schemata being processed in a string population with length ¢ and size n. As we know, not all of these are processed with high probability because crossover destroys those with relatively long defining lengths, In this section we compute a lower bound on those schemata that are processed in a useful ‘manner—those that are sampled at the desirable exponentially increasing rate. ‘The most widely quoted counting of effective schemata processing is Hol- land's well known, but poorly understood, O(n) estimate (Goldberg, 19854). Simply stated, this estimate means that despite the processing of only 7 structures each generation, a genetic algorithm processes something like 7’ schemata. This result is so important, Holland has given it a special name, fmpltcit parallelism. Even though each generation we perform computation proportional to the size of the population, we get useful processing of something like #° schemata in parallel with no special bookkeeping or memory other than the population itself. Let us rederive this estimate to understand its underlying assumptions and examine the source of this computational leverage. Consider 2 population of » binary strings of length We consider only sche- ‘mata that survive with a probability greater than p, a Constant. AS a result, assuming the operation of simple crossover and a small mutation rate, we admit only those schemata with an error rate © <1 — p, This leads us to consider only those schemata with length f, << ef — 1) +1 With a particular schema length, we can estimate a lower bound on the number of unique schemata processed by an initially random population of strings. To do this, we first count the number of schemata of length f, or less. We then multiply this by an appropriate population size, chosen so we expect, on average, ‘no more than one of each schema of length £/2. Suppose we wish to count the schemata of length /, in the following string of length / = 10: 1011100010 To do this we calculate the number of schemata in the first cell of 5, (orijooo10 so the last bit in the cell is fixed. That is, we want all schemata of the form B&ZgZilesees where the stars * are don't care symbols and the percent signs % take on either the fixed value (the 1 or 0 at that position) or a don't care. Clearly there are 2" of these schemata because /, — 1 = 4 positions can be fixed or take on the don't care. To count the total number of these, we simply slide the template ‘of 5 along one space at a time: 1JOTT10l0010The Building Block Hypothesis a We perform this trick a total of / ~ J, + 1 times and we can estimate the total number of schema of length ¢, or less as 2°" (1 ~ 1, + 1). This count is the number of such schemata in this particular string. To overestimate the number of such schemata in the whole population, we could simply multiply by the population size and obtain the count m2" (1 — J, + 1). This obviously over- estimates the correct count because surely there will be duplicates of low-order schemata in large populations, To refine the estimate, we pick a population size nn = 22, By choosing in this manner, we expect to have one or fewer of all schemata of order 4,/2 or more. Recognizing that the number of schema is binom- ially distributed, we conclude that half are of higher order than //2 and half are of smaller order. If we count only the higher order ones, we estimate a lower bound on the number of schemata as follows: n> ml ~ 1 + 12 ‘This differs from the previous overestimate by a factor of 1/2, Furthermore, the restriction of the population size to the particular value 2"? results in the expression: CU = 4+ Dn ”, a Since n, = Cn, we conclude that the number of schemata is proportional to the cube of the population size and is thus of order 7’, OC"), ‘Thus we see that despite the disruption of long, high-order schemata by crossover and mutation, genetic algorithms inherently process a large quantity of schemata while processing a relatively small quantity of strings. THE BUILDING BLOCK HYPOTHESIS ‘The picture of genetic algorithms’ performance is much clearer with the perspective afforded by schemata. Short, low-order, and highly fit schemata are sampled, recombined, and resampled to form strings of potentially higher fitness. In «a way, by working with these particular schemata (the building blocks), we have reduced the complexity of our problem; instead of building high-performance strings by trying every conceivable combination, we construct better and better strings from the best partial solutions of past samplings. Because highly fit schemata of low defining length and low order play such an important role in the action of genetic algorithms, we have already given them a special name: building blocks. Just as a child creates magnificent fortresses through the arrangement of simple blocks of wood, so does a genetic algorithm seek near optimal performance through the juxtaposition of short, low-order, high-performance schemata, or building blocks. There is, however, one catch. Repeatedly Chapter 1 claimed that notions combine to form better ideas. Just now, it was claimed that building blocks com:42 Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations bine to form better strings. While these claims seem perfectly reasonable, how do we know whether they hold true or not? Certainly there is a growing body of empirical evidence to support these claims in a variety of problem classes. Starting two decades ago with two pioncer- ing dissertations (Bagley, 1967; and Rosenberg, 1967) and continuing through the many genetic algorithm applications demonstrated at recent conferences de: voted to genetic algorithms (Grefenstette, 1985a, 1987a), the building block hy pothesis has held up in many different problem domains. Smooth, unimodal problems, noisy multimodal problems, and combinatorial optimization problems have all been attacked successfully using virtually the same reproduction-cross- overmutation GA. While limited empirical evidence does no theory prove, it oes suggest that genetic algorithms are appropriate for many of the types of problems we normally encounter. More recently, Bethke (1981) has shed some light on this topic. Using Walsh functions and a clever transformation of schemata, he has devised an efficient, analytical method for determining schema average fitness values using Walsh coefficients, This in turn permits us to identify whether, given a particular function and coding, building blocks combine to form optima or near optima. Holland (1987b) has extended Bethke’s computation to the analysis of schema averages when the population is not uniformly distributed. Bethke's and Holland's Walsh-schema transform work is beyond the scope of this discussion, although the interested reader should consult Appendix F, which briefly discusses some important results. Nonetheless, the concepts behind these discoveries are sufficiently important that we must try to understand the regularity implied in building block processing, at least from an intuitive, graphic viewpoint. To do this, let us return to the fivebit coding example we started in Chapter 1, Recall, in that problem we were trying to maximize the function Ax) = 2? where x was coded as a five-bit unsigned integer. In this problem, what do the building blocks look like and how do they lead to better solutions when they are mixed with one another? Consider a simple schema, , = 1**"*. What does this onc-fixed-bit schema look like? The answer is shown in Fig, 2.3 as the shaded region of the domain. Apparently, a schema with the high-order bit set covers the upper half of the domain of interest. Similarly, the schema 1, = 0**** covers the lower half, Other one-bit examples prove illuminating, for example the schema #7, = ****1 shown in Fig. 2.4. This schema covers the half domain that decodes to an odd number (00001 = 1, 00011 = 3, 00101 = 5, etc). The schema #7, = ***0* also covers half the domain, but in the manner shown in Fig. 2,5, It seems that a one-bit schema covers half the domain, but the frequency of oscillation depends on the position of the fixed bit. Higher order building blocks are certainly of interest. Consider the schema H, = 10°** as depicted in Fig. 2.6, This schema covers the lower quarter of the upper half domain, Other two-bit schemata may be sketched similarly, such as the schema H, = **1"1 as shown in Fig. 2.7. Like schema H,, it too covers quarter domain (why is this?), but in a more broken-up fashion. For readers fa miliar with Fourier series, the periodicity of the different schemata is suggestive.The Building Block Hypothesis os os o7 oe os (Trovsencs) ow os oO oe ee ee FIGURE 2.3. Skeich of schema 1°*** overlaying the function ftx) = 2°. os 07 (Trousande) g FIGURE 2.4 Sketch of schema ****1.Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations 07 (Trowsonde) os 02 FIGURE 2.5 Sketch of schema ""*0°. os oe (eusonas) on os 02 on ° ° + oe a oe oe ae FIGURE 2.6 Sketch of schema 10°**.The Building Block Hypothesis 45 os | 4 | (Trovtonds) os } oz ( | In fact, itis this periodicity that permits the Walsh function analysis. Just as har- monic analysis determines physical properties through an examination of the relative magnitudes of Fourier coefficients, so docs Walsh function analysis determine the expected static performance of a genetic algorithm through an analysis of the relative magnitudes of Walsh coefficients. Although these transform methods are powerful mathematical tools for genetic algorithm analysis in specific cases, generalization of these results to arbitrary codings and functions has proved difficult, Bethke has generated a number Of test cases that are provably misleading for the simple three-operator genetic algorithm (we call these coding function combinations GA-deceptive). These results suggest that functions and codings that are GA-deceptive tend to contain isolated optima: the best points tend to be surrounded by the worst. Practically ‘speaking, many of the functions encountered in the real world do not have this needle in-the-haystack quality; there is usually some regularity in the function: coding combination—much like the regularity of one or more schemata—that may be exploited by the recombination of building blocks. Furthermore, we can argue that finding a needle in a haystack is going to be difficult regardless of the search technique adopted. Nevertheless, it is important to keep in mind that the simple genetic algorithm depends upon the recombination of building blocks to seek the best points. If the building blocks are misleading due to the coding used o the function itself, the problem may require long waiting times to arrive at near optimal solutions.46 Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations ANOTHER PERSPECTIVE: THE MINIMAL DECEPTIVE PROBLEM ‘Schema visualization and Walsh-schema transforms provide insight into the work- ig5 Of genetic algorithms, From a practical standpoint, however, these techniques are at least as computationally cumbersome as an enumerative search of the discrete problem space. As a result, they are not widely used to analyze prac tical problems in genctic search. Nonetheless, we still need to understand better what makes a problem difficult for a simple GA. To investigate this matter further, let's construct the simplest problem that should cause a GA to diverge from the global optimum (Goldberg, 1987b). To do this, we want to violate the building block hypothesis in the extreme, Put another way, we would like to have short, low-order building blocks lead to incorrect (suboptimal longer, higher order building blocks. The smallest problem where we can have such deception is a twoit problem. In this section, we briefly develop this minimal deceptive prob: lem (MDP). Despite our best efforts to fool a simple GA, it is Somewhat surprising that this GA-deceptive problem is not usually GA-hard (does not usually diverge from the global optimum). Suppose we have a set of four order-2 schemata over two defining positions, each schema associated with a fitness value as follows: see oeeesto® fy see oeeeeele seep eeseeoe fh see leeeeere of Ie 8H) 1 The fitness values are schema averages, assumed to be constant with no variance (this last restriction may be lifted without changing our conclusions as we only consider expected performance). To start, let's assume that f,, is the global optimum: Si > Soo fir > Sos Sur > foe Since the problem is invariant to rotation or reflection in Hamming twospace, the assumption of a particular global optimum is irrelevant to the generality of ‘our conclusions. Next, we introduce the element of deception necessary to make this a tough problem for a simple genetic algorithm. To do this, we want a problem where ‘one or both of the suboptimal, order-1 schemata are better than the optimal, order-1 schemata. Mathematically, we want one or both of the following conditions to hold: KO") > fA $00) > fC1). In these expressions we have dropped consideration of all alleles other than the ‘two defining positions, and the fitness expression implies an average over all‘Another Perspective: The Minimal Decoptive Problem a7 strings contained within the specified similarity subset. Thus we would like the following two expressions to hold: 00) + (OV). 10) + S11) 2 2 00) + (00) . (01) + UY) 2 2 " Unfortunately, both expressions cannot hold simultaneously in the two:problem (if they do, point 11 cannot be the global optimum), and without loss of gener ality we assume that the first expression is true. Thus, the deceptive two-problem is specified by the globality condition (/;, is the best) and one deception condi tion (we choose f(0*) > f(1*)). To put the problem into closer perspective, we normalize all fitness values with respect to the fitness of the complement of the global optimum as follows: Sa » — fo Soo He Soo We may rewrite the globality condition in normalized form: rq r>h re. We may also rewrite the deception condition in normalized form: reite-c From these conditions, we may conclude a number of interesting facts: es From these, we recognize that there are two types of deceptive two-problem: Type fn > fo (€> 1). Type lls foo > fon (C1). Figures 2.8 and 2.9 are representative sketches of these problems where the fitness is graphed as a function of two boolean variables. Both cases are deceptive, and it may be shown that neither case can be expressed as a linear combination of the individual allele values: neither case can be expressed in the form: Sxx,) = b+ Dax, In the biologist's terms, we have an cpistatic problem. Since it similarly may be proved that no one-bit problem can be deceptive, the deceptive, two-problem is the smallest possible deceptive problem: it is ¢he minimal, deceptive problem (MDP), With the MDP defined, we now turn toward a complete schema analysis of its behavior.Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations oo ‘0 FIGURE 2.8 sketch of Type 1, minimal deceptive problem (MDP) fu, > Jur DX FIGURE 2.9. Sketch of Type I, minimal deceptive problem (MDP) fu. > JuAnother Perspective: The Minimal Deceptive Problem 49 An Extended Schema Analysis of the Two-Problem So far we have constructed a generalized two-bit problem that seems capable of misleading a genetic algorithm when the two defining bits of the problem be- ‘come widely separated on the string, Judging from the schema theorem, we expect to have difficulty when the factor a ~ pat | 7 T= 1 is less than or equal to 1 (assuming that p,, = 0). A more careful analysis requires us to consider the details of crossover more closely. In an earlier section, we saw how the usual calculation of the expected num: ber of schemata in the next generation is a lower bound. This is so because the derivation contains no source terms (one schema’s loss is another's gain) and it assumes that we lose the schema whenever a cross occurs between the schema's ‘outermost defining bits. In the two-problem this latter assumption is overly con: servative, because the mating and crossing of noncomplementary pairs conserve the genetic material of the parents. For example, 00 crossed with ()1 yields 01 and 00, The only time a loss of genetic material occurs is when complements mate and cross. In these cases, a 00 mated and crossed with a 11 yields the pair O1 and 10, and likewise, a 01 mated and crossed with a 10 yields the pair 11 and 00. The full crossover yield table is shown in Table 2.2 where an 5 is used to indicate that the offspring are the same as their parents. In the yield table we see how complements lose material, although we also see how this loss shows up as a gain to the other complementary pair of schemata Using this information, it is possible to write more accurate difference relation: ships for the expected proportion P of each of the four competing schemata. To do this we must account for the correct expected loss and gain of schemata due TABLE 2.2 Crossover Yield Table in Two-Bit Problem x 00 or 10 u or 00 s s s 10) 5 00 5 o s s ” s 00 10 s “ s s oO " i50 Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations to crossover. Assuming proportionate reproduction, simple crossover, and random mating of the products of reproduction, we obtain the following autono- ‘mous, nonlinear, difference equations: | + lad PP: | + pe as bobs = | + hha, | +P PeeP ii In these equations, the superscripts are time indexes, and the subscript binary numbers are schema indexes. The variable /is simply the current (generation £) population average fitness, which may be evaluated as follows Prafoo + Polar + Prof + Pah The parameter p’, is the probability of having a cross that falls between the two defining bits of the schema: ‘Together, these equations predict the expected proportions of the four schemata in the next generation. With specified initial proportions, we may follow the trajectory of the expected proportions through succeeding generations. A necessary condition for the convergence of the GA is that the expected proportion of optimal schemata must go to unity in the limit as generations continue: lim BP}, = 1 To examine the behavior of these equations more carefully, we look at several numerical solutions of the extended schema equations for Type T and Il problems. Some theoretical results are presented briefly without proof. MDP Results Figure 2.10 presents computational results for a representative Type I problem. At first, the optimum schema (schema 11) loses proportion; however, as sche. mata 10 and 00 lose proportion, the remaining battle is fought between schemata 11 and 01 alone, with 11 winning in the end, It may be shown that this resultAnother Perspective: The Minimal Deceptive Problem 51 generalizes to any Type | problem with nonzero starting proportions of the four ‘schemata. This result is surprising, as we originally designed the problem to cause divergence from the global optimum. In short, dynamic analysis tells us that the ‘Type I minimal deceptive problem is not GA-hard. Figures 2.11 and 2.12 present computational results for a Type II MDP. In Fig. 2.11 we see representative convergent results where (as in the Type I case) the solution converges to the optimum despite its initial deception. Not all Type II problems converge like this, however; when the complementary schema 00 has too great an initial proportion, schema 11 may be overwhelmed, with resulting convergence to the second best solution. Representative divergent results are shown in Fig, 2.12. Simple sufficient conditions for the convergence of a Type Il problem may be derived (Goldberg, 1987b); it is surprising that all Type Il prob: lems converge to the best solution for most starting conditions. Recent results (Bridges and Goldberg, 1987; Goldberg, 1987a) have extended the exact schema analysis to higher order problems. Other work along these lines may permit a constructive definition of the class of GA-hard problems, Of course, all of this work assumes a fixed coding. The addition of reordering operators such as inversion may be nature's answer to problems too difficult for a simple GA. We consider such operators and their analysis in Chapter 5. TYPE I: FOt > FOO Population Propertion so 00 Generation IGURE 2.10 Numerical solution of a Type 1, minimal deceptive problem (MDP): r = 1.1,¢ = 1.05,c’ = 0.452 Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations TYPE II: FOO > FOL (CONVERGES) Population Proportion 0 ee Too Generation FIGURE 2.11 Numerical solution of a Type Mf, minimal deceptive problem that converges: r = 1.1,¢ = 0.9,¢' = 0.5 with equal initial proportions. TYPE IT: FOO > FOS (DIVERGES) 7 a ry Generation FIGURE 2.12 Numerical solution of a Type II, minimal deceptive problem that diverges: r= 1.1, ¢ = 0.9, €’ = 0.5 with unequal initial proportions.‘Schemata Revisited: Similarity Templates os Hyperplanes 53 SCHEMATA REVISITED: SIMILARITY TEMPLATES AS HYPERPLANES ‘Over the past two sections we have looked at schema processing from two pet- spectives: using schema visualization we have viewed schema processing as the ‘manipulation of important periodicities, and under the minimal deceptive prob- Jem we have considered schema processing in a competitive, ecological setting ‘Another useful vantage point may be reached if we take a more geometric view of the underlying bit space. ‘To generate this geometrical vision of our search space, consider the strings and schemata of length / = 3. Because the string length is so short, it is easy t0 draw a picture of the search space (Fig. 2.13). In this representation, we view the space as a three-dimensional vector space. Points in the space are strings or schemata of order 3. Lines in the space, as indicated in the diagram, are schemata of order 2. Planes in the space are schemata of order 1, and the whole space is covered by the schema of order 0, the schema ***. ‘This result generalizes to spaccs of higher dimension where we must aban- don the geometrical notions available to us in 3-space. Points, lines, and planes described by schemata in three dimensions generalize to hyperplanes of varying Ont Lu ‘MOM Plane Xs FIGURE 2.13 Visualization of schemata as hyperplanes in three-dimensional space.54 Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations dimension in space. Thus we can think of a genetic algorithm cutting across different hyperplanes to search for improved performance. SUMMARY In this chapter, we have undertaken a more rigorous appraisal of genetic algorithm performance using a more careful analysis of scbemetta (similarity tem. plates), The primary result, embodied in the fundamental theorem of genetic algorithms, says that high-performance, short-defining-length, low-order schemata receive at least exponentially increasing numbers of trials in successive generations. This occurs because reproduction allocates more copies to the best schemata and because simple crossover does not disturb short-defining-length schemata with high frequency. Since mutation ts fairly infrequent, it has little effect on these important schemata. The exponential form of this trial allocation turns out to be a rational way to allocate trials, as it connects with the optimal solution to a two-armed bandit problem. By processing similarities in this manner, a genetic algorithm reduces the complexity of arbitrary problems. In a sense, these highly fit, short, low-order schemata become the partial solutions to a problem (called building blocks), and 4 GA discovers new solutions by speculating on many combinations of the best partial solutions contained within the current population ‘That building blocks do indeed lead to better performance is an underlying assumption of the simple genctic algorithm called the building block hypothesis. ‘The Walsh-schema transform has provided us with an important tool for understanding whether particular problems arc amenable to simple GA solution. Work in this arca has also suggested that functions that are GA-hard (that is, difficult to solve with the simple three-operator genetic algorithm) tend to have remote optima, something akin to finding a ncedic in a haystack; many optimization meth- ‘ods, not just genetic algorithms, have trouble finding answers in problems with such isolated optima, Additional insight into the workings of genetic algorithms has been obtained through analysis of the minimal deceptive problem—the smallest problem that can possibly be deceptive or misleading for the simple genetic algorithm. It is surprising that for many likely initial conditions the MDP is not Ga-hard. In other words, even though we construct a difficult and misleading function (a badly epistatic function), the genetic algorithm often refuses to be misled, This is en: couraging and is no doubt responsible for much of the empirical success of simple genetic algorithms in epistatic problem domains. Analysis of genetic algorithm performance has led us to understand better why GAS work. In the next chapter a simple genetic algorithm is programmed ‘using the Pascal programming language, and we observe its performance ina trial problem.Problems 55 @ PROBLEMS 21, Consider three strings A) = 11101111, A = 01000011 and six schemata H, = 1* Hy = 0°00", Hy = 1°***1*, and He *, Which schemata are matched by which strings? What are the order and defining length of each of the schemata? Estimate the probability of survival of each schema under mutation when the probability of a single mutation is p,, = 0.001. Estimate the probability of survival of each schema under crossover when the probability of crossover Pe = 0.85. 2.2. A population contains the following strings and fitness values at generation 0: # String Fitness 1 10001 20 2 11100 10 3 ooo 5 4 01110 15 The probability of mutation is p,, = 0.01 and the probability of crossover is P. = 1.0, Calculate the expected number of schemata of the form 1**** in generation 1. Estimate the expected number of schemata of the form 0**1* in gen- ration 1 2.3. Devise three methods of performing reproduction and calculate an estimate of the expected number of schemata in the next generation under each method, 24. Suppose we perform a crossoverlike operation where we pick two cross sites and exchange string material between the two sites. xx x[xx[ xx KRKYYRE yyyiyylyy yyyxxyy Calculate a lower bound on the survival probability of a schema of defining length 3 and order o under this operator. Recalculate the survival probability when we ‘treat the string as a ring (when the left end is assumed to be adjacent to the right end). 2.5. How many unique schemata exist within strings of length J = 10, 20, and 30 when the underlying alphabet is binary? How many unique schemata of order 3 exist in binary strings of length / = 10, 20, and 30? Calculate reasonable upper and lower bounds on the number of schemata processed using strings of length 10, 20, and 30 when the population size m = 50. Assume a significant build: ig block length equal to 10 percent of the total string length.Chapter 2 / Genetic Algorithms Revisited: Mathematical Foundations 2.6. Suppose a schema H when present in a particular string causes the string t© have a fitness 25 percent greater than the average fitness of the current popula: tion. If the destruction probabilities for this schema under mutation and crossover are negligible, and ifa single representative of the schema is contained in the population at generation 0, determine when the schema Hf will overtake pop: ulations of size n = 20, 50, 100, and 200. 2.7. Suppose a schema H when present in a particular string causes the string to have a fitness 10 percent less than the average fitness of the current population. If the destruction probabilities for this schema under mutation and crossover are negligible, and if representatives of the schema are contained in 60 percent of the population at generation 0, determine when the schema H will disappear from populations of size n = 20, 50, 100, and 200. 2.8. A two-armed bandit pays equal awards with probabilities p, = 0.7 and P, = 0.3, Estimate the number of trials that should be given to the observed best arm after a total of 50 trials. 2.9, Derive a more accurate formula for calculating the number of unique sche mata contained in a randomly generated initial population of size m when the string length is £ (Hint: Consider the probability of having no schemata of a particular order and use the complementary probability to count the number of schemata represented by one or more.) 2.10. Suppose a problem is coded as a single unsigned binary integer between 0 and 127 (base 10), where 0000000, = Ojo, 1000000, = 64,o, and 1111111 127. Sketch the portion of the space covered by the following schemata: 1******, serene, LLLLLLS, 10° ‘OL, #111" ™ COMPUTER ASSIGNMENTS ‘A. fitness function for a single locus genetic algorithm is given by the function fi = constant and fy = constant. Derive the recursion relationship for the expected proportion of I's under reproduction alone and reproduction with mutation in an infinitely large population. Program the finite difference relationship and calculate the expected proportion of 1's between generation 0 and 100 assuming equal proportions of 1's and 0's initially and a ratio of f/f, = r= 11,210. B. Redo Problem A including mutation using mutation probability values of Pa = 0.001, 0.01, 0.1Computer Assignments 57 C. Consider an order 2 schema where we assume constant fitness values as follows: sepeeeles on, seleesors fy seoreeiee of, ssoreegee or Write finite difference relationships for the proportion of the four schemata (11, 10, 01, 00) under reproduction, crossover, and mutation. Be sure to use the correct loss and gain terms due to crossover. Program the finite difference relationships and simulate the performance of a large population with the f constants of Fig. 2.10. Compare and contrast results at p,, values of 0.001, 0.01, 0.1 to the results of Fig. 2.10. D. For the function f(x) = 2° on the integer interval [0, 31] coded as a five-bit, unsigned binary integer, calculate the average fitness values for all 3° schemata From this data, determine whether the function-coding is GA-deceptive oF not. E, Design a three bit function that is GA-deceptive. Prove its deception by calculating all 3 schema averages.Computer Implementation of a Genetic Algorithm When first approaching genctic algorithms, many users hesitate, not knowing where to start or how to begin. On the one hand, this aversive reaction seems strange. After all, in the first two chapters we have seen how genetic algorithms are mechanically quite simple, involving nothing more than random number generation, string copies, and partial string exchanges. On the other hand, for many business, scientific, and engineering users and programmers this stark simplicity is itself part of the problem; these individuals are familiar with using and programming high-level computer codes involving complex mathematics, interwoven da- tabases, and intricate computations. Moreover, this same audience is most comfortable with the reassuring repeatability of deterministic computer programs. The direct manipulation of bit strings, the construction of custom codings, and even the randomness of GA operators can present a sequence of high hurdles that prevent effective application. In this chapter, we leap these obstacles by first constructing the data structures and algorithms necessary to implement the simple genetic algorithm described earlier. Specifically, we write a Pascal computer code called the simple genetic algorithm (SGA), which contains nonoverlapping string populations, reproduction, crossover, and mutation applied to the optimization ofa simple function of one variable coded as an unsigned binary integer. We also examine some implementation issues such as discretization of parameters, coding of strings, en- forcement of constraints, and mapping of fitness that arise in applying GAs to particular problems.60 Chapter 3 / Computer Implementation of a Genetic Algorithm DATA STRUCTURES Genetic algorithms process populations of strings. Therefore it comes as no surprise that the primary data structure for the simple genetic algorithm is a string population. There are any number of ways to implement populations. For the SGA we choose the simplest; we construct a population as an array of individuals where each individual contains the phenotype (the decoded parameter or param: eters), the genotype (the artificial chromosome or bit string), and the fitness (objective function) value along with other auxiliary information. A schematic of population is shown in Fig, 3.1, The Pascal code of Fig. 3.2 declares a population type corresponding to this model. For readers unfamiliar with Pascal, the essentials of the language are presented in Appendix B; this appendix also presents some random number generation routines and utilities. Even without formal training, many readers should be able to decipher the essence of this code, Referring to Fig. 3.2, we see the declaration of a number of constants: the maximum population size, maxpop, and the maximum string length, maxstring. These set upper bounds on the population size and the string length. Following the constant declarations, we declare the population itself, along with its com- ponents in the fype block. As we can see, the type population is an array of type individual (indexed between | and maxpop). Type individual is a record composed of a type chromosome called chrom, a real variable called fitness, and a real type variable called x. These represent the artificial chromosome, the string fitness value, and the decoded parameter value x respectively. Digging further, we sce that the type chromosome is itsclf an array of type allete (indexed between I and maxsiring), which in this case is simply another name for the boolean type (a single bit, true or false ). INDIVIDUAL. INDIVIDUALS numper [stains | x [oo praca 1 out 15 | 225, 2 ‘1001 oa a n Com 7 1s FIGURE 3.1 Schematic of a string population in a genetic algorithm.Dota Structures 6 const maxpop = 100; maxstring — ~ 30; type allele boolean; { Allele = bit p chromosome = array(1. .maxstring] of alle individual = record chrom:chromosone; ( Genotype ~ bit string ) x:real; (Phenotype = unsigned Integer ) fitness:real; (Objective function value } parentl, parent2, xsite:integer; ( parents & cross pt ) on ) ; ( String of bits ) array(1..maxpop] of individual; population FIGURE 3.2 A simple genetic algorithm, SGA, data type declarations in Pascal, In the SGA, we apply genetic operators to an entire population at cach gen- ration, as shown in Fig. 3.3. To implement this operation cleanly, we utilize two nonoverlapping populations, thereby simplifying the birth of offspring and the replacement of parents. The declarations of the two populations o/dpop and new- ‘pop are shown in Fig. 3.4 along with the declaration of a number of other global Program variables. With these two populations, it is a simple matter to create new offspring from the members of oldpop using the genetic operators, place those new individuals in newpop, and set oldpop to newpop when we are GENERATION GENERATION T Tt 1 coy : 2 3 2 ‘ + eran 3 esaesver & ‘ : [MSTATION 3 wot Nea " ————4 FIGURE 3.3. Schematic of nonoverlapping populations used in the SGA.62 Chapter 3 / Computer Implementation of a Genetic Algorithm var oldpop, nevpop: population; { Ivo non-overlapping populations } gen, mage ( Integer global varlables | ‘pmutation, sunfitness:r ( Real global variables ) mutation, neross:integer: ( Integer statistics ) avg, max, ‘min:real; (Real statiatics ) FIGURE 3.4 SGA global variable declarations in Pascal. through. There are other, more storage-efficient methods of handling populations. We could maintain a single overlapping population and pay more careful attention to who replaces whom in successive populations. There is also no particular reason (o keep the population size constant. Natural populations certainly change in size, and there may be motivation during artificial genetic search to permit Population size variation from generation to generation. There is, however, Stronger motivation in our current work to keep things simple, and this has guided the choice of nonoverlapping populations of constant size. in our machine Icarning work in later chapters we will need to come to terms with the popula tion issue once more. ‘With our data structures designed and built, we need to understand the three operators—reproduction, crossover, and mutation—essential to SGA operation, Before we can do this, we need to define some of the more important global program variables that affect the operation of the entire code. Looking at Fig. 3.4 once again, we see a number of variables of type integer. Among them are the variables popsize, Ichrom, and gen. These important variables correspond to what we have been calling population size (72), string length (J), and the gener: ation counter (¢), Additionally the variable maxgen is an upper limit on the num: ber of generations. Also shown in Fig. 3.4 are a number of important global reat variables: peross, pmutation, sumfitness, avg, max, and min. The variables peross and pmutation are the probabilities of crossover and mutation respectively (p, and p,,). The sumfitness variable is the sum of the population fitness values (3/), This variable is important during roulette wheel selection. There are a few other global variables we have not discussed; a complete listing of the SGA code is presented in Appendix C. REPRODUCTION, CROSSOVER, AND MUTATION ‘The three operators of the simple tripartite algorithm can each be implemented in straightforward code segments. This comes as no surprise, since we have been touting the simple mechanics of these operators. Before we look at each routine, we must remember the common thread running through the three operators:Reproduction, Crossover, and Mutation 63 cach depends on random choice. In the code segments that follow, we assume the existence of three random choice routines: random returns a real pseudorandom number between zero and one (a uniform random variable on the real interval (0, 1). Rip returns a boolean true value with specified probability (a Ber- noulli random variable) md returns an énteger value between specified lower and upper limits (@ uniform random variable over a subsct of adjacent integers). A more complete discussion of these routines is contained in Appendix B where several programming examples are given. In the simple genetic algorithm, reproduction is implemented in the function select as a linear search through a roulette wheel with slots weighted in proportion to string fitness values. In the code shown in Fig. 3.5, we see that sefect returns the population index value corresponding to the selected individual, To do this, the partial sum of the fitness values is accumulated in the real variable partsum. The real variable rand contains the location where the wheel has landed after a random spin according to the computation rand := random * sumfitness Here the sum of the population fitnesses (calculated in the procedure statistics) is multiplied by the normalized pseudorandom number generated by random. Finally the repeat-untif construct searches through the weighted roulette wheel until the partial sum is greater than or equal to the stopping point rand The function returns with the current population index value j assigned to select. function select(popsize:integer; sunfitness:real; ‘var pop: population): integer; ¢ a single individual via roulette wheel selection } {Random point on wheel, partial oun | ( populacion index } begin partsun := 0.0; J ‘= 0; ( Zero out counter and coumslator } ( Wheel point cale. uses random number [0,1] ) rand := random ¥ sunfitnes: repeat ( Find wheel slot } Sin dtl partsum '= partsun + pop{3].f{tness: unefl (paresum >= rand) oF (J = popsize); (Return individual number } select i= $; ends FIGURE 3.5. Function select implements roulette wheel selection.Chapter 3 / Computer Implementation of a Genetic Algorithm ‘This is perhaps the simplest way to implement selection, There are more efficient codes to implement this operator (a binary search will certainly speed things up), and there are many other ways to choose offspring with appropriate bias toward the best. We will examine some of these in later chapters, but for now we stay with this basic mechanism. The code segment select gives us a straightforward way of choosing offspring for the next generation. From our previous descriptions, we know our next step is crossover. In SGA the crossover operator is implemented in a procedure that we, cleverly enough, have called crossover (Fig. 3.6). The routine crossover takes two parent strings called parent! and parent2 and generates two offspring strings called child and child2. The probabilities of crossover and mutation, peross and pmutation, are passed to crossover, along with the string length [chrom, a cross over count accumulator ncross, and a mutation count accumulator nmtattation, Within crossover the operations mirror our description in Chapter 1. At the top of the routine, we determine whether we are going to perform crossover on the current pair of parent chromosomes. Specifically, we toss a biased coin that comes up heads (true) with probability poross. The coin toss is simulated in the boolean function flip, where flip in turn calls on the pseudorandom number procedure crossover(var parentl, parent?, childl, child2:chronosome; ‘var Ichrom, ‘neross, nmutation, Jcross: integer; peross, pmutation: real); place in 2 child etringe | ( Crose 2 parent strings var j:incegor; begin Af £11p(pcross) then begin {Do crossover with p(cross) ) Jeross '= rnd(1,lenrom-1); (Cross between 1 and 1-1 ) nerose := nerors + 1; ir counter ) ond else ‘site to force mutation } jeross :~ Ichron, (Ast exchange, 1 to 1 and 2 to 2) for j := 1 to jeross do begin ehiidi|j] := mutatton(parentl[}], pmutation, mutation); ehild2|j] i= mutation(parent2[3], pmutation, mutation); end; ( 2nd exchange, 1 to 2 and 2 to 1} Af Jerosscichrom then { Skip If cross site 1s lehrom--no crossover } for j ‘= Jeroses] to Ichrom ao begin childi(}] += mitation(perent2[J], poutation, rmutation): ehild2(j] mutation(parent1(j], pmutation, mautation); end ena FIGURE 3. Procedure crossover implements simple (single-point) crossover.Reproduction, Crossover, and Mutation 65 routine random. Ifa cross is called for, a crossing site is selected between 1 and the last cross site, The crossing site is selected in the function rnd, which returns a pseudorandom integer between specified lower and upper limits (between 1 and [chrom ~1). If no cross is to be performed, the cross site is sclected as Ichrom (the full string length /) so a bit-by-bit mutation will take place despite the absence of a cross. Finally, the partial exchange of crossover is carried out in the two for-do constructs at the end of the code. The first for-do handles the partial transfer of bits between parent! and child? and between parent2 and child2, The second for-do construct handles the transfer and partial exchange of material between parent! and child? and between parent? and cbildl, In all ccascs, a bit-by-bit mutation is carried out by the boolean (or allelean) function mutation Mutation at a point is carried out by mutation as shown in Fig. 3.7. This function uses the function flip (the biased coin toss) to determine whether or not to change a true to a false (a 1 to a0) or vice versa, Of course the function flip will only come up heads (true) pmutation percent of the time as a result of the call to the pseudorandom number generator random within flip itself. The function also keeps tabs on the number of mutations by incrementing the variable ‘nmutation, As with reproduction, there are ways to improve our simple mutation operator. For example, it would be possible to avoid much random number generation if we decided when the next mutation should occur rather than calling flip each time. Again, in this chapter, we avoid sophisticated niceties and stick ‘with the basics. The three main pieces of our genetic algorithm puzzle have proven to be none too puzzling, We have seen in this section how the three may be easily coded and easily understood. The next section continues piecing together the bigger GA picture as we coordinate reproduction, crossover, and mutation in a single generation function mtation(alle pmutation: real; ‘var nautation: integer) allele ( Mutate an allele w/ pmutation, count nunber of mutations } var mutate:boolean; begin mitace := £1ip(pmucacion); (Flip the biased coin } Sf mutate then begin mutation = nmutation + 1; mutation := not alleleval; ( Change bit value } end else ‘mutation := alleleval: ( No change } end: FIGURE 3.7. Function mutation implements a single-bit, point mutation,66 Chapter 3 / Computer Implementation of a Genetic Algorithm ATIME TO REPRODUCE, A TIME TO CROSS With the Big Three designed and built, creating a new population from an old one is no big deal. The proper sequencing is shown in Fig. 3.8 in the procedure generation. Starting at an individual index j = 1 and continuing until the popu: lation size, popsize, has been exceeded, we pick two mates, mate! and mate2, using successive calls to select We cross and mutate the chromosomes using crossover (which itself contains the necessary invocations of mutation). In a final flurry of mad activity, we decode the pair of chromosomes, evaluate the objective (fitness) function values, and increment the population index j by 2. Having already examined select, crossover, and mutation in detail, we need only concern ourselves with the two problem-dependent routines hinted at above. For any problem we must create a procedure that decodes the string to Procedure generation: clon through select, crossover ‘sunes an even-numbered popsizé | mate2, Jeross: integer; and mutation ) ’ ( select, crossover, and mitation until nevpop is filled ) s, oldpop): ( pick patr of mates ) on, Jeross, peros: ( Decode string, evaluare fitness, & Yecord parentage dat vith nepoplj ] do begin x i= decode(chrom, lchrom); Fltness := objfune(x) ; parent) parent? xeite end; with newpop[+1] do begin x i= decoda(cheom, Tehrom) ; fon both children } jeroee; Fitness = obj func(x); parentl := matel; Fant? i= mate2; xeite i= Jere end; {Increment population index ) Jest? until J>popsize fend; FIGURE 3.8 Procedure generation gencratcs a new population from the previous population.ATime to Reproduce, « Time to Cross 67 create a parameter or set of parameters appropriate for that problem, We must also create a procedure that receives the parameter or set of parameters thus decoded and evaluate the figure of merit or objective function value associated with the given parameter set. These routines, which we call decode and objfunc, are the two places where the GA rubber meets the applications road, For different problems we will often need different decoding routines (although later on in this chapter we will examine some standard routines that have proven useful a number of studies), and in different problems we will always need a different fitness function routine. Having said this, it is still uscful to look at a particular decoding routine and a particular fitness function, To be consistent with work earlier in this book, we will continue to use binary unsigned integer coding, and ‘we will continue to use a simple power function as the fitness function; however, we will increase the value of the exponent, using the function f(x) = 2°, SGA uses the decoding routine shown in Fig, 3.9, the function decode. In this function, a single chromosome is decoded starting at the low-order bit (position 1) and mapped right to left by accumulating the current power of 2— stored in the variable poweroftwo—when the appropriate bit is set (value is true). The accumulated value, stored in the variable acum, is finally returned by the func: tion decode. The objective function used in SGA is a simple power function, similar to the function used in Chapter 1. In SGA we evaluate the function f(x) = (x/coeff)”. ‘The actual value of coeff is chosen to normalize the x parameter when a bit string ‘of length icbrom = 30 is chosen. Thus coeff = 2° — 1 = 10737418230. Since the x value has been normalized, the maximum value of the function will be f(x) = 1.0 when x = 2" — 1 for the case when fobrom = 30. A straightforward implementation of the power function is presented 3.10 as the function objfunc. function decode(chrom:chromosome; Ibis: Lnteger) real; ( Decode string as unsigned binary integer - true-1, false~0 } var j:integer; ‘accun, pouero£2: begin accum := 0.0; powerof? := 1; for } := 1 to ibits do begin if chrom[j} then accum ‘= accum + poverof?; poverof2 = peverof? * 2; end; Secode := acum; ena; 21; FIGURE 3.9 Function decode decodes a binary string as a single, unsigned integer.68 Chapter 3 / Computer Implementation of a Genetic Algorithm function Sonat coef = 107374i825-0; ( Gnefftectent to nomalize donate } n= 10; gin objfune := power( x/coef, n ) end; FIGURE 3.10 Function objfunc calculates the fitness function f(x) = cx"” from the decoded parameter x. GET WITH THE MAIN PROGRAM We have described the data structures. We have built the genetic operators, We have decoded the strings, and we have figured the fitness values. Now is the time to wrap a ribbon around this parcel, test it, and ship it on for further use. In Fig 3.11 we sce the main program of SGA. At the top of the code, we start innocently enough by sctting the gencration counter to 0, gen := 0, We build steam as we read in program data, initialize a random population, calculate initial population statistics, and print out a special initial report using the procedure initialize. We won't dwell on the initialization code here. The interested reader should refer to Appendix C, which contains a complete copy of the SGA code. ‘At long last, with necessary preliminaries complete, we hit the main loop contained within the repeat-until construct. In rapid succession we increment the generation counter, generate a new generation in generation, calculate new generation statistics in statistics, print out the generation report in report, and advance the population in one fell swoop: oldpop := newpop: All this continues, step after relentless step, until the generation counter exceeds the maximum, thereby forcing the machinery to a grinding halt. In our rush to see the big picture, we have missed some important details. ‘The statistical routine statistics (Fig. 3.12) calculates the average, maximum, and ‘minimum fitness values; it also calculates the sumfitness required by the roulette wheel. This version of statistics is again something of a minimally acceptable solution. Many other interesting population statistics could and probably should be tracked. For example, allele convergence statistics are often tabulated during 4a generation. Best string so far or best & strings so far could be stored for future reference. Population standard deviation or even population histograms might also be of interest in doing more detailed run postmortems, The separation of statistical functions in the routine statistics permits the casy addition of any or all of these computations.Get with the Main Program 69 ( Main progean gen := 0; ( Set things up ) Iniedalize; repeat {Main iterative loop } ‘gen c= gen + 1; generation; statistics(popsize, max, avg, min, sunfitness, newpop): Feport(gen): oldpop := newpop; ( advance the generation } une (gen >= maxgen) end, | Bnd main program ) FIGURE 3.11 Main program for a simple genetic algorithm, SGA. The procedure report presents the full population report, including strings, fitnesses, and parameter valucs. A listing of report and its single subprocedure writechrom are presented in Fig, 3.13. Once again, a wide array of tabular and graphic reporting options may be useful in genetic algorithm work. The simple report procedure is a good tool because it permits side-by-side comparison of consecutive generations. In turn, this allows checking of operators and analysis of the events leading to the construction of the best individuals, Procedure statistics (popetze: in ‘var max,avg,min, sunfitness:real; var pop:population); ( Galoulate population statistics ) var jrinteger: begin (Initialize ) suafitness :- pop{).fLtness; ain = pop{1] . ficness: max ‘= pop[1] ‘fitness; 2 to popaize do with pop[j] do begin = sunfitness + fitness; { Accumulate fitness sun ) Af Eitness>max then max (Mew max ) Af fitnesscain then ain 2 (Mew ain) end; (Calculate average ) sunfitnese/popete: FIGURE 3.12. Procedure statistics calculates important population statistics.70 Chapter 3 / Computer Implementation of a Genetic Algorithm ( report. sga: contains writechrom, report ) procedure writechrom(var out:text; chrom:chromosome; Ichrom: integer) ; (Write a chromosome as a string of 1's (trues) and O's (false’s) } var j: integer: begin for J i= chrom domto 1 do 4f chrom{j} then vrite(out, 1") else velte(out, 0"); end procedure report (gen: integer) | (Write the population report’) const Inelength ~ 132; var J:integer: har(Ist,‘~" ,.inelengeh) ; writeln(ist); )7 1,50); weitein(Lse,"Fopulation Report’); repehar(let,! 1,23); weite(Lst, ‘Generation ',gen-1:2); repehar(Ist,’ 1157); weiteln(Lst, "Generation '\gen:2) weiteln(lst); verlee(ise," # sering x fleness*); wricecise," # parents xsite’): weiteln(ise, + string x tenes’); repehar (Let,"-1, ineLength) ; vetteln (Let); for J i= 1 to popsize do begin welte(1se,J:2.0°)D; Cord string | with oldpoplj] do begin vwritecheon(1et, chrom, chrom) ; welee(lee,! "R10," ', eftness:6:6, 1 |: end; Citew seeing} with newpopl}] do, begin write(ist,? te 9:2, ") Cy pareneh:2, 1,', parent2:2,')¢ xsite a5 vrttechron(1ee, chon, Lchzom wrltela(ist, 1 7,216," Ebeness:6:4); end; eve repehar(1st,"~‘,linelength) ; weiteln(let); { Coneration statistics and accumulated values } writeln(1st," Note: Generation ', gen:2, ' & Accumulated Statistics: ' Te maxm?, max:6:4,", ‘mine', min:6:4, ", aves’, avg:6:4, "sume! Tsunfitness:6:4, ',” nmutation=", nmutation, ',' neross', ‘neross); repchar(iet,"-",1inelength) ; writeln(ist); page(1st); end; FIGURE 3.13 Procedures report and writechrom implement population reports. HOW WELL DOES IT WORK? We have trudged through the SGA code, step by step, inch by inch, gaining a better feel for some of the ins and outs of genetic algorithm programming Of course, this is not the only way to skin a GA cat, and a number of public domainHow Well Does It Work? n codes are available with numerous bells and whistles for effective optimization in a variety of domains (Booker and De Jong, 1985; De Jong, 1982; Grefenstette, 1984a, 1984b). Actually, we will be adding several important features of our own in this and later chapters. Let us resist temptation and hold off the fancy features. In this section, we stick with the bare-bones GA and sce how welll it works. We have already specified our simple test problem. The bit string decodes as an unsigned 30-bit integer. The fitness function f is the power function f(x) = (x/c)", where c has been chosen to normalize x and m has been chosen as 10, Some of you may cry foul, wondering why we have chosen a different function from the one followed in Chapters 1 and 2 (f(x) = =), Actually, we have changed the problem to make things tougher for the GA, as illustrated in Fig. 3.14, With the larger exponent, the average function value is lower, and a smaller Proportion of the domain maps to values above some specified quantity. AS a result, the random starting population will not contain very good points to begin; this is a better test of GA performance. To specify our computer simulations more precisely, let’s choose a trial set of GA parameters. In De Jong's (1975) study of genetic algorithms in function optimization, a series of parametric studies across a five-function suite of prob lems suggested that good GA performance requires the choice of a high crossover probability, 2 low mutation probability (inversely proportional to the population size), and a moderate population size. Following these suggestions we adopt the following parameters for our first computer simulations: 0333 (probability of mutation) 06 (probability of crossover) 30 (population size, n) pmutation peross popsize (x) x FIGURE 3.14 Comparison of the functions +7 and x" on the unit interval.72 Chapter 3 / Computer Implementation of o Genetic Algorithm ‘The string length for the hand simulations of a genetic algorithm in Chapter | was short (short by genetic algorithm standards): / = 5, This translated into a ridiculously small space with only 2° = 32 points, where there was little practical need for genetic search. Any enumerative search or random walk would have found good points quickly, Of course, at that time our aim was pedagogical clarity, and the size of the search space was of little interest. Now that we are interested in seeing a stiffer test of GA performance, the string length is increased and the exponent of the test function is increased, With a string length /ebrom = 30, the search space is much larger and random walk or enumeration should not be 50 profitable. With icbrom = 30 there are 2” = 1,07( 10") points, With over 1,07 billion points in the space, one-at-a-time methods are unlikely to do very much very quickly. Moreover, the increase in exponent has adjusted the space so that only 1.05 percent of the points have a value greater than 0.9, as shown in Fig. 3.14, These two modifications make the problem a better test of GA performance. We start the simple genetic algorithm and let it run for seven generations. ‘The statistical report for the run is shown in Fig. 3.15 and the initial generation (gen = 0) and the first generation are shown side by side in Fig. 3.16. The initial population starts out with a population average fitness of 0.0347. The average fitness of the function on the specified interval may be calculated to be 0.0909. In some sense, we have been unlucky (but not unrealistically so) in our random choice of a population. Additionally glancing at our best member fitness in the initial population, fn,, = 0.2824, we should expect to have 30(1 ~ 0.2824") = 3.56 or approximately four strings in a random population of 30 with fitness Fopulation size (popsize> + Chromosome (eget Centon) = axtnun # oF geneeation (eaxcen) = 10 Cressover prebaility perssey = .9ooooooee-ot tation prebebitity (gmutetion) = 333000000802 2.82413225526-01 population average flaness = 3.67i38327B0e-02 population minima fNtnese = 9. 34061595736-10 Ina population sum of Fitness = 1.QUI4740R37E00 FIGURE 3.15 initial report from an SGA, simple genetic algorithm, run.How Well Does It Work? tienes parents xalte 1) T0000 Teor19ecootor111110100 9. o.e782 | 14) (24,4) HS s44rt0011eroo00ctoorin13110c0t 1.0u70Ee07 7772 | 13) GO, 1) 30 s14r1900T0t0D000T0ror188100100 1ospKESR O.AS8 | CO, 1) 30 MH41t0011r100000t0roorHtOTINY 1 0cBIEs00 0.7850 | » sH4s1¥0recoootstororssstooton 1.065509 0.9079 | 18) ¢ 9,10) 30, M11TH00tecooo0ctoorsr18110000 1spiEsOR 0.8715 |S s1rTootecovoocrontocriooerer 1ossie? oars — | OCS, s4rteorrrtoooeetororrtrticcos 1.oustesor 0.7850 | DCT) 8 Orrteorsrtootoctororrtstoctoo 7.rospees 0.0408 | BDC 1H) 8 ‘M1rtT0reconooctoorist8110cot 1 os7seso7 0.9659 | (18.20) 80 s¥rrtt0nioconoootooriststto1tY 1ospIESOP OAS — | 2H) 18,20) 30 ‘stiittrtiacooooetoorsststoctoo 1.0717E+09 0.9807 | 289 (16,30) st1rteotartonooecorarrtstiontt 1.0nsoeeon 07am | 249 (18,30) s11rtroorecrrtocorro1st1110000 tose? 0.8782 | 2D SLB) 1B Cnr ot oooot0oyaDDN0yNOOeNON $-oREAEHOR o cons =| 28) (23, 5S) 8 sMArttoorteonoeetontiensriocor 1.csoteson o.eeot | 2) (2, 2) 30 sMttttantecttooecorarrsttoctoo 1.ospseeon corer | 30), 2) 3 Ganeratien 7 & Accumulated Seetietices mnc.9807, in0.0U08, aged.8 75 eneration 7 seloe ness {WN ManrorNDMNKNNNDTT TIONG ID 1.059%E609 0.8779 {911 110010000000%0011011101100 1.08916609 0.8715 {N11MopYoc0o000%001orONOCNNY 1.059160 0.0715, {NTT MMoreCOoHeOHODGGTTTH TONY 1.0875EHOF 0.9650 ‘Hrsntorecopertnorey¥11140¥00 1:08536409 9070 ‘11stop0terto0eco011 1811101100 1.048166 0.764 ‘0111001 te000¢eN001 1811816181 7,7910€408 0.0405 irrmonrtmorecomtertTTiacto0 1.0510e+ 0.8872 ‘8111001 ere0neeN901 18811 1c0D1 1:04 7DESO 0.7772 ‘1114004 ofo00e0%0011881111001 1,05706609 0.7772 ‘ntrrterrecooteconot¥1t0re001 1.0714€909 0.9807 ‘nrr¥1erecopaeo199119111 0001 1.087560? 0.9650 ‘H11MonroreDgecmnHeyHONT0HNT9.05646409 0.8757 ‘111Y0ot6rY000e0H0H01181 106100 1,0440€409 0.76% {111110ptee00080%001GCHT006101 1,0591E+0P 0.0715, ‘nrteoorioworeowoneyN¥1 116001 9.01350 0.5615 {WrrrMoronaoneo¥oD!T0¥NHH0NNT 1.0675E409 09650 {1911001 eop000¥0DNTHNNN eH {Wr MpF 200000100171111 16109 1.08838%07 0.9066 {11 140910¥Y000200000111011¢001 4.06408609 0.7606 Wi seo9Hs0¥09!9090N098991111 1.09958607 6.5615 Mortar amor orer 16109 #,9821E408 6.4726 {Wry MootommoNorMDIeCIODNC ON 1.059NE40F 6.8715 {WNT Hpot0o¥ton099N0 1100000 1.05958409 6.8747 s41s19001ecr419c0nt0T1120T0100 1.059508 0.8752 ‘Wr Moni0o19000 19917 IeOON T.6593e607 0.8736 swrytitpstongono%a04 1111 0109 t abaseaam 0.9522 ststt00tecootoctooorststtontt 1.059107 0.8720 s1rTHoerecoreoctoorstt8110011 1.059269 0.0724 revtation=201, 00, sume 26.2997, rereese 71 FIGURE 3.18 SGA run, generation report ¢ = 6-7. MAPPING OBJECTIVE FUNCTIONS TO FITNESS FORM In many problems, the objective is more naturally stated as the minimization of some cost function g(x) rather than the maximization of some utility or profit function 14(:°). Even if the problem is natu rally stated in maximization form, this alone does not guarantee that the utility function will be nonnegative for all > as ‘we require in fitness function (recall that a fitness function must be a nonnegative figure of merit), As a result, it is often necessary to map the underlying natural ‘objective function to a fitness function for rm through one or more mappings.76 Chapter 3 / Computer Implementation of a Gens ie Algorithm ‘The duality of cost minimization and profit maximization is well known, In normal operations research work, to transform a minimization problem to a maximization problem we simply multiply the cost function by a minus one. In genetic algorithm work, this operation alone is insufficient because the measure thus obtained is not guaranteed to be nonnegative in all instances. With GAs the following cost-to-fitness transformation is commonly used: S®) = Cone ~ CE) when (2) < Cans =0 otherwise, ‘There are a variety of ways to choose the CoeffIcient Cy Coax May bE taken as an input coefficient, as the largest g value observed thus far, as the largest g value in the current population, or the largest of the last & generations. Perhaps more appropriately, Cy, should vary depending on the population variance. We will consider this last possibility in Chapter 4 When the natural objective function formulation is a profit or utility function we have no difficulty with the direction of the function: maximized profit or ity leads to desired performance. We may still have a problem with negative utility function u(x) values. To overcome this, we simply transform fitness according to the cquation: Ax) = U6) + Coq when u(x) + Con > O, =0 otherwise. We may choose Ci 28 an input Coefficient, as the absolute value of the worst w value in the current or last # generations, or as a function of the population variance, We postpone further consideration of these possibilities to a later chapter, All this monkeying about with objective functions should arouse suspicion about the underlying relationship between objective functions and fitness functions. In nature, fitness (the number of offspring that survive to reproduction) is a tautology. Large numbers of offspring survive because they are fit, and they are fit because large numbers of offspring survive. Survival in natural populations is the ultimate and only taskmaster of any import. By contradistinction, in genetic algorithm work we have the opportunity and perhaps the duty to regulate the level of competition among members of the population to achieve the interim and ultimate algorithm performance we desire. This is precisely what we do when we perform fitness scaling FITNESS SCALING Regulation of the number of copies is especially important in small population genetic algorithms. At the start of GA runs it is common to have a few extraordinary individuals in a population of mediocre colleagues. If left to the normal selection rule (pselect, = f/2s), the extraordinary individuals would take over a

Goldberg Genetic Algorithms in Search PDF
No ratings yet
Goldberg Genetic Algorithms in Search PDF
432 pages
Linear Algebra Libraries
100% (1)
Linear Algebra Libraries
35 pages
Quadriga Documentation v2.6.1-0
100% (1)
Quadriga Documentation v2.6.1-0
260 pages
Statistical Rethinking Sample
100% (2)
Statistical Rethinking Sample
80 pages
Dynamic Programming and Optimal Control 3rd Edition Dimitri P. Bertsekas Online Reading
100% (1)
Dynamic Programming and Optimal Control 3rd Edition Dimitri P. Bertsekas Online Reading
100 pages
L6 Adaptive Filters
No ratings yet
L6 Adaptive Filters
35 pages
Weka Clustering
No ratings yet
Weka Clustering
15 pages
Andrew NG Main - Notes PDF
100% (1)
Andrew NG Main - Notes PDF
226 pages
Practical Control of Electric Machines Model Based Design and Simulation 1st Edition Rubén Molina Llorente Available Any Format
No ratings yet
Practical Control of Electric Machines Model Based Design and Simulation 1st Edition Rubén Molina Llorente Available Any Format
156 pages
Mastering Metrics Published
No ratings yet
Mastering Metrics Published
4 pages
Scott Stephens - Game Peoples Play PDF
No ratings yet
Scott Stephens - Game Peoples Play PDF
68 pages
APRENDENDO PYTHON (MARK LUTZ, DAVID ASCHER) (Z-Lib - 220526 - 214603
No ratings yet
APRENDENDO PYTHON (MARK LUTZ, DAVID ASCHER) (Z-Lib - 220526 - 214603
576 pages
Power System Analysis and Design EE-461: Tassawar Kazmi Lecturer, EE Department, Seecs, Nust
No ratings yet
Power System Analysis and Design EE-461: Tassawar Kazmi Lecturer, EE Department, Seecs, Nust
10 pages
Wo2020056331 Pamph 20200319 4096
100% (2)
Wo2020056331 Pamph 20200319 4096
53 pages
Metaheuristics For Finding Multiple Solutions 1st Edition Mike Preuss Michael G Epitropakis Xiaodong Li Jonathan E Fieldsend Instant Download
100% (1)
Metaheuristics For Finding Multiple Solutions 1st Edition Mike Preuss Michael G Epitropakis Xiaodong Li Jonathan E Fieldsend Instant Download
50 pages
Algorithms For Validation
No ratings yet
Algorithms For Validation
289 pages
Senescence-Resistant Human Mesenchymal Progenitor Cells Counter Aging in Primates
No ratings yet
Senescence-Resistant Human Mesenchymal Progenitor Cells Counter Aging in Primates
59 pages
2020 BilevelOptimization
No ratings yet
2020 BilevelOptimization
679 pages
David E. Goldberg - Genetic Algorithms in Search, Optimization, and Machine Learning-Addison-Wesley Professional (1989) PDF
No ratings yet
David E. Goldberg - Genetic Algorithms in Search, Optimization, and Machine Learning-Addison-Wesley Professional (1989) PDF
432 pages
Genetic Algorithms in Search Optimization and Machine Learning
No ratings yet
Genetic Algorithms in Search Optimization and Machine Learning
50 pages
David-E-Goldberg-Genetic-Algorithms PDF
33% (3)
David-E-Goldberg-Genetic-Algorithms PDF
432 pages
David E. Goldberg-Genetic Algorithms (1989) - 2
83% (6)
David E. Goldberg-Genetic Algorithms (1989) - 2
432 pages
Genetic Algorithms in Search, Optimization, and Machine Learning (PDFDrive)
No ratings yet
Genetic Algorithms in Search, Optimization, and Machine Learning (PDFDrive)
434 pages
Genetic Algorithms and Genetic Programming Modern Concepts and Practical Applications 1st Edition Michael Affenzeller Online PDF
100% (1)
Genetic Algorithms and Genetic Programming Modern Concepts and Practical Applications 1st Edition Michael Affenzeller Online PDF
170 pages
Genetic Algorithms: Practical Handbook
No ratings yet
Genetic Algorithms: Practical Handbook
5 pages
9781420011326
No ratings yet
9781420011326
394 pages
Optimization Technique-Genetic Algorithm
No ratings yet
Optimization Technique-Genetic Algorithm
19 pages
Lakshmipriya - Module1 Writing Assignment
No ratings yet
Lakshmipriya - Module1 Writing Assignment
6 pages
Practical Genetic Algorithms
No ratings yet
Practical Genetic Algorithms
13 pages
Genetic Algorithms in Search Optimization and Machine Learning - gOLDBERG
67% (3)
Genetic Algorithms in Search Optimization and Machine Learning - gOLDBERG
6 pages
Introduction To Genetic Algorithms For Scientists and - Engineers by David Coley
No ratings yet
Introduction To Genetic Algorithms For Scientists and - Engineers by David Coley
211 pages
Genetic Algorithm Seminar Report
No ratings yet
Genetic Algorithm Seminar Report
17 pages
University Instituteof Engineering Masters of Engineering (AI/AI-ML) Machine Learning (20-CST-651)
No ratings yet
University Instituteof Engineering Masters of Engineering (AI/AI-ML) Machine Learning (20-CST-651)
8 pages
Applications of Genetic Algorithm
No ratings yet
Applications of Genetic Algorithm
4 pages
L.D. Davis, Handbook of Genetic Algorithms : Artificial Intelligence
No ratings yet
L.D. Davis, Handbook of Genetic Algorithms : Artificial Intelligence
6 pages
Genetic Algorithms - Principles and Perspectives - A Guide To GA Theory
50% (4)
Genetic Algorithms - Principles and Perspectives - A Guide To GA Theory
327 pages
(Ebook) Genetic Algorithms and Genetic Programming: Modern Concepts and Practical Applications by Michael Affenzeller, Stephan Winkler, Stefan Wagner, Andreas Beham ISBN 1584886293 Get PDF
100% (3)
(Ebook) Genetic Algorithms and Genetic Programming: Modern Concepts and Practical Applications by Michael Affenzeller, Stephan Winkler, Stefan Wagner, Andreas Beham ISBN 1584886293 Get PDF
187 pages
Genetic Algorithms
No ratings yet
Genetic Algorithms
12 pages
Unit 3
No ratings yet
Unit 3
129 pages
Real-World Uses For Genetic Algorithms
No ratings yet
Real-World Uses For Genetic Algorithms
4 pages
Brownlee J. Genetic Algorithm Afternoon. A Practical Guide... 2024
No ratings yet
Brownlee J. Genetic Algorithm Afternoon. A Practical Guide... 2024
130 pages
Genetic Algorithm Application
No ratings yet
Genetic Algorithm Application
25 pages
Genetic Algothim AI
No ratings yet
Genetic Algothim AI
16 pages
Genetic Algorithm: Surma Mukhopadhyay
No ratings yet
Genetic Algorithm: Surma Mukhopadhyay
26 pages
Genetic Algorithms
No ratings yet
Genetic Algorithms
3 pages
Genetic Algorithms in Data Mining
No ratings yet
Genetic Algorithms in Data Mining
4 pages
Banzhaf W., Nordin P., Francone F.D. - Genetic Programming. An Introduction (1998)
No ratings yet
Banzhaf W., Nordin P., Francone F.D. - Genetic Programming. An Introduction (1998)
481 pages
Kaufmann - Genetic Programming - An Introduction
No ratings yet
Kaufmann - Genetic Programming - An Introduction
481 pages
Genetic Programming: An Introduction
No ratings yet
Genetic Programming: An Introduction
481 pages
Genetic Algorithm Optimization Applied To Electromagnetics: A Review
No ratings yet
Genetic Algorithm Optimization Applied To Electromagnetics: A Review
11 pages
Genetic Algorithms: Lecture Notes For Students
No ratings yet
Genetic Algorithms: Lecture Notes For Students
17 pages
Genetic Engineering in Algorithms
No ratings yet
Genetic Engineering in Algorithms
22 pages
Genetic Algorithm Lit Review Guide
100% (1)
Genetic Algorithm Lit Review Guide
6 pages
Genetic Algorithm Group 5 Official - 101533 - Quincy Scott
No ratings yet
Genetic Algorithm Group 5 Official - 101533 - Quincy Scott
25 pages
AI Lecture 05 (Genetic Algorithm)
No ratings yet
AI Lecture 05 (Genetic Algorithm)
42 pages
Introduction To Genetic Algorithms: IT Are To and
No ratings yet
Introduction To Genetic Algorithms: IT Are To and
4 pages
An Introduction To Genetic Algorithms For Engineers and Scientists
No ratings yet
An Introduction To Genetic Algorithms For Engineers and Scientists
244 pages
Genetic Algorithm Optimization Guide
No ratings yet
Genetic Algorithm Optimization Guide
21 pages
Ga 1
No ratings yet
Ga 1
15 pages
Fundamentals of Genetic Algorithm
100% (1)
Fundamentals of Genetic Algorithm
42 pages

Goldberg Genetic Algorithms in Search

Uploaded by

Goldberg Genetic Algorithms in Search

Uploaded by

You might also like