Preface..................................................... III
CHAPTER 1
Populations in Genetic Studies................................... 1
1.1 Commonly Used Populations in GeneticStudies ................. 2
1.1.1 Bi-Parental Populations.............................. 2
1.1.2 Multi-Parental Populations........................... 5
1.1.3 Considerations in Developing GeneticPopulations .......... 9
1.2 Preliminary Analysis of Genotypic Data....................... 12
1.2.1 Collection and Coding of Genotypic Data................ 12
1.2.2 Gene Frequency and Genotypic Frequency................ 17
1.2.3 Fitness Test on Genotypic Frequencies................... 18
1.3 Genetic Effect and Genetic Variance.......................... 20
1.3.1 Calculation of Population Mean andPhenotypic Variance .... 20
1.3.2 One-Locus Additive and Dominance Model............... 23
1.3.3 Population Mean and Genetic Variance atOne Locus ....... 24
1.4 ANOVA on Single Environment Trials........................ 27
1.4.1 Linear Decomposition on PhenotypicObservation .......... 27
1.4.2 Decomposition of Sum of Squares ofPhenotypic Deviations ... 28
1.4.3 Single Environmental ANOVA on Rice GrainLength ........ 31
1.5 ANOVA on Multi-Environment Trials......................... 32
1.5.1 Linear Decomposition on PhenotypicObservation .......... 32
1.5.2 Decomposition of Sum of Squares ofPhenotypic Deviations ... 33
1.5.3 Multi-Environmental ANOVA on Rice GrainLength ........ 38
1.6 Estimation of Genotypic Values and theBroad-Sense Heritability .... 39
1.6.1 Genotypic Values and Broad-SenseHeritability
from Single Environmental Trials....................... 39
1.6.2 Genotypic Values and Broad-Sense Heritability
from Multi-Environmental Trials....................... 41
1.6.3 Estimation of Genotypic Values UnderHeterogeneous Error
Variances......................................... 42
Exercises................................................... 45
CHAPTER 2
Estimation of the Two-Point RecombinationFrequencies ............... 51
2.1 Generation Transition Matrix............................... 51
2.1.1 Usefulness of the Transition Matrix inLinkage Analysis ...... 51
2.1.2 Transition Matrix of One Generation ofBackcrossing ........ 53
2.1.3 Transition Matrix of One Generation ofSelfing ............ 55
2.1.4 Transition Matrix of Doubled Haploid................... 58
2.1.5 Transition Matrix of Repeated Selfing................... 59
2.1.6 Expression of the Two-Locus GenotypicFrequencies
in Matrix Format................................... 61
2.2 Theoretical Genotypic Frequencies at TwoLoci ................. 62
2.2.1 Theoretical Frequencies of 10 Genotypesat Two Loci ....... 62
2.2.2 Theoretical Frequencies of 4 Homozygotesin Permanent
Populations....................................... 65
2.2.3 Genotypic Frequencies of Two Co-DominantLoci
in Temporary Populations ............................65
2.2.4 Genotypic Frequencies of One Co-DominantLocus and One
Dominant Locus in Temporary Populations............... 69
2.2.5 Genotypic Frequencies of One Co-DominantLocus and One
Recessive Locus in TemporaryPopulations................ 69
2.2.6 Genotypic Frequencies of Two DominantLoci in Temporary
Populations....................................... 74
2.2.7 Genotypic Frequencies of One DominantLocus and One
Recessive Locus in TemporaryPopulations................ 74
2.2.8 Genotypic Frequencies of Two RecessiveLoci in Temporary
Populations....................................... 77
2.3 Estimation of Two-Point RecombinationFrequency .............. 77
2.3.1 Maximum Likelihood Estimation ofRecombination
Frequency in DHPopulations.......................... 77
2.3.2 General Procedure on the MaximumLikelihood Estimation
of Recombination Frequency.......................... 81
2.3.3 Estimation of Recombination FrequencyBetween One
Co-Dominant and One Dominant Marker in F2population.... 86
2.3.4 Initial Values in Newton Algorithm..................... 87
2.3.5 EM Algorithm in Estimating RecombinationFrequency in F2
Populations .......................................90
2.3.6 Effects on the Estimation ofRecombination Frequency from
Segregation Distortion............................... 92
Exercises................................................... 95
CHAPTER 3
Three-Point Analysis and Linkage Map Construction.................. 101
3.1 Three-Point Analysis and MappingFunction.................... 102
3.1.1 Genetic Interference and Coefficient ofInterference ......... 102
3.1.2 Mapping Function.................................. 105
VI Contents
3.2 Construction of Genetic Linkage Maps........................ 107
3.2.1 Marker GroupingAlgorithm........................... 107
3.2.2 Marker Ordering Algorithm........................... 111
3.2.3 Use of the k-Optimal Algorithm in LinkageMap Construction . 113
3.2.4 Rippling of the Ordered Markers....................... 117
3.2.5 Integration of Multiple Maps.......................... 118
3.3 Comparison of the Recombination FrequencyEstimation in Different
Populations.............................................121
3.3.1 LOD Score in Testing the LinkageRelationship in Different
Populations....................................... 121
3.3.2 Accuracy of the Estimated RecombinationFrequency ........ 123
3.3.3 Least Population Size to Declare theSignificant Linkage
Relationship and Close Linkage........................ 124
3.4 Linkage Analysis in Random MatingPopulations ................ 127
3.4.1 Linkage Dis-Equilibrium in Random MatingPopulations ..... 127
3.4.2 Generation Transition Matrix from DiploidGenotypes
to Haploid Gametes................................. 130
3.4.3 Gametic and Genotypic Frequencies inPopulations After
Several Generations of Random Mating.................. 132
Exercises................................................... 134
CHAPTER 4
Single Marker Analysis and Simple IntervalMapping .................. 139
4.1 Single Marker Analysis.................................... 140
4.1.1 Phenotypic Means of Different Genotypesat One Marker
Locus............................................140
4.1.2 Single Marker Analysis by t-Test inPopulations with Two
Genotypes........................................ 143
4.1.3 Single Marker Analysis by t-Test inPopulations with Three
Genotypes........................................ 146
4.1.4 ANOVA in Single Marker Analysis inPopulations with Three
Genotypes........................................ 150
4.1.5 Likelihood Ratio Test in Single MarkerAnalysis............ 151
4.1.6 Problems with Single Marker Analysis................... 153
4.2 Simple Interval Mapping................................... 154
4.2.1 Frequencies of the QTL Genotypes in aMarker Interval ...... 154
4.2.2 Maximum Likelihood Estimation ofPhenotypic Means
of QTL Genotypes.................................. 161
4.2.3 Testing for the Existence of QTL....................... 166
4.2.4 Estimation of Genetic Effects of QTL andIts Contribution
to Phenotypic Variance.............................. 167
4.2.5 Applications of Simple Interval Mappingin DH and F2
Populations....................................... 168
4.2.6 Phenomenon of ‘Ghost’ QTL in SimpleInterval Mapping .... 171
4.2.7 Other Problems with Simple IntervalMapping ............. 172
Contents VII
4.3 Threshold Values of LOD Score in QTLMapping ................ 174
4.3.1 Significance Level and Critical Value ofOne Test Statistic .... 174
4.3.2 Distribution of the LRT Statistic at SingleScanning Positions
in the Absence of Any QTL........................... 176
4.3.3 Factors Affecting the Distribution of theGenome-Wide
Largest LOD Score................................. 177
4.3.4 Number of Effective Tests and theEmpirical LOD Score
Thresholds in QTL Mapping.......................... 180
4.3.5 Permutation Test and the Empirical LODScore Thresholds
in QTL Mapping................................... 184
Exercises................................................... 189
CHAPTER 5
Inclusive Composite Interval Mapping............................. 195
5.1 Importance of the Control on BackgroundGenetic Variation in QTL
Mapping............................................... 196
5.2 Inclusive Composite Interval Mapping in DHPopulations .......... 199
5.2.1 Additive Genetic Model of One Single QTL............... 199
5.2.2 Additive Genetic Model for Multiple QTLs............... 201
5.2.3 One-Dimensional Scanning and HypothesisTesting
for Additive QTLs.................................. 202
5.2.4 Application of ICIM in a DH MappingPopulation in Barley . . 204
5.3 Inclusive Composite Interval Mapping in F2Populations ........... 208
5.3.1 Additive and Dominant Model of One SingleQTL .......... 208
5.3.2 Additive and Dominant Model for MultipleQTLs .......... 212
5.3.3 One-Dimensional Scanning and HypothesisTesting
in Additive and Dominant QTL Mapping................ 213
5.3.4 Application of ICIM in an F2 MappingPopulation .......... 214
5.4 Type II Error in Hypothesis Testing andStatistical Power in QTL
Detection.............................................. 216
5.4.1 Type II Error and Statistical Power inHypothesis Testing .... 216
5.4.2 Probability of Two Types of Error and theAppropriate
Sample Size....................................... 220
5.4.3 Distribution and Effect Models of QTLsUsed in Power
Analysis by Simulations.............................. 222
5.4.4 Calculation of the Detection Power andFalse Discovery Rate
in QTL Mapping ...................................224
5.5 Comparison of IM and ICIM by Simulation.................... 230
5.5.1 QTL Detection Power and FDR from IM................. 230
5.5.2 QTL Detection Power and FDR from ICIM............... 232
5.5.3 Detection Powers Counted by MarkerIntervals ............ 234
5.5.4 Suitable Population Size Required in QTLMapping ......... 235
5.6 Avoiding the Overfitting Problem in theFirst Step of Model
Selection in ICIM........................................ 237
Exercises ...................................................240
VIII Contents
CHAPTER 6
QTL Mapping for Epistasis andGenotype-by-Environment Interaction ..... 245
6.1 Epistatic QTL Mapping in DH Populations.................... 246
6.1.1 Linear Regression in Epistatic QTLMapping and the
Statistical Properties................................ 246
6.1.2 Two-Dimensional Scanning on Di-GenicEpistatic QTLs ..... 248
6.1.3 Genetic Variance on Epistatic QTLs withLinkage .......... 253
6.1.4 Simulation Study on Epistatic QTL Mappingin DH
Populations....................................... 254
6.2 Epistatic QTL Mapping in F2 Populations..................... 257
6.2.1 The Di-Genic Epistasis Model in F2Populations ........... 257
6.2.2 Epistatic QTL Mapping Procedure in F2Population ........ 258
6.2.3 Detection Power of Epistatic QTLs in F2Populations ....... 265
6.3 Genetic Analysis and Detection Power of theMost Common
Di-Genic Interactions..................................... 268
6.3.1 Genetic Effects in Di-Genic Interactions.................. 268
6.3.2 Decomposition of Genetic Variance at thePresence
of Di-Genic Epistasis................................ 270
6.3.3 Power Simulation of Epistatic QTL Mapping.............. 276
6.3.4 Issues in Epistatic QTL Mapping....................... 281
6.4 Mapping of the QTL by EnvironmentInteractions ............... 282
6.4.1 Mapping of the Additive QTL byEnvironment Interactions ... 282
6.4.2 Mapping of the Epistatic QTL andEnvironment Interactions . 285
6.4.3 QTL and Environment Interactions in OneActual RIL
Population in Maize................................. 287
Exercises................................................... 291
CHAPTER 7
Genetic Analysis in Hybrid F1 of TwoHeterozygous Parents
and Double-Cross F1 of Four Homozygous Parents.................... 295
7.1 Linage Analysis in the Hybrid F1 Derivedfrom Two Heterozygous
Parents................................................ 296
7.1.1 Categories of Polymorphism Markers ....................296
7.1.2 Unknown Linkage Phases in HeterozygousParents
and Genotypes in Their F1 Progenies at Two Loci.......... 298
7.1.3 Estimation of the Recombination FrequencyBetween Two
Fully-Informative Markers............................ 299
7.1.4 Haploid Type Rebuilding in theHeterozygous Parents ....... 302
7.2 Estimation of the Recombination Frequencyfor Incompletely
Informative Markers...................................... 305
7.2.1 Theoretical Frequencies of IdentifiableGenotypes Between
the Complete Marker and Other Three Categoriesof Markers . 306
7.2.2 Theoretical Frequencies of IdentifiableGenotypes Between
Two Markers Belonging to Category II, III, orIV .......... 308
Contents IX
7.2.3 Theoretical Frequencies of IdentifiableGenotypes Between
Two Category IV Markers............................ 311
7.2.4 Haploid Type Rebuilding at the Presenceof All Categories
of Markers........................................ 316
7.3 Linkage Analysis in Double Cross F1 Derivedfrom Four Pure-Line
Parents................................................ 319
7.3.1 Marker Categories and Estimation ofRecombination
Frequency in the Double Cross F1 Population............. 319
7.3.2 Equivalence Between the Double Cross F1of Pure-Line
Parents and Hybrid F1 of Heterozygous Parents............ 323
7.3.3 Genotypic Frequencies at Three CompleteMarkers ......... 325
7.3.4 Imputation of Incomplete and MissingMarker Information ... 327
7.4 QTL Mapping in the Double Cross F1Population Derived from Four
Pure-Line Parents........................................ 331
7.4.1 One-QTL Genetic Model in Double Cross F1Population ..... 332
7.4.2 The Linear Regression Model of thePhenotype on Marker
Type for MultipleQTLs.............................. 335
7.4.3 Inclusive Composite Interval Mapping(ICIM) in the Double
Cross F1 Population................................. 336
Exercises................................................... 338
CHAPTER 8
Genetic Analysis in Multi-Parental Pure-LineProgeny Populations........ 345
8.1 Linkage Analysis in Four-Parental Pure-LinePopulations .......... 346
8.1.1 Development Procedure and MarkerClassification
in Four-Parental Pure-Line Populations.................. 346
8.1.2 Theoretical Frequencies of Genotypes andEstimation
of Recombination Frequency at Two Complete Loci......... 348
8.1.3 Estimation of the Recombination FrequencyInvolving
Incomplete Markers................................. 355
8.1.4 Situations When Number of Inbred ParentsSmaller Than
Four.............................................359
8.2 Linkage Analysis in Eight-ParentalPure-Line Populations ......... 360
8.2.1 Development Procedure and MarkerClassification
in Eight-Parental Pure-Line Populations................. 360
8.2.2 Marker Classification and GenotypicCoding
in Eight-Parental Pure-Line Populations................. 362
8.2.3 Theoretical Frequencies of Genotypes atTwo Complete Loci . . 363
8.2.4 Estimation of the Recombination FrequencyBetween Any
Two Categories ofMarkers............................ 368
8.2.5 Situations When the Number of InbredParents Smaller Than
Eight............................................ 370
8.3 QTL Mapping in Four-Parental Pure-LinePopulations ............ 370
8.3.1 Genetic Constitution at Three CompleteLoci ............. 371
X Contents
8.3.2 Imputation of the Incomplete and MissingMarker
Information....................................... 372
8.3.3 The Linear Regression Model of Phenotype onMarker Types. . 378
8.3.4 Inclusive Composite Interval Mapping(ICIM)
in Four-Parental Pure-Line Populations.................. 380
8.4 QTL Mapping in Eight-Parental Pure-LinePopulations ........... 383
8.4.1 Genetic Constitution at Three CompleteLoci ............. 383
8.4.2 The Linear Regression Model of Phenotypeon Marker Types. . 389
8.4.3 Inclusive Composite Interval Mapping(ICIM)
in Eight-Parental Pure-Line Populations................. 391
Exercises................................................... 394
CHAPTER 9
QTL Mapping in Other Genetic Populations........................ 399
9.1 Selective Genotyping Analysis and BulkedSegregant Analysis....... 400
9.1.1 Statistical Principles of SelectiveGenotyping Analysis ....... 400
9.1.2 Likelihood Ratio Test and LOD ScoreStatistics from Selective
Genotyping Analysis................................ 402
9.1.3 Bulked Segregant Analysis............................ 403
9.1.4 Problems with Selective GenotypingAnalysis and Bulked
Segregant Analysis.................................. 404
9.2 QTL Mapping in Populations of ChromosomalSegment Substitution
Lines..................................................404
9.2.1 Characteristics of Chromosomal SegmentSubstitution Lines . . 404
9.2.2 Mapping Methods in Populations ofChromosomal Segment
Substitution Lines.................................. 407
9.2.3 QTL Mapping for Grain Length in a CSSLPopulation
in Rice........................................... 412
9.3 QTL Mapping in Genetic Populations ofMultiple Parents Crossed
with One Common Parent................................. 414
9.3.1 Generalized Linear Regression and ModelSelection ......... 415
9.3.2 Parameter Estimation and HypothesisTesting in JICIM ..... 415
9.3.3 QTL Mapping for Flowering Time in anArabidopsis NAM
Population........................................ 417
9.4 Mendelization of Quantitative Trait Genes..................... 419
9.4.1 Preliminary Mapping of One QTL on GrainWidth of Rice in
One RIL Population................................. 420
9.4.2 Validation of the Grain Width QTL byChromosomal
Segment Substitution Lines........................... 421
9.4.3 Mendelization of a Stable QTL on GrainWidth ............ 424
9.4.4 Fine Mapping and Functional Analysis ofthe Gene at a Stable
Grain Width QTL.................................. 425
9.5 Association Mapping in NaturalPopulations.................... 427
9.5.1 Linkage Disequilibrium is thePrerequisite of Gene Mapping ... 427
9.5.2 Linkage Disequilibrium in Random MatingPopulations ...... 429
Contents XI
9.5.3 Factors Influencing LinkageDisequilibrium ............... 432
9.5.4 Comparison of Linkage and AssociationApproaches in Gene
Mapping......................................... 435
Exercises................................................... 439
CHAPTER 10
More on the Frequently Asked Questions in QTLMapping .............. 443
10.1 Genetic Variance and Contribution toPhenotypic Variation of the
Detected QTL.......................................... 443
10.1.1 Genetic Variance and PhenotypicContribution from One
QTL ..........................................443
10.1.2 Genetic Variance and PhenotypicContribution of Linked
QTLs ..........................................445
10.1.3 Phenotypic Contribution and the QTLDetection Power.... 448
10.2 On the Use of Composite Traits in QTLMapping ............... 450
10.2.1 Composite Traits and Their Applicationsin Genetic
Studies and Breeding.............................. 450
10.2.2 QTL Mapping on Component and CompositeTraits in One
Maize RIL Population............................. 451
10.2.3 Genetic Effects and Genetic Variances onComposite Traits . 455
10.2.4 Power Analysis in QTL Mapping onComposite Traits ..... 461
10.2.5 Heritability of Composite Traits...................... 465
10.3 Effects on QTL Detection by the Increasein Marker Density ....... 470
10.3.1 Effects of Denser Markers on IndependentQTLs ......... 470
10.3.2 Effect of Denser Markers on Linked QTLs.............. 471
10.4 Imputation of Missing Marker Types and TheirEffects in QTL
Mapping in Bi-Parental Populations......................... 474
10.4.1 Imputation of Missing and IncompleteMarker Types ...... 474
10.4.2 QTLs on Plant Height in an F2 Populationin Rice ........ 477
10.4.3 Effects of Missing Marker Types on QTLDetection ....... 479
10.5 Effects of Segregation Distortion onGenetic Studies ............. 481
10.5.1 Segregation Distortion Loci in One RiceF2 Population ..... 481
10.5.2 Effects of Segregation Distortion on QTLMapping
in Populations with Three Genotypes at EachLocus ...... 482
10.5.3 Genetic Distance That can be Affected bySegregation
Distortion.......................................486
10.5.4 Effects of Segregation Distortion on QTLMapping
in Populations with Two Genotypes at Each Locus....... 487
10.6 Non-Normality of the PhenotypicDistribution ................. 488
10.6.1 Phenotypic Model and Distribution ofQuantitative Traits . . 488
10.6.2 QTL Mapping on Phenotypic Traits of theNon-Normal
Distributions.................................... 489
Exercises................................................... 492
References.................................................. 495
Index...................................................... 503
Appendix A: Journal Articles Making Up ThisBook .................. 509
Appendix B: Dissertations of Post-GraduatesMaking Up This Book ...... 513
Appendix C: Integrated Software Packages MakingUp This Book ........ 515