EDP Sciences EDP Sciences EDP Sciences EDP Sciences

Linkage Analysis and Gene Mapping

by Jiankang WANG (author), Huihui LI (author), Luyan ZHANG (author)
july 2023
530 pages Download after purchase
120,99 €
Référencer ce produit sur votre site

Presentation

One major task of genetic studies is to construct the genetic map through linkage analysis, and then locate the genetic loci of important traits on the constructed linkage maps, identify favorable alleles which are of values to human beings, and investigate their biochemical pathways from genotype to phenotype. This book presents the linkage analysis and gene mapping methodologies, which are applicable to self-pollinated, cross-pollinated, and asexual propagated species, and genetic populations derived from two homozygous parents, two heterozygous parents, and multiple homozygous parents. Chapter 1 in this book begins with genetic mating designs and various types of genetic populations, followed by the structure of commonly used populations and analysis methods of phenotypic data. In chapters 2 and 3, estimation of recombination frequency and construction of linkage map are introduced for twenty bi-parental populations. Chapter 4 deals with two classical gene mapping methods, where no background control is considered, while chapters 5 and 6 describe the inclusive composite interval mapping (ICIM) with background control. Chapter 7 is focused on populations derived from two heterozygous parents, which can be two individuals in a random mating population, two clonal cultivars, or two single crosses from four homozygous inbred lines. Chapter 8 provides knowledge on the pure-line progeny populations derived from four to eight homozygous parents. The last two chapters covers populations, methods and commonly asked questions in genetic mapping which cannot be included previously. This book is intended for readers working on plant and animal genetics, population and quantitative genetics, and plant and animal breeding.

Resume

Preface..................................................... III

CHAPTER 1

Populations in Genetic Studies................................... 1

1.1 Commonly Used Populations in GeneticStudies ................. 2

1.1.1 Bi-Parental Populations.............................. 2

1.1.2 Multi-Parental Populations........................... 5

1.1.3 Considerations in Developing GeneticPopulations .......... 9

1.2 Preliminary Analysis of Genotypic Data....................... 12

1.2.1 Collection and Coding of Genotypic Data................ 12

1.2.2 Gene Frequency and Genotypic Frequency................ 17

1.2.3 Fitness Test on Genotypic Frequencies................... 18

1.3 Genetic Effect and Genetic Variance.......................... 20

1.3.1 Calculation of Population Mean andPhenotypic Variance .... 20

1.3.2 One-Locus Additive and Dominance Model............... 23

1.3.3 Population Mean and Genetic Variance atOne Locus ....... 24

1.4 ANOVA on Single Environment Trials........................ 27

1.4.1 Linear Decomposition on PhenotypicObservation .......... 27

1.4.2 Decomposition of Sum of Squares ofPhenotypic Deviations ... 28

1.4.3 Single Environmental ANOVA on Rice GrainLength ........ 31

1.5 ANOVA on Multi-Environment Trials......................... 32

1.5.1 Linear Decomposition on PhenotypicObservation .......... 32

1.5.2 Decomposition of Sum of Squares ofPhenotypic Deviations ... 33

1.5.3 Multi-Environmental ANOVA on Rice GrainLength ........ 38

1.6 Estimation of Genotypic Values and theBroad-Sense Heritability .... 39

1.6.1 Genotypic Values and Broad-SenseHeritability

from Single Environmental Trials....................... 39

1.6.2 Genotypic Values and Broad-Sense Heritability

from Multi-Environmental Trials....................... 41

1.6.3 Estimation of Genotypic Values UnderHeterogeneous Error

Variances......................................... 42

Exercises................................................... 45

CHAPTER 2

Estimation of the Two-Point RecombinationFrequencies ............... 51

2.1 Generation Transition Matrix............................... 51

2.1.1 Usefulness of the Transition Matrix inLinkage Analysis ...... 51

2.1.2 Transition Matrix of One Generation ofBackcrossing ........ 53

2.1.3 Transition Matrix of One Generation ofSelfing ............ 55

2.1.4 Transition Matrix of Doubled Haploid................... 58

2.1.5 Transition Matrix of Repeated Selfing................... 59

2.1.6 Expression of the Two-Locus GenotypicFrequencies

in Matrix Format................................... 61

2.2 Theoretical Genotypic Frequencies at TwoLoci ................. 62

2.2.1 Theoretical Frequencies of 10 Genotypesat Two Loci ....... 62

2.2.2 Theoretical Frequencies of 4 Homozygotesin Permanent

Populations....................................... 65

2.2.3 Genotypic Frequencies of Two Co-DominantLoci

in Temporary Populations ............................65

2.2.4 Genotypic Frequencies of One Co-DominantLocus and One

Dominant Locus in Temporary Populations............... 69

2.2.5 Genotypic Frequencies of One Co-DominantLocus and One

Recessive Locus in TemporaryPopulations................ 69

2.2.6 Genotypic Frequencies of Two DominantLoci in Temporary

Populations....................................... 74

2.2.7 Genotypic Frequencies of One DominantLocus and One

Recessive Locus in TemporaryPopulations................ 74

2.2.8 Genotypic Frequencies of Two RecessiveLoci in Temporary

Populations....................................... 77

2.3 Estimation of Two-Point RecombinationFrequency .............. 77

2.3.1 Maximum Likelihood Estimation ofRecombination

Frequency in DHPopulations.......................... 77

2.3.2 General Procedure on the MaximumLikelihood Estimation

of Recombination Frequency.......................... 81

2.3.3 Estimation of Recombination FrequencyBetween One

Co-Dominant and One Dominant Marker in F2population.... 86

2.3.4 Initial Values in Newton Algorithm..................... 87

2.3.5 EM Algorithm in Estimating RecombinationFrequency in F2

Populations .......................................90

2.3.6 Effects on the Estimation ofRecombination Frequency from

Segregation Distortion............................... 92

Exercises................................................... 95

CHAPTER 3

Three-Point Analysis and Linkage Map Construction.................. 101

3.1 Three-Point Analysis and MappingFunction.................... 102

3.1.1 Genetic Interference and Coefficient ofInterference ......... 102

3.1.2 Mapping Function.................................. 105

VI Contents

3.2 Construction of Genetic Linkage Maps........................ 107

3.2.1 Marker GroupingAlgorithm........................... 107

3.2.2 Marker Ordering Algorithm........................... 111

3.2.3 Use of the k-Optimal Algorithm in LinkageMap Construction . 113

3.2.4 Rippling of the Ordered Markers....................... 117

3.2.5 Integration of Multiple Maps.......................... 118

3.3 Comparison of the Recombination FrequencyEstimation in Different

Populations.............................................121

3.3.1 LOD Score in Testing the LinkageRelationship in Different

Populations....................................... 121

3.3.2 Accuracy of the Estimated RecombinationFrequency ........ 123

3.3.3 Least Population Size to Declare theSignificant Linkage

Relationship and Close Linkage........................ 124

3.4 Linkage Analysis in Random MatingPopulations ................ 127

3.4.1 Linkage Dis-Equilibrium in Random MatingPopulations ..... 127

3.4.2 Generation Transition Matrix from DiploidGenotypes

to Haploid Gametes................................. 130

3.4.3 Gametic and Genotypic Frequencies inPopulations After

Several Generations of Random Mating.................. 132

Exercises................................................... 134

CHAPTER 4

Single Marker Analysis and Simple IntervalMapping .................. 139

4.1 Single Marker Analysis.................................... 140

4.1.1 Phenotypic Means of Different Genotypesat One Marker

Locus............................................140

4.1.2 Single Marker Analysis by t-Test inPopulations with Two

Genotypes........................................ 143

4.1.3 Single Marker Analysis by t-Test inPopulations with Three

Genotypes........................................ 146

4.1.4 ANOVA in Single Marker Analysis inPopulations with Three

Genotypes........................................ 150

4.1.5 Likelihood Ratio Test in Single MarkerAnalysis............ 151

4.1.6 Problems with Single Marker Analysis................... 153

4.2 Simple Interval Mapping................................... 154

4.2.1 Frequencies of the QTL Genotypes in aMarker Interval ...... 154

4.2.2 Maximum Likelihood Estimation ofPhenotypic Means

of QTL Genotypes.................................. 161

4.2.3 Testing for the Existence of QTL....................... 166

4.2.4 Estimation of Genetic Effects of QTL andIts Contribution

to Phenotypic Variance.............................. 167

4.2.5 Applications of Simple Interval Mappingin DH and F2

Populations....................................... 168

4.2.6 Phenomenon of ‘Ghost’ QTL in SimpleInterval Mapping .... 171

4.2.7 Other Problems with Simple IntervalMapping ............. 172

Contents VII

4.3 Threshold Values of LOD Score in QTLMapping ................ 174

4.3.1 Significance Level and Critical Value ofOne Test Statistic .... 174

4.3.2 Distribution of the LRT Statistic at SingleScanning Positions

in the Absence of Any QTL........................... 176

4.3.3 Factors Affecting the Distribution of theGenome-Wide

Largest LOD Score................................. 177

4.3.4 Number of Effective Tests and theEmpirical LOD Score

Thresholds in QTL Mapping.......................... 180

4.3.5 Permutation Test and the Empirical LODScore Thresholds

in QTL Mapping................................... 184

Exercises................................................... 189

CHAPTER 5

Inclusive Composite Interval Mapping............................. 195

5.1 Importance of the Control on BackgroundGenetic Variation in QTL

Mapping............................................... 196

5.2 Inclusive Composite Interval Mapping in DHPopulations .......... 199

5.2.1 Additive Genetic Model of One Single QTL............... 199

5.2.2 Additive Genetic Model for Multiple QTLs............... 201

5.2.3 One-Dimensional Scanning and HypothesisTesting

for Additive QTLs.................................. 202

5.2.4 Application of ICIM in a DH MappingPopulation in Barley . . 204

5.3 Inclusive Composite Interval Mapping in F2Populations ........... 208

5.3.1 Additive and Dominant Model of One SingleQTL .......... 208

5.3.2 Additive and Dominant Model for MultipleQTLs .......... 212

5.3.3 One-Dimensional Scanning and HypothesisTesting

in Additive and Dominant QTL Mapping................ 213

5.3.4 Application of ICIM in an F2 MappingPopulation .......... 214

5.4 Type II Error in Hypothesis Testing andStatistical Power in QTL

Detection.............................................. 216

5.4.1 Type II Error and Statistical Power inHypothesis Testing .... 216

5.4.2 Probability of Two Types of Error and theAppropriate

Sample Size....................................... 220

5.4.3 Distribution and Effect Models of QTLsUsed in Power

Analysis by Simulations.............................. 222

5.4.4 Calculation of the Detection Power andFalse Discovery Rate

in QTL Mapping ...................................224

5.5 Comparison of IM and ICIM by Simulation.................... 230

5.5.1 QTL Detection Power and FDR from IM................. 230

5.5.2 QTL Detection Power and FDR from ICIM............... 232

5.5.3 Detection Powers Counted by MarkerIntervals ............ 234

5.5.4 Suitable Population Size Required in QTLMapping ......... 235

5.6 Avoiding the Overfitting Problem in theFirst Step of Model

Selection in ICIM........................................ 237

Exercises ...................................................240

VIII Contents

CHAPTER 6

QTL Mapping for Epistasis andGenotype-by-Environment Interaction ..... 245

6.1 Epistatic QTL Mapping in DH Populations.................... 246

6.1.1 Linear Regression in Epistatic QTLMapping and the

Statistical Properties................................ 246

6.1.2 Two-Dimensional Scanning on Di-GenicEpistatic QTLs ..... 248

6.1.3 Genetic Variance on Epistatic QTLs withLinkage .......... 253

6.1.4 Simulation Study on Epistatic QTL Mappingin DH

Populations....................................... 254

6.2 Epistatic QTL Mapping in F2 Populations..................... 257

6.2.1 The Di-Genic Epistasis Model in F2Populations ........... 257

6.2.2 Epistatic QTL Mapping Procedure in F2Population ........ 258

6.2.3 Detection Power of Epistatic QTLs in F2Populations ....... 265

6.3 Genetic Analysis and Detection Power of theMost Common

Di-Genic Interactions..................................... 268

6.3.1 Genetic Effects in Di-Genic Interactions.................. 268

6.3.2 Decomposition of Genetic Variance at thePresence

of Di-Genic Epistasis................................ 270

6.3.3 Power Simulation of Epistatic QTL Mapping.............. 276

6.3.4 Issues in Epistatic QTL Mapping....................... 281

6.4 Mapping of the QTL by EnvironmentInteractions ............... 282

6.4.1 Mapping of the Additive QTL byEnvironment Interactions ... 282

6.4.2 Mapping of the Epistatic QTL andEnvironment Interactions . 285

6.4.3 QTL and Environment Interactions in OneActual RIL

Population in Maize................................. 287

Exercises................................................... 291

CHAPTER 7

Genetic Analysis in Hybrid F1 of TwoHeterozygous Parents

and Double-Cross F1 of Four Homozygous Parents.................... 295

7.1 Linage Analysis in the Hybrid F1 Derivedfrom Two Heterozygous

Parents................................................ 296

7.1.1 Categories of Polymorphism Markers ....................296

7.1.2 Unknown Linkage Phases in HeterozygousParents

and Genotypes in Their F1 Progenies at Two Loci.......... 298

7.1.3 Estimation of the Recombination FrequencyBetween Two

Fully-Informative Markers............................ 299

7.1.4 Haploid Type Rebuilding in theHeterozygous Parents ....... 302

7.2 Estimation of the Recombination Frequencyfor Incompletely

Informative Markers...................................... 305

7.2.1 Theoretical Frequencies of IdentifiableGenotypes Between

the Complete Marker and Other Three Categoriesof Markers . 306

7.2.2 Theoretical Frequencies of IdentifiableGenotypes Between

Two Markers Belonging to Category II, III, orIV .......... 308

Contents IX

7.2.3 Theoretical Frequencies of IdentifiableGenotypes Between

Two Category IV Markers............................ 311

7.2.4 Haploid Type Rebuilding at the Presenceof All Categories

of Markers........................................ 316

7.3 Linkage Analysis in Double Cross F1 Derivedfrom Four Pure-Line

Parents................................................ 319

7.3.1 Marker Categories and Estimation ofRecombination

Frequency in the Double Cross F1 Population............. 319

7.3.2 Equivalence Between the Double Cross F1of Pure-Line

Parents and Hybrid F1 of Heterozygous Parents............ 323

7.3.3 Genotypic Frequencies at Three CompleteMarkers ......... 325

7.3.4 Imputation of Incomplete and MissingMarker Information ... 327

7.4 QTL Mapping in the Double Cross F1Population Derived from Four

Pure-Line Parents........................................ 331

7.4.1 One-QTL Genetic Model in Double Cross F1Population ..... 332

7.4.2 The Linear Regression Model of thePhenotype on Marker

Type for MultipleQTLs.............................. 335

7.4.3 Inclusive Composite Interval Mapping(ICIM) in the Double

Cross F1 Population................................. 336

Exercises................................................... 338

CHAPTER 8

Genetic Analysis in Multi-Parental Pure-LineProgeny Populations........ 345

8.1 Linkage Analysis in Four-Parental Pure-LinePopulations .......... 346

8.1.1 Development Procedure and MarkerClassification

in Four-Parental Pure-Line Populations.................. 346

8.1.2 Theoretical Frequencies of Genotypes andEstimation

of Recombination Frequency at Two Complete Loci......... 348

8.1.3 Estimation of the Recombination FrequencyInvolving

Incomplete Markers................................. 355

8.1.4 Situations When Number of Inbred ParentsSmaller Than

Four.............................................359

8.2 Linkage Analysis in Eight-ParentalPure-Line Populations ......... 360

8.2.1 Development Procedure and MarkerClassification

in Eight-Parental Pure-Line Populations................. 360

8.2.2 Marker Classification and GenotypicCoding

in Eight-Parental Pure-Line Populations................. 362

8.2.3 Theoretical Frequencies of Genotypes atTwo Complete Loci . . 363

8.2.4 Estimation of the Recombination FrequencyBetween Any

Two Categories ofMarkers............................ 368

8.2.5 Situations When the Number of InbredParents Smaller Than

Eight............................................ 370

8.3 QTL Mapping in Four-Parental Pure-LinePopulations ............ 370

8.3.1 Genetic Constitution at Three CompleteLoci ............. 371

X Contents

8.3.2 Imputation of the Incomplete and MissingMarker

Information....................................... 372

8.3.3 The Linear Regression Model of Phenotype onMarker Types. . 378

8.3.4 Inclusive Composite Interval Mapping(ICIM)

in Four-Parental Pure-Line Populations.................. 380

8.4 QTL Mapping in Eight-Parental Pure-LinePopulations ........... 383

8.4.1 Genetic Constitution at Three CompleteLoci ............. 383

8.4.2 The Linear Regression Model of Phenotypeon Marker Types. . 389

8.4.3 Inclusive Composite Interval Mapping(ICIM)

in Eight-Parental Pure-Line Populations................. 391

Exercises................................................... 394

CHAPTER 9

QTL Mapping in Other Genetic Populations........................ 399

9.1 Selective Genotyping Analysis and BulkedSegregant Analysis....... 400

9.1.1 Statistical Principles of SelectiveGenotyping Analysis ....... 400

9.1.2 Likelihood Ratio Test and LOD ScoreStatistics from Selective

Genotyping Analysis................................ 402

9.1.3 Bulked Segregant Analysis............................ 403

9.1.4 Problems with Selective GenotypingAnalysis and Bulked

Segregant Analysis.................................. 404

9.2 QTL Mapping in Populations of ChromosomalSegment Substitution

Lines..................................................404

9.2.1 Characteristics of Chromosomal SegmentSubstitution Lines . . 404

9.2.2 Mapping Methods in Populations ofChromosomal Segment

Substitution Lines.................................. 407

9.2.3 QTL Mapping for Grain Length in a CSSLPopulation

in Rice........................................... 412

9.3 QTL Mapping in Genetic Populations ofMultiple Parents Crossed

with One Common Parent................................. 414

9.3.1 Generalized Linear Regression and ModelSelection ......... 415

9.3.2 Parameter Estimation and HypothesisTesting in JICIM ..... 415

9.3.3 QTL Mapping for Flowering Time in anArabidopsis NAM

Population........................................ 417

9.4 Mendelization of Quantitative Trait Genes..................... 419

9.4.1 Preliminary Mapping of One QTL on GrainWidth of Rice in

One RIL Population................................. 420

9.4.2 Validation of the Grain Width QTL byChromosomal

Segment Substitution Lines........................... 421

9.4.3 Mendelization of a Stable QTL on GrainWidth ............ 424

9.4.4 Fine Mapping and Functional Analysis ofthe Gene at a Stable

Grain Width QTL.................................. 425

9.5 Association Mapping in NaturalPopulations.................... 427

9.5.1 Linkage Disequilibrium is thePrerequisite of Gene Mapping ... 427

9.5.2 Linkage Disequilibrium in Random MatingPopulations ...... 429

Contents XI

9.5.3 Factors Influencing LinkageDisequilibrium ............... 432

9.5.4 Comparison of Linkage and AssociationApproaches in Gene

Mapping......................................... 435

Exercises................................................... 439

CHAPTER 10

More on the Frequently Asked Questions in QTLMapping .............. 443

10.1 Genetic Variance and Contribution toPhenotypic Variation of the

Detected QTL.......................................... 443

10.1.1 Genetic Variance and PhenotypicContribution from One

QTL ..........................................443

10.1.2 Genetic Variance and PhenotypicContribution of Linked

QTLs ..........................................445

10.1.3 Phenotypic Contribution and the QTLDetection Power.... 448

10.2 On the Use of Composite Traits in QTLMapping ............... 450

10.2.1 Composite Traits and Their Applicationsin Genetic

Studies and Breeding.............................. 450

10.2.2 QTL Mapping on Component and CompositeTraits in One

Maize RIL Population............................. 451

10.2.3 Genetic Effects and Genetic Variances onComposite Traits . 455

10.2.4 Power Analysis in QTL Mapping onComposite Traits ..... 461

10.2.5 Heritability of Composite Traits...................... 465

10.3 Effects on QTL Detection by the Increasein Marker Density ....... 470

10.3.1 Effects of Denser Markers on IndependentQTLs ......... 470

10.3.2 Effect of Denser Markers on Linked QTLs.............. 471

10.4 Imputation of Missing Marker Types and TheirEffects in QTL

Mapping in Bi-Parental Populations......................... 474

10.4.1 Imputation of Missing and IncompleteMarker Types ...... 474

10.4.2 QTLs on Plant Height in an F2 Populationin Rice ........ 477

10.4.3 Effects of Missing Marker Types on QTLDetection ....... 479

10.5 Effects of Segregation Distortion onGenetic Studies ............. 481

10.5.1 Segregation Distortion Loci in One RiceF2 Population ..... 481

10.5.2 Effects of Segregation Distortion on QTLMapping

in Populations with Three Genotypes at EachLocus ...... 482

10.5.3 Genetic Distance That can be Affected bySegregation

Distortion.......................................486

10.5.4 Effects of Segregation Distortion on QTLMapping

in Populations with Two Genotypes at Each Locus....... 487

10.6 Non-Normality of the PhenotypicDistribution ................. 488

10.6.1 Phenotypic Model and Distribution ofQuantitative Traits . . 488

10.6.2 QTL Mapping on Phenotypic Traits of theNon-Normal

Distributions.................................... 489

Exercises................................................... 492

References.................................................. 495

Index...................................................... 503

Appendix A: Journal Articles Making Up ThisBook .................. 509

Appendix B: Dissertations of Post-GraduatesMaking Up This Book ...... 513

Appendix C: Integrated Software Packages MakingUp This Book ........ 515

Compléments

Characteristics

Language(s): English

Audience(s): Research, Students

Publisher: EDP Sciences & Science Press

Collection: Current Natural Sciences

Published: 7 july 2023

EAN13 (hardcopy): 9782759830428

Reference eBook [PDF]: L30435

EAN13 eBook [PDF]: 9782759830435

Interior: Colour

Pages count eBook [PDF]: 530

Size: 14.4 MB (PDF)

--:-- / --:--