APP下载

Association analysis revealed importance of dominance effects on days to silk of maize nested association mapping(NAM)population

2017-05-19MONIRMdMamunZHUJun

关键词:显性关联遗传

MONIR Md.Mamun,ZHU Jun()

Association analysis revealed importance of dominance effects on days to silk of maize nested association mapping(NAM)population

MONIR Md.Mamun,ZHU Jun1*(Institute of Bioinformatics,Zhejiang University,Hangzhou 310058,China)

SummaryFull model and multi-loci additive model were used to analyze the days to silk(DS,female flowering)of maize nested association mapping(NAM)population.Analysis with the full model revealed that small effects of additive, dominance,epistasis,and their environmental interactions of many loci controlled the DS of maize NAM population. Dominance related effects had large impacts on the trait.Estimated total heritability was 79.86%,whereas 50.52%was due to dominance related effects.Environmental specific genetic effects also revealed as imperative for DS,explained 27.31% phenotypic variations.The highly significant(-log10PEW>5)quantitative trait SNPs(QTSs)identified were 50 for full model, but 47 for additive model with low heritability(31.65%).Utilizing the association analysis results of DS,genotypes and total genetic effects of superior lines,superior hybrids were predicted that could be useful for future breeding program.

genome-wide association study;maize;days to silk;dominance effects

Flowering time is an important trait,measuring the adaption capability of plants to local environments[1-2]. The transition from vegetative growth to flowering by integrating different environmental prompts is crucial for plant reproductive success[3].Flowering time is considered as a major selection criterion in plant breeding[4].Maize is originated from Balsas teosinte (Zea maysssp.parviglumis)in the Mexican highlands (approximately 9 000 years ago),and has evolved to adapt in diverse ecological conditions[1].Dissection ofthe genetic mechanisms of maize flowering time is crucial for evolutionary analysis and future breeding programs.Several studies have been conducted to discover the underlying genetic architecture of flowering time of maize by using quantitative trait locus(QTL)mappingand genome-wideassociation study (GWAS)[1-2,5].

Dominance and epistasis are important phenomena in quantitative genetics area.Complexity of the genetic architecture can be largely attributed to epistasis,which plays a significant role in heterosis, inbreeding depression,adaptation,reproductive isolation,and speciation[6].However,most of the GWAS of different organisms have been analyzed by ignoring the impacts of dominance,epistasis and environmental interaction.Ignoring the important factors could be a major cause of missing heritability of GWAS.Heterozygous genotypes are generally found with high proportion in random mating and others specially designed populations.However,in whole genome sequencing data with a large number of single nucleotide polymorphisms(SNPs),a small portion of heterozygote genotypes can be found in inbred lines of animals and crops that could have large impacts on phenotypic traits[7-8].In this study,an attempt was made to discover the impacts of heterozygous genotypes on days to silk(DS)of maize nested association mapping (NAM)population.Forthat,the fullmodelapproach with additive,dominance,epistasis,and their environmental interactions was analyzed to dissect genetic architecture of DS by usingQTXNetwork[9].Maize NAM population was constructed by only five-generation self-crossing within 25 diverse families[1,5,10].However,there were no heterozygous genotypes rather than a small portion of missing genotypes.The missing genotypes were replaced by heterozygote genotypes in this study.An additive model with only additive(a)and additive by environmental interactions(ae)was also analyzed for comparison study.Genotypes and total genetic effects of best line(BL),superior line(SL),and superior hybrid(SH)were arranged to observe the scope of improvements for future maize breeding.

1 Materials and methods

1.1 Genotype and phenotype data

Maize nested association mapping(NAM) population derived in the United States(US-NAM)was used in this study,which was derived by crossing 25 diverse lines with B73 and then self-pollination for five generations[5,10].Days to silk(DS)were scored over nine environments.However,to get rid from computational complexity,data from four environments were analyzed.We downloaded the genotype and phenotype data sets from http://www.panzea.org/.

1.2 Statistical analysis

Newly developed approach for association mapping,implemented inQTXNetwork,was used for association mapping.The approach has two distinct parts:generalized multi-factor dimensionality reduction (GMDR)method to scan SNPs by 1D for main effects, 2D and 3D for epistasis interactions using module GMDR-GPU[11]ofQTXNetwork,and then association mapping was conducted on detected SNPs by using quantitative traits SNPs(QTS)module ofQTXNetwork.Two different models for association mapping were used in this study,called full genetic model and multiloci additive model.The full genetic model includes SNP loci effects(a,d,aa,ad,da,dd)as fixed; environment(e)and loci by environment interaction (ae,de,aae,ade,dae,dde)as random effects for four environments(1 forE1,2 forE2,3 forE4,and 4 forE9).The statistical approaches of full and additive models[12]were used for conducting association analyses.

Henderson methodⅢ[13]was used to calculate theF-statistic test for association analysis.A total of 2 000 times permutation was conducted for calculating the criticalF-value to control the experiment-wise typeⅠerror(αEW<0.05).Parameters were estimated by using the MCMC(Markov chain Monte Carlo)algorithm with 20 000 Gibbs sample iterations[9,14-16].Experiment-wise criticalPvalue(PEW-value)wascalculated bycontrolling experiment-wise typeⅠerror(PEW<0.05).

2 Results

2.1 Estimated heritability using full model

Days to silk(DS)of maize NAM population is highly heritable trait[5].Estimated total heritability by using full model approach was 79.86%for DS,mostly due to dominance and dominance related epistasis effects(Table 1),referring the importance of analyzing dominance-related effects even ifin inbred lines.Recentstudy showsthatenvironmental specific effects are relatively unimportant for leaf orientation traits of maize NAM population,contributing to only 4.98%-7.32%phenotypic variation[7].Unlike the maize leaf orientation traits,large amount of heritability was estimated due to environmental specific effectswhich refer the genetic effects varied acrossdifferentenvironments.

Table 1 Estimated heritability(%)of genetic effects for days to silk using full model and additive model

2.2 Genetic architecture of DS

Association analyses for DS identified multiple loci with different genetic effects.Full model approach identified total 50 highly significant(-log10PEW>5) QTSs(Fig.1,Table S1 available at http://www. zjujournals.com/agr/EN/article/showSupportInfo.do?id= 10459).The identified QTSs had 64 genetic main effects and 54 environmental specific effects. Therefore,environmental specific effects of QTSs play important roles in DS of NAM population.Despite of the low frequency of heterozygote genotypes of the identified loci(8.21%-9.24%for the loci which had dominant effects,and 3.51%-9.03%for the loci which had dominance related epistasis interaction),we observed large impacts of dominance related effects on DS;though only three QTSs had highly significant dominant effects,there were five pairs of QTSs with highly significant dominance related epistasis interactions(Table S1 available at http:// www.zjujournals.com/agr/EN/article/showSupportInfo. do?id=10459).Flowering time in plants results from interactive molecular pathways[17],and epistasis effects have been observed inArabidopsis[18]and rice[19].In this study,the full model identified total 24 pairs of highly significant epistasis effects for DS of NAM population. In converse to self-fertilizing crop species,small effects of many loci were reported to control the flowering time using QTL mapping of maize NAM population[5]. Similar to previous QTL mapping of DS of NAM population,association analysis with the full model estimated small genetic effects of DS QTSs.The largest positive individual effect of QTS(S10_ 113745101)had a dominant effect of only 1.43 days (-log10PEW=47.3)that could explain 2.92%phenotypic variation.Again,the largest negative individual effect of QTS(S1_172281879)had an additive× environment 1(ae1)effect of-0.912 day(-log10PEW=51.5)that contributed to 0.85%phenotypic variation, though total additive effect of the QTS in environment 1(a+ae1)was only-0.559 day.Similar to individualgenetic effects of loci,estimated epistasis effects were also small.The largest epistasis effects of QTSs(S4_ 53677782 and S8_37237820)had a dominance× dominance(dd)effect of only 2.688 days(-log10PEW=22.3),which could explain 10.31%phenotypic variation.The identified QTS S3_159869611 had the largestpositive additiveeffect61.1),and the QTS S2_109001252 had the largest negative additive effect43.3).

Fig.1 G×G p lot of detected significant QTSs(PEW<0.05)for DS by using fu ll model(DS_ADI)and additive model (DS_A)app roaches

2.3 Candidate gene annotation

Candidate genes corresponding to DS QTSs were collected from Gramene database(http://ensembl. gramene.org/Zea_mays/).Functions of candidate genes were searched in the UniProt(http://www.uniprot.org/ uniprot/)with the accession number of the genes collected from Gramene database.Descriptions of some of the candidate genes were collected from NCBI gene database.Moreover,the functions of candidate genes were collected via literature search in Google. Functions of some candidate genes were tabulated in supplementary Table2(Table S2 availableathttp://www. zjujournals.com/agr/EN/article/showSupportInfo.do?id= 10459).We observed that some of the candidate genes were members of well-known gene families that have crucial functions in plant life.For example,QTS S1_ 172281879 is the near variant of C3HC4-type RING finger family protein geneGRMZM2G116714.The C3HC4-type RING finger genes play important roles in various physiological processes including growth, development,and stress responses[20].QTS S3_ 54472637 is the variant of MYB transcription factor protein geneGRMZM2G051256.The MYB transcription factor proteins play regulatory roles in development processesand defenseresponsesin plants[21].Functions of most of the candidate genes are still unknown.

2.4 Prediction of best line,superior line,and superior hybrid for DS

Along with the provided association mapping results,best line(BL),superior line(SL),and superior hybrid(SH)can be predicted for DS that may help breeders for future breeding program(Table 2).Overall total genetic effect of the non-B73 allele homozygous (QQ)combinations was 2.25 days across environments, but variant from 0.20 to 4.18 days in four environments. Predicted total genetic effect forF1hybrid(1.95 days) was smaller than non-B73 allele homozygous(QQ) genotypes.

Table 2 Prediction of total genetic effects of days to silk

Maximum positive total genetic effect across environments was revealed for the line Z012E0020 (6.83 days)called as the positive best line(best line(+)),whereas environment specific positive best lines were Z008E0050(9.89 days)in environment 1,Z012E0124(9.72 days)in environment 2, Z007E0043(6.89 days)in environment 3,and Z012E0058(9.27 days)in environment 4(Table S3 available at http://www.zjujournals.com/agr/EN/article/ showSupportInfo.do?id=10459).Maximum negative total genetic effect across environments was revealed for the line Z019E0177(-5.72 days)called as negative best line(best line(-)),and its total genetic values were varied to(-1.87--8.56)days under four different environments.Environmental specific negative best lines were Z024E0182(-9.05 days)in environment 1,Z024E0114(-6.16 days)in environment 2, Z010E0020(-5.48 days)in environment 3,and Z024E0094(-8.69 days)in environment 4.Total genetic values of environmental specific best lines were largely varied,(-2.50--9.05)days for line Z024E0182,(-2.57--7.36)daysforline Z024E0114, (-2.11--5.48)days for line Z010E0020,and (-1.41--8.69)days for line Z024E0094.Therefore, there was no specific best line across the environments forDS.

The predicted superior negative line(superior line(-))could provide insight for crop improvement along with the optimum homozygous genotypes(QQ,qq)combinations.Total overall genetic effect of the predicted superior line had-7.11 days,which was smaller than the existing best line(Z019E0177).

Again,the total genetic effect of the negative superior hybrid,that exhausted the optimum combination of homozygous(QQ,qq)and heterozygous (Qq)genotypes had-11.80 days,which was 6.08 days earlier than the existing line Z019E0177,referring that the predicted superior hybrid has greater scope than the predicted superior line for further improvement. We tabulated optimum genotypes corresponding to loci of the predicted lines(Table S4 available at http://www. zjujournals.com/agr/EN/article/showSupportInfo.do?id= 10459)that could be helpful to breeders for further crop improvement.

2.5 Association mapping with additive model

Additive model identified 47 highly significant QTSs,among which 31 QTSs were also identified by full model(Fig.1).As like the full model,estimated effects from additive model were small.Estimated total heritability was 31.65%by using additive model approach that was less than half of the total heritability of full model(Table 1),illustrating the problem of missing heritability by using additive model. Therefore,ignoring dominant and epistasis interactions may have large impacts on under-estimating heritability ofcomplextraits.

3 Discussion

Role of heterozygous genotypes has been ignored in GWAS under the assumption that most of the genetic variations in animal and plant organisms are results of additive effects of multiple loci. Environmental impacts were also ignored or adjusted by subtracting their effects from phenotypic data. However,ignorance or adjustments of important factors can result in missing information about the genetic architecture of complex traits.Full model approach was designed to estimate or predict the effects of different types of factors(additive, dominance,epistasis,and their environmental interactions)that can provide more information about the underlying mechanisms of complex traits.In this study,maize days to silk was analyzed by using full model approach,which revealed new insight about this complex traits.DS is related with adaption of maize under various environments,a major criterion for selection breeding[1].We observed genetic effects of multiple loci varying under different environments. Estimated heritability of environmental specific effects was 27.31%.For full model analyses,dominance and dominance related epistasis interaction had large effects on DS.An additive model was also analyzed in this study.Association study with additive model approach had smaller heritability than the full model approach.Correlation between predicted genotypic values and phenotypes was very high for full model approach(r≈0.96),suggesting the analysis results can accurately predict the phenotypes.Epistasis effects were unimportant for DS in previous QTL mapping study[5].However,we observed large impact of epistasis effects on DS,contributing to around 49.37%of phenotypic variations(Table 1).This result showed concordance with the results observed inArabidopsis[18]and rice[19].

By calculating the total genetic effects of lines, we observed that there was no specific line with large genetic effect across environments,rather than found that different lines had large effects under different environments.This result suggests that the maize flowering time is very sensitive to environments,and different environments need different combinations of genotypes for better performance.The predicted genotypes of SL and SH also suggest the same hypotheses that the superior genotypes of loci were different under different environments(Table S4).The predicted SL and SH had larger genetic effects than the best lines,suggesting the scope of further improvement for the maize days to silk with the predicted genotype combinations.

[1]LI Y X,LI C,BRADBURY P J,et al.Identification of genetic variants associated with maize flowering time using an extremely large multi-genetic background population.The Plant Journal: For Cell and Molecular Biology,2016,86(5):391-402.

[2]XU J,LIU Y,LIU J,et al.The genetic architecture of floweringtime and photoperiod sensitivity in maize as revealed by QTL review and Meta analysis.Journal of Integrative Plant Biology, 2012,54(6):358-373.

[3]GRILLO M A,LI C,HAMMOND M,et al.Genetic architecture of flowering time differentiation between locally adapted populations ofArabidopsis thaliana.The New Phytologist,2013, 197(4):1321-1331.

[4]JUNG C,MULLER A E.Flowering time control and applications in plant breeding.Trends in Plant Science,2009,14(10):563-573.

[5]BUCKLER E S,HOLLAND J B,BRADBURY P J,et al.The genetic architecture of maize flowering time.Science,2009,325 (5941):714-718.

[6]YANG J,ZHU J.Methods for predicting superior genotypes under multiple environments based on QTL effects.Theoretical and Applied Genetics,2005,110(7):1268-1274.

[7]MONIR M M.Comparing different genetic models and statistical approaches of GWAS for complex traits.Hangzhou:Zhejiang University,2016:44-64.

[8]LIYUAN Z.Genetic association studies for complex traits of crops and linear-model-based multiple dimensionality reduction method developing.Hangzhou:Zhejiang University,2016:10-23.

[9]ZHANG F T,ZHU Z H,TONG X R,et al.Mixed linear model approaches of association mapping for complex traits based on omics variants.Scientific Reports,2015,5:10298.

[10]TIAN F,BRADBURY P J,BROWN P J,et al.Genome-wide association study of leaf architecture in the maize nested association mapping population.Nature Genetics,2011,43(2):159-162.

[11]ZHU Z,TONG X,ZHU Z,et al.Development of GMDR-GPU for gene-gene interaction analysis and its application to WTCCC GWAS data for type 2 diabetes.PloS One,2013,8(4):e61943.

[12]MONIR M M,ZHU J.Comparing GWAS results of complex traits using full genetic model and additive models for revealing genetic architecture.Scientific Reports,2017,7:38600.

[13]SEARLE S R,CASELLA G,MCCULLOCH C E.Variance Components.New York,USA:John Wiley&Sons,2009.

[14]YANG J,ZHU J,WILLIAMS R W.Mapping the genetic architecture of complex traits in experimental populations.Bioinformatics,2007,23(12):1527-1536.

[15]YANG J,HU C C,HU H,et al.QTLNetwork:Mapping and visualizing genetic architecture of complex traits in experimental populations.Bioinformatics,2008,24(5):721-723.

[16]QI T,JIANG B,ZHU Z,et al.Mixed linear model approach for mapping quantitative trait loci underlying crop seed traits.Heredity,2014,113(3):224-232.

[17]KOMEDA Y.Genetic regulation of time to flower inArabidopsis thaliana.Annual Review of Plant Biology,2004,55:521-535.

[18]EL-LITHY M E,BENTSINK L,HANHART C J,et al.NewArabidopsisrecombinant inbred line populations genotyped using SNPWave and their use for mapping flowering-time quantitative trait loci.Genetics,2006,172(3):1867-1876.

[19]UWATOKO N,ONISHI A,IKEDA Y,et al.Epistasis among the three major flowering time genes in rice:Coordinate changes of photoperiod sensitivity,basic vegetative growth and optimum photoperiod.Euphytica,2007,163(2):167-175.

[20]MA K,XIAO J H,LI X H,et al.Sequence and expression analysis of the C3HC4-type RING finger gene family in rice.Gene,2009,444(1/2):33-45.

[21]CHEN Y H,YANG X Y,HE K,et al.The MYB transcription factor superfamily ofArabidopsis:Expression analysis and phylogenetic comparison with the rice MYB family.Plant Molecular Biology,2006,60(1):107-124.

关联分析揭示显性效应对玉米巢式定位群体抽穗期的重要性(英文).

马姆·茂尼,朱军*(浙江大学生物信息学研究所,杭州310058)

采用关联定位全模型和多位点加性模型,分析了玉米巢式关联定位群体抽丝期的遗传效应。全模型关联分析揭示,玉米抽丝期受微效多基因的加性、显性、上位性及其环境互作控制,其中显性效应最为重要。在估算的总遗传率(79.86%)中,与显性效应相关的遗传率高达50.52%,其次是环境互作效应的遗传率(27.31%)。检测到的极显著(-log10PEW>5)数量性状单核苷酸多态性位点数为全模型50个、加性模型47个(遗传率=31.65%)。基于关联分析玉米抽丝期的结果,预测了最优自交系和最优杂交组合的基因型组配方式及相应的遗传效应值,可用于指导玉米群体优异位点的精准分子选择。

全基因组关联分析;玉米;抽丝期;显性效应

Q 348

A

10.3785/j.issn.1008-9209.2017.02.236

浙江大学学报(农业与生命科学版),2017,43(2):146-152

Foundation item:Supported by the National Natural Science Foundation of China(No.31371250).

*Corresponding author:ZHU Jun(http://orcid.org/0000-0002-8509-8304),E-mail:jzhu@zju.edu.cn

Received:2017-02-23;Accepted:2017-03-13

猜你喜欢

显性关联遗传
非遗传承
不惧于新,不困于形——一道函数“关联”题的剖析与拓展
输注人血白蛋白可以降低肝硬化患者显性肝性脑病发生率并提高其缓解率
还有什么会遗传?
还有什么会遗传
还有什么会遗传?
“一带一路”递进,关联民生更紧
奇趣搭配
显性激励与隐性激励对管理绩效的影响
智趣