Skip to main content
  • ASM Journals
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Eukaryotic Cell
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems
  • Log in
  • My alerts
  • My Cart

Main menu

  • Home
  • Articles
    • Latest Articles
    • COVID-19 Research and News from ASM Journals
    • mSphere of Influence: Commentaries from Early Career Microbiologists
    • Archive
  • Topics
    • Applied and Environmental Science
    • Clinical Science and Epidemiology
    • Ecological and Evolutionary Science
    • Host-Microbe Biology
    • Molecular Biology and Physiology
    • Therapeutics and Prevention
  • For Authors
    • Getting Started
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics Resources and Policies
  • About the Journal
    • About mSphere
    • Editor in Chief
    • Board of Editors
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
  • ASM Journals
    • Antimicrobial Agents and Chemotherapy
    • Applied and Environmental Microbiology
    • Clinical Microbiology Reviews
    • Clinical and Vaccine Immunology
    • EcoSal Plus
    • Eukaryotic Cell
    • Infection and Immunity
    • Journal of Bacteriology
    • Journal of Clinical Microbiology
    • Journal of Microbiology & Biology Education
    • Journal of Virology
    • mBio
    • Microbiology and Molecular Biology Reviews
    • Microbiology Resource Announcements
    • Microbiology Spectrum
    • Molecular and Cellular Biology
    • mSphere
    • mSystems

User menu

  • Log in
  • My alerts
  • My Cart

Search

  • Advanced search
mSphere
publisher-logosite-logo

Advanced Search

  • Home
  • Articles
    • Latest Articles
    • COVID-19 Research and News from ASM Journals
    • mSphere of Influence: Commentaries from Early Career Microbiologists
    • Archive
  • Topics
    • Applied and Environmental Science
    • Clinical Science and Epidemiology
    • Ecological and Evolutionary Science
    • Host-Microbe Biology
    • Molecular Biology and Physiology
    • Therapeutics and Prevention
  • For Authors
    • Getting Started
    • Submit a Manuscript
    • Scope
    • Editorial Policy
    • Submission, Review, & Publication Processes
    • Organization and Format
    • Errata, Author Corrections, Retractions
    • Illustrations and Tables
    • Nomenclature
    • Abbreviations and Conventions
    • Publication Fees
    • Ethics Resources and Policies
  • About the Journal
    • About mSphere
    • Editor in Chief
    • Board of Editors
    • For Reviewers
    • For the Media
    • For Librarians
    • For Advertisers
    • Alerts
    • RSS
    • FAQ
Research Article | Host-Microbe Biology

Genome-Wide Association Analyses in the Model Rhizobium Ensifer meliloti

Brendan Epstein, Reda A. I. Abou-Shanab, Abdelaal Shamseldin, Margaret R. Taylor, Joseph Guhlin, Liana T. Burghardt, Matthew Nelson, Michael J. Sadowsky, Peter Tiffin
Julia Oh, Editor
Brendan Epstein
aDepartment of Plant and Microbial Biology, University of Minnesota, Saint Paul, Minnesota, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Brendan Epstein
Reda A. I. Abou-Shanab
bBiotechnology Institute, University of Minnesota, Saint Paul, Minnesota, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Reda A. I. Abou-Shanab
Abdelaal Shamseldin
cEnvironmental Biotechnology Department, Genetic Engineering and Biotechnology Research Institute (GEBRI) at City of Scientific Research and Technology Applications (SRTA-City) New Borg El Arab, Alexandria, Egypt
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Margaret R. Taylor
bBiotechnology Institute, University of Minnesota, Saint Paul, Minnesota, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Margaret R. Taylor
Joseph Guhlin
aDepartment of Plant and Microbial Biology, University of Minnesota, Saint Paul, Minnesota, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Joseph Guhlin
Liana T. Burghardt
aDepartment of Plant and Microbial Biology, University of Minnesota, Saint Paul, Minnesota, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Liana T. Burghardt
Matthew Nelson
aDepartment of Plant and Microbial Biology, University of Minnesota, Saint Paul, Minnesota, USA
bBiotechnology Institute, University of Minnesota, Saint Paul, Minnesota, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
Michael J. Sadowsky
aDepartment of Plant and Microbial Biology, University of Minnesota, Saint Paul, Minnesota, USA
bBiotechnology Institute, University of Minnesota, Saint Paul, Minnesota, USA
dDepartment of Soil, Water, and Climate, University of Minnesota, Saint Paul, Minnesota, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Michael J. Sadowsky
Peter Tiffin
aDepartment of Plant and Microbial Biology, University of Minnesota, Saint Paul, Minnesota, USA
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
  • ORCID record for Peter Tiffin
Julia Oh
The Jackson Laboratory for Genomic Medicine
Roles: Editor
  • Find this author on Google Scholar
  • Find this author on PubMed
  • Search for this author on this site
DOI: 10.1128/mSphere.00386-18
  • Article
  • Figures & Data
  • Info & Metrics
  • PDF
Loading

Article Figures & Data

Figures

  • Tables
  • Supplemental Material
  • Related Data
  • FIG 1
    • Open in new tab
    • Download powerpoint
    FIG 1

    (A) Distribution of number of variants per LD group (at r2 ≥ 0.95), (B) distribution of genomic distance spanned by LD groups found on the chromosome or the megaplasmids (including only groups found only on one replicon), and (C) number of groups containing only PAVs, only SNPs, or both as well as the number of LD groups found within and across replicons. There were 22,057 SNPs and 10,674 PAVs that were not grouped with other variants and 9,501 LD groups with a median of three variants per group, and the largest group contained 6,970 variants. Half of all variants are in groups that contain ≤12 variants. Only variants used for association testing (minor allele frequency ≥ 5%, missingness ≤ 20%) were grouped.

  • FIG 2
    • Open in new tab
    • Download powerpoint
    FIG 2

    (A) Phenotypic distributions of the focal traits and (B) proportion of phenotypic variance explained (PVE) by relatedness among strains (i.e., the K-matrix) alone, as predicted by a linear mixed model, and by both relatedness and large-effect variants through a Bayesian sparse linear mixed model (BSLMM) implemented in GEMMA. PVE was calculated for all variants, only SNPs, and only PAVs. The gray lines indicate the lower 95% of the empirical null distributions from permuted data sets.

  • FIG 3
    • Open in new tab
    • Download powerpoint
    FIG 3

    Evaluation of the expected proportion of variance explained (PVE) for A17 biomass by the most strongly associated variants as determined by association testing and forward model selection. Panel A shows the cumulative PVE explained by the 10 most strongly associated variants (black line, more than 10 variants rarely explained more variation than expected by chance) as well as the cumulative PVE from each of 100 randomly permuted data sets that make up the empirical null distribution (gray lines). For A17 biomass, the actual data explain more variance than the permuted data; however, panel B shows that only the first 3 variants explain more of the residual PVE (i.e., after accounting for PVE of the previous variants) than expected by chance. In panel B, the vertical gray lines represent the lower 95% of the null distribution.

  • FIG 4
    • Open in new tab
    • Download powerpoint
    FIG 4

    The proportion of remaining phenotypic variance of the focal traits explained by adding each additional top variant, as in Fig. 3B.

Tables

  • Figures
  • Supplemental Material
  • Related Data
  • TABLE 1

    Mean r2, a measure of nonindependence between segregating variants, is generally low between pairs of variants of different types or on different replicons, while the median size and spanned distance of LD groups is less on the megaplasmids than on the chromosome

    Variant type or locationMean r2 between
    variants
    No. ungrouped
    variants
    No. of LD
    groups
    Median no. of variants
    per LD group
    Median LD group
    spanned distancea
    All0.0632,8219,5013N/A
    SNPs only0.0722,0578,3643N/A
    PAVs only0.0210,7646322N/A
    Between SNPs and PAVs0.03N/A5057N/A
    Chromosome SNPs0.247899007173,406
    pSymB SNPs0.0513,6714,4783518
    pSymA SNPs0.127,5972,91231,063
    • ↵a Spanned distance calculated only for LD groups with SNPs that were all on the same replicon.

  • TABLE 2

    Candidate genes tagged by variants in LD groups that explained more variation than expected based on the empirical null distribution (see Fig. S7 for QQ-plots)

    TraitRepliconPositiondAnnotation (MaGea locus tag)
    2-AminoethanolpSymA52580fixI: nitrogen fixation protein FixI (SMEL_v1_mpb0065)
    Formic acidpSymA38256napA: nitrate reductase, periplasmic, large subunit (SMEL_v1_mpb0048)
    GentamicinpSymA796714Transcriptional regulator, ROK family (SMEL_v1_mpb0963)
    pSymA263510nifH: nitrogenase Fe protein (SMEL_v1_mpb0322)
    pSymA282760Putative aldehyde dehydrogenase (SMEL_v1_mpb0345)
    SpectinomycinpSymAPAVConserved protein of unknown function (SMEL_v1_mpb0259)
    StreptomycinChrom.PAVMultisensor signal transduction histidine kinase (SMEL_v1_0575)
    DesiccationpSymB1161576Putative aldehyde or xanthine dehydrogenase (SMEL_v1_mpa1160)
    A17 biomasspSymA269841nifA: Nif-specific regulatory protein (SMEL_v1_mpb0330)
    269869
    270090
    270096
    270157
    270283
    270292
    pSymA271348nifA: (SMEL_v1_mpb0330); unknown (SMEL_v1_mpb0331)
    pSymA274195Unannotated
    pSymA276359gabD: succinate-semialdehyde dehydrogenase I, NADP-dependent (SMEL_v1_mpb0338)
    276443
    276563
    pSymB1231268queC: 7-cyano-7-deazaguanine synthase (SMEL_v1_mpa1230)
    pSymB1376015Diguanylate cyclase/phosphodiesterase (SMEL_v1_mpa1374)
    pSymB669804Sulfotransferase family (SMEL_v1_mpa0678)
    R108 biomasspSymA305290fixN: cytochrome c oxidase subunit 1 homolog (SMEL_v1_mpb0374)
    305308
    305353
    AMTbpSymA648346Diguanylate cyclase/phosphodiesterase (SMEL_v1_mpb0802)
    649133
    APcpSymAPAVfixS: FixS2 nitrogen fixation protein (SMEL_v1_mpb0492)
    • ↵a http://www.genoscope.cns.fr/agc/microscope/home/index.php.

    • ↵b Annual mean temperature.

    • ↵c Annual precipitation.

    • ↵d Variants are sorted by genomic position, not ranking or LD group.

  • TABLE 3

    For most traits, phenotypic variance explained by genome-wide relatedness (“PVE LMM”) was greater than the phenotypic variance explained by just the top variants

    TraitPVE top variantsaPVE LMMb
    2-Aminoethanol0.050.00
    Gentamicin resistance0.140.49
    Spectinomycin resistance0.430.50
    Streptomycin resistance0.340.58
    Annual mean temperature0.090.12
    Annual precipitation0.100.23
    Formic acid0.080.30
    Desiccation tolerance0.190.31
    A17 biomass0.330.74
    R108 biomass0.190.53
    R108 nodule number0.060.19
    • ↵a Maximum cumulative PVE among 1 to 25 variants chosen by model selection after subtracting the median of the empirical null distribution obtained from random permutations.

    • ↵b After subtracting median of the null distribution.

Supplemental Material

  • Figures
  • Tables
  • Related Data
  • FIG S1

    (A) Distribution of number of variants in each LD group at r2 = 0.80. There were 27,509 LD groups total, and half of the variants were found in LD groups with 37 or fewer variants. (B) Number of LD groups containing PAVs or SNPs, and number of LD groups containing SNPs found on each replicon, also at r2 = 0.80. (C) Distributions of number of variants per LD group for LD groups at r2 = 0.95 containing only SNPs, only PAVs, or only SNPs on one of the replicons. Download FIG S1, PDF file, 1.3 MB.

    Copyright © 2018 Epstein et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • FIG S2

    Distributions for all 20 phenotypic traits. The five focal traits are in red, and the 11 traits with PVE > 0 are in bold. Download FIG S2, PDF file, 0.3 MB.

    Copyright © 2018 Epstein et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • FIG S3

    PVE explained by top variants from GWA after model selection. As in Fig. 4, the points represent the proportion of remaining variance explained (“PRVE”) by adding each additional variant, and, for clarity, only the top 10 of 25 variants are shown. The gray lines indicate the lower 95% of PRVE estimates from analyses of phenotypes randomly permuted 100 times. The five focal traits have red titles, and the ten traits for which at least the first variant explained more phenotypic variance than expected by chance have bolded titles. Download FIG S3, PDF file, 0.9 MB.

    Copyright © 2018 Epstein et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • FIG S4

    PVE by LMM (the K-matrix alone) and by the BSLMM. The gray lines indicate the lower 95% of random permutations. Note that random permutations were only conducted for BSLMM on the five focal traits because of computational limitations. The five focal traits are in red, and the nine traits with PVE by either the K-matrix or BSLMM >95% of the random permutations are in bold. Download FIG S4, PDF file, 0.6 MB.

    Copyright © 2018 Epstein et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • FIG S5

    Size and composition (brown SNPs, blue PAVs) of top 10 LD groups from the testing of the real phenotype data (column 1). (Column 2) The distribution of the mean proportion of SNPs in the top 10 LD groups from 100 GWA runs with randomly permuted phenotypes (LD groups containing both SNPs and PAVs were counted fractionally). The vertical line indicates the mean proportion of SNPs in the real data, and the percentage of random runs with a value greater than the real data is annotated to the right of the line. The genome-wide mean proportion of SNPs is 0.73. (Columns 3 and 4) The distribution of mean LD group size and mean MAF, respectively, for the permutations compared to the real data. Download FIG S5, PDF file, 1.3 MB.

    Copyright © 2018 Epstein et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • TEXT S1

    Supplemental methods describing phenotype measurement methods in detail. Download Text S1, PDF file, 0.1 MB.

    Copyright © 2018 Epstein et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • FIG S6

    Positions of strains and loadings of the 19 core bioclim variables on the first four PCA axes (variance explained in parentheses). Precipitation-related variables are shown in blue, and temperature-related variables are shown in red. Annual mean temperature (AMT) and annual precipitation (AP) were chosen as the representative variables for further analysis. Variable abbreviations: AMT, annual mean temperature (temp.); MDR, mean diurnal range; IT, isothermality; TS, temp. seasonality; TWaM, max. temp. warmest month; TCM, min. temp. coldest month; TAR, temp. annual range; TWeQ, mean temp. wettest quarter; TDQ, mean temp. driest quarter; TWaQ, mean temp. warmest quarter; TCQ, mean temp. coldest quarter; AP, annual precipitation (precip.); PWeM, precip. wettest month; PDM, precip. driest month; PS, precip. seasonality; PWeQ, precip. wettest quarter; PDQ, precip. driest quarter; PWaQ, precip. warmest quarter; PCQ, precip. coldest quarter. Download FIG S6, PDF file, 0.3 MB.

    Copyright © 2018 Epstein et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • TABLE S1

    Descriptions of the 20 phenotypes selected for association analyses. The phenotypes are listed by category. For binary traits, instead of a mean and standard deviation, the number of resistant strains (strains that grew) and the number of susceptible strains (strains that did not grow) are reported (resistant/susceptible in the Mean column). Download Table S1, PDF file, 0.1 MB.

    Copyright © 2018 Epstein et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

  • FIG S7

    QQ plots for the P values from likelihood ratio tests in linear mixed model GWA. The five focal traits have red titles, and the eleven traits with PVE greater than expected by chance have bolded titles. The P value distributions for four of these 11 traits showed no evidence for inflation, five traits (2-aminoethanol utilization, annual mean temperature, R108 nodule number, and plant biomass on both plant genotypes) showed some mild inflation, and resistance to spectinomycin and streptomycin both showed substantial inflation; that is, the distribution of P values was strongly skewed toward lower (more significant) values. Download FIG S7, PDF file, 0.9 MB.

    Copyright © 2018 Epstein et al.

    This content is distributed under the terms of the Creative Commons Attribution 4.0 International license.

Related Data

  • Figures
  • Tables
  • Supplemental Material
PreviousNext
Back to top
Download PDF
Citation Tools
Genome-Wide Association Analyses in the Model Rhizobium Ensifer meliloti
Brendan Epstein, Reda A. I. Abou-Shanab, Abdelaal Shamseldin, Margaret R. Taylor, Joseph Guhlin, Liana T. Burghardt, Matthew Nelson, Michael J. Sadowsky, Peter Tiffin
mSphere Oct 2018, 3 (5) e00386-18; DOI: 10.1128/mSphere.00386-18

Citation Manager Formats

  • BibTeX
  • Bookends
  • EasyBib
  • EndNote (tagged)
  • EndNote 8 (xml)
  • Medlars
  • Mendeley
  • Papers
  • RefWorks Tagged
  • Ref Manager
  • RIS
  • Zotero
Print
Alerts
Sign In to Email Alerts with your Email Address
Email

Thank you for sharing this mSphere article.

NOTE: We request your email address only to inform the recipient that it was you who recommended this article, and that it is not junk mail. We do not retain these email addresses.

Enter multiple addresses on separate lines or separate them with commas.
Genome-Wide Association Analyses in the Model Rhizobium Ensifer meliloti
(Your Name) has forwarded a page to you from mSphere
(Your Name) thought you would be interested in this article in mSphere.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.
Share
Genome-Wide Association Analyses in the Model Rhizobium Ensifer meliloti
Brendan Epstein, Reda A. I. Abou-Shanab, Abdelaal Shamseldin, Margaret R. Taylor, Joseph Guhlin, Liana T. Burghardt, Matthew Nelson, Michael J. Sadowsky, Peter Tiffin
mSphere Oct 2018, 3 (5) e00386-18; DOI: 10.1128/mSphere.00386-18
del.icio.us logo Digg logo Reddit logo Twitter logo CiteULike logo Facebook logo Google logo Mendeley logo
  • Top
  • Article
    • ABSTRACT
    • INTRODUCTION
    • RESULTS
    • DISCUSSION
    • MATERIALS AND METHODS
    • ACKNOWLEDGMENTS
    • FOOTNOTES
    • REFERENCES
  • Figures & Data
  • Info & Metrics
  • PDF

KEYWORDS

BSLMM
GWAS
Medicago
Rhizobium
Sinorhizobium
bacteria
chip heritability
genetic architecture
genomics
linkage disequilibrium
symbiosis

Related Articles

Cited By...

About

  • About mSphere
  • Board of Editors
  • Policies
  • For Reviewers
  • For the Media
  • Embargo Policy
  • For Librarians
  • For Advertisers
  • Alerts
  • RSS
  • FAQ
  • Permissions
  • Journal Announcements

Authors

  • ASM Author Center
  • Submit a Manuscript
  • Author Warranty
  • Types of Articles
  • Getting Started
  • Ethics
  • Contact Us

Follow #mSphereJ

@ASMicrobiology

       

 

Website feedback

ASM Journals

ASM journals are the most prominent publications in the field, delivering up-to-date and authoritative coverage of both basic and clinical microbiology.

About ASM | Contact Us | Press Room

 

ASM is a member of

Scientific Society Publisher Alliance

 

American Society for Microbiology
1752 N St. NW
Washington, DC 20036
Phone: (202) 737-3600

Copyright © 2021 American Society for Microbiology | Privacy Policy | Website feedback

Online ISSN: 2379-5042