Team II Comparative Genomics Group: Difference between revisions
Line 75: | Line 75: | ||
SNP analysis compares genetic sequences across samples in search of single nucleotide differences. Even single nucleotide changes have been shown to drastically affect genetic expression, transcriptional mechanisms, and protein composition and configuration. Our team chose to compare these sites using [https://academic.oup.com/bioinformatics/article/31/17/2877/183216 kSNP3]. kSNP relies on pre-existing string manipulation programs which k-merize the sequences based off of metrics of sequence similarity. kSNP3 also possess the ability to annotate these SNPs found in the genome based off either NCBI reference genomes or provided GenBank files. | SNP analysis compares genetic sequences across samples in search of single nucleotide differences. Even single nucleotide changes have been shown to drastically affect genetic expression, transcriptional mechanisms, and protein composition and configuration. Our team chose to compare these sites using [https://academic.oup.com/bioinformatics/article/31/17/2877/183216 kSNP3]. kSNP relies on pre-existing string manipulation programs which k-merize the sequences based off of metrics of sequence similarity. kSNP3 also possess the ability to annotate these SNPs found in the genome based off either NCBI reference genomes or provided GenBank files. | ||
Using self-contained scripts, we used a k-mer size of 23 and GenBank files from the Functional Annotation team, we analyzed both the whole sample population and our previously identified cluster of interest. Initial analysis of the trees generated by kSNP3 did not return clear phylogenetic groupings along phenotypic populations. The | Using self-contained scripts, we used a k-mer size of 23 and GenBank files from the Functional Annotation team, we analyzed both the whole sample population and our previously identified cluster of interest. Initial analysis of the trees generated by kSNP3 did not return clear phylogenetic groupings along phenotypic populations. The tree for our cluster of interest is listed below. | ||
INSERT KSNP TREES HERE | INSERT KSNP TREES HERE |
Revision as of 02:39, 16 April 2018
Introduction
Background
Comparative genomics is the study of comparing genome sequences to better understand the structure and function of genes.
Fosfomycin
Fosfomycin is a natural antibacterial produced by various Streptomyces and Pseudomonas species. It is the only antibiotic currently in clinical use that targets a Mur enzyme. It is broad-spectrum bactericidal antibiotic that can be employed against both Gram-positive and Gram-negative bacteria. It interferes with cell wall synthesis, particularly inhibits the initial step involving phosphoenolpyruvate synthetase, as shown below.
Resistance of Fosfomycin involves a wide range of resistance mechanisms. Some of them include reduced uptake, target site modification, expression of antibiotic-degrading enzymes and rescue of the UDP-MurNAc biogenesis pathway (ex. mutation within MurA enzyme).
Objectives
To identify genetic determinants that could be a potential cause for Fosfomycin heteroresistance in the isolates provided.
Data
The following is the metadata of our study:
Whole Genome Approach
The Whole Genome approach to comparative genomics attempts to broadly identify similarities and differences across samples.
Similarity analysis
Identifying similarities among our samples would tell us if phenotypic similarities correlate with overall genome similarity and help us choose representatives.
We computed min-hash distances between all samples with known Antibiotic resistance phenotype and clustered them using complete linkage hierarchical clustering.
The dendrogram indicates that there is no clear demarcation between the different resistance phenotypes, confirming that there is no broad genomic signature that distinguishes the heteroresistant samples from the non-heteroresistant samples.
After removing the outliers, we found a single highly similar cluster of isolates that spanned all 3 phenotypes (Resistant, heteroresistant, susceptible)
These 11 samples have relative mash distances =~ 0.0001, suggesting that any genetic element causing the different antibiotic resistance phenotypes would be caused by the extremely small percentage of differences between these samples. We believe this cluster is a good representative of our complete dataset.
Difference analysis
In order to identify the genomic differences between our samples, we decided to perform a Genome Wide Association Study or GWAS. A GWA study correlates the presence or absence of variants in a genome with the presence or absence of a trait, which in our case is heteroresistance.
We executed a pan-genome GWAS using the tool 'bacterial GWAS'. Bacterial GWAS runs prodigal on assembled contigs to annotate ORFs, and performs CD-HIT clustering to construct a pan-genome. Then it performs a GWAS using a logistic regression model on the given phenotypes - presence or absence of heteroresistance and outputs a list of significant predicted genes along with the frequency of their occurrence in each phenotype. We then BLASTed the predicted genes to identify their function.
Phylogeny Approach
Phylogeny based approaches aim to pair down analysis by focusing on small changes to a directed set of genes between samples. Our group chose to focus on comparison of highly conserved genes and single nucleotide polymorphisms. These approaches attempted to both sequence type and understand the underlying mechanisms of action for heteroresistance in Klebsiella pneumoniae.
Multilocus Sequence Typing (MLST)
Traditional MLST schemes focus on allelic diversity across a small subset of highly conserved genes commonly referred to as housekeeping genes. Compiling allelic variants into compound identification profiles creates unique types which have been shown to have specificity down to the strain level. However, due to housekeeping genes being highly conserved, MLST schemes have difficulty distinguishing between organisms and samples from the same culture.
MLST schemes take years and large amounts of funding to establish and verify. Luckily, an MLST scheme existed previously for Klebsiella pneumoniae. We chose to use this existing scheme along with known phenotypic profiles of our samples in hopes of being able to easily sequence type heteroresistance. STing, an MLST Tool developed by the Jordan Lab at Georgia Tech, was used to quickly assign allelic profiles to our sample set.
The MLST scheme for Klebsiella pneumoniae contains 7 genes and is as follows:
- gapA - Glyceraldehyde-3-phosphate dehydrogenase A
- infB - Translation initiation factor IF-2
- Mdh - Malate dehydrogenase
- Pgi - Glucose-6-phosphate isomerase
- phoE - Outer membrane pore protein E
- rpoB - RNA polymerase subunit B
- tonB - Protein TonB
SHOW SOME RESULTS HERE
Single Nucleotide Polymorphism (SNP) Analysis
SNP analysis compares genetic sequences across samples in search of single nucleotide differences. Even single nucleotide changes have been shown to drastically affect genetic expression, transcriptional mechanisms, and protein composition and configuration. Our team chose to compare these sites using kSNP3. kSNP relies on pre-existing string manipulation programs which k-merize the sequences based off of metrics of sequence similarity. kSNP3 also possess the ability to annotate these SNPs found in the genome based off either NCBI reference genomes or provided GenBank files.
Using self-contained scripts, we used a k-mer size of 23 and GenBank files from the Functional Annotation team, we analyzed both the whole sample population and our previously identified cluster of interest. Initial analysis of the trees generated by kSNP3 did not return clear phylogenetic groupings along phenotypic populations. The tree for our cluster of interest is listed below.
INSERT KSNP TREES HERE
After obtaining the trees and SNPs contained within, our focus shifted toward finding SNPs which were homogenous within and unique to our phenotypic populations.
In the end, we were unable to discover any SNPs which were homogenous across our heteroresistant sample population and were not found in susceptible and/or resitant populations. This led us to the conclusion that heteroresistance was not being caused by single nucleotide polymorphisms.
Results and Discussion
References
Castañeda-García, Alfredo, Jesús Blázquez, and Alexandro Rodríguez-Rojas. "Molecular mechanisms and clinical impact of acquired and intrinsic fosfomycin resistance." Antibiotics 2.2 (2013): 217-236.
Nikolaidis I, Favini-Stabile S, Dessen A. 2014. Resistance to antibiotics targeted to the bacterial cell wall. Protein Sci 23: 243–259.
Kidd, Timothy J et al. “A Klebsiella Pneumoniae Antibiotic Resistance Mechanism That Subdues Host Defences and Promotes Virulence.” EMBO Molecular Medicine 9.4 (2017): 430–447.
Guo, Qinglan et al. “Glutathione-S-Transferase FosA6 of Klebsiella Pneumoniae Origin Conferring Fosfomycin Resistance in ESBL-Producing Escherichia Coli.” Journal of Antimicrobial Chemotherapy 71.9 (2016): 2460–2465.
Gardner, Shea N., Tom Slezak, and Barry G. Hall. "kSNP3. 0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genome." Bioinformatics31.17 (2015): 2877-2878.
Shea N Gardner, Tom Slezak, Barry G. Hall; kSNP3.0: SNP detection and phylogenetic analysis of genomes without genome alignment or reference genome, Bioinformatics, Volume 31, Issue 17, 1 September 2015, Pages 2877–2878.
Kim, Mincheol, et al. "Towards a taxonomic coherence between average nucleotide identity and 16S rRNA gene sequence similarity for species demarcation of prokaryotes." International journal of systematic and evolutionary microbiology 64.2 (2014): 346-351.