Team II Webserver Group: Difference between revisions

From Compgenomics 2018
Jump to navigation Jump to search
Yinquan (talk | contribs)
Shrey (talk | contribs)
Line 61: Line 61:
Output tree:
Output tree:


[[File: StrainSeeker.png |800px| center ]]
[[File: StrainSeeker.png |700px| center ]]


3) RGI
3) RGI


4) VFDB
4) VFDB

Revision as of 18:18, 20 April 2018

Introduction

Background

A web server is a server that hosts web pages as requested. It takes in requests in the form of HTTP(Hyper Text Transfer Protocol) and then stores, processes data at the same time. Here, we build a web server that ideally predicts a phenotype based on the genetic information it is given.

Goal

The web server should provide the following:

1. An easy-to-use tool to help distinguish between Klebsiella phenotypes, by implementing the work of the comparative genomics group.

2. A robust and easy-to-use web-based de-novo assembly tool.

3. A feature to visualize and download the results of the 258 genomes.

Design Principles

1. Minimal

2. Mobile Friendly

3. Short Load Time

4. Contrasting Colors

Functionalities Offered

Genome Assembly

Users will have the option of uploading either an assembled genome or short-reads from NGS methods. If short-reads are provided, a de novo assembly will be performed using the assembler Skesa. The resulting assembly will be available for download and will be used for downstream processes.

Strain and Species Identification

Strain identification is performed using k-mer based approaches in StrainSeeker. Each user-provided input sample is reduced to a pool of unique k-mers that is compared to k-mer pools from samples in the NCBI sequence database. The observed and expected k-mer pool overlaps between user-provided samples and NCBI-samples are used to taxonomically place samples with unknown strain identity.

Antibiotic-resistance Profiling

Each user-provided sample will be compared to the Comprehensive Antibiotic Resistance Database (CARD) using the toolkit RGI. RGI discovers high-confidence homologues of known antibiotic-resistance genes using Diamond homology searches. RGI also incorporates SNP models to predict genetic variants that are likely to confer new antibiotic resistances. The entire antibiotic profile of each strain is visualized in a wheel-chart (provided by RGI) that allows the user to explore result by drug class, mechanism of resistance, and antibiotic target.

Virulence Factor Profiling

Each user-provided sample will be blasted against the Virulence Factor Database (VFDB) to identify homologues of known virulence factors. A non-redundant blast output (outfmt 6) will be provided for download by the user.

Tools

1) Skesa

2) StrainSeeker StrainSeeker is a program for detecting bacterial strains from raw sequencing reads. Compared to other similar programs, it offers the following advantages:

1. Fully customizable database - use your own strains of interest or download our database

2. Detect novel strains that are related to strains in the database

3. Quickly handle large amounts of data

4. Results given all the way down to the strain level!

Input: fastq or fasta sample

Output tree:

3) RGI

4) VFDB