help me with this Computational Genomics project

Cancelado Publicado hace 6 años Pagado a la entrega
Cancelado Pagado a la entrega

Browse TCGA Data portal. [login to view URL]

You will use GBM and LGG datasets for RNA-sequencing. Following web pages will help you to understand this data set;

[login to view URL]

[login to view URL]

Each case(patient) is represented as a file. File names will look like;

Compressed; [login to view URL]

Fully Uncompressed; [login to view URL]

For each gene, there should be an entry such as; ENSG00000242268.2 and a corresponding value.

Convert entries to gene names by using EnsembltoGeneID.txt. You will need to combine these files to create your data tables.

Use [login to view URL] and convert filenames to sample IDs such as; TCGA-06-5859-01A-01R-1849-01

Then you need to annotate these cases for subtypes using Table S1 on [login to view URL](15)01692-X

(You may annotate the sample IDs using; https://wiki.nci.nih.gov/display/TCGA/TCGA+barcode as well)

Part A;

Annotate samples by expression and methylation-based clusters. Each cluster would have LGG and GBM samples. Compare LGG and GBM samples within IDH-must (LGm1-2-3) samples. Then compare LGG and GBM samples within IDH-wt (LGm4-5-6) samples.

For each gene, you need to compare two groups with t-test and generate a p-value. Then you will order genes in increasing order of p-values and explain top 20 genes you find.

Finally, compare the two lists you find.

As you compare low grade and high-grade gliomas in different glioma subtypes; in your report, this should be the main theme. Genes you find might give insights on molecular reasons behind glioma aggression.

Write a report with abstract, introduction, results and discussion sections. You may use the papers you presented in the class as examples.

The abstract should include the summary of your analyses.

The introduction should include information on glioma, tcga, RNA-sequencing, t-test. You should include relevant studies as references.

Results should include your findings in your analyses. You need to include figures and/or tables as visuals.

The discussion should include your interpretations of results and speculations for future research that should be done.

Part B;

Annotate samples by expression and methylation-based clusters. Each cluster would have LGG and GBM samples. Compare LGG and GBM samples within IDH-mut (LGm1-2-3) samples. Then use regression ([login to view URL]) to find genes that correlate with Patient Age for two subtypes.

For each gene, you, need to run a regression (AGE should be explanatory variable and gene should be dependent variable) and generate a p-value. Then you will order genes in increasing order of p-values and explain top 20 genes you find.

Finally, compare the two lists you find.

Entrada de datos Java Python

Nº del proyecto: #16233057

Sobre el proyecto

3 propuestas Proyecto remoto Activo hace 6 años

3 freelancers están ofertando un promedio de $26 por este trabajo

eswarmahesh2604

A proposal has not yet been provided

$27 USD en 5 días
(0 comentarios)
0.0
matriks97

because I think that I am able to do this job and I need the money to study, I end up in three days at the latest.

$25 USD en 3 días
(0 comentarios)
0.0