Academic year 2021/2022
- Course ID
- Prof.ssa Francesca Cordero
Prof. Marco Beccuti
- Degree course
- Cellular and Molecular Biology
- 1st year
- Related or integrative
- Course disciplinary sector (SSD)
- INF/01 - informatica
- Formal authority
- Type of examination
- Written followed by interview
- Formally none.
R language: data structures and the usage of functions.
Knowledge of algorithms, artificial intelligence and basic biology would help.
- Propedeutic for
- Systems biology
Sommario del corso
This teaching contributes to the learning objectives included into the area of Biomolecular in the Master in Cellular and Molecular Biology - Biologia Cellulare e Molecolare, providing knowledge and applicative abilities to analyze deep sequencing data obatined from high-throughput techniques.
In the last years become mandatory understand and known how design and analyze a biological experiment based on high-throughput techniques. Since the complete sequencing of the human genome is available to the researchers, the diagnosis and treatment of human diseases has evolved on the basis of new insights in genomics and trascriptomic. The emerging field of bioinformatics is crucial to process, to mine the data, and also to generate new testable hypotheses.
The course provides a general overview of the main technologies used to investigate the genomic and trascriptomic layers in the past, microarray, and in the present, next generation sequencing. A complete overview about the major computational pipeline and algorithms necessary to analyze these deep sequencing data will be provided.
Students will acquire an advanced level of knowledge in term of experimental design and they will be able to known which are the main steps to deal with during the analysis and visualization of deep sequencing data.
The course is organized in two modules in strict connection between them. On one side the course provides an introduction to computer science concepts, local and global alignment algorithms, and clustering algorithms.
Those concepts will be integrated in the explanation about how analyze deep sequencing data and interpret the results obtained.
Results of learning outcomes
The student will be able to understand the biological problem and to suggest an optimal experimental design to aswer to the biologcial questions proposed. The students will be also able to understand and discusss scientific papers based on the usage of the technologies discussed during the course.
High throughput technologies introduction
from spotted/printed array to affymetrix.
Analysis of microarray data. History of DNA sequencing
Next Generation Sequencing:
Coverage in massive parallel experiments, multiplexing library, SE versus PE reads
Library Preparation, FastQ file.
Quality control, PCA and HCL
Basic notions about alignment. How mapping the reads from deep sequencing data.
Tophat algorithm for reads alignment, annotation files and SAM/BAM output file format.
Mapping and Transcripts reconstruction
Transcript quantification and differential expression analysis
Differential expression analysis and results visualisation
Interpretation of the results:
Gene Ontology: structure and annotation. Hypergeometric test.
Gene Ontology applications and Gene Set Enrichment Analysis
Applications of Deep Sequencing TechnologyDNA, RNA, small non coding RNA, fusion genes.
Dynamic programming algorithm for sequence alignment algorithms: Local, global and multiple sequence alignment algorithms.
Clustering and Classification algorithms.
The course is articulated in tormal in-class lectures.
Emphasis is given to the scientific topics of the syllabus, by surveying the status of art.
Learning assessment methods
Exercises and discussions with the students during the lessons are the main tools for controlling the learning experience.
A final written test checking the acquired knowledge lasts 90 minutes and evaluates the student ability to answer both theoretical questions and solving problems related to the topics introduced during the course. Generally will be proposed 7/8 open questions and/or exercise. The score associated with each question is indicate on the exam text and it is proportional with respect to the complexity of the question.
The oral exam consists in the review of a scientific article, chosen by the student and related to the topics of the course. At the end of the explanation of the paper the student must be prepare a slide indicating the advantages and limitations reported in the paper based on his/her experience acquired at the end of the course.
Suggested readings and bibliography
RNA-seq Data Analysis: A Practical Approach Eija Korpelainen, Jarno Tuimala, Panu Somervuo, Mikael Huss, Garry Wong
Bioinfomatics algorithms - An active learning approach by Pavel Pevzner and Phillip Compeau
Lessons: dal 07/03/2019 to 31/05/2019
- Enrollment opening date
- 01/03/2020 at 00:00
- Enrollment closing date
- 31/12/2022 at 23:55