Vai al contenuto principale





Academic year 2018/2019

Course ID
Teaching staff
Francesca Cordero
Dott. Marco Beccuti
Degree course
Cellular and Molecular Biology
1st year
Teaching period
Semester 2
Related or integrative
Course disciplinary sector (SSD)
INF/01 - informatica
Formal authority
Type of examination
Written followed by interview
Formally none.
R language: data structures and the usage of functions.
Knowledge of algorithms, artificial intelligence and basic biology would help.
Propedeutic for
Systems biology

Sommario del corso


Course objectives

This teaching contributes to the learning objectives included into the area of Biomolecular in the Master in Cellular and Molecular Biology - Biologia Cellulare e Molecolare, providing knowledge and applicative abilities to analyze deep sequencing data obatined from high-throughput techniques.

In the last years become mandatory understand and known how design and analyze a biological experiment based on high-throughput techniques. Since the complete sequencing of the human genome is available to the researchers, the diagnosis and treatment of human diseases has evolved on the basis of new insights in genomics and trascriptomic. The emerging field of bioinformatics is crucial to process, to mine the data, and also to generate new testable hypotheses.

The course provides a general overview of the main technologies used to investigate the genomic and trascriptomic layers in the past, microarray, and in the present, next generation sequencing. A complete overview about the major computational pipeline and algorithms necessary to analyze these deep sequencing data will be provided.

Students will acquire an advanced level of knowledge in term of experimental design and they will be able to known which are the main steps to deal with during the analysis and visualization of deep sequencing data.

The course is organized in two modules in strict connection between them. On one side the course provides an introduction to computer science concepts, local and global alignment algorithms, and clustering algorithms.
Those concepts will be integrated in the explanation about how analyze deep sequencing data and interpret the results obtained.


Results of learning outcomes

The student will be able to understand the biological problem and to suggest an optimal experimental design to aswer to the biologcial questions proposed. The students will be also able to understand and discusss scientific papers based on the usage of the technologies discussed during the course.


Course delivery

The course is articulated in tormal in-class lectures. 

Emphasis is given to the scientific topics of the syllabus, by surveying the status of art.



Learning assessment methods

Exercises and discussions with the students during the lessons are the main tools for controlling the learning experience.

A final written test checking the acquired knowledge lasts 90 minutes and evaluates the student ability to answer both theoretical questions and solving problems related to the topics introduced during the course. Generally will be proposed 7/8 open questions and/or exercise. The score associated with each question is indicate on the exam text and it is proportional with respect to the complexity of the question.

The oral exam consists in the review of a scientific article, chosen by the student and related to the topics of the course. At the end of the explanation of the paper the student must be prepare a slide indicating the advantages and limitations reported in the paper based on his/her experience acquired at the end of the course.



High throughput technologies introduction

Microarray technologies:
from spotted/printed array to affymetrix.
Analysis of microarray data. History of DNA sequencing

Next Generation Sequencing:
Coverage in massive parallel experiments, multiplexing library, SE versus PE reads
Library Preparation, FastQ file.
Quality control, PCA and HCL
Basic notions about alignment. How mapping the reads from deep sequencing data.
Tophat algorithm for reads alignment, annotation files and SAM/BAM output file format.
Mapping and Transcripts reconstruction
Transcript reconstruction
Transcript quantification and differential expression analysis
Differential expression analysis and results visualisation

Interpretation of the results:
Gene Ontology: structure and annotation. Hypergeometric test.
Gene Ontology applications and Gene Set Enrichment Analysis

Applications of Deep Sequencing TechnologyDNA, RNA, small non coding RNA, fusion genes.

Dynamic programming algorithm for sequence alignment algorithms: Local, global and multiple sequence alignment algorithms.
Clustering and Classification algorithms.

Suggested readings and bibliography


RNA-seq Data Analysis: A Practical Approach Eija Korpelainen, Jarno Tuimala, Panu Somervuo, Mikael Huss, Garry Wong

Bioinfomatics algorithms  - An active learning approach by Pavel Pevzner and Phillip Compeau


Class schedule

Lessons: dal 07/03/2019 to 31/05/2019

Last update: 24/03/2019 10:38
Non cliccare qui!