Academic year 2020/2021
- Course ID
- Prof.ssa Francesca Cordero
Prof. Michele Caselle
Dott. Matteo Osella
- Degree course
- Cellular and Molecular Biology
- 2nd year
- Related or integrative
- Course disciplinary sector (SSD)
- FIS/02 - fisica teorica, modelli e metodi matematici
INF/01 - informatica
- Formal authority
- Type of examination
- Written followed by interview
Sommario del corso
This teaching contributes to the learning objectives included into the area of Biomolecular in the Master in Cellular and Molecular Biology - Biologia Cellulare e Molecolare, providing knowledge and applicative abilities from the comprehension of the usage of complex data structures to manage mapping and assembly genomes to analyze the temporal behaviors of macromolecules.
The goal of this course is to lead the next generation of scientists in technology intensive, quantitative, systems level approaches to molecular biology. We aim to teach to the students the data structures and the mathematical formalisms for analyzing data and make sense to them.
The first part of the course will be focus on the advantages in the usage of advance data structure to map and assembly entire genome. Moreover, an overview on the main algorithm for pattern matching problem will be proposed.
Since different types of information can be represented in the shape of networks in order to model the cell, details about the construction and interpretation of big biological networks will be given. The meaning of the nodes and edges used in a network representation depends on the type of data used to build the network and this should be taken into account when analyzing it. Several examples will be discussed during the course.
In the second part of the course will be studied the complexity of biological regulatory networks. It often defies the intuition of the biologist and calls for the development of proper mathematical methods to model their structures and to delineate their dynamical properties. We will then introduce and explain the Petri Nets formalism.
Some of the general areas of the course include:
Large-scale genetic network analysis and reconstruction
Molecular modeling of genetic regulatory circuits
Real time, single cell analyses of genetic regulatory circuits
Algorithm development for comparison of DNA, RNA, and protein sequences
Results of learning outcomes
KNOWLEDGE AND LEARNING SKILLS. Theoretical approaches for the quantitative study of gene regulation
USE OF KNOWLEDGE AND LEARNING SKILLS. At the end of the course, the student is expected to be able to:
- identify the optimal stucture for analyzing deep sequencing data
- discuss the main features of biological netowrks.
- use mathematical modelling to discuss relavant issues in Systems Biology
- understand the main results published on a research paper
- prepare a presentation based on a research paper in Systems Biology
Advanced structures to map, align and assembly algorithms:
Introduction to the course and Assembly algorithms
Color code in NGS
Hash table, seed idea and pattern matching
Shrimp algorithm, multiple pattern matching, tree data structure
Burrow Wheeler Transformation and exact pattern matching
BWT, position and multiple pattern matching problem and introduction on systems biology.
How represent the transcriptional factor binding sites, TFBS.
Algorithms for motifs finding with or without mismatches.
Entropy applied on the TFBS. Algorithms for TFBS detection based on greedy approach, random and Gibbs sampling.
Petri Nets Formalism
Network - Graph Theory
Network - Types of architecture
Network - Measures and example
Quantitative description of Biological Systems using mathematical and physical methods. In particular, after a short
introduction to Statistical Mechanics, we shall discuss the applications of network theory and computer simulations to
the study of complex biological systems.
The course is articulated in two parts of 24 hours each of formal in-class lectures.
If online teaching will be required, the blackboard lectures in the classroom will be replaced by recorded and live video lectures.
Learning assessment methods
A final written test checking the acquired knowledge lasts 90 minutes and evaluates the student ability to answer both theoretical questions and solving problems related to the topics introduced during the course. Generally will be proposed 7/8 open questions and/or exercise. The score associated with each question is indicate on the exam text and it is proportional with respect to the complexity of the questions.
The second part of the exam will be given by a oral test.
It is mandatory to give first the writte exam and then the oral exam.
The same modality applies also in the case of online exams through video conference.
Suggested readings and bibliography
Bioinformatics algorithms - An active learning approach by Pavel Pevzner and Phillip Compeau
Network Science by Albert Laszlo Barabasi
Introduction to Systems Biology by Alon U., Chapman & Hall/CRC