Algorithms for genomic data analysis 1000-718ADG

1. Mapping of sequencing reads
◦ pattern matching algorithms, text indexing
◦ approximate pattern matching based on text indexes
◦ techniques for finding approximate occurrences of a pattern with low similarity
2. Structural variant calling
◦ based on sequencing reads
◦ based on optical mapping data
3. RNA-seq data processing
◦ read mapping vs determination of k-mer spectrum
4. Metagenomic data analysis
◦ composition- and homology-based read classification
◦ linked reads deconvolution
5. De novo genome assembly
◦ Overlap-Layout-Consensus approach
◦ de Bruijn graphs approach
◦ contig merging and scaffolding
6. Pangenomics
◦ pangenome models and their construction methods
◦ pangenome-based sequencing data analysis

Course coordinators

Jakub Kaźmierczyk
Karolina Niewiarowska

Learning outcomes

Knowledge:
- knowledge of algorithmic techniques used in DNA sequence analysis
- knowledge of methods of analysis of high-throughput DNA sequencing data

Skills:
- the ability to choose the proper sequencing technique for a given biological problem
- the ability to properly design experiments using large-scale genomic technologies and to analyze the output data
- the ability to implement selected algorithms for the analysis of data from next generation sequencing

Competences:
- knows the limitations of his own knowledge, is able to formulate questions to deepen the understanding of the issue under consideration
- understands the need for a critical analysis of the study he created

Assessment criteria

Final assesment is based on lab projects and (optionally) oral exam.

Bibliography

V. Mäkinen, D. Belazzougui, F. Cunial, A. Tomescu, Genome-Scale Algorithm Design. Cambridge University Press 2015.
X. Wang, Next-Generation Sequencing Data Analysis, CRC Press 2016.