Education

Learning resources

Where to start with data science in pharma — two short steps in Python or R, then a full library for when you want to go deeper. Mostly free.

Courses at KU & SDU

Start with a course.

Instructor-led courses in pharmaceutical data science at the University of Copenhagen and the University of Southern Denmark. Several are open to PhD students and external participants — follow each link for level, dates and enrolment.

KU course

Hands-on Introduction to Pharmaceutical Data Science

Graduate School of Health & Medical Sciences · University of Copenhagen

A PhD course introducing pharmaceutical data science in Python. Free for PhD students at Danish universities (except CBS) and NorDoc faculties; others may apply and pay a fee.

PhD course · Course Open ↗
KU course

AI- & data-driven drug design – An introduction

University of Copenhagen

An introduction to AI- and data-driven approaches to drug design.

PhD course Open ↗
KU course

Big data, artificial intelligence and machine learning in drug safety

University of Copenhagen · SMIM22002U

Big data, artificial intelligence and machine-learning methods applied to drug safety.

University course Open ↗
KU course

Pharmacoepidemiology, post-authorisation safety studies and real-world data

University of Copenhagen · SMIM25001U

Pharmacoepidemiology, post-authorisation safety studies (PASS) and the use of real-world data.

University course Open ↗
KU course

Medicinal and Biostructural Chemistry

University of Copenhagen · SFAK24001U

Medicinal chemistry and the structural basis of drug–target interactions.

University course Open ↗
KU course

Structure-based Drug Research

University of Copenhagen · SLKKIL112U

Structure-based approaches to drug discovery and research.

University course Open ↗
KU course

Pharmaceutical Modelling

University of Copenhagen · SFAB21002U

Modelling approaches across the pharmaceutical sciences. Open as an elective to BSc and MSc students.

BSc/MSc elective Open ↗
KU course

Design and Analysis of Experiments

University of Copenhagen · SFKKIF102U

Statistical design of experiments and analysis of the resulting data. Open as an elective to BSc and MSc students.

BSc/MSc elective Open ↗

Where to begin

Two steps to get going.

Pick the language your group uses — Python or R — and follow two steps. Everything else is in the library below, there whenever you want to go deeper.

Step 1

Getting started

Install the tools and learn the basics from zero. Pick the language your group uses — Python or R. The way of thinking carries over either way.

Step 2

Going further

Once you can write a bit of code, move into data science: wrangling data, making figures, and doing statistics — with reproducible, shareable workflows.

The full library

Browse everything by topic.

Hand-picked resources grouped by topic — open a section to explore. Mostly free; options that carry a certificate are marked.

Programming foundations 24 resources

Start here if you've never written code. Pick Python or R based on what your colleagues use.

Certificate · CourseraPythonBeginner

Python for Everybody

University of Michigan · Charles Severance

The single best on-ramp from no programming to working Python — kind, paced, with auto-graded exercises. Free at py4e.com; same content also available as a paid Coursera Specialization with a certificate.

Interactive book + videos · ~30 hours Open ↗
FreePythonBeginner

A Whirlwind Tour of Python

Jake VanderPlas

A fast, focused intro for people who already program in some other language and just want the Python they need to read scientific code.

Free book / Jupyter notebooks · ~6 hours Open ↗
FreePythonBeginner → Advanced

Real Python (selected free tutorials)

Real Python

Topical, readable Python tutorials. The free articles alone cover most of what you need; the paid track adds video courses.

Mixed (free + paid) · Ad hoc Open ↗
FreeRBeginner

R for Data Science (2nd ed.)

Hadley Wickham, Mine Çetinkaya-Rundel & Garrett Grolemund

The canonical free R + tidyverse book. Best starting point for R if you'll do data wrangling, biostats, or pharmacometrics.

Free book · ~40 hours Open ↗
FreeRBeginner

swirl — Learn R, in R

swirlstats

Teaches R inside the R console with hands-on prompts. Pair it with R for Data Science when you want practice rather than reading.

Interactive (R package) · ~15 hours Open ↗
FreeRAll

Posit (RStudio) Cheatsheets & Recipes

Posit

One-page printable cheatsheets for ggplot2, dplyr, tidyr, lubridate, and the rest of the tidyverse. The fastest reference once you know the basics.

Reference sheets · Reference Open ↗
Certificate · Coursera (Johns Hopkins)RBeginner → Intermediate

Data Science Specialization

Johns Hopkins · Coursera

Long-running R-based specialisation that covers tooling, EDA, regression models, ML, and a capstone. Stackable; audit-free per course.

Coursera specialization (10 courses) · ~120 hours Open ↗
Certificate · Coursera (UMich)PythonIntermediate

Applied Data Science with Python Specialization

University of Michigan · Coursera

Pandas, plotting, ML with scikit-learn, text analysis, network analysis. Practical, applied; assumes you know basic Python (do Python for Everybody first).

Coursera specialization (5 courses) · ~80 hours Open ↗
FreePythonBeginner

futurecoder

futurecoder.io

Learn Python from scratch in the browser — it runs your code and gives instant feedback, with nothing to install to begin.

Interactive course · Self-paced Open ↗
FreePythonBeginner

Think Python

Allen B. Downey

A clear introduction to Python for people who have never programmed before. Free online from the author.

Free book · ~25 hours Open ↗
FreePythonBeginner

Programming with Python (Software Carpentry)

The Carpentries

A hands-on novice Python lesson aimed at researchers — used as the recommended primer for the ULLA AI in Drug Discovery course.

Workshop lesson · ~4 hours Open ↗
FreePythonBeginner → Intermediate

Python for Data Analysis (3rd ed.)

Wes McKinney

Written by the creator of pandas. Free to read online; the practical reference for data wrangling once you know basic Python.

Free book · ~30 hours Open ↗
FreePythonBeginner

Python Beginner's Guide

Python Software Foundation

The official starting point — how to install Python and where to learn, for both new and experienced programmers.

Link hub / reference · Reference Open ↗
FreePythonBeginner

W3Schools Python Tutorial

W3Schools

A modular, look-it-up reference for Python basics — handy when you just need to remember how to do one small thing.

Reference tutorial · Reference Open ↗
FreeMixedBeginner

freeCodeCamp

freeCodeCamp

Free, project-based coding courses you can fit around a busy schedule. Good for building momentum and habit.

Interactive courses · Self-paced Open ↗
FreeRBeginner

fasteR — Fast Lane to Learning R

Norm Matloff

A comprehensive single-document introduction to R for new programmers — general programming principles plus R's data types and objects.

Free tutorial · ~15 hours Open ↗
FreeRBeginner

Hands-On Programming with R

Garrett Grolemund

A friendly introduction written for non-programmers, teaching R through small hands-on projects.

Free book · ~15 hours Open ↗
FreeRBeginner

R-Ladies Sydney — Basic Basics

R-Ladies Sydney

Get R and RStudio installed and find your way around the IDE. A very gentle start, with short videos.

Tutorial + videos · ~3 hours Open ↗
FreeRBeginner

W3Schools R Tutorial

W3Schools

A modular reference for R basics — nice for quickly looking up how to do a specific thing.

Reference tutorial · Reference Open ↗
FreeRBeginner → Intermediate

An Introduction to R

R Core Team

The official introduction, straight from the source. A little dry and technical, but authoritative.

Official manual (PDF) · Reference Open ↗
FreeRIntermediate

data.table — Introduction

data.table project

The main features of data.table for fast, memory-efficient data wrangling. Further vignettes cover specific topics in depth.

Vignette · Reference Open ↗
FreeRAdvanced

Advanced R

Hadley Wickham

Deeper R concepts for experienced programmers — not for beginners, but excellent once you want to understand how R really works.

Free book · ~40 hours Open ↗
PaidRBeginner → Advanced

DataCamp — R courses

DataCamp

Polished interactive courses for R and more. Mostly a paid plan, though the introductory R course has historically been free.

Online platform · Self-paced Open ↗
SDU courseFreeRBeginner

R4PhD — Introduction to R and the tidyverse

University of Southern Denmark · O'Neill & Harsted

An open site tied to a hands-on SDU course (run 1–3× a year, also on demand). Assumes no prior data-science experience. 2026 PhD dates: Feb 4–5 and 25–26.

Course + open site · 4-day course / self-study Open ↗
Statistics & data analysis 8 resources

Build statistical intuition before reaching for ML. Pharmacy curricula typically cover the basics; these go deeper.

FreeTheoryBeginner → Intermediate

StatQuest with Josh Starmer

Josh Starmer · YouTube

Visual, intuition-first explanations of every statistical and ML concept you're likely to meet — bias-variance, regularisation, PCA, ROC curves, transformers. The friendliest single resource for self-study stats.

YouTube channel · 5–25 min per topic Open ↗
FreeBothIntermediate

Introduction to Statistical Learning (ISLR)

James, Witten, Hastie, Tibshirani

The most-used graduate-level intro to statistical learning. Free PDF, with separate R and Python lab books that walk through the methods on real data.

Free book + R + Python labs · ~80 hours Open ↗
FreeRIntermediate

Statistical Rethinking

Richard McElreath

A complete Bayesian-stats course. The book is paid but the full lecture series, code, and homework are free. The clearest treatment of causal inference + multilevel models you'll find at this level.

Free YouTube lectures + book code · ~120 hours Open ↗
FreeRIntermediate

Modern Statistics with R

Måns Thulin

Practical, applied stats with tidyverse syntax. Covers regression, mixed models, survival analysis, and Bayesian methods.

Free book · ~60 hours Open ↗
FreeTheoryAll

3Blue1Brown — Essence of Linear Algebra / Calculus / Probability

Grant Sanderson · YouTube

Animated math explainers with unmatched visual intuition. The 'Essence of' series is the recommended primer before any ML course.

YouTube series · ~10 hours per series Open ↗
Certificate · edX (MIT)PythonIntermediate → Advanced

MITx MicroMasters in Statistics & Data Science

MIT · edX

Rigorous graduate-level treatment of probability, stats, ML, and capstone. Audit free; verified track requires payment per course. Stackable into MIT degrees.

edX MicroMasters (4–5 courses) · ~12 months part-time Open ↗
Certificate · edX (Stanford)BothIntermediate

Statistical Learning with R / Python

Stanford Online · Hastie & Tibshirani

The video lectures companion to the ISLR book, taught by the authors. Audit free or pay for the verified certificate.

edX course (mirrors ISLR book) · ~50 hours Open ↗
Certificate · Coursera (Imperial)Python + theoryIntermediate

Mathematics for Machine Learning Specialization

Imperial College London · Coursera

Linear algebra, multivariate calculus, and PCA, framed for ML. Bridge between 3Blue1Brown intuition and reading actual ML papers.

Coursera specialization (3 courses) · ~50 hours Open ↗
Reproducibility & workflow 5 resources
Bioinformatics & Galaxy 6 resources

For omics, sequencing, registry/structured biological data. Galaxy is point-and-click, the rest are programmatic.

FreeGalaxy / point-and-clickBeginner → Advanced

Galaxy Training Network

Galaxy Project · GTN

Hundreds of GUI-driven, runnable tutorials covering RNA-seq, variant calling, proteomics, single-cell, machine learning, and more. The fastest path into bioinformatics for non-coders.

Hands-on tutorials · ~1–4 hours per tutorial Open ↗
FreeRIntermediate

Bioconductor — Course Materials

Bioconductor

Annual workshops and labs from the R/Bioconductor ecosystem. Authoritative for omics analysis in R.

Course archive · Varies Open ↗
FreePythonBeginner → Advanced

Rosalind — Bioinformatics problems

Rosalind

Project-Euler-style bioinformatics challenges. Great for building Python fluency on biological data without committing to a long course.

Problem set · Self-paced Open ↗
FreePythonBeginner → Intermediate

Biopython Tutorial & Cookbook

Biopython

Working with sequences, alignments, structures, and biological databases in Python. Complements Galaxy when you need scripted control.

Tutorial / cookbook · ~10 hours Open ↗
Certificate · Coursera (Johns Hopkins)Python + Galaxy + RIntermediate

Genomic Data Science Specialization

Johns Hopkins · Coursera

Covers Galaxy, Bioconductor, Python for genomics, statistics, and a capstone. Strong applied bioinformatics path.

Coursera specialization (8 courses) · ~80 hours Open ↗
FreeMixedAll

National Health Data Science Sandbox

HeaDS · University of Copenhagen

Coordinated at HeaDS (KU). Training modules and a model for building and curating computational resources for research and teaching.

Training platform · Browse Open ↗
Cheminformatics & drug discovery 3 resources
Machine learning & AI for healthcare 10 resources

From classical ML (scikit-learn) through deep learning, with a focus on healthcare applications and responsible practice.

FreePythonIntermediate

Practical Deep Learning for Coders

fast.ai · Jeremy Howard, Rachel Thomas

Top-down deep learning course that has you training real models in week 1 and explaining them in week 7. The 'Deep Learning for Coders' book that accompanies it is also free as Jupyter notebooks.

Free video course + book · ~70 hours Open ↗
FreePythonIntermediate

Hugging Face NLP / LLM Course

Hugging Face

End-to-end course on transformers, fine-tuning, datasets, and evaluation. The standard introduction to modern NLP/LLM tooling.

Free course · ~30 hours Open ↗
FreePythonIntermediate

Made With ML

Goku Mohandas

Goes beyond model training into MLOps — testing, CI/CD, monitoring, and shipping ML systems responsibly. Useful when you need a model to actually run somewhere.

Free course · ~50 hours Open ↗
Certificate · Coursera (Stanford / DeepLearning.AI)PythonBeginner → Intermediate

Machine Learning Specialization

DeepLearning.AI · Stanford · Coursera

Andrew Ng's reboot of the classic Coursera ML course. Audit free; pay (~£40/month) for graded assignments and a certificate.

Coursera specialization · ~95 hours Open ↗
Certificate · Coursera (DeepLearning.AI)PythonIntermediate

AI for Medicine Specialization

DeepLearning.AI · Coursera

Three courses on diagnosis, prognosis, and treatment with ML — covers medical image segmentation, survival analysis, and causal inference for medicine. Closest 'pharma-flavoured' specialization at this level.

Coursera specialization (3 courses) · ~80 hours Open ↗
Certificate · edX (Harvard)RBeginner → Intermediate

HarvardX Data Science Professional Certificate

Harvard · edX (Rafael Irizarry)

Audit each course for free; pay for the certificate. Covers R, viz, probability, inference, ML, and a capstone. The most-recommended single learning path for stats + ML in R.

edX professional certificate (9 courses) · ~150 hours Open ↗
Certificate · Coursera (Stanford)PythonIntermediate

AI in Healthcare Specialization

Stanford · Coursera

Stanford School of Medicine's overview of AI in healthcare — from data fundamentals through evaluation and ethics. Most pharma-relevant of the Stanford Coursera offerings.

Coursera specialization (5 courses) · ~110 hours Open ↗
Certificate · Microsoft Learn (badges + paid Microsoft certifications)PythonBeginner → Intermediate

Microsoft Learn — Machine Learning paths

Microsoft Learn

Free, hands-on learning paths with optional paid Microsoft Certified credentials at the end (Azure Data Scientist Associate is the relevant one). Useful if your org uses Azure.

Free modules + Microsoft credentials · ~10–80 hours per path Open ↗
FreePythonBeginner

Kaggle Learn

Kaggle (Google)

Bite-sized free courses on pandas, ML, computer vision, time series, geospatial. Best as a refresher between bigger commitments.

Short interactive courses · ~3–4 hours per course Open ↗
FreeMixedAll

OpenML — open ML data + benchmarks

OpenML

Open repository of curated ML datasets and benchmark results. Useful when you need a teaching dataset that's not the same iris/MNIST as everyone else.

Datasets + benchmarks · Reference Open ↗
Pharmacometrics & PKPD 3 resources
Data visualisation 4 resources
Regulatory, ethics & data governance 4 resources

Hand-curated and last reviewed 2026-05-08. These are starting points the centre believes in — quality is subjective. Spot a broken link or a better resource? Open an issue or pull request on the website repo.