Skip to content

Pipelines

scikit-learn Pipelines

Introduction

Machine learning projects frequently require a sequence of preprocessing tasks to prepare the data for model training. Such tasks can range from filling in missing values and normalizing numerical data to encoding categorical data. The scikit-learn library simplifies this process through its Pipeline class. This handy tool allows to bundle the preprocessing steps and model training into one unified workflow, acting much like an individual estimator.