The Pipeline class in scikit-learn's pipeline module offers a convenient way to execute a chain of normalization and pre-processing steps, as well as predictors and classifiers on a dataset.

Such a pipeline is especially useful in the context of cross-validation, where we are interested in testing and comparing different feature selection techniques, dimensionality reduction approaches, and classifiers. All in all, Pipeline s are not only a great time-saver, but they also allow us to write clutter-free code and stay organized in the attempt to find the ideal combination of techniques for solving a pattern classification task.

In the current implementation of Pipeline the different steps in the chain have to be classes that have fit and transform methods (or, alternatively, a fit_transform method instead of transform ).