Kommande
2259:-
This book explores multivariate statistics from both traditional and modern perspectives. The first section covers core topics like multivariate normality, MANOVA, discrimination, PCA, and canonical correlation analysis. The second section includes modern concepts such as gradient boosting, random forests, variable importance, and causal inference. A key theme is leveraging classical multivariate statistics to explain advanced topics and prepare for contemporary methods. For example, linear models provide a foundation for understanding regularization with AIC and BIC, leading to a deeper analysis of regularization through generalization error and the VC theorem. Discriminant analysis introduces the weighted Bayes rule, which leads into modern classification techniques for class-imbalanced machine learning problems. Steepest descent serves as a precursor to matching pursuit and gradient boosting. Axis-aligned trees like CART, a classical tool, set the stage for more recent methods like super greedy trees. Another central theme is the concept of training error. While introductory courses often emphasize that reducing training error can lead to overfitting, training error is also known as empirical risk. In regression, it is the residual sum of squares, and reducing it leads to least squares solutions. But is this always the best approach? Empirical risk is vital in statistical learning theory and is key to determining whether learning is possible. The principle of empirical risk minimization is crucial and shows that reducing training error is not harmful when regularization is applied. This principle is further examined through discussions on penalization, matching pursuit, gradient boosting, and super greedy trees. Key Features: Covers both classical and contemporary multivariate statistics. Each chapter includes a carefully selected set of exercises that vary in degree of difficulty and are both applied and theoretical. The book can also serve as a reference for researchers due to the diverse topics covered, including new material on super greedy trees, rule-based variable selection, and machine learning for causal inference. Extensive treatment on trees that provides a comprehensive and unified approach to understanding trees in terms of partitions and empirical risk minimization. New content on random forests, including random forest quantile classifiers for class-imbalanced problems, multivariate random forests, subsampling for confidence regions, super greedy forests. An entire chapter is dedicated to random survival forests, featuring new material on random hazard forests extending survival forests to time-varying covariate
- Format: Inbunden
- ISBN: 9781032758794
- Språk: Engelska
- Antal sidor: 504
- Utgivningsdatum: 2025-03-20
- Förlag: Chapman & Hall/CRC