R语言简洁建模(影印版)
Max Kuhn, Julia Silge
出版时间:2023年03月
页数:363
“tidymodels框架结合了以人为中心的设计和统计最佳实践,我想不出比Max和Julia给出的更好的学习方法。”
——Hadley Wickham
RStudio首席科学家
“这本书提供了一种统一和系统的方法来构建、分析和评估R中的统计模型。”
——Balasubramanian
Narasimhan
斯坦福大学高级研究科学家

tidymodels是一组用于建模和机器学习的R语言包。无论你是新手还是拥有多年建模经验,这本实践用书将为数据分析师、业务分析师和数据科学家展示tidymodels框架如何为你的工作提供一致、灵活的方法。
RStudio的工程师Max Kuhn和Julia Silge展示了通过专注于一种被称为tidyverse的R方言来创建模型的方法。采用tidyverse原则的软件共享高层设计理念和低层语法及数据结构,因此学习生态系统的一部分有助于掌握下一部分。你会明白为什么tidymodels框架被人们广泛使用。
本书内容包括:
● 学习从头到尾构建模型所需的步骤
● 理解如何流畅地使用不同的建模和特征工程方法
● 研究如何避免建模的常见缺陷,比如过拟合
● 学习为建模准备数据的实用方法
● 调整模型以获得最佳性能
● 使用良好的统计实践来比较、评估和选择模型
  1. Preface
  2. Part I. Introduction
  3. 1. Software for Modeling
  4. Fundamentals for Modeling Software
  5. Types of Models
  6. Connections Between Types of Models
  7. Some Terminology
  8. How Does Modeling Fit into the Data Analysis Process?
  9. Chapter Summary
  10. 2. A Tidyverse Primer
  11. Tidyverse Principles
  12. Examples of Tidyverse Syntax
  13. Chapter Summary
  14. 3. A Review of R Modeling Fundamentals
  15. An Example
  16. What Does the R Formula Do?
  17. Why Tidiness Is Important for Modeling
  18. Combining Base R Models and the Tidyverse
  19. The tidymodels Metapackage
  20. Chapter Summary
  21. Part II. Modeling Basics
  22. 4. The Ames Housing Data
  23. Exploring Features of Homes in Ames
  24. Chapter Summary
  25. 5. Spending Our Data
  26. Common Methods for Splitting Data
  27. What About a Validation Set?
  28. Multilevel Data
  29. Other Considerations for a Data Budget
  30. Chapter Summary
  31. 6. Fitting Models with parsnip
  32. Create a Model
  33. Use the Model Results
  34. Make Predictions
  35. parsnip-Extension Packages
  36. Creating Model Specifications
  37. Chapter Summary
  38. 7. A Model Workflow
  39. Where Does the Model Begin and End?
  40. Workflow Basics
  41. Adding Raw Variables to the workflow()
  42. How Does a workflow() Use the Formula?
  43. Creating Multiple Workflows at Once
  44. Evaluating the Test Set
  45. Chapter Summary
  46. 8. Feature Engineering with Recipes
  47. A Simple recipe() for the Ames Housing Data
  48. Using Recipes
  49. How Data Are Used by the recipe()
  50. Examples of Steps
  51. Tidy a recipe()
  52. Column Roles
  53. Chapter Summary
  54. 9. Judging Model Effectiveness
  55. Performance Metrics and Inference
  56. Regression Metrics
  57. Binary Classification Metrics
  58. Multiclass Classification Metrics
  59. Chapter Summary
  60. Part III. Tools for Creating Effective Models
  61. 10. Resampling for Evaluating Performance
  62. The Resubstitution Approach
  63. Resampling Methods
  64. Estimating Performance
  65. Parallel Processing
  66. Saving the Resampled Objects
  67. Chapter Summary
  68. 11. Comparing Models with Resampling
  69. Creating Multiple Models with Workflow Sets
  70. Comparing Resampled Performance Statistics
  71. Simple Hypothesis Testing Methods
  72. Bayesian Methods
  73. Chapter Summary
  74. 12. Model Tuning and the Dangers of Overfitting
  75. Model Parameters
  76. Tuning Parameters for Different Types of Models
  77. What Do We Optimize?
  78. The Consequences of Poor Parameter Estimates
  79. Two General Strategies for Optimization
  80. Tuning Parameters in tidymodels
  81. Chapter Summary
  82. 13. Grid Search
  83. Regular and Nonregular Grids
  84. Evaluating the Grid
  85. Finalizing the Model
  86. Tools for Creating Tuning Specifications
  87. Tools for Efficient Grid Search
  88. Chapter Summary
  89. 14. Iterative Search
  90. A Support Vector Machine Model
  91. Bayesian Optimization
  92. Simulated Annealing
  93. Chapter Summary
  94. 15. Screening Many Models
  95. Modeling Concrete Mixture Strength
  96. Creating the Workflow Set
  97. Tuning and Evaluating the Models
  98. Efficiently Screening Models
  99. Finalizing a Model
  100. Chapter Summary
  101. Part IV. Beyond the Basics
  102. 16. Dimensionality Reduction
  103. What Problems Can Dimensionality Reduction Solve?
  104. A Picture Is Worth a Thousand…Beans
  105. A Starter Recipe
  106. Recipes in the Wild
  107. Feature Extraction Techniques
  108. Modeling
  109. Chapter Summary
  110. 17. Encoding Categorical Data
  111. Is an Encoding Necessary?
  112. Encoding Ordinal Predictors
  113. Using the Outcome for Encoding Predictors
  114. Feature Hashing
  115. More Encoding Options
  116. Chapter Summary
  117. 18. Explaining Models and Predictions
  118. Software for Model Explanations
  119. Local Explanations
  120. Global Explanations
  121. Building Global Explanations from Local Explanations
  122. Back to Beans!
  123. Chapter Summary
  124. 19. When Should You Trust Your Predictions?
  125. Equivocal Results
  126. Determining Model Applicability
  127. Chapter Summary
  128. 20. Ensembles of Models
  129. Creating the Training Set for Stacking
  130. Blend the Predictions
  131. Fit the Member Models
  132. Test Set Results
  133. Chapter Summary
  134. 21. Inferential Analysis
  135. Inference for Count Data
  136. Comparisons with Two-Sample Tests
  137. Log-Linear Models
  138. A More Complex Model
  139. More Inferential Analysis
  140. Chapter Summary
  141. Appendix. Recommended Preprocessing
  142. References
  143. Index
书名:R语言简洁建模(影印版)
作者:Max Kuhn, Julia Silge
国内出版社:东南大学出版社
出版时间:2023年03月
页数:363
书号:978-7-5766-0590-7
原版书书名:Tidy Modeling with R
原版书出版商:O'Reilly Media
Max Kuhn
 
Max Kuhn是RStudio的一名软件工程师,致力于提高R语言的建模能力。他在制药和诊断行业应用各种模型超过18年。
 
 
Julia Silge
 
Julia Silge是RStudio的一名软件工程师,致力于开发开源建模工具。她拥有天体物理学博士学位,曾在科技和非营利部门担任数据科学家。
 
 
The animal on the cover of Tidy Modeling with R is a European robin (Erithacus rubecula), native to continental Europe, the United Kingdom, western Russia, and northern Africa.
European robins are grey and brown, with a signature orange chest and white stomach. Males and females look similar, with the primary differentiator being the shape of the beak. They live around forests, plants, or trees. They don’t necessarily migrate significant distances for winter (except for those living further north in Scandinavia and Russia), but females will move short distances away from males, who maintain the same territories in both winter and summer.
European robins typically begin breeding in March and will lay several clutches of 4 to 6 eggs each with an incubation period of 13 to 14 days. The clutches can overlap to a certain extent, with the male feeding the newborns while the female sits on the next clutch. The male bonds with the female by bringing her food, which can look like a mother feeding her young to an untrained observer. Their diet consists of insects, seeds, nuts, and fruits.
European robins have a very high mortality rate in the first year of life. After that, their life expectancy significantly increases. Ten percent of deaths after the first year are a result of fights between robins, as the males are very aggressive and protective of their territories. The orange chest gradually manifests itself in young robins; the initial lack of orange coloring reduces the chances of a fight over territory in their first year, when their chances of death are already so high.
Since 1960, the European robin has been the national bird of Britain, but it is less popular in other parts of Europe. They are a popular symbol of Christmas because Victorian mailmen were known as “redbreasts” due to their uniforms. The American robin is named for its similarity in appearance to the European robin, but they are not actually closely related.
购买选项
定价:118.00元
书号:978-7-5766-0590-7
出版社:东南大学出版社