Python数据分析(影印版)
Wes McKinney
出版时间:2013年06月
页数:472
“科学和数据分析领域已经等了这本书好几年了:具有具体的实用建议以及如何聚沙成塔的见解。它应该会成为接下来若干年里Python科学计算方面的经典参考资料。”
——Fernando Perez
UC Berkeley大学的助理研究员,也是IPython的原创作者之一

你是否在寻找一本完整介绍Python操纵、处理、提取和压缩结构化数据的指南?本书包含了许多实例分析,通过若干个Python库——包括NumPy,pandas,matplotlib和IPython——为你展示了如何高效地解决大量数据分析的问题。
《Python数据分析》由Wes McKinney撰写,他是pandas库的主要作者。本书也是一本具有实践性的指南,指导那些使用Python进行科学计算的数据密集型应用。它适用于刚刚开始使用Python的分析师,或者是进入科学计算领域的Python程序员。

· 使用IPython交互式shell作为你的主要开发环境
· 学习NumPy(Numerical Python)的基础和高级特性
· 接触pandas库中的数据分析工具
· 使用高性能工具来加载、抽取、转换、合并和改造数据
· 使用matplotlib来创建散点图和静态或者交互式可视化数据
· 运用pandas的groupby功能来对数据集进行切片、切块和汇总
· 通过具体实例来学习如何解决web分析、社交科学、金融和经济领域的问题

Wes McKinney是pandas的主要作者,pandas是Python中流行的数据分析开源库。他一开始是AQR资产管理公司的量化分析师,后来创办了Lambda Foundry——一家企业数据分析公司。Wes是Python和开源社区的活跃讲师和参与者。
  1. Chapter 1: Preliminaries
  2. What Is This Book About?
  3. Why Python for Data Analysis?
  4. Essential Python Libraries
  5. Installation and Setup
  6. Community and Conferences
  7. Navigating This Book
  8. Acknowledgements
  9. Chapter 2: Introductory Examples
  10. 1.usa.gov data from bit.ly
  11. MovieLens 1M Data Set
  12. US Baby Names 1880-2010
  13. Conclusions and The Path Ahead
  14. Chapter 3: IPython: An Interactive Computing and Development Environment
  15. IPython Basics
  16. Using the Command History
  17. Interacting with the Operating System
  18. Software Development Tools
  19. IPython HTML Notebook
  20. Tips for Productive Code Development Using IPython
  21. Advanced IPython Features
  22. Credits
  23. Chapter 4: NumPy Basics: Arrays and Vectorized Computation
  24. The NumPy ndarray: A Multidimensional Array Object
  25. Universal Functions: Fast Element-wise Array Functions
  26. Data Processing Using Arrays
  27. File Input and Output with Arrays
  28. Linear Algebra
  29. Random Number Generation
  30. Example: Random Walks
  31. Chapter 5: Getting Started with pandas
  32. Introduction to pandas Data Structures
  33. Essential Functionality
  34. Summarizing and Computing Descriptive Statistics
  35. Handling Missing Data
  36. Hierarchical Indexing
  37. Other pandas Topics
  38. Chapter 6: Data Loading, Storage, and File Formats
  39. Reading and Writing Data in Text Format
  40. Binary Data Formats
  41. Interacting with HTML and Web APIs
  42. Interacting with Databases
  43. Chapter 7: Data Wrangling: Clean, Transform, Merge, Reshape
  44. Combining and Merging Data Sets
  45. Reshaping and Pivoting
  46. Data Transformation
  47. String Manipulation
  48. Example: USDA Food Database
  49. Chapter 8: Plotting and Visualization
  50. A Brief matplotlib API Primer
  51. Plotting Functions in pandas
  52. Plotting Maps: Visualizing Haiti Earthquake Crisis Data
  53. Python Visualization Tool Ecosystem
  54. Chapter 9: Data Aggregation and Group Operations
  55. GroupBy Mechanics
  56. Data Aggregation
  57. Group-wise Operations and Transformations
  58. Pivot Tables and Cross-Tabulation
  59. Example: 2012 Federal Election Commission Database
  60. Chapter 10: Time Series
  61. Date and Time Data Types and Tools
  62. Time Series Basics
  63. Date Ranges, Frequencies, and Shifting
  64. Time Zone Handling
  65. Periods and Period Arithmetic
  66. Resampling and Frequency Conversion
  67. Time Series Plotting
  68. Moving Window Functions
  69. Performance and Memory Usage Notes
  70. Chapter 11: Financial and Economic Data Applications
  71. Data Munging Topics
  72. Group Transforms and Analysis
  73. More Example Applications
  74. Chapter 12: Advanced NumPy
  75. ndarray Object Internals
  76. Advanced Array Manipulation
  77. Broadcasting
  78. Advanced ufunc Usage
  79. Structured and Record Arrays
  80. More About Sorting
  81. NumPy Matrix Class
  82. Advanced Array Input and Output
  83. Performance Tips
  84. Appendix: Python Language Essentials
  85. The Python Interpreter
  86. The Basics
  87. Data Structures and Sequences
  88. Functions
  89. Files and the operating system
书名:Python数据分析(影印版)
作者:Wes McKinney
国内出版社:东南大学出版社
出版时间:2013年06月
页数:472
书号:978-7-5641-4204-9
原版书书名:Python for Data Analysis
原版书出版商:O'Reilly Media
Wes McKinney
 
Wes McKinney是纽约的一名数据分析高手和企业主。在2007年获得MIT的数学学士学位之后,他到位于康涅狄格州格林威治市(Greenwich,CT)的AQR Capital Management公司从事定量金融方面的工作。由于不满那些数据分析工具的各种不好用,他开始学习Python,并于2008年开始构建pandas项目。他目前是Python科学计算社区的活跃分子,而且积极倡导在数据分析、金融以及统计应用中使用Python。
 
 
The animal on the cover of Python for Data Analysis is a golden-tailed, or pen-tailed, tree shrew (Ptilocercus lowii). The golden-tailed tree shrew is the only one of its species in the genus Ptilocercus and family Ptilocercidae; all the other tree shrews are of the family Tupaiidae. Tree shrews are identified by their long tails and soft red-brown fur. As nicknamed, the golden-tailed tree shrew has a tail that resembles the feather on a quill pen. Tree shrews are omnivores, feeding primarily on insects, fruit, seeds, and small vertebrates.Found predominantly in Indonesia, Malaysia, and Thailand, these wild mammals are known for their chronic consumption of alcohol. Malaysian tree shrews were found to spend several hours consuming the naturally fermented nectar of the bertam palm, equalling about 10 to 12 glasses of wine with 3.8% alcohol content. Despite this, no golden-tailed tree shrew has ever been intoxicated, thanks largely to their impressive ethanol breakdown, which includes metabolizing the alcohol in a way not used by humans. Also more impressive than any of their mammal counterparts, including humans? Brain to body mass ratio.

Despite these mammals’ name, the golden-tailed shrew is not a true shrew, instead more closely related to primates. Because of their close relation, tree shrews have become an alternative to primates in medical experimentation for myopia, psychosocial stress, and hepatitis.