QQ阅读女生短篇网

女生短篇网

首页
现言
古言
青春
幻言
仙侠

达人推广书架浏览历史

登录 | 注册

首页>Python Data Analysis Cookbook>Python Data Analysis Cookbook最新章节目录

Python Data Analysis Cookbook

更新时间：2021-07-14 11:06:29

最新章节：Index

完结共209章

倒序

Index
更新时间：2021-07-14 11:06:29
Docker tips
Reproducible sessions
Command-line history
The alias command
Command-line tools
IPython notebooks
Appendix D. Tips and Tricks for Command-Line and Miscellaneous Tools
Mathematics and statistics
IPython notebooks and open data
Appendix C. Online Resources
Statsmodels
Seaborn
SciPy
Scikit-learn
pandas
NumPy
Matplotlib
IPython
Appendix B. Function Reference
Appendix A. Glossary
Harnessing the power of the GPU with OpenCL
Streaming counting with the Count-min sketch
Caching HTTP requests
Caching with a least recently used cache
Calculating the mean variance skewness and kurtosis on the fly
Profiling memory usage
Distributed processing with execnet
Accessing resources asynchronously with the asyncio module
Launching multiple tasks with the concurrent.futures module
Running multiple threads with the threading module
Speeding up numerical expressions with Numexpr
Just-in-time compiling with Numba
Introduction
Chapter 12. Parallelism and Performance
Segmenting images with spectral clustering
Applying hierarchical clustering on images
Extracting texture features from images
Extracting metadata from images
Searching for bright stars
Detecting faces with Haar cascades
Extracting patches from an image
Denoising images
Quantizing colors
Detecting features with SURF
Applying Scale-Invariant Feature Transform (SIFT)
Setting up OpenCV
Introduction
Chapter 11. Analyzing Images
Taking a look at the Matthews correlation coefficient
Examining the kappa of classification
Calculating the mean absolute error and the residual sum of squares
Comparing with a dummy regressor
Determining MAPE and MPE
Comparing results with a dummy classifier
Evaluating clusters with the mean silhouette coefficient
Computing MSE and median absolute error
Visualizing the goodness of fit
Examining a receiver operating characteristic and the area under a curve
Computing precision recall and F1-score
Getting classification straight with the confusion matrix
Introduction
Chapter 10. Evaluating Classifiers Regressors and Clusters
Taking a Theano tour
Hierarchically clustering data
Reusing models with joblib
Nesting cross-validation
Boosting for better learning
Bagging to improve results
Fitting noisy data with the RANSAC algorithm
Learning with random forests
Stacking and majority voting for multiple models
Applying linear discriminant analysis for dimension reduction
Applying principal component analysis for dimension reduction
Recursively eliminating features
Introduction
Chapter 9. Ensemble Learning and Dimensionality Reduction
Creating a document graph with cosine similarity
Getting the clique number of a graph
Calculating the assortativity coefficient of a graph
Estimating the average clustering coefficient
Determining the betweenness centrality
Calculating social network closeness centrality
Computing social network density
Implementing a basic terms database
Extracting topics with non-negative matrix factorization
Recognizing named entities
Stemming lemmatizing filtering and TF-IDF scores
Tokenizing news articles in sentences and words
Creating a categorized corpus
Introduction
Chapter 8. Text Mining and Social Network Analysis
Optimizing an equal weights two-asset portfolio
Populating the stock prices database
Creating tables for a stock prices database
Determining market efficiency with autoregressive models
Testing for random walks
Examining the market with the non-parametric runs test
Exploring risk and return
Correlating individual stocks with the broader market
Analyzing returns statistics
Ranking stocks with the Calmar and Sortino ratios
Ranking stocks with the Sharpe ratio and liquidity
Computing simple and log returns
Introduction
Chapter 7. Selecting Stocks with Financial Data Analysis
Applying the discrete wavelet transform
Moving block bootstrapping time series data
Block bootstrapping time series data
Analyzing signals with the discrete cosine transform
Analyzing the frequency spectrum of audio
Using the Lomb-Scargle periodogram
Evaluating smoothing
Exponential smoothing
Measuring phase synchronization
Analyzing peaks
Estimating power spectral density with the Welch method
Spectral analysis with periodograms
Introduction
Chapter 6. Signal Processing and Timeseries
Clustering data with Spark
Setting up Spark
Using HDFS
Implementing a star schema with fact and dimension tables
Setting up a test web server
Adding indices after table creation
Adding a table column to an existing table
Setting up database migration scripts
Implementing association tables
Dealing with non-ASCII text and HTML entities
Scraping the Web
Simulating web browsing
Introduction
Chapter 5. Web Mining Databases and Big Data
Using arbitrary precision for linear algebra
Using arbitrary precision for optimization
Taking variance into account with weighted least squares
Fitting a robust linear model
Applying logit() to transform proportions
Rebinning data
Transforming data with logarithms
Transforming data with the power ladder
Normalizing with the Box-Cox transformation
Measuring central tendency of noisy data
Winsorizing data
Clipping and filtering outliers
Introduction
Chapter 4. Dealing with Data and Numerical Issues
Evaluating relations between variables with ANOVA
Correlating a binary and a continuous variable with the point biserial correlation
Correlating variables with the Spearman rank correlation
Correlating variables with Pearson's correlation
Exploring extreme values
Sampling with probability weights
Determining confidence intervals for mean variance and standard deviation
Estimating kernel density
Determining bias
Fitting aggregated counts to the Poisson distribution
Fitting aggregated data to the gamma distribution
Fitting data to the exponential distribution
Introduction
Chapter 3. Statistical Data Analysis and Probability
Highlighting data points with influence plots
Using ggplot2-like plots
Displaying geographical maps
Visualizing network graphs with hive plots
Combining box plots and kernel density plots with violin plots
Creating heatmaps
Visualizing with d3.js via mpld3
Viewing a matrix of scatterplots
Interacting with IPython Notebook widgets
Choosing matplotlib color maps
Choosing seaborn color palettes
Graphing Anscombe's quartet
Introduction
Chapter 2. Creating Attractive Data Visualizations
Standardizing reports code style and data access
Seeding random number generators and NumPy print options
Configuring matplotlib
Configuring pandas
Unit testing your code
Learning to log for robust error checking
Configuring IPython
Keeping track of package versions and history in IPython Notebook
Sandboxing Python applications with Docker images
Creating a virtual environment with virtualenv and virtualenvwrapper
Installing the Data Science Toolbox
Setting up Anaconda
Introduction
Chapter 1. Laying the Foundation for Reproducible Data Analysis
Customer support
Reader feedback
Conventions
Sections
Who this book is for
What you need for this book
What this book covers
A conjecture about the future
A brief of history of data analysis with Python
Data analysis data science big data – what is the big deal?
Why do you need this book?
Preface
eBooks discount offers and more
www.PacktPub.com
About the Reviewers
About the Author
Credits
版权页
封面

封面
版权页
Credits
About the Author
About the Reviewers
www.PacktPub.com
eBooks discount offers and more
Preface
Why do you need this book?
Data analysis data science big data – what is the big deal?
A brief of history of data analysis with Python
A conjecture about the future
What this book covers
What you need for this book
Who this book is for
Sections
Conventions
Reader feedback
Customer support
Chapter 1. Laying the Foundation for Reproducible Data Analysis
Introduction
Setting up Anaconda
Installing the Data Science Toolbox
Creating a virtual environment with virtualenv and virtualenvwrapper
Sandboxing Python applications with Docker images
Keeping track of package versions and history in IPython Notebook
Configuring IPython
Learning to log for robust error checking
Unit testing your code
Configuring pandas
Configuring matplotlib
Seeding random number generators and NumPy print options
Standardizing reports code style and data access
Chapter 2. Creating Attractive Data Visualizations
Introduction
Graphing Anscombe's quartet
Choosing seaborn color palettes
Choosing matplotlib color maps
Interacting with IPython Notebook widgets
Viewing a matrix of scatterplots
Visualizing with d3.js via mpld3
Creating heatmaps
Combining box plots and kernel density plots with violin plots
Visualizing network graphs with hive plots
Displaying geographical maps
Using ggplot2-like plots
Highlighting data points with influence plots
Chapter 3. Statistical Data Analysis and Probability
Introduction
Fitting data to the exponential distribution
Fitting aggregated data to the gamma distribution
Fitting aggregated counts to the Poisson distribution
Determining bias
Estimating kernel density
Determining confidence intervals for mean variance and standard deviation
Sampling with probability weights
Exploring extreme values
Correlating variables with Pearson's correlation
Correlating variables with the Spearman rank correlation
Correlating a binary and a continuous variable with the point biserial correlation
Evaluating relations between variables with ANOVA
Chapter 4. Dealing with Data and Numerical Issues
Introduction
Clipping and filtering outliers
Winsorizing data
Measuring central tendency of noisy data
Normalizing with the Box-Cox transformation
Transforming data with the power ladder
Transforming data with logarithms
Rebinning data
Applying logit() to transform proportions
Fitting a robust linear model
Taking variance into account with weighted least squares
Using arbitrary precision for optimization
Using arbitrary precision for linear algebra
Chapter 5. Web Mining Databases and Big Data
Introduction
Simulating web browsing
Scraping the Web
Dealing with non-ASCII text and HTML entities
Implementing association tables
Setting up database migration scripts
Adding a table column to an existing table
Adding indices after table creation
Setting up a test web server
Implementing a star schema with fact and dimension tables
Using HDFS
Setting up Spark
Clustering data with Spark
Chapter 6. Signal Processing and Timeseries
Introduction
Spectral analysis with periodograms
Estimating power spectral density with the Welch method
Analyzing peaks
Measuring phase synchronization
Exponential smoothing
Evaluating smoothing
Using the Lomb-Scargle periodogram
Analyzing the frequency spectrum of audio
Analyzing signals with the discrete cosine transform
Block bootstrapping time series data
Moving block bootstrapping time series data
Applying the discrete wavelet transform
Chapter 7. Selecting Stocks with Financial Data Analysis
Introduction
Computing simple and log returns
Ranking stocks with the Sharpe ratio and liquidity
Ranking stocks with the Calmar and Sortino ratios
Analyzing returns statistics
Correlating individual stocks with the broader market
Exploring risk and return
Examining the market with the non-parametric runs test
Testing for random walks
Determining market efficiency with autoregressive models
Creating tables for a stock prices database
Populating the stock prices database
Optimizing an equal weights two-asset portfolio
Chapter 8. Text Mining and Social Network Analysis
Introduction
Creating a categorized corpus
Tokenizing news articles in sentences and words
Stemming lemmatizing filtering and TF-IDF scores
Recognizing named entities
Extracting topics with non-negative matrix factorization
Implementing a basic terms database
Computing social network density
Calculating social network closeness centrality
Determining the betweenness centrality
Estimating the average clustering coefficient
Calculating the assortativity coefficient of a graph
Getting the clique number of a graph
Creating a document graph with cosine similarity
Chapter 9. Ensemble Learning and Dimensionality Reduction
Introduction
Recursively eliminating features
Applying principal component analysis for dimension reduction
Applying linear discriminant analysis for dimension reduction
Stacking and majority voting for multiple models
Learning with random forests
Fitting noisy data with the RANSAC algorithm
Bagging to improve results
Boosting for better learning
Nesting cross-validation
Reusing models with joblib
Hierarchically clustering data
Taking a Theano tour
Chapter 10. Evaluating Classifiers Regressors and Clusters
Introduction
Getting classification straight with the confusion matrix
Computing precision recall and F1-score
Examining a receiver operating characteristic and the area under a curve
Visualizing the goodness of fit
Computing MSE and median absolute error
Evaluating clusters with the mean silhouette coefficient
Comparing results with a dummy classifier
Determining MAPE and MPE
Comparing with a dummy regressor
Calculating the mean absolute error and the residual sum of squares
Examining the kappa of classification
Taking a look at the Matthews correlation coefficient
Chapter 11. Analyzing Images
Introduction
Setting up OpenCV
Applying Scale-Invariant Feature Transform (SIFT)
Detecting features with SURF
Quantizing colors
Denoising images
Extracting patches from an image
Detecting faces with Haar cascades
Searching for bright stars
Extracting metadata from images
Extracting texture features from images
Applying hierarchical clustering on images
Segmenting images with spectral clustering
Chapter 12. Parallelism and Performance
Introduction
Just-in-time compiling with Numba
Speeding up numerical expressions with Numexpr
Running multiple threads with the threading module
Launching multiple tasks with the concurrent.futures module
Accessing resources asynchronously with the asyncio module
Distributed processing with execnet
Profiling memory usage
Calculating the mean variance skewness and kurtosis on the fly
Caching with a least recently used cache
Caching HTTP requests
Streaming counting with the Count-min sketch
Harnessing the power of the GPU with OpenCL
Appendix A. Glossary
Appendix B. Function Reference
IPython
Matplotlib
NumPy
pandas
Scikit-learn
SciPy
Seaborn
Statsmodels
Appendix C. Online Resources
IPython notebooks and open data
Mathematics and statistics
Appendix D. Tips and Tricks for Command-Line and Miscellaneous Tools
IPython notebooks
Command-line tools
The alias command
Command-line history
Reproducible sessions
Docker tips
Index
更新时间：2021-07-14 11:06:29