Readit News logoReadit News
shoshin23 · 8 years ago
I would love to get started with data analysis. Could anyone point me to a reliable guide that gets me started on the bascis. I'm pretty well versed with Python and have used Pandas before as part of some side projects.

Thanks!

barry-cotter · 8 years ago
R

http://www-bcf.usc.edu/~gareth/ISL/

This book provides an introduction to statistical learning methods. It is aimed for upper level undergraduate students, masters students and Ph.D. students in the non-mathematical sciences. The book also contains a number of R labs with detailed explanations on how to implement the various methods in real life settings, and should be a valuable resource for a practicing data scientist.

Python

https://www.edx.org/micromasters/data-science

mtzet · 8 years ago
As a rookie trying to get into the field myself, I think there are quite a few ways to start about it.

The programming part with R, python, julia etc., seems to get the most attention here. I think the most important part here is to learn how to load datasets into your system of choice and work with them to get some nice plots out. The book "R for data science"[1] seems like a good intro for this with R and tidyverse.

Somewhat more overlooked here, are the statistical models. I second the recommendation of "Introduction to Statistical Learning"[2], possibly supplemented with it's big brother "Elements of Statistical Learning"[3] if you're more mathematically inclined and want more details. I like their emphasis on starting with simple models and working your way up. I also found their discussion on how to go from data to a mathematical model very lucid.

[1] http://r4ds.had.co.nz/

[2] http://www-bcf.usc.edu/~gareth/ISL/

[3] http://web.stanford.edu/~hastie/ElemStatLearn/

petters · 8 years ago
Bayesian Methods for Hackers https://github.com/CamDavidsonPilon/Probabilistic-Programmin... is really fun. But it does not exactly teach the basics of data science.
vazamb · 8 years ago
While a really cool approach I would highly discourage a beginner to get started with this. Bayesian methods are not what most people use and the book does not teach any of the necessary basics in terms of analysis of real world data
stared · 8 years ago
Go with scikit-learn, and learn some basic statistics and ML.

Some links: http://p.migdal.pl/2016/03/15/data-science-intro-for-math-ph...

amrrs · 8 years ago
akilism · 8 years ago
there's a pretty good book out there "Data Science from Scratch" it's all in python and really lays out the workings of popular data science / analysis algorithms without relying heavily on external libraries.