Resource Center¶
Here a links to a variety of data science and business analytics related resources.
General business analytics and data science¶
Analytics Magazine It’s published by INFORMS - the Institute for Operations Research and Management Science. They are the premier professional society for analytics and it’s inexpensive to join as a student. Full disclosure, I’ve been an INFORMS member since about 1986 (when it was still ORSA/TIMS). We were doing analytics before it was called analytics. :)
Awesome Data Science - giant curated list of data science resources
- There are numerous groups on Reddit related to analytics and data science. These can be very good resources for unvarnished conversations/opinions about careers, grad school as well as technical advice.
Programming tutorial hubs¶
Software Carpentry and Data Carpentry - helping scientists learn to do computational work with R, Python, SQL and other tools
Online courses¶
There are numerous online courses available through DataCamp, Coursera, EdX, Udemy and others. Here’s a few Python and R ones I’ve checked out over the years.
Intro to Data Science in Python - I did this short course in Feb 2017 (Coursera UMich). Great fun. If you want a good pandas/python learning challenge, try the assignments.
Python for Everybody course - This site includes a bunch of videos and supplementary files. The whole thing was created by a professor at University of Michigan and is meant to be a totally open set of freely available learning materials for Python in the context of data analysis.
Coursera has some well regarded R based data science courses
Learning R¶
Online tutorials, books and examples for getting started¶
R-bloggers- The aggregator for R related blogs.
Quick-R - This is a great site dedicated to helping R newbies get over the somewhat steep R learning curve.
Cookbook for R - Another great site for learning R. In their words: “The goal of the cookbook is to provide solutions to common tasks and problems in analyzing data.”
R for Data Science - Free, online version of the book, R for Data Science by Hadley Wickham and Garrett Grolemund.
The Official R Manuals - These are accessible from the main R Project page in the Documentation section.
Contributed Documentation - Many people have written tutorials, books, and other free documentation for various aspects of R. This is part of the magic of R community.
Introducing R to a non-programmer in one hour - Just what it says.
Webinars from R Studio- The creators of the hugely popular R Studio package have a ton of learning resources on their site.
Teach yourself Shiny- A somewhat recent development by the folks at R Studio is something called a Shiny web app. Learn to create interactive, R driven, web apps!
Packages¶
The R ecosystem relies on high quality packages and its community of package developers. Here are some collections of package descriptions and links.
RStartHere- A very comprehensive and well organized list of packages for doing data science in R.
Awesome R- Curated list of R packages by category (IDE, data manipulation, etc.)
Learning Python¶
Online tutorials, books and examples for getting started¶
Software Carpentry - Lessons - Software Carpentry is one of my all time favorite resources for teaching and learning practical programming skills. This link takes you to their list of “Lessons” (really entire mini-courses). In addition to a lesson on Python, you’ll find lessons on tons of stuff that is useful for business analytics and data science. Highly, highly recommended.
Whirlwind Tour of Python - Jake VanderPlas - Free 100 page pdf and associated Jupyter notebooks for those who want to learn Python for data science use and have some prior knowledge of programming.
Python for Everybody - Charles Severance - This is a remixed, freely available, textbook on learning Python to do data analysis.
Blogs and listservs¶
Practical Business Python - Super relevant blog for business students learning Python.
Pycoders Weekly - Weekly email newsletter. Always has interesting stuff and almost always something directly data science related.
Libraries¶
Awesome Python - A curated list of awesome Python frameworks, libraries, software and resources
Statistics¶
If you are rusty on statistics, there’s a really good OpenIntro Stats book available as a free online book or you can pay what you want for a paperback copy. It includes R based material.
You can also find high quality free online statistics courses through the Open Learning Initiative as well as places like Coursera and EdX.
Cross Validated is a great Q&A forum for all things statistics. Lots of R related content.
Publicly available data¶
Kaggle Datasets - need to create a free Kaggle account
OpenML Datasets - site with many ML resources
cs109 Resources (2014) - Many links to datasets (as well as links to Python and misc data science stuff)
https://github.com/rstudio/RStartHere#data - From the RStartHere site
Workflow and reproducible analysis¶
Data Science Workflow: Overview and Challenges - Blog post by Philip Guo who did his dissertation on this topic.
Cookiecutter Data Science - “A logical, reasonably standardized, but flexible project structure for doing and sharing data science work.”