Before getting started with data science it's important to realize there are many different roles involved:
- Data Engineer ingesting, exploring, transforming, cleaning and understand data
- Data Scientist shaping and evaluating data, creating or building models, and communicating and sharing results
- Business Analyst understand the problem domain, evaluating models, or communicating results
- App Developer consumes data and models to create applications
Data science is considered a team sport due to these different roles and need for collaboration. Instead of working in siloed environments it's helpful to have a central platform for all the roles to collaborate, for this I recommend getting started with the Data Science Experience. Here you can import data from many different sources, clean and transform the data, use tools and libraries of your choice (RStudio, Jupyter Notebooks, SPSS, TensorFlow), train and test models, setup automation of tasks, easily communicate results (Pixie Dust) and share models for consumption as API endpoints. You get all of this built on top of Apache Spark and IBM’s Cloud Platform.
Great Podcast about DataScience: Partially Derivative
- Overview of Watson Machine Learning
- Getting started with Watson Machine Learning
- How to start a data science project in the DataScience Experience
- Adding a custom library to a Jupyter Notebook in the DataScience Experience
- How to Use dashDB in the DataScience Experience
- A visual introduction to Machine Learning
- Tour of Machine Learning Algorithms
- Machine Learning Progression for Software Engineers
- Recommended Resources for Beginners
- Notes on Data Science w/Chris Albon
21 Short Videos Introducing Features of the Data Science Experience
- Using correlations To understand Your Data in R
- Collection of examples, notebooks and exercies in data science and machine learning
- Analyze traffic data from the city of San Francisco
- Analyze and create data visualizations with Jupyter Notebooks
- Starcraft Replay Analysis
- Developing IBM Streams applications w/Python
- IBM Watson Data Lab Github
- Examples of pattern classificaion
- Deep Learning Anomaly Detection
- Cognitive IoT Machine Learning
- Realtime anomaly detection
- Python Connectors for Loading and Saving Data in Notebooks
- Train and Deploy Core ML Models for iOS and macOS
- Access IBM PowerAI platform w/Nimbix Cloud
- Achieve faster training of machine learning with TensorFlow and PowerAI
- Plotly Library for Pandas: Interative graphs directly from dataframes
- scikit-learn: Machine Learning in Python