EDSP Challenge Mentoring Program
Welcome!
The content of this repository is available for you to work through in preparation for completing the Enterprise Data Science Program Challenge.
Modules
The topics available in this program:
Module 1: Python
The core Python concepts you’ll need as you begin your journey into Data Science and Machine Learning.
Module 2: Math & Statistics
Data Science involves many areas of mathematics from statistics, calculus, linear algebra, and more. The material in this module is around statistics and probability - this knowledge will help serve as a foundation as you continue into the later modules.
Module 3: Business Understanding
This module will guide you through setting up a AI/ML use case end-end from a business understanding & project management perspective.
Module 4: Data Understanding & Preparation(for Data Science & Analytics)
This module introduces the next steps of the data science process: Data Understanding and Data Preparation. This begins with data profiling to understand the current state of the data, and whether it can satisfy the business objectives of the project. Since data is often messy and incomplete, some data preparation it typically necessary for it to satisfy the needs of machine learning development. Exploratory data analysis (EDA) aims to identify the relationships existing among the elements (features) of the dataset along with the features that exert the greatest influence on the Target (the value being predicted). Since potential correlations may be obscured within the data, and because machine learning algorithms expect data in particular formats (e.g., all numerical values), some form of feature engineering may be necessary to reveal the full predictive power in the data, and to make that data satisfy machine learning requirements.
Module 5: Modeling, Evaluation & Deployment
In this module, we discuss model training, evaluation and hyperparameter tuning.
Module 6: Presentation of Analytical Results
Presentation of your Analytical Results is critical for success on the EDSP Challenge. In fact, 50% of the points are awarded for a compelling presentation that caters to both a business and technical audience. The goal of the challenge is not to evaluate how good your model is, but your entire approach to solve the problem and your presentation of your process and results.
Module 7: Responsible AI
This module will cover:
- Microsoft’s Ethical Obligation & Responsible AI Principles
- Intepreting ML Model Behavior & Explaining Their Inferences
- Detecting & Mitigating Unintended Bias in Training Data and ML Models
- Microsoft Tools for Responsible AI: InterpretML, Fairlearn, and More
Module 8: Sample Challenge
In this module participants will have 1 week to complete an example challenge that closely mimics the EDSP Challenge. You will present results to your team and mentor to receive feedback and learn from each other.
Module 9: MLOps
MLOps is not required for the Data Science Challenge, however it’s a critical need for many of our customers, so we’ve added this bonus module to help prepare you for discussions and implementation of MLOps processes.