Sign in to confirm you’re not a bot
This helps protect our community. Learn more
How to Analyze and Clean a Dataset [Part 8] | Machine Learning for Beginners
108Likes
11,524Views
2023May 8
📊🧹 Welcome to this video, where Bea Stollnitz, a Principal Cloud Advocate at Microsoft, guides you through analyzing and cleaning a dataset. This video is part of our Machine Learning for Beginners series, where we'll cover various machine learning topics and their implementation using Python code in Jupyter notebooks. In this video, you'll learn: ✅ How to import a messy dataset and analyze it using Pandas 🐼 ✅ How to clean and prepare the data for further analysis ✅ How to filter and drop columns that aren't relevant to your question ✅ How to calculate new columns based on existing data 🎃 We'll use a real-world pumpkin dataset, which contains plenty of missing values and inconsistencies. Our goal is to find the cheapest month to buy pumpkins. Follow along as we clean the data and prepare it for visualization and analysis in the next video. Make sure to subscribe and hit the notification bell 🔔 so you won't miss our next video, where we'll dive deeper into various machine learning topics and guide you through their implementation using Python code in Jupyter notebooks. See you there! 📙 Follow along The Jupyter Notebook to follow along with this lesson is available here: https://github.com/microsoft/ML-For-B... #DataCleaning #Python #Pandas #machinelearning #ml 📚 Learn more: This course is based on the free, open source, 26 lesson ML For Beginners curriculum from Microsoft, which can be found at https://aka.ms/ml-beginners. 📇 Connect with Bea: Blog: https://bea.stollnitz.com/blog/ LinkedIn:   / beatrizstollnitz   Twitter:   / beastollnitz   0:00 - Intro 0:18 - The Pumpkin dataset 0:47 - How should we clean the data so we can predict price by bushels? 1:20 - Open the notebook to clean the data 1:36 - Filter the data by package size 1:49 - Check for empty cells for the columns we care about 2:10 - Drop columns we are not interested in 2:18 - Create a new data frame with average prices by bushels and the month 3:00 - Review the data

Follow along using the transcript.

Microsoft Developer

589K subscribers