If playback doesn't begin shortly, try restarting your device.
•
You're signed out
Videos you watch may be added to the TV's watch history and influence TV recommendations. To avoid this, cancel and sign in to YouTube on your computer.
CancelConfirm
Share
An error occurred while retrieving sharing information. Please try again later.
11,524 views • May 8, 2023 • #Python #Pandas #DataCleaning
Show less
📊🧹 Welcome to this video, where Bea Stollnitz, a Principal Cloud Advocate at Microsoft, guides you through analyzing and cleaning a dataset. This video is part of our Machine Learning for Beginners series, where we'll cover various machine learning topics and their implementation using Python code in Jupyter notebooks.
In this video, you'll learn:
✅ How to import a messy dataset and analyze it using Pandas 🐼
✅ How to clean and prepare the data for further analysis
✅ How to filter and drop columns that aren't relevant to your question
✅ How to calculate new columns based on existing data
🎃 We'll use a real-world pumpkin dataset, which contains plenty of missing values and inconsistencies. Our goal is to find the cheapest month to buy pumpkins. Follow along as we clean the data and prepare it for visualization and analysis in the next video.
Make sure to subscribe and hit the notification bell 🔔 so you won't miss our next video, where we'…...more
How to Analyze and Clean a Dataset [Part 8] | Machine Learning for Beginners
108Likes
11,524Views
2023May 8
📊🧹 Welcome to this video, where Bea Stollnitz, a Principal Cloud Advocate at Microsoft, guides you through analyzing and cleaning a dataset. This video is part of our Machine Learning for Beginners series, where we'll cover various machine learning topics and their implementation using Python code in Jupyter notebooks.
In this video, you'll learn:
✅ How to import a messy dataset and analyze it using Pandas 🐼
✅ How to clean and prepare the data for further analysis
✅ How to filter and drop columns that aren't relevant to your question
✅ How to calculate new columns based on existing data
🎃 We'll use a real-world pumpkin dataset, which contains plenty of missing values and inconsistencies. Our goal is to find the cheapest month to buy pumpkins. Follow along as we clean the data and prepare it for visualization and analysis in the next video.
Make sure to subscribe and hit the notification bell 🔔 so you won't miss our next video, where we'll dive deeper into various machine learning topics and guide you through their implementation using Python code in Jupyter notebooks. See you there!
📙 Follow along
The Jupyter Notebook to follow along with this lesson is available here:
https://github.com/microsoft/ML-For-B...#DataCleaning#Python#Pandas#machinelearning#ml
📚 Learn more:
This course is based on the free, open source, 26 lesson ML For Beginners curriculum from Microsoft, which can be found at https://aka.ms/ml-beginners.
📇 Connect with Bea:
Blog: https://bea.stollnitz.com/blog/
LinkedIn: / beatrizstollnitz
Twitter: / beastollnitz 0:00 - Intro
0:18 - The Pumpkin dataset
0:47 - How should we clean the data so we can predict price by bushels?
1:20 - Open the notebook to clean the data
1:36 - Filter the data by package size
1:49 - Check for empty cells for the columns we care about
2:10 - Drop columns we are not interested in
2:18 - Create a new data frame with average prices by bushels and the month
3:00 - Review the data…...more