Cleaning the Netflix Dataset with Pandas
Learn how to clean the Netflix dataset using Python and Pandas.
5 upvotes
10 upvotes
Project Description
In this project, you will load the Netflix Movies and TV Shows dataset from Kaggle and clean it using Pandas. The dataset has missing values, wrong data types, and mixed-type columns — exactly the kind of mess you find in real data.
The goal is not just to drop nulls, but to understand why values are missing and make deliberate decisions about each column.
Project Requirements
Download the Netflix dataset from Kaggle.
Inspect the DataFrame with
.info(),.describe(), and.head()Identify and handle missing values column by column
Fix mixed-type columns (e.g., duration stored as
"90 min")Parse date columns into proper
datetimeobjectsExport the cleaned DataFrame to a new CSV file
Technologies to Use
Python
Pandas
Jupyter Notebook
What You Will Learn
You will practice making real decisions about messy data, not just running .dropna() and moving on. You will also get comfortable reading a dataset before transforming it, which is a habit that matters a lot in real projects.
Want to See a Solution?
A full walkthrough of this project is available on Towards Data Science: 🔗 How to Clean Your Data in Python
