E-Commerce Data Analysis with Pandas

Explore the UCI Online Retail Dataset.

Start building, submit solution and get feedback from the community.
2Submit Solution
5 upvotes10 upvotes

Project Description

In this project, you will work with the UCI Online Retail Dataset, a real transactions dataset from a UK-based online store with over 500,000 rows. You will clean it, filter it, and compute your first business metric.

This project is closer to real work than most beginner datasets. The data is not clean, and some rows make no sense: you have to deal with that before doing any analysis.

Project Requirements

  • Download the UCI Online Retail Dataset (available on Kaggle or the UCI ML Repository)

  • Sample 10% of rows to keep things manageable

  • Clean the data: remove nulls, fix data types, filter out returns, and free items

  • Convert InvoiceDate to a proper datetime object

  • Create a Revenue column (Quantity 脳 UnitPrice)

  • Find the top 10 countries by total revenue and plot the result

Technologies to Use

  • Python

  • Pandas

  • Matplotlib / Seaborn

  • Jupyter Notebook

What You Will Learn

You will practice cleaning a large, realistic dataset and computing a derived metric. You will also understand why negative quantities and zero prices exist in real transaction data, and how to handle them without deleting useful rows.

Want to See a Solution?

A full walkthrough of this project is available on Towards Data Science: 馃敆 EDA in Public: Cleaning and Exploring Sales Data with Pandas

Join the Community

roadmap.sh is the 6th most starred project on GitHub and is visited by hundreds of thousands of developers every month.

Rank 6th聽out of 28M!

352K

GitHub Stars

Star us on GitHub
Help us reach #1

+90kevery month

+2.8M

Registered Users

Register yourself
Commit to your growth

+2kevery month

46K

Discord Members

Join on Discord
Join the community

RoadmapsGuidesFAQsYouTube

roadmap.shby@kamrify

Community created roadmaps, best practices, projects, articles, resources and journeys to help you choose your path and grow in your career.

漏 roadmap.shTermsPrivacy

ThewNewStack

The top DevOps resource for Kubernetes, cloud-native computing, and large-scale development and deployment.