How to analyse pharmaceutical sales data using Python, Pandas, Matplotlib
In this project, you will analyze pharmaceutical sales data. Analysing sales data based on historical data is a very common data analysis task. This is a great way to start working with data science projects.
Create a folder for a project on your computer called "Analysing-pharmaceutical-sales-data"
Download the salesdaily.csv file from this Kaggle project. Place these data sets in a folder called "data" in your project folder.
Start a new Jupyter notebook
Answer the following questions using Pandas and Matplotlib:
What are the total sales quantities for each drug category (ATC code)?
Which individual drug brands have the highest total sales?
Which three drugs have the highest sales in January 2015, July 2016, September 2017.
Which drug has sold the most often in 2017?
Which drug category has the highest average daily sales?
Are respiratory drugs (R03) sold more during specific months?
Python 3
Pandas, Matplotlib
In this exercise, we have learned to use Python Pandas and Matplotlib libraries, the most popular data manipulation and visualization libraries in data science.
Check out this Analysing Pharmaceutical Sales Data in Python article from Towards Data Science if you need some inspiration on how to solve this project.
Join the Community
roadmap.sh is the 6th most starred project on GitHub and is visited by hundreds of thousands of developers every month.