Prescriptions NI
Context
This is an exercise in presenting prescription data for Northern Ireland in a dashboard for further analysis. Tools used in this project are: - Posit's Shiny for Python for creating dashboards from data frames. - Pandas for data analysis and manipulation. - Beautiful Soup to scrap and automate data download.
The Journey
Data
The data was in .csv
format, Windows-1252 encoding.
Data cleaning
The Windows-1252 encoding was a bit of a gotcha, as most .csv
files are utf-8. There was a fair amount of head scratching to figure this out as it was causing problems creating dataframes.
This data set spanned over many years and as such had a few issues that needed to be cleaned or made uniform. Column names were Titlecase and Uppercase over the different years, which caused further problems when processing these columns using Pandas. Some fields were filled with -
for no value, others left blank/null and some with a space. This also tripped things up when trying to present multiple years.
At it's current state this is working, you will need to uncomment the download functions and beware it is a 4GB+ total download of data.
Next steps
Further iteration over the data processing to make it more efficient