The data scientist at Big Mart collected sales data for 2013 for 1559 products across 10 sotres in different cities. The goal of this is to help the retailer better understand the properties of products and outlets that play crucial roles in increasing sales. The dataset was inspected and cleaned by handling inconsistencies in the data names, outliers, dropping duplicates, and handling null values for example. Python libraries used were Numpy, Pandas, MatplotLib and Seaborn. The data was preprocessed for machine learning and a linear regression and decision tree model were used to predict future sales based on the data provided.
This project demonstrates execution of SQL queries with Python.
This Tableau Dashboard for the Chicago Crime dataset provides visuals for the distribution of each crime type, compares crimes count for months of the year, day of the week, and across the years.
This dataset is used to predict whether a patient is likely to get a stroke based on the input parameters like gender, age, various diseases, and smoking status. The data is cleaned and prepared for the machine learning. A logistic regression model and KNN Model were used to predict the likelyhood of a patient having a stroke.
This is a subset of IMDB's publicly available dataset. This dataset is used to analyze what makes a movie successful.
This is a project utilizing Tableau using data from Data Analyst Salary on Kaggle's website. It answers six different questions regarding average salaries for Data Analysts.