{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## LabDS01 ##\n", "\n", "Consider the following csv, that has information about movies:\n", "\n", "https://github.com/masterfloss/datamovies/raw/main/movies_ratings.tsv\n", "\n", "https://github.com/masterfloss/datamovies/raw/main/moviesPT3.xlsx\n", "\n", "Perform the follwoing group of activities:\n", "\n", "- Business Understanding\n", "\n", "- Data Understanding\n", "\n", "- Data Preparation\n", "\n", "\n", "\n", "\n", "Performe also the follwoing tasks:\n", "\n", "1. create two dataframes dfrates and dfmovies\n", "2. Some columns (Title.1,CodeIGAC,Dim,Director,Type,genre,premiere date,ID Imdb1,ISO,distributor) correspond to the first year of exhibition of the movie. Replace the missing values with the value of the previous cell. \n", "3. Verify if columns 'Title.1' and 'Title' are equal. Why?\n", "4. create two extra columns of gross income by a spectator and gross income by session\n", "5. Remove missing values.\n", "6. Merge the two dataframes\n", "7. create a boxplot\n", "8. create histograms to verify distributions\n", "9. correlation between the variables\n", "10. create a heatmap with correlations\n", "\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import pandas as pd\n", "import seaborn as sns\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.7" } }, "nbformat": 4, "nbformat_minor": 4 }