Embark on an engaging project that seamlessly combines the thrill of gaming with the tech-driven realms of web scraping, ETL (Extract, Transform, Load), and insightful exploratory data analysis (EDA). Dive into the Amazon ecosystem as you leverage these techniques to uncover hidden gems, transform raw data into meaningful insights, and create an immersive experience at the intersection of data exploration.
Hey, Welcome to ‘Amazon Dashboard,’ where we bring together the forces of web scraping, Plotly, and a Flask-powered API to create an immersive data visualization experience. Uncover patterns, manipulate data, and interact with insights in real-time.
As a global e-commerce giant and digital services provider, Amazon complements the tech experiences offered by both Samsung and iPhone.Together, the dynamic trio of Samsung, iPhone, and Amazon creates a seamless, interconnected digital experience. From the latest smartphones to smart home integration and effortless online shopping, this collaboration defines the modern intersection of technology and convenience.
“Amazon_Web_Scrapping_Data_Visualizations,” integrates web scraping, ETL processes, and data visualizations using Beautiful Soup, Plotly.js, and Flask. The exploration focuses on Amazon, Apple, and Samsung products, offering insights into pricing, star ratings, and global ratings, presented through interactive charts and a dynamic dashboard.
This ETL process for ‘Amazon ETL & Visualizations’ break into three deliverables.
Deliverable 1: Web Scaping Using Beautiful Soup.
Deliverable 2: Data Cleaning Using Jupyter Notebook.
Deliverable 3: Flask-powered API & Data Visualizations Using Plotly.js
Hey, let’s explore the data with web scrapping and analyze the data using Beautiful soup and Pandas Python Data Analysis.
Before you begin, ensure you have the following installed:
Python 3.6 or higher
Beautiful Soup 4
Requests (for making HTTP requests)
Pandas (for data analysis)
We scrape data from the Amazon Website URLs for Apple & Samsung Smart Phones:
The structre of the Webpages can view here:
Defines a list of URLs for iPhone and Samsung products on Amazon.
Defines a function scrape_amazon_product to extract relevant information from Amazon product pages using BeautifulSoup and requests.
The function extracts product title, price, star ratings, number of global ratings, customer stars percentages, and features and ratings.
It also contains a function extract_model_number to extract the model number from the product title.
The script showcases a systematic approach to web scraping.
Defines dictionaries ‘iphone_scraped_data’ and ‘samsung_scraped_data’ to store scraped data for iPhone & Samsung products.
Loops through each brand’s URLs, calls the scrape_amazon_product function, and stores the scraped data in the dictionaries. Saves the scraped data to JSON and CSV files.
Creates DataFrames ‘scraped_iphone_df’ and ‘scraped_samsung_df’ from the scraped data. Splits the ‘title’ column into multiple columns based on commas. Saves the modified DataFrames to CSV files.
Hey, let’s clean the data for robust Python Data Analysis experiences using Pandas and Regular Expressions.
Before you begin, ensure you have the following installed:
Python 3.6 or higher
Regular Expressions
Data Analysis ( Pandas, Numpy, Scikit-learn )
We use the data resources from two files ‘scraped_iphone_data.csv’ and ‘scraped_samsung_data.csv’ using Pandas.
Drops unnecessary columns.
Extracts and converts storage capacity, color, price, star ratings, global ratings, and customer stars percentages to appropriate data types using regular expressions.
Creates new columns, including ‘brand,’ ‘model_year,’ and reorders columns with appropriate names.
Saves cleaned ‘iphone_samsung_df’ data to JSON and CSV files.
Hi, let’s Explore the world of Amazon, Apple, and Samsung like never before!
Before you begin, ensure you have the following installed:
Flask (Imports the Flask framework, allowing the creation of a web application)
jsonify ( Python dictionaries to JSON responses, and render_template facilitates rendering HTML templates )
render_template (facilitates rendering HTML templates)
HTML, CSS, Bootsrap ( for structure and style web page)
JavaScript ( for dynamic behavior)
Sweetalert ( for Dashboard POPUP BOXES)
D3.js ( for data manipulation)
Plotly ( for interactive graphs )
Renders the “index.html” template when users access the home page (“/”).
The below routes read merged iPhone and Samsung data from a CSV file, converts it to JSON, and returns it as a JSON response:
/api/iphone_samsung_details
/api/iphone_details
/api/samsung_details
Displays a SweetAlert notification when the page loads.
Includes a SweetAlert popup to display a success message when the page loads, providing a user-friendly experience.
Created a dropdown menu to select different iPhone and Samsung models. Upon selection, populates demographic information, builds pie charts, and bar charts for the selected product.
Also, initializes specific charts for Apple and Samsung brands. Populating Product Information: Retrieves specific information (brand, color, price, storage capacity) for the selected product and displays it.
Constructs a pie chart displaying the distribution of star ratings for the selected product. Building Bar Chart for Product Star Ratings:
Constructs a bar chart displaying the prices of Apple & Samsung phones.
Bar chart displaying the distribution of star ratings for Samsung brand models.
Line chart showing the evolution of brand models over different model years for both Apple and Samsung with respect to price.
Bubble chart showing the relationship between global ratings, price, and brand models for both Apple and Samsung.
The size of each bubble is determine by the corresponding overall “star_ratings”.
Summary
Prices are in higher ranges for Apple as compared to Samsung.
The 5 star ratings of Apple carries more weight (i.e 68.8%) than Samsung (i.e 62.4%).
Spike in the price of the iPhone 14 Pro Max (2022) among Apple models, whereas the price of the Samsung Galaxy S Ultra shows minimal fluctuations (2023).
For Apple iPhones:
The iPhone 1_4 has the highest star rating (4.4) among the Apple models.
The iPhone 14 Pro Max has the highest price ($1849.0) among the Apple models.
The iPhone 13 Pro Max has the highest number of global ratings (420) among the Apple models.
For Samsung Galaxy Phones:
Galaxy S23 5G has the highest star rating (4.5) among the Samsung models.
The Galaxy S22 Ultra has the highest price ($839.99) among the Samsung models.
The Galaxy S21 5G has the highest number of global ratings (173) among the Samsung models.