A Highly Scientific Analysis of Chinese Restaurant Names in NYC
Authors
Description
This project is an exploratory analysis of Chinese restaurant names in NYC.
369 Chinese restaurant names were scraped from Yelp's website via a bot I built with Selenium and stored in a Pandas dataframe.
Data cleaning and preprocessing was subsequently conducted with Pandas and NumPy packages. Exploratory Analysis and vizualizations were done with Python libraries including; Matplotlib, Seaborn and Plotly.
Important Caveat
This analysis only considers unique Chinese restaurant names, meaning that restaurants that have multiple establishments with the same name will be treated as a single entity.
Network Analysis of Words in NYC Chinese Restaurant Names
Please wait for the interactive viz to load. It should only take a few seconds.
Viewing Instructions
- Hover over a node to see the word name
- The size of the node is relative to the total number of connections that word has to other words in restaurant names
Categories
Word Cloud of Words in NYC Chinese Restaurant Names
Interactive Visualizations
Hover over the charts with your mouse to see the tooltip. Apply filters by clicking on the legend.
The most populated category represented is 'places' with 'restaurant' being the most common word within the category.
The most popular family names in NYC Chinese Restraunt names is Hong and Chen which combine to make up 18.75% of the family names present.
Languages and Packages Used
Python 3.9.13 | Numpy | Pandas | MatPlotLib | Seaborn | Selenium
Environments Used
VScode | Jupyter Notebooks
Data Source
Restaurant Data Scraped from Yelp
Methods
Exploratory data analysis | Web Scraping | Feature Engineering
Analysis