Tableau presentation of cleaned sample travel data from Kaggle.com
To practice data cleaning and standardization in Excel, then data analysis and presentation using Tableau.
NOTE: You can preview the story from my Tableau workbook at https://public.tableau.com/shared/KX46SMJWF?:display_count=n&:origin=viz_share_link. However, the workbook was optimized to be viewed via Tableau software.
I downloaded a Kaggle.com dataset in CSV format (see kaggle-travel-dataset-original.csv in repository files) from https://www.kaggle.com/datasets/rkiattisak/traveler-trip-data. The dataset is a LLM-generated sample published by Kiattisak Rattanaporn. I cleaned the data using Excel, then imported the cleaned CSV into Tableau. The workbook currently consists of a dashboard and multiple worksheets.
The cleaned dataset consists of a single CSV file comprised of 13 fields containing 137 records. Important fields include:
- Customer demographics (nationality, age, gender)
- Travel destination
- Travel start dates and end dates
- Duration of each trip
- Cost of transportation and accommodation
- Accommodation type and transportation method
Data cleaning included the following steps:
- Removed 2 null records that only contained trip ID
- Standardized the nationality field to match country names
- Added countries to destinations where only a city name was present
- Truncated currency fields by removing 'USD', because all values are in USD by default
Some travel destination records only contain a country, without a city. Instead of attempting to analyze travel destinations by city and country separately, I chose to focus only on the top 10 most visited cities in the Tableau dashboard. Data on the sole traveler from New Zealand was excluded from the "Aggregated by Traveler Nationality" map view, because the accommodation cost was an extreme high outlier.
Thanks to Kiattisak Rattanaporn for providing the original Kaggle dataset under Attribution 4.0 International (CC BY 4.0).