Skip to content

AndreasHadjisavvas99/Advanced-Databases

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Advanced Databases - Project

Team 57

  • Loukia Pavlana (el18711)
  • Andreas Hadjisavvas (el18701)

Project Overview

This project was developed as part of the course Advanced Topics in Databases. It involved processing large-scale datasets of New York City taxi trips using Apache Spark and HDFS. The main goal was to perform complex queries and data transformations to analyze various aspects of the ride data.

Key Features:

  • Analyzed ride data to identify top tip routes, peak hours, and fare patterns.
  • Used Apache Spark's DataFrame/SQL API and RDD API for data transformations and query execution.
  • Leveraged distributed computing for efficient processing of large datasets.
  • Implemented query optimizations for performance improvements.

Technologies Used:

  • Apache Spark
  • HDFS
  • Python

Course Information:

  • Course: Προχωρημένα Θέματα Βάσεων Δεδομένων (Advanced Databases)
  • Institution: National Technical University of Athens

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages