/

NFL Data Pipeline

Python, GCP, MAGE, R, Biquery

Project

Overview

This project focused on building a data engineering pipeline for predicting weekly NFL fantasy points for players. The pipeline begins by scraping NFL data using the nfl-data-py library. The data is then orchestrated using Mage, which automates the extraction, transformation, and loading (ETL) processes. The raw data is stored in Google Cloud Platform (GCP), acting as the data lake for the project. From there, the data is processed and stored in BigQuery, which serves as the data warehouse. Finally, R is used to query the data from BigQuery and apply machine learning models to predict weekly NFL fantasy points for each player. This project integrates multiple technologies to streamline data management, storage, and analysis, ultimately providing predictions that can aid fantasy football players in making informed decisions.

Check out my blog post that goes more into detail on how I was able to create this amazing project!

Technologies

Python

GCP

MAGE

R

Bigquery