Accessible Activity Data Platform

This project was the Computer Science Capstone Project at The University of Sydney which I did during my exchange year. It was supposed to be done in a 7 people group project but as it always happens with these projects, it was just 2 of us did the whole job (It was a bit stressful, however, I learnt a lot).

About the project:

I designed and built a microservice-based web app that automates the discovery and management of accessible physical activity data for a real client. I led the project end-to-end, acting as project manager, main developer and customer liaison (I had to organize meetings every week).

This replaces an excel spreadsheet process used previously by the client with a scalable backend, a clean web interface, and an AI-powered scraping pipeline. This pipeline automatically finds relevant activity providers online and turns unstructured web content into structured data using independent service modules built with Python, Flask, and SQLAlchemy.

The system integrates PostgreSQL for persistent storage, Pandas for data exploration and cleaning, and multiple external APIs including Google Maps API, Google Geocoding, and Google Custom Search. Dynamic content scraping is handled with Playwright and BeautifulSoup, while Gemini API is used to convert raw text into structured, schema-compliant candidate records.

A candidate workflow ensures data quality before new entries are added to the main dataset. The frontend is built with HTML, Jinja templates, JavaScript, and TailwindCSS.

This project demonstrates my ability to lead technical projects, design production-ready systems, and deliver AI-driven automation across the full stack.

The project:

Unfortunately since this project was done for a University course and had real clients involved, it cannot be shared freely. However I can show some video demonstrations of how it worked: