This project delivers an end-to-end automation flow for property listing data from Properti123.
Running on an Ubuntu VPS, Python scripts collect listing URLs, scrape detailed property attributes,
normalize address fields, and store everything in PostgreSQL tables. The same dataset is then exposed
through Flask API endpoints for downstream analysis or visualization.
The implementation focuses on practical reliability: HTTP retry with exponential backoff,
status flags in property_link.available_data, batch-style processing scripts, fuzzy address matching
with TheFuzz, and notification integration via messaging bots. The codebase demonstrates applied backend/data engineering
skills using Python, SQL, and Linux scheduling.
Below is the schematics graph for reference:
Key Features
- 1 Automated Web Scraping
- 2 URL discovery and deduplication into property_link
- 3 Detailed listing extraction into property_detil
- 4 Address normalization into property_address
- 5 Telegram/LINE notification for filtered listings
- 6 Flask API endpoints for data consumption