feat(backend): Add cable graph service and data collectors

## Changelog

### New Features

#### Cable Graph Service
- Add cable_graph.py for finding shortest path between landing points
- Implement haversine distance calculation for great circle distances
- Support for dateline crossing (longitude normalization)
- NetworkX-based graph for optimal path finding

#### Data Collectors
- Add ArcGISCableCollector for fetching submarine cable data from ArcGIS GeoJSON API
- Add FAOLandingPointCollector for fetching landing point data from FAO CSV API

### Backend Changes

#### API Updates
- auth.py: Update authentication logic
- datasources.py: Add datasource endpoints and management
- visualization.py: Add visualization API endpoints
- config.py: Update configuration settings
- security.py: Improve security settings

#### Models & Schemas
- task.py: Update task model with new fields
- token.py: Update token schema

#### Services
- collectors/base.py: Improve base collector with better error handling
- collectors/__init__.py: Register new collectors
- scheduler.py: Update scheduler logic
- tasks/scheduler.py: Add task scheduling

### Frontend Changes
- AppLayout.tsx: Improve layout component
- index.css: Add global styles
- DataSources.tsx: Enhance data sources management page
- vite.config.ts: Add Vite configuration for earth module
This commit is contained in:
rayd1o
2026-03-11 16:38:49 +08:00
parent 6cb4398f3a
commit aaae6a53c3
18 changed files with 990 additions and 146 deletions

View File

@@ -17,7 +17,20 @@ class BaseCollector(ABC):
priority: str = "P1"
module: str = "L1"
frequency_hours: int = 4
data_type: str = "generic" # Override in subclass: "supercomputer", "model", "dataset", etc.
data_type: str = "generic"
def __init__(self):
self._current_task = None
self._db_session = None
self._datasource_id = 1
def update_progress(self, records_processed: int):
"""Update task progress - call this during data processing"""
if self._current_task and self._db_session and self._current_task.total_records > 0:
self._current_task.records_processed = records_processed
self._current_task.progress = (
records_processed / self._current_task.total_records
) * 100
@abstractmethod
async def fetch(self) -> List[Dict[str, Any]]:
@@ -35,13 +48,11 @@ class BaseCollector(ABC):
from app.models.collected_data import CollectedData
start_time = datetime.utcnow()
datasource_id = getattr(self, "_datasource_id", 1) # Default to 1 for built-in collectors
datasource_id = getattr(self, "_datasource_id", 1)
# Check if collector is active
if not collector_registry.is_active(self.name):
return {"status": "skipped", "reason": "Collector is disabled"}
# Log task start
task = CollectionTask(
datasource_id=datasource_id,
status="running",
@@ -51,16 +62,21 @@ class BaseCollector(ABC):
await db.commit()
task_id = task.id
self._current_task = task
self._db_session = db
try:
raw_data = await self.fetch()
task.total_records = len(raw_data)
await db.commit()
data = self.transform(raw_data)
# Save data to database
records_count = await self._save_data(db, data)
# Log task success
task.status = "success"
task.records_processed = records_count
task.progress = 100.0
task.completed_at = datetime.utcnow()
await db.commit()
@@ -94,8 +110,7 @@ class BaseCollector(ABC):
collected_at = datetime.utcnow()
records_added = 0
for item in data:
# Create CollectedData entry
for i, item in enumerate(data):
record = CollectedData(
source=self.name,
source_id=item.get("source_id") or item.get("id"),
@@ -125,7 +140,12 @@ class BaseCollector(ABC):
db.add(record)
records_added += 1
if i % 100 == 0:
self.update_progress(i + 1)
await db.commit()
await db.commit()
self.update_progress(len(data))
return records_added
async def save(self, db: AsyncSession, data: List[Dict[str, Any]]) -> int: