606 lines
15 KiB
Markdown
606 lines
15 KiB
Markdown
# WebSocket Protocol Specification
|
|
|
|
## Connection
|
|
|
|
```
|
|
ws://localhost:8000/ws?token=<access_token>
|
|
```
|
|
|
|
**Connection Steps:**
|
|
1. Client connects to WebSocket endpoint
|
|
2. Server validates JWT token
|
|
3. Server sends `connection_established` message
|
|
4. Client sends `subscribe` message (optional)
|
|
5. Server begins sending data frames
|
|
|
|
**Connection Limits:**
|
|
- Maximum concurrent connections per user: 3
|
|
- Connection timeout (no activity): 5 minutes
|
|
- Heartbeat interval: 30 seconds
|
|
|
|
---
|
|
|
|
## Message Format
|
|
|
|
### General Structure
|
|
|
|
```json
|
|
{
|
|
"type": "message_type",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": { ... }
|
|
}
|
|
```
|
|
|
|
### All Message Types
|
|
|
|
| Type | Direction | Description |
|
|
|------|-----------|-------------|
|
|
| connection_established | Server → Client | Initial connection confirmation |
|
|
| heartbeat | Bidirectional | Keep-alive ping/pong |
|
|
| data_frame | Server → Client | Main data payload |
|
|
| control_frame | Client → Server | Camera/display control |
|
|
| alert_notification | Server → Client | Real-time alert |
|
|
| error | Bidirectional | Error reporting |
|
|
| sync_request | Client → Server | Request full sync |
|
|
| subscription | Client → Server | Subscribe/unsubscribe channels |
|
|
|
|
---
|
|
|
|
## Connection Established
|
|
|
|
**Server → Client**
|
|
|
|
```json
|
|
{
|
|
"type": "connection_established",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"connection_id": "conn_a1b2c3d4",
|
|
"server_version": "1.0.0",
|
|
"session_id": "sess_xyz789",
|
|
"heartbeat_interval": 30,
|
|
"supported_channels": ["gpu_clusters", "submarine_cables", "ixp_nodes", "alerts"]
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Heartbeat
|
|
|
|
### Client → Server (Ping)
|
|
```json
|
|
{
|
|
"type": "heartbeat",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"action": "ping"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Server → Client (Pong)
|
|
```json
|
|
{
|
|
"type": "heartbeat",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"action": "pong",
|
|
"latency_ms": 45
|
|
}
|
|
}
|
|
```
|
|
|
|
**Client Behavior:**
|
|
- Send ping every 30 seconds
|
|
- If no pong received in 10 seconds, reconnect
|
|
- Track latency for monitoring
|
|
|
|
**Server Behavior:**
|
|
- Send pong immediately on receiving ping
|
|
- Track connection health
|
|
|
|
---
|
|
|
|
## Data Frame (Main Payload)
|
|
|
|
### Full Update
|
|
|
|
**Server → Client**
|
|
|
|
```json
|
|
{
|
|
"type": "data_frame",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"update_type": "full",
|
|
"sequence": 12345,
|
|
"payload": {
|
|
"meta": {
|
|
"generated_at": "2024-01-20T10:30:00Z",
|
|
"data_sources": 9,
|
|
"total_records": 20800
|
|
},
|
|
"gpu_clusters": {
|
|
"total": 1500,
|
|
"last_updated": "2024-01-20T10:00:00Z",
|
|
"data": [
|
|
{
|
|
"id": "epoch-gpu-001",
|
|
"name": "Frontier",
|
|
"country": "US",
|
|
"city": "Oak Ridge, TN",
|
|
"lat": 35.9327,
|
|
"lng": -84.3107,
|
|
"gpu_count": 37888,
|
|
"gpu_type": "AMD MI250X",
|
|
"total_flops": 1.54e9,
|
|
"rank": 1,
|
|
"visual": {
|
|
"size": 1.0,
|
|
"color": "#FF6B6B",
|
|
"pulse": true
|
|
}
|
|
}
|
|
]
|
|
},
|
|
"submarine_cables": {
|
|
"total": 436,
|
|
"last_updated": "2024-01-20T09:00:00Z",
|
|
"data": [
|
|
{
|
|
"id": "cable-001",
|
|
"name": "FASTER",
|
|
"length_km": 11600,
|
|
"capacity_tbps": 60,
|
|
"status": "active",
|
|
"landing_points": [
|
|
{"lat": 37.7749, "lng": -122.4194},
|
|
{"lat": 35.6762, "lng": 139.6503}
|
|
],
|
|
"visual": {
|
|
"width": 2.0,
|
|
"color": "#4ECDC4",
|
|
"animated": true
|
|
}
|
|
}
|
|
]
|
|
},
|
|
"ixp_nodes": {
|
|
"total": 1200,
|
|
"last_updated": "2024-01-20T09:30:00Z",
|
|
"data": [
|
|
{
|
|
"id": "ixp-001",
|
|
"name": "Equinix Ashburn",
|
|
"country": "US",
|
|
"city": "Ashburn, VA",
|
|
"lat": 39.0438,
|
|
"lng": -77.4874,
|
|
"member_count": 250,
|
|
"traffic_tbps": 15.5,
|
|
"visual": {
|
|
"size": 0.8,
|
|
"color": "#45B7D1"
|
|
}
|
|
}
|
|
]
|
|
},
|
|
"cloud_infra": {
|
|
"total": 500,
|
|
"last_updated": "2024-01-20T08:00:00Z",
|
|
"data": [
|
|
{
|
|
"provider": "AWS",
|
|
"region": "us-east-1",
|
|
"data_center_count": 15,
|
|
"capacity_mw": 500,
|
|
"lat": 39.0438,
|
|
"lng": -77.4874,
|
|
"visual": {
|
|
"size": 1.2,
|
|
"color": "#FF9900"
|
|
}
|
|
}
|
|
]
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Incremental Update
|
|
|
|
**Server → Client**
|
|
|
|
```json
|
|
{
|
|
"type": "data_frame",
|
|
"timestamp": "2024-01-20T10:35:00.000Z",
|
|
"data": {
|
|
"update_type": "incremental",
|
|
"sequence": 12346,
|
|
"base_sequence": 12345,
|
|
"changes": {
|
|
"gpu_clusters": {
|
|
"updated": [
|
|
{
|
|
"id": "epoch-gpu-002",
|
|
"rank": 2,
|
|
"gpu_count": 40000
|
|
}
|
|
],
|
|
"added": [],
|
|
"removed": []
|
|
},
|
|
"alerts": {
|
|
"new": [
|
|
{
|
|
"id": 1234,
|
|
"severity": "warning",
|
|
"message": "API response time > 30s",
|
|
"source": "Epoch AI"
|
|
}
|
|
],
|
|
"resolved": [1230]
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Control Frame (Client → Server)
|
|
|
|
### Camera Position
|
|
```json
|
|
{
|
|
"type": "control_frame",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"action": "camera_set",
|
|
"camera": {
|
|
"position": {
|
|
"latitude": 35.6762,
|
|
"longitude": 139.6503,
|
|
"altitude": 5000000
|
|
},
|
|
"target": {
|
|
"latitude": 35.6762,
|
|
"longitude": 139.6503,
|
|
"altitude": 0
|
|
},
|
|
"rotation": {
|
|
"pitch": -45,
|
|
"yaw": 0,
|
|
"roll": 0
|
|
}
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Camera Animation
|
|
```json
|
|
{
|
|
"type": "control_frame",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"action": "camera_animate",
|
|
"animation": {
|
|
"type": "fly_to",
|
|
"target": {
|
|
"latitude": 39.0438,
|
|
"longitude": -77.4874,
|
|
"altitude": 3000000
|
|
},
|
|
"duration_seconds": 3.0,
|
|
"easing": "ease_in_out"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Auto-Cruise Control
|
|
```json
|
|
{
|
|
"type": "control_frame",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"action": "cruise_control",
|
|
"enabled": true,
|
|
"config": {
|
|
"speed": 1.0,
|
|
"route": "global",
|
|
"pause_on_interaction": true
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Layer Visibility
|
|
```json
|
|
{
|
|
"type": "control_frame",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"action": "layer_visibility",
|
|
"layers": {
|
|
"gpu_clusters": true,
|
|
"submarine_cables": true,
|
|
"ixp_nodes": true,
|
|
"cloud_infra": false,
|
|
"satellites": false,
|
|
"alerts": true
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
### Focus Request
|
|
```json
|
|
{
|
|
"type": "control_frame",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"action": "focus_entity",
|
|
"entity_type": "gpu_cluster",
|
|
"entity_id": "epoch-gpu-001",
|
|
"show_info": true
|
|
}
|
|
}
|
|
```
|
|
|
|
### Time Range Filter
|
|
```json
|
|
{
|
|
"type": "control_frame",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"action": "set_time_range",
|
|
"time_range": {
|
|
"start": "2024-01-01T00:00:00Z",
|
|
"end": "2024-01-20T23:59:59Z",
|
|
"aggregation": "hourly"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Alert Notification
|
|
|
|
**Server → Client**
|
|
|
|
```json
|
|
{
|
|
"type": "alert_notification",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"alert": {
|
|
"id": 1234,
|
|
"severity": "critical",
|
|
"title": "Data Collection Failed",
|
|
"message": "TOP500 data source failed to collect data",
|
|
"source": "TOP500",
|
|
"timestamp": "2024-01-20T10:25:00Z",
|
|
"actions": ["acknowledge", "retry", "view_details"]
|
|
},
|
|
"badge_update": {
|
|
"critical": 2,
|
|
"warning": 5,
|
|
"info": 10
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Sync Request
|
|
|
|
**Client → Server**
|
|
|
|
```json
|
|
{
|
|
"type": "sync_request",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"request_type": "full",
|
|
"channels": ["gpu_clusters", "submarine_cables", "ixp_nodes"]
|
|
}
|
|
}
|
|
```
|
|
|
|
**Server Response:**
|
|
Same as `data_frame` with `update_type: "full"`
|
|
|
|
---
|
|
|
|
## Subscription Management
|
|
|
|
### Subscribe
|
|
```json
|
|
{
|
|
"type": "subscription",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"action": "subscribe",
|
|
"channels": ["gpu_clusters", "alerts"]
|
|
}
|
|
}
|
|
```
|
|
|
|
### Unsubscribe
|
|
```json
|
|
{
|
|
"type": "subscription",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"action": "unsubscribe",
|
|
"channels": ["alerts"]
|
|
}
|
|
}
|
|
```
|
|
|
|
**Server Response:**
|
|
```json
|
|
{
|
|
"type": "subscription_confirmed",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"action": "subscribe",
|
|
"channels": ["gpu_clusters", "alerts"],
|
|
"active_subscriptions": ["gpu_clusters", "alerts"]
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Error Messages
|
|
|
|
### Connection Error
|
|
```json
|
|
{
|
|
"type": "error",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"code": "INVALID_TOKEN",
|
|
"message": "Invalid or expired authentication token",
|
|
"action": "reconnect_with_fresh_token"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Rate Limit Error
|
|
```json
|
|
{
|
|
"type": "error",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"code": "RATE_LIMITED",
|
|
"message": "Too many requests",
|
|
"retry_after_seconds": 30
|
|
}
|
|
}
|
|
```
|
|
|
|
### Data Error
|
|
```json
|
|
{
|
|
"type": "error",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"code": "DATA_FETCH_FAILED",
|
|
"message": "Failed to fetch data from source: Epoch AI",
|
|
"source": "Epoch AI",
|
|
"will_retry": true,
|
|
"retry_in_seconds": 60
|
|
}
|
|
}
|
|
```
|
|
|
|
### Validation Error
|
|
```json
|
|
{
|
|
"type": "error",
|
|
"timestamp": "2024-01-20T10:30:00.000Z",
|
|
"data": {
|
|
"code": "INVALID_CONTROL_FRAME",
|
|
"message": "Invalid camera position",
|
|
"details": {
|
|
"field": "camera.position.altitude",
|
|
"constraint": "Must be positive"
|
|
}
|
|
}
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Error Codes Reference
|
|
|
|
| Code | HTTP Equivalent | Description |
|
|
|------|-----------------|-------------|
|
|
| INVALID_TOKEN | 401 | JWT validation failed |
|
|
| TOKEN_EXPIRED | 401 | Token has expired |
|
|
| RATE_LIMITED | 429 | Too many requests |
|
|
| CHANNEL_NOT_FOUND | 404 | Invalid channel name |
|
|
| INVALID_FRAME | 400 | Malformed JSON or structure |
|
|
| INVALID_CONTROL_FRAME | 400 | Control action validation failed |
|
|
| DATA_FETCH_FAILED | 500 | Backend data collection failed |
|
|
| INTERNAL_ERROR | 500 | Server internal error |
|
|
|
|
---
|
|
|
|
## Connection State Machine
|
|
|
|
```
|
|
DISCONNECTED
|
|
│
|
|
├─→ CONNECTING (token validation)
|
|
│
|
|
├─→ AUTHENTICATED ──→ ESTABLISHED
|
|
│ │
|
|
├─→ ERROR (reconnect) ├─→ RECEIVING DATA
|
|
│ │
|
|
└───────────────────────────┴─→ DISCONNECTING
|
|
│
|
|
└─→ DISCONNECTED
|
|
```
|
|
|
|
---
|
|
|
|
## Reconnection Strategy
|
|
|
|
1. **Immediate Retry:** On disconnect, retry after 1 second
|
|
2. **Exponential Backoff:** If failed, wait 2, 4, 8, 16 seconds
|
|
3. **Max Retries:** 5 attempts before giving up
|
|
4. **Token Refresh:** If token expired, refresh before reconnecting
|
|
|
|
---
|
|
|
|
## Data Flow Diagram
|
|
|
|
```
|
|
┌──────────┐ ┌──────────┐
|
|
│ UE5 │◄───── WebSocket ───►│ Server │
|
|
│ Client │ │ │
|
|
└────┬─────┘ └────┬─────┘
|
|
│ │
|
|
│ 1. Connect (with JWT) │
|
|
│ 2. Connection Established │
|
|
│ │
|
|
│ 3. Control Frame (Camera) │
|
|
│◄─────────────────────────────────┤
|
|
│ │
|
|
│ 4. Data Frame (Update) │
|
|
│◄─────────────────────────────────┤
|
|
│ 5. Heartbeat (30s interval) │
|
|
│◄─────────────────────────────────┤
|
|
│ │
|
|
│ 6. Alert Notification │
|
|
│◄─────────────────────────────────┤
|
|
```
|
|
|
|
---
|
|
|
|
## Performance Considerations
|
|
|
|
| Metric | Target | Notes |
|
|
|--------|--------|-------|
|
|
| Data frame size | < 1 MB | Compressed if larger |
|
|
| Update latency | < 5 seconds | End-to-end |
|
|
| Heartbeat latency | < 100 ms | Server processing |
|
|
| Max connections | 1000 per server | With负载均衡 |
|
|
|
|
**Optimization Strategies:**
|
|
- Incremental updates for frequent changes
|
|
- Binary encoding for large datasets (MessagePack/Protocol Buffers)
|
|
- Compression for data frames (gzip)
|
|
- Chunking for large payloads
|