first commit
This commit is contained in:
605
websocket_protocol.md
Normal file
605
websocket_protocol.md
Normal file
@@ -0,0 +1,605 @@
|
||||
# WebSocket Protocol Specification
|
||||
|
||||
## Connection
|
||||
|
||||
```
|
||||
ws://localhost:8000/ws?token=<access_token>
|
||||
```
|
||||
|
||||
**Connection Steps:**
|
||||
1. Client connects to WebSocket endpoint
|
||||
2. Server validates JWT token
|
||||
3. Server sends `connection_established` message
|
||||
4. Client sends `subscribe` message (optional)
|
||||
5. Server begins sending data frames
|
||||
|
||||
**Connection Limits:**
|
||||
- Maximum concurrent connections per user: 3
|
||||
- Connection timeout (no activity): 5 minutes
|
||||
- Heartbeat interval: 30 seconds
|
||||
|
||||
---
|
||||
|
||||
## Message Format
|
||||
|
||||
### General Structure
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "message_type",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": { ... }
|
||||
}
|
||||
```
|
||||
|
||||
### All Message Types
|
||||
|
||||
| Type | Direction | Description |
|
||||
|------|-----------|-------------|
|
||||
| connection_established | Server → Client | Initial connection confirmation |
|
||||
| heartbeat | Bidirectional | Keep-alive ping/pong |
|
||||
| data_frame | Server → Client | Main data payload |
|
||||
| control_frame | Client → Server | Camera/display control |
|
||||
| alert_notification | Server → Client | Real-time alert |
|
||||
| error | Bidirectional | Error reporting |
|
||||
| sync_request | Client → Server | Request full sync |
|
||||
| subscription | Client → Server | Subscribe/unsubscribe channels |
|
||||
|
||||
---
|
||||
|
||||
## Connection Established
|
||||
|
||||
**Server → Client**
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "connection_established",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"connection_id": "conn_a1b2c3d4",
|
||||
"server_version": "1.0.0",
|
||||
"session_id": "sess_xyz789",
|
||||
"heartbeat_interval": 30,
|
||||
"supported_channels": ["gpu_clusters", "submarine_cables", "ixp_nodes", "alerts"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Heartbeat
|
||||
|
||||
### Client → Server (Ping)
|
||||
```json
|
||||
{
|
||||
"type": "heartbeat",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"action": "ping"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Server → Client (Pong)
|
||||
```json
|
||||
{
|
||||
"type": "heartbeat",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"action": "pong",
|
||||
"latency_ms": 45
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Client Behavior:**
|
||||
- Send ping every 30 seconds
|
||||
- If no pong received in 10 seconds, reconnect
|
||||
- Track latency for monitoring
|
||||
|
||||
**Server Behavior:**
|
||||
- Send pong immediately on receiving ping
|
||||
- Track connection health
|
||||
|
||||
---
|
||||
|
||||
## Data Frame (Main Payload)
|
||||
|
||||
### Full Update
|
||||
|
||||
**Server → Client**
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "data_frame",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"update_type": "full",
|
||||
"sequence": 12345,
|
||||
"payload": {
|
||||
"meta": {
|
||||
"generated_at": "2024-01-20T10:30:00Z",
|
||||
"data_sources": 9,
|
||||
"total_records": 20800
|
||||
},
|
||||
"gpu_clusters": {
|
||||
"total": 1500,
|
||||
"last_updated": "2024-01-20T10:00:00Z",
|
||||
"data": [
|
||||
{
|
||||
"id": "epoch-gpu-001",
|
||||
"name": "Frontier",
|
||||
"country": "US",
|
||||
"city": "Oak Ridge, TN",
|
||||
"lat": 35.9327,
|
||||
"lng": -84.3107,
|
||||
"gpu_count": 37888,
|
||||
"gpu_type": "AMD MI250X",
|
||||
"total_flops": 1.54e9,
|
||||
"rank": 1,
|
||||
"visual": {
|
||||
"size": 1.0,
|
||||
"color": "#FF6B6B",
|
||||
"pulse": true
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
"submarine_cables": {
|
||||
"total": 436,
|
||||
"last_updated": "2024-01-20T09:00:00Z",
|
||||
"data": [
|
||||
{
|
||||
"id": "cable-001",
|
||||
"name": "FASTER",
|
||||
"length_km": 11600,
|
||||
"capacity_tbps": 60,
|
||||
"status": "active",
|
||||
"landing_points": [
|
||||
{"lat": 37.7749, "lng": -122.4194},
|
||||
{"lat": 35.6762, "lng": 139.6503}
|
||||
],
|
||||
"visual": {
|
||||
"width": 2.0,
|
||||
"color": "#4ECDC4",
|
||||
"animated": true
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
"ixp_nodes": {
|
||||
"total": 1200,
|
||||
"last_updated": "2024-01-20T09:30:00Z",
|
||||
"data": [
|
||||
{
|
||||
"id": "ixp-001",
|
||||
"name": "Equinix Ashburn",
|
||||
"country": "US",
|
||||
"city": "Ashburn, VA",
|
||||
"lat": 39.0438,
|
||||
"lng": -77.4874,
|
||||
"member_count": 250,
|
||||
"traffic_tbps": 15.5,
|
||||
"visual": {
|
||||
"size": 0.8,
|
||||
"color": "#45B7D1"
|
||||
}
|
||||
}
|
||||
]
|
||||
},
|
||||
"cloud_infra": {
|
||||
"total": 500,
|
||||
"last_updated": "2024-01-20T08:00:00Z",
|
||||
"data": [
|
||||
{
|
||||
"provider": "AWS",
|
||||
"region": "us-east-1",
|
||||
"data_center_count": 15,
|
||||
"capacity_mw": 500,
|
||||
"lat": 39.0438,
|
||||
"lng": -77.4874,
|
||||
"visual": {
|
||||
"size": 1.2,
|
||||
"color": "#FF9900"
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Incremental Update
|
||||
|
||||
**Server → Client**
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "data_frame",
|
||||
"timestamp": "2024-01-20T10:35:00.000Z",
|
||||
"data": {
|
||||
"update_type": "incremental",
|
||||
"sequence": 12346,
|
||||
"base_sequence": 12345,
|
||||
"changes": {
|
||||
"gpu_clusters": {
|
||||
"updated": [
|
||||
{
|
||||
"id": "epoch-gpu-002",
|
||||
"rank": 2,
|
||||
"gpu_count": 40000
|
||||
}
|
||||
],
|
||||
"added": [],
|
||||
"removed": []
|
||||
},
|
||||
"alerts": {
|
||||
"new": [
|
||||
{
|
||||
"id": 1234,
|
||||
"severity": "warning",
|
||||
"message": "API response time > 30s",
|
||||
"source": "Epoch AI"
|
||||
}
|
||||
],
|
||||
"resolved": [1230]
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Control Frame (Client → Server)
|
||||
|
||||
### Camera Position
|
||||
```json
|
||||
{
|
||||
"type": "control_frame",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"action": "camera_set",
|
||||
"camera": {
|
||||
"position": {
|
||||
"latitude": 35.6762,
|
||||
"longitude": 139.6503,
|
||||
"altitude": 5000000
|
||||
},
|
||||
"target": {
|
||||
"latitude": 35.6762,
|
||||
"longitude": 139.6503,
|
||||
"altitude": 0
|
||||
},
|
||||
"rotation": {
|
||||
"pitch": -45,
|
||||
"yaw": 0,
|
||||
"roll": 0
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Camera Animation
|
||||
```json
|
||||
{
|
||||
"type": "control_frame",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"action": "camera_animate",
|
||||
"animation": {
|
||||
"type": "fly_to",
|
||||
"target": {
|
||||
"latitude": 39.0438,
|
||||
"longitude": -77.4874,
|
||||
"altitude": 3000000
|
||||
},
|
||||
"duration_seconds": 3.0,
|
||||
"easing": "ease_in_out"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Auto-Cruise Control
|
||||
```json
|
||||
{
|
||||
"type": "control_frame",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"action": "cruise_control",
|
||||
"enabled": true,
|
||||
"config": {
|
||||
"speed": 1.0,
|
||||
"route": "global",
|
||||
"pause_on_interaction": true
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Layer Visibility
|
||||
```json
|
||||
{
|
||||
"type": "control_frame",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"action": "layer_visibility",
|
||||
"layers": {
|
||||
"gpu_clusters": true,
|
||||
"submarine_cables": true,
|
||||
"ixp_nodes": true,
|
||||
"cloud_infra": false,
|
||||
"satellites": false,
|
||||
"alerts": true
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Focus Request
|
||||
```json
|
||||
{
|
||||
"type": "control_frame",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"action": "focus_entity",
|
||||
"entity_type": "gpu_cluster",
|
||||
"entity_id": "epoch-gpu-001",
|
||||
"show_info": true
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Time Range Filter
|
||||
```json
|
||||
{
|
||||
"type": "control_frame",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"action": "set_time_range",
|
||||
"time_range": {
|
||||
"start": "2024-01-01T00:00:00Z",
|
||||
"end": "2024-01-20T23:59:59Z",
|
||||
"aggregation": "hourly"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Alert Notification
|
||||
|
||||
**Server → Client**
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "alert_notification",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"alert": {
|
||||
"id": 1234,
|
||||
"severity": "critical",
|
||||
"title": "Data Collection Failed",
|
||||
"message": "TOP500 data source failed to collect data",
|
||||
"source": "TOP500",
|
||||
"timestamp": "2024-01-20T10:25:00Z",
|
||||
"actions": ["acknowledge", "retry", "view_details"]
|
||||
},
|
||||
"badge_update": {
|
||||
"critical": 2,
|
||||
"warning": 5,
|
||||
"info": 10
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Sync Request
|
||||
|
||||
**Client → Server**
|
||||
|
||||
```json
|
||||
{
|
||||
"type": "sync_request",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"request_type": "full",
|
||||
"channels": ["gpu_clusters", "submarine_cables", "ixp_nodes"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Server Response:**
|
||||
Same as `data_frame` with `update_type: "full"`
|
||||
|
||||
---
|
||||
|
||||
## Subscription Management
|
||||
|
||||
### Subscribe
|
||||
```json
|
||||
{
|
||||
"type": "subscription",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"action": "subscribe",
|
||||
"channels": ["gpu_clusters", "alerts"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Unsubscribe
|
||||
```json
|
||||
{
|
||||
"type": "subscription",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"action": "unsubscribe",
|
||||
"channels": ["alerts"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
**Server Response:**
|
||||
```json
|
||||
{
|
||||
"type": "subscription_confirmed",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"action": "subscribe",
|
||||
"channels": ["gpu_clusters", "alerts"],
|
||||
"active_subscriptions": ["gpu_clusters", "alerts"]
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Messages
|
||||
|
||||
### Connection Error
|
||||
```json
|
||||
{
|
||||
"type": "error",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"code": "INVALID_TOKEN",
|
||||
"message": "Invalid or expired authentication token",
|
||||
"action": "reconnect_with_fresh_token"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Rate Limit Error
|
||||
```json
|
||||
{
|
||||
"type": "error",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"code": "RATE_LIMITED",
|
||||
"message": "Too many requests",
|
||||
"retry_after_seconds": 30
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Data Error
|
||||
```json
|
||||
{
|
||||
"type": "error",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"code": "DATA_FETCH_FAILED",
|
||||
"message": "Failed to fetch data from source: Epoch AI",
|
||||
"source": "Epoch AI",
|
||||
"will_retry": true,
|
||||
"retry_in_seconds": 60
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Validation Error
|
||||
```json
|
||||
{
|
||||
"type": "error",
|
||||
"timestamp": "2024-01-20T10:30:00.000Z",
|
||||
"data": {
|
||||
"code": "INVALID_CONTROL_FRAME",
|
||||
"message": "Invalid camera position",
|
||||
"details": {
|
||||
"field": "camera.position.altitude",
|
||||
"constraint": "Must be positive"
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Error Codes Reference
|
||||
|
||||
| Code | HTTP Equivalent | Description |
|
||||
|------|-----------------|-------------|
|
||||
| INVALID_TOKEN | 401 | JWT validation failed |
|
||||
| TOKEN_EXPIRED | 401 | Token has expired |
|
||||
| RATE_LIMITED | 429 | Too many requests |
|
||||
| CHANNEL_NOT_FOUND | 404 | Invalid channel name |
|
||||
| INVALID_FRAME | 400 | Malformed JSON or structure |
|
||||
| INVALID_CONTROL_FRAME | 400 | Control action validation failed |
|
||||
| DATA_FETCH_FAILED | 500 | Backend data collection failed |
|
||||
| INTERNAL_ERROR | 500 | Server internal error |
|
||||
|
||||
---
|
||||
|
||||
## Connection State Machine
|
||||
|
||||
```
|
||||
DISCONNECTED
|
||||
│
|
||||
├─→ CONNECTING (token validation)
|
||||
│
|
||||
├─→ AUTHENTICATED ──→ ESTABLISHED
|
||||
│ │
|
||||
├─→ ERROR (reconnect) ├─→ RECEIVING DATA
|
||||
│ │
|
||||
└───────────────────────────┴─→ DISCONNECTING
|
||||
│
|
||||
└─→ DISCONNECTED
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Reconnection Strategy
|
||||
|
||||
1. **Immediate Retry:** On disconnect, retry after 1 second
|
||||
2. **Exponential Backoff:** If failed, wait 2, 4, 8, 16 seconds
|
||||
3. **Max Retries:** 5 attempts before giving up
|
||||
4. **Token Refresh:** If token expired, refresh before reconnecting
|
||||
|
||||
---
|
||||
|
||||
## Data Flow Diagram
|
||||
|
||||
```
|
||||
┌──────────┐ ┌──────────┐
|
||||
│ UE5 │◄───── WebSocket ───►│ Server │
|
||||
│ Client │ │ │
|
||||
└────┬─────┘ └────┬─────┘
|
||||
│ │
|
||||
│ 1. Connect (with JWT) │
|
||||
│ 2. Connection Established │
|
||||
│ │
|
||||
│ 3. Control Frame (Camera) │
|
||||
│◄─────────────────────────────────┤
|
||||
│ │
|
||||
│ 4. Data Frame (Update) │
|
||||
│◄─────────────────────────────────┤
|
||||
│ 5. Heartbeat (30s interval) │
|
||||
│◄─────────────────────────────────┤
|
||||
│ │
|
||||
│ 6. Alert Notification │
|
||||
│◄─────────────────────────────────┤
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Performance Considerations
|
||||
|
||||
| Metric | Target | Notes |
|
||||
|--------|--------|-------|
|
||||
| Data frame size | < 1 MB | Compressed if larger |
|
||||
| Update latency | < 5 seconds | End-to-end |
|
||||
| Heartbeat latency | < 100 ms | Server processing |
|
||||
| Max connections | 1000 per server | With负载均衡 |
|
||||
|
||||
**Optimization Strategies:**
|
||||
- Incremental updates for frequent changes
|
||||
- Binary encoding for large datasets (MessagePack/Protocol Buffers)
|
||||
- Compression for data frames (gzip)
|
||||
- Chunking for large payloads
|
||||
Reference in New Issue
Block a user