Files
planet/websocket_protocol.md
2026-03-05 11:46:58 +08:00

15 KiB

WebSocket Protocol Specification

Connection

ws://localhost:8000/ws?token=<access_token>

Connection Steps:

  1. Client connects to WebSocket endpoint
  2. Server validates JWT token
  3. Server sends connection_established message
  4. Client sends subscribe message (optional)
  5. Server begins sending data frames

Connection Limits:

  • Maximum concurrent connections per user: 3
  • Connection timeout (no activity): 5 minutes
  • Heartbeat interval: 30 seconds

Message Format

General Structure

{
    "type": "message_type",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": { ... }
}

All Message Types

Type Direction Description
connection_established Server → Client Initial connection confirmation
heartbeat Bidirectional Keep-alive ping/pong
data_frame Server → Client Main data payload
control_frame Client → Server Camera/display control
alert_notification Server → Client Real-time alert
error Bidirectional Error reporting
sync_request Client → Server Request full sync
subscription Client → Server Subscribe/unsubscribe channels

Connection Established

Server → Client

{
    "type": "connection_established",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "connection_id": "conn_a1b2c3d4",
        "server_version": "1.0.0",
        "session_id": "sess_xyz789",
        "heartbeat_interval": 30,
        "supported_channels": ["gpu_clusters", "submarine_cables", "ixp_nodes", "alerts"]
    }
}

Heartbeat

Client → Server (Ping)

{
    "type": "heartbeat",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "action": "ping"
    }
}

Server → Client (Pong)

{
    "type": "heartbeat",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "action": "pong",
        "latency_ms": 45
    }
}

Client Behavior:

  • Send ping every 30 seconds
  • If no pong received in 10 seconds, reconnect
  • Track latency for monitoring

Server Behavior:

  • Send pong immediately on receiving ping
  • Track connection health

Data Frame (Main Payload)

Full Update

Server → Client

{
    "type": "data_frame",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "update_type": "full",
        "sequence": 12345,
        "payload": {
            "meta": {
                "generated_at": "2024-01-20T10:30:00Z",
                "data_sources": 9,
                "total_records": 20800
            },
            "gpu_clusters": {
                "total": 1500,
                "last_updated": "2024-01-20T10:00:00Z",
                "data": [
                    {
                        "id": "epoch-gpu-001",
                        "name": "Frontier",
                        "country": "US",
                        "city": "Oak Ridge, TN",
                        "lat": 35.9327,
                        "lng": -84.3107,
                        "gpu_count": 37888,
                        "gpu_type": "AMD MI250X",
                        "total_flops": 1.54e9,
                        "rank": 1,
                        "visual": {
                            "size": 1.0,
                            "color": "#FF6B6B",
                            "pulse": true
                        }
                    }
                ]
            },
            "submarine_cables": {
                "total": 436,
                "last_updated": "2024-01-20T09:00:00Z",
                "data": [
                    {
                        "id": "cable-001",
                        "name": "FASTER",
                        "length_km": 11600,
                        "capacity_tbps": 60,
                        "status": "active",
                        "landing_points": [
                            {"lat": 37.7749, "lng": -122.4194},
                            {"lat": 35.6762, "lng": 139.6503}
                        ],
                        "visual": {
                            "width": 2.0,
                            "color": "#4ECDC4",
                            "animated": true
                        }
                    }
                ]
            },
            "ixp_nodes": {
                "total": 1200,
                "last_updated": "2024-01-20T09:30:00Z",
                "data": [
                    {
                        "id": "ixp-001",
                        "name": "Equinix Ashburn",
                        "country": "US",
                        "city": "Ashburn, VA",
                        "lat": 39.0438,
                        "lng": -77.4874,
                        "member_count": 250,
                        "traffic_tbps": 15.5,
                        "visual": {
                            "size": 0.8,
                            "color": "#45B7D1"
                        }
                    }
                ]
            },
            "cloud_infra": {
                "total": 500,
                "last_updated": "2024-01-20T08:00:00Z",
                "data": [
                    {
                        "provider": "AWS",
                        "region": "us-east-1",
                        "data_center_count": 15,
                        "capacity_mw": 500,
                        "lat": 39.0438,
                        "lng": -77.4874,
                        "visual": {
                            "size": 1.2,
                            "color": "#FF9900"
                        }
                    }
                ]
            }
        }
    }
}

Incremental Update

Server → Client

{
    "type": "data_frame",
    "timestamp": "2024-01-20T10:35:00.000Z",
    "data": {
        "update_type": "incremental",
        "sequence": 12346,
        "base_sequence": 12345,
        "changes": {
            "gpu_clusters": {
                "updated": [
                    {
                        "id": "epoch-gpu-002",
                        "rank": 2,
                        "gpu_count": 40000
                    }
                ],
                "added": [],
                "removed": []
            },
            "alerts": {
                "new": [
                    {
                        "id": 1234,
                        "severity": "warning",
                        "message": "API response time > 30s",
                        "source": "Epoch AI"
                    }
                ],
                "resolved": [1230]
            }
        }
    }
}

Control Frame (Client → Server)

Camera Position

{
    "type": "control_frame",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "action": "camera_set",
        "camera": {
            "position": {
                "latitude": 35.6762,
                "longitude": 139.6503,
                "altitude": 5000000
            },
            "target": {
                "latitude": 35.6762,
                "longitude": 139.6503,
                "altitude": 0
            },
            "rotation": {
                "pitch": -45,
                "yaw": 0,
                "roll": 0
            }
        }
    }
}

Camera Animation

{
    "type": "control_frame",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "action": "camera_animate",
        "animation": {
            "type": "fly_to",
            "target": {
                "latitude": 39.0438,
                "longitude": -77.4874,
                "altitude": 3000000
            },
            "duration_seconds": 3.0,
            "easing": "ease_in_out"
        }
    }
}

Auto-Cruise Control

{
    "type": "control_frame",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "action": "cruise_control",
        "enabled": true,
        "config": {
            "speed": 1.0,
            "route": "global",
            "pause_on_interaction": true
        }
    }
}

Layer Visibility

{
    "type": "control_frame",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "action": "layer_visibility",
        "layers": {
            "gpu_clusters": true,
            "submarine_cables": true,
            "ixp_nodes": true,
            "cloud_infra": false,
            "satellites": false,
            "alerts": true
        }
    }
}

Focus Request

{
    "type": "control_frame",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "action": "focus_entity",
        "entity_type": "gpu_cluster",
        "entity_id": "epoch-gpu-001",
        "show_info": true
    }
}

Time Range Filter

{
    "type": "control_frame",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "action": "set_time_range",
        "time_range": {
            "start": "2024-01-01T00:00:00Z",
            "end": "2024-01-20T23:59:59Z",
            "aggregation": "hourly"
        }
    }
}

Alert Notification

Server → Client

{
    "type": "alert_notification",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "alert": {
            "id": 1234,
            "severity": "critical",
            "title": "Data Collection Failed",
            "message": "TOP500 data source failed to collect data",
            "source": "TOP500",
            "timestamp": "2024-01-20T10:25:00Z",
            "actions": ["acknowledge", "retry", "view_details"]
        },
        "badge_update": {
            "critical": 2,
            "warning": 5,
            "info": 10
        }
    }
}

Sync Request

Client → Server

{
    "type": "sync_request",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "request_type": "full",
        "channels": ["gpu_clusters", "submarine_cables", "ixp_nodes"]
    }
}

Server Response: Same as data_frame with update_type: "full"


Subscription Management

Subscribe

{
    "type": "subscription",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "action": "subscribe",
        "channels": ["gpu_clusters", "alerts"]
    }
}

Unsubscribe

{
    "type": "subscription",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "action": "unsubscribe",
        "channels": ["alerts"]
    }
}

Server Response:

{
    "type": "subscription_confirmed",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "action": "subscribe",
        "channels": ["gpu_clusters", "alerts"],
        "active_subscriptions": ["gpu_clusters", "alerts"]
    }
}

Error Messages

Connection Error

{
    "type": "error",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "code": "INVALID_TOKEN",
        "message": "Invalid or expired authentication token",
        "action": "reconnect_with_fresh_token"
    }
}

Rate Limit Error

{
    "type": "error",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "code": "RATE_LIMITED",
        "message": "Too many requests",
        "retry_after_seconds": 30
    }
}

Data Error

{
    "type": "error",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "code": "DATA_FETCH_FAILED",
        "message": "Failed to fetch data from source: Epoch AI",
        "source": "Epoch AI",
        "will_retry": true,
        "retry_in_seconds": 60
    }
}

Validation Error

{
    "type": "error",
    "timestamp": "2024-01-20T10:30:00.000Z",
    "data": {
        "code": "INVALID_CONTROL_FRAME",
        "message": "Invalid camera position",
        "details": {
            "field": "camera.position.altitude",
            "constraint": "Must be positive"
        }
    }
}

Error Codes Reference

Code HTTP Equivalent Description
INVALID_TOKEN 401 JWT validation failed
TOKEN_EXPIRED 401 Token has expired
RATE_LIMITED 429 Too many requests
CHANNEL_NOT_FOUND 404 Invalid channel name
INVALID_FRAME 400 Malformed JSON or structure
INVALID_CONTROL_FRAME 400 Control action validation failed
DATA_FETCH_FAILED 500 Backend data collection failed
INTERNAL_ERROR 500 Server internal error

Connection State Machine

DISCONNECTED
    │
    ├─→ CONNECTING (token validation)
    │
    ├─→ AUTHENTICATED ──→ ESTABLISHED
    │                           │
    ├─→ ERROR (reconnect)       ├─→ RECEIVING DATA
    │                           │
    └───────────────────────────┴─→ DISCONNECTING
                                        │
                                        └─→ DISCONNECTED

Reconnection Strategy

  1. Immediate Retry: On disconnect, retry after 1 second
  2. Exponential Backoff: If failed, wait 2, 4, 8, 16 seconds
  3. Max Retries: 5 attempts before giving up
  4. Token Refresh: If token expired, refresh before reconnecting

Data Flow Diagram

┌──────────┐                     ┌──────────┐
│   UE5    │◄───── WebSocket ───►│  Server  │
│  Client  │                     │          │
└────┬─────┘                     └────┬─────┘
     │                                  │
     │  1. Connect (with JWT)           │
     │  2. Connection Established       │
     │                                  │
     │  3. Control Frame (Camera)       │
     │◄─────────────────────────────────┤
     │                                  │
     │  4. Data Frame (Update)          │
     │◄─────────────────────────────────┤
     │  5. Heartbeat (30s interval)     │
     │◄─────────────────────────────────┤
     │                                  │
     │  6. Alert Notification           │
     │◄─────────────────────────────────┤

Performance Considerations

Metric Target Notes
Data frame size < 1 MB Compressed if larger
Update latency < 5 seconds End-to-end
Heartbeat latency < 100 ms Server processing
Max connections 1000 per server With负载均衡

Optimization Strategies:

  • Incremental updates for frequent changes
  • Binary encoding for large datasets (MessagePack/Protocol Buffers)
  • Compression for data frames (gzip)
  • Chunking for large payloads