Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

GitHub

This documentation is part of the "Projects with Books" initiative at zenOSmosis.

The source code for this project is available on GitHub.

Transport Errors

Relevant source files

Purpose and Scope

This document explains transport-level errors in the rust-muxio system—failures that occur at the connection and network layer rather than in RPC method execution. Transport errors include connection failures, unexpected disconnections, stream cancellations, and network I/O errors. These errors are distinct from RPC service errors (see RPC Service Errors), which represent application-level failures like method-not-found or serialization errors.

Transport errors affect the underlying communication channel and typically result in all pending requests being cancelled. The system provides automatic cleanup mechanisms to ensure that no requests hang indefinitely when transport fails.


Transport Error Categories

The system handles three primary categories of transport errors, each with distinct handling mechanisms and propagation paths.

Connection Establishment Errors

Connection errors occur during the initial WebSocket handshake or TCP connection setup. These are synchronous errors that prevent the client from being created.

Key Error Type : std::io::Error with kind ConnectionRefused

extensions/muxio-tokio-rpc-client/src/rpc_client.rs:118-121

Sources:

Runtime Disconnection Errors

Disconnections occur after a successful connection when the WebSocket stream encounters an error or closes unexpectedly. These trigger automatic cleanup of all pending requests.

Key Error Type : FrameDecodeError::ReadAfterCancel

extensions/muxio-tokio-rpc-client/src/rpc_client.rs102

Sources:

Frame-Level Decode Errors

Frame decode errors occur when the binary framing protocol receives malformed data. These are represented by FrameDecodeError variants from the core library.

Key Error Types:

  • FrameDecodeError::CorruptFrame - Invalid frame structure
  • FrameDecodeError::ReadAfterCancel - Stream cancelled by transport
  • FrameDecodeError::UnexpectedEOF - Incomplete frame data

src/rpc/rpc_dispatcher.rs:187-206

Sources:


Transport Error Types and Code Entities

Sources:


Error Propagation Through RpcDispatcher

When a transport error occurs, the RpcDispatcher is responsible for propagating the error to all pending request handlers. This prevents requests from hanging indefinitely.

sequenceDiagram
    participant Transport as "WebSocket Transport"
    participant Client as "RpcClient"
    participant Dispatcher as "RpcDispatcher"
    participant Handlers as "Response Handlers"
    participant App as "Application Code"
    
    Note over Transport: Connection failure detected
    Transport->>Client: ws_receiver error
    Client->>Client: shutdown_async()
    Note over Client: is_connected.swap(false)
    
    Client->>Dispatcher: dispatcher.lock().await
    Client->>Dispatcher: fail_all_pending_requests(error)
    
    Note over Dispatcher: Take ownership of handlers\nstd::mem::take()
    
    loop For each pending request
        Dispatcher->>Handlers: Create RpcStreamEvent::Error
        Note over Handlers: rpc_request_id: Some(id)\nframe_decode_error: ReadAfterCancel
        Handlers->>Handlers: handler(error_event)
        Handlers->>App: Resolve Future with error
    end
    
    Note over Dispatcher: response_handlers now empty\nNote over Client: State change handler called\nRpcTransportState::Disconnected

Dispatcher Error Propagation Sequence

Sources:


Automatic Disconnection Handling

The RpcClient implements automatic disconnection handling through three concurrent tasks that monitor the WebSocket connection and coordinate cleanup.

Client Task Architecture

TaskResponsibilityError DetectionCleanup Action
Receive LoopReads WebSocket messagesDetects ws_receiver.next() errors or NoneSpawns shutdown_async()
Send LoopWrites WebSocket messagesDetects ws_sender.send() errorsSpawns shutdown_async()
Heartbeat LoopPeriodic ping messagesDetects channel closedExits task

extensions/muxio-tokio-rpc-client/src/rpc_client.rs:139-257

Shutdown Synchronization

The client provides both synchronous and asynchronous shutdown paths to handle different scenarios:

Asynchronous Shutdown (shutdown_async):

  • Called by background tasks when detecting errors
  • Acquires dispatcher lock to prevent new RPC calls
  • Calls fail_all_pending_requests with ReadAfterCancel error
  • Invokes state change handler with RpcTransportState::Disconnected

extensions/muxio-tokio-rpc-client/src/rpc_client.rs:80-108

Synchronous Shutdown (shutdown_sync):

  • Called from Drop implementation
  • Does not acquire locks (avoids deadlock during cleanup)
  • Only invokes state change handler
  • Aborts all background tasks

extensions/muxio-tokio-rpc-client/src/rpc_client.rs:56-77

Key Synchronization Mechanism : AtomicBool::is_connected

The is_connected flag uses SeqCst ordering to ensure:

  1. Only one shutdown path executes
  2. Send loop drops messages if disconnected
  3. Emit function rejects outgoing RPC data

extensions/muxio-tokio-rpc-client/src/rpc_client.rs61 extensions/muxio-tokio-rpc-client/src/rpc_client.rs85 extensions/muxio-tokio-rpc-client/src/rpc_client.rs:231-236 extensions/muxio-tokio-rpc-client/src/rpc_client.rs:294-297

Sources:


graph TB
    subgraph "Before Disconnect"
        PH1["response_handlers HashMap\nrequest_id → handler"]
PR1["Pending Request 1\nawaiting response"]
PR2["Pending Request 2\nawaiting response"]
PR3["Pending Request 3\nawaiting response"]
PH1 --> PR1
 
       PH1 --> PR2
 
       PH1 --> PR3
    end
    
    subgraph "Cancellation Process"
        DC["Disconnect detected"]
FP["fail_all_pending_requests()\nstd::mem::take(&mut handlers)"]
DC --> FP
    end
    
    subgraph "Handler Invocation"
        LE["Loop over handlers"]
CE["Create RpcStreamEvent::Error\nrpc_request_id: Some(id)\nframe_decode_error: ReadAfterCancel"]
CH["handler(error_event)"]
FP --> LE
 
       LE --> CE
 
       CE --> CH
    end
    
    subgraph "After Cancellation"
        PH2["response_handlers HashMap\n(empty)"]
ER1["Request 1 fails with\nRpcServiceError::TransportError"]
ER2["Request 2 fails with\nRpcServiceError::TransportError"]
ER3["Request 3 fails with\nRpcServiceError::TransportError"]
CH --> ER1
 
       CH --> ER2
 
       CH --> ER3
        PH2 -.handlers cleared.-> ER1
    end
    
    style DC fill:#ffcccc
    style FP fill:#ffcccc
    style ER1 fill:#ffcccc
    style ER2 fill:#ffcccc
    style ER3 fill:#ffcccc

Pending Request Cancellation

When a transport error occurs, all pending RPC requests must be cancelled to prevent application code from hanging indefinitely. The system achieves this through the fail_all_pending_requests method.

Cancellation Mechanism

Sources:

Implementation Details

The fail_all_pending_requests method in RpcDispatcher:

  1. Takes ownership of all response handlers using std::mem::take

    • Leaves response_handlers empty
    • Prevents new errors from affecting already-cancelled requests
  2. Creates synthetic error events for each pending request:

    • RpcStreamEvent::Error with FrameDecodeError::ReadAfterCancel
    • Includes rpc_request_id for correlation
    • Omits rpc_header and rpc_method_id (not needed for cancellation)
  3. Invokes each handler with the error event:

    • Wakes up the waiting Future in application code
    • Results in RpcServiceError::TransportError propagated to caller

src/rpc/rpc_dispatcher.rs:428-456

Critical Design Note : Handler removal via std::mem::take prevents the catch-all handler from processing error events for already-cancelled requests, avoiding duplicate error notifications.

Sources:


Transport State Change Notifications

Applications can register a state change handler to receive notifications when the transport connection state changes. This enables reactive UI updates and connection retry logic.

State Change Handler Interface

RpcTransportState Enum

StateMeaningTriggered When
ConnectedTransport is activeClient successfully connects, or handler registered on already-connected client
DisconnectedTransport has failedWebSocket error detected, connection closed, or client dropped

extensions/muxio-rpc-service-caller/src/transport_state.rs

Handler Invocation Guarantees

The state change handler is invoked with the following guarantees:

  1. Immediate callback on registration : If the client is already connected when set_state_change_handler is called, the handler is immediately invoked with Connected

  2. Single disconnection notification : The is_connected atomic flag ensures only one thread invokes the Disconnected handler

  3. Thread-safe invocation : Handler is called while holding the state_change_handler mutex, preventing concurrent modifications

extensions/muxio-tokio-rpc-client/src/rpc_client.rs:315-334

Sources:


Error Handling in Stream Events

The RpcDispatcher and RpcRespondableSession track stream-level errors through the RpcStreamEvent::Error variant. These errors are distinct from transport disconnections—they represent protocol-level decode failures during frame reassembly.

RpcStreamEvent::Error Structure

Error Event Processing

When RpcDispatcher::read_bytes encounters a decode error:

  1. Error logged : Tracing output includes method ID, header, and request ID context
  2. Queue unaffected : Unlike response events, error events do not remove entries from rpc_request_queue
  3. Handler not invoked : Catch-all response handler processes the error but does not delete the queue entry

src/rpc/rpc_dispatcher.rs:187-206

Design Rationale : Error events do not automatically clean up queue entries because:

  • Partial streams may still be recoverable
  • Application code may need to inspect incomplete payloads
  • Explicit deletion via delete_rpc_request gives caller control

TODO in codebase : Consider auto-removing errored requests from queue or marking them with error state.

src/rpc/rpc_dispatcher.rs205

Sources:


Mutex Poisoning and Error Recovery

The RpcDispatcher uses Mutex to protect the rpc_request_queue. If a thread panics while holding this lock, the mutex becomes "poisoned" and subsequent lock attempts return an error. The dispatcher treats poisoned mutexes as fatal errors.

Poisoning Handling Strategy

In catch-all response handler :

src/rpc/rpc_dispatcher.rs:104-118

In read_bytes :

src/rpc/rpc_dispatcher.rs:367-370

Rationale : A poisoned queue indicates inconsistent shared state. Continuing could result in:

  • Incorrect request routing
  • Lost response data
  • Silent data corruption

The dispatcher crashes fast to provide clear debugging signals rather than attempting partial recovery.

Sources:


Best Practices for Handling Transport Errors

1. Register State Change Handlers Early

Always register a state change handler before making RPC calls to ensure disconnection events are captured:

2. Handle Cancellation Errors Gracefully

Pending RPC calls will fail with RpcServiceError::TransportError containing ReadAfterCancel. Application code should distinguish these from service-level errors:

3. Check Connection Before Making Calls

Use is_connected() to avoid starting RPC operations when transport is down:

extensions/muxio-tokio-rpc-client/src/rpc_client.rs:284-286

4. Understand Disconnect Timing

The send and receive loops detect disconnections independently:

  • Receive loop : Detects server-initiated disconnects immediately
  • Send loop : Detects errors when attempting to send data
  • Heartbeat : May detect connection issues if both loops are idle

Do not assume instant disconnection detection for all failure modes.

Sources:


Differences from RPC Service Errors

Transport errors (page 7.2) differ from RPC service errors (RPC Service Errors) in critical ways:

AspectTransport ErrorsRPC Service Errors
LayerConnection/framing layerRPC protocol/application layer
ScopeAffects all pending requestsAffects single request
RecoveryRequires reconnectionRetry may succeed
DetectionWebSocket errors, frame decode failuresMethod dispatch failures, serialization errors
Propagationfail_all_pending_requestsIndividual handler callbacks
Error Typestd::io::Error, FrameDecodeErrorRpcServiceError variants

When to use this page vs. page 7.1 :

  • Use this page for connection failures, disconnections, stream cancellations
  • Use RPC Service Errors for method-not-found, parameter validation, handler panics

Sources:

Dismiss

Refresh this wiki

Enter email to refresh