Document Processing Saga

Wolverine Sagas

Contains the DocumentProcessingSaga, the master state machine that controls the asynchronous document generation process across distributed services.

Architecture

The MILTON.DocumentGenerator leverages Wolverine Sagas to orchestrate the processing of an entire document. The crucial characteristic of this Saga is that it does not own or modify any domain state.

Instead, the Saga operates as a pure orchestrator:

  • It maintains its own internal state (the queue of PendingBlocks, ProcessedCount, and TotalBlockCount).
  • It delegates all data fetching and database modifications back to the API.

DocumentProcessingSaga

  1. Initiation: The Saga is started when the API publishes a DocumentCreatedEvent. This event contains a flat list of ProcessableBlockRefs (which represent all the structural sections, requirements, or test cases in the document).
  2. Iteration: The Saga pops blocks off the queue one at a time and inspects the block type.
  3. Dispatch: Based on the BlockType, it issues a Prepare[Type]GenerationCommand (e.g., PrepareScannerGenerationCommand) directly to the API.
  4. Transition: The Saga then waits. Once the API completes the generation loop (fetching data calling worker applying data), the API publishes a BlockProcessedEvent. The Saga consumes this event, increments its progress, and handles any dynamically spawned blocks (e.g. if one block spawned multiple sub-requirements).
  5. Real-time Notifications: Throughout its lifecycle, the Saga yields SignalR updates via the IUserNotifier to track progress.
  6. Completion: Once all blocks are processed, it issues a DocumentGenerationCompletedEvent and triggers a GeneratePdfCommand (which is handled back in the API) to compile the final document.

Saga Orchestration Sequence

The following diagram illustrates how the DocumentProcessingSaga manages the complete lifecycle of a Document, delegating actual work to the API.

sequenceDiagram
    participant API as API
    participant Broker as RabbitMQ
    participant Saga as DocumentProcessingSaga
    participant Notifier as IUserNotifier (SignalR)

    API->>Broker: Publish DocumentCreatedEvent(ProcessableBlocks)
    Broker->>Saga: Start Saga
    Saga->>Notifier: NotifyProgressAsync(Started)
    
    loop For each Pending Block
        Saga->>Broker: Send Prepare[Type]GenerationCommand
        Note over API, Broker: API gathers data, delegates to Worker, applies results
        API->>Broker: Publish BlockProcessedEvent(Success, SpawnedBlocks)
        Broker->>Saga: Consume BlockProcessedEvent
        Saga->>Saga: Enqueue SpawnedBlocks (if any)
        Saga->>Notifier: NotifyProgressAsync(Increment)
    end
    
    Saga->>Broker: Publish DocumentGenerationCompletedEvent
    Saga->>Broker: Send GeneratePdfCommand
    Saga->>Notifier: NotifyCompletedAsync

Business Logic Intent

The rationale behind separating the Saga from the worker logic is to ensure resilience and separation of concerns:

  • Resilience: If the API restarts or the AI provider times out during a block generation, the Wolverine Saga naturally pauses and can retry. The state of the document pipeline is durably stored in PostgreSQL by Wolverine.
  • Message-Only Pipeline: By forcing the Saga to send commands to the API (which then uses S3 Claim-Checks to route to the workers), the Document Generator service never needs to know the schema of the relational database or directly manage EF Core entity lifecycles.
  • Dynamic Workloads: Certain document blocks (like a Requirement block) might spawn 10 new sub-blocks dynamically. The Saga is designed to accept SpawnedBlocks in the BlockProcessedEvent, allowing the document to grow and process recursively in real-time.

0 items under this folder.