Document Processing Saga
Wolverine Sagas
Contains the
DocumentProcessingSaga, the master state machine that controls the asynchronous document generation process across distributed services.
Architecture
The MILTON.DocumentGenerator leverages Wolverine Sagas to orchestrate the processing of an entire document. The crucial characteristic of this Saga is that it does not own or modify any domain state.
Instead, the Saga operates as a pure orchestrator:
- It maintains its own internal state (the queue of
PendingBlocks,ProcessedCount, andTotalBlockCount). - It delegates all data fetching and database modifications back to the API.
DocumentProcessingSaga
- Initiation: The Saga is started when the API publishes a
DocumentCreatedEvent. This event contains a flat list ofProcessableBlockRefs (which represent all the structural sections, requirements, or test cases in the document). - Iteration: The Saga pops blocks off the queue one at a time and inspects the block type.
- Dispatch: Based on the
BlockType, it issues aPrepare[Type]GenerationCommand(e.g.,PrepareScannerGenerationCommand) directly to the API. - Transition: The Saga then waits. Once the API completes the generation loop (fetching data → calling worker → applying data), the API publishes a
BlockProcessedEvent. The Saga consumes this event, increments its progress, and handles any dynamically spawned blocks (e.g. if one block spawned multiple sub-requirements). - Real-time Notifications: Throughout its lifecycle, the Saga yields SignalR updates via the
IUserNotifierto track progress. - Completion: Once all blocks are processed, it issues a
DocumentGenerationCompletedEventand triggers aGeneratePdfCommand(which is handled back in the API) to compile the final document.
Saga Orchestration Sequence
The following diagram illustrates how the DocumentProcessingSaga manages the complete lifecycle of a Document, delegating actual work to the API.
sequenceDiagram participant API as API participant Broker as RabbitMQ participant Saga as DocumentProcessingSaga participant Notifier as IUserNotifier (SignalR) API->>Broker: Publish DocumentCreatedEvent(ProcessableBlocks) Broker->>Saga: Start Saga Saga->>Notifier: NotifyProgressAsync(Started) loop For each Pending Block Saga->>Broker: Send Prepare[Type]GenerationCommand Note over API, Broker: API gathers data, delegates to Worker, applies results API->>Broker: Publish BlockProcessedEvent(Success, SpawnedBlocks) Broker->>Saga: Consume BlockProcessedEvent Saga->>Saga: Enqueue SpawnedBlocks (if any) Saga->>Notifier: NotifyProgressAsync(Increment) end Saga->>Broker: Publish DocumentGenerationCompletedEvent Saga->>Broker: Send GeneratePdfCommand Saga->>Notifier: NotifyCompletedAsync
Business Logic Intent
The rationale behind separating the Saga from the worker logic is to ensure resilience and separation of concerns:
- Resilience: If the API restarts or the AI provider times out during a block generation, the Wolverine Saga naturally pauses and can retry. The state of the document pipeline is durably stored in PostgreSQL by Wolverine.
- Message-Only Pipeline: By forcing the Saga to send commands to the API (which then uses S3 Claim-Checks to route to the workers), the Document Generator service never needs to know the schema of the relational database or directly manage EF Core entity lifecycles.
- Dynamic Workloads: Certain document blocks (like a Requirement block) might spawn 10 new sub-blocks dynamically. The Saga is designed to accept
SpawnedBlocksin theBlockProcessedEvent, allowing the document to grow and process recursively in real-time.