Morzsák

Oldal címe

Mesh-aware debugging: systematic traversal and statespace replay for identifying resource allocation issues in distributed microservices

Címlapos tartalom

Debugging distributed microservice systems is inherently challenging due to their asynchronous behavior, nondeterminism, and dynamic resource allocation. This paper introduces a novel mesh-aware debugging methodology that adapts macrostep-based techniques from parallel computing to modern, cloud-native environments. Leveraging collective breakpoints and execution tree traversal, the approach enables deterministic replay and systematic exploration of state spaces in a service mesh environment. The proposed system extends existing capabilities by incorporating replay functionality, dynamic service discovery, and fine-grained communication control using code instrumentation. To validate the methodology, a test application modeling deadlock scenarios among competing processes was developed. Experimental results reveal how race conditions and timing variability affect deadlock emergence. The findings validate the effectiveness of macrostep debugging in identifying hard-to-reproduce faults, offering a powerful tool for diagnosing complex behaviors in distributed systems.