Distributed Deep Graph Learning with DistDGL: A Case Study in a Cloud Environment

The goal of this work was to evaluate a state-of-the-art distributed graph neural network framework in order to gain deeper insights into their operation and to explore opportunities for advancing both graph neural network architectures and distributed training systems. The framework was tested in a community cloud environment using $\mathbf{1}-\mathbf{4}$ virtual machines, varying batch sizes, and different datasets. As a result of the evaluation, we designed a reference architecture that makes it easy to reproduce and apply the framework in future research. The experiments show that increasing the batch size can significantly reduce training time, but at the cost of accuracy. In contrast, adding more virtual machines improves training speed without degrading accuracy; however, scalability strongly depends not only on the model and the infrastructure, but also on the characteristics of the dataset. Sparse graphs scale more effectively, while dense graphs are more challenging. Overall, the experiments successfully reproduced a state-of-the-art setup and demonstrated measurable speedup through distributed training, providing a foundation for future research.

Morzsák

Oldal címe

Distributed Deep Graph Learning with DistDGL: A Case Study in a Cloud Environment

Címlapos tartalom