Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/115059
| Title: | DeepZoning : re-accelerate CNN inference with zoning graph for heterogeneous edge cluster | Authors: | Wang, J Ma, R Yang, X Qi, Q Zhuang, Z Wang, J Liao, J Guo, S |
Issue Date: | Mar-2025 | Source: | ACM transactions on architecture and code optimization, Mar. 2025, v. 22, no. 1, 10 | Abstract: | Parallelizing CNN inference on heterogeneous edge clusters with data parallelism has gained popularity as a way to meet real-time requirements without sacrificing model accuracy. However, existing algorithms struggle to find optimal parallel granularity for complex CNNS, the structure of which is a directed acyclic graph (DAG) rather than a chain, and the parallel dimension is inflexible. To distribute the workload of modern CNNs on heterogeneous devices is also proven as NP-hard problem. In this article, we introduce DeepZoning, a versatile and cooperative inference framework that combines both model and data parallelism to accelerate CNN inference. DeepZoning employs two algorithms at different levels: (1) a low-level Adaptive Workload Partition algorithm that uses linear programming and takes spatial and channel dimensions into optimization during the search for feature map distribution on heterogeneous devices, and (2) a high-level Model Partition algorithm that finds the optimal model granularity and organizes complex CNNs into sequential zones to balance communication and computation during execution. Our experimental evaluations show that DeepZoning is effective, achieving up to a 3.02× speed improvement on our experimental prototype compared to state-of-the-art algorithms. | Keywords: | Cooperative CNN inference Edge computing Graph partition Model deployment |
Publisher: | Association for Computing Machinery | Journal: | ACM transactions on architecture and code optimization | ISSN: | 1544-3566 | EISSN: | 1544-3973 | DOI: | 10.1145/3701995 | Rights: | This work is licensed under a Creative Commons Attribution International 4.0 License (https://creativecommons.org/licenses/by/4.0/). ©2025 Copyright held by the owner/author(s). The following publication Wang, J., Ma, R., Yang, X., Qi, Q., Zhuang, Z., Wang, J., Liao, J., & Guo, S. (2025). DeepZoning: Re-accelerate CNN Inference with Zoning Graph for Heterogeneous Edge Cluster. ACM Trans. Archit. Code Optim., 22(1), Article 10 is available at https://doi.org/10.1145/3701995. |
| Appears in Collections: | Journal/Magazine Article |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| 3701995.pdf | 4.37 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.



