A deep reinforcement learning approach for online and concurrent 3D bin packing optimisation with bin replacement strategies

Tsang, YP; Mo, DY; Chung, KT; Lee, CKM

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/111321

DC Field	Value	Language
dc.contributor	Department of Industrial and Systems Engineering	en_US
dc.creator	Tsang, YP	en_US
dc.creator	Mo, DY	en_US
dc.creator	Chung, KT	en_US
dc.creator	Lee, CKM	en_US
dc.date.accessioned	2025-02-17T08:29:37Z	-
dc.date.available	2025-02-17T08:29:37Z	-
dc.identifier.issn	0166-3615	en_US
dc.identifier.uri	http://hdl.handle.net/10397/111321	-
dc.language.iso	en	en_US
dc.publisher	Elsevier BV	en_US
dc.subject	3D bin packing problem	en_US
dc.subject	Deep reinforcement learning	en_US
dc.subject	Dual bin strategy	en_US
dc.subject	Online optimisation	en_US
dc.subject	Robotic warehouse	en_US
dc.title	A deep reinforcement learning approach for online and concurrent 3D bin packing optimisation with bin replacement strategies	en_US
dc.type	Journal/Magazine Article	en_US
dc.identifier.volume	164	en_US
dcterms.abstract	In the realm of robotic palletisation, the quest for optimal space utilization remains vital but also presents a critical challenge, particularly due to the constraints of decision complexity and the need for real-time decision-making without complete prior information. The widely adopted rule-based heuristics approaches were ease to use, but failed to adapt dynamically to the complex and changing landscape of online 3D bin packing. This study is motivated by the need for a system that is both more agile and intelligent, capable of managing the intricacies of dual-bin scenarios and the variable inflow of items. This study introduces a novel deep reinforcement learning (DRL) optimiser, employing a double deep Q-network (DDQN) to obtain optimal packing policies in an online environment with two proposed bin replacement strategies. This approach surpasses the limitations of previous methods by facilitating the simultaneous management of multiple bins and enabling on-the-fly adjustments to decisions based on limited prior knowledge. In a case study involving a logistics company, the proposed optimizer demonstrated a significant improvement in average space utilization across various lookahead scenarios, outperforming traditional heuristics in simulation experiments. The proposed optimiser contributes significantly to the economic and environmental sustainability of robotic warehouses, positioning itself as a cornerstone for the future of smart logistics.	en_US
dcterms.accessRights	embargoed access	en_US
dcterms.bibliographicCitation	Computers in industry, Jan. 2025, v. 164, 104202	en_US
dcterms.isPartOf	Computers in industry	en_US
dcterms.issued	2025-01	-
dc.identifier.eissn	1872-6194	en_US
dc.identifier.artn	104202	en_US
dc.description.validate	202502 bcch	en_US
dc.description.oa	Not applicable	en_US
dc.identifier.FolderNumber	a3407	-
dc.identifier.SubFormID	50066	-
dc.description.fundingSource	RGC	en_US
dc.description.fundingSource	Others	en_US
dc.description.fundingText	The Hong Kong Polytechnic University	en_US
dc.description.pubStatus	Published	en_US
dc.date.embargo	2027-01-31	en_US
dc.description.oaCategory	Green (AAM)	en_US
Appears in Collections:	Journal/Magazine Article

Open Access Information

Status	embargoed access
Embargo End Date	2027-01-31

Access

View full-text via PolyU eLinks

Show simple item record

Page views

35

Citations as of Apr 14, 2025

Google Scholar^TM

Check

Open Access Information

Access

Page views

Google ScholarTM

Google Scholar^TM