Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/115747
| DC Field | Value | Language |
|---|---|---|
| dc.contributor | Department of Electrical and Electronic Engineering | en_US |
| dc.creator | Kwong, NW | en_US |
| dc.creator | Chan, YL | en_US |
| dc.creator | Tsang, SH | en_US |
| dc.creator | Huang, Z | en_US |
| dc.creator | Lam, KM | en_US |
| dc.date.accessioned | 2025-10-27T07:06:45Z | - |
| dc.date.available | 2025-10-27T07:06:45Z | - |
| dc.identifier.issn | 1520-9210 | en_US |
| dc.identifier.uri | http://hdl.handle.net/10397/115747 | - |
| dc.language.iso | en | en_US |
| dc.publisher | Institute of Electrical and Electronics Engineers | en_US |
| dc.rights | © 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | en_US |
| dc.rights | The following publication Kwong, N. W., Chan, Y. L., Tsang, S. H., Huang, Z., & Lam, K. M. (2025). "Multi-frame spatiotemporal feature and hierarchical learning approach for no-reference screen content video quality assessment" in IEEE Transactions on Multimedia, vol. 27, pp. 6235-6247 is available at https:// doi.org/10.1109/TMM.2025.3599071. | en_US |
| dc.subject | No reference | en_US |
| dc.subject | Screen content video quality assessment | en_US |
| dc.subject | Temporal pyramid transformer | en_US |
| dc.title | Multi-frame spatiotemporal feature and hierarchical learning approach for no-reference screen content video quality assessment | en_US |
| dc.type | Journal/Magazine Article | en_US |
| dc.identifier.spage | 7632 | en_US |
| dc.identifier.epage | 7647 | en_US |
| dc.identifier.doi | 10.1109/TMM.2025.3599071 | en_US |
| dcterms.abstract | The rapid adoption of remote work, online conferencing, and shared-screen collaboration has significantly increased the usage of screen content videos (SCVs), creating a growing need for reliable quality assessment to maintain excellent quality of service. While several full-reference SCV quality assessment (SCVQA) methods have been proposed, their practical application is often limited by the unavailability of reference videos. Existing no-reference SCVQA (NR-SCVQA) methods rely on handcrafted features and focus solely on specific distortions and features, potentially limiting their generalization ability. Moreover, they fail to explore the underlying spatiotemporal information of SCVs, which could hinder their performance. In this work, we propose a novel deep learning-based NR-SCVQA model specifically tailored to capture the comprehensive spatiotemporal features of SCVs to overcome these issues and challenges posed by the SCVQA task. Our approach incorporates a dual-channel spatiotemporal convolutional neural network (DCST-CNN) module to extract both content-aware and edge-aware spatiotemporal quality features, which enables an effective spatiotemporal quality feature representation learning for the downstream SCVQA task. Building upon the DCST-CNN, we further propose a Temporal Pyramid Transformer (TPT) module to fuse spatiotemporal features across multiple temporal scales, enabling the model to capture both short-term and long-term temporal dependencies within an SCV for hierarchical learning. The proposed DCST-CNN and TPT modules work together to provide a robust and accurate NR-SCVQA framework. We conduct experiments on SCVQA databases to validate the effectiveness of our model, which outperforms existing state-of-the-art NR-SCVQA method. The results demonstrate the strength and applicability of our approach in real-world SCVQA tasks. | en_US |
| dcterms.accessRights | open access | en_US |
| dcterms.bibliographicCitation | IEEE transactions on multimedia, 2025, v. 27, p. 7632-7647 | en_US |
| dcterms.isPartOf | IEEE transactions on multimedia | en_US |
| dcterms.issued | 2025 | - |
| dc.identifier.scopus | 2-s2.0-105013314683 | - |
| dc.identifier.eissn | 1941-0077 | en_US |
| dc.description.validate | 202510 bchy | en_US |
| dc.description.oa | Accepted Manuscript | en_US |
| dc.identifier.SubFormID | G000286/2025-09 | - |
| dc.description.fundingSource | Self-funded | en_US |
| dc.description.pubStatus | Published | en_US |
| dc.description.oaCategory | Green (AAM) | en_US |
| Appears in Collections: | Journal/Magazine Article | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| Kwong_Multi-Frame_Spatiotemporal_Feature.pdf | Pre-Published version | 3.35 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.



