Weakly-supervised video summarization using variational encoder-decoder and web prior

Cai, S; Zuo, W; Davis, LS; Zhang, L

doi:10.1007/978-3-030-01264-9_12

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/105635

DC Field	Value	Language
dc.contributor	Department of Computing	-
dc.creator	Cai, S	-
dc.creator	Zuo, W	-
dc.creator	Davis, LS	-
dc.creator	Zhang, L	-
dc.date.accessioned	2024-04-15T07:35:34Z	-
dc.date.available	2024-04-15T07:35:34Z	-
dc.identifier.isbn	978-3-030-01263-2	-
dc.identifier.isbn	978-3-030-01264-9 (eBook)	-
dc.identifier.issn	0302-9743	-
dc.identifier.uri	http://hdl.handle.net/10397/105635	-
dc.description	15th European Conference, Munich, Germany, September 8-14, 2018	en_US
dc.language.iso	en	en_US
dc.publisher	Springer	en_US
dc.rights	© Springer Nature Switzerland AG 2018	en_US
dc.rights	This version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use(https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms), but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/978-3-030-01264-9_12.	en_US
dc.subject	Variational autoencoder	en_US
dc.subject	Video summarization	en_US
dc.title	Weakly-supervised video summarization using variational encoder-decoder and web prior	en_US
dc.type	Conference Paper	en_US
dc.identifier.spage	193	-
dc.identifier.epage	210	-
dc.identifier.volume	11218	-
dc.identifier.doi	10.1007/978-3-030-01264-9_12	-
dcterms.abstract	Video summarization is a challenging under-constrained problem because the underlying summary of a single video strongly depends on users’ subjective understandings. Data-driven approaches, such as deep neural networks, can deal with the ambiguity inherent in this task to some extent, but it is extremely expensive to acquire the temporal annotations of a large-scale video dataset. To leverage the plentiful web-crawled videos to improve the performance of video summarization, we present a generative modelling framework to learn the latent semantic video representations to bridge the benchmark data and web data. Specifically, our framework couples two important components: a variational autoencoder for learning the latent semantics from web videos, and an encoder-attention-decoder for saliency estimation of raw video and summary generation. A loss term to learn the semantic matching between the generated summaries and web videos is presented, and the overall framework is further formulated into a unified conditional variational encoder-decoder, called variational encoder-summarizer-decoder (VESD). Experiments conducted on the challenging datasets CoSum and TVSum demonstrate the superior performance of the proposed VESD to existing state-of-the-art methods. The source code of this work can be found at https://github.com/cssjcai/vesd.	-
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics), 2018, v. 11218, p. 193-210	-
dcterms.isPartOf	Lecture notes in computer science (including subseries Lecture notes in artificial intelligence and lecture notes in bioinformatics)	-
dcterms.issued	2018	-
dc.identifier.scopus	2-s2.0-85055675665	-
dc.relation.conference	European Conference on Computer Vision [ECCV]	-
dc.identifier.eissn	1611-3349	-
dc.description.validate	202402 bcch	-
dc.description.oa	Accepted Manuscript	en_US
dc.identifier.FolderNumber	COMP-1043	en_US
dc.description.fundingSource	RGC	en_US
dc.description.pubStatus	Published	en_US
dc.identifier.OPUS	13568017	en_US
dc.description.oaCategory	Green (AAM)	en_US
Appears in Collections:	Conference Paper

Files in This Item:

File	Description	Size	Format
Cai_Weakly-Supervised_Video_Summarization.pdf	Pre-Published version	1.73 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Final Accepted Manuscript

Access

View full-text via PolyU eLinks

Show simple item record

Page views

167

Last Week
4

Last month

Citations as of Nov 9, 2025

Downloads

101

Citations as of Nov 9, 2025

SCOPUS^TM
Citations

12

Citations as of Dec 19, 2025

WEB OF SCIENCE^TM
Citations

51

Citations as of Dec 18, 2025

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Page views

Downloads

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM