Fast Encoding and Decoding for Implicit Video Representation

stp2yOctober 16, 20240 Comments

AmazUtah_NLP at SemEval-2024 Task 9: A MultiChoice Question Answering System for Commonsense Defying Reasoning

[Submitted on 28 Sep 2024 (v1), last revised 15 Oct 2024 (this version, v2)]

View a PDF of the paper titled Fast Encoding and Decoding for Implicit Video Representation, by Hao Chen and 3 other authors

View PDF
HTML (experimental)

Abstract:Despite the abundant availability and content richness for video data, its high-dimensionality poses challenges for video research. Recent advancements have explored the implicit representation for videos using neural networks, demonstrating strong performance in applications such as video compression and enhancement. However, the prolonged encoding time remains a persistent challenge for video Implicit Neural Representations (INRs). In this paper, we focus on improving the speed of video encoding and decoding within implicit representations. We introduce two key components: NeRV-Enc, a transformer-based hyper-network for fast encoding; and NeRV-Dec, a parallel decoder for efficient video loading. NeRV-Enc achieves an impressive speed-up of $mathbf{10^4times}$ by eliminating gradient-based optimization. Meanwhile, NeRV-Dec simplifies video decoding, outperforming conventional codecs with a loading speed $mathbf{11times}$ faster, and surpassing RAM loading with pre-decoded videos ($mathbf{2.5times}$ faster while being $mathbf{65times}$ smaller in size).

Submission history

From: Hao Chen [view email]
[v1]
Sat, 28 Sep 2024 18:21:52 UTC (29,724 KB)
[v2]
Tue, 15 Oct 2024 03:18:39 UTC (29,769 KB)

Source link
lol

By stp2y