We introduce the Progressive Visual Token Compression (PVC) in large vision-language models (VLMs), which unifies the visual inputs as videos and progressively compresses vision tokens across video ...
Abstract: Perceptual Video Compression (PVC) is a promising approach to enhancing compression efficiency. The Human Visual System (HVS) possesses many important perceptual characteristics, which can ...