Default Cover Image

2024 IEEE International Conference on Multimedia and Expo (ICME)

July 15 2024 to July 19 2024

Niagara Falls, ON, Canada

ISBN: 979-8-3503-9015-5

Table of Contents

Copyright PageFreely available from IEEE.pp. 1-1
Welcome Message from the General ChairsFreely available from IEEE.pp. 1-2
Joint edge detection learning for recurrent homography estimationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
AdaStyleSpeech: A Fast Stylized Speech Synthesis Model Based on Adaptive Instance NormalizationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
A Patch-wise Adversarial Denoising Could Enhance the Robustness of Adversarial TrainingFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Improving Transferability of Adversarial Examples with Adversaries CompetitionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Delve into Source and Target Collaboration in Semi-supervised Domain Adaptation for Semantic SegmentationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Powerful Lossy Compression for Noisy ImagesFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Neighborhood-Adaptive Context Enhancement Learning For Scene Graph GenerationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
A Multimodal Transformer for Live Streaming Highlight PredictionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Towards Low-resource License Plate Recognition via Feature ShufflingFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Build a Cross-modality Bridge for Image-to-Point Cloud RegistrationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Exploring Interactive Semantic Alignment for Efficient HOI Detection with Vision-language ModelFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
PGDM: Multimodal Panoramic Image Generation with Diffusion ModelsFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
NID-SLAM: Neural Implicit Representation-based RGB-D SLAM In Dynamic EnvironmentsFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
TLVC: Temporal Bit-rate Allocation for Learned Video CompressionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Noisy-Residual Continuous Diffusion Models for Real Image DenoisingFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
MeshStyle: Text-driven Efficient and High-Quality 3D Mesh Stylization via Hypergraph ConvolutionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
LR-MAE: Locate while Reconstructing with Masked Autoencoders for Point Cloud Self-supervised LearningFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
A Region-Growing Supervised Geometry-Weighted Transformer for Normal EstimationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Ultralight-weight Binary Neural Network with 1K Parameters for Image Super-ResolutionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Overcoming Language Priors for Visual Question Answering Based on Knowledge DistillationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Contextual Interaction Enhancement Network for Smoke DetectionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Improving Few-Shot Neural Radiance Field with Image Based RenderingFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
SVT: Spectral Video Transformer for Video Restoration in Under-Display CameraFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Enhancing Zero-shot 3D Photography via Mesh-represented Image InpaintingFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
DSENet: An Object-Wise Density-Informed Coarse-to-Fine Object Detector for Aerial ImageFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Training-Free Semantic Video Composition via Pre-trained Diffusion ModelFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Structure-aware Residual-center Representation for Self-Supervised Open-set 3D Cross-modal RetrievalFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
FE-VAD: High-Low Frequency Enhanced Weakly Supervised Video Anomaly DetectionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Enabling Practical and Pervasive Content Delivery from Emerging LEO Mega-ConstellationsFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Three-Stage Temporal Deformable Network for Blurry Video Frame InterpolationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Beyond Global Cues: Unveiling the Power of Fine Details in Image MatchingFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Rethinking Image Deraining via Text-guided Detail ReconstructionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Multi-Stage Fusion for Event-based Multimodal TrackerFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
ToW3D: Consistency-aware Interactive Point-based Mesh Editing on GANsFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Video Object Segmentation with Dynamic Query ModulationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Agnostic Feature Compression with Semantic Guided Channel Importance AnalysisFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Talking Portrait with Discrete Motion Priors in Neural Radiation FieldFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Robust Knowledge Distillation and Self-Contrast Reasoning for Debiased Visual Question AnsweringFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
PA-SAM: Prompt Adapter SAM for High-Quality Image SegmentationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Source-Free Domain Adaptation for Point Cloud Semantic SegmentationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Geo GCN: Geometric-based Graph CNN for Learning on Point CloudFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
HQOD: Harmonious Quantization for Object DetectionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
ITportrait: Image-Text Coupled 3D Portrait Domain AdaptationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
ETAU: Towards Emotional Talking Head Generation Via Facial Action UnitFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
CAM-Guided Translation for Unpaired Weakly-Supervised Medical Image SegmentationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
ProTA: Probabilistic Token Aggregation for Text-Video RetrievalFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
ConfR: Conflict Resolving for Generalizable Deepfake DetectionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Robust 3D Face Alignment with Multi-Path Neural Architecture SearchFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Document Image Dewarping Guided by 3D Geometry and Layout PriorsFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Multimodal Knowledge Graph Embeddings via Lorentz-based Contrastive LearningFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Reconstructing Prototype From Contaminated Face With Variations Across Heterogeneous DomainsFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
FNFORMER: A Transformer-Based Face Normal EstimatorFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Encoding Semantic Priors into the Weights of Implicit Neural RepresentationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Multi-modal Learnable Queries for Image Aesthetics AssessmentFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Two-Step Temporal Divisive Clustering for Unsupervised Action SegmentationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Decoupling Spatio-Temporal Network for Fine-Grained Temporal Action SegmentationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Chain-of-Thought Prompting for Demographic Inference with Large Multimodal ModelsFull-text access may be available. Sign in or learn about subscription options.pp. 1-7
Learning Motion Priors with DETR for Visual TrackingFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Hierarchically Aggregated Identification Transformer Network for Camouflaged Object DetectionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Towards Omni-supervised Referring Expression SegmentationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Towards Real-world Continuous Super-Resolution: Benchmark and MethodFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
HFF-Net: A High-Frequency Fidelity Model for Accelerated Parallel MRI ReconstructionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Multi-scale Bottleneck Transformer for Weakly Supervised Multimodal Violence DetectionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
X-ReID: Cross-Instance Transformer for Identity-Level Person Re-IdentificationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
DepthRefiner: Adapting RGB Trackers to RGBD Scenes via Depth-Fused RefinementFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
FedDGP: Disentangling Global and Personal Models for Federated LearningFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
BFD: Binarized Frequency-enhanced Distillation for Vision TransformerFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Effective and Efficient Few-shot Fine-tuning for Vision TransformersFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
MergeNet: Explicit Mesh Reconstruction from Sparse Point Clouds via Edge PredictionFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
One-Class HEVC Double Compression Detection with Same Coding ParametersFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
An Aesthetic-Guided Multimodal Framework for Video SummarizationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Semantic Bridging and Feature Anchoring for Class Incremental LearningFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
STCMOT: Spatio-Temporal Cohesion Learning for UAV-Based Multiple Object TrackingFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Parameter Efficient Fine-Tuning on Selective Parameters for Transformer-Based Pre-Trained ModelsFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Enhancing Adversarial Transferability on Vision Transformer by Permutation-Invariant AttacksFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Exploring 3D-aware Lifespan Face Aging via Disentangled Shape-Texture RepresentationsFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
HpEIS: Learning Hand Pose Embeddings for Multimedia Interactive SystemsFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
ICFRNet: Image Complexity Prior Guided Feature Refinement for Real-time Semantic SegmentationFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Multi-scale Transformer with Prompt Learning for Remote Sensing Image DehazingFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Coarse-to-fine Alignment Makes Better Speech-image RetrievalFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
SSETPAN: Spatial-Spectral Enhanced Transformer based network for pansharpeningFull-text access may be available. Sign in or learn about subscription options.pp. 1-6
Showing 100 out of 663