Ruixiang Xue薛瑞翔

研究内容

空间智能

Spatial
Intelligence

智能点云压缩三维高斯泼溅压缩隐式神经表示街景生成式新视角合成智能辅助摄影

项目

智能点云压缩

面向 MPEG AI-PCC 的深度学习点云压缩算法研究与开发。

针对点云数据表征设计 Transformer 网络，提高空间特征提取效率。
利用点云降采样过程中的 密度变化 指导重建模式选择。
在公开测试条件下，相较基线实现 39% 几何压缩增益 和 34% 属性压缩增益。

三维高斯泼溅编码

面向 3D Gaussian Splatting 场景的紧凑化、渐进式压缩与区域自适应质量控制。

提出 RecastGS，将预训练 3DGS 重组为 区域感知的分层表示，并结合提示驱动的目标区域提取与渐进蒸馏。
提出 LayeredCGS，通过 跨层上下文建模 实现前馈式分层 3DGS 压缩，并生成 可截断码流。
相较前馈基线 FCGS 实现 35% BD-Rate 增益，在相近码率下 ROI PSNR 最高提升 2 dB。

ECCV26

智能辅助摄影

面向虚拟摄影的空间重构系统：从单张照片生成可控三维拍摄空间，再用生成式神经渲染提升照片观感。

虚拟摄影棚：Marble 3DGS、SplatTransform、UE 导入、沉浸式拍摄。

人物场景合成：场景 splats 与独立人物 3DGS 资产流程组合。

空间重构取景：从单张照片复刻 Apple-like 近邻视角重构。

Virtual Studio Pipeline 从单张参考图到 UE 虚拟摄影棚，再到生成式神经渲染照片。

参考图片单张照片 Marble 3DGS 场景资产 SplatTransform 预处理 Unreal Engine 虚拟摄影棚相机拍摄沉浸漫游人物 3DGS 合成入镜神经渲染照片级修饰

Spatial Reframing Pipeline 面向有限角度变化的 Apple-like 重新取景流程。

输入照片用户图片单图 3DGS 几何先验主体线索 mask + depth 相机重构小范围运动置信度图空洞 + alpha 生成式修复低置信区域重构照片最终视角

教育经历

南京大学

信息与通信工程博士研究生

2021.09 - 2027.06

NJU Vision Lab，导师：马展教授、陈彤研究员
研究方向：智能点云压缩、三维高斯泼溅压缩、隐式神经表达、智能辅助摄影
博士中期考核优秀

杭州电子科技大学

电子信息工程工学学士

2017.09 - 2021.06

工作经历

吉利

人工智能中心 · 算法实习生

2026.05 - 至今

围绕街景新视角外推开展研究，探索前馈重建模型与生成模型之间的双向增强。

利用前馈街景重建模型提升生成式新视角合成模型的几何一致性。
利用生成先验改进前馈街景重建方法的渲染质量损失，以提升大幅度新视角外推效果。

OPPO

研究院 · 算法实习生

2024.02 - 2024.11

围绕智能点云压缩算法研究与 MPEG AI-PCC 标准化开展工作。

研发和评估面向 MPEG AI-PCC 标准化的智能点云压缩算法。
开发点云编解码器软件，并在不同测试序列上评估压缩性能。
多次参加 MPEG 国际会议，提交 10 份标准化提案，并参与 5 项发明专利申请。

论文

3D Gaussian Splatting Compression with Object Scalability

ECCV 2026 · European Conference on Computer Vision

Ruixiang Xue, Tong Chen, Zhan Ma

CCF-B三维高斯泼溅压缩

Paper Code

Abstract

We introduce a framework towards scalable, finer-grained object-level 3DGS compression. First, a post-training method named RecastGS is proposed to reorganize pretrained 3DGS into a layered representation and progressively distills cumulative submodels to improve rate-distortion efficiency. Leveraging multi-view SAM predictions from user click prompts, Gaussians are further partitioned into user-defined regions of interest (ROI), enabling region-adaptive quality control without retraining. Second, built upon this reorganized region-aware layered hierarchy, a feed-forward 3DGS compression method named LayeredCGS is proposed to compress position using a lightweight point cloud codec and attributes with a layer-wise context model to exploit cross-layer correlations. Extensive experiments show that LayeredCGS achieves 35% BD-Rate gain over the existing feed-forward method FCGS. With progressive distillation in RecastGS enabled, our method further outperforms most per-scene optimization methods. Moreover, the proposed method supports ROI-aware compression and flexible bitstream truncation, achieving up to 2 dB higher ROI PSNR at comparable bitrates compared with the uniform quality allocation baseline while enabling low-latency preview and progressive quality refinement. The code will be released at https://github.com/RuixiangXue/ScalableGSC.
BibTeX

@inproceedings{xue2026objectscalable3dgs, title={3D Gaussian Splatting Compression with Object Scalability}, author={Xue, Ruixiang and Chen, Tong and Ma, Zhan}, booktitle={European Conference on Computer Vision (ECCV)}, year={2026} }
A Versatile Point Cloud Compressor Using Universal Multiscale Conditional Coding – Part I: Geometry

TPAMI · IEEE Transactions on Pattern Analysis and Machine Intelligence

Vol. 47, No. 1, pp. 269-287, Jan. 2025 · DOI: 10.1109/TPAMI.2024.3462938

Jianqiang Wang, Ruixiang Xue, Jiaxin Li, Dandan Ding, Yi Lin, Zhan Ma

SCI 一区CCF-AIF 18.6共同一作智能点云压缩

Project Paper Code

Abstract

A universal multiscale conditional coding framework, Unicorn, is proposed to compress the geometry and attribute of any given point cloud. Geometry compression is addressed in Part I of this paper, while attribute compression is discussed in Part II. We construct the multiscale sparse tensors of each voxelized point cloud frame and properly leverage lower-scale priors in the current and (previously processed) temporal reference frames to improve the conditional probability approximation or content-aware predictive reconstruction of geometry occupancy in compression. Unicorn is a versatile, learning-based solution capable of compressing static and dynamic point clouds with diverse source characteristics in both lossy and lossless modes. Following the same evaluation criteria, Unicorn significantly outperforms standard-compliant approaches like MPEG G-PCC, V-PCC, and other learning-based solutions, yielding state-of-the-art compression efficiency while presenting affordable complexity for practical implementations.
BibTeX

@article{wang2025unicorngeometry, title={A Versatile Point Cloud Compressor Using Universal Multiscale Conditional Coding -- Part I: Geometry}, author={Wang, Jianqiang and Xue, Ruixiang and Li, Jiaxin and Ding, Dandan and Yi, Lin and Ma, Zhan}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume={47}, number={1}, pages={269--287}, year={2025}, doi={10.1109/TPAMI.2024.3462938} }
NeRI: Implicit Neural Representation of LiDAR Point Cloud Using Range Image Sequence

ICASSP 2024 · IEEE International Conference on Acoustics, Speech, and Signal Processing

ICASSP 2024, pp. 8020-8024 · DOI: 10.1109/ICASSP48485.2024.10446596

Ruixiang Xue, Jiaxin Li, Tong Chen, Dandan Ding, Xun Cao, Zhan Ma

CCF-B第一作者隐式神经表达智能点云压缩

Paper Code

Abstract

This paper proposes the NeRI, an implicit neural representation (INR) based LiDAR point cloud compressor. In NeRI, we first transform a sequence of 3D LiDAR frames into a 2D range image sequence through range image projection over time. Then, we employ a neural network conditioned on the temporal frame index and associated LiDAR sensor pose to fit input range images as closely as possible. The optimized network parameters, which implicitly represent the input LiDAR data, are later lossily compressed. NeRI decoder is then initialized using decoded parameters to generate range images for reconstructing the 3D LiDAR sequence accordingly. Extensive experimental results demonstrate the significant superiority of NeRI regarding the compression efficiency and decoding speed compared to state-of-the-art 2D and 3D compressors for LiDAR point cloud.
BibTeX

@inproceedings{xue2024neri, title={NeRI: Implicit Neural Representation of LiDAR Point Cloud Using Range Image Sequence}, author={Xue, Ruixiang and Li, Jiaxin and Chen, Tong and Ding, Dandan and Cao, Xun and Ma, Zhan}, booktitle={ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)}, pages={8020--8024}, year={2024}, doi={10.1109/ICASSP48485.2024.10446596} }
Poster
A Versatile Point Cloud Compressor Using Universal Multiscale Conditional Coding – Part II: Attribute

TPAMI · IEEE Transactions on Pattern Analysis and Machine Intelligence

Vol. 47, No. 1, pp. 252-268, Jan. 2025 · DOI: 10.1109/TPAMI.2024.3462945

Jianqiang Wang, Ruixiang Xue, Jiaxin Li, Dandan Ding, Yi Lin, Zhan Ma

SCI 一区CCF-AIF 18.6第二作者智能点云压缩

Project Paper Code

Abstract

A universal multiscale conditional coding framework, Unicorn, is proposed to code the geometry and attribute of any given point cloud. Attribute compression is discussed in Part II of this paper, while geometry compression is given in Part I of this paper. We first construct the multiscale sparse tensors of each voxelized point cloud attribute frame. Since attribute components exhibit very different intrinsic characteristics from the geometry element, e.g., 8-bit RGB color versus 1-bit occupancy, we process the attribute residual between lower-scale reconstruction and current-scale data. Similarly, we leverage spatially lower-scale priors in the current frame and (previously processed) temporal reference frame to improve the probability estimation of attribute intensity through conditional residual prediction in lossless mode or enhance the attribute reconstruction through progressive residual refinement in lossy mode for better performance. The proposed Unicorn is a versatile, learning-based solution capable of compressing a great variety of static and dynamic point clouds in both lossy and lossless modes. Following the same evaluation criteria, Unicorn significantly outperforms standard-compliant approaches like MPEG G-PCC, V-PCC, and other learning-based solutions, yielding state-of-the-art compression efficiency with affordable encoding/decoding runtime.
BibTeX

@article{wang2025unicornattribute, title={A Versatile Point Cloud Compressor Using Universal Multiscale Conditional Coding -- Part II: Attribute}, author={Wang, Jianqiang and Xue, Ruixiang and Li, Jiaxin and Ding, Dandan and Yi, Lin and Ma, Zhan}, journal={IEEE Transactions on Pattern Analysis and Machine Intelligence}, volume={47}, number={1}, pages={252--268}, year={2025}, doi={10.1109/TPAMI.2024.3462945} }
GRNet: Geometry Restoration for G-PCC Compressed Point Clouds Using Auxiliary Density Signaling

TVCG · IEEE Transactions on Visualization and Computer Graphics

Vol. 30, No. 10, pp. 6740-6753, Oct. 2024 · DOI: 10.1109/TVCG.2023.3336936

Gexin Liu, Ruixiang Xue, Jiaxin Li, Dandan Ding, Zhan Ma

SCI 一区CCF-AIF 6.5第二作者智能点云压缩

Paper

Abstract

The lossy Geometry-based Point Cloud Compression (G-PCC) inevitably impairs the geometry information of point clouds, which deteriorates the quality of experience (QoE) in reconstruction and/or misleads decisions in tasks such as classification. To tackle it, this work proposes GRNet for the geometry restoration of G-PCC compressed large-scale point clouds. By analyzing the content characteristics of original and G-PCC compressed point clouds, we attribute the G-PCC distortion to two key factors: point vanishing and point displacement. Visible impairments on a point cloud are usually dominated by an individual factor or superimposed by both factors, which are determined by the density of the original point cloud. To this end, we employ two different models for coordinate reconstruction, termed Coordinate Expansion and Coordinate Refinement, to attack the point vanishing and displacement, respectively. In addition, 4-byte auxiliary density information is signaled in the bitstream to assist the selection of Coordinate Expansion, Coordinate Refinement, or their combination. Before being fed into the coordinate reconstruction module, the G-PCC compressed point cloud is first processed by a Feature Analysis Module for multiscale information fusion, in which kNN-based Transformer is leveraged at each scale to adaptively characterize neighborhood geometric dynamics for effective restoration. Following the common test conditions recommended in the MPEG standardization committee, GRNet significantly improves the G-PCC anchor and remarkably outperforms state-of-the-art methods on a great variety of point clouds (e.g., solid, dense, and sparse samples) both quantitatively and qualitatively. Meanwhile, GRNet runs fairly fast and uses a smaller-size model when compared with existing learning-based approaches, making it attractive to industry practitioners.
BibTeX

@article{liu2024grnet, title={GRNet: Geometry Restoration for G-PCC Compressed Point Clouds Using Auxiliary Density Signaling}, author={Liu, Gexin and Xue, Ruixiang and Li, Jiaxin and Ding, Dandan and Ma, Zhan}, journal={IEEE Transactions on Visualization and Computer Graphics}, volume={30}, number={10}, pages={6740--6753}, year={2024}, doi={10.1109/TVCG.2023.3336936} }

荣誉

南京大学学业一等奖学金
2021 - 2025

浙江省第十二届大学生创业计划竞赛特等奖
2020