Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WildRefer #62

Open
linukc opened this issue Sep 23, 2024 · 0 comments
Open

WildRefer #62

linukc opened this issue Sep 23, 2024 · 0 comments

Comments

@linukc
Copy link
Owner

linukc commented Sep 23, 2024

We propose two novel datasets, i.e., STRefer and LifeRefer, which focus on large-scale human-centric daily-life scenarios accompanied with abundant 3D object and natural language annotations.
We uniformly sampled 662 scenes from original STCrowd dataset, including a total length of 65 minutes, for STRefer and annotate 5,458 natural language descriptions for 3,581 subjects. The scene here means a frame of synchronized LiDAR point cloud and image. The content in each scene distinguishes from others due to changing capture locations or time. We split it into training and testing data by 4:1 without data leakage. LifeRefer involves 25,380 natural language descriptions for 11,864 subjects based on 3,172 scenes, which has totally 103 minutes length. Similarly, we split it into 14,650 training data and 10,730
testing data without data leakage.

Paper Project Code

@article{lin2023wildrefer,
  title={Wildrefer: 3d object localization in large-scale dynamic scenes with multi-modal visual data and natural language},
  author={Lin, Zhenxiang and Peng, Xidong and Cong, Peishan and Hou, Yuenan and Zhu, Xinge and Yang, Sibei and Ma, Yuexin},
  journal={arXiv preprint arXiv:2304.05645},
  year={2023}
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant