Spatial-information Guided Adaptive Context-aware Network for Efficient RGB-D Semantic Segmentation logo
AI Tool Profile

Spatial-information Guided Adaptive Context-aware Network for Efficient RGB-D Semantic Segmentation

Efficient RGB-D semantic segmentation has received considerable attention in mobile robots, which plays a vital role in analyzing and recognizing environmental information.

Website
github.com
Pricing model
Free
Price start
Free

GitHub Link

The GitHub link is https://github.com/mvme-hbut/sgacnet

Introduce

Title GitHub Repository Update for SGACNet Spatial-information Guided Adaptive Context-aware Network. Summary This GitHub repository branch is current and synced with the CyunXiong/SGACNetmain repository, which pertains to the development of SGACNet—a network designed for efficient RGB-D semantic segmentation, incorporating spatial information guidance and adaptive context awareness. Efficient RGB-D semantic segmentation has received considerable attention in mobile robots, which plays a vital role in analyzing and recognizing environmental information.

Content

We provide the weights for our selected ESANet-R34-NBt1D (with ResNet34 NBt1D backbones) on NYUv2, SunRGBD, and Cityscapes. Download and extract the models to ./trained_models. Please navigate to the cloned directory. Note we are using Python 3.7+. Torch 1.3.1 and torchvision 0.4.2 ImageNet can be downloaded for our selected backbones on the above datasets. Stored in <dir>/trained_models/imagenet. Note that some parameters are different in Cityscapes. Evaluation on SUN RGB-D is similar to NYUv2. Yang Zhang, Chenyun Xiong, Junjie Liu, Xuhui Ye, and Guodong Sun. Spatial-information Guided Adaptive Context-aware Network for Efficient RGBD Semantic Segmentation[J]. IEEE Sensors Journal, 2023.

Alternatives & Similar Tools

LongLLaMA-handle very long text contexts, up to 256,000 tokens logo

LongLLaMA is a large language model designed to handle very long text contexts, up to 256,000 tokens. It's based on OpenLLaMA and uses a technique called Focused Transformer (FoT) for training. The repository provides a smaller 3B version of LongLLaMA for free use. It can also be used as a replacement for LLaMA models with shorter contexts.

LAMA: Human motion data to realistic complex 3D model actions logo

LAMA utilizes a reinforcement learning framework combined with a motion matching algorithm. Reinforcement learning helps the model make appropriate decisions in various scenarios, while motion matching algorithms ensure that synthesized actions match real human actions. In addition, LAMA also utilizes the motion editing framework of manifold learning to cover various possible changes in interactions and operations.

Compare Spatial-information Guided Adaptive Context-aware Network for Efficient RGB-D Semantic Segmentation

Quick compare routes for nearby alternatives.

All compare routes →