Manifold DivideMix: A Semi-Supervised Contrastive Learning Framework for Severe Label Noise logo

Manifold DivideMix: A Semi-Supervised Contrastive Learning Framework for Severe Label Noise

However, their performance degrades when training data contains noisy labels, leading to poor generalization on the test set.

GitHub Link

The GitHub link is https://github.com/fahim-f/manifolddividemix

Introduce

The "Manifold DivideMix" project on GitHub presents a semi-supervised contrastive learning framework designed to tackle label noise in training data. Conventional neural networks struggle with noisy labels, leading to poor generalization. The proposed approach utilizes self-supervised training to create a meaningful embedding space for each sample, incorporating both in-distribution and out-of-distribution noisy samples. An iterative Manifold DivideMix algorithm is introduced to identify clean and noisy samples, while the MixEMatch algorithm enhances semi-supervised learning using mixup augmentation in both input and hidden representations. Extensive experiments on synthetic and real-world datasets demonstrate the effectiveness of the method. However, their performance degrades when training data contains noisy labels, leading to poor generalization on the test set.

Content

Fahimeh Fooladgar1, Minh Nguyen Nhat To1, Parvin Mousavi2, Purang Abolmaesumi1 Codes will be uploaded soon ... University of British Columbia _ _2 _3

Alternatives & Similar Tools

LongLLaMA-handle very long text contexts, up to 256,000 tokens logo

LongLLaMA is a large language model designed to handle very long text contexts, up to 256,000 tokens. It's based on OpenLLaMA and uses a technique called Focused Transformer (FoT) for training. The repository provides a smaller 3B version of LongLLaMA for free use. It can also be used as a replacement for LLaMA models with shorter contexts.