MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction logo

MT4CrossOIE: Multi-stage Tuning for Cross-lingual Open Information Extraction

Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages.

GitHub Link

The GitHub link is https://github.com/CSJianYang/Multilingual-Multimodal-NLP/tree/main/MT4CrossOIE

Introduce

Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages.

Content

Cross-lingual open information extraction aims to extract structured information from raw text across multiple languages. Previous work uses a shared cross-lingual pre-trained model to handle the different languages but underuses the potential of the language-specific representation. In this paper, we propose an effective multi-stage tuning framework called MT4CrossOIE, designed for enhancing cross-lingual open information extraction by injecting language-specific knowledge into the shared model. Specifically, the cross-lingual pre-trained model is first tuned in a shared semantic space (e.g., embedding matrix) in the fixed encoder and then other components are optimized in the second stage. After enough training, we freeze the pre-trained model and tune the multiple extra low-rank language-specific modules using mixture-of- LoRAs for model-based cross-lingual transfer. In addition, we leverage two-stage prompting to encourage the large language model (LLM) to annotate the multilingual raw data for data-based cross-lingual transfer. The model is trained with multilingual objectives on our proposed dataset OpenIE4++ by combing the model-based and data-based transfer techniques. Experimental results on various benchmarks emphasize the importance of aggregating multiple plug-in-and-play languagespecific modules and demonstrate the effectiveness of MT4CrossOIE in cross-lingual OIE. picture

Alternatives & Similar Tools

LongLLaMA-handle very long text contexts, up to 256,000 tokens logo

LongLLaMA is a large language model designed to handle very long text contexts, up to 256,000 tokens. It's based on OpenLLaMA and uses a technique called Focused Transformer (FoT) for training. The repository provides a smaller 3B version of LongLLaMA for free use. It can also be used as a replacement for LLaMA models with shorter contexts.