multilingual-nlp

This repository contains the code, data, and models of the paper titled "CrossSum: Beyond English-Centric Cross-Lingual Summarization for 1,500+ Language Pairs" published in Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (ACL’23), July 9-14, 2023.

cross-lingual-summarization cross-lingual-transfer multilingual-nlp

Updated Mar 26, 2024
Python

DmitryRyumin / EMNLP-2023-Papers

Star

EMNLP 2023 Papers: Explore cutting-edge research from EMNLP 2023, the premier conference for advancing empirical methods in natural language processing. Stay updated on the latest in machine learning, deep learning, and natural language processing with code included. ⭐ support NLP!

Updated May 18, 2024
Python

BatsResearch / LexC-Gen

Star

Generate synthetic labeled data for extremely low-resource languages using bilingual lexicons.

multilingual sentiment-analysis topic-modeling synthetic-data synthetic-dataset-generation low-resource-languages lexicon-based multilingual-nlp llm

Updated Oct 3, 2024
Python

ArkS0001 / IIT-Bombay-Whisper-Hindi-ASR-Model-Machine-Learning-Intern

Sponsor

Star

Whisper is an automatic speech recognition (ASR) system trained on 680,000 hours of multilingual and multitask supervised data collected from the web. We show that the use of such a large and diverse dataset leads to improved robustness to accents, background noise and technical language. Moreover, it enables transcription in multiple languages, as

multilingual voice-recognition openai whisper nlp-machine-learning speechtotext multilingual-nlp llm openai-whisper

Updated Apr 29, 2024
Jupyter Notebook

cisnlp / Glot500

Star

Glot500: Scaling Multilingual Corpora and Language Models to 500 Languages -- ACL 2023

multilingual nlp natural-language-processing acl dataset glot xlm multilingual-models xlm-r multilingual-nlp glot500

Updated Apr 20, 2024
Python

cambridgeltl / prompt4bli

Star

On Bilingual Lexicon Induction with Large Language Models (EMNLP 2023). Keywords: Bilingual Lexicon Induction, Word Translation, Large Language Models, LLMs.

machine-translation prompt pytorch llama prompts zero-shot-learning mt5 bilingual-lexicon-extraction few-shot-learning multilingual-models multilingual-nlp low-resource-machine-translation bilingual-lexicon-induction word-translation in-context-learning large-language-models bilingual-dictionary-induction prompt-engineering prompting llms

Updated Aug 12, 2024
Python

harmonydata / harmony_r

Star

R library for Harmony

nlp data-science natural-language-processing cran r ai psychology multilingual-nlp

Updated Oct 19, 2023
HTML

cambridgeltl / sail-bli

Star

Self-Augmented In-Context Learning for Unsupervised Word Translation (ACL 2024). Keywords: Bilingual Lexicon Induction, Word Translation, Large Language Models, LLMs.

machine-translation prompt pytorch llama self-learning zero-shot-learning bilingual-lexicon-extraction few-shot-learning multilingual-models multilingual-nlp low-resource-machine-translation bilingual-lexicon-induction word-translation in-context-learning large-language-models bilingual-dictionary-induction prompt-engineering prompting llms llama2

Updated Aug 12, 2024
Python

negar-foroutan / multiLMs-lang-neutral-subnets

Star

[EMNLP 2022] Discovering Language-neutral Sub-networks in Multilingual Language Models.

mt5 lottery-ticket-hypothesis mbert cross-lingual-transfer multilingual-language-models multilingual-nlp

Updated Apr 1, 2024
Python

BatsResearch / LexC-Gen-Data-Archive

Star

Data Repository for LexC-Gen: Generating Data for Extremely Low-Resource Languages with Large Language Models and Bilingual Lexicons

multilingual sentiment-analysis topic-modeling synthetic-data multilingual-nlp

Updated Oct 3, 2024

longxudou / multispider

Star

MultiSpider: Towards Benchmarking Multilingual Text-to-SQL Semantic Parsing

multilingual natural-language-processing japanese-language semantic-parsing german-language spanish-language text-to-sql french-language multilingual-nlp

Updated Mar 12, 2024
Python

e-hossam96 / CMU-CS11-737

Star

Solutions of the CMU Multilingual Natural Language Processing Course

nlp natural-language-processing neural-network pytorch artificial-intelligence multilingual-nmt multilingual-nlp multilingual-sequence-labeling

Updated Jan 21, 2023
Shell

Rajarshi1001 / IITK-SemEval-2024-Task-1

Star

IITK at SemEval Task 1: Semantic Textual Relatedness for African and Asian Languages

semantics transformer lexical-analysis google-distance contrastive-learning sentence-transformers multilingual-nlp tsdae

Updated Mar 27, 2024
Jupyter Notebook

Luci-MG / NLP-POSTagging-AutoCorrection

Star

This project covers POS tagging using HMMs and neural networks (LSTM, BiLSTM) across multiple languages, and explores autocorrection methods with various n-gram models and error correction techniques.

nlp python3 postagging autocorrection multilingual-nlp

Updated Oct 3, 2024
Python

MaLA-LM / mala-500

Star

MaLA-500: Massive Language Adaptation of Large Language Models

multilingual-nlp large-language-models

Updated Apr 24, 2024
Python

Improve this page

Add a description, image, and links to the multilingual-nlp topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the multilingual-nlp topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

multilingual-nlp

Here are 42 public repositories matching this topic...

embeddings-benchmark / mteb

bigscience-workshop / xmtf

epfl-dlab / llm-latent-language

shijie-wu / crosslingual-nlp

FSoft-AI4Code / TheVault

csebuetnlp / CrossSum

DmitryRyumin / EMNLP-2023-Papers

BatsResearch / LexC-Gen

ArkS0001 / IIT-Bombay-Whisper-Hindi-ASR-Model-Machine-Learning-Intern

cisnlp / Glot500

cambridgeltl / prompt4bli

harmonydata / harmony_r

cambridgeltl / sail-bli

negar-foroutan / multiLMs-lang-neutral-subnets

BatsResearch / LexC-Gen-Data-Archive

longxudou / multispider

e-hossam96 / CMU-CS11-737

Rajarshi1001 / IITK-SemEval-2024-Task-1

Luci-MG / NLP-POSTagging-AutoCorrection

MaLA-LM / mala-500

Improve this page

Add this topic to your repo