Detecting Online Polarization (SemEval-2026 Task 9)

Completed

NLP

Sep 2025 – Jan 2026

Overview

Multilingual polarization detection (binary classification: Polarized vs Non-Polarized) with emphasis on fair evaluation (Macro-F1).

Problem

Online polarization is a growing societal concern, but detecting it automatically across multiple languages remains challenging due to linguistic diversity and class imbalance.

Solution

A multi-stage pipeline from TF-IDF baselines through transformer models (XLM-R, mDeBERTa-v3, RemBERT) with focal loss, weighted sampling, and soft-voting ensembles.

Highlights

•TF-IDF + Logistic Regression baseline
•Neural baselines: BiLSTM, BiLSTM + Attention with language identity features
•Transformers: XLM-R (base/large), InfoXLM, mDeBERTa-v3, RemBERT
•Imbalance strategies: focal loss, weighted sampling / inverse-frequency weighting
•Ensembling: soft voting / weighted combination across models
•Parameter-efficient tuning exploration (QLoRA-style) for compute constraints

Tech Stack

Python

HuggingFace Transformers

NLP

Multilingual ML

Related Projects

EchoSpace-AR

Detects surrounding sounds and renders a directional in-headset icon that moves toward the sound source to improve environmental awareness—especially for Deaf and hard-of-hearing users.

ML-Based Solutions for 6G THz Drone Communications

ML-based channel selection / capacity optimization for 6G THz-band drone networks (NTN), considering ultra-massive MIMO and MAC-level issues.

Sentiment Analysis on IMDB Reviews

Multi-model sentiment classification on 50k IMDB reviews using classical ML, ANN, and BERT.