Detecting Online Polarization (SemEval-2026 Task 9)
Overview
Multilingual polarization detection (binary classification: Polarized vs Non-Polarized) with emphasis on fair evaluation (Macro-F1).
Problem
Online polarization is a growing societal concern, but detecting it automatically across multiple languages remains challenging due to linguistic diversity and class imbalance.
Solution
A multi-stage pipeline from TF-IDF baselines through transformer models (XLM-R, mDeBERTa-v3, RemBERT) with focal loss, weighted sampling, and soft-voting ensembles.
Highlights
- •TF-IDF + Logistic Regression baseline
- •Neural baselines: BiLSTM, BiLSTM + Attention with language identity features
- •Transformers: XLM-R (base/large), InfoXLM, mDeBERTa-v3, RemBERT
- •Imbalance strategies: focal loss, weighted sampling / inverse-frequency weighting
- •Ensembling: soft voting / weighted combination across models
- •Parameter-efficient tuning exploration (QLoRA-style) for compute constraints
Tech Stack
Related Projects
EchoSpace-AR
Detects surrounding sounds and renders a directional in-headset icon that moves toward the sound source to improve environmental awareness—especially for Deaf and hard-of-hearing users.
ML-Based Solutions for 6G THz Drone Communications
ML-based channel selection / capacity optimization for 6G THz-band drone networks (NTN), considering ultra-massive MIMO and MAC-level issues.
Sentiment Analysis on IMDB Reviews
Multi-model sentiment classification on 50k IMDB reviews using classical ML, ANN, and BERT.