Projects

  • Computer Vision

    Video Segmentation Tool

    Computer Vision

    Overview: A semi-automatic video segmentation tool that accelerates the preparation of image segmentation data for video sequences using GUI-based annotation and AI propagation.

    Technologies: Python, PyQt6, PyTorch, OpenCV, OSVOS (One-Shot Video Object Segmentation)

    Key Features:

    • • Manual annotation tools (pencil, polygon)
    • • Semi-automatic label propagation using Optical Flow and OSVOS
    • • Frame-by-frame navigation with multi-label support
    • • MVC architecture for scalable design

    Impact: Combines manual annotation capabilities with automated segmentation algorithms to significantly reduce manual labeling effort for video datasets.

  • Computer Vision

    Skin Cancer Detection

    Computer Vision

    Overview: A comprehensive medical AI system for skin cancer detection using Swin Transformer with parameter-efficient bottleneck adapters. Supports both binary classification (benign vs malignant) and multiclass classification (HAM10000 7-class dataset) for clinical-grade dermatoscopic image analysis.

    Technologies: Python, PyTorch, Swin Transformer, Vision Transformer (ViT), Parameter-Efficient Fine-Tuning, Bottleneck Adapters, Medical Imaging

    Key Features:

    • • Parameter-efficient fine-tuning with bottleneck adapters
    • • Swin Transformer and ViT architectures for medical imaging
    • • Binary and 7-class classification capabilities
    • • HAM10000 dataset integration for dermatoscopic analysis

    Impact: Advances dermatological AI by combining efficient transfer learning techniques with state-of-the-art vision transformers for accurate skin cancer detection and classification in clinical settings.

  • Computer Vision

    Face Demography Analysis

    Computer Vision

    Overview: A real-time facial emotion detection system using YOLOv8 for face detection and FER for emotion analysis across 7 categories.

    Technologies: Python, YOLOv8, OpenCV, Facial Expression Recognition (FER), ONNX

    Key Features:

    • • Real-time emotion recognition across 7 categories
    • • Multiple input sources (images, videos, webcam)
    • • Two-stage detection pipeline
    • • Command-line interface with flexible parameters

    Applications: Combines YOLOv8 face detection with emotion recognition algorithms for real-time video streams, suitable for demographic analysis and human-computer interaction applications.

  • Signal Processing

    Baseline Models for EEG Analysis

    Signal Processing

    Overview: A comprehensive framework for EEG signal analysis and classification, providing baseline implementations of classical machine learning models for the research community.

    Technologies: Python, NumPy, SciPy, scikit-learn, PyTorch, MNE-Python, XGBoost

    Key Features:

    • • Support for multiple EEG datasets (CHB-MIT, BCI Competition 2a, LEE, Klinik)
    • • Classical ML models with signal processing techniques
    • • Modular architecture with cross-validation
    • • Integration with Weights & Biases for experiment tracking

    Impact: Provides standardized framework for EEG signal processing including epilepsy detection and motor imagery classification with reproducible baseline performance metrics across different datasets.

  • Signal Processing

    Vibration Noise Detection

    Signal Processing

    Overview: Signal processing system for detecting and analyzing vibration-induced noise patterns using machine learning techniques and acoustic signal analysis.

    Technologies: Python, Signal Processing, Machine Learning, Audio Analysis, Spectral Analysis

    Key Features:

    • • Vibration pattern recognition and classification
    • • Spectral analysis for noise characterization
    • • Real-time signal processing capabilities
    • • Anomaly detection in mechanical systems

    Applications: Enables predictive maintenance and fault detection in mechanical systems by analyzing vibration signatures and identifying abnormal noise patterns for industrial monitoring.

  • Signal Processing

    Self-supervised EEG Embedding

    Signal Processing

    Overview: Learn meaningful EEG representations without labeled data across multiple tasks and datasets using self-supervised learning techniques.

    Technologies: PyTorch Lightning, Python, Neural Networks, EEG Signal Processing

    Key Features:

    • • Self-supervised learning framework for EEG
    • • Task-agnostic embeddings for multiple datasets
    • • Modular neural architecture with "Minion Networks"
    • • Cross-validation support for robust evaluation

    Impact: Creates transferable embeddings across different neurological signal analysis tasks using auxiliary tasks like temporal context prediction and channel reconstruction.

  • Signal Processing

    EEG-LTENT

    Signal Processing

    Overview: EEG-LTENT (Learned Task-agnostic Embeddings for Neural Time-series) provides a foundation model approach for EEG signal analysis. Instead of training task-specific models from scratch, this framework learns universal EEG representations that can be fine-tuned for various downstream applications.

    Technologies: Python, PyTorch, Conformer Architecture, Vector Quantization, Masked Autoencoding, Self-Supervised Learning, CNN-Transformer Hybrid

    Key Features:

    • • Task-agnostic learning: Pre-train once, adapt to many tasks (classification, regression, clustering)
    • • Self-supervised approach using masked autoencoding on unlabeled EEG data
    • • Discrete representations through vector quantization for interpretable embedding spaces
    • • Universal embeddings that transfer across datasets, tasks, and subjects

    Impact: Revolutionizes EEG analysis by creating a foundation model that learns from unlabeled data and transfers knowledge across different neurological tasks, significantly improving performance on limited labeled datasets.

  • Computer Vision

    Image Captioning

    Computer Vision

    Overview: A comprehensive image captioning system that automatically generates descriptive text for images using deep learning techniques and attention mechanisms.

    Technologies: Python, PyTorch, Computer Vision, Natural Language Processing, CNN, RNN, Attention

    Key Features:

    • • Encoder-decoder architecture with attention mechanism
    • • CNN-based image feature extraction
    • • RNN-based text generation
    • • Multi-modal learning approach

    Impact: Bridges computer vision and natural language processing to generate human-like descriptions of visual content for accessibility and automated content description applications.

  • NLP

    PDF-QA using LLM

    NLP

    Overview: A Python application that processes PDF documents and extracts structured information using Large Language Models, designed specifically for medical reports.

    Technologies: OpenAI GPT, Meta's LLaMA, Python, PyTorch, Hugging Face Transformers, LangChain

    Key Features:

    • • Dual LLM support (GPT and LLaMA)
    • • Automated PDF content extraction with structured output
    • • Batch processing capabilities
    • • Resource usage monitoring

    Impact: Uses large language models to parse PDF files and extract patient information, test results, and panel summaries with flexible LLM backends and structured CSV/JSON outputs.

  • NLP

    Persian Text Classification using GloVe

    NLP

    Overview: Classify ten classes of Persian text datasets using GloVe embedding to address unique challenges of Persian language processing in multi-class classification.

    Technologies: Python, Jupyter Notebook, GloVe Embedding, Machine Learning

    Key Features:

    • • Multi-class Persian text classification framework
    • • GloVe word embedding implementation for Persian
    • • Comprehensive preprocessing pipeline
    • • Specialized handling of Persian language characteristics

    Applications: Leverages GloVe embeddings to convert Persian text into numerical representations for effective machine learning classification across ten distinct text categories.

  • Computer Vision

    Tumor Classification Using Machine Learning

    Computer Vision

    Overview: A comprehensive machine learning project for tumor subtype classification using mutation and copy number variation data from genomic analysis.

    Technologies: Python, Scikit-learn, Pandas, NumPy, Jupyter Notebook, Deep Neural Networks

    Key Features:

    • • Classification of tumor subtypes (PDM vs SCM)
    • • Multiple ML approaches including SVM, XGBoost, and few-shot learning
    • • Feature selection and dimensionality reduction techniques
    • • 5-fold cross-validation evaluation

    Results: Achieved 69.17% accuracy with Logistic Regression using Tumor Mutation Burden, copy number variations, and missense mutations with metaheuristic optimization for feature selection.