Skip to content

thuhcsi/CA-EMER

Repository files navigation

Evaluating and Improving Explanation Coherence for Multimodal Emotion Recognition

Hugging Face Demo Page

The official repository for "Evaluating and Improving Explanation Coherence for Multimodal Emotion Recognition".

Change all places with YOUR_PATH to your local directories.

Preparations

Environment

See configs/HumanOmni.yml for SFT (Python 3.10) and configs/r1-v.yml for GRPO (Python 3.11).

For the qwen3_caption_vllm environment, please refer to the Qwen3-Omni repository.

Pretrained Models

Model HuggingFace
HumanOmni-0.5B HF
HumanOmni-7B HF
Qwen3-Omni-30B-A3B-Instruct HF
bert-base-uncased HF
siglip-base-patch16-224 HF
siglip-so400m-patch14-384 HF
whisper-large-v3 HF

Data Format

SFT

{
    "video": "VIDEO_PATH",
    "conversations": [
        {
        "from": "human",
        "value": "<video>\n<audio>\nAs an emotional recognition expert; throughout the video, which emotion conveyed by the characters is the most obvious to you? Output the thinking process in <think> </think> and final emotion in <answer> </answer> tags."
        },
        {
        "from": "gpt",
        "value": "<think>THINK_CONTENT</think>\n<answer>EMOTION_LABEL</answer>"
        }
    ]
}

GRPO / Inference

{
    "video": "VIDEO_PATH",
    "conversations": [
        {
            "from": "human",
            "value": "<video>\n<audio>\nAs an emotional recognition expert; throughout the video, which emotion conveyed by the characters is the most obvious to you? Output the thinking process in <think> </think> and final emotion in <answer> </answer> tags."
        },
        {
            "from": "gpt",
            "value": "EMOTION_LABEL"
        }
    ]
}

Training

SFT

bash srun_sft_humanomni.sh

GRPO

To use FG-CE, run srun_fgce.sh and fill in the POD_IP in srun_grpo_humanomni.sh.

bash srun_grpo_humanomni.sh

Inference

conda activate r1-v
torchrun --nproc_per_node=$GPUS --nnodes=1 \
    --master_addr=localhost --master_port=12345 \
    inference_batch.py \
    --model_path $MODEL_PATH \
    --bert_path $BERT_PATH \
    --input_jsonl $INPUT_JSONL \
    --output_dir $OUTPUT_DIR

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors