Movatterモバイル変換


[0]ホーム

URL:


Skip to content

Navigation Menu

Sign in
Appearance settings

Search code, repositories, users, issues, pull requests...

Provide feedback

We read every piece of feedback, and take your input very seriously.

Saved searches

Use saved searches to filter your results more quickly

Sign up
Appearance settings

A Python package for evaluating radiology report generation using multiple standard and medical-specific metrics.

NotificationsYou must be signed in to change notification settings

jogihood/rrg-metric

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

A Python package for evaluating Radiology Report Generation (RRG) using multiple metrics including:
BLEU, ROUGE, METEOR, BERTScore, F1RadGraph, F1CheXbert, and SembScore.

Features

  • Multiple evaluation metrics supported:
    • BLEU
    • ROUGE
    • METEOR
    • BERTScore
    • F1 RadGraph
    • F1 CheXbert
    • SembScore (CheXbert vector similarity)
    • RaTEScore (Entity-aware metric)
  • Easy-to-use API
  • Support for batch processing
  • Detailed per-sample and aggregated results
  • Visualization tools for correlation analysis

TODO

  • Add CLI usage
  • AddGREEN score

Installation

  1. Clone the repository:
git clone https://github.com/jogihood/rrg-metric.gitcd rrg-metric
  1. Create and activate a conda environment using the providedenvironment.yml:
conda env create -f environment.ymlconda activate rrg-metric

Alternatively, you can install the required packages using pip:

pip install -r requirements.txt

Usage

Metric Computation

Here's a simple example of how to use the package:

importrrg_metric# Example usagepredictions= ["Normal chest x-ray","Bilateral pleural effusions noted"]ground_truth= ["Normal chest radiograph","Small bilateral pleural effusions present"]# Compute BLEU scoreresults=rrg_metric.compute(metric="bleu",preds=predictions,gts=ground_truth,per_sample=True,verbose=True)print(f"Total BLEU score:{results['total_results']}")ifresults['per_sample_results']:print(f"Per-sample scores:{results['per_sample_results']}")

Visualization

The package provides visualization tools for correlation analysis between metric scores and radiologist error counts:

For preprocessing tools related to radiology error validation (ReXVal), please check:https://github.com/jogihood/rexval-preprocessor

importrrg_metricimportmatplotlib.pyplotasplt# Example datametric_scores= [0.8,0.7,0.9,0.6,0.85]error_counts= [1,2,0,3,1]# Create correlation plotax,tau,tau_ci=rrg_metric.plot_corr(metric="BLEU",metric_scores=metric_scores,radiologist_error_counts=error_counts,error_type="total",# or "significant"color='blue',# custom colorscatter_alpha=0.6,# scatter point transparencyshow_tau=True# show Kendall's tau in title)print(f"Kendall's tau:{tau:.3f}")print(f"95% CI: [{tau_ci[0]:.3f},{tau_ci[1]:.3f}]")plt.show()

Parameters

compute(metric, preds, gts, per_sample=False, verbose=False)

Required Parameters:

  • metric (str): The evaluation metric to use. Must be one of: ["bleu", "rouge", "meteor", "bertscore", "f1radgraph", "chexbert", "ratescore"]
  • preds (List[str]): List of model predictions/generated texts
  • gts (List[str]): List of ground truth/reference texts

Optional Parameters:

  • per_sample (bool, default=False): If True, returns scores for each individual prediction-reference pair
  • verbose (bool, default=False): If True, displays progress bars and loading messages
  • f1radgraph_model_type /f1radgraph_reward_level: Parameters for RadGraph. Recommend default values
  • cache_dir:cache_dir for huggingface model downloads

plot_corr(metric, metric_scores, radiologist_error_counts, error_type="total", ax=None, **params)

Required Parameters:

  • metric (str): Name of the metric being visualized
  • metric_scores (List[float]): List of metric scores
  • radiologist_error_counts (List[float]): List of radiologist error counts

Optional Parameters:

  • error_type (str, default="total"): Type of error to plot. Must be either "total" or "significant"
  • ax (matplotlib.axes.Axes, default=None): Matplotlib axes for plotting. If None, creates new figure and axes
  • Additional parameters for plot customization (see docstring for details)

Requirements

  • Python 3.10+
  • Other dependencies listed inrequirements.txt

Contributing

This repository is still under active development. If you encounter any issues or bugs, I would really appreciate if you could submit a Pull Request. Your contributions will help make this package more robust and useful for the community!

About

A Python package for evaluating radiology report generation using multiple standard and medical-specific metrics.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors2

  •  
  •  

[8]ページ先頭

©2009-2025 Movatter.jp