Model Zoo

The following table lists models from the HuggingFace Model Hub that are supported in Lightning IR. For each model, the table reports the re-ranking effectiveness in terms of nDCG@10 on the officially released run files containing 1,000 passages for TREC Deep Learning 2019 and 2020.

Native models were fine-tuned using Lightning IR and the model’s HuggingFace model card provides Lightning IR configurations for reproduction. Non-native models were fine-tuned externally but are supported in Lightning IR for inference.

Reproduction

The following command and configuration can be used to reproduce the results:

config.yaml
trainer:
  logger: false
model:
  class_path: CrossEncoderModule # for cross-encoders
  # class_path: BiEncoderModule # for bi-encoders
  init_args:
    model_name_or_path: {MODEL_NAME}
    evaluation_metrics:
    - nDCG@10
data:
  class_path: LightningIRDataModule
  init_args:
    inference_datasets:
    - class_path: RunDataset
      init_args:
      run_path_or_id: msmarco-passage/trec-dl-2019/judged
    - class_path: RunDataset
      init_args:
      run_path_or_id: msmarco-passage/trec-dl-2020/judged
lightning-ir re_rank --config config.yaml

Model Name

Native

TREC DL 2019

TREC DL 2020

Cross-Encoders

webis/monoelectra-base

0.751

0.769

webis/monoelectra-large

0.750

0.791

castorini/monot5-base-msmarco

0.723

0.714

castorini/monot5-large-msmarco

0.720

0.728

castorini/monot5-3b-msmarco

0.726

0.752

Soyoung97/RankT5-base

0.734

0.745

Soyoung97/RankT5-large

0.737

0.759

Soyoung97/RankT5-3b

0.721

0.776

Bi-Encoders

webis/bert-bi-encoder

0.711

0.714

sentence-transformers/msmarco-bert-base-dot-v5

0.705

0.735

webis/colbert

0.751

0.749

colbert-ir/colbertv2.0

0.732

0.746

webis/splade

0.736

0.723

naver/splade-v3

0.715

0.749