MSMARCO Models

MS MARCO is a large scale information retrieval corpus that was created based on real user search queries using Bing search engine. The provided models can be used for semantic search, i.e., given keywords / a search phrase / a question, the model will find passages that are relevant for the search query.

The training data consists of over 500k examples, while the complete corpus consist of over 8.8 Million passages.

Usage

python

from sentence_transformers import SentenceTransformer, util

model = SentenceTransformer("sentence-transformers/msmarco-distilbert-dot-v5")

query_embedding = model.encode("How big is London")
passage_embedding = model.encode([
    "London has 9,787,426 inhabitants at the 2011 census",
    "London is known for its financial district",
])

print("Similarity:", util.dot_score(query_embedding, passage_embedding))

For more details on the usage, see Applications - Information Retrieval

Performance

Performance is evaluated on TREC-DL 2019 and TREC-DL 2020, which are a query-passage retrieval task where multiple queries have been annotated as with their relevance with respect to the given query. Further, we evaluate on the MS Marco Passage Retrieval dataset.

Approach	MRR@10 (MS Marco Dev)	NDCG@10 (TREC DL 19 Reranking)	NDCG@10 (TREC DL 20 Reranking)	Queries (GPU / CPU)	Docs (GPU / CPU)
Models tuned with normalized embeddings
sentence-transformers/msmarco-MiniLM-L6-cos-v5	32.27	67.46	64.73	18,000 / 750	2,800 / 180
sentence-transformers/msmarco-MiniLM-L12-cos-v5	32.75	65.14	67.48	11,000 / 400	1,500 / 90
sentence-transformers/msmarco-distilbert-cos-v5	33.79	70.24	66.24	7,000 / 350	1,100 / 70
sentence-transformers/multi-qa-MiniLM-L6-cos-v1		65.55	64.66	18,000 / 750	2,800 / 180
sentence-transformers/multi-qa-distilbert-cos-v1		67.59	66.46	7,000 / 350	1,100 / 70
sentence-transformers/multi-qa-mpnet-base-cos-v1		67.78	69.87	4,000 / 170	540 / 30
Models tuned for dot-product
sentence-transformers/msmarco-distilbert-base-tas-b	34.43	71.04	69.78	7,000 / 350	1100 / 70
sentence-transformers/msmarco-distilbert-dot-v5	37.25	70.14	71.08	7,000 / 350	1100 / 70
sentence-transformers/msmarco-bert-base-dot-v5	38.08	70.51	73.45	4,000 / 170	540 / 30
sentence-transformers/multi-qa-MiniLM-L6-dot-v1		66.70	65.98	18,000 / 750	2,800 / 180
sentence-transformers/multi-qa-distilbert-dot-v1		68.05	70.49	7,000 / 350	1,100 / 70
sentence-transformers/multi-qa-mpnet-base-dot-v1		70.66	71.18	4,000 / 170	540 / 30

Notes:

We provide two type of models: One that produces normalized embedding and can be used with dot-product, cosine-similarity or euclidean distance (all three scoring function will produce the same results). The models tuned for dot-product will produce embeddings of different lengths and must be used with dot-product to find close items in a vector space.
Models with normalized embeddings will prefer the retrieval of shorter passages, while models tuned for dot-product will prefer the retrieval of longer passages. Depending on your task, you might prefer the one or the other type of model.
Encoding speeds are per second and were measured on a V100 GPU and an 8 core Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz

Changes in v5

Models with normalized embeddings were added: These are the v3 cosine-similarity models, but with an additional normalize layer on-top.
New models trained with MarginMSE loss trained: sentence-transformers/msmarco-distilbert-dot-v5 and sentence-transformers/msmarco-bert-base-dot-v5

Changes in v4

Just one new model was trained with better hard negatives, leading to a small improvement compared to v3

Changes in v3

The models from v2 have been used for find for all training queries similar passages. An MS MARCO Cross-Encoder based on the electra-base-model has been then used to classify if these retrieved passages answer the question.

If they received a low score by the cross-encoder, we saved them as hard negatives: They got a high score from the bi-encoder, but a low-score from the (better) cross-encoder.

We then trained the v2 models with these new hard negatives.

MSMARCO Models

MSMARCO Models

Usage

Performance

Changes in v5

Changes in v4

Changes in v3

Version History