Skip to contents

Calculates the maximal marginal relevance between candidate words and documents. Considers similarity between keywords and phrases and already selected keywords and phrases and chooses representation based on this to maximise diversity.

Usage

bt_representation_mmr(
fitted_model,
embedding_model,
diversity = 0.1,
top_n_words = 10)

Arguments

fitted_model

Output of bt_fit_model() or another bertopic topic model. The model must have been fitted to data.

embedding_model

embedding model used to embed keywords selected as potential representative words. Only compatible with sentence transformer models at this point.

diversity

How diverse representation words/phrases are. 0 = not diverse, 1 = completely diverse

top_n_words

Number of keywords/phrases to be extracted

Value

MaximalMarginalRelevance representation model