If you've already performed dimensionality reduction on your embeddings, you can feed in the reduced dimension embeddings to the embeddings argument, make sure to supply bt_compile_model
with a base reducer (the output of bt_base_reducer()
)
NOTE: The bertopic model you are working with is a pointer to a python object at a point in memory. This means that the input and the output model cannot be differentiated between without explicitly saving the model before performing this operation. We do not need to specify an output to the bt_fit_model function as the function changes the input model in place. If you do decide to explicitly assign a function output, be aware that the output model and the input model will be the same as one another.
Arguments
- model
Output of bt_compile_model() or another bertopic topic model
- documents
Your documents to topic model on
- embeddings
Your embeddings, can be reduced dimensionality or not. If no embeddings provided, embedding model used reduction_model in bt_compile_model is used to calculate and reduce dimensionality of embeddings.
- topic_labels
Pre-existing labels, for supervised topic modelling
Examples
if (FALSE) {
# create the model with default parameters, then fit the model to the data
model <- bt_compile_model()
bt_fit_model(model = model, documents = docs)
# create the model with document embeddings already created and reduced
# embeddings
embedder <- bt_make_embedder(all-minilm-l6-v2)
embeddings <- bt_do_embedding(embedder, docs)
# reduced embeddings
reducer <- bt_make_reducer_umap()
reduced_embeddings <- bt_do_reducing(reducer, embeddings)
# model
model <- bt_compile_model(embedding_model = bt_empty_embedder, reducer = bt_empty_reducer)
bt_fit_model(model = model, documents = docs, embeddings = reduced_embeddings)
}