Updates topics and their representations to be based on the document-topic classification described in the list of new_topics. As when initiating the model with bt_compile_model, if you want to manipulate the topic representations you must use a vectoriser/ctfidf model, these can be the same as those used in bt_compile_model.
NOTE: The bertopic model you are working with is a pointer to a python object at a point in memory. This means that the input and the output model cannot be differentiated between without explicitly saving the model before performing this operation. We do not need to specify an output to the bt_fit_model function as the function changes the input model in place. If you do decide to explicitly assign a function output, be aware that the output model and the input model will be the same as one another.
Usage
bt_update_topics(
fitted_model,
documents,
new_topics = NULL,
representation_model = NULL,
vectoriser_model = NULL,
ctfidf_model = NULL
)
Arguments
- fitted_model
Output of bt_fit_model() or another bertopic topic model. The model must have been fitted to data.
- documents
documents to which the model was fit
- new_topics
Topics to update model with
- representation_model
model for updating topic representations
- vectoriser_model
Model for vectorising input for topic representations (Python object)
- ctfidf_model
Model for performing class-based tf-idf (ctf-idf) (Python object)
Details
NOTE: If using this function to update outlier topics, it may lead to errors if topic reduction or topic merging techniques are used afterwards. The reason for this is that when you assign a -1 document to topic 1 and another -1 document to topic 2, it is unclear how you map the -1 documents. Is it matched to topic 1 or 2.
Examples
if (FALSE) {
# update model with new topic distribution
# reduce outliers
outliers <- bt_outliers_ctfidf(fitted_model = topic_model, documents = docs, threshold = 0.2)
# update the model with the new topic distribution
bt_update_topics(fitted_model = topic_model, documents = docs, new_topics = outliers$new_topics)
# update topic representation
bt_update_topics(fitted_model = topic_model, documents = docs, vectoriser_model = update_vec_model)
}