Package: faSTM 0.0.0.9000

faSTM: Fast Structural Topic Models

A modern implementation of the Structural Topic Model. faSTM fits the logistic-normal STM (with prevalence and content covariates) via a multithreaded Rust core, with an opt-in stochastic-variational path for large corpora. It is self-contained: text preparation is read from 'quanteda' or 'tidytext' objects, model inspection (labelTopics with FREX/lift/score, findThoughts, semantic coherence, exclusivity, topic correlations) and an estimateEffect() (method-of-composition posterior propagation) are built in. The fitted object is structurally compatible with 'stm' so existing analyses migrate with minimal changes.

Authors:Neal Caren [aut, cre], Margaret Roberts [cph], Brandon Stewart [cph], Dustin Tingley [cph]

faSTM_0.0.0.9000.tar.gz
faSTM_0.0.0.9000.zip(r-4.7)faSTM_0.0.0.9000.zip(r-4.6)faSTM_0.0.0.9000.zip(r-4.5)
faSTM_0.0.0.9000.tgz(r-4.6-x86_64)faSTM_0.0.0.9000.tgz(r-4.6-arm64)faSTM_0.0.0.9000.tgz(r-4.5-x86_64)faSTM_0.0.0.9000.tgz(r-4.5-arm64)
faSTM_0.0.0.9000.tar.gz(r-4.7-arm64)faSTM_0.0.0.9000.tar.gz(r-4.7-x86_64)faSTM_0.0.0.9000.tar.gz(r-4.6-arm64)faSTM_0.0.0.9000.tar.gz(r-4.6-x86_64)
manual.pdf |manual.html
DESCRIPTION |NEWS
card.svg |card.png
faSTM/json (API)

# Install 'faSTM' in R:
install.packages('faSTM', repos = c('https://nealcaren.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/nealcaren/fastm/issues

Pkgdown/docs site:https://nealcaren.github.io

Datasets:
  • congress - U.S. Congressional Speeches
  • poliblog - CMU 2008 Political Blog Corpus

On CRAN:

Conda:

rustcargo

3.78 score 20 scripts 71 exports 4 dependencies

Last updated from:b4ac5c1f0d. Checks:11 WARNING, 1 OK, 1 FAIL. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-arm64WARNING273
linux-devel-x86_64WARNING294
source / vignettesOK531
linux-release-arm64WARNING275
linux-release-x86_64WARNING289
macos-release-arm64WARNING262
macos-release-x86_64WARNING396
macos-oldrel-arm64WARNING268
macos-oldrel-x86_64WARNING344
windows-develWARNING612
windows-releaseWARNING626
windows-oldrelWARNING295
wasm-releaseFAIL167

Exports:align_corpusalignCorpusameas_corpusasSTMCorpusaugmentcalcfrexcalcliftcalcscorecheck_residualscheckBetacheckResidualscloudcoherencecontent_topicsconvertCorpuseffect_estimatesestimateEffecteval_heldouteval.heldoutexclusivityfind_thoughtsfind_topicfindThoughtsfindTopicfit_new_documentsfitNewDocumentsfrex_scoresfrom_tidyglancelabel_topicslabelTopicsmake_dtmake_heldoutmake.dtmake.heldoutmakeDesignMatrixmany_topicsmanyTopicsmulti_stmoptimizeDocumentpermutation_testplot_topic_networkplotModelsplotQuoteposterior_theta_samplesread_ldacreadLdacssage_labelssageLabelssearch_ksearchKselect_bestselect_modelselectModelsemantic_coherencesemanticCoherencestmthetaPosteriortidytoLDAvistopic_corr_graphtopic_correlationtopic_lassotopic_proportionstopic_termstopicCorrtopicQualitywrite_ldacwriteLdac

Dependencies:genericslatticeMASSMatrix

Validation: parity with stm, and fit quality
Same model, same numbers | Topic labels (probability, FREX, lift, score) | Semantic coherence and exclusivity | Different fit, comparable quality | What this means for your analysis

Last update: 2026-06-19
Started: 2026-06-19

Beyond stm: faSTM's extensions
Multiple content covariates | Framing, not agenda | estimateEffect(): cluster-robust SEs and weights | Random effects in prevalence | Average marginal effects | Topic prevalence over time | Coherence: NPMI and C_V | A tidyverse-friendly surface | Summary

Last update: 2026-06-19
Started: 2026-06-19

faSTM: the stm vignette, run on faSTM
Ingesting data | Estimating the structural topic model | Model selection and search | Interpreting topics | Covariate effects on topic prevalence | Topical content | Interactions | More visualization | Out-of-sample documents

Last update: 2026-06-19
Started: 2026-06-18

Readme and manuals

Help Manual

Help pageTopics
Align a new corpus to a fitted model's vocabularyalign_corpus
Align a new corpus to a reference vocabulary (stm-compatible)alignCorpus
Average marginal effects from an estimateEffect fitame
Build a faSTM corpus from prepared textas_corpus
Convert search_k diagnostics to long form for plottingas.data.frame.faSTM_searchk
Coerce inputs into an stm-style corpus (stm-compatible)asSTMCorpus
Augment: most-likely topic for each document-term tokenaugment.faSTM
stm-compatible label scorers (FREX / lift / score)calcfrex calclift calcscore
Residual dispersion check (is K large enough?)check_residuals
Flag words that load almost entirely on one topiccheckBeta
Topic coherence (Mimno / NPMI / c_v)coherence
U.S. Congressional Speeches (Party x Chamber, 1987-2011)congress
Marginal content words by one content covariatecontent_topics
Convert documents/vocab between corpus formats (stm-compatible)convertCorpus
Extract estimateEffect estimates as a tidy data.frame (no plotting)effect_estimates
Estimate covariate effects on topic prevalence (method of composition)estimateEffect
Evaluate held-out log-likelihood of a fit on a held-out seteval_heldout
Topic exclusivity (FREX-summary, frexw default 0.7)exclusivity
Representative documents for each topicfind_thoughts
Find topics whose top words include given wordsfind_topic
Infer topic proportions for new documentsfit_new_documents
Fit a structural topic model and return its raw arrays.fit_stm
Infer topics for new documents (stm-compatible signature)fitNewDocuments
FREX scores for every word and topicfrex_scores
Build a faSTM corpus from a tidy (long) term-count tablefrom_tidy
One-row model summary for a faSTM fitglance.faSTM
Out-of-sample topic inference: for each new document, run the variational E-step against fixed globals (β, μ, Σ⁻¹) and return θ. Documents are passed sparse — 'words' are 0-based ids into the _fitted model's_ vocabulary (out-of-vocabulary terms dropped by the R caller) with their 'counts', concatenated, plus per-document term counts 'doc_nterms'.infer_theta_new
Label topics by top words (prob, FREX, lift, score)label_topics
LDA topic-word matrix via topica's CVB0 (deterministic collapsed variational Bayes), to seed a "replicate stm's LDA init" STM fit. Mirrors stm's collapsed-Gibbs LDA initialization; the result is fed back as 'init_beta'. Returns K*V row-major topic-word probabilities.lda_init_beta
Document-topic proportions as a data framemake_dt
Create a held-out version of a corpus for document-completion validationmake_heldout
Build a (sparse) design matrix for new data (stm-compatible)makeDesignMatrix
Select models across a range of Kmany_topics
Cross-run topic stabilitymulti_stm
Per-document variational E-step (stm-compatible)optimizeDocument
Permutation test for a binary covariate's effect on topicspermutation_test
Topic correlation networkplot_topic_network
Plot a fitted modelplot.faSTM
Plot estimated covariate effects on topic prevalenceplot.faSTM_effect
Plot search_k diagnosticsplot.faSTM_searchk
CMU 2008 Political Blog Corpus (poliblog5k)poliblog
Draw from the per-document topic-proportion posteriorposterior_theta_samples
Predict topic proportions for new documentspredict.faSTM
Read/write a corpus in LDA-C (Blei) sparse formatread_ldac write_ldac
Spline term for prevalence formulass
Labels for a content (SAGE) modelsage_labels
Search over the number of topics Ksearch_k
Pick one model from a 'select_model' runselect_best
Fit several models and keep the ones on the quality frontierselect_model
Semantic coherence (Mimno et al. 2011)semantic_coherence
Fit a structural topic model (fast Rust backend, stm-compatible object)stm
Tidy a faSTM fit (topic-term or document-topic distributions)tidy.faSTM
Tidy an estimateEffect fit (one row per term per topic)tidy.faSTM_effect
Topic-correlation network as an igraph graphtopic_corr_graph
Topic correlation graph (positive correlations of topic proportions)topic_correlation
Predict a document-level outcome from topic proportions (lasso)topic_lasso
Expected topic proportions (the numbers behind the summary plot)topic_proportions
Top terms per topic, with their numeric scores (tidy)topic_terms