Releases: juliasilge/tidytext
Releases · juliasilge/tidytext
tidytext 0.2.3
tidytext 0.2.2
- Access NRC lexicon via textdata package
tidytext 0.2.1
- Fix bug in
augment()
function for stm topic model. - Warn when tf-idf is negative, thanks to @EmilHvitfeldt (#112).
- Switch from importing broom to importing generics, for lighter dependencies (#133).
- Add functions for reordering factors (such as for ggplot2 bar plots) thanks to @tmastny (#110).
- Update to
tibble()
where appropriate, thanks to @luisdza (#136). - Clarify documentation about impact of lowercase conversion on URLs (#139).
- Change how sentiment lexicons are accessed from package (remove NRC lexicon entirely, access AFINN and Loughran lexicons via textdata package so they are no longer included in this package).
tidytext 0.2.0
tidytext 0.1.9
tidytext 0.1.8
tidytext 0.1.7
unnest_tokens
can now unnest a data frame with a list column (which formerly threw the errorunnest_tokens expects all columns of input to be atomic vectors (not lists)
). The unnested result repeats the objects within each list. (It's still not possible whencollapse = TRUE
, in which tokens can span multiple lines).- Add
get_tidy_stopwords()
to obtain stopword lexicons in multiple languages in a tidy format. - Add a dataset
nma_words
of negators, modals, and adverbs that affect sentiment analysis (#55). - Updated various vignettes/docs/tests so package can build on R-oldrel.
tidytext 0.1.5
tidytext 0.1.4
- Fix tidier for quanteda dictionary for correct class (#71).
- Add a pkgdown site.
- Convert NSE from underscored function to tidyeval (
unnest_tokens
,bind_tf_idf
, all sparse casters) (#67, #74). - Added tidiers for topic models from the
stm
package (#51).
tidytext 0.1.3
get_sentiments
now works regardless of whethertidytext
has been loaded or not (#50).unnest_tokens
now supports data.table objects (#37).- Fixed
to_lower
parameter inunnest_tokens
to work properly for all tokenizing options. - Updated
tidy.corpus
,glance.corpus
, tests, and vignette for changes to quanteda API - Removed the deprecated
pair_count
function, which is now in the in-development widyr package - Added tidiers for LDA models from the
mallet
package - Added the Loughran and McDonald dictionary of sentiment words specific to financial reports
unnest_tokens
preserves custom attributes of data frames and data.tables