-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathREADME.Rmd
117 lines (86 loc) · 2.59 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
---
output: rmarkdown::github_document
---
```{r include=FALSE}
knitr::opts_chunk$set(message = FALSE, warning = FALSE, echo = TRUE,
dev="png", fig.retina = 2, fig.width = 10, fig.height = 6)
```
# misinfo
Tools to Perform 'Misinformation' Analysis on a Text Corpus
## Description
Tools to Perform 'Misinformation' Analysis on a Text Corpus
## What's Inside The Tin
The following functions are implemented:
- `mi_analyze_document`: Run a one or more documents through a misinformation/bias sentiment analysis
- `mi_plot_document_summary`: Plot raw frequency count summary from a processed corpus
- `mi_plot_individual_frequency`: Plot raw frequency count summary from a processed corpus
- `mi_use_builtin`: Use a built-in 'sentiment' word/phrase list
- `word_doc_to_txt`: Convert a Word document to a plaintext document
## Installation
```{r eval=FALSE}
devtools::install_github("hrbrmstr/misinfo")
```
```{r message=FALSE, warning=FALSE, error=FALSE, include=FALSE}
options(width=120)
```
## Usage
```{r message=FALSE, warning=FALSE, error=FALSE}
library(misinfo)
library(hrbrthemes)
library(tidyverse)
# current verison
packageVersion("misinfo")
```
### Try it out on the built-in corpus
```{r}
mi_analyze_document(
path = system.file("extdat", package="misinfo"),
pattern = ".*txt$",
bias_type = "explanatory",
sentiment_list = mi_use_builtin("explanatory")
) -> corpus
```
```{r}
glimpse(corpus)
```
```{r}
corpus$frequency_summary
```
```{r}
corpus$individual_frequency
```
```{r fig.height=8}
mi_plot_individual_frequency(corpus)
```
```{r fig.height=8}
mi_plot_individual_frequency(corpus, FALSE)
```
```{r}
mi_plot_document_summary(corpus)
```
```{r}
mi_plot_document_summary(corpus, FALSE)
```
### Compare a corpus across a bunch of biases
```{r}
map_df(c("uncertainty", "sourcing", "retractors", "explanatory"), ~{
# compute individual bias scores
mi_analyze_document(
path = system.file("extdat", package="misinfo"),
pattern = ".*txt$",
bias_type = .x,
sentiment_list = mi_use_builtin(.x)
) -> corpus
mutate(corpus$frequency_summary, bias = corpus$bias_type)
}) -> biases
mutate(biases, pct = ifelse(pct == 0, NA, pct)) %>%
ggplot(aes(sprintf("Doc #%s", doc_num), bias, fill=pct)) +
geom_tile(color="#2b2b2b", size=0.125) +
scale_x_discrete(expand=c(0,0)) +
scale_y_discrete(expand=c(0,0)) +
viridis::scale_fill_viridis(direction=-1, na.value="white") +
labs(x=NULL, y=NULL, title="Word List Usage Heatmap (normalized)") +
theme_ipsum_rc(grid="")
distinct(biases, doc_num, doc_id, total_words) %>%
knitr::kable(format="markdown")
```