-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathkeyword_counting.Rmd
40 lines (29 loc) · 1.02 KB
/
keyword_counting.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
---
title: "keyword counting"
author: "Paul Bradshaw"
date: "8 December 2016"
output: html_document
---
This is based on steps outlined in a [blog post by John Victor Anderson](http://johnvictoranderson.org/?p=115).
First, we need to export the column of keywords:
```{r}
bbcartfull <- read.csv("artdata.csv", header=FALSE)
#The keywords are in the 16th column, called V16 because there were no column headers
write.csv(bbcartfull$V16, 'keywordsastext.txt')
```
Now we re-import that data as a character object using `scan`:
```{r}
keywords <- scan('keywordsastext.txt', what="char", sep=",")
# We convert all text to lower case to prevent any case sensitive issues with counting
keywords <- tolower(keywords)
```
We now need to put this through a series of conversions before we can generate a table:
```{r}
keywords.split <- strsplit(keywords, " ")
keywordsvec <- unlist(keywords.split)
keywordstable <- table(keywordsvec)
```
That table is enough to create a CSV from:
```{r}
write.csv(keywordstable1, 'keywordcount.csv')
```