Skip to content

Commit

Permalink
doc: simple examples and how to get ranked array of sentences
Browse files Browse the repository at this point in the history
  • Loading branch information
jhrcook committed Jan 4, 2021
1 parent b678481 commit cd75d30
Showing 1 changed file with 30 additions and 2 deletions.
32 changes: 30 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,16 +5,44 @@
[![SwiftFormat](https://img.shields.io/badge/SwfitFormat-enabled-A166E6)](https://github.com/nicklockwood/SwiftFormat)
![GitHub Actions CI](https://github.com/jhrcook/TextRank/workflows/GitHub%20Actions%20CI/badge.svg)

A Swift package that implements the ['TextRank' algorithm](https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf) for summarization and keyword extraction.
It is based off of the [original Python implementation](https://github.com/summanlp/textrank).
A Swift package that implements the ['TextRank' algorithm](https://web.eecs.umich.edu/~mihalcea/papers/mihalcea.emnlp04.pdf) for summarization.
This algorithm uses the [PageRank (PR) algorithm](https://en.wikipedia.org/wiki/PageRank) to rank nodes by centrality in a weighted, undirected network where the nodes are sentences and edges indicate the degree of similarity of two sentences.

*This package is functional, but young. Please open an issue if you find any bugs or have any feature requests.*

Stop words were acquired from [Ranks NL](https://www.ranks.nl/stopwords).
Please open an issue to request additional language support.

### Example

```swift
import TextRank

let textRank = TextRank(text: myParagraph)
let rankedResults = textRank.runPageRank()
```

The `rankedResults` is a `TextGraph.PageRankResult` struct with three properties:

1. `didConverge`: whether or not the PR algorithm converged
2. `iterations`: the number of iterations required for the PR algorithm to converge
3. `results`: a `TextGraph.NodeList` which is a type alias for `[Sentence: Float]` that holds the final rankings from the PR algorithm

The `PageRankResult.results` object holds the final node list after running PR.
Under the hood, it is a dictionary mapping each sentence to a rank.
The keys are of type `Sentence` which has properties for the original sentence `text` and the set of words in the sentence `words`.
Below is an example of how to obtain an array of sentences sorted by their rankings in decreasing order.

```swift
let sortedSentences: [String] = rankedResults
.results
.sorted { $0.value < $1.value }
.map { $0.key }
```

---

### Similar projects

This code base is based off of the [original Python implementation](https://github.com/summanlp/textrank).
These is another Swift package, ['SwiftTextRank'](https://github.com/goncharik/SwiftTextRank) that implements this algorithm.

0 comments on commit cd75d30

Please sign in to comment.