-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #8 from LeidenUniversityLibrary/documentation
Add install and usage documentation
- Loading branch information
Showing
6 changed files
with
147 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,29 @@ | ||
--- | ||
title: Extract metadata from converted files | ||
# SPDX-FileCopyrightText: 2023-present Leiden University Libraries <[email protected]> | ||
# SPDX-License-Identifier: CC-BY-4.0 | ||
--- | ||
|
||
Nexis Uni includes fairly standardised metadata for each article, such as the | ||
title, name of the publication, publication date, a byline and the number of | ||
words in the article body. | ||
This command creates a CSV file that includes these metadata for each input | ||
file. | ||
|
||
# Usage | ||
|
||
The input directory must contain the Markdown files. | ||
The output directory (defaults to the input directory if not specified) will | ||
have a file named *analysis-results.csv*. | ||
|
||
Run with Hatch: | ||
|
||
```sh | ||
hatch run nexis analyse -i INPUT_DIRECTORY -o OUTPUT_DIRECTORY | ||
``` | ||
|
||
If you installed the package, you can run: | ||
|
||
```sh | ||
nexis analyse -i INPUT_DIRECTORY -o OUTPUT_DIRECTORY | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
--- | ||
title: Extract search terms from converted files | ||
# SPDX-FileCopyrightText: 2023-present Leiden University Libraries <[email protected]> | ||
# SPDX-License-Identifier: CC-BY-4.0 | ||
--- | ||
|
||
Nexis Uni marks the phrases or terms that caused the article to match in the | ||
files. | ||
This allows us to find which terms are in which article by their markup. | ||
|
||
The result of the command is a CSV file linking filenames and counts of terms. | ||
|
||
# Usage | ||
|
||
The input directory must contain the Markdown files. | ||
The output directory (defaults to the input directory if not specified) will | ||
have a file named *terms-results.csv*. | ||
|
||
Run with Hatch: | ||
|
||
```sh | ||
hatch run nexis terms -i INPUT_DIRECTORY -o OUTPUT_DIRECTORY | ||
``` | ||
|
||
If you installed the package, you can run: | ||
|
||
```sh | ||
nexis terms -i INPUT_DIRECTORY -o OUTPUT_DIRECTORY | ||
``` | ||
|
||
!!! note | ||
Marked phrases and terms are only extracted from the body of the articles. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
--- | ||
title: Installation | ||
# SPDX-FileCopyrightText: 2023-present Leiden University Libraries <[email protected]> | ||
# SPDX-License-Identifier: CC-BY-4.0 | ||
--- | ||
|
||
The scripts are written in Python and require Python 3.9 or newer to run. | ||
|
||
The scripts have not been published to PyPI, so to install them you can either | ||
install the package from the git repository using pip, or run the scripts with | ||
Hatch. | ||
We describe how to use Hatch below. | ||
|
||
# Install Hatch | ||
|
||
Follow the [Hatch installation instructions][Hatch] to install Hatch. | ||
|
||
[Hatch]: https://hatch.pypa.io/latest/install/ | ||
|
||
# Clone the git repository | ||
|
||
```sh | ||
git clone https://github.com/LeidenUniversityLibrary/nexis-analysis.git | ||
cd nexis-analysis | ||
``` | ||
|
||
# Run a `nexis` command | ||
|
||
After the previous steps, you should be in the `nexis-analysis` directory. | ||
To check that the tool works, run: | ||
|
||
```sh | ||
hatch run nexis --help | ||
``` | ||
|
||
This should show the available commands: | ||
|
||
```output | ||
Usage: nexis [OPTIONS] COMMAND [ARGS]... | ||
Options: | ||
--help Show this message and exit. | ||
Commands: | ||
analyse Extract information from GFM documents in a directory | ||
convert Convert .docx files in a directory to GitHub-flavoured Markdown | ||
terms Extract marked-up search terms or phrases from GFM documents... | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters