Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[OMP] [main] Port the CSV importexport tool for OMP 3.5 #1845

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
453 changes: 279 additions & 174 deletions plugins/importexport/csv/CSVImportExportPlugin.php

Large diffs are not rendered by default.

129 changes: 129 additions & 0 deletions plugins/importexport/csv/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# CSV Import Export Plugin

## Table of Contents
- [Overview](#overview)
- [Usage Instructions](#usage-instructions)
- [CSV File Structure and Field Descriptions](#csv-file-structure-and-field-descriptions)
- [Required Fields and Headers](#required-fields-and-headers)
- [Authors Data Organization](#authors-data-organization)
- [Examples](#examples)
- [Common Use Cases](#common-use-cases)
- [Best Practices and Troubleshooting](#best-practices-and-troubleshooting)
- [Limitations and Special Considerations](#limitations-and-special-considerations)

## Overview
The CSV Import Export Plugin is a command-line tool for importing submission data from a CSV file into OMP. It allows you to batch-import submissions using a properly formatted CSV file.

## Usage Instructions
### How to Run
Use the following command in your terminal:
```
php tools/importExport.php CSVImportExportPlugin [path_to_csv_file] [username]
```
- **[path_to_csv_file]**: The path to the CSV file containing submission data.
- **[username]**: The username to assign the imported submissions.

**Example:**
```
php tools/importExport.php CSVImportExportPlugin /home/user/submissions.csv johndoe
```

### Command Parameters Table

| Parameter | Description | Example |
|-------------------|---------------------------------------------------------|--------------------------------|
| [path_to_csv_file]| Path to the CSV file containing submission data | /home/user/submissions.csv |
| [username] | Username to assign the imported submissions | johndoe |

## CSV File Structure and Field Descriptions

The CSV file should have the following structure and fields:

| Column Name | Description | Required | Example Value |
|-------------------------|--------------------------------------------------------------|:--------:|------------------------------------------------|
| pressPath | Identifier for the press | Yes | leo |
| authorString | Authors list; separate multiple authors with semicolons | Yes | "Given1,Family1,[email protected];John,Doe,[email protected]" |
| title | Title of the submission | Yes | Title text |
| abstract | Summary or abstract of the submission | Yes | Abstract text |
| seriesPath | Series identifier (optional if not applicable) | No | (leave empty if not applicable) |
| year | Year of the submission | No | 2024 (leave empty if not applicable) |
| isEditedVolume | Flag indicating if it's an edited volume (1 = Yes, 0 = No) | Yes | 1 (leave empty if not applicable) |
| locale | Locale code (e.g., en) | Yes | en |
| filename | Name of the file with submission content | Yes | submission.pdf |
| doi | Digital Object Identifier (if applicable) | No | 10.1111/hex.12487 |
| keywords | Keywords separated by semicolons | No | keyword1;keyword2;keyword3 |
| subjects | Subjects separated by semicolons | No | subject1;subject2 |
| bookCoverImage | Filename for the cover image | No | coverImage.png |
| bookCoverImageAltText | Alternative text for the cover image | No | Alt text, with commas |
| categories | Categories separated by semicolons | No | Category 1;Category 2;Category 3 (leave empty if not applicable) |
| genreName | Genre of the submission | No | MANUSCRIPT (leave empty if not applicable) |

**Note:** Ensure that fields with commas are properly quoted.

### Required Fields and Headers

The CSV must contain exactly the following headers in the specified order:

**Expected Headers:**
```
pressPath,authorString,title,abstract,seriesPath,year,isEditedVolume,locale,filename,doi,keywords,subjects,bookCoverImage,bookCoverImageAltText,categories,genreName
```

**Required Headers (mandatory):**
```
pressPath,authorString,title,abstract,locale,filename
```

**Warning:** The CSV header order must match exactly as provided in sample.csv. Any deviation, such as additional headers, missing headers, or reordering, will cause the CLI command to crash.

## Authors Data Organization

Author's information is processed via the AuthorsProcessor (see AuthorsProcessor.php). In the CSV, author details should be provided in the `authorString` field following these rules:
- Multiple authors must be separated by a semicolon (`;`).
- Each author entry must contain comma-separated values in the following order:
- Given Name (required)
- Family Name (required)
- Email Address (optional; if omitted, the tool defaults to the provided contact email)

**Example:**
```
"Given1,Family1,[email protected];John,Doe,"
```

**Note:** All assets referenced in the CSV (e.g., files specified in `filename` or `bookCoverImage`) must reside in the same directory as the CSV file.

## Examples

### Command Example
**Command:**
```
php tools/importExport.php CSVImportExportPlugin /home/user/submissions.csv johndoe
```

**Example Output:**
```
Submission: "Title text" successfully imported.
Submission: "Another Title" successfully imported.
...
All submissions imported. 2 successes, 0 failures.
```

### Sample CSV File Snippet
```
pressPath,authorString,title,abstract,seriesPath,year,isEditedVolume,locale,filename,doi,keywords,subjects,bookCoverImage,bookCoverImageAltText,categories,genreName
leo,"Given1,Family1,[email protected];John,Doe,[email protected]",Title text,Abstract text,,2024,1,en,submission.pdf,10.1111/hex.12487,keyword1;keyword2,subject1;subject2,coverImage.png,"Alt text, with commas",Category 1;Category 2,MANUSCRIPT
```

## Common Use Cases
- **Batch Importing Submissions:** Import multiple submissions at once using a CSV file.
- **Data Migration:** Transfer submission data from legacy systems to OMP.
- **Automated Imports:** Integrate the tool into scripts for periodic data imports.

## Best Practices and Troubleshooting
- **Verify CSV Structure:** Always check your CSV against the sample structure provided above and ensure it strictly adheres to the required header order.
- **Check for Required Fields:** Ensure all mandatory fields (e.g., pressPath, authorString, title, abstract, locale, filename) are provided.
- **Validate Authors Format:** Confirm that the `authorString` field follows the format: Given Name, Family Name, Email (with multiple authors separated by semicolons).

## Limitations and Special Considerations
- The tool is command-line only; no web interface is available.
- **Warning:** CSV header mismatches—such as extra headers, missing headers, or headers in an incorrect order—will cause the CLI command to crash. Ensure the CSV exactly matches the header format provided in sample.csv.
95 changes: 95 additions & 0 deletions plugins/importexport/csv/classes/caches/CachedDaos.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
<?php

/**
* @file plugins/importexport/csv/classes/caches/CachedDaos.php
*
* Copyright (c) 2013-2025 Simon Fraser University
* Copyright (c) 2003-2025 John Willinsky
* Distributed under the GNU GPL v3. For full terms see the file docs/COPYING.
*
* @class CachedDaos
*
* @ingroup plugins_importexport_csv
*
* @brief Cached DAOs
*/

namespace APP\plugins\importexport\csv\classes\caches;

use APP\facades\Repo;
use APP\press\PressDAO;
use APP\publicationFormat\PublicationDateDAO;
use APP\publicationFormat\PublicationFormatDAO;
use APP\section\DAO as SectionDAO;
use APP\submission\DAO as SubmissionDAO;
use PKP\author\DAO as AuthorDAO;
use PKP\category\DAO as CategoryDAO;
use PKP\db\DAO;
use PKP\db\DAORegistry;
use PKP\publication\DAO as PublicationDAO;
use PKP\submission\GenreDAO;
use PKP\submissionFile\DAO as SubmissionFileDAO;
use PKP\user\DAO as UserDAO;

class CachedDaos
{
/**
* @var DAO[] Array for caching already initialized DAOs.
*/
private static array $daos = [];

public static function getCategoryDao(): CategoryDAO
{
return self::$daos['CategoryDAO'] ??= Repo::category()->dao;
}

public static function getSubmissionDao(): SubmissionDAO
{
return self::$daos['SubmissionDAO'] ??= Repo::submission()->dao;
}

public static function getUserDao(): UserDAO
{
return self::$daos['UserDAO'] ??= Repo::user()->dao;
}

public static function getPressDao(): PressDAO
{
return self::$daos['PressDAO'] ??= DAORegistry::getDAO('PressDAO');
}

public static function getGenreDao(): GenreDAO
{
return self::$daos['GenreDAO'] ??= DAORegistry::getDAO('GenreDAO');
}

public static function getSeriesDao(): SectionDAO
{
return self::$daos['SeriesDAO'] ??= Repo::section()->dao;
}

public static function getPublicationDao(): PublicationDAO
{
return self::$daos['PublicationDAO'] ??= Repo::publication()->dao;
}

public static function getAuthorDao(): AuthorDAO
{
return self::$daos['AuthorDAO'] ??= Repo::author()->dao;
}

public static function getPublicationFormatDao(): PublicationFormatDAO
{
return self::$daos['PublicationFormatDAO'] ??= DAORegistry::getDAO('PublicationFormatDAO');
}

public static function getPublicationDateDao(): PublicationDateDAO
{
return self::$daos['PublicationDateDAO'] ??= DAORegistry::getDAO('PublicationDateDAO');
}

public static function getSubmissionFileDao(): SubmissionFileDAO
{
return self::$daos['SubmissionFileDAO'] ??= Repo::submissionFile()->dao;
}
}
92 changes: 92 additions & 0 deletions plugins/importexport/csv/classes/caches/CachedEntities.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
<?php

/**
* @file plugins/importexport/csv/classes/caches/CachedEntities.php
*
* Copyright (c) 2013-2025 Simon Fraser University
* Copyright (c) 2003-2025 John Willinsky
* Distributed under the GNU GPL v3. For full terms see the file docs/COPYING.
*
* @class CachedEntities
*
* @ingroup plugins_importexport_csv
*
* @brief Cached entities
*/

namespace APP\plugins\importexport\csv\classes\caches;

use APP\facades\Repo;
use APP\press\Press;
use Exception;
use PKP\security\Role;
use PKP\user\User;

class CachedEntities
{
private static array $presses = [];

private static array $genreIds = [];

private static array $userGroupIds = [];

private static array $seriesIds = [];

private static ?User $user = null;

/**
* Returns a cached Press or create a new one, if it isn't retrieved yet.
*/
public static function getCachedPress(string $pressPath): ?Press
{
$pressDao = CachedDaos::getPressDao();
return self::$presses[$pressPath] ??= $pressDao->getByPath($pressPath);
}

public static function getCachedGenreId(int $pressId, string $genreName): ?int
{
$customKey = "{$genreName}_{$pressId}";

if (key_exists($customKey, self::$genreIds)) {
return self::$genreIds[$customKey];
}

$genreDao = CachedDaos::getGenreDao();
$genre = $genreDao->getByKey($genreName, $pressId);

return self::$genreIds[$customKey] = $genre?->getId();
}

public static function getCachedUserGroupId(int $pressId, string $pressPath): ?int
{
return self::$userGroupIds[$pressPath] ??= Repo::userGroup()
->getArrayIdByRoleId(Role::ROLE_ID_AUTHOR, $pressId)[0] ?? null;
}

public static function getCachedSeriesId(string $seriesPath, int $pressId): ?int
{
$customKey = "{$seriesPath}_{$pressId}";

if (self::$seriesIds[$customKey]) {
return self::$seriesIds[$customKey];
}

$seriesDao = CachedDaos::getSeriesDao();
$series = $seriesDao->getByPath($seriesPath, $pressId);

return self::$seriesIds[$customKey] = $series?->getId();
}

public static function getCachedUser(?string $username = null): ?User
{
if (self::$user) {
return self::$user;
}

if (!$username && !self::$user) {
throw new Exception('User not found');
}

return self::$user = CachedDaos::getUserDao()->getByUsername($username);
}
}
Loading
Loading