Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add options to connect to database for c. annotation #13

Open
wants to merge 11 commits into
base: master
Choose a base branch
from
16 changes: 16 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,3 +58,19 @@ Installation instructions:
- Do you want to install any cache files (y/n)? -> y, install GRCh37 or GRCh38 cache
- Do you want to install any FASTA files (y/n)? -> y, install GRCh37 or GRCh38 FASTA
- Do you want to install any plugins (y/n)? -> n

## Connect an Ensembl database

In order to annotate c. alterations, you must connect to a MySQL database. For an example of connecting to Ensembl's public database, you would set the following
in [application.properties](target/classes/application.properties):

```
database.host = ensembldb.ensembl.org
database.port = 5306
database.user = anonymous
# password is not used when connecting to ensembl
database.password =
```

However, Ensembl does not recommend connecting to their public database (as it is very slow), so it is recommended to set up a local copy of their database by following
the [instructions](https://useast.ensembl.org/info/docs/webcode/mirror/install/ensembl-data.html) on the Ensembl website.
67 changes: 65 additions & 2 deletions src/main/java/org/genomenexus/vep_wrapper/VepController.java
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ public void getVepAnnotation(
try {
out = response.getOutputStream();
response.setContentType("application/json");
vepRunner.run(Arrays.asList(region + "/" + allele), false, responseTimeout, out);
vepRunner.run(Arrays.asList(region + "/" + allele), false, responseTimeout, out, false);
} catch (IOException | InterruptedException | VepLaunchFailureException e) {
e.printStackTrace();
// TODO: throw and handle errors with global exception handler
Expand Down Expand Up @@ -83,7 +83,7 @@ public void fetchVepAnnotationByRegionsPOST(
try {
out = response.getOutputStream();
response.setContentType("application/json");
vepRunner.run(regions, true, responseTimeout, out);
vepRunner.run(regions, true, responseTimeout, out, false);
} catch (IOException | InterruptedException | VepLaunchFailureException e) {
e.printStackTrace();
// TODO: throw and handle errors with global exception handler
Expand All @@ -98,4 +98,67 @@ public void fetchVepAnnotationByRegionsPOST(
return;
}

@RequestMapping(value = "/vep/human/hgvsc/{hgvsc}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since database is optional, should we hide or disable this endpoint when no database?

method = RequestMethod.GET,
produces = "application/json")
@ApiOperation(value = "Retrieves VEP results for single c. variant specified in hgvs syntax (https://ensembl.org/info/docs/tools/vep/vep_formats.html)",
nickname = "fetchVepHgvscAnnotationByGET")
public void getVepHgvscAnnotation(
@ApiParam(value="ENST00000618231.3:c.9G>C", required=true)
@PathVariable
String hgvsc,
@ApiParam("Maximum time (in seconds) to let VEP construct a response (0 = no limit)")
@RequestParam(defaultValue = "0")
Integer responseTimeout,
HttpServletResponse response) {
OutputStream out = null;
try {
out = response.getOutputStream();
response.setContentType("application/json");
vepRunner.run(Arrays.asList(hgvsc), false, responseTimeout, out, true);
} catch (IOException | InterruptedException | VepLaunchFailureException e) {
e.printStackTrace();
// TODO: throw and handle errors with global exception handler
} finally {
try {
response.flushBuffer();
} catch (Throwable e) {
e.printStackTrace();
// TODO: throw and handle errors with global exception handler
}
}
}

@RequestMapping(value = "/vep/human/hgvsc",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, should we hide or disable this endpoint when no database?

method = RequestMethod.POST)
@ApiOperation(value = "Retrieves VEP results for multiple c. variants specified in hgvs syntax (https://ensembl.org/info/docs/tools/vep/vep_formats.html)",
nickname = "fetchVepHgvscAnnotationsByPOST")
public void fetchVepHgvscAnnotationsPOST(
@ApiParam(value = "List of variants in ENSEMBL hgvsc format. For example:\n" +
"[\"ENST00000618231.3:c.9G>C\", \"ENST00000471631.1:c.28_33delTCGCGG\"]",
required = true)
@RequestBody
List<String> hgvscStrings,
@ApiParam("Maximum time (in seconds) to let VEP construct a response (0 = no limit)")
@RequestParam(defaultValue = "0")
Integer responseTimeout,
HttpServletResponse response) {
OutputStream out = null;
try {
out = response.getOutputStream();
response.setContentType("application/json");
vepRunner.run(hgvscStrings, true, responseTimeout, out, true);
} catch (IOException | InterruptedException | VepLaunchFailureException e) {
e.printStackTrace();
// TODO: throw and handle errors with global exception handler
} finally {
try {
response.flushBuffer();
} catch (Throwable e) {
e.printStackTrace();
// TODO: throw and handle errors with global exception handler
}
}
return;
}
}
52 changes: 42 additions & 10 deletions src/main/java/org/genomenexus/vep_wrapper/VepRunner.java
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,18 @@ public class VepRunner {
@Value("${vep.fastaFileRelativePath:homo_sapiens/98_GRCh37/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz}")
private String vepFastaFileRelativePath;

@Value("${database.host}")
private String databaseHost;

@Value("${database.port}")
private String databasePort;

@Value("${database.user}")
private String databaseUser;

@Value("${database.password}")
private String databasePassword;

private Path vepFastaFilePath;
@Autowired
private void setVepFastaFilePath() {
Expand All @@ -64,21 +76,21 @@ private void createTmpDirIfNecessary() throws IOException {
}

/**
* Create a file containing the regions received in the input query.
* Write the user supplied regions from the "regions" argument to an output file.
* CAUTION : this function does not sort the regions into chromosomal order. The
* Create a file containing the variants received in the input query.
* Write the user supplied variants from the "variants" argument to an output file.
* CAUTION : this function does not sort the variants into chromosomal order. The
* VEP command line tool is very slow when the input is not sorted. It is expected
* that users of the VepRunner will always send requests that have been pre-sorted.
*
* @param regions - the regions as passed by the user
* @param variants - the variants as passed by the user
* @param vepInputFile - the file to be written
* @return sum of two operands

**/
private void constructFileForVepProcessing(List<String> regions, Path vepInputFile) throws IOException {
private void constructFileForVepProcessing(List<String> variants, Path vepInputFile) throws IOException {
try (PrintWriter out = new PrintWriter(Files.newBufferedWriter(vepInputFile))) {
for (String region : regions) {
out.println(region);
for (String variant : variants) {
out.println(variant);
}
out.close();
} catch (IOException e) {
Expand Down Expand Up @@ -111,7 +123,7 @@ public boolean timeIsExpired(Instant timeToKillProcess) {
return Instant.now().isAfter(timeToKillProcess);
}

public void run(List<String> regions, Boolean convertToListJSON, Integer responseTimeout, OutputStream responseOut)
public void run(List<String> variants, Boolean convertToListJSON, Integer responseTimeout, OutputStream responseOut, boolean useDatabase)
throws IOException, InterruptedException, VepLaunchFailureException {

printWithTimestamp("Running vep");
Expand All @@ -120,7 +132,26 @@ public void run(List<String> regions, Boolean convertToListJSON, Integer respons
Path constructedInputFile = createTempFileForVepInput();

// get vep parameters (use -Dvep.params to change)
String vepParameters = System.getProperty("vep.params", String.join(" ",
String vepParameters;
if (useDatabase) {
vepParameters = System.getProperty("vep.params", String.join(" ",
"--database",
"--host " + databaseHost,
"--user " + databaseUser,
"--password " + databasePassword,
"--port " + databasePort,
"--everything",
"--hgvsg",
"--xref_refseq",
"--format hgvs",
"--fork " + vepForkCount,
"--fasta " + vepFastaFilePath,
"--json",
"-i " + constructedInputFile,
"-o STDOUT",
"--no_stats"));
} else {
vepParameters = System.getProperty("vep.params", String.join(" ",
"--cache",
"--offline",
"--everything",
Expand All @@ -134,6 +165,7 @@ public void run(List<String> regions, Boolean convertToListJSON, Integer respons
"-i " + constructedInputFile,
"-o STDOUT",
"--no_stats"));
}

// build command
List<String> commandElements = new ArrayList<String>();
Expand All @@ -143,7 +175,7 @@ public void run(List<String> regions, Boolean convertToListJSON, Integer respons
}

printWithTimestamp("writing constructed input file");
constructFileForVepProcessing(regions, constructedInputFile);
constructFileForVepProcessing(variants, constructedInputFile);

printWithTimestamp("processing requests");
printWithTimestamp("process command elements: " + commandElements);
Expand Down
4 changes: 4 additions & 0 deletions src/main/resources/application.properties
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
database.host = ensembldb.ensembl.org
database.port = 5306
database.user = anonymous
database.password =