- Introduction
- Broadsea - Quick start
- Broadsea - Advanced Usage
- Shutdown Broadsea
- Broadsea Intended Uses
- Troubleshooting
- Hardware/OS Requirements for Installing Docker
- License
Broadsea runs the core OHDSI technology stack using cross-platform Docker container technology.
Information on Observational Health Data Sciences and Informatics (OHDSI)
This repository contains the Docker Compose file used to launch the OHDSI Broadsea Docker containers:
- OHDSI R HADES - in RStudio Server
- OHDSI Atlas - including WebAPI REST services
- OHDSI Ares
- OHDSI Perseus (Experimental)
Additionally, Broadsea offers limited support for services not specifically needed for OHDSI applications that often are useful:
- OpenLDAP for testing security in Atlas
- Open Shiny Server for deploying Shiny apps without a commercial license
- Posit Connect for sites with commercial Posit licenses, for deploying Shiny apps
- DBT for ETL design
Throughout this README, we will show docker compose commands with the convention of docker compose
(no hyphen), per the new Docker Compose V2 standard outlined by Docker.
For Broadsea 3.5, you will need Docker version 1.27.0+.
- Linux, Mac, or Windows with WSL
- Docker 1.27.0+
- Git
- Chromium-based web browser (Chrome, Edge, etc.)
If using Mac Silicon (M1, M2), you may need to set the DOCKER_ARCH variable in Section 1 of the .env file to "linux/arm64". Some Broadsea services still need to run via emulation of linux/amd64 and are hard-coded as such.
- Download and install Docker. See the installation instructions at the Docker Web Site
- git clone this GitHub repo:
git clone https://github.com/OHDSI/Broadsea.git
- In a command line / terminal window - navigate to the directory where this README.md file is located and start the Broadsea Docker Containers using the below command. On Linux you may need to use 'sudo' to run this command. Wait up to one minute for the Docker containers to start.
docker compose --profile default up -d
- In your web browser open the URL:
"http://127.0.0.1"
- Click on the Atlas link to open Atlas in a new browser window
- Click on the Hades link to open HADES (RStudio) in a new browser window.
- The default RStudio userid is 'ohdsi' and the default password is located in the
./secrets/hades/HADES_PASSWORD
file.
- The default RStudio userid is 'ohdsi' and the default password is located in the
The .env file that comes with Broadsea has default and sample values. For advanced use, modify the values as appropriate, as covered below.
Broadsea leverages Docker Secrets to handle sensitive passwords and secret keys.
In Broadsea 3.0, these were handled via plain-text environment variables, which is not best security practice
Now in Broadsea 3.5, each sensitive password or secret key is to be stored in a file; the paths to these files is then set in the .env file per Section. Please refer to the default ./secrets
folder for examples on how to set up these files for your site.
In Section 1 of the .env file, set BROADSEA_HOST as the IP address or host name (without http/https) of the remote server.
Broadsea makes use of Docker profiles to allow for either a full default deployment ("default"), or a more a-la-carte approach in which you can pick and choose which services you'd like to deploy.
You can use this syntax for this approach, substituting profile names in:
docker compose --env-file .env --profile profile1 --profile profile2 ... up -d
Profile | Description |
---|---|
default |
|
atlas-from-image |
|
atlas-from-git |
|
webapi-from-image |
|
webapi-from-git |
|
atlasdb |
|
solr-vocab-no-import |
|
solr-vocab-with-import |
|
ares |
|
content |
|
omop-vocab-pg-load |
|
phoebe-pg-load |
|
openldap |
|
We also offer profiles for Perseus and other useful services, but please note, these are EXPERIMENTAL and not guaranteed to work:
Profile | Description |
---|---|
perseus |
|
perseus-shareddb |
|
perseus-files-manager |
|
perseus-user |
|
perseus-backend |
|
perseus-frontend |
|
perseus-vocabularydb |
|
perseus-cdm-builder |
|
perseus-solr |
|
perseus-athena |
|
perseus-usagi |
|
perseus-r-serve |
|
perseus-dqd |
|
perseus-swagger |
|
perseus-white-rabbit |
|
open-shiny-server |
|
posit-connect |
|
pgadmin4 |
|
jupyter-notebook |
|
gaia-catalog |
|
gaia-degauss |
|
Broadsea uses Traefik as a proxy for all containers within. The traefik dashboard is enabled by default at /dashboard/
, and can be useful for debugging the proxy network.
Traefik can be set up with SSL to enable HTTPS:
- Obtain a crt and key file. Rename them to "broadsea.crt" and "broadsea.key", respectively.
- In Section 1 of the .env file:
- Update the BROADSEA_CERTS_FOLDER to the folder that holds these cert files.
- Update the HTTP_TYPE to "https"
To adjust which app links to display on the Broadsea content page ("/"), refer to Section 12 of the .env file. Use "show" to display the div or "none" to hide it.
To load a new OMOP Vocabulary into a Postgres schema, review and fill out Section 9 of the .env file. Please note: this service will attempt to run the CPT4 import process for the CONCEPT table, so you will need a UMLS API Key from https://uts.nlm.nih.gov/uts/profile; store this in a file and set the path to the file as UMLS_API_KEY_FILE.
The Broadsea atlasdb Postgres instance is listed by default, but you can use an external Postgres instance. You need to copy your Athena downloaded files into ./omop_vocab/files.
Note: with WebAPI 2.14, you will need to use the webapi-from-git profile and set WEBAPI_MAVEN_PROFILE to webapi-docker,webapi-solr
To enable the use of SOLR for fast OMOP Vocab search in Atlas, review and fill out Section 7 of the .env file. You can either point to an existing SOLR instance, or have Broadsea build one. The JDBC jar file is needed in the Broadsea root folder in order for Solr to perform the dataimport step.
To enable a security provider for authentication and identity management in Atlas/WebAPI, review and fill out Sections 4 and 5 in the .env file.
Atlas database based security is pre-configured by the Broadsea-AtlasDB project and can be used as a demo. To enable this security:
- Update these environment variables in Sections 2, 4, and 5 in the .env file:
- section 2:
- ATLAS_USER_AUTH_ENABLED="true"
- section 4:
- ATLAS_SECURITY_PROVIDER_TYPE="db"
- ATLAS_SECURITY_PROVIDER_NAME="DB Security"
- ATLAS_SECURITY_USE_FORM="true"
- ATLAS_SECURITY_USE_AJAX="true
- section 5:
- WEBAPI_SECURITY_PROVIDER="AtlasRegularSecurity"
- SECURITY_AUTH_JDBC_ENABLED="true"
- section 2:
- Start the Broadsea docker containers
- Login to ATLAS with a demo user defined
Role Username Password Admin admin admin Atlas user ohdsi ohdsi
The Docker implementation of WebAPI does not come with all JDBC drivers supported by OHDSI (for example, Snowflake). To add a JDBC driver to the WebAPI build, refer to Section 3 of the .env file and edit the WEBAPI_ADDITIONAL_JDBC_FILE_PATH variable to point to your JDBC driver file.
Some deployments require a Java Keystore (cacerts) file that establishes trust with Root Certificate Authorities for LDAP or Snowflake connections.
To allow this, alter the env variable WEBAPI_CACERTS_FILE to point to your cacerts file. WebAPI can then leverage it for these external Java SSL connections.
For Snowflake, you will need to also set the CDM_SNOWFLAKE_PRIVATE_KEY_FILE env variable in Section 3.
OpenLDAP is provided for testing purposes, and is not recommended for any production deployment. Refer to Section 13 of the .env file to establish user accounts (using secrets files) for this LDAP instance. A GUI-based LDAP explorer, such as Apache Directory Studio is recommended for managing this instance.
To build either Atlas or WebAPI from a git repo instead of from Docker Hub, use Section 6 to specify the Git repo paths. Branches and commits can be in the URL after a "#".
With Atlas 2.12.0 and above, a new concept recommendation feature is available, based upon the Phoebe project. Review and fill out Section 10 of the .env file to load the concept_recommended table needed for this feature into a Postgres hosted OMOP Vocabulary.
To mount files prepared for Ares (see CDM Post Processing), add your Ares data folder path to ARES_DATA_FOLDER in Section 11. By default, it will use the Broadsea shared volume cdm-postprocessing-data/ares
used by the aresindexer service.
DBT provides a command-line tool for ETL design. See Section 16 for configuring DBT.
Perseus offers a full suite of services for data profiling, vocabulary mapping, ETL design, and ETL execution. See Section 16 for configuring Perseus.
New to Broadsea, there's now a profile for deploying the pgAdmin4 web application for database management of Postgres. See Section 18 for setting up the initial default admin username and the password secret file.
Once you have a CDM database available, it is important to run summary level statistics and data quality analyses prior to publishing the source to users. Broadsea provides services for running Achilles, DataQualityDashboard, and AresIndexer. See Section 17 for setting up the CDM connection details and the various application settings needed.
The credentials for the RStudio user can be established in Section 8 of the .env file (with a password stored in a secrets file).
To permanently retain the "rstudio" user files in the "rstudio" user home directory, and make local R packages available to RStudio in the Broadsea Methods container the following steps are required:
- In the same directory where the docker-compose.yml is stored create a sub-directory tree called "home/rstudio" and a sub-directory called "site-library"
- Set the file permissions for the "home/rstudio" sub-directory tree and the "site-library" sub-directory to public read, write and execute.
- Add the below volume mapping statements to the end of the broadsea-methods-library section of the docker-compose.yml file.
volumes:
- ./home/rstudio:/home/rstudio
- ./site-library:/usr/local/lib/R/site-library
Any files added to the home/rstudio or site-library sub-directories on the Docker host can be accessed by RStudio in the container.
The Broadsea Methods container RStudio /usr/lib/R/site-library originally contains the "littler" and "rgl" R packages. Volume mapping masks the original files in the directory so you will need to add those 2 packages to your Docker host site-library sub-directory if you need them.
New to Broadsea, there's now a profile for launching a simple, single user instance of Jupyter Data Science Notebook.
The OHDSI/GIS working group is developing an OHDSI plugin to enable ETL of geographic data for analyzing both social and environmental determinants of health outcomes.
New to Broadsea, this profile launches a suite of tools to ETL and make available geographic data that is searchable through a catalog. Once the data is loaded it is available to HADES through the exposure_occurrence table.
NOTE: most of the container is the suite build from github, the build time can be lengthy (especially gaia-core).
For configuration see section 19 of the .env file. By default the following services are available:
- http://localhost:5000 - the catalog.
- Select the OHDSI/GIS Collection and explore a little
- Explore the variable level metadata on the landing pages in the OHDSI/GIS collection
- Once you are in a landing page for one of the SVI layers, if you click a red dot next to a variable, it will load that variable into the database using the OHDSI/GIS toolchain (this can take time as it includes the full ETL process starting from download to re-projection, transformation, and load into database)
- http://localhost:8787 - HADES RStudio with the gaia package installed (see section 8 of the .env for credentials)
- http://localhost:8983 - SOLR (for those really interested in metadata, indexing, and search these next two urls give the csv for Doug's notebooks)
- csv of OHDSI/GIS collection: http://localhost:8983/solr/dcat/select?indent=true&q.op=OR&q=gdsc_collections%3AOHDSI%2FGIS&useParams=&csv.mv.separator=|&wt=csv
- csv of entire set of collections http://localhost:8983/solr/dcat/select?indent=true&q.op=OR&q=*%3A*&rows=113&useParams=&csv.mv.separator=|&wt=csv
- with PGAdmin, set host to localhost, port to 5433, user to postgres, and pass to SuperSecret and you can explore the database
New to Broadsea, this profile launches a degauss geocoding instance with a simple API that can be called through http.
http://localhost:5150/geocode?address=URLencodedAddress
NOTE: this builds from github and is a long build ...
To configure an open-source Shiny Server, refer to Section 14 of the .env file. Use the OPEN_SHINY_SERVER_APP_ROOT variable to point to a folder that will host Shiny apps.
The pattern for using Posit Connect deviates from the rest of Broadsea due to the many configuration options available. A sample .gcfg file is included, but you likely will need to make modifications to it. See Posit Connect configuration guide for more information.
If you want to keep a container for use later, you can use docker compose stop
. This may be useful when you plan to restart the services later and want to persist the container's state and networks. If you want to remove the containers and recreate them later, use docker compose down
. This will remove the containers and networks, but it will keep the volumes.
Use the following CLI commands to stop and start Broadsea's containers.
docker compose stop
docker compose start
Or target a specific profile using --profile
docker compose --profile profile1 stop
docker compose --profile profile1 start
Use the following commands to down and then up Broadsea's containers.
docker compose down
docker compose start
Or target a specific profile using --profile
docker compose --profile profile1 down
docker compose --profile profile1 up
By default Docker will create volumes and persist them. Any saved files or custom configs made in the containers themselves will persist through these containers. However, if you want to remove these volumes you can pass -v
with docker compose down
and the next time you compose up new volumes will be created.
docker compose down -v
docker compose up
Broadsea can deploy the OHDSI stack on any of the following infrastructure alternatives:
- laptop / desktop
- internally hosted server
- cloud provider hosted server
- cluster of servers (internally or cloud provider hosted)
It supports any database management system that the OHDSI stack supports, though some services are specific to Postgresql.
It supports any OS where Docker containers can run, including Windows, Mac OS X, and Linux (including Ubuntu, CentOS & CoreOS).
docker compose ps
Logs per container are available using this syntax:
docker logs containername
Follow the instructions here - Install Docker for Mac
Docker for Mac includes both Docker Engine & Docker Compose
For Mac Silicon, you may need to enable "Use Rosetta for x86/amd64 emulation on Apple Silicon" in the "Features in Development" Settings menu.
Follow the instructions here - Install Docker for Windows
Docker for Windows includes both Docker Engine & Docker Compose
64bit Windows 10 Pro, Enterprise and Education (1511 November update, Build 10586 or later). In the future Docker will support more versions of Windows 10. The Hyper-V package must be enabled. The Docker for Windows installer will enable it for you, if needed. (This requires a reboot).
Note. Docker for Windows is the preferred Docker environment for Broadsea, but Docker-Toolbox may be used instead if your machine doesn't meet the above requirements. (See info below.)
Follow the instructions here - Install Docker Toolbox on Windows
64bit Windows 7 or higher. The Hyper-V package must be enabled. The Docker for Windows installer will enable it for you, if needed. (This requires a reboot).
Follow the instructions here:
Install Docker for Linux
Install Docker Compose for Linux
Docker requires a 64-bit installation. Additionally, your kernel must be 3.10 at minimum. The latest 3.10 minor version or a newer maintained version are also acceptable.
Kernels older than 3.10 lack some of the features required to run Docker containers.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use the Broadsea software except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.