-
Notifications
You must be signed in to change notification settings - Fork 23
2018 06 05
Andre Merzky edited this page Jun 5, 2018
·
2 revisions
- use cases -> requirements -> architecture -> implementation -> testing -> release
- Milestones:
- Use cases, Requirements (early April)
- Feasibility, Prototype (early June )
- Implementation (end June )
- updates:
- Team 1: Ioannis, Will, Jumana
- ticket
- use case document
- TODO Will: tests for slurm LRMS (other LRMS missing)
- JD: data from BW, Titan, Stampede-2
- TODO: use an RP script (BoT
df /tmp/
) to get LSF size out of band
- TODO: use an RP script (BoT
- TODO: look into documented expected node failure rate (focus on BW)
- comet: failure rate
0.5%
, node-MTF: 2.8 days - comet: less than 200GB of storage should not happen, application must clean up
- TODO: look into other machines
- TODO: suggest to have this documented
- comet: failure rate
- Team 2: Vivek, Srinivas, George
- literature study
- non-mpi tests pass
- mpi-tests now complete.
- WIP: integration test on RP layer, targeting BW
- TODO begin to look into tests on remote resources (focus on BW)
- toward integration
- TODO IP: sync with devel
- TODO: check tagging reqs on scheduler
- Use case is proof, not tests and prototype
- don't delay EnTK integration
- how does that relate to tags?
- TODO: check LSF update approach with use cases
- TODO: press forward with integration
- Team 1: Ioannis, Will, Jumana
- Link Collection
- RP Architecture
- RCT Code Walkthroughs: