Skip to content

2015 09 17

Andre Merzky edited this page Sep 24, 2015 · 5 revisions
  • Agenda:
    • open TODOs:
      • WIP AM/MS: prepare action/support plan for activities on BW
        • objectives, challenges, timelines, phase 1
      • WIP IP: anaconda support on client side?
        • client side seems to 'just work', which is good
        • agent side is expected to fail, and does
        • seemingly differences between system / user space installs of RP (sdist builds)
        • IP: look into Supermuc bootstrapper
        • TODO VB: check if anaconda support is needed by Justin
      • TODO (MT?): add allocation info to resource doc

      • HOLD AM: check if we can switch to HeartbeatMonitor for pilot health checks
      • HOLD AM: suggest alternatives for PTY layer resource consumption
      • HOLD MS: Anaconda/SuperMUC (October)
      • HOLD MS: add NAMD examples eventually? (Tom Bishop)
      • HOLD AM: set up example on how to use synapse as RP workload
      • HOLD AM: check documentation of state diagram in released docs
      • HOLD MT: move semantic elements of tools into RP.utils
      • HOLD AM: proposal to json export to persistent storage
      • HOLD MS: proposal for persistent experimental data storage
    • Development Progress:
      • release plan:
        • 0.36: mid September
          • 1 week merging of branches (agent split, profiling)
          • 1 week of testing
          • -> delayed: 1 week
          • TODO: start tutorial preps in parallel
        • 0.37: September 23rd
          • documentation, examples, tutorials
          • -> as planned
        • 0.38: end October
          • module refactor
          • final state model
          • -> as planned
      • testing:
        • TODO AT:
          • move to RADICAL-Jenkins (with one fixture)
          • TODO AT: get stable (blue)
          • TODO AT: look into mail notifications
      • Yarn:
        • TODO IP: toward dynamic multi node (lower priority)
        • TODO AM: daemon startup over LMs?
        • WIP IP: check what (non)queue system is used on chameleon(?) cloud
          • no batch system
        • DONE IP: open ticket for '+ssh://'
        • using Chameleon VM now works
        • TODO IP: pull request for launcher...
      • Spark
        • HOLD GC: compare to Yarn integration
      • BW/OSG:
        • mostly complete, in testing and profiling phase
        • regression on OpenMPI layer, MPI-CU support is gone
        • regression on clean agent termination (due to more generic launch method support)
        • support seems valid for all Crays (BW, Archer, Edison, Titan)
        • seems as flexible in layout as we hope
        • DONE SJ: OTP token for our allocation is still pending
      • State of application kernels?
      • CECAM
        • Action Items from Extasy
          • TODO AM: what will be covered in the RP tutorial part
        • Documentation Tickets
          • which is the target env for installation?
          • workflow.iu.edu -> 50 tutorial account
            • TODO SJ: clarify account usage and XSEDE allocation
            • same accounts for extasy
          • pre/post exec: not after application error
        • Intro: SJ
        • install: VB
        • resources: MT
        • data: AT
        • examples: MS
        • tutorial: AM
    • Data Roadmap:
    • Experiments:
      • micro vs. macro benchmarks
      • profile status
    • Publications:
    • AOB:
      • CECAM Tutorial
        • online documentation vs. online tutorial
        • begin to work on interactive examples (which involve user activity)
          • how to submit n tasks of size A and m tasks of size B, toward hosts X and Y
          • TODO AT: simple repex example
            • TODO AT: check with SJ about suitable example / exercise mode
          • TODO VB: simple MD example
          • TODO AM: simple RP example
        • execution env, software stack, applications/libraries
        • WIP AM: assign documentation tickets
      • SC15 Tutorial
  • Notes: *
Clone this wiki locally