Things To Do #1

krishnan-r · 2017-07-18T07:38:46Z

Issues and things to fix

When there are more than atleast 200 tasks to show, the timeline appearing and scrolling lags
- This should depend on users browser and machine resources
- TODO: Beyond a certain threshold hide individual tasks entirely.
  - This needs to be done in the backend listener itself for scalability.
Some jobs do not have names
- For example when reading a parquet file, job name is null
- TODO: Use first stage name instead as done in Spark UI
Timeline annotations do not appear when number of tasks is too huge.
- Timeline loads asynchronously...
- TODO: Fix this or add option for user to show annotations by toggling checkbox
Cases where spark application is started and stopped multiple times in the same cell causes conflict in display as job Ids and stageIds are duplicate
- This could happen if jobs are called from an imported python script and context is stopped and started mutliple times.
- TODO Either clear the previous applications display or append appId to each jobId/stageId to make it unique.
- TODO Cases where a stage attempted again (never encountered this though)
When running multiple cells and an intermediate cell fails, further executions detect the wrong cell
- Restart and Run All doesnt work
- Cell Queue that is used to detect current cell needs to be cleared in frontend
- Further execution requests are possibly discarded in the kernel.
- TODO: How to detect this?
Error in some browsers like Internet Explorer, when frontend extension fails to load, python throws 'comm' error
- TODO: Supress error
- TODO: Replicate issue and identify possible causes

Pending Features

Handle skipped stages name and no: of tasks properly in the progressbars
Show failed tasks in red
- In the timeline
- In the table of jobs
- Also show reason of failure.
Dynamically update executors in task graph
Aggregate no: active tasks over finite interval to make graph smoother
Add annotations to task graph regarding start and end of jobs
- Change current charting library as annotations are not properly implemented.
Popup with more details when clicking on an items in the timeline
Ability to Cancel Jobs - The cancel button
- TODO What is the right API to do this?
- Using SparkContext
  - setJobGroup / cancelJobGroup
  - Currently there is no access to the SparkContext
  - Current communication mechanism prevents messages to kernel when kernel is busy.
- However the Spark UI has an internal REST API to kill individual Jobs
  - This is the (kill) link that appears in the UI

Look and Feel

In Firefox prevent tables css from expanding rows to fill container.
Jquery UI dialog css styles conflicting with matplotlib output,
- can be fixed
Add scrollbars to table when the number of jobs/stages is more.
Add a visual indicator that shows overall status of a cell - running/completed
Possibly show number of active executors somewhere as a number.
Display overall cell execution time somewhere

New Features

Add an option to remove display all together from a cell
- For trivial operations like a read or viewing count/take, user may prefer to hide the display.
- Maybe a global option to hide all displays
- Respond to "cell -> clear all/current output" and toggle options in the menu
- Too many displays in a notebook creates clutter
When automatically creating SparkConf in users namespace,in a new notebook, create a cell which displays the conf so that user does not by mistake recreate it.

Other Possible Future Things/Ideas

Include a configuration system for the user to configure things
- Option to disable the extension altogether.
- Configure other parameters such as refresh interval or display themes etc
- Jupyter nbextension configurator integration
Use a package manager for javascript dependencies instead of storing dependencies in the repo itself
Build and minify javascript for production
Upload module to PIP pypi registry
Write Tests
Document Code
Future Integration/compatibility with JupyterLab??

alfozan · 2018-06-21T08:44:34Z

Future Integration/compatibility with JupyterLab will be useful!

krishnan-r · 2018-06-21T09:30:13Z

Yes. Once JupyterLab reaches 1.0 I will give it a shot. It should have better extension APIs

Tagar · 2019-01-21T22:40:28Z

Would be great to support Livy connections too (through %sparkmagic extension )

Thanks!

Ftagn92 · 2019-02-05T17:10:52Z

Hello
I have added a python3 kernel to my jupyter docker image
Is it a way to have sparkmonitor working with both 2.x and 3.x ?

It works fine with a python 2 kernel, but when i switch to 3.x kernel, the conf test raise an error


print(conf.toDebugString())

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-1-0a5e403cf2b8> in <module>
----> 1 print(conf.toDebugString())

NameError: name 'conf' is not defined

Thanks for your help

krishnan-r mentioned this issue Feb 5, 2019

Python 3 Kernel Issue #13

Closed

randomthought mentioned this issue Feb 26, 2019

How to solve "[IPKernelApp] ERROR | No such comm target registered: SparkMonitor"? #6

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Things To Do #1

Things To Do #1

krishnan-r commented Jul 18, 2017 •

edited

Loading

alfozan commented Jun 21, 2018

krishnan-r commented Jun 21, 2018

Tagar commented Jan 21, 2019

Ftagn92 commented Feb 5, 2019

Things To Do #1

Things To Do #1

Comments

krishnan-r commented Jul 18, 2017 • edited Loading

Issues and things to fix

Pending Features

Look and Feel

New Features

Other Possible Future Things/Ideas

alfozan commented Jun 21, 2018

krishnan-r commented Jun 21, 2018

Tagar commented Jan 21, 2019

Ftagn92 commented Feb 5, 2019

krishnan-r commented Jul 18, 2017 •

edited

Loading