Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Pipeline Truncation & Depth Query String Support #8

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

soundofjw
Copy link

Brings over UI updates and truncation support originally implemented by Kahn Academy’s appengine-mapreduce repo (https://github.com/Khan/appengine-mapreduce).

For the python backend:

  • get_status_tree supports a depth parameter, to enable non-root
    pipeline lookup. Executed with new methods db_get_in_batches and
    get_pipelines_within_depth.
  • _LowMemoryPipelineRecord and _LowMemorySlotRecord add truncation
    support for pipeline parameters.
  • get_pipeline_values provides a means to load full values from a
    pipeline.
  • _PipelineValuesHandler is a status handler that retrieves full
    pipeline values for the UI.
  • _register_json_primitive and the _TYPE_TO_ENCODER map use the
    name of a type, rather than it’s class, for the case of unexpected
    classpaths.

On the UI Side:

  • Adds support for pipeline_values de-truncation from the status_ui
    handler.
  • Enables depth query string argument to enable non-root pipeline
    inspection.

This pull requests resolves #1 [for python] ! 👯 💃 👯

Josh Whelchel added 5 commits November 16, 2014 09:58
Brings over backend changes made by Kahn Academy’s appengine-mapreduce
(https://github.com/Khan/appengine-mapreduce).

Notably:
* `get_status_tree` supports a `depth` parameter, to enable non-root
pipeline lookup. Executed with new methods `db_get_in_batches` and
`get_pipelines_within_depth`.
* `_LowMemoryPipelineRecord` and `_LowMemorySlotRecord` add truncation
support for pipeline parameters.
* `get_pipeline_values` provides a means to load full values from a
pipeline.
* `_PipelineValuesHandler` is a status handler that retrieves full
pipeline values for the UI.
* `_register_json_primitive` and the `_TYPE_TO_ENCODER ` map use the
name of a type, rather than it’s class, for the case of unexpected
classpaths.
Adds support for `pipeline_values` de-truncation from the `status_ui`
handler.

Enables `depth` query string argument to enable non-root pipeline
inspection.
Enables a pipeline class to provide its own value for
`_DEFAULT_MAX_ATTEMPTS`.
Provide `root_pipeline_key` property to fix UI and prevent the strain
required by using
`_PipelineRecord.root_pipeline.get_value_for_datastore(pipeline_record)`
.
Add Pipeline Truncation & Depth Query String Support
'Pipeline ID "%s" points to child ID "%s" which does not exist.'
% (pipeline_record.key().name(), child_pipeline_key.name()))
# If we limited the depth, we expect missing children, so don't warn.
if depth is None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we continue to raise the PipelineStatusError if depth is None?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The children may be missing due to eventual consistency. Also, I think this is a good case for best-effort display of the data given.

See Khan@87095b9

@AngryBrock
Copy link
Contributor

Hey, I've left some comments on the request. No huge changes, just nitpicks and requests for documentation/clarification. Overall this looks very backwards-compatible while adding valuable functionality to the UI for large pipeline jobs. Thanks for porting the code over here.

@soundofjw
Copy link
Author

👍 These are all great suggestions. I'll try to get these fixed up by end of week. :)

try:
val = unicode(value)
except UnicodeEncodeError:
val = strip_accents(value)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just use utf8 so you know that val is always a str and never a unicode? Upstream processes (read: browsers) should be able to handle utf8 encoded strings.

if isinstance(value, basestring):
    val = value.encode('utf8')
else:
    val = str(value)

@MattFaus
Copy link
Contributor

Thanks so much for doing this porting! I'm finally getting around to reworking our repo such that it is a proper fork of this one. Will you address the comments so we can get this pulled in and others can benefit?

For posterity the changes in this pull request correspond to these commits in our repo:

Khan@87095b9
Khan@b43f8ed
Khan@6db6c6e
Khan@b31ad76
Khan@6f9b30e
Khan@9e0b62f

@soundofjw
Copy link
Author

@MattFaus Yes! I'll take some time to accomplish that this weekend. Thanks 👍

@soundofjw soundofjw self-assigned this Jun 12, 2015
@soundofjw
Copy link
Author

I'm having nightmares about this still hanging about - I'll wrap this up.
This functionality saves us a lot of time. :]

👯

+ Added + cleaned up docstrings
+ Restored missed `depthUrlArg()` refactor of `DEPTH` argument
+ Removed extraneous `strip_accents` method in favor of @MattFaus's
unicode `.encode` suggestion
+ Added debug logging for non-int depth parameter
@soundofjw
Copy link
Author

Ok! Changes made, ready for re-review 👍

@soundofjw
Copy link
Author

Resolved merge conflicts.

@AngryBrock
Copy link
Contributor

I will let @MattFaus comment on the edits (Loudr@cedb557) before I merge this in.

surpresses bad state for abort, aborted log
@soundofjw
Copy link
Author

Any news? this is quite dated :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When the pipeline info exceeds 32MB the UI fails
3 participants