Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipelines API Reference #3390

Open
fireproofsocks opened this issue Oct 31, 2022 · 2 comments
Open

Pipelines API Reference #3390

fireproofsocks opened this issue Oct 31, 2022 · 2 comments

Comments

@fireproofsocks
Copy link

fireproofsocks commented Oct 31, 2022

The documentation is missing critical details that describe the HTTP interface that drives this software. A REST API is a standard interface. A Python SDK is a derivative implementation of that interface. The use of any language or SDK is not a requirement for a language-agnostic HTTP API. In other words: if we know what the REST endpoints are, we can make our own SDK; documenting the REST API is required: SDK stuff is optional.

One easy place for improvement is to clarify that some endpoints are not exposed at /apis but instead at /pipeline/apis/. E.g. requests to /apis/v1beta1/pipelines fail, but requests to /pipeline/apis/v1beta1/pipelines succeed. It seems like there are actually 2 APIs here, each one supporting its own endpoints (?). In any case, adding a note would help clear up the confusion.

Secondly, there are no notes regarding authentication. A REST API should not rely on cookies because they are antithetical to its stateless design. Authentication for REST APIs would more commonly be implemented as a request header value, e.g. a bearer token. It would be very helpful to call this out and say something like "currently, cookies are required" so people familiar with REST conventions are not left thoroughly confused.

Demonstrating a working curl example would go a long ways in clarifying how to use the software programmatically.
For example, a working curl request to list available pipelines looks something like this:

curl -X GET --cookie "authservice_session=xxxx" -H 'Content-Type: application/json' http://example.com/pipeline/apis/v1beta1/pipelines

However, you would need to log in to get a valid value for the authservice_session cookie. The login flow (using email/username and password) can be manually submitted. Here is a python example provided by Benjamin Tan:

# Example Python Script to get a valid authservice_session cookie value
import re
import requests
from urllib.parse import urlsplit


def get_istio_auth_session(url: str, username: str, password: str) -> dict:
    """
    Determine if the specified URL is secured by Dex and try to obtain a session cookie.
    WARNING: only Dex `staticPasswords` and `LDAP` authentication are currently supported
             (we default default to using `staticPasswords` if both are enabled)

    :param url: Kubeflow server URL, including protocol
    :param username: Dex `staticPasswords` or `LDAP` username
    :param password: Dex `staticPasswords` or `LDAP` password
    :return: auth session information
    """
    # define the default return object
    auth_session = {
        "endpoint_url": url,    # KF endpoint URL
        "redirect_url": None,   # KF redirect URL, if applicable
        "dex_login_url": None,  # Dex login URL (for POST of credentials)
        "is_secured": None,     # True if KF endpoint is secured
        "session_cookie": None  # Resulting session cookies in the form "key1=value1; key2=value2"
    }

    # use a persistent session (for cookies)
    with requests.Session() as s:

        ################
        # Determine if Endpoint is Secured
        ################
        resp = s.get(url, allow_redirects=True)
        if resp.status_code != 200:
            raise RuntimeError(
                f"HTTP status code '{resp.status_code}' for GET against: {url}"
            )

        auth_session["redirect_url"] = resp.url

        # if we were NOT redirected, then the endpoint is UNSECURED
        if len(resp.history) == 0:
            auth_session["is_secured"] = False
            return auth_session
        else:
            auth_session["is_secured"] = True

        ################
        # Get Dex Login URL
        ################
        redirect_url_obj = urlsplit(auth_session["redirect_url"])

        # if we are at `/auth?=xxxx` path, we need to select an auth type
        if re.search(r"/auth$", redirect_url_obj.path):

            #######
            # TIP: choose the default auth type by including ONE of the following
            #######

            # OPTION 1: set "staticPasswords" as default auth type
            redirect_url_obj = redirect_url_obj._replace(
                path=re.sub(r"/auth$", "/auth/local", redirect_url_obj.path)
            )
            # OPTION 2: set "ldap" as default auth type
            # redirect_url_obj = redirect_url_obj._replace(
            #     path=re.sub(r"/auth$", "/auth/ldap", redirect_url_obj.path)
            # )

        # if we are at `/auth/xxxx/login` path, then no further action is needed (we can use it for login POST)
        if re.search(r"/auth/.*/login$", redirect_url_obj.path):
            auth_session["dex_login_url"] = redirect_url_obj.geturl()

        # else, we need to be redirected to the actual login page
        else:
            # this GET should redirect us to the `/auth/xxxx/login` path
            resp = s.get(redirect_url_obj.geturl(), allow_redirects=True)
            if resp.status_code != 200:
                raise RuntimeError(
                    f"HTTP status code '{resp.status_code}' for GET against: {redirect_url_obj.geturl()}"
                )

            # set the login url
            auth_session["dex_login_url"] = resp.url

        ################
        # Attempt Dex Login
        ################
        resp = s.post(
            auth_session["dex_login_url"],
            data={"login": username, "password": password},
            allow_redirects=True
        )
        if len(resp.history) == 0:
            raise RuntimeError(
                f"Login credentials were probably invalid - "
                f"No redirect after POST to: {auth_session['dex_login_url']}"
            )

        # store the session cookies in a "key1=value1; key2=value2" string
        auth_session["session_cookie"] = "; ".join(
            [f"{c.name}={c.value}" for c in s.cookies])

    return auth_session


# Provide the host where your Kubeflow install lives and a valid username + password
print(get_istio_auth_session(
    'http://example.com', 'username', 'password'))

This yields a JSON payload like

{'endpoint_url': 'http://example.com', 'redirect_url': 'http://example.com/dex/auth/local?req=syej6jou3tdtcto25xdydnkdu', 'dex_login_url': 'http://example.com/dex/auth/local?req=syej6jou3tdtcto25xdydnkdu', 'is_secured': True, 'session_cookie': 'authservice_session=xxxxxxxx'}

from which the cookie value can be extracted.

Many thanks to Benjamin Tan for providing the above information. Critical info like this belongs in the official docs.

@varodrig
Copy link
Contributor

@kubeflow/wg-pipeline-leads to review

@varodrig
Copy link
Contributor

/area pipelines

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants