feat: replaced `werkzeug/flask` with `uvicorn/starlette` #375

viniarck · 2023-04-20T19:12:50Z

Closes #347
Closes #372
Closes #301
Closes #168
Closes #280
Closes #225

Summary

See updated changelog file.

Summary of upcoming starlette/uvicorn changes for NApp developers to be aware:

any unit test that calls a rest endpoint of kytos-ng platform should use an async client, get_test_client, will return an instance of httpx.asyncclient. Although it's an async client it can test both sync or async client, but the test case must be async, this is a constraint to ensure that pytest and event loops plays well (as mentioned earlier, unittest.testcase isn't compatible).
by default, prefer async endpoints over sync endpoints for IO-bound work. if you're using pymongo or any other lib that's blocking you should stick with sync endpoints to avoid blocking the event loop, since starlette will run sync endpoints in a threadpool. notice that race conditions can still happen, but they're easier to manage since context switching is explicit, and threading.lock isn't compatible, so if you have a dependency using a threading.lock and can't be migrated or moved then you should stick with a sync endpoint too.
starlette with uvicorn generally outperforms flask and uvicorn in most cases, latency has been also improved even for cases where sync endpoints are used, since uvicorn threadpool machinery is a bit more optimized.
kytos core dependencies will ship httpx, it's both a sync and async version of requests, relatively the same usability. you don't need to replace existing requests usage on our NApps but http calls should preferably use httpx since by default it works synchronously but is compatible with asyncio too.
uvicorn supports auto reload, but auto reloading the entire process isn't trivial especially considering the foreground mode and also how NApps start/stop and their life cycle, uvicorn provides a slightly better experience when serving the ui files, so even though we won't have yet a full blown hot-reload for any un changes, the workflow when developing ui for a Napp won't have that much friction though, since a page refresh is expected to work with any new changes on .kytos files.
starlette unlocks python-based websockets implementation, in the future, we could also allow websocket routes for bidirectional communications for certain NApps in the future.
the last significant IO-blocking lib that we have is pymongo, one day we might also introduce an async option with motor, but apm instrumentation doesn't work with it yet, and its implementation is just wrapping asyncio over threads (it could still be handy, but it's worth waiting to see how it'll evolve). other than that, you should be able to reach out to asyncio and asyncio-compatible libs to pretty much any other io parts of our code base.

Benchmark

Here's the benchmark request stress tests with uvicorn (with all of the recent draft PRs) and werkzeug. In summary, uvicorn is outperforming in most cases, and overall has lower latencies metrics for both async and ThreadPool-based (sync) routes:

GET topology/v3 500 reqs/sec over 60 secs with uvicorn, sync route:

❯ jq -ncM '{method: "GET", url: "http://localhost:8181/api/kytos/topology/v3/"}' | vegeta attack -format=json -rate 500/1s -duration=60s -timeout=60s | tee results.bin | vegeta report
Requests      [total, rate, throughput]         30000, 500.02, 500.01
Duration      [total, attack, wait]             59.999s, 59.998s, 1.184ms
Latencies     [min, mean, 50, 90, 95, 99, max]  564.334µs, 1.922ms, 1.26ms, 2.258ms, 3.878ms, 18.72ms, 70.42ms
Bytes In      [total, mean]                     258030000, 8601.00
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:30000  
Error Set:

GET topology/v3 500 reqs/sec over 60 secs with werkzeug, sync route (this was the case where werkzeug lead to unstability):

❯ jq -ncM '{method: "GET", url: "http://localhost:8181/api/kytos/topology/v3/"}' | vegeta attack -format=json -rate 500/1s -duration=60s -timeout=60s | tee results.bin | vegeta report
Requests      [total, rate, throughput]         30000, 500.01, 256.45
Duration      [total, attack, wait]             1m47s, 59.999s, 46.706s
Latencies     [min, mean, 50, 90, 95, 99, max]  2.356ms, 6.704s, 329.991ms, 32.181s, 55.272s, 1m0s, 1m0s
Bytes In      [total, mean]                     224220616, 7474.02
Bytes Out     [total, mean]                     0, 0.00
Success       [ratio]                           91.21%
Status Codes  [code:count]                      0:2636  200:27364  
Error Set:
Get "http://localhost:8181/api/kytos/topology/v3/": read tcp 127.0.0.1:49031->127.0.0.1:8181: read: connection reset by peer
Get "http://localhost:8181/api/kytos/topology/v3/": read tcp 127.0.0.1:36911->127.0.0.1:8181: read: connection reset by peer
Get "http://localhost:8181/api/kytos/topology/v3/": read tcp 127.0.0.1:53479->127.0.0.1:8181: read: connection reset by peer
Get "http://localhost:8181/api/kytos/topology/v3/": read tcp 127.0.0.1:33709->127.0.0.1:8181: read: connection reset by peer

POST of_lldp/v1/polling_time 200 reqs/sec over 60 scs with uvicorn, async route (this endpoint doesn't have additional IO though to make the diff even more evident):

~/repos/napps master*  1m 0s
❯ jq -ncM '{method: "POST", url: "http://localhost:8181/api/kytos/of_lldp/v1/polling_time", body: { "polling_time": 4 } | @base64, header: {"Content-Type": ["application/json"]}}' | vege
ta attack -format=json -rate 200/1s -duration=60s -timeout=120s | tee results.bin | vegeta report
Requests      [total, rate, throughput]         12000, 200.02, 200.01
Duration      [total, attack, wait]             59.996s, 59.995s, 950.506µs
Latencies     [min, mean, 50, 90, 95, 99, max]  469.586µs, 1.12ms, 1.008ms, 1.603ms, 1.805ms, 2.232ms, 17.973ms
Bytes In      [total, mean]                     384000, 32.00
Bytes Out     [total, mean]                     216000, 18.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:12000  
Error Set:

POST of_lldp/v1/polling_time 200 reqs/sec over 60 scs with werkzeug, sync route (latency metrics are worse as expected with werkzeug compared to uvicorn):

❯ jq -ncM '{method: "POST", url: "http://localhost:8181/api/kytos/of_lldp/v1/polling_time", body: { "polling_time": 4 } | @base64, header: {"Content-Type": ["application/json"]}}' | vege
ta attack -format=json -rate 200/1s -duration=60s -timeout=120s | tee results.bin | vegeta report
Requests      [total, rate, throughput]         12000, 200.02, 200.01
Duration      [total, attack, wait]             59.998s, 59.995s, 2.914ms
Latencies     [min, mean, 50, 90, 95, 99, max]  715.538µs, 3.044ms, 3.116ms, 3.668ms, 3.831ms, 4.229ms, 19.844ms
Bytes In      [total, mean]                     396000, 33.00
Bytes Out     [total, mean]                     216000, 18.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      200:12000  
Error Set:

POST kytos/flow_manager/v2/flows/{dpid} 200 reqs/sec over 60 secs with uvicorn, sync route (despite pymongo driver is still IO-blocking, it performed better, especially comparing the mean and under 50th percentile):

❯ jq -ncM '{method: "POST", url: "http://localhost:8181/api/kytos/flow_manager/v2/flows/00:00:00:00:00:00:00:01", body: { "force": true, "flows": [ { "priority": 10, "match": { "in_port"
: 1, "dl_vlan": 100 }, "actions": [ { "action_type": "output", "port": 1 } ] } ] } | @base64, header: {"Content-Type": ["application/json"]}}' | vegeta attack -format=json -rate 200/1s -
duration=60s -timeout=120s | tee results.bin | vegeta report
Requests      [total, rate, throughput]         12000, 200.01, 140.00
Duration      [total, attack, wait]             1m26s, 59.996s, 25.718s
Latencies     [min, mean, 50, 90, 95, 99, max]  11.722ms, 12.601s, 632.181ms, 52.68s, 1m8s, 1m21s, 1m24s
Bytes In      [total, mean]                     432000, 36.00
Bytes Out     [total, mean]                     1464000, 122.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      202:12000  
Error Set:

POST kytos/flow_manager/v2/flows/{dpid} 200 reqs/sec over 60 secs with werkzeug:

❯ jq -ncM '{method: "POST", url: "http://localhost:8181/api/kytos/flow_manager/v2/flows/00:00:00:00:00:00:00:01", body: { "force": true, "flows": [ { "priority": 10, "match": { "in_port"
: 1, "dl_vlan": 100 }, "actions": [ { "action_type": "output", "port": 1 } ] } ] } | @base64, header: {"Content-Type": ["application/json"]}}' | vegeta attack -format=json -rate 200/1s -
duration=60s -timeout=120s | tee results.bin | vegeta report
Requests      [total, rate, throughput]         12000, 200.02, 98.24
Duration      [total, attack, wait]             2m2s, 59.995s, 1m2s
Latencies     [min, mean, 50, 90, 95, 99, max]  75.724ms, 58.496s, 1m0s, 1m17s, 1m23s, 1m47s, 1m58s
Bytes In      [total, mean]                     444000, 37.00
Bytes Out     [total, mean]                     1464000, 122.00
Success       [ratio]                           100.00%
Status Codes  [code:count]                      202:12000  
Error Set:

Local Tests

I ran local tests with with all linked PRs (check out their PR summary for more info)

End-to-End Tests

e2e tests with this PR and related starlette PRs can be found here, they're passing

removed flask, flask-socketio, flask_cors

api_client dead_letter auth Deleted autouse ev_loop fixture to avoid ev loop conflicts

Replaced werkzeug with uvicorn Adapted APIServer methods accordingly Used httpx when fetching ui web latest release tag

Refactored DeadLetter endpoints to be async

…with async

Introduced to validate async routes Broken down functions for reusability

Changed initial log level as INFO for uvicorn qualname

…rdering

…be deterministic

Parametrized ev loop on Controller to be used with get_json funcs Refactored accordingly

This reverts commit 3eec164.

Removed flask optional dependency from elastic-apm

viniarck added 20 commits April 5, 2023 13:42

feat: addded starlette, uvicorn and httpx on requirements/run.in

ffa0687

removed flask, flask-socketio, flask_cors

feat: regenerated requirements/run.txt

8874fa7

feat: regenerated requirements/dev.txt

326e99b

Included asgiref dependency because of openapi validations

cc9be29

Replaced FlaskAPM with StarletteAPM middleware

3284466

Updated conftest.py fixtures:

8f019c0

api_client dead_letter auth Deleted autouse ev_loop fixture to avoid ev loop conflicts

Added rest_api module exposing api utilities

917b612

Replaced Flask instance with Starlette

e692264

Replaced werkzeug with uvicorn Adapted APIServer methods accordingly Used httpx when fetching ui web latest release tag

Refactored api_server unit tests and added test_rest_api_routes module

cffd311

Refactored DeadLetter to be compatible with Starlette

89f71dc

Refactored DeadLetter endpoints to be async

Refactored DeadLetter unit tests accordingly

c93a209

Refactored test_atcp_server test suite to be compatible with asyncio

223642b

Refactored test_buffers test suite to be compatible with async

502a9f1

Refactored Auth to be compatible with Starlette

aa45cf5

Refactored test_auth test suite to be compatible with asyncio

ec51d16

Refactored Controller endpoints to be compatible with Starlette

83818fe

Refactored test_controller unit test, moved some tests to be async

c933918

Refactored get_test_client to use httpx.AsyncClient to be compatible …

805027c

…with async

Refactored validate_openapi to be comparible with Starlette routes

d5ab214

Introduced to validate async routes Broken down functions for reusability

Unit test refactoring

0d566fb

viniarck requested a review from a team as a code owner April 20, 2023 19:12

viniarck marked this pull request as draft April 20, 2023 19:13

viniarck added 3 commits April 21, 2023 10:33

fix issue 373 by pinning virtualenv 20.21.0 for now

90ebc99

Regen requirements/dev.txt after pinning virtualenv 20.21.0

efde005

Updated get_controller_mock to mock _buffers to facilitate async tests

5491962

This was referenced Apr 21, 2023

chore: @rest endpoints are now run by starlette/uvicorn instead of flask/werkzeug kytos-ng/pathfinder#45

Merged

chore: @rest endpoints are now run by starlette/uvicorn instead of flask/werkzeug kytos-ng/of_lldp#78

Merged

viniarck added 3 commits April 21, 2023 16:21

Updated logging.ini.template replaced werkzeug with uvicorn

87e2b6e

Changed initial log level as INFO for uvicorn qualname

Added created_at attribute on decorated @rest to ensure route match o…

d8b03b5

…rdering

Fixed unit tests

27cbe70

Extracted start_web_ui_static_files for testability

0b63eff

viniarck force-pushed the feat/starlette branch from a89b6eb to 0b63eff Compare April 25, 2023 19:30

viniarck mentioned this pull request Apr 25, 2023

chore: @rest endpoints are now run by starlette/uvicorn instead of flask/werkzeug kytos-ng/maintenance#75

Merged

Replaced @rest decorated created_at attr with route_index instead to …

94e2239

…be deterministic

viniarck mentioned this pull request Apr 26, 2023

chore: @rest endpoints are now run by starlette/uvicorn instead of flask/werkzeug kytos-ng/mef_eline#320

Merged

viniarck added 12 commits April 26, 2023 14:40

Updated _get_decorated_functions to be more test-friendly

08778ff

Temporarily set debug=True for e2e

3eec164

Refactored get_json funcs to pass down the current event loop

74175c2

Parametrized ev loop on Controller to be used with get_json funcs Refactored accordingly

Refactored unit tests accordingly

75b1694

Trying out to reuse the same event loop on uvicorn

ae8884d

Replaced APIServer blocking task to be async with uvicorn

3a85ffc

Updated unit tests accordingly

a2c1d74

Revert "Temporarily set debug=True for e2e"

ce28f8c

This reverts commit 3eec164.

Updated the CHANGELOG.rst accordingly

88ce0ee

Updated unit test value

03159b7

Upgraded httpx to 0.24.0

6a0135f

Removed flask optional dependency from elastic-apm

chore: regenerated requirements/run.txt and requirements/dev.txt

1930f86

viniarck marked this pull request as ready for review May 1, 2023 20:18

Updated the CHANGELOG.rst accordingly

bdabd23

viniarck force-pushed the feat/starlette branch from deed7df to bdabd23 Compare May 1, 2023 20:24

This was referenced May 2, 2023

feat: introduced api_traceback_on_500 kytos config option #380

Merged

chore: @rest endpoints are now run by starlette/uvicorn instead of flask/werkzeug kytos-ng/of_multi_table#12

Merged

viniarck merged commit 8012cfb into master May 3, 2023

viniarck deleted the feat/starlette branch May 3, 2023 17:23

viniarck mentioned this pull request May 3, 2023

feat: replace get_test_client(controller, napp) with starlette test client and include httpx as a dep #348

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: replaced `werkzeug/flask` with `uvicorn/starlette` #375

feat: replaced `werkzeug/flask` with `uvicorn/starlette` #375

viniarck commented Apr 20, 2023 •

edited

Loading

feat: replaced werkzeug/flask with uvicorn/starlette #375

feat: replaced werkzeug/flask with uvicorn/starlette #375

Conversation

viniarck commented Apr 20, 2023 • edited Loading

Summary

Benchmark

Local Tests

End-to-End Tests

feat: replaced `werkzeug/flask` with `uvicorn/starlette` #375

feat: replaced `werkzeug/flask` with `uvicorn/starlette` #375

viniarck commented Apr 20, 2023 •

edited

Loading