Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FDS-2127] Update URL validation to use requests.options to verify connectivity #1469

Conversation

BryanFauble
Copy link
Collaborator

Problem:

  1. Valid URLs like https://doi.org/10.1158/0008-5472.can-23-0128 were being rejected as the urlopen was not connecting to the server hosting this webpage.

Solution:

  1. Swap to requests.options that will verify the webserver in question is reachable. Read more about the options method here: https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods/OPTIONS

Testing:

  1. I set up a few integration tests around the changes.

@@ -2,17 +2,28 @@
[![Build Status](https://img.shields.io/endpoint.svg?url=https%3A%2F%2Factions-badge.atrox.dev%2FSage-Bionetworks%2Fschematic%2Fbadge%3Fref%3Ddevelop&style=flat)](https://actions-badge.atrox.dev/Sage-Bionetworks/schematic/goto?ref=develop) [![Documentation Status](https://readthedocs.org/projects/sage-schematic/badge/?version=develop)](https://sage-schematic.readthedocs.io/en/develop/?badge=develop) [![PyPI version](https://badge.fury.io/py/schematicpy.svg)](https://badge.fury.io/py/schematicpy)

# Table of contents
- [Schematic](#schematic)
Copy link
Collaborator Author

@BryanFauble BryanFauble Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a markdown plugin in VSCODE that automatically creates/updates these. If you don't wish to have this update I will delete it from the PR.

@@ -142,3 +142,9 @@ def temporary_file_copy(request, helpers: Helpers) -> Generator[str, None, None]
# Teardown
if os.path.exists(temp_csv_path):
os.remove(temp_csv_path)


@pytest.fixture(name="dmge", scope="function")
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I moved this up to conftest so I could use it in my new tests as well

@@ -0,0 +1,73 @@
import pandas as pd
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the start of moving some files into an tests/integration directory. We can start slow and add new items into the appropriate locations while we slowly move items over in a strangler fig pattern: https://shopify.engineering/refactoring-legacy-code-strangler-fig-pattern

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I appreciate the additive nature of these changes

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Bryan! I like the change.

Copy link

sonarcloud bot commented Aug 23, 2024

valid_url = True
response_code = response.getcode()
response = requests.options(url, allow_redirects=True)
Copy link
Member

@thomasyu888 thomasyu888 Aug 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the FAIR team, was there a specific reason to use urllib other than the fact that it is a builtin package?

I think something like this works too

from urllib import request

req = request.Request(url, method='OPTIONS')
response = request.urlopen(req)
print(response.status, response.headers)

note: I prefer using requests package, but thought I would ask

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I also prefer using requests package. What Brian has there seems perfect to me.

Copy link
Member

@thomasyu888 thomasyu888 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔥 Thanks for hopping on this so quick, I'm going to pre-approve but will leave it up to the FAIR team to approve.

@BryanFauble
Copy link
Collaborator Author

@thomasyu888 @linglp I moved this to a non-forked branch/PR: #1472

@BryanFauble BryanFauble deleted the fds-2127-url-validation branch August 26, 2024 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants