Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix the AccessLoggingFilter #12

Open
LucaCinquini opened this issue Mar 14, 2016 · 3 comments
Open

Fix the AccessLoggingFilter #12

LucaCinquini opened this issue Mar 14, 2016 · 3 comments
Assignees
Milestone

Comments

@LucaCinquini
Copy link
Member

Who: Luca

It has been reported that in a small percentage of the cases, data downloads are not completed successfully because of the AccessLoogingFilter. What is wrong in the AccessLoggingFilter is unknown.

@LucaCinquini LucaCinquini self-assigned this Mar 14, 2016
@LucaCinquini LucaCinquini added this to the Release 0.8.0 milestone Mar 14, 2016
@LucaCinquini
Copy link
Member Author

I have created a new version of the AccessLoggingFilter (with the same name) which does the following:

o Does NOT wrap the HTTP response to count the bytes actually transferred, in case this is the source of the problem
o Does NOT rely in esg.ini to check the size of the file on disk for every request/response, in case this is the source of the problem. This also has the additional benefit the data downloads will NOT crash if the file mounting point is NOT found in the esg.ini file (as it happened previously)

The new filter also does the following:

o Retrieves the size of the input file from the HTTP response "Content-Length" (same as the TDS does)
o Retrieves the "duration" time (time to deliver the response) from the filter execution time - same as the TDS servlet does
o Logs the original URL that was requested, i.e. it does NOT strip the first part of the URL to try a resolution with the mount point in the file system
o Sets success=true if the response code is 200 (NOTE: the response code will be 200 even if the client interrupts the request. It will NOT be 200 if a problem arises on the server side).

@LucaCinquini
Copy link
Member Author

This has been tested successfully at JPL after loading 356452 entries in the esgf_node_manager.access_logging table. But is is possible that problems arise only for large database tables and high concurrency situations.

@LucaCinquini
Copy link
Member Author

Example of entry for restricted download:

select user_id, url, remote_addr, user_agent, service_type, batch_update_time, to_timestamp(date_fetched), success, duration, user_idp, data_size, xfer_size, duration from esgf_node_manager.access_logging order by date_fetched desc limit 1;
user_id | url | remote_addr | user_agent | serv
ice_type | batch_update_time | to_timestamp | success | duration | user_idp | data_size | xfer_size | duration
----------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+------------+-----
---------+-------------------+------------------------+---------+----------+------------------------+-----------+-----------+----------
https://esgf-node.jpl.nasa.gov/esgf-idp/openid/rootAdmin | https://esgf-dev.jpl.nasa.gov/thredds/fileServer/esg_dataroot/obs4MIPs/observations/atmos/hus/mon/grid/NASA-JPL/MLS/v20111025/hus_MLS_L3_v03-3x_200408-201012.nc | 137.79.241.180 | | thre
dds | 0 | 2016-03-14 09:43:31-07 | t | 20033 | esgf-node.jpl.nasa.gov | 45919888 | -1 | 20033
(1 row)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant