Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop #122

Open
wants to merge 46 commits into
base: develop
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
46 commits
Select commit Hold shift + click to select a range
ae7d3cc
ADD: Check if a referral WHOIS server is actually alive
Sir-Fenrir Jun 6, 2016
90c5f0a
FIX: Registrar key not existing (resulted in failure for .eu domains)
Sir-Fenrir Jun 6, 2016
13c0a4a
ADD: Caching for WHOIS servers. Only the top level, referred servers …
Sir-Fenrir Jun 8, 2016
e103b98
FIX: Single instance caching
Sir-Fenrir Jun 9, 2016
2476e71
FIX: Encoding issues. Normalizing is back on.
Sir-Fenrir Jun 9, 2016
8ae3acd
Merge pull request #1 from MasterFenrir/develop
Jun 9, 2016
b8fc96b
REF: Changed README.md
Sir-Fenrir Jun 9, 2016
9be54dc
REF: Renamed some stuff to not confuse this package with pythonwhois.…
Sir-Fenrir Jun 9, 2016
5b99c51
FIX: Cache is now saved in your home folder, OS independent.
Sir-Fenrir Jun 9, 2016
4069177
REF: No more forced caching, can now be set manually by giving the ca…
Sir-Fenrir Jun 9, 2016
e9201d6
ADD: Cooldown capabilities.
Sir-Fenrir Jun 9, 2016
0d1f71d
ADD: Comments
Sir-Fenrir Jun 9, 2016
0bc8ed5
FIX: Package being packaged in setup.py
Sir-Fenrir Jun 9, 2016
b6ea0ea
Merge pull request #1 in ~GEDEELD/the-whois-oracle from feature/rate-…
weswes666 Jun 10, 2016
08ea923
FIX: Package being packaged in setup.py
Sir-Fenrir Jun 10, 2016
4e7048d
ADD: Configuration for cool down
Sir-Fenrir Jun 10, 2016
57466bc
REF: prints
Sir-Fenrir Jun 10, 2016
7d3a8aa
REF: Thread is started automatically again
Sir-Fenrir Jun 10, 2016
97c2045
ADD: The values for the default cool down and the cool down period ca…
Sir-Fenrir Jun 10, 2016
2aa6654
REF: Some comments and variable names
Sir-Fenrir Jun 12, 2016
393f171
ADD: Explanation to the README.md about caching and cool down
Sir-Fenrir Jun 12, 2016
85f8bd2
FIX: Clarification in the Readme
Sir-Fenrir Jun 13, 2016
e2d8cdd
REF: Placed CoolDownTracker into its own file
Sir-Fenrir Jun 13, 2016
faa0fe0
REF: Made the name for the method that resets the cool down clearer
Sir-Fenrir Jun 13, 2016
9607741
FIX: Order in which the cool down is decided.
Sir-Fenrir Jun 13, 2016
271a5f2
REF: Removed threading, thanks to Wes. Thanks Wes!
Sir-Fenrir Jun 13, 2016
feb5065
REF: net.py looks a bit neater now, but I didn't go all out because t…
Sir-Fenrir Jun 14, 2016
4d87efe
REF: Made decrement_cool_downs a little clearer
Sir-Fenrir Jun 14, 2016
a01f663
ADD: A holder for WHOIS responses. It contains information about the …
Sir-Fenrir Jun 14, 2016
d9aa27c
REF: Increased default cool down length from 1 second to 2
Sir-Fenrir Jun 15, 2016
e3f9618
Merge pull request #2 in ~GEDEELD/the-whois-oracle from feature/confi…
weswes666 Jun 15, 2016
744ee34
REF: Renamed whois_response.py to raw_whois_response.py
Sir-Fenrir Jun 15, 2016
2d5e2ca
REF: Removed the fix for parsing empty responses and moved it to the …
Sir-Fenrir Jun 15, 2016
b0f201e
ENH: Wording in a comment
Sir-Fenrir Jun 16, 2016
0b049a0
Merge pull request #3 in ~GEDEELD/the-whois-oracle from feature/VOY-1…
weswes666 Jun 16, 2016
6300274
FIX: Fixed failure for gg domains
Sir-Fenrir Jun 17, 2016
aaa7a6c
REF: Changed the default cooldown to 4 seconds
Sir-Fenrir Jun 17, 2016
b8771d0
Merge pull request #4 in ~GEDEELD/the-whois-oracle from bugfix/fix-fo…
weswes666 Jun 17, 2016
4a44d4d
REF: Version increase
Sir-Fenrir Jun 17, 2016
3092c8a
DEL Checking whether the referral server is alive now only happens in…
Sir-Fenrir Jun 17, 2016
08f9be9
FIX: postal -> postalcode
Sir-Fenrir Jun 17, 2016
91bd710
REF: Changed the increase in cool down rate from 2 to 1.5
Sir-Fenrir Jun 20, 2016
9842f54
ADD: WhoisResult. Contains the list of responses, whether the list is…
Sir-Fenrir Jun 20, 2016
e1e5728
ADD: Added a new method for compatibility with the original pythonwhois
Sir-Fenrir Jun 21, 2016
9a8c17e
Merge pull request #5 in ~GEDEELD/the-whois-oracle from bugfix/now-on…
weswes666 Jun 21, 2016
f236888
ENH: Small version increase
Sir-Fenrir Jun 23, 2016
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 31 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
pythonwhois
===========
The WHOIS Oracle, forked from pythonwhois
=========================================

A WHOIS retrieval and parsing library for Python.
Because it is all knowing!
A WHOIS retrieval and parsing library for Python, forked from pythonwhois
and updated by me.

## Dependencies

Expand All @@ -11,6 +13,32 @@ None! All you need is the Python standard library.

The manual (including install instructions) can be found in the doc/ directory. A HTML version is also viewable [here](http://cryto.net/pythonwhois).

## Cache configuration
Using pythonwhois.set_persistent_cache a cache can be set. If a cache is set,
whois-oracle will look there for WHOIS servers for TLD's. For domains with thin
WHOIS servers, only the 'head' WHOIS server is cached, not the referral servers.
Otherwise it would
be impossible to get the correct information because the information for the domain
might not be on that WHOIS server at all.

## Cool down configuration
This feature is not useful for single lookups, but for bulk this comes in really handy.
Every WHOIS server gets a certain time before it will be asked again, to prevent spamming
and possibly refused connections. This can be configured by passing a configuration file
to pythonwhois.set_cool_down_config. This file can contain the following to elements, but doesn't have to.
`[general]`
`default_cool_down_length : 1`
This is the general part. Currently, only one variable can be defined. It is optional to do so.

`[whois.eu]`
`cool_down_length : 10`
`max_requests_minute : 5`
`max_requests_hour : 20`
`max_requests_day : 50`
This is how sections for specific WHOIS servers are defined. The section
name is the name of the server and the section can contain the listed properties.
None of them are required. Multiple WHOIS servers can be added to the configuration file.

## Goals

* 100% coverage of WHOIS formats.
Expand Down Expand Up @@ -53,14 +81,6 @@ The manual (including install instructions) can be found in the doc/ directory.

Do note that `ipwhois` does not offer a normalization feature, and does not (yet) come with a command-line tool. Additionally, `ipwhois` is maintained by Philip Hane and not by me; please make sure to file bugs relating to it in the `ipwhois` repository, not in that of `pythonwhois`.

## Important update notes

*2.4.0 and up*: A lot of changes were made to the normalization, and the performance under Python 2.x was significantly improved. The average parsing time under Python 2.7 has dropped by 94% (!), and on my system averages out at 18ms. Performance under Python 3.x is [unchanged](https://github.com/joepie91/python-whois/issues/27). `pythonwhois` will now expand a lot of abbreviations in normalized mode, such as airport codes, ISO country codes, and US/CA/AU state abbreviations. The consequence of this is that the library is now bigger (as it ships a list of these abbreviations). Also note that there *may* be licensing consequences, in particular regarding the airport code database. More information about that can be found below.

*2.3.0 and up*: Python 3 support was fixed. Creation date parsing for contacts was fixed; correct timestamps will now be returned, rather than unformatted ones - if your application relies on the broken variant, you'll need to change your code. Some additional parameters were added to the `net` and `parse` methods to facilitate NIC handle lookups; the defaults are backwards-compatible, and these changes should not have any consequences for your code. Thai WHOIS parsing was implemented, but is a little spotty - data may occasionally be incorrectly split up. Please submit a bug report if you run across any issues.

*2.2.0 and up*: The internal workings of `get_whois_raw` have been changed, to better facilitate parsing of WHOIS data from registries that may return multiple partial matches for a query, such as `whois.verisign-grs.com`. This change means that, by default, `get_whois_raw` will now strip out the part of such a response that does not pertain directly to the requested domain. If your application requires an unmodified raw WHOIS response and is calling `get_whois_raw` directly, you should use the new `never_cut` parameter to keep pythonwhois from doing this post-processing. As this is a potentially breaking behaviour change, the minor version has been bumped.

## It doesn't work!

* It doesn't work at all?
Expand Down
116 changes: 0 additions & 116 deletions pwhois

This file was deleted.

35 changes: 27 additions & 8 deletions pythonwhois/__init__.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,30 @@
from . import net, parse


def get_whois(domain, normalized=[]):
raw_data, server_list = net.get_whois_raw(domain, with_server_list=True)
# Unlisted handles will be looked up on the last WHOIS server that was queried. This may be changed to also query
# other servers in the future, if it turns out that there are cases where the last WHOIS server in the chain doesn't
# actually hold the handle contact details, but another WHOIS server in the chain does.
return parse.parse_raw_whois(raw_data, normalized=normalized, never_query_handles=False, handle_server=server_list[-1])

def whois(*args, **kwargs):
raise Exception("The whois() method has been replaced by a different method (with a different API), since pythonwhois 2.0. Either install the older pythonwhois 1.2.3, or change your code to use the new API.")
final_result = net.get_whois_raw_wrapped(domain, with_server_list=True)
# Unlisted handles will be looked up on the last WHOIS server that was queried. This may be changed to also query
# other servers in the future, if it turns out that there are cases where the last WHOIS server in the chain doesn't
# actually hold the handle contact details, but another WHOIS server in the chain does.
if len(final_result.server_list) > 0:
handle_server = final_result.server_list[-1]
else:
handle_server = ""
return parse.parse_raw_whois(final_result.responses, normalized=normalized, never_query_handles=False,
handle_server=handle_server)


def set_persistent_cache(path_to_cache):
"""
Set a persistent cache. If the file does not yet exist, it is created.
:param path_to_cache: The place where the cache is stored or needs to be created
"""
net.server_cache.set_persistent_location(path_to_cache)


def set_cool_down_config(path_to_config):
"""
Set a cool down configuration file, describing specific settings for certain WHOIS servers.
:param path_to_config: The path to the configuration file, this needs to exist
"""
net.cool_down_tracker.set_cool_down_config(path_to_config)
Empty file added pythonwhois/caching/__init__.py
Empty file.
59 changes: 59 additions & 0 deletions pythonwhois/caching/whois_server_cache.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
import ast
import os


def read_cache(file_path):
if os.path.isfile(file_path):
return ast.literal_eval(open(file_path).read())
else:
if os.path.dirname(file_path):
os.makedirs(os.path.dirname(file_path))
return {}


def write_cache(cache, file_path):
cache_file = open(file_path, 'w')
cache_file.write(str(cache))


class WhoisServerCache:
"""
Cache handler for easy of use. Do not instantiate. import server_cache instead.
Otherwise an inconsistent cache can happen as a result of multiple caches.
"""

def __init__(self):
self.cache = {}
self.persistent = False
self.file_path = None

def get_server(self, tld):
"""
Get a WHOIS server for a given TLD
:param tld: The TLD to get the WHOIS server for
:return: The WHOIS server if it is known, or None otherwise
"""
return self.cache.get(tld)

def put_server(self, tld, whois_server):
"""
Store a new WHOIS server in the cache. If the cache is persistent,
it is also written to disk again. Because the WHOIS servers
don't change that often, it simply writes to a file.
:param tld: The TLD to store a WHOIS server for
:param whois_server: The WHOIS server to store
"""
self.cache[tld] = whois_server
if self.file_path is not None:
write_cache(self.cache, self.file_path)

def set_persistent_location(self, file_path):
"""
Store the cache in a persistent location
:param file_path: The path to store the cache
"""
self.file_path = file_path
self.cache = read_cache(file_path)


server_cache = WhoisServerCache()
Loading