tld Package

Extracts the top level domain (TLD) from the URL given. List of TLD names is taken from Mozilla http://mxr.mozilla.org/mozilla/source/netwerk/dns/src/effective_tld_names.dat?raw=1

Optionally raises exceptions on non-existing TLDs or silently fails (if fail_silently argument is set to True). Knows about active and inactive TLDs. If only active TLDs shall be matched against, active_only argument shall be set to True (default - False).

Installation

Latest stable version on PyPI:

$ pip install tld

Latest development version:

$ pip install -e hg+http://bitbucket.org/barseghyanartur/tld#egg=tld

Usage example

To get the top level domain name from the URL given:

>>> from tld import get_tld
>>> print get_tld("http://www.google.co.uk")
'google.co.uk'
>>> print get_tld("http://www.google.idontexist", fail_silently=True)
None

To update/sync the tld names with the most recent version run the following from your terminal:

$ python tld/update.py

or simply do:

>>> from tld.utils import update_tld_names
>>> update_tld_names()

utils Module

tld.utils.update_tld_names(fail_silently=False)[source]

Updates the local copy of TLDs file.

Parameters:fail_silently (bool) – If set to True, no exceptions is raised on failure but boolean False returned.
Return bool:True on success, False on failure.
tld.utils.get_tld(url, active_only=False, fail_silently=False)[source]

Extracts the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw TldBadUrl or TldDomainNotFound exceptions if there’s bad URL provided or no TLD match found respectively.

Parameters:
  • url – URL to get top level domain from.
  • active_only – If set to True, only active patterns are matched.
  • fail_silently – If set to True, no exceptions are raised and None is returned on failure.
Returns:

String with top level domain or None on failure.

Indices and tables

Read the Docs v: 0.6.3
Versions
latest
0.6.3
0.5
0.4
Downloads
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.