tld package¶

Submodules¶

tld.base module¶

class tld.base.BaseTLDSourceParser[source]¶

Bases: object

Base TLD source parser.

classmethod get_tld_names(fail_silently: bool = False, retry_count: int = 0)[source]¶

Get tld names.

Parameters:	fail_silently – retry_count –
Returns:

include_private = True¶

uid = None¶

classmethod update_tld_names(fail_silently: bool = False) → bool[source]¶

Update the local copy of the TLD file.

Parameters:	fail_silently –
Returns:

classmethod validate()[source]¶: Constructor.

class tld.base.Registry[source]¶

Bases: type

REGISTRY = {'mozilla': <class 'tld.utils.MozillaTLDSourceParser'>, 'mozilla_public_only': <class 'tld.utils.MozillaPublicOnlyTLDSourceParser'>}¶

classmethod get(key: str, default: Optional[tld.base.BaseTLDSourceParser] = None) → Optional[tld.base.BaseTLDSourceParser][source]¶

classmethod items() → ItemsView[str, tld.base.BaseTLDSourceParser][source]¶

classmethod reset() → None[source]¶

tld.conf module¶

tld.defaults module¶

tld.exceptions module¶

exception tld.exceptions.TldBadUrl(url)[source]¶

Bases: ValueError

TldBadUrl.

Supposed to be thrown when bad URL is given.

exception tld.exceptions.TldDomainNotFound(domain_name)[source]¶

Bases: ValueError

TldDomainNotFound.

Supposed to be thrown when domain name is not found (didn’t match) the local TLD policy.

exception tld.exceptions.TldImproperlyConfigured[source]¶

Bases: Exception

TldImproperlyConfigured.

Supposed to be thrown when code is improperly configured. Typical use-case is when user tries to use get_tld function with both search_public and search_private set to False.

exception tld.exceptions.TldIOError[source]¶

Bases: OSError

TldIOError.

Supposed to be thrown when problems with reading/writing occur.

tld.helpers module¶

tld.helpers.project_dir(base: str) → str[source]¶: Project dir.

tld.helpers.PROJECT_DIR(base: str) → str¶: Project dir.

tld.registry module¶

class tld.registry.Registry[source]¶

Bases: type

REGISTRY = {'mozilla': <class 'tld.utils.MozillaTLDSourceParser'>, 'mozilla_public_only': <class 'tld.utils.MozillaPublicOnlyTLDSourceParser'>}¶

classmethod get(key: str, default: Optional[tld.base.BaseTLDSourceParser] = None) → Optional[tld.base.BaseTLDSourceParser][source]¶

classmethod items() → ItemsView[str, tld.base.BaseTLDSourceParser][source]¶

classmethod reset() → None[source]¶

tld.result module¶

class tld.result.Result(tld: str, domain: str, subdomain: str, parsed_url: urllib.parse.SplitResult)[source]¶

Bases: object

Container.

domain¶

extension¶

Alias of tld.

Return str:

fld¶

First level domain.

Returns:
Return type:	str

parsed_url¶

subdomain¶

suffix¶

Alias of tld.

Return str:

tld¶

tld.trie module¶

class tld.trie.Trie[source]¶

Bases: object

An adhoc Trie data structure to store tlds in reverse notation order.

add(tld: str, private: bool = False) → None[source]¶

class tld.trie.TrieNode[source]¶

Bases: object

Class representing a single Trie node.

children¶

exception¶

leaf¶

private¶

tld.utils module¶

class tld.utils.BaseMozillaTLDSourceParser[source]¶

Bases: tld.base.BaseTLDSourceParser

classmethod get_tld_names(fail_silently: bool = False, retry_count: int = 0) → Optional[Dict[str, tld.trie.Trie]][source]¶

Parse.

Parameters:	fail_silently – retry_count –
Returns:

tld.utils.get_fld(url: Union[str, urllib.parse.SplitResult], fail_silently: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[tld.base.BaseTLDSourceParser] = None, **kwargs) → Optional[str][source]¶

Extract the first level domain.

Extract the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw TldBadUrl or TldDomainNotFound exceptions if there’s bad URL provided or no TLD match found respectively.

Parameters:	url (str \| SplitResult) – URL to get top level domain from. fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure. fix_protocol (bool) – If set to True, missing or wrong protocol is ignored (https is appended instead). search_public (bool) – If set to True, search in public domains. search_private (bool) – If set to True, search in private domains. parser_class –
Returns:	String with top level domain (if `as_object` argument is set to False) or a `tld.utils.Result` object (if `as_object` argument is set to True); returns None on failure.
Return type:	str

tld.utils.get_tld(url: Union[str, urllib.parse.SplitResult], fail_silently: bool = False, as_object: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[tld.base.BaseTLDSourceParser] = None) → Union[str, tld.result.Result, None][source]¶

Extract the top level domain.

Extract the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw TldBadUrl or TldDomainNotFound exceptions if there’s bad URL provided or no TLD match found respectively.

Parameters:	url (str \| SplitResult) – URL to get top level domain from. fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure. as_object (bool) – If set to True, `tld.utils.Result` object is returned, `domain`, `suffix` and `tld` properties. fix_protocol (bool) – If set to True, missing or wrong protocol is ignored (https is appended instead). search_public (bool) – If set to True, search in public domains. search_private (bool) – If set to True, search in private domains. parser_class –
Returns:	String with top level domain (if `as_object` argument is set to False) or a `tld.utils.Result` object (if `as_object` argument is set to True); returns None on failure.
Return type:	str

tld.utils.get_tld_names(fail_silently: bool = False, retry_count: int = 0, parser_class: Type[tld.base.BaseTLDSourceParser] = None) → Dict[str, tld.trie.Trie][source]¶

Build the tlds list if empty. Recursive.

Parameters:	fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure. retry_count (int) – If greater than 1, we raise an exception in order to avoid infinite loops. parser_class (BaseTLDSourceParser) –
Returns:	List of TLD names
Return type:	obj:tld.utils.Trie

tld.utils.get_tld_names_container() → Dict[str, tld.trie.Trie][source]¶

Get container of all tld names.

Returns:
Rtype dict:

tld.utils.is_tld(value: Union[str, urllib.parse.SplitResult], search_public: bool = True, search_private: bool = True, parser_class: Type[tld.base.BaseTLDSourceParser] = None) → bool[source]¶

Check if given URL is tld.

Parameters:	value (str) – URL to get top level domain from. search_public (bool) – If set to True, search in public domains. search_private (bool) – If set to True, search in private domains. parser_class –
Returns:
Return type:	bool

class tld.utils.MozillaTLDSourceParser[source]¶

Bases: tld.utils.BaseMozillaTLDSourceParser

Mozilla TLD source.

local_path = 'res/effective_tld_names.dat.txt'¶

source_url = 'https://publicsuffix.org/list/public_suffix_list.dat'¶

uid = 'mozilla'¶

class tld.utils.MozillaPublicOnlyTLDSourceParser[source]¶

Bases: tld.utils.BaseMozillaTLDSourceParser

Mozilla TLD source.

include_private = False¶

local_path = 'res/effective_tld_names_public_only.dat.txt'¶

source_url = 'https://publicsuffix.org/list/public_suffix_list.dat?publiconly'¶

uid = 'mozilla_public_only'¶

tld.utils.parse_tld(url: Union[str, urllib.parse.SplitResult], fail_silently: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[tld.base.BaseTLDSourceParser] = None) → Union[Tuple[None, None, None], Tuple[str, str, str]][source]¶

Parse TLD into parts.

Parameters:	url – fail_silently – fix_protocol – search_public – search_private – parser_class –
Returns:	Tuple (tld, domain, subdomain)
Return type:	tuple

tld.utils.pop_tld_names_container(tld_names_local_path: str) → None[source]¶

Remove TLD names container item.

Parameters:	tld_names_local_path –
Returns:

tld.utils.process_url(url: Union[str, urllib.parse.SplitResult], fail_silently: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[tld.base.BaseTLDSourceParser] = <class 'tld.utils.MozillaTLDSourceParser'>) → Union[Tuple[List[str], int, urllib.parse.SplitResult], Tuple[None, None, urllib.parse.SplitResult]][source]¶

Process URL.

Parameters:	parser_class – url – fail_silently – fix_protocol – search_public – search_private –
Returns:

tld.utils.reset_tld_names(tld_names_local_path: str = None) → None[source]¶

Reset the tld_names to empty value.

If tld_names_local_path is given, removes specified entry from tld_names instead.

Parameters:	tld_names_local_path (str) –
Returns:

class tld.utils.Result(tld: str, domain: str, subdomain: str, parsed_url: urllib.parse.SplitResult)[source]¶

Bases: object

Container.

domain¶

extension¶

Alias of tld.

Return str:

fld¶

First level domain.

Returns:
Return type:	str

parsed_url¶

subdomain¶

suffix¶

Alias of tld.

Return str:

tld¶

tld.utils.update_tld_names[source]¶

Update TLD names.

Parameters:	fail_silently – parser_uid –
Returns:

tld.utils.update_tld_names_cli() → int[source]¶

CLI wrapper for update_tld_names.

Since update_tld_names returns True on success, we need to negate the result to match CLI semantics.

tld.utils.update_tld_names_container(tld_names_local_path: str, trie_obj: tld.trie.Trie) → None[source]¶

Update TLD Names container item.

Parameters:	tld_names_local_path – trie_obj –
Returns:

Module contents¶

tld.get_fld(url: Union[str, urllib.parse.SplitResult], fail_silently: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[tld.base.BaseTLDSourceParser] = None, **kwargs) → Optional[str][source]¶

Extract the first level domain.

Extract the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw TldBadUrl or TldDomainNotFound exceptions if there’s bad URL provided or no TLD match found respectively.

Parameters:	url (str \| SplitResult) – URL to get top level domain from. fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure. fix_protocol (bool) – If set to True, missing or wrong protocol is ignored (https is appended instead). search_public (bool) – If set to True, search in public domains. search_private (bool) – If set to True, search in private domains. parser_class –
Returns:	String with top level domain (if `as_object` argument is set to False) or a `tld.utils.Result` object (if `as_object` argument is set to True); returns None on failure.
Return type:	str

tld.get_tld(url: Union[str, urllib.parse.SplitResult], fail_silently: bool = False, as_object: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[tld.base.BaseTLDSourceParser] = None) → Union[str, tld.result.Result, None][source]¶

Extract the top level domain.

Extract the top level domain based on the mozilla’s effective TLD names dat file. Returns a string. May throw TldBadUrl or TldDomainNotFound exceptions if there’s bad URL provided or no TLD match found respectively.

Parameters:	url (str \| SplitResult) – URL to get top level domain from. fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure. as_object (bool) – If set to True, `tld.utils.Result` object is returned, `domain`, `suffix` and `tld` properties. fix_protocol (bool) – If set to True, missing or wrong protocol is ignored (https is appended instead). search_public (bool) – If set to True, search in public domains. search_private (bool) – If set to True, search in private domains. parser_class –
Returns:	String with top level domain (if `as_object` argument is set to False) or a `tld.utils.Result` object (if `as_object` argument is set to True); returns None on failure.
Return type:	str

tld.get_tld_names(fail_silently: bool = False, retry_count: int = 0, parser_class: Type[tld.base.BaseTLDSourceParser] = None) → Dict[str, tld.trie.Trie][source]¶

Build the tlds list if empty. Recursive.

Parameters:	fail_silently (bool) – If set to True, no exceptions are raised and None is returned on failure. retry_count (int) – If greater than 1, we raise an exception in order to avoid infinite loops. parser_class (BaseTLDSourceParser) –
Returns:	List of TLD names
Return type:	obj:tld.utils.Trie

tld.is_tld(value: Union[str, urllib.parse.SplitResult], search_public: bool = True, search_private: bool = True, parser_class: Type[tld.base.BaseTLDSourceParser] = None) → bool[source]¶

Check if given URL is tld.

Parameters:	value (str) – URL to get top level domain from. search_public (bool) – If set to True, search in public domains. search_private (bool) – If set to True, search in private domains. parser_class –
Returns:
Return type:	bool

tld.parse_tld(url: Union[str, urllib.parse.SplitResult], fail_silently: bool = False, fix_protocol: bool = False, search_public: bool = True, search_private: bool = True, parser_class: Type[tld.base.BaseTLDSourceParser] = None) → Union[Tuple[None, None, None], Tuple[str, str, str]][source]¶

Parse TLD into parts.

Parameters:	url – fail_silently – fix_protocol – search_public – search_private – parser_class –
Returns:	Tuple (tld, domain, subdomain)
Return type:	tuple

class tld.Result(tld: str, domain: str, subdomain: str, parsed_url: urllib.parse.SplitResult)[source]¶

Bases: object

Container.

domain¶

extension¶

Alias of tld.

Return str:

fld¶

First level domain.

Returns:
Return type:	str

parsed_url¶

subdomain¶

suffix¶

Alias of tld.

Return str:

tld¶

tld.update_tld_names[source]¶

Update TLD names.

Parameters:	fail_silently – parser_uid –
Returns: