FeaturizeIcoxxlist#
- class lobsterpy.featurize.core.FeaturizeIcoxxlist(path_to_icoxxlist, path_to_structure, bin_width=0.02, interactions_tol=0.001, max_length=6.0, min_length=0.0, normalization='formula_units', are_cobis=False, are_coops=False)[source]#
Bases:
object
Class to Featurize ICOXXLIST.lobster as Bond weighted distribution function (BWDF).
- Parameters:
path_to_icoxxlist (str | Path) – path to ICOXXLIST.lobster
path_to_structure (str | Path) – path to structure file (e.g., CONTCAR (preferred), POSCAR)
bin_width (float) – bin width for the BWDF
interactions_tol (float) – tolerance for interactions
max_length (float) – maximum bond length for BWDF computation
min_length (float) – minimum bond length for BWDF computation
normalization (Literal['formula_units', 'area', 'counts', 'none']) – normalization strategy for BWDF
are_cobis (bool) – bool indicating if file contains COBI/ICOBI data
are_coops (bool) – bool indicating if file contains COOP/ICOOP data
- get_icoxx_neighbors_data(site_index=None)[source]#
Get the neighbors data with icoxx values for a structure.
Uses distance based neighbor list as reference to map the neighbor’s data.
- Parameters:
site_index (int | None) – index of the site for which neighbors data is returned. Default is None (All sites).
- Returns:
Neighbors data as a dictionary with the following information
”ref_rdf_data”: radial distribution function (RDF) data
”input_icoxx_list”: complete ICOXXLIST.lobster data in the form of list of tuples
”mapped_icoxx_data”: ICOXX values mapped to RDF data
”missing_interactions”: list of interactions that are present in RDF data but not in ICOXX data
”wasserstein_dist_to_rdf”: wasserstein distance computed between ref_rdf_data and mapped_icoxx_data.
- Return type:
dict
- calc_bwdf()[source]#
Compute BWDF from ICOXXLIST.lobster data.
- Returns:
BWDF as a dictionary for each atom pair and entire structure
”A-B”: BWDF for atom pair A-B, e.g., “Na-Cl”: {“icoxx_binned”: np.array, “icoxx_counts”: np.array}
”summed”: BWDF for entire structure, e.g., “summed”: {“icoxx_binned”: np.array, “icoxx_counts”: np.array}
”centers”: bin centers for BWDF
”edges”: bin edges for BWDF
”bin_width”: bin width
”wasserstein_dist_to_rdf”: wasserstein distance between RDF and ICOXX data
- Return type:
dict
- calc_site_bwdf(site_index)[source]#
Compute BWDF from ICOXXLIST.lobster data for a site.
- Parameters:
site_index (int) – index of the site for which BWDF needs to be computed
- Returns:
BWDF as a dictionary for the site in the following format
”X”: BWDF for the site X, e.g., “0”: {“icoxx_binned”: np.array, “icoxx_counts”: np.array}
”centers”: bin centers for BWDF
”edges”: bin edges for BWDF
”bin_width”: bin width
”wasserstein_dist_to_rdf”: wasserstein distance between RDF and ICOXX data
- Return type:
dict
- calc_label_bwdf(bond_label)[source]#
Compute BWDF from ICOXXLIST.lobster data for a bond label.
- Parameters:
bond_label (str) – bond label for which BWDF needs to be computed
- Returns:
BWDF as a dictionary for the bond label in the following format
”X”: BWDF for the bond label, e.g., “20”: {“icoxx_binned”: np.array, “icoxx_counts”: np.array}
”centers”: bin centers for BWDF
”edges”: bin edges for BWDF
”bin_width”: bin width
”wasserstein_dist_to_rdf”: wasserstein distance between RDF and ICOXX data
- Return type:
dict
- get_binned_bwdf_df(ids=None)[source]#
Return a pandas dataframe with computed BWDF features as columns.
- Parameters:
ids (str | None) – set index name in the pandas dataframe. Default is None.
- Returns:
A pandas dataframe object with BWDF as columns. Each column contains sum of icoxx values corresponding to bins.
- Return type:
DataFrame
- get_site_df(site_index, ids=None)[source]#
Return a pandas dataframe with computed BWDF features for a site as columns.
- Parameters:
site_index (int) – index of the site in a structure for which BWDF needs to be computed
ids (str | None) – set index name in the pandas dataframe. Default is None.
- Returns:
A pandas dataframe object with BWDF as columns for a site. Each column contains sum of icoxx values corresponding to bins.
- Return type:
DataFrame
- get_stats_df(ids=None)[source]#
Return a pandas dataframe with statical info from BWDF as columns.
- Parameters:
ids (str | None) – set index name in the pandas dataframe. Default is None.
- Returns:
A pandas dataframe object with BWDF statistical information as columns. Columns include sum, mean, std, min, max, skew, kurtosis, weighted mean and weighted std.
- Return type:
DataFrame
- get_sorted_bwdf_df(ids=None)[source]#
Return a pandas dataframe with BWDF values sorted by distances, ascending.
- Parameters:
ids (str | None) – set index name in the pandas dataframe. Default is None.
- Returns:
A pandas dataframe object with binned BWDF values sorted by distance.
- Return type:
DataFrame
- get_sorted_dist_df(ids=None, mode='negative')[source]#
Return a pandas dataframe with distances sorted by BWDF values (either only positive or negative), sorted descending by absolute values.
- Parameters:
ids (str | None) – set index name in the pandas dataframe. Default is None
mode (Literal['positive', 'negative']) – must be in (“positive”, “negative”), defines whether BWDF values above or below zero are considered for distance featurization.
- Returns:
A pandas dataframe object with binned distances sorted by BWDF values.
- Return type:
DataFrame