FeaturizeIcoxxlist#

class lobsterpy.featurize.core.FeaturizeIcoxxlist(path_to_icoxxlist, path_to_structure, bin_width=0.02, interactions_tol=0.001, max_length=6.0, min_length=0.0, normalization='formula_units', are_cobis=False, are_coops=False)[source]#

Bases: object

Class to Featurize ICOXXLIST.lobster as Bond weighted distribution function (BWDF).

Parameters:
  • path_to_icoxxlist (str | Path) – path to ICOXXLIST.lobster

  • path_to_structure (str | Path) – path to structure file (e.g., CONTCAR (preferred), POSCAR)

  • bin_width (float) – bin width for the BWDF

  • interactions_tol (float) – tolerance for interactions

  • max_length (float) – maximum bond length for BWDF computation

  • min_length (float) – minimum bond length for BWDF computation

  • normalization (Literal['formula_units', 'area', 'counts', 'none']) – normalization strategy for BWDF

  • are_cobis (bool) – bool indicating if file contains COBI/ICOBI data

  • are_coops (bool) – bool indicating if file contains COOP/ICOOP data

get_icoxx_neighbors_data(site_index=None)[source]#

Get the neighbors data with icoxx values for a structure.

Uses distance based neighbor list as reference to map the neighbor’s data.

Parameters:

site_index (int | None) – index of the site for which neighbors data is returned. Default is None (All sites).

Returns:

Neighbors data as a dictionary with the following information

  • ”ref_rdf_data”: radial distribution function (RDF) data

  • ”input_icoxx_list”: complete ICOXXLIST.lobster data in the form of list of tuples

  • ”mapped_icoxx_data”: ICOXX values mapped to RDF data

  • ”missing_interactions”: list of interactions that are present in RDF data but not in ICOXX data

  • ”wasserstein_dist_to_rdf”: wasserstein distance computed between ref_rdf_data and mapped_icoxx_data.

Return type:

dict

calc_bwdf()[source]#

Compute BWDF from ICOXXLIST.lobster data.

Returns:

BWDF as a dictionary for each atom pair and entire structure

  • ”A-B”: BWDF for atom pair A-B, e.g., “Na-Cl”: {“icoxx_binned”: np.array, “icoxx_counts”: np.array}

  • ”summed”: BWDF for entire structure, e.g., “summed”: {“icoxx_binned”: np.array, “icoxx_counts”: np.array}

  • ”centers”: bin centers for BWDF

  • ”edges”: bin edges for BWDF

  • ”bin_width”: bin width

  • ”wasserstein_dist_to_rdf”: wasserstein distance between RDF and ICOXX data

Return type:

dict

calc_site_bwdf(site_index)[source]#

Compute BWDF from ICOXXLIST.lobster data for a site.

Parameters:

site_index (int) – index of the site for which BWDF needs to be computed

Returns:

BWDF as a dictionary for the site in the following format

  • ”X”: BWDF for the site X, e.g., “0”: {“icoxx_binned”: np.array, “icoxx_counts”: np.array}

  • ”centers”: bin centers for BWDF

  • ”edges”: bin edges for BWDF

  • ”bin_width”: bin width

  • ”wasserstein_dist_to_rdf”: wasserstein distance between RDF and ICOXX data

Return type:

dict

calc_label_bwdf(bond_label)[source]#

Compute BWDF from ICOXXLIST.lobster data for a bond label.

Parameters:

bond_label (str) – bond label for which BWDF needs to be computed

Returns:

BWDF as a dictionary for the bond label in the following format

  • ”X”: BWDF for the bond label, e.g., “20”: {“icoxx_binned”: np.array, “icoxx_counts”: np.array}

  • ”centers”: bin centers for BWDF

  • ”edges”: bin edges for BWDF

  • ”bin_width”: bin width

  • ”wasserstein_dist_to_rdf”: wasserstein distance between RDF and ICOXX data

Return type:

dict

get_binned_bwdf_df(ids=None)[source]#

Return a pandas dataframe with computed BWDF features as columns.

Parameters:

ids (str | None) – set index name in the pandas dataframe. Default is None.

Returns:

A pandas dataframe object with BWDF as columns. Each column contains sum of icoxx values corresponding to bins.

Return type:

DataFrame

get_site_df(site_index, ids=None)[source]#

Return a pandas dataframe with computed BWDF features for a site as columns.

Parameters:
  • site_index (int) – index of the site in a structure for which BWDF needs to be computed

  • ids (str | None) – set index name in the pandas dataframe. Default is None.

Returns:

A pandas dataframe object with BWDF as columns for a site. Each column contains sum of icoxx values corresponding to bins.

Return type:

DataFrame

get_stats_df(ids=None)[source]#

Return a pandas dataframe with statical info from BWDF as columns.

Parameters:

ids (str | None) – set index name in the pandas dataframe. Default is None.

Returns:

A pandas dataframe object with BWDF statistical information as columns. Columns include sum, mean, std, min, max, skew, kurtosis, weighted mean and weighted std.

Return type:

DataFrame

get_sorted_bwdf_df(ids=None)[source]#

Return a pandas dataframe with BWDF values sorted by distances, ascending.

Parameters:

ids (str | None) – set index name in the pandas dataframe. Default is None.

Returns:

A pandas dataframe object with binned BWDF values sorted by distance.

Return type:

DataFrame

get_sorted_dist_df(ids=None, mode='negative')[source]#

Return a pandas dataframe with distances sorted by BWDF values (either only positive or negative), sorted descending by absolute values.

Parameters:
  • ids (str | None) – set index name in the pandas dataframe. Default is None

  • mode (Literal['positive', 'negative']) – must be in (“positive”, “negative”), defines whether BWDF values above or below zero are considered for distance featurization.

Returns:

A pandas dataframe object with binned distances sorted by BWDF values.

Return type:

DataFrame