BatchIcoxxlistFeaturizer#

class lobsterpy.featurize.batch.BatchIcoxxlistFeaturizer(path_to_lobster_calcs, normalization='formula_units', bin_width=0.02, bwdf_df_type='stats', sorted_dists_mode='negative', interactions_tol=0.001, max_length=6.0, min_length=0.0, read_icobis=False, read_icoops=False, n_jobs=4)[source]#

Bases: object

BatchFeaturizer to generate BWDF-derived features from ICOXXLIST.lobster data.

Parameters:
  • path_to_lobster_calcs (str | Path) – path to root directory consisting of all lobster calc

  • max_length (float) – maximum bond length for BWDF computation

  • min_length (float) – minimum bond length for BWDF computation

  • normalization (Literal['formula_units', 'area', 'counts', 'none']) – normalization strategy for BWDF

  • bin_width (float) – bin width for BWDF

  • bwdf_df_type (Literal['binned', 'stats', 'sorted_bwdf', 'sorted_dists']) –

    Type of BWDF dataframe to generate

    • ”binned”: Binned BWDF function.

    • ”stats”: Statistical features of BWDF function.

    • ”sorted_bwdf”: BWDF values sorted by distances, ascending.

    • ”sorted_dists”: Distances sorted by BWDF values (either only positive or negative), sorted descending by absolute values.

  • sorted_dists_mode (Literal['positive', 'negative']) – only applies if bwdf_df_type==”sorted_dists”. Corresponds to param “mode” of get_sorted_dist_df, defines whether BWDF values above or below zero are considered for distance featurization.

  • n_jobs – number of parallel processes to run

  • interactions_tol (float)

  • read_icobis (bool)

  • read_icoops (bool)

Read_icobis:

bool to state to read ICOBILIST.lobster from the path

Read_icoops:

bool to state to read ICOOPLIST.lobster from the path

get_df()[source]#

Generate a pandas dataframe with BWDF for all calcs.

Returns:

A pandas dataframe with BWDF features as columns. The features can be either binned, sorted or statistical. Depends on the “bwdf_df_type” parameter set when the class is initialized.

Return type:

DataFrame