Advanced Jobflow Tutorial#

The tutorial has been written by J. George (Federal Institute for Materials Research and Testing in Germany in Berlin, Friedrich Schiller University). Suggestions and corrections by A. Naik, C. Ertural, and A. S. Rosen have been included.

Jobflow#

We will use jobflow to write and execute workflows on our local machines. One can also start these computations on supercomputers and convert the flows from jobflow to Workflows that can be executed with Fireworks. This allows for parallel execution of jobs that are in the same workflow.

We recommend doing the basic jobflow tutorials Five-minute Quickstart, Introdution to jobflow and Defining jobs in jobflow before starting with this tutorial.

The tutorial will cover the development of a workflow to compute the equation of state and of a workflow to compute vibrational properties (phonons). It aims at the computational materials science community rather than chemistry but the workflow development should not be very different. The tutorial will assume background in solid-state chemistry and physics.

Install All Relevant Software#

We start by installing all relevant software. We will use jobflow but also two typical software from computational materials science. ASE and pymatgen. pymatgen works especially well together with jobflow as both have been developed in the context of the Materials Project.

%%capture
!pip install jobflow[strict]
%%capture
!pip install ase
%%capture
!pip install pymatgen

Workflow to Compute the Equation of State (e.g., to Compute a Bulk Modulus)#

In this first tutorial, we will build flows and jobs to automatize the computation of the equation of state. This tutorial has been inspired by the equation of state tutorial from ASE: https://wiki.fysik.dtu.dk/ase/tutorials/eos/eos.html.

Please also check out this nice paper that compares different equations of state: K. Latimer, S. Dwaraknath, K. Mathew, D. Winston, K. A. Persson, npj Comput Mater 2018, 4, 40. You can use the different equations of state with the help of pymatgen.

How Does the Computation Work Without Workflow?#

We will start by looking at how we would compute the results without workflow. To avoid needing large computing resources, we will choose a very fast calculator from ase (i.e., an object to compute energies, forces and stresses from ab initio theories or force fields for a given structure) and small unit cells during the computations. The exact method “Effective Medium Theory” will be introduced below.

Let’s import Maker:

We will start by constructing a simple unit cell with the Atoms object in ASE: Ag in fcc structure type. The Atoms object in ASE is used to represent the structure. It can also be directly connected to electronic energies, forces and stresses when a calculator is added.

from ase.build import bulk

atoms = bulk("Ag", "fcc", 3.9)
print(atoms)
Atoms(symbols='Ag', pbc=True, cell=[[0.0, 1.95, 1.95], [1.95, 0.0, 1.95], [1.95, 1.95, 0.0]])

It is possible to visualize the structure.

from ase.visualize.plot import plot_atoms
plot_atoms(atoms)
<Axes: >
_images/d106bb63a214987d5e13c80137c83bf7e30e7778d23f879c27201d997398579d.png

Besides using an ASE Atoms object, we will later on also use Structure objects from pymatgen. They are very handy as they can be easily converted to json which is used by jobflow to encode job outputs. There are conversion tools from the Atoms object to the Structure object available in pymatgen. Let’s create a Structure object from an Atoms object. This converter will only convert the structural information but not information on the computations that can also be stored in the Atoms object.

from pymatgen.io.ase import AseAtomsAdaptor

structure = AseAtomsAdaptor().get_structure(atoms)
print(structure)
print(structure.as_dict())
Full Formula (Ag1)
Reduced Formula: Ag
abc   :   2.757716   2.757716   2.757716
angles:  60.000000  60.000000  60.000000
pbc   :       True       True       True
Sites (1)
  #  SP      a    b    c
---  ----  ---  ---  ---
  0  Ag      0    0    0
{'@module': 'pymatgen.core.structure', '@class': 'Structure', 'charge': 0, 'lattice': {'matrix': [[0.0, 1.95, 1.95], [1.95, 0.0, 1.95], [1.95, 1.95, 0.0]], 'pbc': (True, True, True), 'a': 2.7577164466275352, 'b': 2.7577164466275352, 'c': 2.7577164466275352, 'alpha': 59.99999999999999, 'beta': 59.99999999999999, 'gamma': 59.99999999999999, 'volume': 14.829749999999999}, 'sites': [{'species': [{'element': 'Ag', 'occu': 1}], 'abc': [0.0, 0.0, 0.0], 'xyz': [0.0, 0.0, 0.0], 'label': 'Ag', 'properties': {}}]}

Effective Medium Theory (EMT)#

To calculate energies, forces and stresses, we will use effective-medium theory as described in K. W. Jacobsen, J. K. Norskov, M. J. Puska, Phys. Rev. B 1987, 35, 7423–7442. We mainly use this approximation to be able to perform the computations without high-performance computers and still have reasonable results for simple solids such as Ag. The method is based on an ansatz that assumes that the total electron density can be expressed as a sum of the electron densities of individual atoms. The electron densities of the individual atoms are calculated separately assuming that each atom is embedded in a homogeneous electron gas. It’s not necessary to understand this approximation in detail to perform the exercises. Typically, our workflows in computational materials science use DFT (e.g., as implemented in VASP).

Let’s add the EMT calculator to the Atoms object.

from ase.calculators.emt import EMT

atoms.calc = EMT()

Structure Optimization#

We need a structural optimization for the equation of state workflow. To do so, we will use a standard optimizer from ASE (BFGS) here to perform the optimization. We add an additional Filter, the ExpCellFilter, to also optimize the cell parameters and not only the atomic positions.

from ase.optimize import BFGS
from ase.constraints import ExpCellFilter

print(atoms)
opt = BFGS(ExpCellFilter(atoms))
opt.run(fmax=0.00001)
print(atoms)
Atoms(symbols='Ag', pbc=True, cell=[[0.0, 1.95, 1.95], [1.95, 0.0, 1.95], [1.95, 1.95, 0.0]], calculator=EMT(...))
      Step     Time          Energy         fmax
BFGS:    0 07:33:12        0.088477        1.5173
BFGS:    1 07:33:12        0.018361        0.6601
BFGS:    2 07:33:12       -0.000016        0.0864
BFGS:    3 07:33:12       -0.000365        0.0064
BFGS:    4 07:33:12       -0.000367        0.0001
BFGS:    5 07:33:12       -0.000367        0.0000
Atoms(symbols='Ag', pbc=True, cell=[[-2.298712018099952e-16, 2.0317762972959157, 2.0317762972959117], [2.0317762972959135, 2.3373590085817006e-15, 2.0317762972959117], [2.031776297295913, 2.0317762972959126, 1.2589072243282675e-16]], calculator=EMT(...))

We have printed the Atoms object before and after the optimization. As you can see, the cell parameters have changed. The atomic positions will typically not change due to the high symmetry of the chosen structure.

Next, we are printing the results for energies, forces and stresses.

print(atoms.calc.results)
{'energy': -0.00036686253946172087, 'energies': array([-0.00036686]), 'free_energy': -0.00036686253946172087, 'forces': array([[0., 0., 0.]]), 'stress': array([-3.44910465e-09, -3.44910586e-09, -3.44910471e-09,  5.07608054e-16,
        1.58738221e-17, -7.53100562e-17])}

Computing Energies and Volumes for the Equation of State#

After this full optimization of the structure, we will now scale the cell parameters to get the energy-volume curve. At each of the scaled structures, we will perform an optimization with constant volume and evaluate the energy of the structures at the different volumes. We start with current cell parameters after the full optimization:

cell = atoms.get_cell()
print(cell)
Cell([[-2.298712018099952e-16, 2.0317762972959157, 2.0317762972959117], [2.0317762972959135, 2.3373590085817006e-15, 2.0317762972959117], [2.031776297295913, 2.0317762972959126, 1.2589072243282675e-16]])

We will first test the scaling with a volume change of 1%:

new_cell = cell * 1.01
print(new_cell)
print(cell)
[[-2.32169914e-16  2.05209406e+00  2.05209406e+00]
 [ 2.05209406e+00  2.36073260e-15  2.05209406e+00]
 [ 2.05209406e+00  2.05209406e+00  1.27149630e-16]]
Cell([[-2.298712018099952e-16, 2.0317762972959157, 2.0317762972959117], [2.0317762972959135, 2.3373590085817006e-15, 2.0317762972959117], [2.031776297295913, 2.0317762972959126, 1.2589072243282675e-16]])

Now, we will perform the full energy volume-curve computation. The optimizations that we will perform will now be at constant volume. After the optimization, we are collecting the volumes and energies.

import numpy as np
from ase.io.trajectory import Trajectory
from copy import deepcopy

cell = atoms.get_cell()

volumes = []
energies = []
for x in [0.95, 0.97, 0.99, 1.00, 1.01, 1.03, 1.05]:
    optimized_atoms = deepcopy(atoms)
    optimized_atoms.set_cell(cell * x, scale_atoms=True)
    opt = BFGS(ExpCellFilter(optimized_atoms, constant_volume=True), logfile="./optimization.log")
    opt.run(fmax=0.00001)
    volumes.append(optimized_atoms.get_volume())
    energies.append(optimized_atoms.get_potential_energy())

print(volumes)
print(energies)
[14.382304526892272, 15.309918086569299, 16.27658340882234, 16.774812103096398, 17.283105684632318, 18.330290104980214, 19.418941860846967]
[0.14168852968516887, 0.04714559046761835, 0.004531478880869244, -0.00036686253946172087, 0.004180145974302718, 0.03765776040276947, 0.0980053988235543]

Plotting and Evaluating the Equation of State#

We will now plot the equation of state:

from ase.units import kJ
from ase.eos import EquationOfState

eos = EquationOfState(volumes, energies)
v0, e0, B = eos.fit()
print(B, 'eV Angstrom**(-3)')
print(B / kJ * 1.0e24, 'GPa')
eos.plot('outputs/Ag-eos.png')
0.6242845498684262 eV Angstrom**(-3)
100.02141105258441 GPa
<Axes: title={'center': 'sj: E: -0.000 eV, V: 16.776 Å$^3$, B: 100.021 GPa'}, xlabel='volume [Å$^3$]', ylabel='energy [eV]'>
_images/5b3a77a766329a70c0d5777fd61a79526a4a3552d91de475ec293d3161d3685c.png

The result is acceptable for us. You can compare the result from EMT to DFT results here: https://materialsproject.org/materials/mp-124?chemsys=Ag#equations_of_state The volume from EMT is different by 1 Angstrom³ and Bulk modulus from eMT is off by a factor of 10 depending on the equation of state.

Structure of the Workflow#

We now have a workflow that starts with full structural optimization. After the structural optimization, structures with different volumes have to be created. Then, several structural optimizations with constant volume can be started. They are independent of each other and could run in parallel. At the very end, energies and volumes from these constant-volume optimizations are collected and the equation of state is fitted. An illustration of the workflow follows.

Workflow

Workflow Implementation (Equation of State)#

Jobflow implements Job objects that are connected to a Flow. We typically use the Maker class from Jobflow to create jobs and flows. There is also a tutorial on how to write a Maker. A Job is an entity that can be executed independently. We therefore have four different job types to write for the workflow above: a full structural optimization, a job to create structures with different volumes, a structural optimization with the constraint of constant volume, and a job that plots and summarizes the results for the database. As the full structural optimization and the structural optimization with the constraint of constant volume are so similar, we will write a combined class inheriting from Maker class to create these jobs.

Structural Optimization (Job)#

Let’s work on an atomic job to optimize the structure. We will use a Structure object to carry the structural information as it can be easily transformed into json. We need the json format as we will save our results in databases that build upon the json format. Serialization is more complicated for an Atoms object. Check out https://materialsproject.github.io/maggma/concepts to learn more.

To write an atomic job in jobflow, we have to inherit from the Maker class and adapt the make method. You can find more about jobs here: https://materialsproject.github.io/jobflow/tutorials/3-defining-jobs.html. Please don’t forget to add the decorators (e.g., @dataclass) as they are central for the execution of the workflow. The decorator @dataclass will add special methods to the class. See here for more info: https://docs.python.org/3/library/dataclasses.html

from jobflow.core.maker import Maker
/opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/maggma/utils.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from tqdm.autonotebook import tqdm

We will write a job OptimizeJob which inherits from Maker and implement the make method. We add a name to the maker which will later on help us identify which job is running. The make method will take a structure as an input.

from dataclasses import dataclass
from jobflow import job, Maker
from pymatgen.core.structure import Structure


@dataclass
class OptimizeJob(Maker):
    """
    Class to carry out a full optimization of a structure with the EMT calculator

    Parameters
    ----------
    name
        Name of the job.
    """
    name: str = "EMT-Optimization"

    @job
    def make(self, structure: Structure):
        pass

In the next step, we simply add steps to perform the optimization. Thus, we take the same classes, methods and functions as before:

from dataclasses import dataclass
from jobflow import job
from pymatgen.core.structure import Structure


@dataclass
class OptimizeJob(Maker):
    """
    Class to carry out a full optimization of a structure with the EMT calculator

    Parameters
    ----------
    name
        Name of the job.
    """
    name: str = "EMT-Optimization"

    @job
    def make(self, structure: Structure):
        adaptor = AseAtomsAdaptor()
        atoms = adaptor.get_atoms(structure)  # get an atoms object
        atoms.calc = EMT()  # add emt as a calculator

        opt = BFGS(ExpCellFilter(atoms), logfile="./optimization.log")  # optimize the structure including cell parameters
        opt.run(fmax=0.00001)  # run the optimization

We have now built a Maker to optimize the structure. We might want to add fmax as a parameter of our Maker to change the default values. We will not do this for all potential settings as it just takes a lot of time to implement… It’s possible to summarize such settings within one dict, e.g. called optimize_kwargs.

@dataclass
class OptimizeJob(Maker):
    """
    Class to carry out a full optimization of a structure with the EMT calculator

    Parameters
    ----------
    name
        Name of the job.
    fmax
        float that determines the forces after optimization
    """
    name: str = "EMT-Optimization"
    fmax: float = 0.00001

    @job
    def make(self, structure: Structure):
        adaptor = AseAtomsAdaptor()
        atoms = adaptor.get_atoms(structure)
        atoms.calc = EMT()
        opt = BFGS(ExpCellFilter(atoms), logfile="./optimization.log")
        opt.run(fmax=self.fmax)

We will add another parameter constant_volume to be able to do a constant volume optimization with the same maker.

To be able to connect different interdependent jobs, we also need to save our results (method should return the results we want to use further down the line) so that we can transfer our results between jobs. For now, it will be enough to save the optimized structure and the energy at the end. You can also save forces, stresses and so on… Typically, a schema based on Pydantic is developed to save the results. This is again just a lot of work and will not help us to understand the automation itself. It’s again just important that all outputs can be serialized to a json.

@dataclass
class OptimizeJob(Maker):
    """
    Class to carry out a full optimization of an atoms object

    Parameters
    ----------
    name
        Name of the job.
    fmax
        float that determines the forces after optimization
    constant_volume
        bool that if true will enforce an optimization with constant volume
    """
    name: str = "EMT-Optimization"
    fmax: float = 0.00001
    constant_volume: bool = False

    @job
    def make(self, structure: Structure):
        adaptor = AseAtomsAdaptor()
        atoms = adaptor.get_atoms(structure)
        atoms.calc = EMT()
        opt = BFGS(ExpCellFilter(atoms, constant_volume=self.constant_volume), logfile="./optimization.log")
        opt.run(fmax=self.fmax)

        out_structure = adaptor.get_structure(atoms)

        results = {}
        results["optimized_structure"] = out_structure
        results["input_structure"] = structure
        results["energy"] = atoms.get_potential_energy()
        return results

For test purposes, we will simply now chain two optimizations, put them in a Flow (i.e., the object for workflows) and see what happens. We also always have to define an output for the flow. Very often, this is the output of the last job in the flow. We will use run_locally to execute the jobs locally. atomate2 has tutorials on how to transfer such jobs to Fireworks or how to run them with a job script.

from jobflow.managers.local import run_locally
from jobflow import Flow

atoms = bulk("Ag", "fcc", 4)
structure = AseAtomsAdaptor().get_structure(atoms)
maker_fullopt = OptimizeJob()
maker_opt_constant_vol = OptimizeJob(constant_volume=True)
job1 = maker_fullopt.make(structure=structure)
job2 = maker_opt_constant_vol.make(structure=job1.output["optimized_structure"])

flow = Flow([job1, job2], output=job2.output)

response = run_locally(flow)
print(response)
2023-06-05 07:33:13,510 INFO Started executing jobs locally
2023-06-05 07:33:13,659 INFO Starting job - EMT-Optimization (9f6083c2-f4cf-4206-bee8-71bb9337f670)
2023-06-05 07:33:13,745 INFO Finished job - EMT-Optimization (9f6083c2-f4cf-4206-bee8-71bb9337f670)
2023-06-05 07:33:13,746 INFO Starting job - EMT-Optimization (b0edc587-a76e-43a8-9681-a3bcf79530b4)
2023-06-05 07:33:13,765 INFO Finished job - EMT-Optimization (b0edc587-a76e-43a8-9681-a3bcf79530b4)
2023-06-05 07:33:13,766 INFO Finished executing jobs locally
{'9f6083c2-f4cf-4206-bee8-71bb9337f670': {1: Response(output={'optimized_structure': Structure Summary
Lattice
    abc : 2.87336550651202 2.8733655065120205 2.87336550651202
 angles : 59.99999999999999 60.00000000000001 59.99999999999999
 volume : 16.774810547282314
      A : -1.0240902968258463e-16 2.031776234482168 2.0317762344821686
      B : 2.0317762344821686 2.323574183069503e-17 2.0317762344821686
      C : 2.0317762344821686 2.031776234482168 -2.8552862920234554e-17
    pbc : True True True
PeriodicSite: Ag (0.0000, 0.0000, 0.0000) [0.0000, 0.0000, 0.0000], 'input_structure': Structure Summary
Lattice
    abc : 2.8284271247461903 2.8284271247461903 2.8284271247461903
 angles : 60.00000000000001 60.00000000000001 60.00000000000001
 volume : 16.0
      A : 0.0 2.0 2.0
      B : 2.0 0.0 2.0
      C : 2.0 2.0 0.0
    pbc : True True True
PeriodicSite: Ag (0.0000, 0.0000, 0.0000) [0.0000, 0.0000, 0.0000], 'energy': -0.000366862539422641}, detour=None, addition=None, replace=None, stored_data=None, stop_children=False, stop_jobflow=False)}, 'b0edc587-a76e-43a8-9681-a3bcf79530b4': {1: Response(output={'optimized_structure': Structure Summary
Lattice
    abc : 2.87336550651202 2.8733655065120205 2.87336550651202
 angles : 59.99999999999999 60.00000000000001 59.99999999999999
 volume : 16.774810547282314
      A : -1.0240902968258463e-16 2.031776234482168 2.0317762344821686
      B : 2.0317762344821686 2.323574183069503e-17 2.0317762344821686
      C : 2.0317762344821686 2.031776234482168 -2.8552862920234554e-17
    pbc : True True True
PeriodicSite: Ag (0.0000, 0.0000, 0.0000) [0.0000, 0.0000, 0.0000], 'input_structure': Structure Summary
Lattice
    abc : 2.87336550651202 2.8733655065120205 2.87336550651202
 angles : 59.99999999999999 60.00000000000001 59.99999999999999
 volume : 16.774810547282314
      A : -1.0240902968258463e-16 2.031776234482168 2.0317762344821686
      B : 2.0317762344821686 2.323574183069503e-17 2.0317762344821686
      C : 2.0317762344821686 2.031776234482168 -2.8552862920234554e-17
    pbc : True True True
PeriodicSite: Ag (0.0000, 0.0000, 0.0000) [0.0000, 0.0000, 0.0000], 'energy': -0.000366862539422641}, detour=None, addition=None, replace=None, stored_data=None, stop_children=False, stop_jobflow=False)}}

As you can see, this works. Of course, this is not very useful here but could be useful, for example, for DFT optimizations, where typically at least two full structural optimizations have to be started due to basis set effects. Btw, you can also create folders for each atomic job in your workflow by using the parameter create_folders from run_locally.

Create Structures with Different Volumes and Start the Constant-Volume Optimizations (Job)#

We now need to build a job that creates structures with different volumes and starts the optimizations with constant volume.

We will start by using a job that is called get_ev_curve. It will replace itself with a Response object that will start a completely new flow or job. As outputs, we will collect all optimized structures and the corresponding energies! We will reuse our maker OptimizeJob from above.

from jobflow import Response, job


@job
def get_ev_curve(structure: Structure, volumes=[0.95, 0.97, 0.99, 1.00, 1.01, 1.03, 1.05]):
    groundstate_vol = structure.volume
    const_vol_maker = OptimizeJob(constant_volume=True)
    job_list = []
    outputs = {
        "optimized_structures": [],
        "energies": [],
    }
    for vol_inc in volumes:
        struct_here = deepcopy(structure)
        struct_here.scale_lattice(groundstate_vol * vol_inc)
        relax_job = const_vol_maker.make(struct_here)
        job_list.append(relax_job)
        outputs["optimized_structures"].append(relax_job.output["optimized_structure"])
        outputs["energies"].append(relax_job.output["energy"])
    return Response(replace=Flow(job_list, output=outputs))

We can now build a flow including the full optimization and the job to change the volumes and start the constant-volume optimizations. We need to transfer the optimized structure to the second job. Please remember that job1.output["optimized_structure"] is a reference to the future output. It can only be accessed if the job has been already executed. You cannot access this output while generating the flow for the first time.

atoms = bulk("Ag", "fcc", 4)
structure = AseAtomsAdaptor().get_structure(atoms)
maker_fullopt = OptimizeJob()
maker_opt_constant_vol = OptimizeJob(constant_volume=True)
job1 = maker_fullopt.make(structure=structure)
job2 = get_ev_curve(structure=job1.output["optimized_structure"])

flow = Flow([job1, job2], output=job2.output)

response = run_locally(flow)
2023-06-05 07:33:13,788 INFO Started executing jobs locally
2023-06-05 07:33:13,790 INFO Starting job - EMT-Optimization (7edf44ab-5252-410a-b771-95cfa1c3ec18)
2023-06-05 07:33:13,933 INFO Finished job - EMT-Optimization (7edf44ab-5252-410a-b771-95cfa1c3ec18)
2023-06-05 07:33:13,938 INFO Starting job - get_ev_curve (c4931155-ab22-4a7f-aa1a-3565ca23273c)
2023-06-05 07:33:13,953 INFO Finished job - get_ev_curve (c4931155-ab22-4a7f-aa1a-3565ca23273c)
2023-06-05 07:33:13,963 INFO Starting job - EMT-Optimization (b615b31c-6eca-455e-a07f-3a69626df173)
2023-06-05 07:33:14,004 INFO Finished job - EMT-Optimization (b615b31c-6eca-455e-a07f-3a69626df173)
2023-06-05 07:33:14,009 INFO Starting job - EMT-Optimization (d173a5ea-b987-4bc1-9054-790a72f312e9)
2023-06-05 07:33:14,046 INFO Finished job - EMT-Optimization (d173a5ea-b987-4bc1-9054-790a72f312e9)
2023-06-05 07:33:14,048 INFO Starting job - EMT-Optimization (41229ecd-5f81-4d6e-8739-0b7d0eaae89d)
2023-06-05 07:33:14,083 INFO Finished job - EMT-Optimization (41229ecd-5f81-4d6e-8739-0b7d0eaae89d)
2023-06-05 07:33:14,086 INFO Starting job - EMT-Optimization (031eba3c-48a0-421c-98e3-e447c40fead2)
2023-06-05 07:33:14,122 INFO Finished job - EMT-Optimization (031eba3c-48a0-421c-98e3-e447c40fead2)
2023-06-05 07:33:14,125 INFO Starting job - EMT-Optimization (ee98abbc-4d1d-4c7d-933b-36e2bc234487)
2023-06-05 07:33:14,162 INFO Finished job - EMT-Optimization (ee98abbc-4d1d-4c7d-933b-36e2bc234487)
2023-06-05 07:33:14,166 INFO Starting job - EMT-Optimization (ca7b29fb-8c0e-4446-84f1-4e6f5da5a511)
2023-06-05 07:33:14,201 INFO Finished job - EMT-Optimization (ca7b29fb-8c0e-4446-84f1-4e6f5da5a511)
2023-06-05 07:33:14,206 INFO Starting job - EMT-Optimization (9e7d21b0-e9aa-475a-aca4-791b9055130a)
2023-06-05 07:33:14,225 INFO Finished job - EMT-Optimization (9e7d21b0-e9aa-475a-aca4-791b9055130a)
2023-06-05 07:33:14,227 INFO Starting job - store_inputs (c4931155-ab22-4a7f-aa1a-3565ca23273c, 2)
2023-06-05 07:33:14,231 INFO Finished job - store_inputs (c4931155-ab22-4a7f-aa1a-3565ca23273c, 2)
2023-06-05 07:33:14,232 INFO Finished executing jobs locally

Summary Job#

We will now write a job get_results_ev_curve that will collect all results, make plots and store the information in a database. It will start from a list of structures and energies and then plot an energy-volume curve to a file and return the most important results.

@job
def get_results_ev_curve(list_structures, list_energies):
    list_volumes = []
    for struct in list_structures:
        list_volumes.append(struct.volume)

    eos = EquationOfState(list_volumes, list_energies)
    v0, e0, B = eos.fit()
    eos.plot('eos.png')

    results = {}
    results["V0"] = v0
    results["e0"] = e0
    results["B"] = B

    return Response(results)

Full Workflow#

Now, we combine all jobs to a larger workflow and check out the output. To do so, we will need a universally unique identifier (uuid). In our case, this will be te one of the last job. Again, we also need to transfer all information between the jobs correctly.

atoms = bulk("Ag", "fcc", 4)
structure = AseAtomsAdaptor().get_structure(atoms)
maker_fullopt = OptimizeJob()
job1 = maker_fullopt.make(structure=structure)
job2 = get_ev_curve(structure=job1.output["optimized_structure"])
job3 = get_results_ev_curve(list_structures=job2.output["optimized_structures"], list_energies=job2.output["energies"])
flow = Flow([job1, job2, job3], output=job3.output)

response = run_locally(flow)
print(response[flow.jobs[-1].uuid][1].output)
2023-06-05 07:33:14,254 INFO Started executing jobs locally
2023-06-05 07:33:14,257 INFO Starting job - EMT-Optimization (8368a494-f934-4723-aa95-c04d7d6c2620)
2023-06-05 07:33:14,369 INFO Finished job - EMT-Optimization (8368a494-f934-4723-aa95-c04d7d6c2620)
2023-06-05 07:33:14,371 INFO Starting job - get_ev_curve (b983d91a-8bc6-480e-8366-b6a9c0571190)
2023-06-05 07:33:14,382 INFO Finished job - get_ev_curve (b983d91a-8bc6-480e-8366-b6a9c0571190)
2023-06-05 07:33:14,388 INFO Starting job - EMT-Optimization (d2d8fad0-c134-42b7-b27b-2b39f51f376d)
2023-06-05 07:33:14,408 INFO Finished job - EMT-Optimization (d2d8fad0-c134-42b7-b27b-2b39f51f376d)
2023-06-05 07:33:14,409 INFO Starting job - EMT-Optimization (a63e1e86-b7af-48eb-9930-48e5c1ecf6f6)
2023-06-05 07:33:14,429 INFO Finished job - EMT-Optimization (a63e1e86-b7af-48eb-9930-48e5c1ecf6f6)
2023-06-05 07:33:14,430 INFO Starting job - EMT-Optimization (a7ef62fa-e3a0-4038-9ac9-e34c1e30e93c)
2023-06-05 07:33:14,450 INFO Finished job - EMT-Optimization (a7ef62fa-e3a0-4038-9ac9-e34c1e30e93c)
2023-06-05 07:33:14,451 INFO Starting job - EMT-Optimization (e87b4146-e1e7-4ea8-88c4-fd78b5777f00)
2023-06-05 07:33:14,471 INFO Finished job - EMT-Optimization (e87b4146-e1e7-4ea8-88c4-fd78b5777f00)
2023-06-05 07:33:14,472 INFO Starting job - EMT-Optimization (005b0c32-84e3-483c-8fce-f9b7cc7fee71)
2023-06-05 07:33:14,492 INFO Finished job - EMT-Optimization (005b0c32-84e3-483c-8fce-f9b7cc7fee71)
2023-06-05 07:33:14,493 INFO Starting job - EMT-Optimization (a25a0c07-d6f1-454d-b16e-fbbec6d7356d)
2023-06-05 07:33:14,524 INFO Finished job - EMT-Optimization (a25a0c07-d6f1-454d-b16e-fbbec6d7356d)
2023-06-05 07:33:14,529 INFO Starting job - EMT-Optimization (8a016b88-223d-4eea-9e8e-0b0e32471dfa)
2023-06-05 07:33:14,548 INFO Finished job - EMT-Optimization (8a016b88-223d-4eea-9e8e-0b0e32471dfa)
2023-06-05 07:33:14,550 INFO Starting job - store_inputs (b983d91a-8bc6-480e-8366-b6a9c0571190, 2)
2023-06-05 07:33:14,553 INFO Finished job - store_inputs (b983d91a-8bc6-480e-8366-b6a9c0571190, 2)
2023-06-05 07:33:14,555 INFO Starting job - get_results_ev_curve (13478f69-db15-437f-a59f-e5e4c3af719f)
2023-06-05 07:33:14,668 INFO Finished job - get_results_ev_curve (13478f69-db15-437f-a59f-e5e4c3af719f)
2023-06-05 07:33:14,670 INFO Finished executing jobs locally
{'V0': 16.7748202620472, 'e0': -0.00036673834964062735, 'B': 0.6250761058809655}
_images/a405f44f3286e2389ed38d6c88e1f4e8d572d2cab160ea362ef1253a24b8a5cb.png

It works. We get the same results!

Validation of the Results with Pydantic#

I will just show you now how to validate your data with a pydantic schema. This allows to describe your outputs and make sure that you also get all relevant data in the right format. Again, this is only one tiny example. In practice, you would add many more options! This is how such a schema can look in practice: materialsproject/atomate2

from pydantic import BaseModel, Field


class EOS_Schema(BaseModel):
    """Schema to store outputs from EV curve."""

    V0: float = Field("Volume in Angstrom**3")
    e0: float = Field("Energy in eV")
    B: float = Field(
        "Bulk modulus",
    )


@job(output_schema=EOS_Schema)
def get_results_ev_curve(list_structures, list_energies):
    list_volumes = []
    for struct in list_structures:
        list_volumes.append(struct.volume)

    eos = EquationOfState(list_volumes, list_energies)
    v0, e0, B = eos.fit()
    eos.plot('eos.png')

    results = {}
    results["V0"] = v0
    results["e0"] = e0
    results["B"] = B
    schema = EOS_Schema(**results)
    return Response(schema)

We will add this now to the workflow and run it again.

atoms = bulk("Ag", "fcc", 4)
structure = AseAtomsAdaptor().get_structure(atoms)
maker_fullopt = OptimizeJob()
job1 = maker_fullopt.make(structure=structure)
job2 = get_ev_curve(structure=job1.output["optimized_structure"])
job3 = get_results_ev_curve(list_structures=job2.output["optimized_structures"], list_energies=job2.output["energies"])
flow = Flow([job1, job2, job3], output=job3.output)

response = run_locally(flow)
print(response[flow.jobs[-1].uuid][1].output)
2023-06-05 07:33:14,858 INFO Started executing jobs locally
2023-06-05 07:33:14,860 INFO Starting job - EMT-Optimization (86cea8ef-1e00-4a09-b604-2238cb99bf8c)
2023-06-05 07:33:14,988 INFO Finished job - EMT-Optimization (86cea8ef-1e00-4a09-b604-2238cb99bf8c)
2023-06-05 07:33:14,989 INFO Starting job - get_ev_curve (dccd3cb7-652e-4b69-90b0-8d00dbb0c52c)
2023-06-05 07:33:14,998 INFO Finished job - get_ev_curve (dccd3cb7-652e-4b69-90b0-8d00dbb0c52c)
2023-06-05 07:33:15,002 INFO Starting job - EMT-Optimization (eda78823-82e9-4c2a-8367-def45da82b4b)
2023-06-05 07:33:15,021 INFO Finished job - EMT-Optimization (eda78823-82e9-4c2a-8367-def45da82b4b)
2023-06-05 07:33:15,022 INFO Starting job - EMT-Optimization (fb47478b-f189-495c-880d-b7b9849fc985)
2023-06-05 07:33:15,048 INFO Finished job - EMT-Optimization (fb47478b-f189-495c-880d-b7b9849fc985)
2023-06-05 07:33:15,051 INFO Starting job - EMT-Optimization (7c7455f5-c945-4e9b-b601-f2833680d6a1)
2023-06-05 07:33:15,080 INFO Finished job - EMT-Optimization (7c7455f5-c945-4e9b-b601-f2833680d6a1)
2023-06-05 07:33:15,080 INFO Starting job - EMT-Optimization (c7abcb7a-318c-42a4-95c0-1cc0ac22385c)
2023-06-05 07:33:15,099 INFO Finished job - EMT-Optimization (c7abcb7a-318c-42a4-95c0-1cc0ac22385c)
2023-06-05 07:33:15,100 INFO Starting job - EMT-Optimization (54c778cb-d65c-4e8d-8e74-33e29d231719)
2023-06-05 07:33:15,118 INFO Finished job - EMT-Optimization (54c778cb-d65c-4e8d-8e74-33e29d231719)
2023-06-05 07:33:15,119 INFO Starting job - EMT-Optimization (91553d98-46c1-46e7-b64d-679824ff6598)
2023-06-05 07:33:15,137 INFO Finished job - EMT-Optimization (91553d98-46c1-46e7-b64d-679824ff6598)
2023-06-05 07:33:15,138 INFO Starting job - EMT-Optimization (e2a0b1f1-0300-42c0-a580-ad871e3ce710)
2023-06-05 07:33:15,156 INFO Finished job - EMT-Optimization (e2a0b1f1-0300-42c0-a580-ad871e3ce710)
2023-06-05 07:33:15,156 INFO Starting job - store_inputs (dccd3cb7-652e-4b69-90b0-8d00dbb0c52c, 2)
2023-06-05 07:33:15,159 INFO Finished job - store_inputs (dccd3cb7-652e-4b69-90b0-8d00dbb0c52c, 2)
2023-06-05 07:33:15,160 INFO Starting job - get_results_ev_curve (ea4b7f1e-6f73-4508-9c58-2231e2980f0f)
2023-06-05 07:33:15,313 INFO Finished job - get_results_ev_curve (ea4b7f1e-6f73-4508-9c58-2231e2980f0f)
2023-06-05 07:33:15,314 INFO Finished executing jobs locally
V0=16.7748202620472 e0=-0.00036673834964062735 B=0.6250761058809655
_images/a405f44f3286e2389ed38d6c88e1f4e8d572d2cab160ea362ef1253a24b8a5cb.png

You have seen all basic functionalities to write workflows with jobflow.

Exercises#

  1. Please build a maker for the whole workflow! You can find an example here: materialsproject/atomate2

  2. Then, use the maker for different structures (e.g., bcc structure type) or elements (e.g., Al or other elements that work with EMT).

Workflow to Compute Phonons (Vibrational Properties of Crystals)#

Introduction#

We will now write a workflow to compute phonons with Phonopy. We will only compute phonon densities of states and not phonon band structures. It is possible to implement this. However, this is significantly more work and does not help you to learn about the automation itself. If you would like to try this, everything needed is implemented in pymatgen but this is really a more advanced task. Again, we will use the EMT calculator to speed up the calculations.

Install Phonopy#

We need to install phonopy.

%%capture
!pip install phonopy

Workflow Structure#

We will compute phonons with the finite differences approach which implemented in Phonopy. You can check out this reference in case you are interested: https://onlinelibrary.wiley.com/doi/10.1002/anie.200906780.

We will first optimize the crystal structure, then generate displaced structures and evaluate forces for the displaced structures. This will allow us to compute force constants for the whole unit cell. Those force constants can then be fourier-transformed to the dynamical matrix within Phonopy. This dynamical matrix will then be used to to get frequencies and eigenvalues describing the vibrations (phonons) in the crystal structure.

We can reuse the OptimizeJob Maker from before for the structural optimization. We need a job that creates the displaced structures and one that starts the evaluations of the forces. This implies that we need an additional static job that just returns forces based on the displaced structure. At the very end, we need a job to summarize the job, create plots and return the results to a database. We will not implement the last job fully.

workflow for phonons

Compute Forces of Displaced Structures#

Let’s build a static job first that computes forces for a structure. The structure is very similar to the OptimizeJob

@dataclass
class StaticJob(Maker):
    """
    Class to compute energy and forces for a structure

    Parameters
    ----------
    name
        Name of the job.
    """
    name: str = "EMT-StaticRun"

    @job
    def make(self, structure: Structure):
        adaptor = AseAtomsAdaptor()
        atoms = adaptor.get_atoms(structure)
        atoms.calc = EMT()
        results = {}
        results["input_structure"] = structure
        results["energy"] = atoms.get_potential_energy()
        results["force"] = atoms.get_forces()
        return results

Create Displaced Structures (Job)#

Now, we need to build jobs that will create displacements for the finite displacement method based on an optimized Structure, run all static runs and then collect the results.

Let’s start with the job that creates the displacements. It starts from an optimized structure and one can also define the size of the supercell matrix for the finite displacement approach:

from phonopy import Phonopy
from phonopy.units import VaspToTHz
from pymatgen.io.phonopy import (
    get_phonopy_structure,
    get_pmg_structure,
)


@job
def create_displacements(structure: Structure, supercell_matrix):
    factor = VaspToTHz  # we need this conversion factor. We will rely on eV
    cell = get_phonopy_structure(structure)  # this transforms the structure into a format that phonopy can use
    phonon = Phonopy(
        cell,
        supercell_matrix,  # we need to compute forces for displaced supercells of our initial structure
        factor=factor,
    )
    phonon.generate_displacements()  # phonopy will generate the displacements

    supercells = phonon.supercells_with_displacements  # these are all displaced structures

    displacements = []
    for icell in supercells:
        displacements.append(get_pmg_structure(icell))  # we will collect all structures in a pymatgen structure format

    return displacements

Start Forces Computations (Job)#

Now, we will generate static jobs for all generated structures. It’s important to remember again that the outputs contain references. You can store the output references in a list or dictionary, but you cannot perform a numerical operation on the outputs before the job has been executed. In the next job, we will be able to resolve the displaced output structures from the previous jobs, but cannot perform any numerical operation on the forces from the static jobs as these computations still have to be performed. This has to be done in an additional job.

@job
def run_displacements(displacements):
    static_maker = StaticJob()
    runs = []
    forces = []
    for displacement in displacements:
        displacement_run = static_maker.make(displacement)
        forces.append(displacement_run.output[
                          "force"])  # this is possible but you cannot transform the array to a numpy array yet as the static_maker jobs haven't been run yet!
        runs.append(displacement_run)

    output = {"forces": forces}  #we will collect all forces from the runs.

    return Response(replace=Flow(runs, output))

Plots and Summary (Job)#

Now, we will write a job that evaluates all results. We again need to recreate the correct phonopy settings. Then, we use the plotting tools from pymatgen to plot the phonon density of states.

from pymatgen.phonon.plotter import PhononDosPlotter
from pymatgen.io.phonopy import (
    get_ph_dos
)


@job
def summarize_phonopy_results(structure: Structure, forces, supercell_matrix, kpoints=[10, 10, 10]):
    forces_set = []
    for force in forces:
        forces_set.append(np.array(force))

    factor = VaspToTHz
    cell = get_phonopy_structure(structure)
    phonon = Phonopy(
        cell,
        supercell_matrix,
        factor=factor,
    )
    phonon.generate_displacements()
    phonon.produce_force_constants(forces=forces_set)
    phonon.save("phonopy.yaml")
    phonon.run_mesh(kpoints)
    phonon.run_total_dos()
    phonon.write_total_dos(filename="phonon_dos.yaml")
    dos = get_ph_dos("phonon_dos.yaml")
    new_plotter_dos = PhononDosPlotter()
    new_plotter_dos.add_dos(label="total", dos=dos)
    new_plotter_dos.save_plot(filename="phonon_dos.eps")
    return {"PhononDOS": dos}

Workflow#

Now we connect the jobs with a structural optimization.

from jobflow.managers.local import run_locally
from jobflow import Flow
from ase.build import bulk

from pymatgen.io.ase import AseAtomsAdaptor
from pymatgen.core.structure import Structure

atoms = bulk("Ag", "fcc", 3.9)
supercell_matrix = [[2, 0, 0], [0, 2, 0], [0, 0, 2]]
structure = AseAtomsAdaptor().get_structure(atoms)
maker_fullopt = OptimizeJob()
job1 = maker_fullopt.make(structure=structure)
job2 = create_displacements(structure=job1.output["optimized_structure"], supercell_matrix=supercell_matrix)
job3 = run_displacements(displacements=job2.output)
job4 = summarize_phonopy_results(structure=job1.output["optimized_structure"], forces=job3.output["forces"],
                                 supercell_matrix=supercell_matrix)
flow = Flow([job1, job2, job3, job4], output=job4.output)

response = run_locally(flow)

new_plotter_dos = PhononDosPlotter()
new_plotter_dos.add_dos(label="total", dos=response[flow.jobs[-1].uuid][1].output["PhononDOS"])
new_plotter_dos.show()
2023-06-05 07:33:17,938 INFO Started executing jobs locally
2023-06-05 07:33:17,941 INFO Starting job - EMT-Optimization (49ae69ce-1132-41f0-aa9f-930588f93906)
2023-06-05 07:33:18,051 INFO Finished job - EMT-Optimization (49ae69ce-1132-41f0-aa9f-930588f93906)
2023-06-05 07:33:18,057 INFO Starting job - create_displacements (cddc05c3-798f-460c-b6e5-82813f8041a3)
2023-06-05 07:33:18,189 INFO Finished job - create_displacements (cddc05c3-798f-460c-b6e5-82813f8041a3)
2023-06-05 07:33:18,190 INFO Starting job - run_displacements (a4aa4844-9e4d-4a67-be28-33cd039f3658)
2023-06-05 07:33:18,196 INFO Finished job - run_displacements (a4aa4844-9e4d-4a67-be28-33cd039f3658)
2023-06-05 07:33:18,198 INFO Starting job - EMT-StaticRun (8c705b7a-a2d6-4cb6-9458-dbb3a22cd79e)
2023-06-05 07:33:18,221 INFO Finished job - EMT-StaticRun (8c705b7a-a2d6-4cb6-9458-dbb3a22cd79e)
2023-06-05 07:33:18,222 INFO Starting job - store_inputs (a4aa4844-9e4d-4a67-be28-33cd039f3658, 2)
2023-06-05 07:33:18,223 INFO Finished job - store_inputs (a4aa4844-9e4d-4a67-be28-33cd039f3658, 2)
2023-06-05 07:33:18,224 INFO Starting job - summarize_phonopy_results (891a0156-72ec-4741-a84e-7ade5caaec52)
The PostScript backend does not support transparency; partially transparent artists will be rendered opaque.
2023-06-05 07:33:18,857 INFO Finished job - summarize_phonopy_results (891a0156-72ec-4741-a84e-7ade5caaec52)
2023-06-05 07:33:18,857 INFO Finished executing jobs locally
_images/920b659ab65c6cf08195bf4d82907a4e5c6cc0bd41ac2b9c73e613f3608b10af.png

If you are interested in a full phonon workflow based on DFT, please check out the atomate2 implementation: phonon workflow in atomate2

Exercise#

  1. Please write a maker for the whole flow

  2. Combine this new maker to a larger workflow. This new workflow will include three phonon runs. One will be performed at the fully optimized volume, one at a volume 1% larger, the other one at a volume 1% smaller than the fully optimized volume. This is a typical setting for computing Grüneisen parameters.