Pyscript Usage
Pyscript is ChemML enabled technique which allows a user to perform tasks and obtain outputs similar to coding. This allows the user to perform a variety of tasks my keying snippets other than those available in the GUI. (e.g. inserting print statements, running specific python scripts on output datasets etc.)
A guide on how to print specifics using a Pyscript are provided in this tutorial.
For example, we use our existing dataset of crystal structures (available in our ChemML library) and generate their inorganic descriptors. We concatenate the various dataframes generated and display the shape of the concatenated dataframe using Pyscript.
A dedicated template for the same exists under the heading “Inorganic Descriptors:”, in the Template workflow tile of the GUI. For details regarding the file formats, inputs and outputs, we recommend that users view each of the click-able task blocks and their respective Parameters and Input/Output tiles.
[1]:
from chemml.wrapper.notebook import ChemMLNotebook
ui = ChemMLNotebook()
The computation graph will be displayed here:
The ChemML Wrapper's config file has been successfully saved ...
config file path: inorganic_crystal.txt
current directory: /mnt/c/Aatish/UB/Mr. Hachmann/master_chemml_wrapper_v2/chemml/docs
what's next? run the ChemML Wrapper using the config file with the following codes:
>>> from chemml.wrapper.engine import run
>>> run(INPUT_FILE = 'path_to_the_config_file', OUTPUT_DIRECTORY = 'CMLWrapper_out')
... you can also create a python script of the above codes and run it on any cluster that ChemML is installed.
The workflow gives a precise representation of all the intermediate steps, blocks used to develop the model, the saved data and the inputs/outputs to each block. Once the workflow is finalized, we save the input script with our desired file name in .txt format.
Note: In this case, we specify our output directory as ‘inorganic_pyscript’
[2]:
from chemml.wrapper.engine import run
run(INPUT_FILE = '/mnt/c/Aatish/UB/Mr. Hachmann/master_chemml_wrapper_v2/chemml/docs/inorganic_crystal.txt', OUTPUT_DIRECTORY = 'inorganic_pyscript')
=================================================
=================================================
Fri May 7 09:38:06 2021
parsing the input file: /mnt/c/Aatish/UB/Mr. Hachmann/master_chemml_wrapper_v2/chemml/docs/inorganic_crystal.txt ...
=================================================
======= block#1: (chemml, load_crystal_structures)
| run ...
| ... done!
| execution time: 9.24s (0h 0m 9.24s)
=======
======= block#2: (chemml, CoordinationNumberAttributeGenerator)
| run ...
| ... done!
| execution time: 36.34s (0h 0m 36.34s)
=======
======= block#3: (chemml, CoulombMatrixAttributeGenerator)
| run ...
| ... done!
| execution time: 0.12s (0h 0m 0.12s)
=======
======= block#6: (chemml, EffectiveCoordinationNumberAttributeGenerator)
| run ...
| ... done!
| execution time: 14.02s (0h 0m 14.02s)
=======
======= block#4: (pandas, concat)
| run ...
| ... done!
| execution time: 0.01s (0h 0m 0.01s)
=======
======= block#5: (chemml, PyScript)
| run ...
shape of features: (18, 38)
| ... done!
| execution time: 0.00s (0h 0m 0.00s)
=======
Total execution time: 59.73s (0h 0m 59.73s)
2021-05-07 09:39:06
The tasks assigned through Pyscript is executed and the output is displayed. In this case, we chose to print the shape of the concatenated features. This can be seen in the output under the heading “block#5”.