One of the things I like most from being a computational chemist, is the possibility to automate tasks by using scripts.
There is usually no problem for automating almost anything, but unfortunately, there are a few tricky to automate tasks, for example, when it comes to generate atomic/molecular figures.
Let’s think that we have to calculate the energies of a lot of different molecules, and after doing that, we certainly need to check the results. We can easily “grep” the total energy (or any other numerical/ASCII value) from the output files but usually, the only way to take a look to the geometry of the structures is by going trough all the outputs and opening them with a graphic molecular viewing program (such as molden, vmd, xmakemol, jmol, …). This task can be quite painful, specially for the geek people who does not like wasting their energy using the mouse if it is not completely necessary. Continue reading
I already shared my now script intended for a small “in home” computing cluster. This time, I am sharing two other variations of that script, designed for the queuing systems of two different supercomputers: Mare Nostrum at BSC, Juropa at JSC and HLRN.
Both of them are the predecessors of the newer now script, and they only have the very basic features, but if you need an script that works right out of the box for any of the supercomputers mentioned above, this is the quick solution.
When running a cell relaxation, it is sometimes nice to monitor what is going on with the cell parameters. If you run the cell relaxation with Quantum Espresso, you can use my python script “pw2cellvec” to parse all the information you need.
So, first of all, run a “vc-relax” calcuculation with a large (about 2 times bigger) cutoff for both wavefunction (ecutwfc) and charge (ecutrho) and wait for a couple of iterations. These calculations are very expensive, due to the large cutoffs.
Then, just run the script and you will get something like this:
user> pw2cellvec.py -o pwout
AA a= 6.02104498011 b= 6.02109150665 c= 9.45688483193 Vol.= 296.919819153
au a= 11.3781260524 b= 11.3782139748 c= 17.8709224124 c/a= 1.57063846279
alpha= 89.9989981833 beta= 90.0012313781 gamma= 119.99694776
- - Continue reading
In order to check the Homo/Lumo gap from a Quantum Espresso (pw.x) calculation it is possible to plot the Density Of States and estimate the distance of empty states around the Fermi level (E=0), but what about if we only want a numerical (accurate) value?
This can be done very quick with “pw2gap” script, which can be downloaded from here.
It goes trough all the eigenvalues at the end of the output, including the case of spin-polarization (spin up/down) and k-points, and it simply prints the energies of the HOMO and LUMO orbitals as well as, of course, the difference between them. As simple as that.
It looks like this (for the example of the wustite):
user> pw2gap.py -o pwout
HOMO= 9.4503 LUMO= 11.5898 eV
H/L GAP = 2.1395
Once, I was making some DOS figures with quantum espresso by using the sumpdos.x program by Andrea Ferretti (included in the QE package), but I felt that I am missing some features. Thereby, I wrote my own script, which does something similar, but it is much more automatized, it is faster to use, and it even produces ready-to-publish graphs automatically.
It is part of the quantum espresso distribution, so if you have the source code of quantum espresso, you have it in the …/*espresso*/PP/tools/sum_states.py. If you don’t have it, or you just want to read the source-code online, you can check for sum_states.py in the quantum espresso repository. Continue reading
The Denstiy Of States (DOS) and the Inverse Partitipation Ratio (IPR) are two interesting properties for understanding the electronic structure of a system.
The DOS is just a histogram counting the amount of states (molecular orbitals/wavefunctions) per energy unit, and analyzing these distributions, we can understand better the electronic behavior of our system.
The IPR is a way to analyze the “amount of localization” of these states, so that the larger the value of IPR the higher the localization around an specific covalent bond.
We used these two properties in an article published in Phys. Rev. B, named “Polymorphism in phase-change materials: melt-quenched and as-deposited amorphous structures in Ge2Sb2Te5 from“, which you can check for more information. Continue reading
Performing job monitoring tasks while running jobs on a cluster/supercomputer can be done by several tools (such as qstat, showstart, …). Unfortunately the most of these standard tools are made by computer scientist for computer scientists, but now, there is an alternative: now.
At some point in my life, I started to get sick of qstats, greps, awks, … because it takes a couple of seconds every time one has to write them. If we multiply these seconds by the number of times I need them and by the number of computational members in my group, we get enough time to prepare a couple of new inputs, or write some post/comments in my blog.
So, I wrote my own job monitoring/visualization script, based on some ideas of my friend Iñaki during my time in the theoretical chemistry group in Donostia. That script, initially did nothing but execute these programs and display ONLY the information I need, and ALL the information I need. This information includes: Continue reading
Sometimes, specially in small computational groups, it is usual that the people uses the workstations for running calculations, and in case of a “trusted” network, were all the colleagues can log-in to each other’s computers, it is sometimes also the case, that everybody calculates everywhere. But at the moment of submitting a calculation, if your own PC is already busy, how do I find a computer in my network, which is not running anything at the moment?
You could ssh to each host and run <it>top</it> or something like that, but that is really slow when you have a lot of computers, and you have vs. to repeat the task very often.
The solution: “checkpc”
checkpc is a python script which can scan a whole network of computers and return the load of all PCs within it in less than 3 seconds. Continue reading
When we perform molecular dynamics (MD) simulations, sometimes we want to store a frame every time step, in order to increase the quality of the statistics (i.e. when calculating radial distribution functions). But on the other hand, due to their big size, these trajectories might be very difficult to handle by visualization programs (VMD, jmol, …), because they usually load the hole file into memory, and storing every frame (or each few frames) in a long MD with a lot of atoms can produce a trajectory file of several MB or even GB.
The solution: shrinking the big trajectory into a smaller one with my “shrink_traj” script.
This script takes a trajectory file (in xyz format) and copies each nth frame to another file, resulting on a smaller and easier to handle file specially suitable for visualization.
A typical problem in the life of a computational scientist: your calculations take longer than the wall time in the supercomputer. If the job can be restarted, we can work around that by re-sending it manually but that is quite tedious sometimes.
My solution: use the “resend” BASH script 😉 (which you can download here)
As in the case of the most of my scripts, there is a “-h” option. This helps if you don’t remember the syntax, this option will remind you about the few possibilities.
You need the job script you submit to the queue, and the input(s). The script keeps checking the qstat for the current user, and searching for the path from where the resend was executed. Whenever there is no job running or in queue with the same path, it submits another one. If there is a job with the same path, it waits for one minute and tries again.
The resend script is as easy to use as:
user> nohup resend -n 5 -f /path/to/script.cmd &
were “-n” option specifies the number of times the job will be resent (default: 3) and “-f” the path to the batch job script (default: ./job.cmd). It is useful to include the last option in order to be able identify the process (by i.e. “ps”) if necessary.
For killing the script, you can use:
this will break the internal loop and exit the script.
From the same directory, or just search the PID and kill it.
Another trick: if you already have executed the program, but you notice that you would like to run the job some more times, you can resend it again and the job will be sent as many times as the sum of resend’s both scripts request.