In talking with the Hyak IT people, I stumbled on a useful ability! Running Top (activity monitor for you mac people) on execute nodes. It’s super simple. You just ssh in to the execute node from your mox login node.
For example. Im currently running PBJelly on execute node N2185. You can find this information via the
scontrol show job JOBID command.
From there, in your mox login node you just ssh via
ssh NODE (ex.
ssh n2185 with no user credentials. This gives you shell access to the execute node as things are running. Also, it should give us direct access to the local node scratch directory, which shouldn’t have file number limitations like the
This also revealed some disappointing CPU usage for PBJelly. It’s essentially running single threaded, event though it was told not to. Going to have to try parallel-sql or GNU-parallel next time.
Also, got a time extension on our PBJelly run (These have to be given by people with admin privileges) up to the scheduled Hyak downtime. Fingers crossed it will finish.