Networking

Unix and Linux network configuration. Multiple network interfaces. Bridged NICs. High-availability network configurations.

Applications

Reviews of latest Unix and Linux software. Helpful tips for application support admins. Automating application support.

Data

Disk partitioning, filesystems, directories, and files. Volume management, logical volumes, HA filesystems. Backups and disaster recovery.

Monitoring

Distributed server monitoring. Server performance and capacity planning. Monitoring applications, network status and user activity.

Commands & Shells

Cool Unix shell commands and options. Command-line tools and application. Things every Unix sysadmin needs to know.

Home » Featured, Processes

openlava Quick Test

Submitted by on February 22, 2013 – 12:46 pm 5 Comments

After years working with PBS and LSF, ran into Jeff Layton’s “Share the Load” review of openlava resource manager in the Feb 2013 issue of the Admin Magazine and nostalgia took over. So I built two CentOS 6.3 VMs and decided to give openlava a shot. To make a long story short: things look broken in the latest build of openlava. The version described in the article was 2.0-206.1. x86_64 and I installed the latest available from openlava.org – 2.0-209.2.x86_64. Doesn’t seem like a huge difference, but it is, as I found out.

First things first, I followed the instructions in the article to the letter by copy-paste method to be certain. Luckily the article is available online. There were no issues during the installation. Everything went as outline in the article until I tried submitting a test job. In his review of openlava, Layton uses the following syntax:

bsub -R "type=all" < test1.script

 “I used the option -R “type=all” because I have a compute node that is different from the master node. Consequently, I need to tell openlava that it can use any node type, even ones it doesn’t understand, for running the job.”

Apparently, a few minutes after the article was published, openlava developers decided to take the “type=all” option out. The syntax no longer works:

[openlava@lavatest01 ~]$ bsub -R "type=all" < test1.script

Bad resource requirement syntax. Job not submitted.

Attempting to submit the job without the “type=all” resource directive seemed to work:
[openlava@lavatest01 ~]$ bsub < test1.script

Job <322> is submitted to default queue <normal>.

However, the job sits in pending indefinitely. Checking on the detailed status reveals the reason:
[openlava@lavatest01 ~]$ bjobs -l

 Job <322>, User <openlava>, Project <default>, Status <PEND>, Queue <normal>, Command <test1.script>

Fri Feb 22 11:44:25: Submitted from host <lavatest01>, CWD <$HOME>;

 PENDING REASONS:

 Not the same type as the submission host: 1 host;

 Job slot limit reached;

The “not the same type” error is exactly the problem the “type=all” option was supposed to address.

To be certain, I removed the only compute node from the cluster and enabled the head node to run jobs. I submitted another simple job to run a “find” command on a local filesystem. The job submitted without problems and, after showing up as “pending” for a few seconds, appeared in the active state:

Job <424> is submitted to default queue <normal>.

[openlava@lavatest01 scripts]$ bjobs

JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME

424     openlav PEND  normal     lavatest01                 test01.sh  Feb 22 11:52

[openlava@lavatest01 scripts]$ bjobs -l

Job <424>, User <openlava>, Project <default>, Status <RUN>, Queue <normal>, Command <test01.sh>

Fri Feb 22 11:52:27: Submitted from host <lavatest01>, CWD </opt/openlava/scripts>

Fri Feb 22 11:52:36: Started on <lavatest01>;

[openlava@lavatest01 scripts]$ bjobs

JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME

424     openlav RUN   normal     lavatest01     lavatest01     test01.sh  Feb 22 11:52

At this point I very much needed to see something work, but my celebration was short-lived. The job seemed to be “active”, but it wasn’t going anywhere. It wasn’t doing anything. No output file, no errors – just mysterious silence. The whole script takes a couple of seconds to run if executed manually, but two hours later it was still in the queue “running”:
[openlava@lavatest01 scripts]$ bjobs

JOBID   USER    STAT  QUEUE      FROM_HOST   EXEC_HOST   JOB_NAME   SUBMIT_TIME

526     openlav RUN   normal     lavatest01     lavatest1     test01.sh  Feb 22 11:58

Consulted the skimpy documentation on openlava.org and found nothing of help. Joined the openlava-users group on Google to see if someone was having the same issue. Unfortunately, there is not much activity there and, it would seem, more questions than answers and mostly having to do with compiling openlava.

So my openlava test fell a bit short of the expectations. I’ll probably stop by a bar on the way back from work to make up for this. At least it’s Friday.

Print Friendly, PDF & Email

5 Comments »

  • Antonio Arena says:

    Hi Igor!

    I had your same problem yesterday. So I called in the big guns, LSF. And also LSF failed. So I figured out it’s a kernel problem. You have to use this kernel 2.6.32-358.0.1.el6.x86_64 to have OL work. I have it working on both CentOS 6.3 and CentOS 6.4 releases.

    I also want to add that I have never seen -R “type=all” in all my years working with LSF. You can use something like this:

    lsrun -R – hostname

    so that it gets submitted on all resources. In this example you wouldn’t go through the queueing system. You can also use it with bsub command.

  • Oilers says:

    iBESIDES DIARY OF A WIMPY KID

  • DuckieM10 says:

    Carl Palmer
    John Bonham
    Neil Peart
    Charlie Watts
    Joey Kramer
    Ginger Baker
    Sheila E. (Not Rock, but pretty good)

  • ibjammin44 says:

    Im already a weight lifter i can leg press 855 i can ham string curl 170.. I can leg extention 270 on the old sytle design. Dead lift at 335lbs..I often wake up at 500AM to jog and i eat well.. However is it possible for me to get into jogging or running marathon i been power lifting since 2006 i weigh 200lbs slim and im 22 years old…….Currently in layton utah….. I think i should be programed for this since i been doing heavy weight lifting for some time also i did endurance work today..

  • davemc74656 says:

    i know what i want to do, but its proving very tricky

Leave a Reply to davemc74656 Cancel reply

%d bloggers like this: