Posts tagged ‘qconf’

Sun Grid Engine’s top engineer Richard Hierlmeier wrote article ( and some bash scripts which implements it – btw why you not to put it them onto your cvs? ) about using SDM in compute cloud ( here’s EC2 as example, I suppose that GoGrid can be used also without too many changes ) – Using SDM Cloud Adapter to Manage Solaris Zones.

I got some problems with my SGE cluster – I got some amount of Solaris 10 which running under some virtualization, all servers are the same configured and have equally environment, on one machine I install SGE master, on other SGE execution hosts – and some execution hosts works well, but on another I have strange error from “install_execd” :

Checking hostname resolving
—————————
Cannot contact qmaster. The command failed:
./bin/sol-x86/qconf -sh
The error message was:
error: commlib error: access denied (client IP resolved to host name “”. This is not identical to clients host name “”)
ERROR: unable to contact qmaster using port 10500 on host “solaris-master.devnet.int.corp”

When I run “qconf -sh” I got :

bash-3.00# qconf -sh
error: commlib error: access denied (client IP resolved to host name “”. This is not identical to clients host name “”)
ERROR: unable to contact qmaster using port 10500 on host “solaris-master.devnet.int.corp

I check out connection – ping works, hostname resolved, telnet connection on port 10500 – it works, after I check connection from master host – there’s no problems too. I compare environment on execution hosts which are worked well with hosts which have error – they got the same environment, master host configuration also have no any suspicios-looking stuff. I try to find something usefull in web – no results, some guys have same problem, but no one knows that’s happen and how to fix it. After I try to reboot execution hosts – no effect.

But when I try run “reboot” on master host – wow, it helps! So, guys, if you’ got the same errors with SGE – try to “reboot” on your master host – it may helps.