I got some problems with my SGE cluster – I got some amount of Solaris 10 which running under some virtualization, all servers are the same configured and have equally environment, on one machine I install SGE master, on other SGE execution hosts – and some execution hosts works well, but on another I have strange error from “install_execd” :
Checking hostname resolving ————————— Cannot contact qmaster. The command failed: ./bin/sol-x86/qconf -sh The error message was: error: commlib error: access denied (client IP resolved to host name “”. This is not identical to clients host name “”) ERROR: unable to contact qmaster using port 10500 on host “solaris-master.devnet.int.corp”When I run “qconf -sh” I got :
bash-3.00# qconf -sh error: commlib error: access denied (client IP resolved to host name “”. This is not identical to clients host name “”) ERROR: unable to contact qmaster using port 10500 on host “solaris-master.devnet.int.corp“I check out connection – ping works, hostname resolved, telnet connection on port 10500 – it works, after I check connection from master host – there’s no problems too. I compare environment on execution hosts which are worked well with hosts which have error – they got the same environment, master host configuration also have no any suspicios-looking stuff. I try to find something usefull in web – no results, some guys have same problem, but no one knows that’s happen and how to fix it. After I try to reboot execution hosts – no effect.
But when I try run “reboot” on master host – wow, it helps! So, guys, if you’ got the same errors with SGE – try to “reboot” on your master host – it may helps.
P says:
Thanks! I had the same problem and this solved it 🙂
[Reply]
Alexey Bokov Reply:
August 20th, 2010 at 12:52 pm
You’re welcome 🙂
[Reply]