Posts tagged ‘Solaris’

If anyone interested in it – here’s new update for Sun Grid Engine 6.2 – update 4. It almost about bug fixing and man’s changes – list of changes is here. Sources’ tag for CVS is V62u4_TAG (make sense for Grid Engine, ARCo, SGE Inspect ), by the way as I know Hedeby is still 1.0u3.

Sun Grid Engine’s top engineer Richard Hierlmeier wrote article ( and some bash scripts which implements it – btw why you not to put it them onto your cvs? ) about using SDM in compute cloud ( here’s EC2 as example, I suppose that GoGrid can be used also without too many changes ) – Using SDM Cloud Adapter to Manage Solaris Zones.

Sun released new version of Sun Grid Engine – 6.2 Update 3. That’s new:

upd. Also there’s new Sun Studio 12 Update 1 is available too.

Stallman dismisses cloud computing as industry bluster. “It’s stupidity. It’s worse than stupidity: it’s a marketing hype campaign,” he said – huh, i agree that anything which contain “cloud” keyword have too much marketing stuff , but it’s not really stupid. There’s too much marketing stuff in this area ( and goGrid’s guys are the very first for this ‘too much marketing’ ), but let’s look on amazon ec2 – it’s really great amazing thing – last 7 years my work related with various size cluster’s, and last year my “server provider” is amazon – and I may say that amazon is much convenient than any company-owned-datacenter. For my it’s big deal when I can get 100 servers for 10 mins and run some job on them. There’s too much marketing noise in cloud industry, but it works and it works almost fine.
ps. Another point for cloud computing – it’s Steve Ballmer on defining the cloud.

Recently I found another strange thing in EC2 – if I use “shutdown now” on ec2 instance for AWS this node will be still alive and in marked as “running” in ec2-describe-instances host list. For those who want to have simple way to shutdown ec2 instances from script I recommend to use this code ( I use it in our Convergence – Data-Aware routing on EC2 project ) :

ec2-describe-instances | grep `hostname` | awk "{ print \"ec2-terminate-instances \" \$2; }" >> /root/tools/

You may just run or schedule it via at

at -f /root/tools/ "now + $TIMEOUT minutes"

just not forget to replace $TIMEOUT with something usefull 🙂

Below I try to summary my experience with SGE and it’s using on vary platform ( Solaris 10, Ubuntu, OpenSolaris, etc.. ). If you use Solaris – check out my Solaris – common questions and it’s differences from Linux – may be your problems deal with Solaris, but not SGE.
So let’s go :

  • when I installing SGE, and after export SGE_ROOT=<my_sge_path> i try to run util/ I got ‘can’t find script /util/arch‘ error as shown below :
    root@domU-12-31-39-03-CC-95:/opt/ge6.2u2_1# util/ $SGE_ROOT
    can’t find script /util/arch
    this error can be fixed by set SDM_DIST enviroment variable :
    export SDM_DIST=$SGE_ROOT
  • I got commlib error :
    error: commlib error: access denied (client IP resolved to host name “”. This is not identical to clients host name “”)ERROR: unable to contact qmaster using port 10500 on host “”
    rebooting SGE master host helps – see Sun Grid Engine : execution host can’t connet to master host with “commlib error: access denied
  • to be continued..

Shravan Kumar share with me a Lew Tucker ( Sun’s CTO ) demo where he demonstates Virtual Data Center – it’s not a usual marketing speech, it’s quite interesting. As I understand Lew Tucker talks about this one –The APIs for the Sun Cloud – a RESTful API for creating and managing cloud resources, including compute, storage, and networking components – it looks very attractive and interesting so I’m goin’ to check it out shortly in time 🙂

Here’s another Lew Tucker demos – cloud APIs and how to create and deploy My SQL Virtual Machine

Lew Tucker and Dave Douglas demonstrate web developer application for cloud storage service

All of this videos ( and other sun’s cloud-related info ) is available at Sun’s Community OneEast web event home page, also you may check out sun’s A Guide to Getting Started with Cloud Computing.

Actually newbies in Solaris ( and experienced users in Linux :-)) have troubles with some every-day routines which works in Solaris in a different way than in most Linux like Ubuntu. Below I try to list most ”popular” problems and questins about differences in Solaris and Linux and try to figure them out.

  • grep doesn’t have -r switch, so there’s no recursive looking throught directories, here’s alternatives for recursive grep on Solaris :
    grep 'somestring' `find . -name '*'`
    find . | xargs grep 'somestring'
  • tar doesn’t support -z option, so tar xfz my_archive.tar.gz will fail with error “tar: z: unknown function modifier“. To unpack tar.gz archive on Solaris you may use this one :
    gzip -dc my_archive.tar.gz | tar xvpf -
  • how to set environment variables in Solaris ( i set them in \~/.bashrc, but it doesn’t works )  : to make Solaris read and apply to user environment barsh_rc file try to create file ~/.bash_profile in your home, and put in it lines listed below :
    if [ -f ~/.bashrc ]; then
    . ~/.bashrc

To be continued…

Usefull links : Solaris Infrequently Asked and Obscure Questions

I got some problems with my SGE cluster – I got some amount of Solaris 10 which running under some virtualization, all servers are the same configured and have equally environment, on one machine I install SGE master, on other SGE execution hosts – and some execution hosts works well, but on another I have strange error from “install_execd” :

Checking hostname resolving
Cannot contact qmaster. The command failed:
./bin/sol-x86/qconf -sh
The error message was:
error: commlib error: access denied (client IP resolved to host name “”. This is not identical to clients host name “”)
ERROR: unable to contact qmaster using port 10500 on host “”

When I run “qconf -sh” I got :

bash-3.00# qconf -sh
error: commlib error: access denied (client IP resolved to host name “”. This is not identical to clients host name “”)
ERROR: unable to contact qmaster using port 10500 on host “

I check out connection – ping works, hostname resolved, telnet connection on port 10500 – it works, after I check connection from master host – there’s no problems too. I compare environment on execution hosts which are worked well with hosts which have error – they got the same environment, master host configuration also have no any suspicios-looking stuff. I try to find something usefull in web – no results, some guys have same problem, but no one knows that’s happen and how to fix it. After I try to reboot execution hosts – no effect.

But when I try run “reboot” on master host – wow, it helps! So, guys, if you’ got the same errors with SGE – try to “reboot” on your master host – it may helps.