Archive for the ‘programming’ Category

This week I spent good time BigChainDB folks in Berlin on Microsoft/BigChainDB hackathon – so we hacking, designing and coding together. Besides other outcome ( which should be not only barely working code in python) there’s some thought which came up into my mind and I want to share it there.

So we have devices – it can be any IoT like devices as rapsberry Pi or Riddle&Code cryptodevices which support AES, Diffie Hellman and secure way to keep keys inside of cryptomodule, or even smart cars like Kia or VolksWagen who collects car telemetry, or even musical instruments or not even device in common meaning like luxury clothes or paintings . And we would like to to get data from device or prove device authenticity and we’re going to do it in secure way. For example, we would like to get information from smart car about its telemetry/mileage and this information should be shared to new potential car owner or insurance company as important insights about smart car status.

  • Here we go to first important topic in that area – its ‘how to identify device in some kind of secure way?’. Basically from that point of view there’s two types of devices (or not devices – do you remember paintings and musical instruments mentioned before)  – first are provide some secure features like ‘I can keep private/public key’ and  second one are about ‘i don’t have any keys at all’. In first case we’re very good – there’s device which have it’s own unique keypair and this keypair can be efficiently used anywhere to identify device. In second case ( 99.95% or even more devices ) we have only properties which should be treated as publicly exposed information, for example:
    – serial number ( VIN ), mostly never changed and mostly unique, but not always – for example for VIN there’s a cases where VIN can be changed ( for example you change engine on your car, then you have new engine with new VIN, but car itself is the same )
    – consumer properties like color, size, weight and so on. Some of that properties are immutable ( like weight or size ) in case if device is in proper state, some like color can be changed easily (after washing for example ) without any affect to main functionality of our device.
    – manufacturer properties: year, factory id, name of device, person who made an assembly and so on. These properties are immutable because device got them as part of its existence/historical data. Yes, serial number is also some kind of manufacturer properties and the reason why I differentiate it from other manufacturer properties because serial number is going to treated like unique identifier, at least it has much more changes to be treated in that way than any other properties.

For these non-secure devices we do have two options kindly provided by blockchain technology – option one it’s to use One Authority center to


world never be the same, android sdk on Visual Studio

Just check it out Visual studio 2015 preview

Hello world c++

Below I put some snapshots from Googl’e chrome release cycle presentation with some comments.
Continue reading ‘Links for January 2011’ »

You can use sed :

sed -e 's/\([^\d]*\)/\L\1/' in.txt > out.txt

or perl inliner :

perl -ne 'utf8::decode $_; $_ = lc $_; utf8::encode $_; print' in.txt > out.txt
Both guys works fine for unicode file too.

Okay, we have text file with list of urls and want to have firefox’s screenshots from this pages and also we need to have this screenshots in some normalized resolution ( like all images should be in 300×400 – thumbnails ). First of all you need to install Command line print Firefox add-on. Then create some simple script which will run firefox with needed url, print screenshot and close ( in my case via kill – may be it’s too brutal ) firefox in cycle. It may look like this ( url_list.txt – file with urls – each url on its own line :-)), after running this script you will have many *.png files which is screenshots for ulrs – 0.png – for first url in urls_list.txt, 1.png for second and so on.

while read line
firefox -print $line -printmode png -printdelay 10 -printfile ${id}.png
ps ax | grep firefox | awk '{ print $1 }' | xargs kill -9 ;
done < urls_list.txt

And now then we have screenshots ( all this guys are in different resolution in common ) then we need to normalize them – to create thumbnails for all images in 300×400 resolution – convert helps!

for f in *.png;
convert -thumbnail 300x400! ${f} thumb_${f}

And we have many thumb_*.pn with 300×400 resolution all. A little note – using resolution without ! sign will work in another way – resize will be processed proportionally with using resize only for one dimension ( bigger one ).

Hash tables :

  • Network Applications of Bloom Filters: A Survey – by Andrei Brodery and Michael Mitzenmacherz
  • Coding bloom filters
  • General Purpose Hash Function Algorithms
  • IP-Lookup with a Blooming Tree Array: A New Lookup
  • Algorithm for High Performance Routers
  • C++ Q/A :

  • Bjarne Stroustrup’s C++ Style and Technique FAQ
  • Here’s some stuff about ( yeah, I know – ‘geeks, stop migrating from svn and git and vise versa – just developer code and don’t spent time on this!‘ ) svn and git comparison.

    What do we have : git and svn installed on a same server, empty both svn and git repositories – I will add same data and provide same changes in both repositories,  and also, in another folder, I will checkout how update changes works – so in this tests I’m going to figure out how it works for ‘commiters’ and for ‘those who update changes’. My tests deal with  typical operations with version control systems  – checkout time for adding, pushing and pulling various types of data and checkout how update for them will work – test’s detailed information is below table. For git I use git+ssh protocol, for svn I use svn protocol. I don’t provide any special configuration nor for git nor for svn – all settings is about to be default. Also I would like to note that this tests is partially incomplete – I don’t checkout how delete, branching, merging and some other functions works – so in future I will try to add these points onto this test. Main result of this test – git is faster than svn about a 2-3 times in typical configuration. There’s only one really bad points – git works bad on pretty big files ( > 300 mb size), for some big size ( ~1gb ) it git even can exit with error about memory leak – there’s some special git-related project – like git-bigfiles – this stuff should help you to work with git and big files together.

    svn git svn/git
    Test 1 : adding boost 1_43 add  ( 1 sec )+ commit ( 600 sec ) = 601 sec add ( 3 sec ) + commit ( 86 sec ) + push ( 612 sec ) = 701 sec 0.85
    Test 2: checkout repository after Test 1 svn co = 109 sec clone = 18 sec 6.05
    Test 3 : small changes in sources – add first line with comment to some files – 346 files changed svn commit = 5 sec commit(2 sec ) + push ( 2 sec ) =4 sec ~1
    Test 4: update after this Test 3 svn update = 15 sec git pull ( 7 sec) ~2
    Test 5: bigger change  – add first line with comment to 5363 files svn commit = 103 sec commit ( 6 sec ) + push ( 18 sec ) = 24 sec 4.29
    Test 6: update after Test 5 svn update = 28 sec git pull = 12 sec 2.3
    Test 7: real-life project add ( 2.6 gb of data ) svn add(21) + svn commit(1h 2m 25s) = 3 766 sec add (38 ) + commit ( 1m23 ) + push ( 18m28 ) = 1 m 51 + 18m 28sec= 20 m 19 sec = 20 m 19 sec = 1 219 sec 3.1
    Test 8: update after Test 7 svn update ( 11m 58 sec ) = 718 sec git pill = 5m 27 sec = 327 sec ~2.2
    Test 9: source change (change namespace visibility) svn commit ( 1m 28 sec ) = 98 sec commit (5 sec ) + push ( 3 sec) = 8 sec 12.25
    Test 10 : update after Test 9 svn up  = 22 sec pull ( 9sec) 2.4
    Test 11: full checkout project svn co 12m 45sec = 765 sec pull ( 4 min 3 sec ) = 243 sec 3.14
    Test 12: big file test : apache log ( 50 mb ) svn add(3) + commit (1m 27s ) = 1m 30 sec = 90 sec add (3) + commit(6) + push (1m 18 sec) = 1m 17 sec = 77 sec 1.16
    Test 13 update after test 12 up = 26 sec pull  = 17 sec 1.52
    Test 14 : big files test: c++ sources in one file ( 70 mb ) add(3) + commit(1m 15 sec) = 1m 18 sec = 78 sec add (2) + commit (2) + push ( 56 sec ) = 1 m = 60 sec 1.3
    Test 15 : update after Test 14 up = 28 sec pull = 14 sec ~2
    Test 16: 100 MB xml with encoded data add(3) + commit (8m 37 sec)= 8m 40 sec = 520 sec add (9 sec) + commit (1) + push ( 24m 34 sec) = 24 m 44 sec = 1484 0.35
    Test 17 : update after Test 16 up = 52 sec pull = 47 sec 1.10
    Test 18 : 300 MB xml with encoded data add ( 1 ) + commit( 9 m 26 sec )= 9m 27 sec = 567 sec add (3) + commit(8) +push (17m  12 sec) = 17 m 33 sec = 1053 sec 0.53
    Test 19:  update after test 18 1m 39 sec pull= 1m 29 sec 1.1
    Test 20 : 1200 MB xml with encoded data 1h 24m 34s failed ?
    Test 21 : update after Test 20 5m 47 sec failed ?

    So – In common we have that git is about 2.6 faster than svn.

    I understand that this is may be very non-precious tests and also I don’t check how delete or branching works, but anyway I checkout my own most used operations during this test. In common we can see that git is more than 2 times faster than svn – everything looks fine with git, except one really bad thing  – work with big ( more than 100 mb size ) files – for this type of file svn is really works faster, and for too big ( more than 1 GB files ) git even crashed during ‘git push’ command with ‘fatal: Out of memory, malloc failed‘ message.

    boost 1_43 C++ library :  29135 files and 31609 objects including folders, 286 Mb summary size

    Test 3
    I just add first line to some cpp files :

    find ./ -name "*a???.cpp" -exec sed -i 1i"//test comment `date` : {}" {} \;

    Test 5
    Change all *.cpp files in the same way :

    find ./ -name "*.cpp" -exec sed -i 1i"//big test comment2 `date` : {}" {} \;

    Test 7:  Add  real-life repository :  many sources in cpp, perl, erlang, makefiles, bash scripts, MS word documents, images, pdf’s, compiled binaries, resource files, etc,  in sum it contains 4605 files, size is about 2.6G

    Test 9:
    Refactoring simulation : make to use “std::” specifier for all cerr and cout. Not sure it’s really safe in real-life-development, but it’s okay for test purpose.

    find ./ -name "*.h"   -exec sed -i "s/  cerr/std::cerr/g" {} \;
    find ./ -name "*.cpp" -exec sed -i "s/ cerr/std::cerr/g" {} \;
    find ./ -name "*.cpp" -exec sed -i "s/ cout/std::coutr/g" {} \;

    Test 11:  checkout whole project

    Test 12: adding 50 mb text file – apache logs

    Test 14: adding 72 mb file – many c++ sources concantenated in one big file

    Test 16: small xml : 100 mb – xml file sample from real life

    Test 18: medium xml : 300 mb – same ( but bigger ) real-life xml fil

    Test 20: large xml : 1200 mb – sample of pretty big real-life xml

    git push failed with error message of memory leak ( as I can see ) : “fatal: Out of memory, malloc failed”

    I collect some usefull information about different source text repositories version control systems : git, svn, cvs, mercury links – take a look on this if you’re interested in different VCS system using.

    Q1: What value will be printed ?
    double d1 = 16, d2=2.0;
    double d3 = (1/2) * ( sqrt(d1) + d2/2);
    cout << d3 << endl;

    Q2: Please write your own itoa implementation

    Q3: What is pure virtual destructor? Shall ( or can ) pure virtual functions be implemented ?

    Q4: Give an example of abort function implementation

    Q5: Why we need semicolon after closing bracket in C++ class declaration ?

    Q6: Can we call destructor directly? For what we may need it ?

    Q7: Can we call constructor directly? For what we may need it ?

    Q8: Can we thron an exception in destructor ?

    Q9: What happen if we call pure virtual methods in constructor or destructor ? Same question about ‘non-pure’ virtual methods ?

    Q10: Why don’t we have virtual constructors in C++ ?

    Q11: What is difference between new and new [] ? Same thing about delete and delete [] ?

    Q12: What is a difference between static_cast, dynamic_cast, reinterpret_cast and const_cast ?

    Q13: Why do we need explicit keyword in C++ ?

    Q14: Explain difference between

    const MyClass c;
    const MyClass &c2 = c;
    const MyClass *c2 = &c;
    const MyClass * const c2 = &c;

    Q15: Difference between const_iterator and iterator

    Q16: Implement Rand5 ( which returns randomly 0..4 ) use only Random7 ( which returns 0..6)

    Q17: Can we use object of class ( or structure ) which doesn’t have name? Can we declare and use classes and structures without names ?

    Q18: What is it factory pattern ?

    ( to be updated and continued very soon 🙂

    Answers ( if you need it 🙂 ) below Continue reading ‘C/C++ interview questions’ »

    Using json in bash :

    • jsawk : work with an array of JSON objects read from stdin, filter them using JavaScript to produce a results array that is printed to stdout. For example get’s json ( from Yahoo image search by ‘madonna’ query ) and add some info ( madonna’s real name ) to this
    • wget
      cat json_example.json | ./jsawk "this.ResultSet.RealName=\"Louise Ciccone\"" > updated_info.json

    • you can use some bash/awk/sed stuff to parse json. For example – list all image urls from json
    • wget
      cat json_example.json | sed -e 's/[{}]/''/g' | awk -v k="text" '{n=split($0,a,","); for (i=1; i<=n; i++) print a[i]}' | grep '"ClickUrl":' | sed 's/:/ /1' | awk -F" " '{ print $2 }'

    • or use library from for language which you like 🙂

    via v.kruchkov

    We have git repo at github (  and would like to move onto some our corporate box inside our network, for example on This repository is shared for group of developers and all who have ssh access onto shall have right to commits onto git repository. So first of all prepare place for git repo at new hosting :

    cd var
    mkdir git_repo
    chmod 777 git_repo
    cd git_repo
    mkdir my_project
    cd my_project
    git init --bare --shared=all
    cd objects
    chmod g+ws *

    Next clone existing repo and change its settings to point onto

    git clone --bare
    git remote add -t master -m master origin ssh://
    git push origin master

    to start work with new repo just use

    git clone ssh://

    have fun.

    If you’re looking for free light-weight Java class decompiler – take a look onto Emmanuel Dupuy’s JD – nice stuff with size about a 1 MB.

    My way – run sqlplus, spool in using html markup and convert this html onto text by using sed replacements. It works fast enought, and better than parse resulted text by spool without markup. Bash script which impelement it is under below Continue reading ‘How to dump oracle table into tab separated text file ( csv )’ »

    One way to do it consists in using queues – you may create unique queue for each host in your SGE grid ( using qconf -aq ) and specify this queue name in submitting parameters –

    qsub -q <queue_name> $SGE_ROOT/examples/jobs/

    In case if you would like do deploy jobs onto grid from application ( C or Java ) SGE supports special API – Direct esource Managment Application API – DRMAA – here’s some examples in C++ and Java which may help to figure out this stuff. There’s SGE DRMAA Javadocs, drmaa package JavaDocs and common help – C library functions listed in section 3. To specify queue name dmraa_set_attribute function should be used as shown below :

    drmaa_set_attribute(jt, DRMAA_NATIVE_SPECIFICATION, “q queue_name”, error, DRMAA_ERROR_STRING_BUFFER – 1);

    Another way to route jon onto specific host it’s to specify request attributes in qsub  : – qsub -l <request_attr_name> – for Java example please see below. Also you may add “soft” or “hard” resource requirements modifier ( for more see SGE glossary – hard/soft resource requirements).

    drmaa_set_attribute(jt, DRMAA_NATIVE_SPECIFICATION, “-hard  -q queue_name”, error, DRMAA_ERROR_STRING_BUFFER – 1);

    Here’s a listing of  drmaa C++ example which runs job on specified queue – to build it you may use this simple bash script which listed below – it works on Solaris 10, for Linux I suppose it’s better to use g++ compiler :

    cc $INC $LIB $LIB_NAME sge_drmaa_test_example.c -o sge_drmaa_test_example.out

    If you got below error when you run this example sge_drmaa_test_example.out: fatal: open failed: No such file or directory

    please checkout LD_LIBRARY_PATH environment variable, it should be set in the way like ( Solaris 10 x86 )

    export LD_LIBRARY_PATH=$SGE_ROOT/lib/sol-x86/

    Java implementation also use DRMAA, but it looks little different from C++ : instead of  drmaaa_set_attribute it called JobTemplate::setNativeSpecification :

    job_template.setNativeSpecification(“-hard -q ” + queue_name);

    Another way to run job on needed host it’s to specifying hostname as request attributes – it look like

    jt.setNativeSpecification(“-l hostname=dev-host1”);

    Here’s an java source for sge drmaa example or Java drmaa example archive – zip  contains source file, eclipse project and compiled binaries – to create jar you may use Eclipse export  or run inside bin folder

    jar cf SgeDrmaaJobRunner.jar net/bokov/sge/*.class

    To run this jar ( and run /tools/ which already deployed on all executors ) on Solaris 10 I use this command

    java -cp $SGE_ROOT/lib/drmaa.jar:SgeDrmaaJobRunner.jar -Djava.library.path=$LD_LIBRARY_PATH net.bokov.sge.SgeDrmaaJobRunner soft host  not_wait  /tools/ host2-dev-net

    Also you specify not only one queue name, but use a lists of queue’s names as parameter –

    qsub -q queue_1, queue_2 $SGE_ROOT/examples/jobs/

    At least qsub allows this syntax 🙂

    Little hint ‘how to add $file to maven repository‘ :
    mvn -X deploy:deploy-file -Dfile=$file -DrepositoryId=_rid_ -DgroupId=_gr_id_ -DartifactId=_art_id_ -Dversion=_ver_num_ -Dpackaging=jar -Durl=_url_

    It may looks like a ( i add ‘str_processing.jar’ ) :
    mvn -X deploy:deploy-file \
    -Dfile=$PROJECT_HOME/lib/str_processing.jar \
    -DrepositoryId=repo-bokov \
    -DgroupId=net.bokov.str \
    -DartifactId=str-processing \
    -Dversion=1.0u1 \
    -Dpackaging=jar \

    CppCMS – c++ framework for highloaded web projects and applications. As I can understand CppCMS faster when WordPress + PHP in in 5 times at least, but currently it used only by project owner to support CppCMS blog. To get more info about this project you may look at this links : How to install wiki++ and Wiki++ installation package