Recently, I crawl throught and found that they have some bugs in their services – in some categories they have wrong stuff counter and also guys have really bad html code for their product pages.Let’s look at Drill bits and sets category – amazon told us that there’s 24,151 result, but I see that summary of sub-categoires show us much less that 24 000 ( look at screenshot ) – may be some results stored in sub-categories and other is stored only on current category level ? Hmm, but in this case I can’t see results which are behind 9,600 – because I can list only onto page number 400. Same thing with router accessoires. So have some hidden stuff which can’t be showed to users, or, they have buggy counter for results. Next point deal with html-code – I open source html for drill bits and sets and see that they have very, very bad styled html – code filled with empty lines and useless spaces ( I don’t even tell that they also include CSS as is in code and they have too much images on page which always reloading ). This is why they have big size of html which equall to 223K ( huh, guys, most web-pages are not more than 30-60k ).  I check out how many useless stuff in this html – I remove all lines which are empty or contain only spaces or tabs by using sed ‘s/^ //’| sed ‘/^$/d’ and I got page which size 5% less than original – 213 K – and original page have 5573 lines, after I clean up empty lines I got html with 3054 line – half of lines in amazon’s html is dummy. I know that almost browsers support zip to download compressed html and check out that’s will happen with zipped files : original file is 39k in zip ( I use zip with best compression ) and ‘cleaned’ file is 37K in zip. So we have 3k dummy binary traffic for every document. 3K it’s about a 10% ( yeah, I know, it’s very rought estimation ), so got 10% of dummy load and can be omitted with simple removing empty lines from their html. Guys, why you don’t do it – cost saving for reduce 10% of web-traffic just by removing empty lines it’s big deal, isn’t it?

Leave a Reply