Most read posts this month



Nov 4th 2009 × CubeCart 4 security vulnerability: is your store at risk?

Whilst reading on XSS attacks today, I found this recently reported exploit in CubeCart 4 that can gain an attacker full administrative access to the store.

Not only that, it can help them dump your entire store DB – products, cats, users, orders, the works. Anyway, you get the idea. “CubeCart responded and informed their customers about this vulnerability” – as technical advisor for a site that runs on CC4, I can testify to the fact that the site owners were not informed of any such. Nice.


Jun 9th 2009 × CubeCart problems: no shopping bakset and login functionality for some users

There’s yet another one of these problems that are inherently weaved deep into CubeCart that you just wouldn’t know about… It displays different versions of pages to search engines and to humans, namely–it disables the shopping basket and checkout functionality as well as the login and registration.

First of all, needs to be said that troubleshooting this and finding the reason for users being unable to purchase off a CubeCart site is an absolute nightmare. When a customer says they use ‘bog standard’ IE8 and report their ‘add to basket’ button doing nothing whatsoever, one tends to think ‘has the javascript handler gone wrong?’. You go on a wild goose chase, trying to reproduce the problem and failing despite of installing various javascript exception tracking modules, looking through logs and quizzing puzzled shoppers. All the while, you can’t help but wonder about the possible extent of the problem, how many users get this? And then – you catch a break by accidentally discovering customers unable to purchase also lack the login/register links – time to start connecting the dots…

From session.inc.php, which controls the login / register links in CubeCart:
if (!$cc_session->user_is_search_engine() || $config['sef'] == false)

The template does not get shown to search engines? That makes sense… having different versions of pages shown to users and to spiders is not only a bad practice (google really don’t appreciate “cloaking” techniques), there is just NO need for it whatsoever. In order to prevent spiders from indexing pages that are deemed irrelevant to e-commerce and spill page rank / relevance, they could have been disallowed from within the robots.txt file. A rel=”nofollow” could have been applied to links to such pages… but what have the clever folks at CubeCart done instead?

They created a boolean method into the sessions class that decides if the user is a bot or not, user_is_search_engine(). It does so by comparing the contents of a file called spiders.txt, filled with “known” extracts from the user agent strings of various spiders from around the web, against the user agent string of the visitor. To be fair, the original idea for this kind of testing comes from OS Commerce…

The problem is when a legitimate customer is being discarded from using the site’s e-commerce because their user agent string is customised in a ‘bad’ way. How does that work? The original CubeCart spiders.txt file has lines that reject things like:

‘googlebot’ but also just ‘google’,
‘msnbot’ but also just ‘msn’

And so forth… multiply by x number of toolbars and custom strings CubeCart may never know about and you have a HUGE problem. For instance, users with Google Desktop get their user agent string set to Mozilla/5.0 (compatible; Google Desktop) so they promptly get rejected. Luckily, that’s not a very popular application but the “MSN” string and the absolutely COUNTLESS numbers of people that have got a user agent string like this one: Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.0; Trident/4.0; Sky Broadband; GTB6; SLCC1; .NET CLR 2.0.50727; Media Center PC 5.0; .NET CLR 3.0.04506; InfoPath.2; .NET CLR 3.5.21022; MSN OptimizedIE8;ENGB) presents a VERY real problem. MSN optimized IE8? Not on my shop site, mate.
edit: check your spiders.txt version, if it predates august 2008, you are affected.

I have been looking at the user sessions table and have thus far found over 3000 genuine users that have been rejected by CubeCart’s loose user agent matching routine. That’s a lot of business to lose and the store owners are understandably upset. It’s not free software and at a testing time like the credit crunch we’re enjoying, having your own store work against you is far from ideal. The real frustration comes from the fact that people had reported an intermittent loss of shopping cart functionality on the CubeCart forums and on their bug / ticketing system. Reported and dismissed – apparently, too difficult to trace or unsupported due to store being customised. Every programmer makes mistakes, but being unable to rectify them and failing to provide support to your paying customers – it’s just bad business. I am sorry to say, CubeCart has failed to impress once again…

The fix to the CubeCart user agent problem:

1. apply nofollow to the links for login, register and checkout
2. empty the contents of spiders.txt in your cubecart root folder (don’t delete it)
or
2. change user_is_search_engine() to always return false.

To test if your store is affected, use FireFox and check this post on how to change your user agent string, set it to the one I put as an example above and visit your shopsite, then try to add a product to your basket.

update: I am being told that this problem is no longer to be found in current releases of cubecart. Well done, the team :) Now, how many existing customers on versions pre-dating august 2008 have been notified?


Feb 16th 2009 × E-commerce and product search algorithms: get relevant search results, not just random DB dumps

Everyone has been down that route – trying to make a good search. Most have failed…

The one thing that really annoys me when shopping around are searches on sites that give you irrelevant results. For instance, take a search for ‘black pack’ – I think you’d agree, a generic string with which I’d expect to get ‘backpacks’ and ‘daypacks’ and the likes, in black. For the experiment, I have chosen a site at random from google: gooutdoors.co.uk (see the search results yourselves here, in a new window). NB: I am in no way affiliated with gooutdoors.co.uk and this is not a dig at them or a link back for their site in any way, I have even applied a nofollow tag to the results link

When you had expected to see backpacks and got things like “Silva Ranger 3 Compass”, “Lifesystems HeadNet Mosquito Hat” and “Wayfayrer Beef Stew and Dumplings” (sic) instead, you can’t help but think something has gone terribly wrong with the search script. With over 70% of users likely to just ‘bounce’ off such a site after not being able to find what they were after immediately, we need to take a look into the why’s of getting such irrelevant search results.

Upon clicking on the Lifesystems Mosquito Hat from the above result set and scanning for the words ‘pack’ and ‘black’, we notice them within the product features:

  • Can screw down into a small “stuff pack

  • Ultrafine black mesh

Should these results have been displayed to me? No. Why do we get this problem? Lazy coding. The most basic search practice out there is to do something like:

1. Break string into words.
2. Compose the search query targeting known data fields like title, description, features, word by word, imploding into the query. At this point the where statement can look like ‘where (description like ‘%black%’ or features like ‘%black%’ or title like ‘%black%’) and (description like ‘%pack’ … etc etc)
3. Display the results and hope for the best.

There’s a marketing school of thought here that you’re better off displaying ’something’ than no hits – but this is NOT how it’s done. Here is another favourite search of mine that works on the site above:
the this, Found 253 product(s) – page 1 of 22

It’s fair to say, certain words should not be used to score results, they are just too generic to be considered. Unless you are typing something like ‘the north face’, ‘the’ should be dismissed, in the same way as ‘this’ should be removed. In fact, over time — I have built a database of ‘bad keywords’ to drop from search strings that you can see as an appendix to this post.

So, what is the alternative? Oddly enough, I have found the most accurate search results are achieved via manual tagging and backed up by product knowledge. It goes like that:

1. Assign tags to each product. You can build an aliases table for common tags and errors. For example, you want to alias things like berghaus with berghouse, berghaus, burghaus etc (you’d be surprised how many people make mistakes).
2. Build the search algorithm to break down the string into parts and analyse them. Drop all common words that won’t help and keep the ‘useful’ bits only (See below for badwords)
3. What words you have left, treat as tags and fetch all products they have been applied to.
4. Refine for relevance. This is done by assigning a number score of hits on a product. Basically – If I search for Berghaus RG1 Jacket, that’s a possible 3 tagwords score. If the shop has the RG1 in stock, the listings should ONLY show me that result (or any other 3 point hits) and none of the results with 2 or less (jacket + berghaus). If the RG1 is not being stocked, this leaves an array of jackets by Berghaus and an array of jackets. Once again, go for relevance and show the first group of results only.

Advantages: always gets the right and relevant results, providing good product maintenance.
Disadvantages: needs to be managed, needs to be updated and there’s a need to monitor for people’s mistakes in searches and allowing for them.
Bottom line: The increased conversion ratio will justify the man hours put into tagging your product base. It’s a credit crunch, we all need to work harder!

I hope this gives you some ideas anyway.

Here is a suggested list of ‘bad words’ that can be safely dropped from search strings:

$badwords = array(
"a", "a's", "able", "about", "above", "according", "accordingly", "across", "actually",
"afterwards", "again", "against", "ain't", "all", "allow", "allows", "almost", "alone",
"along", "already", "also", "although", "always", "am", "among", "amongst", "an", "and",
"another", "any", "anybody", "anyhow", "anyone", "anything", "anyway", "anyways", "anywhere",
"apart", "appear", "appreciate", "appropriate", "are", "aren't", "around", "as", "aside",
"ask", "asking", "associated", "at", "available", "away", "awfully", "b", "be", "became",
"because", "become", "becomes", "becoming", "been", "before", "beforehand", "behind",
"being", "believe", "below", "beside", "besides", "best", "better", "between", "beyond",
"both", "brief", "but", "by", "c", "c'mon", "c's", "came", "can", "can't", "cannot", "cant",
"cause", "causes", "certain", "certainly", "changes", "clearly", "co", "com", "come", "comes",
"concerning", "consequently", "consider", "considering", "contain", "containing", "contains",
"corresponding", "could", "couldn't", "course", "currently", "d", "definitely", "described",
"despite", "did", "didn't", "different", "do", "does", "doesn't", "doing", "don't", "done",
"down", "downwards", "during", "e", "each", "edu", "eg", "eight", "either", "else",
"elsewhere", "enough", "entirely", "especially", "et", "etc", "even", "ever", "every",
"everybody", "everyone", "everything", "everywhere", "ex", "exactly", "example", "except",
"f", "far", "few", "fifth", "first", "five", "followed", "following", "follows", "for",
"former", "formerly", "forth", "four", "from", "further", "furthermore", "g", "get", "gets",
"getting", "given", "gives", "go", "goes", "going", "gone", "got", "gotten", "greetings",
"h", "had", "hadn't", "happens", "hardly", "has", "hasn't", "have", "haven't", "having",
"he", "he's", "hello", "help", "hence", "her", "here", "here's", "hereafter", "hereby",
"herein", "hereupon", "hers", "herself", "hi", "him", "himself", "his", "hither",
"hopefully", "how", "howbeit", "however", "i", "i'd", "i'll", "i'm", "i've", "ie", "if",
"ignored", "immediate", "in", "inasmuch", "inc", "indeed", "indicate", "indicated",
"indicates", "inner", "insofar", "instead", "into", "inward", "is", "isn't", "it",
"it'd", "it'll", "it's", "its", "itself", "j", "just", "k", "keep", "keeps", "kept",
"know", "knows", "known", "l", "last", "lately", "later", "latter", "latterly", "least",
"less", "lest", "let", "let's", "like", "liked", "likely", "little", "look", "looking",
"looks", "ltd", "m", "mainly", "many", "may", "maybe", "me", "mean", "meanwhile", "merely",
"might", "more", "moreover", "most", "mostly", "much", "must", "my", "myself", "n", "name",
"namely", "nd", "near", "nearly", "necessary", "need", "needs", "neither", "never",
"nevertheless", "new", "next", "nine", "no", "nobody", "non", "none", "noone", "nor",
"normally", "not", "nothing", "novel", "now", "nowhere", "o", "obviously", "of", "off",
"often", "oh", "ok", "okay", "old", "on", "once", "one", "ones", "only", "onto", "or",
"other", "others", "otherwise", "ought", "our", "ours", "ourselves", "out", "outside",
"over", "overall", "own", "p", "particular", "particularly", "per", "perhaps", "placed",
"please", "plus", "possible", "presumably", "probably", "provides", "q", "que", "quite",
"qv", "r", "rather", "rd", "re", "really", "reasonably", "regarding", "regardless",
"regards", "relatively", "respectively", "right", "s", "said", "same", "saw", "say",
"saying", "says", "second", "secondly", "see", "seeing", "seem", "seemed", "seeming",
"seems", "seen", "self", "selves", "sensible", "sent", "serious", "seriously", "seven",
"several", "shall", "she", "should", "shouldn't", "since", "six", "so", "some", "somebody",
"somehow", "someone", "something", "sometime", "sometimes", "somewhat", "somewhere", "soon",
"sorry", "specified", "specify", "specifying", "still", "sub", "such", "sup", "sure", "t",
"t's", "take", "taken", "tell", "tends", "th", "than", "thank", "thanks", "thanx", "that",
"that's", "thats", "the", "their", "theirs", "them", "themselves", "then", "thence", "there",
"there's", "thereafter", "thereby", "therefore", "therein", "theres", "thereupon", "these",
"they", "they'd", "they'll", "they're", "they've", "think", "third", "this", "thorough",
"thoroughly", "those", "though", "three", "through", "throughout", "thru", "thus", "to",
"together", "too", "took", "toward", "towards", "tried", "tries", "truly", "try", "trying",
"twice", "two", "u", "un", "under", "unfortunately", "unless", "unlikely", "until", "unto",
"up", "upon", "us", "use", "used", "useful", "uses", "using", "usually", "v", "value",
"various", "very", "via", "viz", "vs", "w", "want", "wants", "was", "wasn't", "way", "we",
"we'd", "we'll", "we're", "we've", "welcome", "well", "went", "were", "weren't", "what",
"what's", "whatever", "when", "whence", "whenever", "where", "where's", "whereafter", "whereas",
"whereby", "wherein", "whereupon", "wherever", "whether", "which", "while", "whither", "who",
"who's", "whoever", "whole", "whom", "whose", "why", "will", "willing", "wish", "with",
"within", "without", "won't", "wonder", "would", "would", "wouldn't", "x", "y", "yes", "yet",
"you", "you'd", "you'll", "you're", "you've", "your", "yours", "yourself", "yourselves", "z", "
);

Oct 19th 2008 × CubeCart: a ticking timebomb that goes off 1 second at a time

There are certainly some positive things that can be said about CubeCart, a low-end budget entry choice for an e-commerce platform. It’s certainly capable of doing the rudimentary functions it’s supposed to do: lets you add products into categories and lets customers look at them (even buy some!). It also supports a number of popular payment gateways / merchant accounts. Not rocket science.

My gripe with it starts at the price tag: $179.95. You don’t get a lot for your money that you can’t get for free elsewhere (eg OS Commerce, Magento, BossCart JV, VirtueMart, Mambo… to name but a few). In fact, you get nothing.

Irregardless, let’s assume–you are a self styled web 2.0 start-up that wants a piece of the pie, but with limited or no technical knowledge, scant time and resources. And a tight budget (those VC must be sleeping). You have splashed cash on cubeCart and have spent considerable time skinning the shop and populating it with your product range. It’s time to look at the business logic behind your shopping process, the SEO, the landing pages, the sizing and shipping options… And to discover that most of these trivial functions are available as paid-for (as in, you have to pay afresh) ‘community’ add-ons. Should you decide to display products based on brand/manufacturer, that’s also a commercial add-on. You get the picture? It’s a scam. By this time you’re well over your budget and behind schedule so you figure, stopping now means it’s all gone to waste. But perhaps you should, because CubeCart is certainly full of nasty surprises. Read on…

I had the dubious “pleasure” of supporting and improving a CubeCart 4.2.0 shop. Having had to implement and skin a ton of buggy or not totally adequate plugins that were purchased, I discovered another negative side to all of this: the CubeCart modders tend to base64 encode all of their work. That’s right, you can get shipping by country and you can pay for it but by god, you can’t change it. Which would have been ok if any of those fine developers actually had a clue and understood how an e-commerce business operates. Experience has taught me one thing: e-commerce is not what about what a programmer thinks it should be. you don’t just measure the strength of your platform by the number of product attributes or skins you can apply. It is also about making the shop owner’s job as easy and as organised as possible. What does that mean?

For instance… viewing orders into the main admin window or even as popups into a new window is cumbersome and difficult to do especially if you get more than, er… 1 a day. Imagine 40 orders waiting to be shipped, your couriers arriving in 30 mins, your phone ringing off-the-hook, new orders coming in, RMAs arriving and replacements needing to going out… and how do you manage this? Certainly not by looking at the CubeCart orders list, unless you have the memory of an elephant and can remember the contents of each order just by glancing at its order ID or customer name. Of course, you can just click into an order and then see the items contained within, go to your warehouse, find the item, pack it, get back to the PC, print the invoice, login to your couriers’ site, print the label, affix label to box and set aside for pickup. 39 more to go, go back to the orders screen and see if you remember where you were…

Nevermind, all of this can be fixed eventually by a consultant like myself a process that makes the original prudent investment of $179.95 seem like a very pleasant memory indeed! But the problems do not stop there… For example, after a while of the store being operational, the shop owners are bound to notice certain… trends. Like, why do people phone in and report that somewhere between clicking add to basket and going to checkout screen, their basket gets lost (something also reported on the forums). Intermittently. As you fix this sessions PHP error, you think… we’re ready for the big time, everything is sorted… Wrong again. CubeCart has more surprises in-store for you (great pun, if i say so myself). Over time, the speed of display of your product pages starts to increase. A lot…

Now, I use 24mb ADSL by be.there in London – pretty fast. I used to get the whole page (with all images) for something like 0.8 – 1.3 secs. Imagine my surprise when the delay between a link click and the start of page rendition (the ‘waiting for response’ phase) was over 4 seconds on its own! With nothing else changed to the best of my knowledge, my first reaction was to look at the MySQL database. I scanned it, profiled it, analysed it – to no avail. Deleted various search histories and session data entries, added some indexes… still nothing. Upgraded the MySQL version to 5.0.58 – no change (other than Fedora Core’s YUM removing Zend and MySQL support out of php.ini and Plesk refusing to work and pop3 passwords being rejected). Once I managed to fix all of that and turned by attentions back onto the website, the evidence still sides with my original suspicion: a database-related delay. It was time to do some query profiling… I found and modified the CubeCart DB class and added a timer and an echo wrapper around all mysql_query() calls. And there it was, waiting for me at the bottom of the page:

SELECT DISTINCT O.productId, I.name, I.image, I.price, I.sale_price FROM CubeCart_order_inv AS O, CubeCart_inventory AS I
WHERE I.productId = O.productId AND O.cart_order_id IN
(SELECT DISTINCT cart_order_id FROM CubeCart_order_inv WHERE productId = 274)
AND O.productId <> 274 LIMIT 3

3 rows in set (3.74 sec). Nice. Nicer still… site does not even use the ‘customers who bought this also bought’ feature! What’s happened? Well, over the last 6 months, the orders and orders inventory tables have grown to something over 2500 records – which has caused this nasty nested query that returns 3 measly product IDs (that may not even be relevant for an up-sell here) to bomb the server.

This begs the question: haven’t the CubeCart development crew done any testing on their system? Or hasn’t anyone that’s used CubeCart before gotten to 2000+ orders? Small wonder…

How many more red herrings are there waiting to be discovered in CubeCart, we’ll never know. Just a word of advice: do not pick this for your platform, even if given half a choice!


Sep 15th 2008 × Ecommerce platforms: BossCart JV

As I am pushing on in my quest to find a perfect free e-commerce platform that has great SEO, ease of product, brand, stock and category management with industrial strength order management. Not an easy task, to be sure… The latest package I’d like to talk about today is BossCart JV by BossCart

What I had presumed to be a lightweight version of their bespoke commercial product is actually quite different. For starters, it uses mootools (1.11) and the “commercial” cart is under jQuery but let’s not draw any conclusions on the strength of the frameworks based upon this just yet, heh :) . It also seems to lack one very important facet of trading: brands/manufacturers (this has been left for the ‘full’ version once again). Other than that, first impressions as a user: it appears to have been coded within the spirit of Web 2.0 in mind – search tags, lightbox imaging, product ratings, the chunky yet slick looks…

Since this IS fragged.org, we write about what we don’t like first and assume the rest is fine or dismiss it as boring… With that in mind, let’s pop the bonnet and see if this baby can organically give good SERPs. n.b. you can always fix the css/theme so this is very superficial

I picked http://jv-cart.bosscart.com/golf-iron-supplier as the page to disect first.

I always wondered what JV stood for and some light got shed here:

<img src="http://jv-cart.bosscart.com/components/com_virtuemart/shop_image/product/0a2d9237bb75f96c09db6f5d91de7287.jpg"  width="135" border="0" alt="Titleist Pro V1 Premium Refinished Golf Balls" style="float:left"/>

Seems that at least in part, the jv-bosscart are based on Joomla / Virtuemart. This explains a lot – mootools 1.11 and not 1.2, for starters (Joomla have yet to move over).

The URL is SEO friendly enough, although the title does not appear to be even remotely relevant. We find this:

<meta name="description" content="" />
<meta name="keywords" content="sample Boss Cart JV golf shop,http://jv-cart.bosscart.com" />

It may seem like a little thing to simply go and set it manually – but the last e-commerce site i worked on had 1200 products and 50+ categories. Today’s e-commerce software needs to try and back the site owners up and fill in wherever possible…

The default theme seems to have a topbar and a side menu element floated to the left, with the body of the page that follows. Obvious disadvantage of that is that repetitive text (categories, site pages, header bits) will always precede the real important page headings and body texts that will determine the page relevance.

The page also appears to be using a mix of css-driven-design and tables, as well as plenty of inline css.

Also stuff that you can clean up as you customise it. Links not having title tags is yet another hindrance that will need fixing.

The first real page-relevant bit of code in the source is at line 228 (a bit too far down for my tastes). The most important dynamic content, keywords, heading tags, descriptions – they should be as close to the top of the page as possible and without too much markup. This may be a semantic view but it works.

The questions here is, are there good enough framework templates to get you started re-skinning the default one?

How effective is the organic SEO out of the box? I decided to search on google for the products in the test shop – obviously w/o any proactive marketing as examples: Bay Hill Plasma irons comes as no. 1, Bay Hill Irons as the 4-th site down. This may be due to the number of inbound links to the BossCart site but it’s also no fluke. The product descriptions are explicit and help matters also. Titleist Golf Balls comes up also on page 2.

The provisonal verdict for BossCart JV: There are a lot of areas that need work – messages here and there, visual glitches, optimisations and other bits. We’ve not looked at the admin interface yet — but it has potential. I would say, as an out-of-the-box free solution, it probably does more for a startup business than oS Commerce or CubeCart.

I will post updates here as soon as I find more about it.