Blog

CogniBlog

Thoughts from the Cognidox world
Tags >> Open Source Software

The Central Office of Information (COI) is the UK Government's centre of excellence for marketing and communications. They have just published a report on the costs, usability and quality of selected UK Government websites in 2009-10.

It's a detailed report and the data is available to download. It shows how the UK Government spent £94 million on website development and running costs plus £32 million on web staff in 2009 - 2010. By looking at the analytics it's possible to correlate the costs of building these sites with the number of visitors. One headline statistic was that the UK Trade and Investment website averaged 28,000 users per month but cost over £4 million to build - so each site visitor cost £11.78. 

As an exercise, I took a deeper look at the websites in the COI report to see what technology they're using. These are leading central government websites so it's an interesting sample.

I wanted to characterise the sites from the information available on the Internet - were they built using Microsoft technology as evidenced by IIS web server, ASP.NET framework and Windows Server? Or were they Open Source based, with Linux OS and Apache web server for example?

In the end I extracted data for 38 of the websites, and found 25 were using Microsoft and 13 were Open Source. The majority of the Microsoft sites were running Windows Server 2003, with one instance each of 2000 and 2008.

That in itself bucks a global trend, in that over 60% of all websites are based on Apache whereas IIS 5, 6 and 7 account for 1%, 20% and 3% respectively. Microsoft and its partners have clearly had a strong influence over UK Government procurement decisions.

How much evidence was there of Content Management Systems (CMS) usage? Very little. There was a small pocket of Vignette CMS (now owned by Open Text), some Drupal and Joomla, and one instance of Microsoft SharePoint 2007. There was a weak correlation between the age of the site and use of CMS - it's more common in recently published web sites.

Were the Microsoft based websites more expensive than the Open Source based ones? I ran a non-parametric statistical test (Mann-Whitney) on the two samples and the short answer is that there's no significant difference. This shows that the overall costs of building these sites out-stripped the licensing costs.

Do these sites seem like good value for the UK taxpayer? At an average non-staff cost of £2.5 million per website, absolutely not. I read the report many times to understand how HM Revenue and Customs could spend £35 million on www.businesslink.gov.uk. I just can't figure out how.

What's the lesson? The licensing model of the underlying technology isn't a significant factor in determining website costs. Free & Open Source Software won't matter when a Consultancy or Outsourcing company loads up a contract with tasks requiring many person weeks of expensive billable time.

If there isn't a FOSS advantage, there's still clearly a commercial off-the-shelf (COTS) advantage. One of the main purposes of these sites (apart from serving static information pages) is to provide a portal for file download.

Commercial open source software packages such as CogniDox allow you to do this in a completely secure and flexible manner. It costs thousands of pounds, not millions, and it delivers those features out of the box. And it has competitors such as Alfresco and Nuxeo that can also do the same.


IT Reseller Magazine (http://www.itrportal.com/) is today reporting a survey by Networked Planet (an Enterprise Search software company based in Oxford, UK) that shows 52% of office employees admit they have saved a document to the Company intranet or network (shared drives) and never been able to find it again. Around 39% of them say it's because no-one told them where to file the document, as there were no company guidelines.

The solution of course is better Enterprise Search. It's interesting that Networked Planet offers an extension on their core product specifically for Microsoft Office SharePoint Server (MOSS), so the implication might be that Sharepoint users are also vulnerable to the mis-filed document problem. That might be especially true of MOSS07, because SharePoint 2010 has added more tagging and metadata features. It's also maybe a reflection of the fact that the Microsoft FAST search server option is an extra $14K in licensing costs.

But I would hazard a guess that it has less to do with the inherent features of SharePoint and more to do with the way that IT departments react to business user demands and use SharePoint as a quick fix solution for document sharing and departmental Intranets i.e. barely one step up from the shared network drive and WebDAV. It isn't really an alternative to taking a short amount of time (1-2 days) for the business to do this properly. Call it "information architecture" or "common sense business planning" as you prefer, but it's still time well spent. And something that deserves a champion at the CxO level.

Enterprise Search software helps, but the other characteristics of a document repository - good catgorization, multiple category allocation, metadata tagging, content file types and a company-wide view rather than departmental Intranets; are equally important. Imagine the dilemma of a new starter in your company - how do they know what to search for? If they can navigate instead through a logical hierarchy it means they can self-train in company knowledge (and you don't need to sit by their side for a week).

Networked Solutions has a nice suite of products for the .NET / Microsoft Server market. They're far cheaper than other Enterprise Search software solutions available. But the cost (around $17K) is still more than the free Open Source products such as Apache Lucene/Solr or Xapian. The problem with those is that it moves the costs from licensing to system integration - you have to know what you are doing. This is why we get a bit excitable when talking about the Solr or Xapian or Swish-e integrated search feature in CogniDox. The value of what we have integrated is more than you pay for the entire CogniDox license.


CMS Wire has posted an article on full text search in web and enterprise content management and concludes that Lucene with Solr is the de facto choice.

The reasons given aren't new - the quality of the search results, the stability of the (open source) software, the utility of the APIs, etc are all cited. It doesn't mention some of the major sites that now use Lucene/Solr - I find that the LinkedIn search experience is particularly enhanced by use of it.

But it vindicates what we've known for some time, when we too decided to adopt Lucene/Solr as our search technology.


Theme of the week for me has been "free software".

It started from a strange source - reading about the judgement in the BSkyB versus EDS law case. Briefly, BSkyB hired EDS in 2000 to build a customer relationship management (CRM) system for two call centres in Scotland. The project was worth £48m, and EDS gave assurances that the system could go live within nine months and be completed within 18 months. However, by 2002 BSkyB had taken the development back in-house (it then spent  £265m on the project) and in 2004 sued for damages. It took years but they won - EDS are guilty of "fraudulent misrepresentation giving rise to damages" estimated at £200m. As you'd expect there were arguments from both sides. EDS argued that BSkyB had been vague about what it wanted. Legal commentators seem to agree that it finally came down to the (lack of) credibility of the EDS principal witness, and that's why they lost their case.

But what seems not to bother anyone is the overall cost of the project. I'm guessing there was more to this CRM than the average Saleforce.com project, and I used to work on call distributions systems in my Telecoms days so I know what these can cost, but a budget of £48 million pounds? And then the cost of £265 million to do it in-house? Amazingly high.

Sometimes when you see coverage of open source technology by certain websites or in analyst reports there's a nudge-nudge factor this isn't really proper grown-up technology. Free software - you get what you pay for - is their implication.

But if this is a glimpse into the world of proprietary Enterprise software I think we need to hope that things have changed since 2002.

It puts the arguments between definitions of freeware, free software and open source software into perspective. The arguments between advocates of "free beer" and "free as in freedom" can be fascinating, but pragmatism says free/open source software is really more about not being ripped off. Not being ripped off by the project costs; by the false promises of a vendor's sales team; by the inflexible terms of a license agreeement; and by inability not to to be able to change the software if you want.

There's a common thread between community projects, companies who offer free or freemium product, commercial open source vendors and free software projects. It is transparency, good value-for-money and respect for the long-term relationship with a customer.

Government is known for the same car-crash examples of ICT projects as the above. The UK government came out this week waving a large open source banner, but achivements included the example that "over 25% of secondary schools use the Linux operating system on at least one computer".  Not good enough.


Open Source CMS Market Share Report

Posted by: paul

The 2009 edition of the Open Source CMS Market Share Report was released today by the water&stone digital agency.  A free copy of the survey can be downloaded from CMSwire at http://www.cmswire.com/downloads/cms-market-share/

Ric Shreves, who has led the reporting on this and previous editions in 2007 and 2008, has developed a very thorough and clever method for analysing the market share of an open source product or project. You can't just use company financial data because the software is downloaded for free, so he has developed an arsenal of metrics ranging from number of downloads to the level of community actvity around the product. He uses these to order the products in terms of their usage and mindshare.

A CMS (it stands for Content Management System) is software used for creating Enterprise web sites and for uploading content such as news, blogs, documents; or any unstructured information content, without requiring technical knowledge of web site design. An analogy is sometimes made with the introduction of the Caxton printing press - without one you couldn't successfully enter the new printing industry. Today, you can't keep up with the creation and update of web sites without the modern equivalent - the CMS.

At the risk of offending my web designer colleagues, it can sometimes seem that every web design company out there offers it's own CMS. But the ones in this report are much more complex in nature, with a wide array of features.

The highlights are that Joomla!, Drupal and Wordpress are the most commonly used (in that order); and that there is a gulf of usage between these and the rest. Although Joomla! was the most used, it only came 5th in response to the question of how much people approved of it.

I think one problem with the report is that there is a difference in the type of company that chooses to use Wordpress rather than Joomla! or Drupal. Wordpress is easy to install and use, and it is therefore excellent for a one-person company that needs to get started quickly. Joomla! and Drupal typically need server installation skills but they do offer myriad features and add-on extensions, either free or low cost. This blog for example is produced by an extension called MyBlog from Azrul.com. The website is built using Joomla! 1.5 and the blog creation is a component like any other.

We also use Joomla! for the customer portal sites we create - CogniDox provides content from the document repository and we can add-on other components such as a customer trouble ticketing system by adding extensions.

I'm pleased to see Joomla! do so well in this report. It hasn't been getting as much press as Drupal recently, mostly because there are now companies offering Drupal support, and they have (quite rightly) promoted the tool along with their capabilities. There is still room for improvement on that front, but Joomla! is an excellent CMS.


There was an interesting survey published just over a week ago - Axios Systems, an IT services management software vendor, commissioned interviews with 1500 IT executives in North America, Europe and Asia.

The headline was that 57% of interviewees felt their IT systems, processes and services were not delivering the value expected of them by their companies.

I was taken with two other points - 67% said that they still had no way of directly measuring the business value of their IT systems in real time, and 63% said that cost reduction would be their #1 driver over the next 12 months. 

"Let's go change what we cannot measure" would seem to be the unfortunate implication. 

However, I'm inclined to see this as an insight into the current state of mind of Enterprise IT groups. Many "go achieve more with a reduced budget" memos have been written in 2009, and the effect on morale must be significant. One problem of measuring quantifiable benefits of enterprise software is that there will be dozens (if not hundreds) of individual applications in use by a company at a given time, and knowing where value was added to the mountains of data is never an easy matter. Subjectively, one knows when an IT application has made something faster, better, easier; but it isn't always easy to see it on the top or bottom line. In this climate, it may be a better idea to put the metrics on hold and follow companies that are exploring the cost reduction promise of open source, cloud computing and virtualization technologies.

A good time to consider the cost of switching from an expensive proprietary solution is when the annual support and maintenance invoice falls due. This is often higher than the Year 1 cost of a commercial open source alternative (and support costs in subsequent years for the alternative will be substantially lower). 

I read a case study recently about a Legal firm moving from an expensive  traditional client/server document management system to open source - a quote that stood out was that it took the project team months to get management approval for the switch because "it all seemed too good to be true". Certainly good enough to overcome the prohibitive costs often involved in such a change.

But it would be a mistake to focus only on cost - the 'fit' of services offered, published APIs, plug-in modularity, feature extensibility when required, and avoidance of proprietary data format lock-in are equally salient. Open data is just as important as open source if you ever decide to change your mind about an application or vendor.


It's been one of those months where two or three thoughts or threads connect together.

First, we've been exploring tighter integration between CogniDox and Microsoft Windows Office applications. This led us to sign up for the BizSpark technology seeding program (many thanks to BLN for sponsoring our application) and to work with Microsoft technologies; if not for the first time then at least more than usual for us. Completely unconnected, of course, but in the same period Microsoft have started to release software under open source license, and have launched other programs such as WebsiteSpark to rival the dominance of the LAMP stack for web development.

Second, there's been (yet another) flurry of debate concerning the difference between free software and open source software. Eric Barroca, the CEO at one of our document management competitors (Nuxeo) wrote in his blog that he couldn't see a real difference between a free demo from a proprietary software company and the community edition of an open source product. The big claim of open source softare that it generates free sales leads is wrong - it is free trials that generate free leads. Matt Asay, VP Business Developent at another competitor (Alfresco) says that the open source software sales cycle averages 60-90 days, whereas traditional enterprise software averages 6 to 9 months.

Third, and continuing with the theme of software sales cycles, there was a disturbing expose about some of the unpleasant tricks that are used to sell enterprise software licenses. Read the article for details, but we are talking about demo vapourware, underbids followed by expensive add-ons, lock-in through high exit cost, and enforced software upgrades amongst other tricks. I have bought enterprise software in the past, and sometimes it does feel like a war with the vendors.

I read as many news articles about open source business models and trends in the enterprise software business as anyone, but am bemused when some seem to ignore common sense. Companies that have already made a major investment in Microsoft technology are going to keep looking for ways to extract value from that investment.  The vast majority of companies adopting enterprise software (such as a document management system) will consider total cost of ownership, including the non value-add costs of integration and trouble-shooting; at least as much as the license model. A commercial open source vendor with a sales team running to dozens of people is likely to behave in much the same way as a traditional proprietary software vendor, unless they have also rejected the tradition of sales targets, quotas and territories. Which they haven't...

That much seems quite obvious, yet curiously at odds with the earnest, evangelical tone of many blogs on the subject of open source enterprise software. The key differentiator for open source vendors in enterprise software is that they devote more energy to building their revenue from an ongoing relationship with their customers. It's all about mutual respect and transparency.


Open Source in UK Government

Posted by: paul

IDC’s latest Worldwide Open Source Software 2009-2013 Forecast reports that global open source software (OSS) revenue was $2.9 billion in 2008, and is expected to grow by 34% in 2009 to $3.9 billion. IDC predict global revenue .

This is higher than IDC previously predicted and they attribute their change of mind to the current recession, the growing acceptance of OSS and inclusion of revenue from hybrid products from the larger IT vendors and the Cloud / SaaS vendors.

However, OSS still only represents around 2-3% of the $138 billion market for proprietary software.

I find it intriguing that they see such accelerated growth for proprietary software from now to 2013. It seems to rely on an assumption that the recession will end and we’ll go back to buying software ‘the way we used to’. It doesn’t seem to tally with their other conclusions about growing OSS acceptance and the increase in hybrid products that rely on embedded OSS. Also, how many consumer or enterprise markets have seen a price fall but then recover to former levels? That may happen for commodities, but software-as-a-commodity?

It’s more likely that the days of the traditional enterprise software majors are on the decline. Companies will get more and more comfortable with the idea of paying for support and services rather than per-seat licenses. Will software buyers ever again opt for steep up-front costs and little or no recovery plan in the event of failure when they discover that they can have a pay-as-you-go alternative?

In which case, we will see a far more rapid convergence of the trend lines than is predicted by IDC.

If the private sector is impacted by the recession to re-consider procurement patterns, then the public sector and Government should follow suit. It’s our money and there’s not a great deal of difference between spending tax money on a Banker’s bonus or on yet another IT failed / delayed project.

There’s a new community website / networking group called UKGovOSS that apparently feels the same way. They have published a report which quotes some interesting facts. For example, reported ICT spending by UK local authorities is expected to reach a record level of £3.2 billion of expenditure in 2008/09. The survey found what is commonly seen in the private sector - open source is used extensively for web servers, databases and web publishing tools, but the desktop and end-user applications are mainly proprietary software (Microsoft).

The drivers towards OSS are also familiar - lower cost, freedom from dependency on particular suppliers, and the functionality of the software itself.

CogniDox isn’t used (yet) in the public sector but this is a very worthwhile group and we are happy to join. It will be interesting to see whether the depth of open source talent and skills available in the UK becomes evident to those in power.

It’s already been pointed out by others that the UK Government site on open source at http://www.cabinetoffice.gov.uk/government_it/open_source.aspx is based on Microsoft's ASP.NET web application framework, so there is clearly a way to go yet.

 


This week we made our quarterly release of software, which we labeled version 8.0 (as opposed to 7.x) because (a) there were substantial changes to the internal architecture to support search engine plugins, and (b) it has a re-worked configuration system which makes life easier for CogniDox sysadmins.

The headline feature was therefore the fact that user companies can select between using the 'classic' Swish-e or the Apache Solr search engine. We've blogged in the past about Enterprise Search, so I'll just summarise to say that Solr offers features such as frequent incremental indexing, federated search, ‘more like this’ searches and spelling suggestions when no results are found. All of which combine into a single benefit - one search has a far greater probability of giving you useful results.

We also support a third search plugin in Flax Search Services, which is based on Xapian. FSS is still in-development, but it will be supported when it is ready for showtime within months. As we've said before, this is promising technology. Solr on the other hand has the power of the Apache brand. It shows how far F/OSS has come when I can say 'brand' in this context :-)

I've also blogged before about another feature - better document import - so I'll also gloss over that apart from saying that it feels good to rip out a feature and start again when it brings good usability improvements.

For ages now we've stopped using Microsoft Office Project (too expensive) and have been looking at open source alternatives. We've used GanttProject and and OpenProj for task scheduling and resource levelling. It's fair to say we do a lot less of this now than in the past - less spread out geographically, smaller development groups and an Agile methodology explains why - but it still comes in handy when you need to juggle constraints. In the v8.0 release support is provided for GanttProject, an alternative to Microsoft Project that provides project scheduling and management for Linux, Windows and MacOS X users. CogniDox includes an example PDF converter script for GanttProject .gan files. There isn't much to choose between the two mentioned, but GanttProject can import and export to MS-Project .mpx and .xml file formats whereas OpenProj only has save as .xml.

Another open source project we use is FreeMind, a free open source mind mapping application that allows a user to capture relationships as a diagram that represents ideas arranged around a central concept. CogniDox v8.0 can render FreeMind .mm files into Flash movies as the viewable version of the document. Mind mapping is one of those things that people either love or hate. One of the big 'preparation tasks' that companies face before they introduce a document categorisation system is to decide what are the information categories. We tried doing this using outline tools but it just seemed more natural to use a mind map where you start with "My Company" in the middle and then add departments, products, projects and anything else that made sense as links. So we've experimented with FreeMind as an Information Architect support tool. But it also has merit for animating a process or workflow, and that's what we are doing when we create a Flash movie from the static map. It also allows you to create linked maps, so you could create a procedure that links you to the appropriate content in CogniDox or on other systems.

Finally, there was the usual crop of small features that were direct requests from the user base. One was worthy of note: the weekly reminder email to users now includes a useful project collaboration feature – it reports on outstanding document review and approval requests from other users on shared documents. The use case for this is when you are preparing for a product or project gate review - if there are outstanding un-issued or un-approved documents you may decide to cancel the gate review until these are done as you would fail the gate criteria until this is so.

Integrating open source technologies such as Solr, Flax, GanttProject and FreeMind with CogniDox to create a platform of useful tools reminds me of a fundamental difference between proprietary and open source products. 

Proprietary software is nearly always 'streaky' - excellent modules mixed with so-so efforts. In the overall interest of the product's reputation, any concerns about these efforts are frequently supressed or denied. When you merge open source projects into one integrated whole, you can be as objective as you like about the module you are integrating. If you get it wrong, you jettison it and find a replacement. You don't need to have a 'tiger team', or the issues of managing out under-performing contributors. To paraphrase the CSI TV show, its all about the code.


We published a paper just two days ago on open source Enterprise Search tools such as Lucene/Solr and Xapian/Flax which basically asked whether these tools are now comparable for this purpose with the proprietary products from the likes of Autonomy and Microsoft FAST?

It's a very hot topic at the moment, and Matt Asay (VP Business Development, Alfresco) covers Lucene/Solr in particular in his CNET blog.

At the risk of simplification, the answer is more or less "yes", but the integration of these powerful tools can be held back by the fact that companies need to invest time to learn how. Then there is the issue of who do you call when you need support later on?

These problems are being addressed by companies whose business model is to provide those implementation, customization and technical support services. For Lucene/Solr, the leading name is Lucid Imagination based in San Mateo, California. One of their customers is Comcast Interactive Media, a division of the CableTV/ISP giant that specialises in online media. Their view is that Lucene/Solr has 80% of the features of rival proprietary search products (and they didn't need the other 20%).

For Xapian, the equivalent source of services and support is the Flax team. They are local to us in Cambridge (UK) and are very actively developing their Flax Search Service.

In June of this year, In-Q-Tel, the technology arm of the CIA, invested an undisclosed amount in Lucid Imagination. I guess that if ever an enterprise knew a thing about searching massively large datasets, it's the intelligence agencies! Both Lucene/Solr and Xapian/Flax are demonstrating that they are capable of scaling to more than 100 million documents.

The other problem with Enterprise Search engines is that it is hard to see the value until after you have integrated the service and can see the results on an actual document search. We're now in the final testing stages of our next release (v8.0.0) and are able to see that for ourselves. We've developed plug-ins for both search engines, and are building up a rich picture of the strengths of each.



Company Blog Tags Open Source Software