Open-source software is frequently mentioned as a choice for companies looking to create their own enterprise search applications. But best use of open-source calls for knowledge of what to expect.
Open-source as a software delivery concept originated with the “open” Unix operating system, originally developed by AT&T Bell Laboratories. Many Unix “flavors” were produced by commercial enterprises and educational institutions over forty years. Hardware vendors adopted it as an alternative to their proprietary operating systems, encouraged by a government mandate in the 1980s that government computational procurements would standardize on Unix. That edict was not sustained, but many hardware and software companies shifted direction or launched new Unix initiatives based on that mandate.
We can learn a lesson from the case of the popular Unix alterative to Windows for PCs, Linux, which is maintained by a dedicated open-source community pioneered by its namesake Linus Torvalds. The community is necessary for enriching this OS.
Any organization creating products or services based on Linux must provide assurances that its products will continue to work in customer environments that are almost always heterogeneous. Delivering an application based on a version of Linux that works only at a single point-in-time does not last in the real world. Thus, these “Linux shops” need to continue with modifications, improvements, and sometimes mundane changes just to stay in business. That means having internal or external experts to continue providing high-level technical support.
A recent InformationWeek article referred to the Department of Defense (DoD) embracing open-source software; one case mentioned was the open-source Drupal application for content management at the WhiteHouse.gov site. The article makes excellent points about misconceptions relating to open-source that also apply to enterprise search. To this we add what must be known before starting down the open-source path:
1) Open-source software is, by definition, free to be downloaded, but that does not mean it is cost-free. Experts will be required to install, implement, tune, and administer the software. It will not arrive with a “quick install” that brings up a workable interface for the administrator or end-users. Those must be designed, developed, tested, and supported.
2) Just as with commercial search software applications, there are ongoing costs. Even when you have finished tailoring an application to enterprise needs, it is never “done.” The surrounding environment will change (e.g., operating systems) and users will demand enhancements and more features and functions.
3) Content that workers want to search will grow, and types of content will change. Scaling the application and adding new types of content will require changes to indexing “instructions” and how search results are displayed.
4) As noted in the cited article, legal requirements and licensing must be researched and understood by the enterprise IT and technical staff that will support, enhance, and modify open-source code. There are community standards to follow, and various packages come with “use” requirements.
5) There are a number of open-source search engines, but by far the most widely deployed is Apache Lucene. It is the basis for many commercial products that include a search function. Those commercial ventures tend to be active members of the open-source community, which continues to improve Lucene in ways that benefit all users.
When deciding to use Lucene or any other open-source search engine, the enterprise needs a “deep bench” of gurus. The best option is a service provider that serves that purpose, so keep in mind that there must be a large community of experts that can support your open-source choice. Lucene has a large user-community and a growing body of third-party developers to help you implement and sustain the software.
Free software does have costs that must be factored in to make “free” a successful business choice, so adopt with eyes wide open.
— Lynda W. Moulton consults at LWM Technology Services on knowledge management strategies for enterprises.
This post at Lucid Imagination, Lucene/Solr and search technology gurus, makes some excellent points. I thought we needed an update to my earlier post on open source: http://www.lucidimagination.com/blog/2009/12/16/open-source-the-industry-standard/ Lynda
To be clear, I am a total agnostic when it comes to the use of open source vs. commercial solutions. Any reading of my previous comments that tries to portray me as "insinuating" otherwise is not correct. I have recommended and used open source software and endorse it under many circumstances. As with any business decision, there are always special considerations of which the decider should be aware. The growth of the open source developer community, as I stated, is a very positive development - great for the marketplace and adopters.
Okay, now that I see the link that you referenced (sorry I know it is the same one as in the original article) I have to point out to you that the Open Group has nothing to do with Open Source. They are absolutely distinct. The Open Group formed to create open standards around Unix implementations.
You'll notice in the chronology that there is not a single refernce to Linux, anywhere on the page you referenced (and rightly so, because it has nothing to do with Unix).
The point of my original comment was to point out these misunderstandings and they still stand.
Open Systems, the Open Group, and X/Open have nothing to do with Open Source or the software development paradigm. Unix was not developed using an Open Source methodology. Source code was licensed from AT&T to other vendors who then created their own variations. All of this was done in proprietary frame works.
If you want to understand the Open Source paradigm and development strategy I still suggest that you watch RevolutionOS and go to the FSF web site.
The fact that the DoD has finally told its users that they should consider it affirms something that we in the software developmnet industry have known for the last several years.
Even after further reading, I'm trying to understand your position. You seem to insinuate that Open Source software is maintained by some unfunded, nebulous, "community" and fail to point out that IBM, Sun, Oracle, Novell and other major software and hardware vendors are large supporters of many Open Source innitiatives.
It also seems that you blame the lack of Unix adoption as the fault of "open source" principles (which hopefully we've put to rest now). Unix didn't "die because of "open source" principles, it "died" because it was a multi-user operating system with no graphical user interface and suddenly computers were everywhere on people's desks.
Unix (despite its death) actually continues to thrive in commercial data centers, although it is being replaced by Linux to some degree.
It just seems that you are casting Fear Uncertainty and Doom around Open Source while using examples that don't pertain to Open Source at all.
The point is still that the Open Source paradigm is becoming prominent and any commercial vendor who does not learn this lesson soon will be doomed to decreasing market share, where there is a comparable Open Source tool set.
I'm not really able to read all of your comment because some formatting stuff got posted in as well. I meant no disrespect and if the original article alluded to the evolutionary roots of Linux as Unix, I would not argue. However to use the word "open" (even with the quotes) to describe Unix is not even a close approximation. I assure you, I know from personal experience, having cut my teeth on AT&T SYS V (and its predecessors) back in the dinosaur age.
I've been using Open Source since the early 90's and gnu long before that. I'm very sensitive to its proper characterization, because like it or not it changing the entire paradigm of our industry.
I look forward to reading the link that you reference if the comment gets cleaned up. Thanks
I now see that my original post was edited in a way that may have led to your misunderstanding of my intent. The original statement was: Open Source as a software delivery concept originated with the UNIX operating system (OS) “open system.” After UNIX, originally developed by AT&T Bell Laboratories, many “flavors” were produced by commercial enterprises and educational institutions over forty-years.
I approved the copy and should have noticed the change.
The link is provided so that readers can see for themselves the way in which the AT&T Unix case influenced the concept of "open systems" and what evolved to be known as "open source."
While it is true that our current state of "use" under GNU is different, there is an evolution and history there that relates to the "open system" concept. I am not a developer, nor an historian, but do stand by the rest of the considerations put forth in the post.
You and I can read and interpret the history differently but having lived through the "Unix" wars when I was in the software development business, the link seems to follow what I recall pretty vividly. To a technical guru, I am probably being overly simplistic, but the points I was making related to business choices, not technical ones.
I'm sorry, but you are misinformed. Unix was not the start of Open Source. Unix was originally developed by AT& T and was sold commercially. As a matter of fact, the entire Novell/SCO controversy would have never happened if Unix were open source.
Unix did have an influence on the Open Source concept though. It inspired Richard Stallman to create the Free Software Foundation, GNU and to campaign for years against the proprietary software model (Richard even claims a part of Linux but that is a different subject all together).
Neither is Linux a form of Unix. It may have many of the same command names, but the kernel was developed from a smaller less well known hobby OS called Minix. This is an extremely important distinction and is well understood. To call Linux a Unix variation is like saying that OS X is an z/OS variation.
A great movie on the whole subject is RevolutionOS.
Given these misunderstandings of the whole concept, I have to take your caveats under advisement with a large grain of salt.
We have been using OSS for years, saving money and finding huge advantages technically as well.
Your point about the heterogeneous nature of normal work environments is something that we understand well. OSS systems work and play well in a heterogenous environment. Currently we are connecting to mainframes using 3270 emulation software, running file and print on Novell OES, managing secure file transfers, parsing data, running three-tier SOA applications, developing Java software, doing continuous test and build and a variety of other things, all with Open Source.
Lucene is a preeminent search tool and is widely used throughout the industry, we use it as well. It is integrated into a commercial/Open Source offering sold by Novell called Teaming and Conferencing and available via Open Source (without support and some of the nice features) as Kablink. The nice thing about Lucene in this environment is that it catalogs a variety of documents the minute that person posts them into either their personal or team workspace. Indexing is automatic, and seamless.
I think your approach is excellent; that is to "buy commercial" or "quick and dirty development" to get started and then build a more adaptable and customized solution.
To that I would add this recommendation: enterprises without previous experience using an enterprise search solution should begin with a low-cost, out-of-the-box search project just to "get their feet wet." Using any product will teach them more about search than any course, consultant or book. Without experience implementing a search project, search product selection committees or development teams have little basis for knowing what is good, unworkable or highly productive functions that they can specifiy for the next choice or for their own customized solution.
Thank you David and Lynda both for thoughtful responses from different perspectives. I've always claimed to be agnostic about build vs. buy vs. rent (SaaS), with the decision to be based on all the factors mentioned. But I have to admit that in most critical applications, as David says, build, especially on open source foundations, is most likely to satisfy. And I've used a commercial app to provide instant (or quick) service while building the better solution, and built a quick solution to prove the point that it's worth investing in the ready-made one.
The ThinkerNet does not reflect the views of TechWeb. The ThinkerNet is an informal means of communication to members and visitors of the Internet Evolution site. Individual authors are chosen by Internet Evolution to blog. Neither Internet Evolution nor TechWeb assume responsibility for comments, claims, or opinions made by authors and ThinkerNet bloggers. They are no substitute for your own research and should not be relied upon for trading or any other purpose.
When I read a recent article on Facebook (Nasdaq: FB)’s plan to offer a question-and-answer feature (as do several other players in the search space), two thoughts immediately ran through my mind.
As I was planning my upcoming conference schedule, it occurred to me that spring is the most intense season for professional technology conferences. Content management, especially search products, are described, dissected, and discussed at great length at a half-dozen programs.
For sales and executive travelers, being able to find content that resides on company servers is a competitive edge. While finding the nearest copy shop, car rental location, or flight updates is magically accessible on your mobile device, finding a critical document from the office when you are on the road is another matter.
Semantic search on the Internet and enterprise semantic search -- two distinct entities -- are both evolving. In the process, they both run up against some harsh economic, market, and sociological realities. Finding the precise information we want or need depends on many layers of technology, human content production and enhancement, and distribution.
New York's Metropolitan Transit Authority is conducting a pilot test of digital kiosks to guide subway users to where they want to go more efficiently and at lower cost.
The whole Amazon.reader debate is a double-stupid. It's stupid to think that there's any e-book buyer who doesn't know Amazon's URL, and it was stupider to let ICANN launch the whole free-form TLD initiative to start with.
While NFC's original goal was to enhance mobile commerce applications, it is finding its way into a number of other uses, which is creating both opportunity as well as challenges for IT departments.
Enterprises would like to move to cloud computing but are hesitant because they are concerned about providers’ ability to secure company data. Here are some tips that help to ensure that if breaches occur, the business is not left holding the bag.
Edmunds separates customers into segments based on the info it collects on its site and from partners, and uses that to push out custom content, said Brian Baron, director of business analytics for Edmunds.com, at Predictive Analytics Innovation Summit.
The automotive website uses propensity modeling to target ads and customer registration forms, said Brian Baron, director of business analytics for Edmunds.com, at Predictive Analytics Innovation Summit.
New York's Metropolitan Transit Authority is conducting a pilot test of digital kiosks to guide subway users to where they want to go more efficiently and at lower cost.
The whole Amazon.reader debate is a double-stupid. It's stupid to think that there's any e-book buyer who doesn't know Amazon's URL, and it was stupider to let ICANN launch the whole free-form TLD initiative to start with.
While NFC's original goal was to enhance mobile commerce applications, it is finding its way into a number of other uses, which is creating both opportunity as well as challenges for IT departments.
Enterprises would like to move to cloud computing but are hesitant because they are concerned about providers’ ability to secure company data. Here are some tips that help to ensure that if breaches occur, the business is not left holding the bag.
Edmunds separates customers into segments based on the info it collects on its site and from partners, and uses that to push out custom content, said Brian Baron, director of business analytics for Edmunds.com, at Predictive Analytics Innovation Summit.
The automotive website uses propensity modeling to target ads and customer registration forms, said Brian Baron, director of business analytics for Edmunds.com, at Predictive Analytics Innovation Summit.
Expert Integrated Systems: Changing the Experience & Economics of IT In this e-book, we take an in-depth look at these expert integrated systems -- what they are, how they work, and how they have the potential to help CIOs achieve dramatic savings while restoring IT's role as business innovator. READ THIS eBOOK
your weekly update of news, analysis, and
opinion from Internet Evolution - FREE! REGISTER HERE
Wanted! Site Moderators Internet Evolution is looking for a handful of readers to help moderate the message boards on our site as well as engaging in high-IQ conversation with the industry mavens on our thinkerNet blogosphere. The job comes with various perks, bags of kudos, and GIANT bragging rights. Interested?
To save this item to your list of favorite Internet Evolution content so you can find it later in your Profile page, click the "Save It" button next to the item.
M2M: Rise of the Machines? Not Yet David Weldon In the 1970 science fiction thriller Colossus: The Forbin Project, two giant supercomputers from the United States and Soviet Union secretly join forces to take control of the collective nuclear might of the two countries. In the film, the two machines discover each other's existence, communicate back-and-forth, share their collective data, and cut their human creators out of the process. It is the ultimate example of machine-to-machine communications, or M2M. CLICK FOR MORE
M2M: Rise of the Machines? Not Yet David Weldon In the 1970 science fiction thriller Colossus: The Forbin Project, two giant supercomputers from the United States and Soviet Union secretly join forces to take control of the collective nuclear might of the two countries. In the film, the two machines discover each other's existence, communicate back-and-forth, share their collective data, and cut their human creators out of the process. It is the ultimate example of machine-to-machine communications, or M2M. CLICK FOR MORE