The Macrosite for News, Analysis and Opinion about the Future of the Internet
Rob Salkowitz

Bandying Semantics: A Peek at 'Linked Data'

Written by Rob Salkowitz
9/23/2009 11 comments
no ratings
DISCUSS   Digg   Del.icio.us   Reddit   Email This   TWEET THIS

Now that more of the big data sites on the Web are getting on board the semantic Web bandwagon and applying machine-readable RDF tags to their published content, we're starting to see the outlines of "Web 3.0" beginning to emerge in the form of the linked data movement.

But will all those semantic smarts really improve the information overload problem, or find new ways to annoy us?

The first phase in the development of the semantic Web essentially aims at defining the technology standards for making digital content "meaningful" to computers in ways that approximate the intuitions of human beings. That way, software could identify connections between different content that's specifically relevant to the context in which it is being retrieved. No more "your search returned 1,100,095 entries"; no more frustration trying to find online info about that guy who has to introduce himself as "Jon Hamm, but not the famous one."

The research community working on semantic Web technologies has put enough of a framework in place that more and more closed data systems are marking up their content using the richer RDF semantic metadata tags. The utility of semantic search within any given system depends on a critical mass of data being properly tagged, and that takes time. But the process has been ongoing and is now starting to bear fruit.

The linked data movement takes the semantic concept to the next level by automatically creating linkages between data sets, not just between data elements in the same system. Thomson Reuters Calais Release 4.0, for example, promises to deliver up unstructured data from all over the Web, including Wikipedia, DBpedia, the Internet Movie Database, and Shopping.com, as relevant results to natural language queries.

Linked data is becoming the hot new thing partly because it's being talked up by Web honcho Sir Tim Berners-Lee, and is the subject of an increasing number of industry conclaves and academic conferences. It's also cool because it promises to solve a genuine problem caused by the first and second generations of Web technology -- the flood of information that forces people to become the integration point between literally thousands of disconnected systems.

My question is, what new frustrations and unintended consequences will Web 3.0 bring? I'm all for better search results and automating electronic chimp-work, and I haven't used any of the latest-and-greatest tools, but my limited experience with anticipatory, context-sensitive "helper" systems leaves me cold.

Part of the problem is that the results are literally correct -- that is, I can appreciate why the algorithm returned those particular references and linkages, or recommended certain others -- but the association of concepts is just a little bit off in ways I could never describe properly to a machine. The deep connection between concepts is too subjective and experiential to define with metadata, and too complicated to capture in a static taxonomy.

Oddly, the gross imprecision and frustrations of Google no longer bother me. My expectations are set, and I've adjusted my workstyle to fit the idiosyncrasies of the tool. Incremental improvements are welcome. Smart, contextual solutions for very specific problems, such as BingTravel (formerly FareCast, which combs through airline pricing information to provide optimum fares and predictions about whether prices will go up or down), strike me as breakthroughs.

The promises of semantic search are vastly greater. I have no doubt that the information and patterns that can be exposed by mashing up linked semantic data will be profound, interesting, and more relevant by orders of magnitude than what we have now. But I also suspect that the frustrations will be much more insidious: like speaking to someone who understands your literal words (not just your grammar), but not your metaphors or idiomatic usage.

At the margins of natural language search, weird moments and unintentional humor will abound; and yet somehow, I don't think all those PhD scholars would be amused to think that they are creating the computer equivalent of Borat.

Oh well. I guess we have to leave something for them to discuss at the Web 4.0 conferences in 2015.

— Rob Salkowitz is the author of Generation Blend: Managing Across the Technology Age Gap (2008) and co-author of Listening to the Future (2009). His next book is Young World Rising: How Youth, Technology and Entrepreneurship Are Transforming the Global Economy.

DISCUSS   Digg   Del.icio.us   Reddit   Email This
Current display:       newest comments first       display in chronological order
Page 1 of 2   Next >
aum007
Rank: Cyborg
Wednesday October 28, 2009 7:35:56 AM
no ratings

Rob,

You and me both know of countless ideas on the Internet that were supposed to change the way we use/access the internet-But so many of them ended up in the Scrap heap of Internet Rejects.

Will Semantic web (in current form) join that list?

Ultimately its the crowds and the Money behind it that decides in which direction the internet goes.

Regards

Ashish.

Mary Jander
Thinkernetter
Tuesday September 29, 2009 9:39:12 AM
no ratings

Great example, EliteC! Hopefully, we won't be asking ourselves "Where's the beef?" when some of the old/new products hit the market to help us find ourselves online.

EliteC
IQ Crew
Monday September 28, 2009 8:57:48 PM
no ratings

There will always be someone who will introduce it as new when most of the time it is an item already on the market with a minor change.  In terms it is new by adding the change, but not in they way it is presented.  For example: Hardees is introducing their new Big Carl in compeiton to McDonalds Big Mac, the difference is grill and extra pattie.  However it is still a burger one just have 3 pieces of bread and the other three pieces of meat.

robsalk
Thinkernetter
Friday September 25, 2009 3:15:06 PM
no ratings

@ Root Maniac - That trust element is exactly where the semantic web concept may disappoint. The social life of information is complicated - we as human beings apply all kinds of subjective and experiential filters to make instanteous assessments of trustworthiness, linkages between concepts, etc. I don't doubt that networked IT systems will eventually have the horsepower to make the requisite number of computational steps (if they don't already), but I find it very hard to believe that the process outcome would resemble human judgement. Or if it did, it would do so with just enough imprecision to annoy us, confuse us, or mislead us just often enough to compromise the integrity of the process.

With Google-type search, you are asking a deaf-mute for directions, and your expectation is that it will point to stuff in broad gestures, based on its limited understanding of what you are asking. Symantic systems are supposed to speak your language and answer you with relevant information, which implies a higher expectation of trust. If that trust does not materialize because it is a) limited or b) being manipulated, the level of frustration will be a lot higher.

Root Maniac
IQ Crew
Friday September 25, 2009 2:58:42 PM
no ratings

Any semantic analysis system is going to need a reliable mechanism of knowing which data can be trusted, otherwise any results it returns will be tainted by marketers gaming the system, ignorance, and outright lies. It's hard enough for a human browsing through Google results to decide which ones are relevant and reliable, let alone a machine. What mechanisms do these developers propose for ensuring the links between data sets will reliably reflect actual correlations, and not spurious ones created by mercantile manipulation, boneheaded bloviating, or plain old maliciousness?

Mary Jander
Thinkernetter
Friday September 25, 2009 9:47:10 AM
no ratings

Back in the early 1990s, a few vendors were trying to make hay by advertising their wares as artificial intelligence. Then, when it was apparent that the science was still highly flawed, the word AI became marketing anathema.

Here we have AI re-emerging for the Web.

My question is whether suppliers will look to avoid using the term and instead attempt to demonstrate that they've reinvented the entire concept using "new" technology.

Perhaps they have; but more likely developers have built on the AI techniques that have been long behind many different products and services.

Just a Friday morning thought.

mathemagician
IQ Crew
Thursday September 24, 2009 1:38:31 PM
no ratings

Oh, and I forgot the most important thing of all:

We will either prove or disprove the "six degrees of separation from Kevin Bacon" by exhaustion.

mathemagician
IQ Crew
Thursday September 24, 2009 1:35:43 PM
no ratings

We will find that:

  • Harrison Ford was not only in slient films as well as contemporary films, but has been remarkably well preserved and active for somebody who died in 1957 (there was an actor named Harrison Ford who did silent films and died in 1957 who is no relation to the Harrison Ford (think Star Wars) we all can immediately recall).
  • There will be a proof that 1+1 = 0, due to a typo, and all sorts of weird things will now be possible.  I hope such a system is not hooked up to a vital or threatening system (like a doomsday device).
  • Many more people will not be able to fly internationally due to their names matching the list of "people of interest".

I'm sure we can come up with other possibilities, but I think the creation of links is the proper first step.  The next big hurdle will be linking them together and resolving the various identifiers (and dis-ambiguating the entities they represent).  One interesting startup, FreeBase, is trying to build up such a list with a common set of URIs (Unique Resource Identifiers) that can be shared and reused.  They have a large set of stuff and it looks promising.

Of course, Semantic Web technology is attempting to solve two problems:

  1. What is the context of you search/request/data and how can I present it to you?
  2. Developing/evolving machine intelligence to make computers more responsive to normal human input (like speech).

One interesting problem will be how to present the results of your semantic-based queries when there are a large number of results (if I ask "what are the various viewpoints of the works of William Shakespeare?", I expect to see a LOT of answers, even if they are grouped and summarized along some common-sense lines).

A last note:  IBM is working with Jeopardy (yes...Alex Trebeck and those folks) to have a Blue Gene system be a live participant on their show.  It will listen to the answer like the two other contestants and respond with the question verbally.  No date announced, but they are working on it to demonstrate how far semantic technology and AI have come, as well as speech recognition and speech synthesis.  I don't know how it will buzz in to get to provide the question, though.

nasimson
Rank: Web master
Thursday September 24, 2009 12:17:22 AM
no ratings

Semantic Web will have its own improvement curve. Its own share of mistakes & blunders. But at the end we all (including our companion machines) will come out wiser. I am witnessing my young nephews & nieces growing and learning. And its interesting how they make innocent intelligent mistakes while learning. Its exciting to see how the web evolves beyond an information highway to a 'global human knowledge base'. Even more exciting will be the transition to get the succinct wisdom out of terabytes of RDFs & ontologies.

TJHSR610
Rank: Cave Painter
Wednesday September 23, 2009 7:04:10 PM
no ratings

® 2001-2009 CWH Dubai Techn, Int'l USA/UAE; has explored and are in Alpha Testing phases of this Technology.

Furthermore, with the Current Mandates for Computerazation of  U.S. National Health Care Records. The Neccessity is very apparant; communications; directly and Instantly between  all medical devices, and servers; for actual triage and treatment by Attending, Medical Doctors anywhere in The World, Including every prescribed medicine(s). Even the Major Business Communities shall be require in the very near future.

Moreover, this is two fold; as the need for Upgraded Security shall be a Must!

Therefore, More spending to secure the Global Networks with More Layred Security, such as CWH's 23rd. Century Designs and Software Implemtations!

(E-Acute Emergency Health-Care & E-Health-Care)

©-®2001-2009 CWH Dubai Technologies, International, LLC. U.S.A. / U.A.E.

  All Rights Reserved and Protected U.S. & International Laws

Page 1 of 2   Next >
The ThinkerNet does not reflect the views of TechWeb. The ThinkerNet is an informal means of communication to members and visitors of the Internet Evolution site. Individual authors are chosen by Internet Evolution to blog. Neither Internet Evolution nor TechWeb assume responsibility for comments, claims, or opinions made by authors and ThinkerNet bloggers. They are no substitute for your own research and should not be relied upon for trading or any other purpose.
previous posts from Rob Salkowitz
Rob Salkowitz
Rob Salkowitz   2/9/2010   4 comments
A remarkable event in world affairs is taking place this week in London, as the first One Young World conference is set to convene.
Rob Salkowitz
Rob Salkowitz   2/3/2010   9 comments
Earlier this month, I came into possession of a used Xbox console that I planned, with all due sincerity, to use exclusively as a “media center extender” to get music and movies from the PC in my office downstairs onto the TV in our bedroom.
Rob Salkowitz
Rob Salkowitz   1/25/2010   35 comments
If your Facebook friends are deadbeats, it might be harder for you to get a credit card or mortgage, according to a recent report on the banking industry site, CreditCards.com.
Rob Salkowitz
Rob Salkowitz   1/14/2010   7 comments
The ground had barely stopped trembling under Port-au-Prince when the first tweets started coming in. The 7.0 magnitude earthquake that devastated Haiti on January 12 is turning out to be another grim proving ground for a new mode of crisis management that coordinates responses through social media.
Rob Salkowitz
Rob Salkowitz   1/12/2010   8 comments
Usually a little imprecision in business intelligence systems costs companies a few dollars. Last month, a BI failure nearly cost hundreds of airline passengers their lives.
5
of
IETV: the thinkerNet on film
5
of
2pm EST
Tue
Feb 23rd
2pm EST
Thu
Mar 4th
3pm EST
Tue
Mar 9th
an IBM information resource
sponsored content
big blue blog
Todd Watson
IBM is announcing today the first of its Power7 processor-based systems and the Power7 processor itself at an event in NYC.
white papers & case studies
an IBM information resource
sponsored content
Smarter Collaboration: How to Thrive in a Challenging Business Environment
Market conditions are changing faster than ever, and organizations need to improve their agility and adaptability in order to provide better service and improve processes. The ability to work with customers, business partners, and employees as effectively as possible - while at the same time holding down costs - is a key to success.

READ THIS eBOOK
your weekly update of news, analysis, and
opinion from Internet Evolution - FREE!

REGISTER HERE
Wanted! Site Moderators
Internet Evolution is looking for a handful of readers to help moderate the message boards on our site – as well as engaging in high-IQ conversation with the industry mavens on our thinkerNet blogosphere. The job comes with various perks, bags of kudos, and GIANT bragging rights. Interested?

Please email: moderators@internetevolution.com
CMP Media LLC
Internet Evolution – not for thickies
Congress Hits the Snooze Button With China
Ira Winkler
In his
recent Congressional testimony, Dennis Blair, the U.S. director of national intelligence, stated that the U.S. is "severely threatened" by cyber attacks and that the recent Google (Nasdaq: GOOG) attacks should serve as a wake-up call.

CLICK FOR MORE
Steve Saunders' Outernet
The Death of Anonymity: Part 3

Part 3 of 4   |  
See complete series
10|28|09   |   1:35   |   4 comments


What can users today do to protect their online privacy? The simplest and most obvious option is to not use the Internet – at all. However, once all digital information is consolidated over the Internet, trying to protect digital identity by simply unplugging from the Internet becomes impossible – a fact that has manifest implications for civil liberties, Saunders says.
Singer at C-Level
Bing + Twitter: Wrestling a Tweety Fire Hose

10|27|09   |   2:33   |   2 comments


Now that Bing has struck a deal with Twitter, its search service will have to process a tsunami of Tweets, many of which are worthless junk. Stefan Weitz, director with Bing Search, explains to Michael Singer how his service will make sense of the Twitter mayhem to provide relevant results to end users and enterprises.
Steve Saunders' Outernet
The Death of Anonymity: Part 2

Part 2 of 4   |  
See complete series
10|27|09   |   2:08   |   8 comments


By 2011 the number of Internet-connected sensors will exceed 1 trillion, making your chances of doing anything or going anywhere unnoticed pretty much zero. Saunders talks about how the 'sensortization' of the Internet is eliminating the traditional divide between online and offline populations.
Singer at C-Level
Inside the Bing/Twitter Deal

Part of 2   |  
See complete series
10|26|09   |   1:43   |   3 comments


Bing, Microsoft’s search service, has struck a deal with Twitter. Here Stefan Weitz, director with Bing Search, talks through how the deal will work from a technical perspective, and what’s in it for users.
Marissa Mayer
VP of Search Products & User Experience, Google

10|26|09   |   01:20   |   4 comments


Google's Marissa Mayer explains how its partnership with Twitter both makes Google search more comprehensive and extends its social-networking reach.
Steve Saunders' Outernet
The Death of Anonymity: Part 1

Part 1 of 4   |  
See complete series
10|26|09   |   1:29   |   13 comments


The 20th Century Internet was characterized by the ability to interact with other people and information on the Internet largely without anyone knowing who you were. The Internet of this century, conversely, will be defined by identity. Saunders explains how Internet users are unwittingly contributing to the demise of the anonymous Internet.
Steve Saunders' Outernet
Search Inversion & Profiling: Part 3

Part 3 of 3   |  
See complete series
10|21|09   |   1:40   |   No comments


Steve Saunders talks about the risks inherent in uncontrolled, widespread profiling of Internet users, and how one day this practice could form the basis of a new industry, the Outernet, which in economic terms will have outgrown the commercial value of the Internet itself.
Steve Saunders' Outernet
Search Inversion & Profiling: Part 2

Part 2 of 3   |  
See complete series
10|20|09   |   1:29   |   1 comment


Search companies and social networks are collecting incredibly detailed information about their users, says Steve Saunders, who predicts that these 'profiles' could one day become commodities to be bought and sold by companies on 'profile markets' or 'identity exchanges’ – the digital DNA equivalents of the financial and commodities exchanges on which stocks, oil, and gold are traded.
Steve Saunders' Outernet
Search Inversion & Profiling: Part 1

Part 1 of 3   |  
See complete series
10|19|09   |   1:52   |   6 comments


One of the most important Internet issues of all time is being ignored by the media. In this three-part video series Steve Saunders explains how search companies are turning the tables on their users by creating user profiles for financial gain, and how soon this trend will explode into full scale profiling.
what.the.ferraro
More Pitiful Privacy from Facebook

12|16|09   |   02:08   |   2 comments


Facebook's new privacy controls just don’t cut it with little miss 'Air Quotes.'
Lee H. Berke
The Decline & Fall of Broadcast Television

2|9|10   |   1:00   |   No comments


Want to know the future of broadcast television? Take a look at broadcast radio’s past.
Tom Nolle
Everything New Is Old Again

2|9|10   |   2:13   |   6 comments


Research shows that the youth of today like Facebook – but not blogging or Twitter. Does that mean Facebook has won, or just that it's not yet out of favor? Will all the services we see today fade into Ovaltine-or-Wheaties status in just a few years?
what.the.ferraro
Email Marketing Gets Desperate

2|8|10   |   2:31   |   4 comments


Promotional emails will use just about anything timely to get people to buy things. Seriously, anything.
Steve Saunders' Outernet
America, Truck Yeah!

2|8|10   |   1:42   |   5 comments


Steve likes his new Dodge Ram 1500, but hates Chrysler's Web non-sales strategy. Rant on, li'l buddy.
what.the.ferraro
Twits Go Wild for Resignation Tweet

2|5|10   |   1:48   |   4 comments


Jonathan Schwartz is the first Fortune 200 CEO to resign via Tweet. Can he walk on water, too?
Full Nelson
Go With the FLO, Part 2

Part 2 of 2   |  
See complete series
2|5|10   |   2:17   |   3 comments


Fritz and his sweater continue their review of Qualcomm's FLO TV.
Singer at C-Level
Goldilocks & the Data Center

2|4|10   |   3:39   |   2 comments


What kinds of companies are doing the most innovation in the data center? Turns out it's midtier enterprises that are taking the "Just Right" approach.
Full Nelson
Go With the FLO, Part 1

Part of 2   |  
See complete series
2|4|10   |   2:39   |   1 comment


Qualcomm's FLO TV gizmo streams live TV shows. Tragically, they include the O'Reilly Factor
Eurotrash
High & Dry in Barcelona

2|3|10   |   1:08   |   No comments


Ray’s heading to Barcelona for the Mobile World Congress, and he’s not happy about it, the miserable git.
Sweeney Blog
No Sex, Please... It's the Super Bowl

2|3|10   |   2:24   |   2 comments


The Super Bowl ads that CBS rejected are turning up online, generating lots of attention but zero revenue for the broadcaster.