The Macrosite for News, Analysis and Opinion about the Future of the Internet
Scott Koegler

How One Editor Streamlines His Newsfeed

Written by Scott Koegler
10/15/2010 14 comments
no ratings
DISCUSS     Email This

With the election season coming up fast, it's increasingly difficult to keep track of any one candidate's activity when news, commentary, and quotes are being distributed from hundreds or even thousands of sources.

The solution, at least in the mind of David Gewirtz, publisher and editor in chief of ZATZ Publishing, was to leverage the WordPress platform's content management system and augment it with some custom code to retrieve the flood of commentary and place it into easily accessible categories based on candidates.

His efforts have resulted in the Website, TrackYourCandidate.com.

This isn't the first time WordPress, or for that matter, other CMS platforms like Joomla and Drupal, have been extended for specific purposes. Each of these systems hosts a wide selection of plug-ins that add functionality to the basic system.

In fact, Gewirtz uses one such plug-in, WPRobot, to scan RSS feeds and populate a database with articles. But sifting through the constantly increasing mass of content requires other tools to format, categorize, and select articles appropriate for publication.

Gewirtz, a former computer science professor, explains: "What was nice about this combination is that I could put all my development time into the AI Editor, and the WordPress platform and various plug-ins could do the rest of the work." The AI Editor, short for Artificial Intelligence Editor, is the program Gewirtz created to sift through the accumulated RSS items to detect those that are pertinent to each political candidate.

WPRobot periodically collects items from a series of RSS feeds and adds them to the WordPress database as content items. Once the RSS articles have been stored, the AI Editor takes over and analyzes each article based on a series of criteria. As the AI Editor evaluates each article, it changes the content item's status indicator from “Draft” to either “Trash” or “Published.” In this way only those articles vetted as valid are published.

When I asked Gewirtz about the skill level required to build this system, he told me, "A good programmer could do it, but I'd probably say [the AI Editor] is substantially beyond the entry level."

Of course, not all RSS entries are created, or appear to be, equal. But when you look at TrackYourCandidates.com, you see there is consistency to the entries. This is another function of the AI Editor. As Gewirtz describes it: "If a post manages to make it through all of the AI Editor’s steps, including successfully writing it up as an English-language post (and this involves a large combination of steps to go from terse or incomplete RSS item to complete three-paragraph article and relevant excerpt), the AI Editor then assigns the post to a candidate category, updates the candidate’s newsworthiness metrics for the week, and then eventually sets the WordPress post element to ‘published,’ which is when it goes out to the world."

Aside from delivering a technical achievement that leverages open-source efforts, the site is a great resource regardless of your location or political affiliation. It's still in beta, but definitely worth exploring.

— Scott Koegler was a CIO for 15 years and has been writing about technology for the last 18 years. He is editor of www.ec-bp.org, a newsletter that addresses supply chain technologies, and manages other newsletters at www.YourCompanyNewsletter.com. You can contact him at scott@koegler.net.

DISCUSS     Email This
Current display:       newest comments first       display in chronological order
Page 1 of 2   Next >
Scott Koegler
Thinkernetter
Monday October 18, 2010 2:33:52 PM
no ratings

I completely agree that some semantic processing could be used to advantage in this situation. The ability to apply relationship connections to the findings would clean up the results considerably.

mathemagician
IQ Crew
Monday October 18, 2010 1:34:26 PM
no ratings

I just went to the site (10/18/2010) and clicked on Patrick Murphy (who I am aware of as an incumbent in a fierce battle in PA) and found the first four articles were way off the mark:

  1. Patrick Murphy engaged (!):  His wife would like to know about this...it's actually about a young couple in Richmond, VA.
  2. Patrick Murphy catches the opening touchdown pass:  a Staten Island game against New Dorp
  3. Superintendant Patrick Murphy says enrollment is up:  in Arlington
  4. Quarterback Patrick Murphy was 6 for 12 for 69 yards:  a football game in Stamford, CT

I know Patrick Murphy is a talented person, but this is a bit much...

After these four articles, it does get to the politician named Patrick Murphey for a while and then veers off to 'Bama (as in Alabama State)...

Sounds like they are doing straightforward text searching and not any semantic searching.

This points out the promise and challenge of semantic web technology:  we people are used to filtering out this "noise" but the computers don't have that common sense.

SecTech
Thinkernetter
Monday October 18, 2010 11:59:43 AM
no ratings

It's mid-October, the leaves are turning, the weather is getting colder and the hyena's are on the prowl.  Yes, those hungry-for-your-vote entities called "politicians."  I don't know about anybody else, but I am so sick of hearing them and about them, reading about them that I literally turn off the TV when a political ad comes on.  They never have anything to say that I believe.  This year the dirty campaigning seems dirtier than ever.  When is enough too much?

I voted already.  I'm done.  This year was a record year for the number of write-in votes utilized.  I am at the point where I don't want any politician to speak or say anything unless he can back up what he says with verifiable information sources that are available to the public to check.  The "he said, she said" and "he did, she did" stuff is part of what is wrong with the political process and it carries through to the ones in office.

I mean really, how stupid do they think we are telling us that when they get elected to congress they are going to stop government waste, lower taxes, do this, do that... Are they really so ignorant as to not realize that as new office holders they will barely have the power to blow their noses by themselves?  There is something broken in the American political process and until it's fixed, nothing will change, no matter who is elected, although some electees could make things a whole lot worse.

Somewhere along the line Government for the People by the People has become Government for the Special Interest Groups and those with Money by those with Money.  The Ideal that was created at the creation of this nation has gone astray and those in power have no interest in having the status quo change.

DavidGewirtz
Rank: Cave Painter
Friday October 15, 2010 5:09:35 PM
no ratings

Generally, it runs on its own (that's the key to the design), but there is a "Flag Article" option that allows readers to mark an article for review. Those are checked by a human.

In terms of balance, take a look at the content. It's working surprisingly well. I'm actually pretty amazed myself.

--David

Michael P. Kassner
Thinkernetter
Friday October 15, 2010 4:09:44 PM
no ratings

I suspected so, but felt inclined to ask whether the algorithm factored in quantity as well as quality. 

Now I am anxious to hear more about this. Your work potentially could be applied to a multitude of topics, besides politics. Is this at all similar to what the .govs are using to data-mine certain illegal activities? 

One final note, I'd like to thank you for your work and Scott for making me aware of it. 

DavidGewirtz
Rank: Cave Painter
Friday October 15, 2010 3:56:30 PM
no ratings

What a great article, thank you!

 

Could the system be biased? Of course. But not as much as you'd think. First, there is always some algorithmical bias because every programmer has a bit of a "signature" in his or her code. Small devisions that have an impact on the final output.

In TrackYourCandidate's case, I put some serious care in making sure there's no specific bias to a given candidate or party, and was careful to list more than a hundred independent and no-name party candidates as well. But I missed at least one. Yesterday, I got a call from a campaign manager for an independent candidate in Texas and was asked to add him in. I didn't find him when I built my original list of candidates.

I did bias the list in a few ways. I added a few newsmakers that weren't declared candidates who I thought worthy of watching, like Bloomberg from New York. He's not a House, Senate, Gov, or Presidential candidate, but he's interesting, so he got the nod. Same with Meghan McCain. She's obviously not a candidate, but I thought I might someday cover her, so I added her into the mix.

The site is biased, though, in that it prefers to throw out articles rather than publish them. There's an entire series of tests and heuristics required before an article is allowed to be published, but fail one test, and the article is tossed. So if one candidate were to be covered by a messier feed or sites where it was harder to extract good content, that candidate would be covered less.

As for RSS streem spamming, that's been accounted for. TyC pulls one candidate from any given stream at any given time and then goes on to the next candidate. In a sense, it only looks at one post for one candidate before testing the next. So even if one candidate were to bot-up and spam the world, only one of those spammed articles would be considered and possibly posted.

This is still definitely experimental and there are still holes. For example, there's one candidate with the same name as a race car driver and TyC often picks the news on the race car driver. I have a fix for this, but I've also got other responsibilities, so it's on the "to do" list, rather than done now.

Oh, one neat thing about the balance of coverage. If you check the TyC leaderboard (at least as of yesterday), Dino Rossi is at the top of the pack. I never heard of him before TyC (Seattle people have, of course), but he organically rose to the top of the leaderboard.

Keep up the good questions. I'll check back later tonight and answer any more I can.

SteveGNYC
IQ Crew
Friday October 15, 2010 3:39:33 PM
no ratings

Scott - I agree. It takes a very neutral person NOT to bias something. The only way is to put it all out there, as much as it is possible, and trust the end reader / informationist to make their own decision based on the information gathered, through these feeds and others which may not be represented, Without this, it sounds like a recipe for censorship, not transparency or vetted truth. 

Like you say, the Reader's job is to determine the believability of what is presented and to evaluate it (and their decision) accordingly.

SteveGNYC
IQ Crew
Friday October 15, 2010 3:35:38 PM
no ratings

I would agree that this would possibly be welcomed by many if not all. I just don't think it's realistic to expect such "truths" to be vetted in moments, as the feeds would want to disperse information - correct or incorrect. 

Like Scott, I agree that the purest way has to be to let all sources be heard / streamed and let the reader be the discriminator and/or vetting agent. Without this, I think you'd run a very good chance of the pre-filtering human's bias slashing or trashing, allowing info s/he believes to be valid / true / etc.

Is there any such animal as an unbiased human filter?

Michael P. Kassner
Thinkernetter
Friday October 15, 2010 11:24:11 AM
no ratings

The argument that the feed could be manipulated like search results. Get a botnet with all sorts of feeds and the amount of traffic would overwhelm other sources and skew the results. 

This concerns me more then vetting content. That is up to me, the reader. 

Scott Koegler
Thinkernetter
Friday October 15, 2010 11:09:57 AM
no ratings

I think the current unattended method is really the purest way to do this. As soon as someone starts to evaluate the different content, bias is introduced. This keeps the biases at the source level.

And obviously there are plenty of truths and lies told and reported about every candidate. But that's why there are sites dedicated to sussing out the verity of these things. It's a big job, and not one to be taken on as part of a news agregation function.

What TrackYourCandidate does is just what it intends - pulls together what has been written about the candidates into a single resource. Figuring out whether you believe any or all of the posts is your prerogative. As a starting point, I suggest you always consider the source.

Page 1 of 2   Next >
The ThinkerNet does not reflect the views of TechWeb. The ThinkerNet is an informal means of communication to members and visitors of the Internet Evolution site. Individual authors are chosen by Internet Evolution to blog. Neither Internet Evolution nor TechWeb assume responsibility for comments, claims, or opinions made by authors and ThinkerNet bloggers. They are no substitute for your own research and should not be relied upon for trading or any other purpose.
previous posts from Scott Koegler
Scott Koegler
Scott Koegler   8/16/2012   2 comments
Standardizing data across systems is always a challenge. But it's particularly difficult for large institutions with hundreds of entities, each with multiple internal computing platforms.
Scott Koegler
Scott Koegler   8/1/2012   14 comments
Brand loyalty is changing dramatically in light of Internet capabilities. Major product names can still draw a devoted following, but stores are having a difficult time retaining customers. Shoppers can shop in the store and online simultaneously, and a growing variety of factors influence their buying decisions.
Scott Koegler
Scott Koegler   7/23/2012   21 comments
Online retailers have discovered that branding still matters, especially when prices aren't different enough to guide buyers to an immediate decision.
Scott Koegler
Scott Koegler   5/22/2012   12 comments
With instant-everything now a given, why do bank transfers still take three to five days? Surely our banking systems operate in real-time when we make a purchase using a debit card. So why should we wait as long as a week to get confirmation of a deposit that simply moves from one bank to another?
5
of
Singer at C-Level
China's ‘Three Kingdoms’ Internet

2|15|11   |   2:41   |   4 comments


China's Internet future is linked to its past. Here's hoping it's less bloody.
Steve Saunders' Outernet
USA Sics Ashton Kutcher on Russia

3|3|10   |   02:16   |   9 comments


The United States' taxpayer-funded technology delegation to Russia turns into a mortifying embarrassment for anyone even remotely proud to be American.
Full Nelson
San Francisco's Web 2.0 Government

10|30|09   |   2:43   |   4 comments


The city of San Francisco is on the leading edge of using the Internet to provide government transparency. It is providing WiFi for its have-nots, and its DataSF.org initiative is putting the city's valuable data back in the hands of its citizens, with innovative results.
Steve Saunders' Outernet
The Death of Anonymity: Part 4

Part 4 of 4   |  
See complete series
10|29|09   |   1:40   |   8 comments


In the final episode of this series about the death of Internet anonymity, Saunders describes how the Internet of the future will start to attain a level of intelligence that requires no human intervention. Scary.
Steve Saunders' Outernet
The Death of Anonymity: Part 3

Part 3 of 4   |  
See complete series
10|28|09   |   1:35   |   4 comments


What can users today do to protect their online privacy? The simplest and most obvious option is to not use the Internet – at all. However, once all digital information is consolidated over the Internet, trying to protect digital identity by simply unplugging from the Internet becomes impossible – a fact that has manifest implications for civil liberties, Saunders says.
Steve Saunders' Outernet
The Death of Anonymity: Part 2

Part 2 of 4   |  
See complete series
10|27|09   |   2:08   |   9 comments


By 2011 the number of Internet-connected sensors will exceed 1 trillion, making your chances of doing anything or going anywhere unnoticed pretty much zero. Saunders talks about how the 'sensortization' of the Internet is eliminating the traditional divide between online and offline populations.
Steve Saunders' Outernet
The Death of Anonymity: Part 1

Part 1 of 4   |  
See complete series
10|26|09   |   1:29   |   13 comments


The 20th Century Internet was characterized by the ability to interact with other people and information on the Internet largely without anyone knowing who you were. The Internet of this century, conversely, will be defined by identity. Saunders explains how Internet users are unwittingly contributing to the demise of the anonymous Internet.
Steve Saunders' Outernet
The Coming Internet Bubble: Part 2

Part 2 of 2   |  
See complete series
10|16|09   |   3:38   |   19 comments


How do you recognize an Internet bubble when you see one? Saunders explains how all bubbles have four symptoms in common – and takes a swipe at Google and Twitter into the bargain.
Second Shooter
Twitter Tweaks Twist Facebook?

9|20|12   |   2:07   |   8 comments


Twitter's changes are clearly aimed at being more Facebook-like, and this is because both companies are vying to serve the mobile social network market. But can that market work for anybody, given how difficult it is to push ads to social-update readers?
Wisdom of the Big Chair
Facebook Activity Isn't Protected as Political Speech

9|18|12   |   2:36   |   10 comments


A US District Court Judge recently ruled that a sheriff can fire six employees who used Facebook to support an opposing candidate.
IETV: the thinkerNet on film
5
of
John Kennedy
How Big-Data Is Changing Marketing

6|13|13   |   1:07   |   1 comment


Big-data and analytics tools enable marketers to understand customers as individuals, identifying unmet needs and addressing each customer as a "segment of one," says John Kennedy, VP corporate marketing, IBM.
Kim Davis
Big-Data Can’t Always Sell Wine

5|21|13   |   2:23   |   10 comments


Whole Foods Global Wine Purchaser Doug Bell told me about some of the constraints on using analytics in the US wine market.
Paul J. Fleuranges
Digital Signage Keeps NYC Subway Straphangers on Track

5|6|13   |   3:51   |   1 comment


New York's Metropolitan Transit Authority is conducting a pilot test of digital kiosks to guide subway users to where they want to go more efficiently and at lower cost.
Kim Davis
Fast Forward to the Future

4|23|13   |   2:29   |   20 comments


A look back at tech writing in the 90s makes us wonder where enterprise IT will be 20 years from now.
Mitch Wagner
Google Launches Its Most Depressing Service Yet

4|15|13   |   2:59   |   10 comments


Google's new Inactive Account Manager lets you control how Google disposes of your accounts when you die.
Second Shooter
Argument Over Top-Level Domains Is 'Stupid'

4|11|13   |   2:07   |   3 comments


The whole Amazon.reader debate is a double-stupid. It's stupid to think that there's any e-book buyer who doesn't know Amazon's URL, and it was stupider to let ICANN launch the whole free-form TLD initiative to start with.
Kim Davis
Ladies, Your Tablet Awaits

3|21|13   |   2:22   |   37 comments


ePad Femme is the world’s first tablet “made exclusively for women.”
Wisdom of the Big Chair
NFC Moves Into the Mainstream

3|20|13   |   2:16   |   No comments


While NFC's original goal was to enhance mobile commerce applications, it is finding its way into a number of other uses, which is creating both opportunity as well as challenges for IT departments.
Wisdom of the Big Chair
Integrating Security Into Your Cloud Contract

3|19|13   |   3:35   |   No comments


Enterprises would like to move to cloud computing but are hesitant because they are concerned about providers’ ability to secure company data. Here are some tips that help to ensure that if breaches occur, the business is not left holding the bag.
Brian Baron
How Edmunds.com Collects Customer Information

3|18|13   |   1:15   |   No comments


Edmunds separates customers into segments based on the info it collects on its site and from partners, and uses that to push out custom content, said Brian Baron, director of business analytics for Edmunds.com, at Predictive Analytics Innovation Summit.
2pm EDT
Fri
Jun 21st
an IBM information resource
sponsored content
big blue blog
Todd Watson
Todd Watson   6/18/2013   Post a comment
The IBM Smarter Commerce Global Summit in Monaco kicked into high gear today, and we've already begun to see news emerging from that lovely city-state by the sea.
an IBM information resource
sponsored content
Expert Integrated Systems: Changing the Experience & Economics of IT
In this e-book, we take an in-depth look at these expert integrated systems -- what they are, how they work, and how they have the potential to help CIOs achieve dramatic savings while restoring IT's role as business innovator.

READ THIS eBOOK
your weekly update of news, analysis, and
opinion from Internet Evolution - FREE!

REGISTER HERE
Wanted! Site Moderators
Internet Evolution is looking for a handful of readers to help moderate the message boards on our site – as well as engaging in high-IQ conversation with the industry mavens on our thinkerNet blogosphere. The job comes with various perks, bags of kudos, and GIANT bragging rights. Interested?

Please email: moderators@internetevolution.com
Internet Evolution – not for thickies
Taking a Dim View of Home Energy Management Tech
Mary E. Shacklett
Energy consumption is a primary contributor to
global warming. At the end of 2012, 40 percent of energy consumption in the US came from commercial and residential buildings.

CLICK FOR MORE