We’ve moved to http://www.haleyAI.com

June 25, 2008

Zigtag for social semantic tagging


I started to use Radar Networks’ Twine at the invitation of CEO Nova Spivak after writing this earlier this year (also see this). I enjoyed it for a while, especially because a lot of technology folks were hooking up with each other, especially the semantic web community, on Twine. But I found it tedious to work through beta issues and to be bothered with recommendations or news about who was saying or bookmarking things about what. (I should have turned off the emails sooner!)

I was also disappointed that Twine was taking an apparently folksonomic approach to tagging. It was as if Radar Networks was riding semantic web buzz without really embracing it openly or sharing the momentum that the invite-only community was investing in. That may not sound fair – I believe that there are semantics in the back room, but that’s how it felt and it’s still the way it looks. But probably the worst part is the process that you have to go through to add a bookmark – which is the whole point, of course! (I ultimately sacrificed popup blockers, but the process still seems laborious compared to other alternatives.)

I stumbled across Zigtag almost accidentally while working for a VC firm with a portfolio of semantic startups. What I like most about Zigtag is that they make it obvious that they are building an ontology of tags and encourage users to select semantic tags (i.e., concepts) rather than folksonomic “words”. They also provide tools for managing tags that allow you to move smoothly and incrementally from a folksonomic to a more semantic approach.


The key to the semantic approach for Zigtag is that shared tags are just that – they are more precise than strings. They are not only words – they have definitions.

Unfortunately, like Twine, Zigtag’s ontological model remains hidden.

My initial experience with Zigtag resulted in immediate jubilation. The Firefox plug-in works for me. It lets me type in tags with nice completion and recommendations from the tags that others have defined. Within 15 minutes I was writing to compliment Zigtag on a practical, elegant approach to the semantic bookmarking problem. I liked it much better than Twine right off the bat – and despite its book-market, I like Twine a lot! Within a few minutes I had an email from their founder, Reg Cheramy. An hour later we were talking. We talked about his early meeting with Michael Arrington, how his work compares to bulletin board or discussion forum emphasis in Twine , how he facilitates semantic tagging given a very large ontology and vocabulary, and so on.

Whether Reg took my advice to emphasize groups more or was already headed in that direction is unclear, but Zigtag now has group functionality that seems as good as (and in some ways better) than Twine’s. If you go to Zigtag the web site, you can find groups to join, but unlike Twine’s web site, Zigtag does not recommend groups for you based on your interests. I’m not sure this is a problem, though. Recommendations can be distracting. Nonetheless, if people want recommendations for more than content, it would be a simple step for Zigtag given the fact that they already recommend content that others have bookmarked.

I’m not too concerned with recommendations, even of content, so I cannot comment on Zigtag versus Twine on that front. Generally, there is plenty of RSS and recommendation noise to go around. I prefer the linked approach to finding information rather than searching and I don’t expect recommendations to become excellent in the near term. For more on this, you might want to check out the recent news about Vulcan’s EVRI investment at Webware or ReadWriteWeb.

I like to use Zigtag from the sidebar in Firefox. Actually, I owe Reg additional thanks for, in effect, causing me to abandon Internet Explorer for Firefox. I use it primarily to organize my bookmarks semantically and across machines. For those that want to do the same, you might also be interested in Mitch Kapor’s Foxmarks.

I’m fine with finding groups on my own and I like seeing people and what they tend to tag, too. Now that I know they are available on the web site, though, I want them in the sidebar. The fact that they are indirect on the web site, not presented in the sidebar, and not proactively recommended probably explains why there are relatively few (especially compared to Twine). It would be nice, for example, to see groups and people organized along with bookmarks according to how heavily they use tags as I pivot through various facets.

So, on a feature basis, I like Zigtag more than Twine for two primary reasons:

  1. Zigtag’s Firefox plug-in is a great user interface while Twine’s book-market is awkward in every sense that matters to me.
  2. Zigtag emphasizes and leverages shared tagging of tags that have clearly documented interpretations Twine is too folksonomic.

The picture shown in this post shows that Zigtag already “knows” a lot about semantics. Part of the reason is that they must have a roomful of people watching for tags that people enter that are not defined. Quite a few of the tags I’ve added have become defined within hours (sometimes minutes) of when I enter them. We’ll see how this scales up, but I like it – a lot.

The key question for both these sites is:

Are you going to share your ontology? If not, why not? If so, when or why not now?

Note that I am not suggesting they should. But if they have a reason not to, it would be nice to understand that.

It also would be nice to know whether the effort I expend on either site will be lost if they are acquired or I want to switch. That’s how it looks at Twine today.

Zigtag exports my bookmarks. I can get them from or over to Delicious, no problem. But I want their semantics, too. I would really appreciate preservation of the text, preferably the semantics of my tags. Perhaps if my bookmarks were simply output as an OWL referencing their ontology? At least then I could move without losing the effort that I have put into them, whether folksonomic or semantic. I also want to know if their ontology is are any good and, if so, I’d appreciate export to OWL so that I could use bookmarks for other purposes that interest me.

The background issue of data portability, for bookmarks, social networks, and other personal profile data is huge.

If I had OWL export and an open ontology, I would be less worried about my investment in Zigtag or Twine. Consider Techcrunch’s recent comments:

Zigtag’s biggest obstacle is the slew of other social bookmarking sites already available (). The semantic tagging feature is fairly unique, but its appeal is still untested, especially against automated semantic taggers like Twine. Frankly, a lot of people are just going to stick with the simple but effective Delicious interface.

It’s hard to argue with the first sentence, but the second seems harsh. Twine is getting credit that it may not deserve. Also, Zigtag recommends tags, too. But the third sentence is a problem for Zigtag as well as Twine, although the latter benefits from superior PR.

Another question, of course, is how Zigtag and Twine will fare once they try to make money. Radar Networks has stated that Twine will start running ads by the end of the year. Zigtag has made no public announcements. Delicious selectively advertises (e.g., on search pages), perhaps to feed intelligence to Yahoo’s advertising network. The advertisements are so selective that the value of other book-marking sites may be limited to the intelligence that they provide to established advertising networks. If so, this will hold down valuations and slow innovation. We’ll see, but obviously, I hope not..


April 18, 2008

A Common Upper Ontology for Advanced Placement tests

I have previously written about the lack of a common upper ontology in the semantic web and commercial software markets (e.g., business rules).  For example, the lack of understanding of time limits the intelligence and ease of use of software in business process management (BPM) and complex event processing (CEP).  The lack of understanding of money limits the intelligence and utility of business rules management systems (BRMS) in financial services and the capital markets.  and in enterprise decision management (EDM).   And, more fundamentally, understanding time and money (among other things, such as location, which includes distance) requires a core understanding of amounts.  

The core principle here is that software needs to have a common core of understanding that makes sense to most people and across almost every application.  These are the concepts of Pareto’s 80/20 Principle.  A concept like building could easily be out, but concepts like money and time (and whatever it takes to really understand money and time) are in.  Location, including distance, is in.  Luminousity could be out, but probably not if color is in.  Charge and current could be out, but not if electricity or magnetism is in.  The cutoff is less scientific than practical, but what is in has to be deeply consistent and completely rational (i.e., logically rigorous).[2] (more…)

April 3, 2008

Cyc is more than encyclopedic

I had the pleasure of visiting with some fine folks at Cycorp in Austin, Texas recently.  Cycorp is interesting for many reasons, but chiefly because they have expended more effort developing a deeper model of common world knowledge than any other group on the planet.  They are different from current semantic web startups.  Unlike Metaweb‘s Freebase, for example, Cycorp is defining the common sense logic of the world, not just populating databases (which is an unjust simplification of what Freebase is doing, but is proportionally fair when comparing their ontological schemata to Cyc’s knowledge).  Not only does Cyc have the largest and most practical ontology on earth, they have almost incomprehensible numbers of formulas[1]  describing the world.   (more…)

March 26, 2008

Agile decision services without XML details

Externalizing enterprise decision management using service-oriented architecture orchestrated by business process management makes increases agility and allows continuous performance improvement, but…

How do you implement the rules of EDM in an SOA decision service?  (more…)

March 21, 2008

Ontology of time in progress – amounts needed

Recent posts on money and time have produced some excellent comments and correspondence.  There is even recent OMG effort that is right on the money, at least concerning time.  For details, see the Date-Time Foundational Vocabulary RFP.  I am particularly impressed with SBVR “Foundation” Vocabularies, which I understand Mark Linehan of IBM presented last week at an OMG meeting in DC[1].

Mark’s suggestions include establishing standard upper ontologies for:

  1. Time & dates
  2. Monetary amount
  3. Location
  4. Unit of measure
  5. Quantities, cardinalities, and ratios
  6. Arithmetic operations

I will skip operations for now since they are not taxonomic concepts but functional relationships involving such concepts.  I believe the post on CEP and BPM covered time in adequate detail and the post on Siebel’s handling of foreign exchange covered the currency exchange aspects of money.  It only touched on the more general concept of amounts that I will focus on here.

The remaining concepts are common to almost every application conceivable.  They are some of the most primitive, domain-independent concepts of a critical and practical upper ontology.  They include: (more…)

March 14, 2008

In the names of CEP and BPM

Have you heard the one about how to drive BPM people crazy?

Ask them the question that drives CEP people crazy!

Last fall, at the RuleML conference in Orlando, (more…)

March 11, 2008

Over $100m in 12 months backs natural language for the semantic web

Radar Networks is accelerating down the path towards the world’s largest body of knowledge about what people care about using Twine to organize their bookmarks.  Unlike social bookmarking sites, Twine uses natural language processing technology to read and categorize people’s bookmarks in a substantial ontology.  Using this ontology, Twine not only organizes their bookmarks intelligently but also facilitates social networking and collaborative filtering that result in more relevant suggestions of others’ bookmarks than other social bookmarking sites can provide.

Twine should rapidly eclipse social bookmarking sites, like Digg and Redditt.  This is no small feat!

The underlying capabilities of Twine present Radar Networks with many other opportunities, too.  Twine could spider out from bookmarks and become a general competitor to Google, as Powerset hopes to become.  Twine could become the semantic web’s Wikipedia, to which Metaweb’s Freebase aspires. (more…)

March 3, 2008

Oracle should teach Siebel CRM about location and money

Not long ago I posted on the need to understand common concepts well. My example then concerned the need to understand time well enough to answer a question like, “How much did IBM’s earnings change last quarter?”. Recently, in contemplating some training issues related to the integration of Haley Authority within Siebel, I came across examples phrasings from the documentation on Siebel’s web site, including:

  • if an account’s location contains “CA” then add 50000 in “USD” for the account
  • if an account’s location contains “CA” then add 70000 in “USD” on today for the account

Two things are immediately obvious.

  1. Oracle does not understand location.
  2. Oracle has an interesting, but nonetheless poor understanding of money.

Of course, I am intimately familiar with Authority’s understanding of money. However, Siebel needs more than Authority understands. (more…)

February 22, 2008

Rules are not enough. Knowledge is core to reuse.

James Taylor’s blog today on rules being core to BPM and SOA in which he discussed reuse had a particularly strong impact on me following a trip yesterday.  During a meeting with the insurance and retail banking practice leaders at a large consulting firm, we looked for synnergies between applications related to investment and applications related to risk.  Of course, during that conversation, we discussed whether operational rules could be usefully shared across these currently siloed areas, but we landed up discussing what they had in common in terms of business concepts, definitions, and fundamental truths or enterprise wide governance.  It was clear to us that this was the most fruitful area to develop core, reusable knowledge assets. 

In his post, James agrees with the Butler Group’s statement:

Possibly the most important aspect of a rules repository, certainly in respect of the stated promise of BPM, Service Oriented Architecture (SOA), and BRMS, is the ability for the developer to re-use rules within multiple process deployments.

I have several problems with this statement: (more…)

February 19, 2008

Understanding events and processes takes time

We have been teaching a computer to answer questions like, “How much did IBM’s earnings change last quarter?”  It takes a fair bit of knowledge, including how to understand English, to answer this question.  But teaching it what a “quarter” is brought back memories of debates with some former CMU colleagues about what units are and how to model time.  Since quite a few people ask me for help with knowledge engineering and ontological matters, I thought some might be interested in parts of those debates.As you will see, a strong upper ontology of common knowledge is required to understand common business knowledge.  Leveraging such an ontology is the only way to deliver business rules for under $50.

Sentences like “do something if more than a number of possibly related things have happened within a timeframe of something else happening” or “do something if nothing happens within a timeframe following something happening” are extremely common in business process management (BPM), complex event processing (CEP), and workflow.  With a sense of time, a business rules management system (BRMS) can support BPM, CEP, and workflow applications almost trivially.  Without a sense of time, most BRMS force users to perform computations.  

For example, without a sense of time and an infrastructure that supports it, the sentence “call a customer if no response is received within 30 days of notifying the customer of a delinquency” has to be transformed into something like “if a notice is mailed on a date and the notice is a delinquency and the date of notification has a day number then compute the date for checking by adding 30 to the day number and check for a response to the delinquency notice on the date for checking”.  The checking on a date for a response to a notice must also be implemented as a database (or persistent queue) of events to be polled or triggered by application code.  Then a second rule is required to implement the check, as in “if checking whether a response has been received to a notice and the notice was given on a date of notice and the notice was given to a customer and there exists no record of communication with the customer since the date of notice then call the customer”.  (Note that this is actually how most BRMS products would implement this.  The natural language approach I prefer handles the original sentence.)

The discussion here reflects the general structure and content that a usable ontology for business process management requires.  Most users of business rules management tools will find the need to understand and engineer this discussion in their tool of choice.  As my Haley Systems customers know, much of this is reflected in Authority’s built-in ontology and English vocabulary, but quite a few of the points discussed here reflect improvements, especially concerning the confusion between units and amounts.

As you will see the discussion takes careful thinking.  Some readers may find it onerous.  If at any time you have had enough (or if you simply cannot take anymore!), please skip to the end and decide whether to fill in the conclusions by revisiting the body.


Create a free website or blog at WordPress.com.