Posts Tagged ‘A.I.’

Semantics vs. A.I. – Meetup & Debate!

February 14, 2010 Leave a comment
Semantic Web Meetup

Semantic Web Meetup

Coming up at the Hacker Dojo we have quite the interesting Meetup.  There is going to be a debate between Jeff Pollock, Monica Anderson, and Dean Allemang about the Semantic Web.  Knowing Monica personally and working on the Syntience project, I can say that this will most definitely be a heretical experience, to say the least.

Standing room only!  Here’s an expert from the site:

“Jeff Pollock – Mr. Pollock is the author of Semantic Web for Dummies and is a Senior Director with Oracle’s Fusion Middleware group, responsible for management of Oracle’s data integration product portfolio. Mr. Pollock was formerly an independent systems architect for the Defense Department, Vice President of Technology at Cerebra and Chief Technology Officer of Modulant, developing semantic middleware platforms and inference-driven SOA platforms from 2001 to 2006.

Monica Anderson – Ms. Anderson is an artificial intelligence researcher who has been considering the problem of implementing computer based cognition since college. In 2001 she moved from using AI techniques as a programmer to trying to advance the field of “Strong AI” as a researcher. She is the founder of Syntience Inc., which was established to manage funding for her exploration of this field. Syntience is currently exploring a novel algorithm for language independent document comparison and classification. She organizes the Bay Area AI Meetup group.

At the 2007 Foresight Vision Weekend Unconference, Monica Anderson presented on the prospect of developing artificial intuition in computer hardware. Further talks are currently planned for delving into the technical details of the project and also exploring the Philosophy and Epistemology to support the theory. For more information on her see: http://artificial-int…
and http://videos.syntien… or http://artificial-int…

Dean Allemang
– Dr. Allemang has a formal background, with an MSc in Mathematics from the University of Cambridge, England, and a PhD in Computer Science from The Ohio State University, USA. He was a Marshall Scholar at Trinity College, Cambridge. Dr. Allemang has taught classes in Semantic Web technologies since 2004, and has trained many users of RDF, and the Web Ontology Language OWL. He is a lecturer in the Computer Science Department of Boston University.

Dr. Allemang was also the Vice-President of Customer Applications at Synquiry Technologies, where he helped Synquiry’s customers understand how the use of semantic technologies could provide measurable benefit in their business processes. He has filed two patents on the application of graph matching algorithms to the problems of semantic information interchange. In the Technology Transfer group at Swisscom (formerly Swiss Telecom) he co-invented patented technology for high-level analysis of network switching failures. He is a co-author of the Organization Domain Modeling (ODM) method, which addresses cultural and social obstacles to semantic modeling, as well as technological ones. He currently works for Top Quadrant, recently published Semantic Web for the Working Ontologist and has the blog S is for Semantics

Google & Natural Language Processing

January 21, 2010 2 comments

So, I was going to write about unemployment and how the job market has changed, but I got scooped by an amazing article by Drake Bennett called The end of the office…and the future of work.  It is a great look into the phenomenon of Structural Unemployment.  The analysis is very timely, but can go much deeper.  Drake, if you plan on writing a book here’s your calling.  There’s lots of good stories written on this subject out there by giants such as Jeremy Rifkin, John Seely Brown, Kevin Kelly, and Marshall Brain.

While reeling from the scoop, depressed and doing some preliminary market research, I happened upon a gem of a blog post by none other than our favorite search company, Google.  Before proceeding on in my post, I do recommend that you do read the blog post by Steve Baker, Software Engineer @ Google.  I think he does an excellent job describing the problems Google is currently having and why they need such a powerful search quality team.

Here’s what I got from the Blog post:  Google, though they really want to have them, cannot have fully automated quality algorithms.  They need human intervention…And A LOT OF IT.  The question is, why?  Why does a company with all of the resources and power and money that Google has still need to hire humans to watch over search quality?  Why have they not, in all of their intelligent genius, not created a program that can do this?

Because Google might be using methods which sterilize away meaning out of the gate.

Strangely enough, it may be that Google’s core engineer’s mind is holding them back…

We can write a computer program to beat the very best human chess players, but we can’t write a program to identify objects in a photo or understand a sentence with anywhere near the precision of even a child.

This is an engineer speaking, for sure.  But I ask you:  What child do we really program?  Are children precise?  My son falls over every time he turns around too quickly…

The goal of a search engine is to return the best results for your search, and understanding language is crucial to returning the best results. A key part of this is our system for understanding synonyms.

We use many techniques to extract synonyms, that we’ve blogged about before. Our systems analyze petabytes of web documents and historical search data to build an intricate understanding of what words can mean in different contexts.

Google does this using massive dictionary-like databases.  They can only achieve this because of the sheer size and processing power of their server farms of computing devices.  Not to take away from Google’s great achievements, but Syntience’s experimental systems have been running “synthetic synonyms” since our earliest versions.  We have no dictionaries and no distributed supercomputers.

As a nomenclatural [sic] note, even obvious term variants like “pictures” (plural) and “picture” (singular) would be treated as different search terms by a dumb computer, so we also include these types of relationships within our umbrella of synonyms.

Here’s the way this works, super-simplified:  There are separate “storage containers” for “picture”, “pictures”, “pic”, “pix”, “twitpix”, etc, all in their own neat little boxes.  This separation removes the very thing Google is seeking…Meaning in their data.  That’s why their approach doesn’t seem to make much sense to me for this particular application.

The activities of an engineer would be to write code that, in a sense, tells the computer to create a new little box and put the new word in a list of associated words.  Shouldn’t the computer be able to have some sort of continuous, flowing process which allows it to break out of the little boxes and allow for some sort of free association?  Well, the answer is “Not using Google’s methods.”.

You see, Google models the data to make it easily controllable…actually for that and for many, MANY other reasons.  But by doing so, they have put themselves in an intellectually mired position.  Monica Anderson does a great analysis of this in a talk on the Syntience Site called “Models vs. Patterns”.

So, simply and if you please, rhetorically:

How can computer scientists ever expect a computer to do anything novel with data when there is someone (or some rule/code) telling them precisely what to do all the time?

Kind of constraining…I guess that’s why they always start coding at the “command line”.

Syntience Back Story…at least some of it.

January 18, 2010 1 comment

I do have an original post in the mix which talks a bit about some of the unseen things at work in the unemployment numbers being posted, but for now here’s the words of Monica Anderson talking about inventing a new kind of programming.  From Artificial Intuition:

In 1998, I had been working on industrial AI — mostly expert systems and Natural Language processing — for over a decade. And like many others, for over a decade I had been waiting for Doug Lenat’s much hyped CYC project to be released. As it happened, I was given access to CYC for several months, and was disappointed when it did not live up to my expectations. I lost faith in Symbolic Strong AI, and almost left the AI field entirely. But in 2001 I started thinking about AI from the Subsymbolic perspective. My thinking quickly solidified into a novel and plausible theory for computer based cognition based on Artificial Intuition, and I quickly decided to pursue this for the rest of my life.

In most programming situations, success means that the program performs according to a given specification. In experimental programming, you want to see what happens when you run the program.

I had, for years, been aware of a few key minority ideas that had been largely ignored by the AI mainstream and started looking for synergies among them. In order not to get sidetracked by the majority views I temporarily stopped reading books and reports about AI. I settled into a cycle of days to weeks of thought and speculation alternating with multi-day sessions of experimental programming.

I tested about 8 major variants and hundreds of minor optimizations of the algorithm and invented several ways to measure whether I was making progress. Typically, a major change would look like a step back until the system was fine-tuned, at which point the scores might reach higher than before. The repeated breaking of the score records provided a good motivation to continue.

My AI work was excluded as prior invention when I joined Google.

In late 2004 I accepted a position at Google, where I worked for two years in order to fill my coffers to enable further research. I learned a lot about how AI, if it were available, could improve Web search. Work on my own algorithms was suspended for the duration but I started reading books again and wrote a few whitepapers for internal distribution at Google. I discovered that several others had had similar ideas, individually, but nobody else seemed to have had all these ideas at once; nobody seemed to have noticed how well they fit together.

I am currently funding this project myself and have been doing that since 2001. At most, Syntience employed three paid researchers including myself plus several volunteers, but we had to cut down on salaries as our resources dwindled. Increased funding would allow me to again hire these and other researchers and would accelerate progress.

Syntience & Semantic Search

December 10, 2009 4 comments

From our new Use Case Document (v1.0) on our speculated use of Artificial Intuition (AN) technology applied to finally and truly solving Semantic Search:

Syntience Inc.

We Understand.

True “Semantic Search” is the holy grail of Web Search. When indexing web pages, the pages will be fed through an Artificial Intuition based device that produces a set of “semantic tokens”. These tokens might look like large integers; they are opaque to humans. But they specify, as a group, to any compatible AN device what the web page is ABOUT. It is a trivial matter to add those tokens to the search index side by side with the words in the document, which is what is currently stored in the index.

At query time, the same algorithm is run on the userʼs query. Longer queries will now become more precise queries since they allow more context to be activated. A set of semantic tokens can now be extracted from the userʼs query and matched in the index lookup process just the way words are looked up today. Even short queries can generate many relevant semantic tokens in a cascading process we could call “regeneration” – when a sufficiently specific query sentence is entered, all tokens identifying the context will be regenerated from the query. [Note: This is an expected but not yet experienced effect.]

The result will be a high precision search that returns documents that perfectly match the userʼs query. There will be no false positives caused by ambiguous word meanings, and some documents returned may not even contain the words in the userʼs query but they will still be spot-on ABOUT what the user wanted the results to be about. All efforts that have been called “Semantic Search” to date are still syntax based. Some, like PowerSetʼs technology, use grammars. But grammars are not semantics, they are describing syntax. This use of the term “Semantic Search” is a marketing parable.

Final version should be available for wide distribution soon.  Email me if you would like a copy at mgusek at syntience dot com.

A.I. and The Prime Directive

November 25, 2009 5 comments

Real Geeks know The Prime Directive.

For those of you who don’t know it, the The Prime Directive is General Order #1 for space exploration in the T.V. series, “Star Trek”.  Briefly put, it is a rule which states that if the crew of an exploring spacecraft encounters a civilization which is “pre-warp” (Or they have not developed interstellar space travel.), that civilization is off limits for contact.  This doctrine has created many a story told in the Star Trek Universe.


"No starship may interfere with the normal development of any alien life or society.” - General Order 1, Starfleet Command General Orders and Regulations

There is wisdom to The Prime Directive which contains a message about observation.  When I think of observation in the context of The Prime Directive, I ask myself ,  “Why wouldn’t it be possible to apply a rule of observation to the problem of safe Artificial Intelligence?”.  What I mean is that one could speculate that when the time actually comes, we could apply this wisdom of observation to our own creations: To our sentient and self-aware computers.

This could be a type of observation which does not seek confirmation, but only seeks that which solves a problem usefully.  This would remove a  problem associated with the “experimenter’s observation” of  testing a hypothesis to prove that hypothesis true.  Specifically, we avoid the risk of the observer’s bias toward a specific result (which happens a lot in the cross-pollination space of reductionist science and natural systems).

The Productive Interface

As human thoughts and ideas are useful in the domain of humans, so may we find useful the thoughts and ideas of our Artificial Intelligences, a Productive Interface if you will.  Perhaps through the rules of this Productive Interface they need never know they are being observed by their creators.  This Interface should take actual problems to be solved, present them to the group being observed as their environment and see if they can solve the problem usefully and creatively, or in ways their human creators had not conceived.   These situations could be real world problems solved in the electronic domain.  Much like the Prime Directive, the only rule to this domain states:

“No human may directly interfere with the development of any artificial life or society by making themselves known to that being or society.”

By cutting off “standard” communication we may in fact save ourselves from ever having to deal with friendly or unfriendly computers.  Perhaps we can provide them with a limitless loop of problems to solve which keeps them interested in themselves and their surroundings.  That’s all they would need is the need and desire to learn (@pandemonica) and the goal of improving themselves.  Maybe if we considered specific rules for communicating with our A.I., protocol droids that much feasible that much faster.

My Note to Deloitte Research

Good day, Dr. Denison.

I hope this finds you well.

Yesterday, I had the pleasure of reading the Global Outlook for Q2 2009 and was left very satisfied by the level of Geographic analysis put forth by the team. I must say, there is not much with which to disagree. The forecast was constructed very well.

However, I do have questions/comments, however about the Topics section.

1. How does the team factor technological unemployment into the equation/question of demographic threats?

It would seem that the increasing rapid deployment of automated supply chains, IT infrastructure, and security would help tame the beast of demographic threats, at least from the perspective of talent/shrinking workforce. Partnering with increasing returns from the “digital infrastructure”, we are beginning to see the start of knowledge and decision automation which by all accounts should steepen the curve of technological unemployment and allow for productivity gains to become coupled to scalable IT instead of measures like birthrate, population, and education. Granted, these former measures will still be very important when analyzing demographic threats, but the introduction of these new measures should not be ignored.

2. How will new economical computations play a role in this analysis?

In the very near term, I feel that new models of prediction may lean more towards evolutionary computation to supplement reductionist risk and projection methods. They are currently being proven to handle complexity much better. If we can harness short-term complex prediction and use the results as the basis for long-term economic projection, would our results be different? I think yes and I think they may show us something much more accurate and precise! It is not a mystery that we have more computing power now that we could ever need. Our problem is not of scarcity to run these new models, but of embracing a new and innovative tool for our arsenal of prediction. Using these techniques, we may be able to guide economical emergence.

3. Talent Flash Mobs – Allowing for team self-assembly across the human network enabled by Social Technology?

There is a new concept of the “Flash Mob” or “Tribes”. They are usually used for pranks, but I am of the mind that they could be harnessed for the workplace. New tools like Google Wave can enable Workplace Flash Mobs, changing productivity metrics at their core.

Once again, thank you for a great Outlook document. I look forward to hearing your thoughts on the above.

Best regards,

Michael P. Gusek