Articles on and examples of using Scientio technology.


Neural Nets are so very 90s

Feb-182008
I did a short presentation at a one day security event in London last week presented by the LTN. Although Scientio creates horizontal (i.e. market agnostic) products it's good to talk to people with specific problems.  There were a wide range of solutions on display, from a really innovative camera system that looked in every direction at once, to umpteen university generated systems that recognised faces, or how people walked, etc.
 
Microsoft's ex-FBI security guru Edward P. Gibson talked, and was very good value, witty and impassioned.
 
I talked about ConceptMine as a way of screening large amounts of text - an obvious security benefit.
 
The thing that struck me was the preponderance of Neural Nets in people's designs. This amazed me, because, much as I like Neural Nets they have some seriously bad characteristics.  I wrote the UK's first commercial Neural Net system back in the 80s, and produced the odd paper at the time. I recently looked at a copy of the standard Journal on NNs, and was horrified at how little of an advance there had been in the intervening 20 years.
 
Neural Nets are still functional black boxes - you can train them, and you can prove they've learned something, but you can't easily find out what that is.
They are still very prone to overtraining - and you can't easily tell they've done it.
Overtraining is the tendency of all learning systems that learn incrementally to overcook. In Stats it's called overfitting. 
Basically except in trivial cases, the data you use to train nets is a sample from some (much) bigger set of data. As Neural nets train they get better and better at learning the samples they know, and at some point their performance starts to get worse on the data that's not in the training set.
This is closely allied to Ockham's Razor - As neural nets fine tune they effectively add more complexity than the data warrants.  Neural nets have the double wammy of it not being clear in advance how many hidden nodes to use, or how many layers. Back in the 80s I experimented with minimising the structure using genetic algorithms.  Then I got fed up and created the forerunner of our current fuzzy logic rule induction algorithm as used in XMLMiner.
 
This is resistant to overtraining, has no 'fiddle factors' - i.e. parameters that you need to set by experiment that are a great source of cheating in papers, and it explains very clearly in English what it has learned.
 
If only I could find a way of getting to these people at the moment they think of designing in the dreaded Neurals!
 
 
Posted by FbaAdministrator | 0 Comments | Trackback Url | Bookmark with:        
Tags: XmlMiner, General

Links to this Post

Comments

Name:
URL:
Email:
Comments:

CAPTCHA Image Validation