Return-Path: <sentto-279987-5137-1028726718-fc=all.net@returns.groups.yahoo.com>
Delivered-To: fc@all.net
Received: from 204.181.12.215 [204.181.12.215] by localhost with POP3 (fetchmail-5.7.4) for fc@localhost (single-drop); Wed, 07 Aug 2002 06:28:07 -0700 (PDT)
Received: (qmail 25774 invoked by uid 510); 7 Aug 2002 13:24:07 -0000
Received: from n24.grp.scd.yahoo.com (66.218.66.80) by all.net with SMTP; 7 Aug 2002 13:24:07 -0000
X-eGroups-Return: sentto-279987-5137-1028726718-fc=all.net@returns.groups.yahoo.com
Received: from [66.218.66.94] by n24.grp.scd.yahoo.com with NNFMP; 07 Aug 2002 13:25:19 -0000
X-Sender: fc@red.all.net
X-Apparently-To: iwar@onelist.com
Received: (EGP: mail-8_0_7_4); 7 Aug 2002 13:25:18 -0000
Received: (qmail 57513 invoked from network); 7 Aug 2002 13:25:16 -0000
Received: from unknown (66.218.66.216) by m1.grp.scd.yahoo.com with QMQP; 7 Aug 2002 13:25:16 -0000
Received: from unknown (HELO red.all.net) (12.232.72.152) by mta1.grp.scd.yahoo.com with SMTP; 7 Aug 2002 13:25:15 -0000
Received: (from fc@localhost) by red.all.net (8.11.2/8.11.2) id g77DPSj28508 for iwar@onelist.com; Wed, 7 Aug 2002 06:25:28 -0700
Message-Id: <200208071325.g77DPSj28508@red.all.net>
To: iwar@onelist.com (Information Warfare Mailing List)
Organization: I'm not allowed to say
X-Mailer: don't even ask
X-Mailer: ELM [version 2.5 PL3]
From: Fred Cohen <fc@all.net>
X-Yahoo-Profile: fcallnet
Mailing-List: list iwar@yahoogroups.com; contact iwar-owner@yahoogroups.com
Delivered-To: mailing list iwar@yahoogroups.com
Precedence: bulk
List-Unsubscribe: <mailto:iwar-unsubscribe@yahoogroups.com>
Date: Wed, 7 Aug 2002 06:25:28 -0700 (PDT)
Subject: [iwar] [fc:NSF,.Intelligence.Community.Work.on.Data-Mining.Research]
Reply-To: iwar@yahoogroups.com
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit
X-Spam-Status: No, hits=3.2 required=5.0 tests=RISK_FREE,FREE_MONEY,DIFFERENT_REPLY_TO version=2.20
X-Spam-Level: ***
NSF, Intelligence Community Work on Data-Mining Research
By Jay Wrolstad
NewsFactor Network
August 02, 2002
<A HREF="http://sci.newsfactor.com/perl/story/18872.html"<a
href="http://sci.newsfactor.com/perl/story/18872.html">http://sci.newsfactor.com/perl/story/18872.html>
The NSF is working with the CIA's technology branch to develop data-mining
techniques in order to analyze communications and hopefully prevent terrorist
activity. The work will involve detection of specific keywords and topics
across a variety of media.
Prompted by homeland security issues brought to the fore by the September
11th terrorist attacks, the U.S. intelligence community and the <A HREF="http://www.nsf.gov/"National
Science Foundation</A (NSF) are researching innovative data-mining techniques
designed primarily to aid law enforcement agencies at various levels. Some
US$8 million from the Intelligence Technology Innovation Center (ITIC), which
is under the Central Intelligence Agency's administrative umbrella but is
funded separately, will be spent to develop data-mining techniques that can
extract underlying patterns -- and create predictive abilities -- from
massive sets of data, such as television broadcasts and Web pages.
Real-Time Pattern Recognition Gary Strong, program officer for NSF's
Directorate for Computer & Information Sciences and Engineering (CISE), told
NewsFactor that the research will involve experts in computer science and
will focus on two areas: data streams and data sharing. "With audio and video
streaming there is little hope of saving information because the databases
are constantly in flux and you have to make real-time decisions on what to
save," said Strong. Consequently, researchers will work on "mining"
underlying patterns and trends while pinpointing changes in those patterns.
This work will involve both topic and word "spotting," or detecting specific
words or word clusters. Data-Sharing Policies Because the intelligence
community and law enforcement agencies have traditionally lacked the capacity
or legal authority to share data, this research will evaluate new policies
for sharing that incorporate "probable cause" conditions, said Strong.
Efforts to use government-owned databases in a coordinated way currently
present problems because of incompatibility among the databases, not to
mention privacy restrictions.
Developing data-mining techniques within these constraints is a challenge
regardless of national security implications, he added. "We now have an
opportunity to develop a way to allow searches of protected information, such
as medical records, while protecting privacy of the data," Strong noted.
Cooperative Agreement Besides national security, other applications for the
research range from natural disaster response to bioinformatics, which
involves searching through large numbers of documents to manage biological
functions. Cooperation between the ITIC and the CIA is made possible through
the interagency Knowledge Discovery and Dissemination (KDD) program.
Through KDD, the NSF identifies projects and programs in which research might
be related to national security and then consults the research community to
focus its efforts, where appropriate, in that direction. An NSF-sponsored
workshop held in December identified some 40 potential data-mining projects
of interest to the intelligence community. Of those, 15 were chosen to
receive funding over the next three years as part of the cooperative venture.
Projects Outlined In one chosen project, SRI International will investigate
ways to enable machines to recognize individuals by the way they talk, a
sophisticated capability that goes far beyond existing voice-recognition
technology.
Strong said this research includes "talk printing," or identifying the
specific ways in which individuals talk, including pauses or speech
inflections. In another project, researchers at Columbia University are
working on a system to track patterns in data types -- such as broadcast news
programs, online chat rooms, e-mail and voice mail -- and then automatically
generate a summary of information about a specific event. "They will take
large numbers of messages and produce short summaries that take into
consideration both time factors and changing news reports to determine the
most accurate information," Strong said. Meanwhile, scientists at IBM's T.J.
Watson Research Center hope to create a topic-spotting method that can search
for a specific area of interest in all languages.
------------------------ Yahoo! Groups Sponsor ---------------------~-->
Free $5 Love Reading
Risk Free!
http://us.click.yahoo.com/09Lw8C/PfREAA/Ey.GAA/kgFolB/TM
---------------------------------------------------------------------~->
------------------
http://all.net/
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
This archive was generated by hypermail 2.1.2 : 2002-10-01 06:44:32 PDT