Movatterモバイル変換


[0]ホーム

URL:


US20050021545A1 - Very-large-scale automatic categorizer for Web content - Google Patents

Very-large-scale automatic categorizer for Web content
Download PDF

Info

Publication number
US20050021545A1
US20050021545A1US10/923,431US92343104AUS2005021545A1US 20050021545 A1US20050021545 A1US 20050021545A1US 92343104 AUS92343104 AUS 92343104AUS 2005021545 A1US2005021545 A1US 2005021545A1
Authority
US
United States
Prior art keywords
node
nodes
child nodes
data object
confidence rating
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US10/923,431
Inventor
Daniel Lulich
Farzin Guilak
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft CorpfiledCriticalMicrosoft Corp
Priority to US10/923,431priorityCriticalpatent/US20050021545A1/en
Publication of US20050021545A1publicationCriticalpatent/US20050021545A1/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLCreassignmentMICROSOFT TECHNOLOGY LICENSING, LLCASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: MICROSOFT CORPORATION
Abandonedlegal-statusCriticalCurrent

Links

Images

Classifications

Definitions

Landscapes

Abstract

A method and apparatus for efficiently classifying and categorizing data objects such as electronic text, graphics, and audio based documents within very-large-scale hierarchical classification trees is provided. In accordance with one embodiment of the invention, a first node of a plurality of nodes of a subject hierarchy is selected. Previously classified data objects corresponding to a selected first node of a subject hierarchy as well as any associated sub-nodes of the selected node are aggregated to form a content class of data objects. Similarly, data objects corresponding to sibling nodes of the selected node and any associated sub-nodes of the sibling nodes are then aggregated to form an anti-content class of data objects. Features are then extracted from each of the content class of data objects and the anti-content class of data objects to facilitate characterization of said previously classified data objects.

Description

Claims (24)

1. A method of training a classifier system by utilizing previously classified data objects comprising one or more electronic documents and organized into a subject hierarchy of a plurality of nodes, the method comprising:
selecting one node of the plurality of nodes;
aggregating those of the previously classified data objects corresponding to the selected node and any associated sub-nodes of the selected node, to form a content class of data objects, said content class of data objects comprising a content class of the one or more electronic documents;
aggregating those of the previously classified data objects corresponding to any associated sibling nodes of the selected node and any associated sub-nodes of the sibling nodes to form an anti-content class of data objects, said anti-content class of data objects comprising an anti-content class of the one or more electronic documents; and
extracting features from at least one of the content class of data objects and the anti-content class of data objects to facilitate characterization of said previously classified data objects.
13. A method of classifying a data object, the method comprising:
selecting a first node of a hierarchically organized classifier having a plurality of nodes;
determining if the first node of said plurality of nodes is the parent of one or more child nodes;
upon determining that said first node is the parent of one or more child nodes, selecting a first of said one or more child nodes and classifying said data object at the first of said one or more child nodes to produce a confidence rating, said data object comprising an electronic document;
recursively selecting each of said one or more child nodes that remain and classifying the data object at each selected one or more child nodes to respectively produce a confidence rating for each selected one or more child nodes; and
assigning the data object to each node of said plurality of nodes having produced an acceptable confidence rating.
35. An apparatus comprising:
a storage medium having stored therein a plurality of programming instructions designed to implement a plurality of functions of a category name service for providing a category name to a data object, including first one or more functions to
select a first node of a hierarchically organized classifier having a plurality of nodes,
determine if the first node of said plurality of nodes is a parent of one or more child nodes,
select a first of said one or more child nodes and classify said data object at the first of said one or more child nodes to produce a confidence rating if said first node is the parent of one or more child nodes,
select each of said one or more child nodes that remain and classify the data object at each selected one or more child nodes to respectively produce a confidence rating for each selected one or more child nodes,
assign the data object to each node of said plurality of nodes having produced an acceptable confidence rating; and
a processor coupled to the storage medium to execute the programming instructions.
US10/923,4312001-05-072004-08-20Very-large-scale automatic categorizer for Web contentAbandonedUS20050021545A1 (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
US10/923,431US20050021545A1 (en)2001-05-072004-08-20Very-large-scale automatic categorizer for Web content

Applications Claiming Priority (3)

Application NumberPriority DateFiling DateTitle
US28941801P2001-05-072001-05-07
US09/963,178US6826576B2 (en)2001-05-072001-09-25Very-large-scale automatic categorizer for web content
US10/923,431US20050021545A1 (en)2001-05-072004-08-20Very-large-scale automatic categorizer for Web content

Related Parent Applications (1)

Application NumberTitlePriority DateFiling Date
US09/963,178ContinuationUS6826576B2 (en)2001-05-072001-09-25Very-large-scale automatic categorizer for web content

Publications (1)

Publication NumberPublication Date
US20050021545A1true US20050021545A1 (en)2005-01-27

Family

ID=26965626

Family Applications (2)

Application NumberTitlePriority DateFiling Date
US09/963,178Expired - LifetimeUS6826576B2 (en)2001-05-072001-09-25Very-large-scale automatic categorizer for web content
US10/923,431AbandonedUS20050021545A1 (en)2001-05-072004-08-20Very-large-scale automatic categorizer for Web content

Family Applications Before (1)

Application NumberTitlePriority DateFiling Date
US09/963,178Expired - LifetimeUS6826576B2 (en)2001-05-072001-09-25Very-large-scale automatic categorizer for web content

Country Status (3)

CountryLink
US (2)US6826576B2 (en)
EP (1)EP1386250A4 (en)
WO (1)WO2002091216A1 (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20070005529A1 (en)*2005-05-182007-01-04Naphade Milind RCross descriptor learning system, method and program product therefor
US20070294204A1 (en)*2006-06-142007-12-20Kabushiki Kaisha ToshibaSystem and method for accessing content from selected sources via a document processing device
US20080040510A1 (en)*2006-02-102008-02-14Elizabeth WarnerWeb services broker and method of using same
US20090171897A1 (en)*2007-12-282009-07-02Ulrich SpinolaMethod and system for case management
US20100082627A1 (en)*2008-09-242010-04-01Yahoo! Inc.Optimization filters for user generated content searches
WO2010042888A1 (en)*2008-10-102010-04-15The Regents Of The University Of CaliforniaA computational method for comparing, classifying, indexing, and cataloging of electronically stored linear information
US20100169319A1 (en)*2008-12-302010-07-01International Business Machines CorporationVerification of Data Categorization
US20130086095A1 (en)*2005-07-052013-04-04Oracle International CorporationMaking and using abstract xml representations of data dictionary metadata
US20140089302A1 (en)*2009-09-302014-03-27Gennady LAPIRMethod and system for extraction

Families Citing this family (120)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US7603415B1 (en)*2000-08-152009-10-13ART Technology GroupClassification of electronic messages using a hierarchy of rule sets
CA2323883C (en)*2000-10-192016-02-16Patrick Ryan MorinMethod and device for classifying internet objects and objects stored oncomputer-readable media
US6717327B2 (en)*2001-04-232004-04-06Murata Manufacturing Co., Ltd.Surface acoustic wave device
GB2400291A (en)*2003-04-052004-10-06Autodesk Canada IncImage processing using switch nodes
US8136025B1 (en)2003-07-032012-03-13Google Inc.Assigning document identification tags
US7328136B2 (en)*2004-09-152008-02-05Council Of Scientific & Industrial ResearchComputer based method for finding the effect of an element in a domain of N-dimensional function with a provision for N+1 dimensions
US20060142993A1 (en)*2004-12-282006-06-29Sony CorporationSystem and method for utilizing distance measures to perform text classification
US7660791B2 (en)*2005-02-282010-02-09Microsoft CorporationSystem and method for determining initial relevance of a document with respect to a given category
US8468445B2 (en)*2005-03-302013-06-18The Trustees Of Columbia University In The City Of New YorkSystems and methods for content extraction
FI20050779L (en)*2005-07-222007-01-23Analyse Solutions Finland Oy Information management method and system
US20070050445A1 (en)*2005-08-312007-03-01Hugh HyndmanInternet content analysis
US7765209B1 (en)*2005-09-132010-07-27Google Inc.Indexing and retrieval of blogs
US10380164B2 (en)2005-10-262019-08-13Cortica, Ltd.System and method for using on-image gestures and multimedia content elements as search queries
US10691642B2 (en)2005-10-262020-06-23Cortica LtdSystem and method for enriching a concept database with homogenous concepts
US11361014B2 (en)2005-10-262022-06-14Cortica Ltd.System and method for completing a user profile
US10614626B2 (en)2005-10-262020-04-07Cortica Ltd.System and method for providing augmented reality challenges
US10193990B2 (en)2005-10-262019-01-29Cortica Ltd.System and method for creating user profiles based on multimedia content
US10949773B2 (en)2005-10-262021-03-16Cortica, Ltd.System and methods thereof for recommending tags for multimedia content elements based on context
US11019161B2 (en)2005-10-262021-05-25Cortica, Ltd.System and method for profiling users interest based on multimedia content analysis
US11403336B2 (en)2005-10-262022-08-02Cortica Ltd.System and method for removing contextually identical multimedia content elements
US9477658B2 (en)2005-10-262016-10-25Cortica, Ltd.Systems and method for speech to speech translation using cores of a natural liquid architecture system
US11604847B2 (en)2005-10-262023-03-14Cortica Ltd.System and method for overlaying content on a multimedia content element based on user interest
US11032017B2 (en)2005-10-262021-06-08Cortica, Ltd.System and method for identifying the context of multimedia content elements
US8326775B2 (en)2005-10-262012-12-04Cortica Ltd.Signature generation for multimedia deep-content-classification by a large-scale matching system and method thereof
US10607355B2 (en)2005-10-262020-03-31Cortica, Ltd.Method and system for determining the dimensions of an object shown in a multimedia content item
US10180942B2 (en)2005-10-262019-01-15Cortica Ltd.System and method for generation of concept structures based on sub-concepts
US10387914B2 (en)2005-10-262019-08-20Cortica, Ltd.Method for identification of multimedia content elements and adding advertising content respective thereof
US10535192B2 (en)2005-10-262020-01-14Cortica Ltd.System and method for generating a customized augmented reality environment to a user
US10742340B2 (en)2005-10-262020-08-11Cortica Ltd.System and method for identifying the context of multimedia content elements displayed in a web-page and providing contextual filters respective thereto
US10191976B2 (en)2005-10-262019-01-29Cortica, Ltd.System and method of detecting common patterns within unstructured data elements retrieved from big data sources
US9191626B2 (en)2005-10-262015-11-17Cortica, Ltd.System and methods thereof for visual analysis of an image on a web-page and matching an advertisement thereto
US10360253B2 (en)2005-10-262019-07-23Cortica, Ltd.Systems and methods for generation of searchable structures respective of multimedia data content
US9218606B2 (en)2005-10-262015-12-22Cortica, Ltd.System and method for brand monitoring and trend analysis based on deep-content-classification
US9646005B2 (en)2005-10-262017-05-09Cortica, Ltd.System and method for creating a database of multimedia content elements assigned to users
US9031999B2 (en)2005-10-262015-05-12Cortica, Ltd.System and methods for generation of a concept based database
US10372746B2 (en)2005-10-262019-08-06Cortica, Ltd.System and method for searching applications using multimedia content elements
US8266185B2 (en)2005-10-262012-09-11Cortica Ltd.System and methods thereof for generation of searchable structures respective of multimedia data content
US11620327B2 (en)2005-10-262023-04-04Cortica LtdSystem and method for determining a contextual insight and generating an interface with recommendations based thereon
US10585934B2 (en)2005-10-262020-03-10Cortica Ltd.Method and system for populating a concept database with respect to user identifiers
US10635640B2 (en)2005-10-262020-04-28Cortica, Ltd.System and method for enriching a concept database
US8312031B2 (en)2005-10-262012-11-13Cortica Ltd.System and method for generation of complex signatures for multimedia data content
US10698939B2 (en)2005-10-262020-06-30Cortica LtdSystem and method for customizing images
US10776585B2 (en)2005-10-262020-09-15Cortica, Ltd.System and method for recognizing characters in multimedia content
US9953032B2 (en)2005-10-262018-04-24Cortica, Ltd.System and method for characterization of multimedia content signals using cores of a natural liquid architecture system
US9529984B2 (en)2005-10-262016-12-27Cortica, Ltd.System and method for verification of user identification based on multimedia content elements
US8818916B2 (en)2005-10-262014-08-26Cortica, Ltd.System and method for linking multimedia data elements to web pages
US9767143B2 (en)2005-10-262017-09-19Cortica, Ltd.System and method for caching of concept structures
US10621988B2 (en)2005-10-262020-04-14Cortica LtdSystem and method for speech to text translation using cores of a natural liquid architecture system
US10380267B2 (en)2005-10-262019-08-13Cortica, Ltd.System and method for tagging multimedia content elements
US9384196B2 (en)2005-10-262016-07-05Cortica, Ltd.Signature generation for multimedia deep-content-classification by a large-scale matching system and method thereof
US11386139B2 (en)2005-10-262022-07-12Cortica Ltd.System and method for generating analytics for entities depicted in multimedia content
US20160321253A1 (en)2005-10-262016-11-03Cortica, Ltd.System and method for providing recommendations based on user profiles
US11003706B2 (en)2005-10-262021-05-11Cortica LtdSystem and methods for determining access permissions on personalized clusters of multimedia content elements
US9747420B2 (en)2005-10-262017-08-29Cortica, Ltd.System and method for diagnosing a patient based on an analysis of multimedia content
US9372940B2 (en)2005-10-262016-06-21Cortica, Ltd.Apparatus and method for determining user attention using a deep-content-classification (DCC) system
US10380623B2 (en)2005-10-262019-08-13Cortica, Ltd.System and method for generating an advertisement effectiveness performance score
US11216498B2 (en)2005-10-262022-01-04Cortica, Ltd.System and method for generating signatures to three-dimensional multimedia data elements
US10848590B2 (en)2005-10-262020-11-24Cortica LtdSystem and method for determining a contextual insight and providing recommendations based thereon
US20070106644A1 (en)*2005-11-082007-05-10International Business Machines CorporationMethods and apparatus for extracting and correlating text information derived from comment and product databases for use in identifying product improvements based on comment and product database commonalities
US7457801B2 (en)*2005-11-142008-11-25Microsoft CorporationAugmenting a training set for document categorization
US7890502B2 (en)2005-11-142011-02-15Microsoft CorporationHierarchy-based propagation of contribution of documents
EP1999693A2 (en)*2006-02-212008-12-10Correlix Ltd.Method and system for transaction monitoring in a communication network
US7809723B2 (en)*2006-06-262010-10-05Microsoft CorporationDistributed hierarchical text classification framework
US10733326B2 (en)2006-10-262020-08-04Cortica Ltd.System and method for identification of inappropriate multimedia content
EP2080127A2 (en)*2006-11-012009-07-22Bloxx LimitedMethods and systems for web site categorisation training, categorisation and access control
US7925651B2 (en)*2007-01-112011-04-12Microsoft CorporationRanking items by optimizing ranking cost function
US7873583B2 (en)*2007-01-192011-01-18Microsoft CorporationCombining resilient classifiers
US8005782B2 (en)2007-08-102011-08-23Microsoft CorporationDomain name statistical classification using character-based N-grams
US8041662B2 (en)2007-08-102011-10-18Microsoft CorporationDomain name geometrical classification using character-based n-grams
US8140531B2 (en)*2008-05-022012-03-20International Business Machines CorporationProcess and method for classifying structured data
US9430562B2 (en)*2008-09-302016-08-30Hewlett Packard Enterprise Development LpClassifier indexing
EP2530605A4 (en)*2010-01-292013-12-25Panasonic Corp DATA PROCESSING UNIT
US9665909B2 (en)*2010-10-182017-05-30Hewlett Packard Enterprise Development LpTransaction classification rule generation
GB2486490A (en)*2010-12-172012-06-20Royal Holloway & Bedford New CollegeMethod for structuring a network
US9152692B2 (en)*2012-06-282015-10-06Google Inc.Generating n-gram clusters associated with events
US9514448B2 (en)*2012-12-282016-12-06Intel CorporationComprehensive task management
US10235649B1 (en)2014-03-142019-03-19Walmart Apollo, LlcCustomer analytics data model
US10235687B1 (en)2014-03-142019-03-19Walmart Apollo, LlcShortest distance to store
US10565538B1 (en)2014-03-142020-02-18Walmart Apollo, LlcCustomer attribute exemption
US10346769B1 (en)2014-03-142019-07-09Walmart Apollo, LlcSystem and method for dynamic attribute table
US10733555B1 (en)2014-03-142020-08-04Walmart Apollo, LlcWorkflow coordinator
US9317566B1 (en)2014-06-272016-04-19Groupon, Inc.Method and system for programmatic analysis of consumer reviews
US10878017B1 (en)*2014-07-292020-12-29Groupon, Inc.System and method for programmatic generation of attribute descriptors
US10977667B1 (en)2014-10-222021-04-13Groupon, Inc.Method and system for programmatic analysis of consumer sentiment with regard to attribute descriptors
US11195043B2 (en)2015-12-152021-12-07Cortica, Ltd.System and method for determining common patterns in multimedia content elements based on key points
US11037015B2 (en)2015-12-152021-06-15Cortica Ltd.Identification of key points in multimedia data elements
WO2019008581A1 (en)2017-07-052019-01-10Cortica Ltd.Driving policies determination
US11899707B2 (en)2017-07-092024-02-13Cortica Ltd.Driving policies determination
US10846544B2 (en)2018-07-162020-11-24Cartica Ai Ltd.Transportation prediction system and method
US11144581B2 (en)*2018-07-262021-10-12International Business Machines CorporationVerifying and correcting training data for text classification
US20200133308A1 (en)2018-10-182020-04-30Cartica Ai LtdVehicle to vehicle (v2v) communication less truck platooning
US11126870B2 (en)2018-10-182021-09-21Cartica Ai Ltd.Method and system for obstacle detection
US11181911B2 (en)2018-10-182021-11-23Cartica Ai LtdControl transfer of a vehicle
US10839694B2 (en)2018-10-182020-11-17Cartica Ai LtdBlind spot alert
US12330646B2 (en)2018-10-182025-06-17Autobrains Technologies LtdOff road assistance
US10748038B1 (en)2019-03-312020-08-18Cortica Ltd.Efficient calculation of a robust signature of a media unit
US11244176B2 (en)2018-10-262022-02-08Cartica Ai LtdObstacle detection and mapping
US10789535B2 (en)2018-11-262020-09-29Cartica Ai LtdDetection of road elements
US11643005B2 (en)2019-02-272023-05-09Autobrains Technologies LtdAdjusting adjustable headlights of a vehicle
US11285963B2 (en)2019-03-102022-03-29Cartica Ai Ltd.Driver-based prediction of dangerous events
US11694088B2 (en)2019-03-132023-07-04Cortica Ltd.Method for object detection using knowledge distillation
US11132548B2 (en)2019-03-202021-09-28Cortica Ltd.Determining object information that does not explicitly appear in a media unit signature
US12055408B2 (en)2019-03-282024-08-06Autobrains Technologies LtdEstimating a movement of a hybrid-behavior vehicle
US11222069B2 (en)2019-03-312022-01-11Cortica Ltd.Low-power calculation of a signature of a media unit
US10789527B1 (en)2019-03-312020-09-29Cortica Ltd.Method for object detection using shallow neural networks
US10776669B1 (en)2019-03-312020-09-15Cortica Ltd.Signature generation and object detection that refer to rare scenes
US10796444B1 (en)2019-03-312020-10-06Cortica LtdConfiguring spanning elements of a signature generator
US11270077B2 (en)*2019-05-132022-03-08International Business Machines CorporationRouting text classifications within a cross-domain conversational service
US10748022B1 (en)2019-12-122020-08-18Cartica Ai LtdCrowd separation
US11593662B2 (en)2019-12-122023-02-28Autobrains Technologies LtdUnsupervised cluster generation
US11590988B2 (en)2020-03-192023-02-28Autobrains Technologies LtdPredictive turning assistant
US11827215B2 (en)2020-03-312023-11-28AutoBrains Technologies Ltd.Method for training a driving related object detector
US11756424B2 (en)2020-07-242023-09-12AutoBrains Technologies Ltd.Parking assist
US12049116B2 (en)2020-09-302024-07-30Autobrains Technologies LtdConfiguring an active suspension
CN114415163A (en)2020-10-132022-04-29奥特贝睿技术有限公司 Camera-based distance measurement
US12257949B2 (en)2021-01-252025-03-25Autobrains Technologies LtdAlerting on driving affecting signal
US12139166B2 (en)2021-06-072024-11-12Autobrains Technologies LtdCabin preferences setting that is based on identification of one or more persons in the cabin
US12423994B2 (en)2021-07-012025-09-23Autobrains Technologies LtdLane boundary detection
EP4194300A1 (en)2021-08-052023-06-14Autobrains Technologies LTD.Providing a prediction of a radius of a motorcycle turn
US12293560B2 (en)2021-10-262025-05-06Autobrains Technologies LtdContext based separation of on-/off-vehicle points of interest in videos

Citations (31)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US4991094A (en)*1989-04-261991-02-05International Business Machines CorporationMethod for language-independent text tokenization using a character categorization
US5201048A (en)*1988-12-011993-04-06Axxess Technologies, Inc.High speed computer system for search and retrieval of data within text and record oriented files
US5303361A (en)*1989-01-181994-04-12Lotus Development CorporationSearch and retrieval system
US5461698A (en)*1991-05-101995-10-24Siemens Corporate Research, Inc.Method for modelling similarity function using neural network
US5717913A (en)*1995-01-031998-02-10University Of Central FloridaMethod for detecting and extracting text data using database schemas
US5724567A (en)*1994-04-251998-03-03Apple Computer, Inc.System for directing relevance-ranked data objects to computer users
US5848186A (en)*1995-08-111998-12-08Canon Kabushiki KaishaFeature extraction system for identifying text within a table image
US5875446A (en)*1997-02-241999-02-23International Business Machines CorporationSystem and method for hierarchically grouping and ranking a set of objects in a query context based on one or more relationships
US5940821A (en)*1997-05-211999-08-17Oracle CorporationInformation presentation in a knowledge base search and retrieval system
US6058205A (en)*1997-01-092000-05-02International Business Machines CorporationSystem and method for partitioning the feature space of a classifier in a pattern classification system
US6167402A (en)*1998-04-272000-12-26Sun Microsystems, Inc.High performance message store
US6249785B1 (en)*1999-05-062001-06-19Mediachoice, Inc.Method for predicting ratings
US20010042085A1 (en)*1998-09-302001-11-15Mark PeairsAutomatic document classification using text and images
US6321267B1 (en)*1999-11-232001-11-20Escom CorporationMethod and apparatus for filtering junk email
US6334131B2 (en)*1998-08-292001-12-25International Business Machines CorporationMethod for cataloging, filtering, and relevance ranking frame-based hierarchical information structures
US6389436B1 (en)*1997-12-152002-05-14International Business Machines CorporationEnhanced hypertext categorization using hyperlinks
US6393427B1 (en)*1999-03-222002-05-21Nec Usa, Inc.Personalized navigation trees
US6418433B1 (en)*1999-01-282002-07-09International Business Machines CorporationSystem and method for focussed web crawling
US6519580B1 (en)*2000-06-082003-02-11International Business Machines CorporationDecision-tree-based symbolic rule induction system for text categorization
US6615242B1 (en)*1998-12-282003-09-02At&T Corp.Automatic uniform resource locator-based message filter
US20030195872A1 (en)*1999-04-122003-10-16Paul SennWeb-based information content analyzer and information dimension dictionary
US6654787B1 (en)*1998-12-312003-11-25Brightmail, IncorporatedMethod and apparatus for filtering e-mail
US6732157B1 (en)*2002-12-132004-05-04Networks Associates Technology, Inc.Comprehensive anti-spam system, method, and computer program product for filtering unwanted e-mail messages
US6772196B1 (en)*2000-07-272004-08-03Propel Software Corp.Electronic mail filtering system and methods
US6867498B2 (en)*2002-08-302005-03-15Micron Technology, Inc.Metal line layout of an integrated circuit
US6868143B1 (en)*2002-10-012005-03-15Bellsouth Intellectual PropertySystem and method for advanced unified messaging
US6868498B1 (en)*1999-09-012005-03-15Peter L. KatsikasSystem for eliminating unauthorized electronic mail
US6901398B1 (en)*2001-02-122005-05-31Microsoft CorporationSystem and method for constructing and personalizing a universal information classifier
US6931433B1 (en)*2000-08-242005-08-16Yahoo! Inc.Processing of unsolicited bulk electronic communication
US6965919B1 (en)*2000-08-242005-11-15Yahoo! Inc.Processing of unsolicited bulk electronic mail
US20070208731A1 (en)*2006-03-062007-09-06Fuji Xerox Co., Ltd.Document information processing apparatus, method of document information processing, computer readable medium and computer data signal

Family Cites Families (35)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5428778A (en)1992-02-131995-06-27Office Express Pty. Ltd.Selective dissemination of information
US5537586A (en)1992-04-301996-07-16Individual, Inc.Enhanced apparatus and methods for retrieving and selecting profiled textural information records from a database of defined category structures
GB9220404D0 (en)1992-08-201992-11-11Nat Security AgencyMethod of identifying,retrieving and sorting documents
US5576954A (en)1993-11-051996-11-19University Of Central FloridaProcess for determination of text relevancy
US5640468A (en)1994-04-281997-06-17Hsu; Shin-YiMethod for identifying objects and features in an image
US5752051A (en)1994-07-191998-05-12The United States Of America As Represented By The Secretary Of NsaLanguage-independent method of generating index terms
US5652829A (en)*1994-07-261997-07-29International Business Machines CorporationFeature merit generator
JP3669016B2 (en)1994-09-302005-07-06株式会社日立製作所 Document information classification device
US5706507A (en)1995-07-051998-01-06International Business Machines CorporationSystem and method for controlling access to data located on a content server
US5809499A (en)1995-10-201998-09-15Pattern Discovery Software Systems, Ltd.Computational method for discovering patterns in data sets
US5657424A (en)*1995-10-311997-08-12Dictaphone CorporationIsolated word recognition using decision tree classifiers and time-indexed feature vectors
US5787420A (en)1995-12-141998-07-28Xerox CorporationMethod of ordering document clusters without requiring knowledge of user interests
US5867799A (en)1996-04-041999-02-02Lang; Andrew K.Information system and method for filtering a massive flow of information entities to meet user information classification needs
US5794236A (en)1996-05-291998-08-11Lexis-NexisComputer-based system for classifying documents into a hierarchy and linking the classifications to the hierarchy
US5909680A (en)1996-09-091999-06-01Ricoh Company LimitedDocument categorization by word length distribution analysis
US5911043A (en)1996-10-011999-06-08Baker & Botts, L.L.P.System and method for computer-based rating of information retrieved from a computer network
US6285999B1 (en)1997-01-102001-09-04The Board Of Trustees Of The Leland Stanford Junior UniversityMethod for node ranking in a linked database
US6182058B1 (en)*1997-02-282001-01-30Silicon Graphics, Inc.Bayes rule based and decision tree hybrid classifier
US5960435A (en)*1997-03-111999-09-28Silicon Graphics, Inc.Method, system, and computer program product for computing histogram aggregations
US5835905A (en)1997-04-091998-11-10Xerox CorporationSystem for predicting documents relevant to focus documents by spreading activation through network representations of a linked collection of documents
US6233575B1 (en)*1997-06-242001-05-15International Business Machines CorporationMultilevel taxonomy based on features derived from training documents classification using fisher values as discrimination values
US6128613A (en)*1997-06-262000-10-03The Chinese University Of Hong KongMethod and apparatus for establishing topic word classes based on an entropy cost function to retrieve documents represented by the topic words
US5870744A (en)1997-06-301999-02-09Intel CorporationVirtual people networking
US6003029A (en)1997-08-221999-12-14International Business Machines CorporationAutomatic subspace clustering of high dimensional data for data mining applications
US5943670A (en)1997-11-211999-08-24International Business Machines CorporationSystem and method for categorizing objects in combined categories
US6163778A (en)1998-02-062000-12-19Sun Microsystems, Inc.Probabilistic web link viability marker and web page ratings
US6161130A (en)*1998-06-232000-12-12Microsoft CorporationTechnique which utilizes a probabilistic classifier to detect "junk" e-mail by automatically updating a training and re-training the classifier based on the updated training set
US6252988B1 (en)*1998-07-092001-06-26Lucent Technologies Inc.Method and apparatus for character recognition using stop words
US6473753B1 (en)1998-10-092002-10-29Microsoft CorporationMethod and system for calculating term-document importance
EP1129417A4 (en)*1998-12-042004-06-30Technology Enabling Company LlSystems and methods for organizing data
US20010032029A1 (en)*1999-07-012001-10-18Stuart KauffmanSystem and method for infrastructure design
US6430558B1 (en)1999-08-022002-08-06Zen Tech, Inc.Apparatus and methods for collaboratively searching knowledge databases
SG89289A1 (en)*1999-08-142002-06-18Kent Ridge Digital LabsClassification by aggregating emerging patterns
US20020099730A1 (en)*2000-05-122002-07-25Applied Psychology Research LimitedAutomatic text classification system
WO2002041190A2 (en)*2000-11-152002-05-23Holbrook David MApparatus and method for organizing and/or presenting data

Patent Citations (31)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US5201048A (en)*1988-12-011993-04-06Axxess Technologies, Inc.High speed computer system for search and retrieval of data within text and record oriented files
US5303361A (en)*1989-01-181994-04-12Lotus Development CorporationSearch and retrieval system
US4991094A (en)*1989-04-261991-02-05International Business Machines CorporationMethod for language-independent text tokenization using a character categorization
US5461698A (en)*1991-05-101995-10-24Siemens Corporate Research, Inc.Method for modelling similarity function using neural network
US5724567A (en)*1994-04-251998-03-03Apple Computer, Inc.System for directing relevance-ranked data objects to computer users
US5717913A (en)*1995-01-031998-02-10University Of Central FloridaMethod for detecting and extracting text data using database schemas
US5848186A (en)*1995-08-111998-12-08Canon Kabushiki KaishaFeature extraction system for identifying text within a table image
US6058205A (en)*1997-01-092000-05-02International Business Machines CorporationSystem and method for partitioning the feature space of a classifier in a pattern classification system
US5875446A (en)*1997-02-241999-02-23International Business Machines CorporationSystem and method for hierarchically grouping and ranking a set of objects in a query context based on one or more relationships
US5940821A (en)*1997-05-211999-08-17Oracle CorporationInformation presentation in a knowledge base search and retrieval system
US6389436B1 (en)*1997-12-152002-05-14International Business Machines CorporationEnhanced hypertext categorization using hyperlinks
US6167402A (en)*1998-04-272000-12-26Sun Microsystems, Inc.High performance message store
US6334131B2 (en)*1998-08-292001-12-25International Business Machines CorporationMethod for cataloging, filtering, and relevance ranking frame-based hierarchical information structures
US20010042085A1 (en)*1998-09-302001-11-15Mark PeairsAutomatic document classification using text and images
US6615242B1 (en)*1998-12-282003-09-02At&T Corp.Automatic uniform resource locator-based message filter
US6654787B1 (en)*1998-12-312003-11-25Brightmail, IncorporatedMethod and apparatus for filtering e-mail
US6418433B1 (en)*1999-01-282002-07-09International Business Machines CorporationSystem and method for focussed web crawling
US6393427B1 (en)*1999-03-222002-05-21Nec Usa, Inc.Personalized navigation trees
US20030195872A1 (en)*1999-04-122003-10-16Paul SennWeb-based information content analyzer and information dimension dictionary
US6249785B1 (en)*1999-05-062001-06-19Mediachoice, Inc.Method for predicting ratings
US6868498B1 (en)*1999-09-012005-03-15Peter L. KatsikasSystem for eliminating unauthorized electronic mail
US6321267B1 (en)*1999-11-232001-11-20Escom CorporationMethod and apparatus for filtering junk email
US6519580B1 (en)*2000-06-082003-02-11International Business Machines CorporationDecision-tree-based symbolic rule induction system for text categorization
US6772196B1 (en)*2000-07-272004-08-03Propel Software Corp.Electronic mail filtering system and methods
US6931433B1 (en)*2000-08-242005-08-16Yahoo! Inc.Processing of unsolicited bulk electronic communication
US6965919B1 (en)*2000-08-242005-11-15Yahoo! Inc.Processing of unsolicited bulk electronic mail
US6901398B1 (en)*2001-02-122005-05-31Microsoft CorporationSystem and method for constructing and personalizing a universal information classifier
US6867498B2 (en)*2002-08-302005-03-15Micron Technology, Inc.Metal line layout of an integrated circuit
US6868143B1 (en)*2002-10-012005-03-15Bellsouth Intellectual PropertySystem and method for advanced unified messaging
US6732157B1 (en)*2002-12-132004-05-04Networks Associates Technology, Inc.Comprehensive anti-spam system, method, and computer program product for filtering unwanted e-mail messages
US20070208731A1 (en)*2006-03-062007-09-06Fuji Xerox Co., Ltd.Document information processing apparatus, method of document information processing, computer readable medium and computer data signal

Cited By (17)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
TWI396980B (en)*2005-05-182013-05-21IbmCross descriptor learning system, method and program product therefor
US20070005529A1 (en)*2005-05-182007-01-04Naphade Milind RCross descriptor learning system, method and program product therefor
US8214310B2 (en)*2005-05-182012-07-03International Business Machines CorporationCross descriptor learning system, method and program product therefor
US8886686B2 (en)*2005-07-052014-11-11Oracle International CorporationMaking and using abstract XML representations of data dictionary metadata
US20130086095A1 (en)*2005-07-052013-04-04Oracle International CorporationMaking and using abstract xml representations of data dictionary metadata
US8209407B2 (en)*2006-02-102012-06-26The United States Of America, As Represented By The Secretary Of The NavySystem and method for web service discovery and access
US20080040510A1 (en)*2006-02-102008-02-14Elizabeth WarnerWeb services broker and method of using same
US7644067B2 (en)*2006-06-142010-01-05Kabushiki Kaisha ToshibaSystem and method for accessing content from selected sources via a document processing device
US20070294204A1 (en)*2006-06-142007-12-20Kabushiki Kaisha ToshibaSystem and method for accessing content from selected sources via a document processing device
US20090171897A1 (en)*2007-12-282009-07-02Ulrich SpinolaMethod and system for case management
US20100082627A1 (en)*2008-09-242010-04-01Yahoo! Inc.Optimization filters for user generated content searches
US8793249B2 (en)*2008-09-242014-07-29Yahoo! Inc.Optimization filters for user generated content searches
US20110196872A1 (en)*2008-10-102011-08-11The Regents Of The University Of California Computational Method for Comparing, Classifying, Indexing, and Cataloging of Electronically Stored Linear Information
WO2010042888A1 (en)*2008-10-102010-04-15The Regents Of The University Of CaliforniaA computational method for comparing, classifying, indexing, and cataloging of electronically stored linear information
US20100169319A1 (en)*2008-12-302010-07-01International Business Machines CorporationVerification of Data Categorization
US8346738B2 (en)*2008-12-302013-01-01International Business Machines CorporationVerification of data categorization
US20140089302A1 (en)*2009-09-302014-03-27Gennady LAPIRMethod and system for extraction

Also Published As

Publication numberPublication date
EP1386250A4 (en)2006-06-14
WO2002091216A1 (en)2002-11-14
EP1386250A1 (en)2004-02-04
US20020174095A1 (en)2002-11-21
US6826576B2 (en)2004-11-30

Similar Documents

PublicationPublication DateTitle
US6826576B2 (en)Very-large-scale automatic categorizer for web content
US6938025B1 (en)Method and apparatus for automatically determining salient features for object classification
CN102576358B (en)Word pair acquisition device, word pair acquisition method, and program
EP1565846B1 (en)Information storage and retrieval
US6665681B1 (en)System and method for generating a taxonomy from a plurality of documents
EP0960376B1 (en)Text processing and retrieval system and method
US8131724B2 (en)System for similar document detection
US7333985B2 (en)Dynamic content clustering
EP1426882A2 (en)Information storage and retrieval
US7024405B2 (en)Method and apparatus for improved internet searching
KR20060048779A (en) Phrases Identification in Information Retrieval Systems
KR20060048780A (en) Phrases-based Indexing in Information Retrieval Systems
KR20060048777A (en) Phrase-based generation of document descriptions
KR20060048778A (en) Phrases-based Search in Information Retrieval Systems
EP1508105A2 (en)System and method for automatically discovering a hierarchy of concepts from a corpus of documents
WO2009154570A1 (en)System and method for aligning and indexing multilingual documents
CN107967290A (en)A kind of knowledge mapping network establishing method and system, medium based on magnanimity scientific research data
Kruger et al.DEADLINER: Building a new niche search engine
CA2500264A1 (en)Method and apparatus for automatically determining salient features for object classification
KR20050096912A (en)Method and apparatus for automatically determining salient features for object classification
KourikPerformance of classification tools on unstructured text
Sahni et al.Topic Modeling on Online News and News Clustering
HK1024076B (en)Text processing and retrieval system and method

Legal Events

DateCodeTitleDescription
STCBInformation on status: application discontinuation

Free format text:ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

ASAssignment

Owner name:MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034766/0001

Effective date:20141014


[8]ページ先頭

©2009-2025 Movatter.jp