Movatterモバイル変換


[0]ホーム

URL:


US20220121637A1 - Structured document indexing and searching - Google Patents

Structured document indexing and searching
Download PDF

Info

Publication number
US20220121637A1
US20220121637A1US17/549,688US202117549688AUS2022121637A1US 20220121637 A1US20220121637 A1US 20220121637A1US 202117549688 AUS202117549688 AUS 202117549688AUS 2022121637 A1US2022121637 A1US 2022121637A1
Authority
US
United States
Prior art keywords
structured data
data structure
structured
path
hash
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US17/549,688
Inventor
Bruce R. Tietjen
Ronald P. Millett
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Perfect Search Corp
Original Assignee
Perfect Search Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Perfect Search CorpfiledCriticalPerfect Search Corp
Priority to US17/549,688priorityCriticalpatent/US20220121637A1/en
Assigned to PERFECT SEARCH CORPORATIONreassignmentPERFECT SEARCH CORPORATIONASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: MILLETT, RONALD P., TIETJEN, BRUCE R.
Publication of US20220121637A1publicationCriticalpatent/US20220121637A1/en
Abandonedlegal-statusCriticalCurrent

Links

Images

Classifications

Definitions

Landscapes

Abstract

Searching for data contained in a structured data structure. A method includes receiving a query. The query includes a structured data structure path and a first element related to the structured data structure path. One or more patterns are created comprising at least a portion of the structured data structure path and one or more elements related to the first element. For each of the one or more patterns, a hash is created. The created hashes are looked-up in a hash index to identity one or more structured data structures correlated to the hashes. The one or more structured data structures are identified to a user.

Description

Claims (20)

1. A system for indexing data contained in a structured data structure, the system comprising:
at least one processor; and
at least one computer readable medium coupled to the processor comprising computer executable instructions that when executed by the processor implement:
a pattern generator, wherein the pattern generator is configured to:
identify a structured data structure path in a structured data structure comprising a plurality of records, each of the records comprising data values, wherein a particular record can be reached by following the structured data structure path;
identify a first data value from the record;
create one or more patterns comprising at least a portion of the structured data structure path combined with one or more elements related to the first data value such that at least one of the patterns comprises the structured data structure path and the first data value;
a hasher configured to, for each of the one or more patterns, including at least one pattern that includes the first data value and at least a portion of the structured data structure path, create a hash; and
an indexer configured to index created hashes in a hash index by correlating the hashes in the hash index with the structured data structure, including indexing the hash created for the at least one pattern comprising both the structured data structure path and the first data value.
8. In a data storage environment, a method of indexing data contained in a structured data structure, the method comprising:
identifying a structured data structure path in a structured data structure comprising a plurality of records, each of the records comprising data values, wherein a particular record can be reached by following the structured data structure path;
identifying a first data value from the record;
creating one or more patterns comprising at least a portion of the structured data structure path combined with one or more elements related to the first data value such that at least one of the patterns comprises the structured data structure path and the first data value;
for each of the one or more patterns, including at least one pattern that includes the first data value and at least a portion of the structured data structure path, creating a hash; and
indexing created hashes in a hash index by correlating the hashes in the hash index with the structured data structure, including indexing the hash created for the at least one pattern comprising both the structured data structure path and the first data value.
16. In a data storage environment, a method of searching for data contained in a structured data structure, the method comprising:
receiving a query, wherein the query comprises a structured data structure path and a first data value;
for the at least a portion of the structured data structure path combined with the first data value, creating a hash;
looking up the created hash in a hash index to identity one or more structured data structures, wherein the hash index comprises a correlation of hashes with structured data structures, including the hash for the at least a portion of the structured data structure path combined with the first data value, the hashes in the hash index being based on hashes of structured data structure paths combined with values in records of the structured data structure that are reached by following the structured data structure paths to the records;
identifying to a user the one or more structured data structures correlated to the hash for the at least a portion of the structured data structure path combined with the first data value.
US17/549,6882016-05-262021-12-13Structured document indexing and searchingAbandonedUS20220121637A1 (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
US17/549,688US20220121637A1 (en)2016-05-262021-12-13Structured document indexing and searching

Applications Claiming Priority (3)

Application NumberPriority DateFiling DateTitle
US201662342072P2016-05-262016-05-26
US15/607,058US11200217B2 (en)2016-05-262017-05-26Structured document indexing and searching
US17/549,688US20220121637A1 (en)2016-05-262021-12-13Structured document indexing and searching

Related Parent Applications (1)

Application NumberTitlePriority DateFiling Date
US15/607,058ContinuationUS11200217B2 (en)2016-05-262017-05-26Structured document indexing and searching

Publications (1)

Publication NumberPublication Date
US20220121637A1true US20220121637A1 (en)2022-04-21

Family

ID=60418885

Family Applications (2)

Application NumberTitlePriority DateFiling Date
US15/607,058Active2039-01-18US11200217B2 (en)2016-05-262017-05-26Structured document indexing and searching
US17/549,688AbandonedUS20220121637A1 (en)2016-05-262021-12-13Structured document indexing and searching

Family Applications Before (1)

Application NumberTitlePriority DateFiling Date
US15/607,058Active2039-01-18US11200217B2 (en)2016-05-262017-05-26Structured document indexing and searching

Country Status (1)

CountryLink
US (2)US11200217B2 (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
CA3055172C (en)*2017-03-032022-03-01Perkinelmer Informatics, Inc.Systems and methods for searching and indexing documents comprising chemical information
US10572576B1 (en)*2017-04-062020-02-25Palantir Technologies Inc.Systems and methods for facilitating data object extraction from unstructured documents
CN116628127A (en)*2023-06-072023-08-22中国银行股份有限公司Data storage method and related device

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20030101416A1 (en)*2001-11-262003-05-29Evolution Consulting Group PlcCreating XML documents
US20040064466A1 (en)*2002-09-272004-04-01Oracle International CorporationTechniques for rewriting XML queries directed to relational database constructs
US20060067334A1 (en)*2004-08-182006-03-30Ougarov Andrei VSystem and methods for dynamic generation of point / tag configurations
US20140324871A1 (en)*2013-04-302014-10-30Wal-Mart Stores, Inc.Decision-tree based quantitative and qualitative record classification
US20160171018A1 (en)*2014-12-162016-06-16Microsoft Technology Licensing, LlcAppend structured data system for maintaining structured format compatibility

Family Cites Families (91)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US4817036A (en)1985-03-151989-03-28Brigham Young UniversityComputer system and method for data base indexing and information retrieval
US4961139A (en)1988-06-301990-10-02Hewlett-Packard CompanyData base management system for real-time applications
JPH04186447A (en)1990-11-211992-07-03Canon Inc information processing equipment
US5699441A (en)1992-03-101997-12-16Hitachi, Ltd.Continuous sign-language recognition apparatus and input apparatus
US5530854A (en)1992-09-251996-06-25At&T CorpShared tuple method and system for generating keys to access a database
US5701459A (en)1993-01-131997-12-23Novell, Inc.Method and apparatus for rapid full text index creation
US5544352A (en)1993-06-141996-08-06Libertech, Inc.Method and apparatus for indexing, searching and displaying data
US5664179A (en)1995-06-271997-09-02Mci CorporationModified skip list database structure and method for access
US5960194A (en)1995-09-111999-09-28International Business Machines CorporationMethod for generating a multi-tiered index for partitioned data
US5737734A (en)1995-09-151998-04-07Infonautics CorporationQuery word relevance adjustment in a search of an information retrieval system
US5761652A (en)1996-03-201998-06-02International Business Machines CorporationConstructing balanced multidimensional range-based bitmap indices
US6216213B1 (en)1996-06-072001-04-10Motorola, Inc.Method and apparatus for compression, decompression, and execution of program code
US6253188B1 (en)1996-09-202001-06-26Thomson Newspapers, Inc.Automated interactive classified ad system for the internet
US5799312A (en)1996-11-261998-08-25International Business Machines CorporationThree-dimensional affine-invariant hashing defined over any three-dimensional convex domain and producing uniformly-distributed hash keys
US5852822A (en)1996-12-091998-12-22Oracle CorporationIndex-only tables with nested group keys
US6076051A (en)1997-03-072000-06-13Microsoft CorporationInformation retrieval utilizing semantic representation of text
US6128613A (en)1997-06-262000-10-03The Chinese University Of Hong KongMethod and apparatus for establishing topic word classes based on an entropy cost function to retrieve documents represented by the topic words
US6018733A (en)1997-09-122000-01-25Infoseek CorporationMethods for iteratively and interactively performing collection selection in full text searches
US6026398A (en)1997-10-162000-02-15Imarket, IncorporatedSystem and methods for searching and matching databases
US6070164A (en)1998-05-092000-05-30Information Systems CorporationDatabase method and apparatus using hierarchical bit vector index structure
US6658626B1 (en)1998-07-312003-12-02The Regents Of The University Of CaliforniaUser interface for displaying document comparison information
US6584458B1 (en)1999-02-192003-06-24Novell, Inc.Method and apparatuses for creating a full text index accommodating child words
US6516320B1 (en)1999-03-082003-02-04Pliant Technologies, Inc.Tiered hashing for data access
US7181438B1 (en)1999-07-212007-02-20Alberti Anemometer, LlcDatabase access system
US6879976B1 (en)1999-08-192005-04-12Azi, Inc.Data indexing using bit vectors
US6772141B1 (en)1999-12-142004-08-03Novell, Inc.Method and apparatus for organizing and using indexes utilizing a search decision table
AUPQ475799A0 (en)1999-12-202000-01-20Youramigo Pty LtdAn internet indexing system and method
US6473729B1 (en)1999-12-202002-10-29Xerox CorporationWord phrase translation using a phrase index
US6678686B1 (en)1999-12-282004-01-13Ncr CorporationMethod and apparatus for evaluating index predicates on complex data types using virtual indexed streams
US6584465B1 (en)2000-02-252003-06-24Eastman Kodak CompanyMethod and system for search and retrieval of similar patterns
US6947931B1 (en)2000-04-062005-09-20International Business Machines CorporationLongest prefix match (LPM) algorithm implementation for a network processor
US6675163B1 (en)2000-04-062004-01-06International Business Machines CorporationFull match (FM) search algorithm implementation for a network processor
US6718325B1 (en)2000-06-142004-04-06Sun Microsystems, Inc.Approximate string matcher for delimited strings
US7660819B1 (en)2000-07-312010-02-09Alion Science And Technology CorporationSystem for similar document detection
US7328211B2 (en)2000-09-212008-02-05Jpmorgan Chase Bank, N.A.System and methods for improved linguistic pattern matching
US6804664B1 (en)2000-10-102004-10-12Netzero, Inc.Encoded-data database for fast queries
US7113943B2 (en)2000-12-062006-09-26Content Analyst Company, LlcMethod for document comparison and selection
JP2002222210A (en)2001-01-252002-08-09Hitachi Ltd Document search system, document search method, and search server
US6938046B2 (en)2001-03-022005-08-30Dow Jones Reuters Business Interactive, LlpPolyarchical data indexing and automatically generated hierarchical data indexing paths
US6785677B1 (en)2001-05-022004-08-31Unisys CorporationMethod for execution of query to search strings of characters that match pattern with a target string utilizing bit vector
US6748401B2 (en)2001-10-112004-06-08International Business Machines CorporationMethod and system for dynamically managing hash pool data structures
KR100483321B1 (en)2001-10-172005-04-15한국과학기술원The Device and Method for Similarity Search Using Hyper-rectangle Based Multidimensional Data Segmentation
US6985904B1 (en)2002-02-282006-01-10Oracle International CorporationSystems and methods for sharing of execution plans for similar database statements
US6993533B1 (en)2002-03-252006-01-31Bif Technologies Corp.Relational database drill-down convention and reporting tool
US6892198B2 (en)*2002-06-142005-05-10Entopia, Inc.System and method for personalized information retrieval based on user expertise
US7266553B1 (en)2002-07-012007-09-04Microsoft CorporationContent data indexing
US7653796B2 (en)2003-02-202010-01-26Panasonic CorporationInformation recording medium and region management method for a plurality of recording regions each managed by independent file system
US20040225497A1 (en)2003-05-052004-11-11Callahan James PatrickCompressed yet quickly searchable digital textual data format
US7299221B2 (en)2003-05-082007-11-20Oracle International CorporationProgressive relaxation of search criteria
WO2005008753A1 (en)2003-05-232005-01-27Nikon CorporationTemplate creation method and device, pattern detection method, position detection method and device, exposure method and device, device manufacturing method, and template creation program
US7296011B2 (en)2003-06-202007-11-13Microsoft CorporationEfficient fuzzy match for evaluating data records
US20050022017A1 (en)2003-06-242005-01-27Maufer Thomas A.Data structures and state tracking for network protocol processing
US8694510B2 (en)*2003-09-042014-04-08Oracle International CorporationIndexing XML documents efficiently
US7467138B2 (en)2003-10-282008-12-16International Business Machines CorporationAlgorithm for sorting bit sequences in linear complexity
US20050108394A1 (en)2003-11-052005-05-19Capital One Financial CorporationGrid-based computing to search a network
US20050131872A1 (en)2003-12-162005-06-16Microsoft CorporationQuery recognizer
US20060106793A1 (en)2003-12-292006-05-18Ping LiangInternet and computer information retrieval and mining with intelligent conceptual filtering, visualization and automation
US7542971B2 (en)2004-02-022009-06-02Fuji Xerox Co., Ltd.Systems and methods for collaborative note-taking
US8055672B2 (en)*2004-06-102011-11-08International Business Machines CorporationDynamic graphical database query and data mining interface
US7836044B2 (en)2004-06-222010-11-16Google Inc.Anticipated query generation and processing in a search engine
US20050289138A1 (en)*2004-06-252005-12-29Cheng Alex TAggregate indexing of structured and unstructured marked-up content
US20060036649A1 (en)2004-08-122006-02-16Simske Steven JIndex extraction from documents
GB2418999A (en)2004-09-092006-04-12Surfcontrol PlcCategorizing uniform resource locators
JP2006091994A (en)2004-09-212006-04-06Toshiba Corp Document information processing apparatus and method, document information processing program
EP1846815A2 (en)2005-01-312007-10-24Textdigger, Inc.Method and system for semantic search and retrieval of electronic documents
US7640363B2 (en)2005-02-162009-12-29Microsoft CorporationApplications for remote differential compression
US7685203B2 (en)2005-03-212010-03-23Oracle International CorporationMechanism for multi-domain indexes on XML documents
US20060265396A1 (en)2005-05-192006-11-23TrimergentPersonalizable information networks
US20070011183A1 (en)*2005-07-052007-01-11Justin LangsethAnalysis and transformation tools for structured and unstructured data
US7467155B2 (en)2005-07-122008-12-16Sand Technology Systems International, Inc.Method and apparatus for representation of unstructured data
US7548929B2 (en)2005-07-292009-06-16Yahoo! Inc.System and method for determining semantically related terms
US20070033165A1 (en)2005-08-022007-02-08International Business Machines CorporationEfficient evaluation of complex search queries
US7840774B2 (en)2005-09-092010-11-23International Business Machines CorporationCompressibility checking avoidance
EP1934703A4 (en)2005-09-142010-01-20Deepdive Technologies Inc APPARATUS AND METHOD FOR INDEXING AND SEARCHING NETWORK INFORMATION
US7676517B2 (en)2005-10-142010-03-09Microsoft CorporationSearch results injected into client applications
US20070162481A1 (en)*2006-01-102007-07-12Millett Ronald PPattern index
US20070175674A1 (en)2006-01-192007-08-02Intelliscience CorporationSystems and methods for ranking terms found in a data product
US20070203898A1 (en)2006-02-242007-08-30Jonathan Lurie CarmonaSearch methods and systems
US8176052B2 (en)2006-03-032012-05-08Perfect Search CorporationHyperspace index
US8266152B2 (en)*2006-03-032012-09-11Perfect Search CorporationHashed indexing
US7853555B2 (en)2006-04-192010-12-14Raytheon CompanyEnhancing multilingual data querying
US8214210B1 (en)*2006-09-192012-07-03Oracle America, Inc.Lattice-based querying
US8250075B2 (en)2006-12-222012-08-21Palo Alto Research Center IncorporatedSystem and method for generation of computer index files
US7774353B2 (en)2007-08-302010-08-10Perfect Search CorporationSearch templates
US7912840B2 (en)2007-08-302011-03-22Perfect Search CorporationIndexing and filtering using composite data stores
US7774347B2 (en)2007-08-302010-08-10Perfect Search CorporationVortex searching
US8032495B2 (en)2008-06-202011-10-04Perfect Search CorporationIndex compression
EP2499562A4 (en)*2009-11-092016-06-01Arcsight IncEnabling faster full-text searching using a structured data store
US8447785B2 (en)*2010-06-022013-05-21Oracle International CorporationProviding context aware search adaptively
CN103678365B (en)*2012-09-132017-07-18阿里巴巴集团控股有限公司The dynamic acquisition method of data, apparatus and system
US9798773B2 (en)*2014-12-122017-10-24International Business Machines CorporationGeneration of mapping definitions for content management system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20030101416A1 (en)*2001-11-262003-05-29Evolution Consulting Group PlcCreating XML documents
US20040064466A1 (en)*2002-09-272004-04-01Oracle International CorporationTechniques for rewriting XML queries directed to relational database constructs
US20060067334A1 (en)*2004-08-182006-03-30Ougarov Andrei VSystem and methods for dynamic generation of point / tag configurations
US20140324871A1 (en)*2013-04-302014-10-30Wal-Mart Stores, Inc.Decision-tree based quantitative and qualitative record classification
US20160171018A1 (en)*2014-12-162016-06-16Microsoft Technology Licensing, LlcAppend structured data system for maintaining structured format compatibility

Also Published As

Publication numberPublication date
US20170344548A1 (en)2017-11-30
US11200217B2 (en)2021-12-14

Similar Documents

PublicationPublication DateTitle
US20220121637A1 (en)Structured document indexing and searching
US9448995B2 (en)Method and device for performing natural language searches
Tablan et al.Mímir: An open-source semantic search framework for interactive information seeking and discovery
US10585924B2 (en)Processing natural-language documents and queries
US20110113048A1 (en)Enabling Faster Full-Text Searching Using a Structured Data Store
US20040221229A1 (en)Data structures related to documents, and querying such data structures
US9064004B2 (en)Extensible surface for consuming information extraction services
CN103530415A (en)Natural language search method and system compatible with keyword search
US7548933B2 (en)System and method for exploiting semantic annotations in executing keyword queries over a collection of text documents
CN105843960B (en)Indexing method and system based on semantic tree
CN112231321A (en)Oracle secondary index and index real-time synchronization method
KR101095866B1 (en) Web based information storage and retrieval method, information management system for this
Diewald et al.Krill: KorAP search and analysis engine
CN105824956A (en)Inverted index model based on link list structure and construction method of inverted index model
CN104778200A (en) A Method of Heterogeneous Processing Big Data Retrieval Combined with Historical Data
Zeng et al.Linking entities in short texts based on a Chinese semantic knowledge base
US20170270127A1 (en)Category-based full-text searching
JP4439496B2 (en) Search processing apparatus and program
US10678870B2 (en)System and method for search discovery
Shen et al.A graph-based RDF triple store
Shui et al.Querying and maintaining ordered XML data using relational databases
Ghodke et al.Fast query for large treebanks
KR20170088466A (en)A method of partial matching for graph data
Kelec et al.One approach for full-text search of files in MongoDB based systems
BastEfficient and Effective Search on Wikidata Using the QLever Engine

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:PERFECT SEARCH CORPORATION, UTAH

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:TIETJEN, BRUCE R.;MILLETT, RONALD P.;SIGNING DATES FROM 20160601 TO 20160701;REEL/FRAME:058377/0059

STPPInformation on status: patent application and granting procedure in general

Free format text:DOCKETED NEW CASE - READY FOR EXAMINATION

STPPInformation on status: patent application and granting procedure in general

Free format text:NON FINAL ACTION MAILED

STCBInformation on status: application discontinuation

Free format text:ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION


[8]ページ先頭

©2009-2025 Movatter.jp