Movatterモバイル変換


[0]ホーム

URL:


US20190294672A1 - Information extraction from natural language texts - Google Patents

Information extraction from natural language texts
Download PDF

Info

Publication number
US20190294672A1
US20190294672A1US15/938,307US201815938307AUS2019294672A1US 20190294672 A1US20190294672 A1US 20190294672A1US 201815938307 AUS201815938307 AUS 201815938307AUS 2019294672 A1US2019294672 A1US 2019294672A1
Authority
US
United States
Prior art keywords
information
information objects
objects
natural language
conflicting
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US15/938,307
Other versions
US10437931B1 (en
Inventor
Stepan Evgenyevich Matskevich
Ilya Aleksandrovich Bulgakov
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Abbyy Development Inc
Original Assignee
Abbyy Production LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from RU2018110386Aexternal-prioritypatent/RU2681356C1/en
Priority claimed from RU2018110387Aexternal-prioritypatent/RU2691855C1/en
Application filed by Abbyy Production LLCfiledCriticalAbbyy Production LLC
Assigned to ABBYY PRODUCTION LLCreassignmentABBYY PRODUCTION LLCASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: BULGAKOV, ILYA ALEKSANDROVICH, MATSKEVICH, STEPAN EVGENYEVICH
Priority to US16/545,463priorityCriticalpatent/US10691891B2/en
Publication of US20190294672A1publicationCriticalpatent/US20190294672A1/en
Application grantedgrantedCritical
Publication of US10437931B1publicationCriticalpatent/US10437931B1/en
Assigned to ABBYY DEVELOPMENT INC.reassignmentABBYY DEVELOPMENT INC.ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: ABBYY PRODUCTION LLC
Expired - Fee Relatedlegal-statusCriticalCurrent
Anticipated expirationlegal-statusCritical

Links

Images

Classifications

Definitions

Landscapes

Abstract

Systems and methods for extracting facts from natural language texts. An example method of information extraction comprises extracting, from a natural language text, a first plurality of information objects; extracting, from the natural language text, a second plurality of information objects; identifying a set of conflicting information objects, such that a first information object of the set of conflicting information objects belongs to the first plurality of information objects and a second information object of the set of conflicting information objects belongs to the second plurality of information objects; and producing a final list of information objects extracted from the natural language text, by applying, to the set of conflicting information objects, a conflict arbitration function which performs at least one of: modifying the first information object, deleting the first information object, or merging two or more information objects of the set of conflicting information objects.

Description

Claims (20)

1. A method, comprising:
extracting, by a computer system, a first plurality of information objects from a natural language text by applying, to a plurality of attributes of the natural language text, a machine learning classifier yielding a degree of association of a fragment of the natural language text with a pre-defined class of information objects;
extracting, from the natural language text, a second plurality of information objects;
identifying a set of conflicting information objects, such that a first information object of the set of conflicting information objects belongs to the first plurality of information objects and a second information object of the set of conflicting information objects belongs to the second plurality of information objects; and
producing a final list of information objects extracted from the natural language text, by applying, to the set of conflicting information objects, a conflict arbitration function which performs at least one of: modifying the first information object, deleting the first information object, or merging two or more information objects of the set of conflicting information objects.
14. A computer system, comprising:
a memory;
a processor, coupled to the memory, the processor configured to:
extract, from a natural language text, a first plurality of information objects;
extract, from the natural language text, a second plurality of information objects;
identify a set of conflicting information objects, such that a first information object of the set of conflicting information objects belongs to the first plurality of information objects and a second information object of the set of conflicting information objects belongs to the second plurality of information objects; and
produce a final list of information objects extracted from the natural language text, by applying, to the set of conflicting information objects, a conflict arbitration function which performs at least one of: modifying the first information object, deleting the first information object, or merging two or more information objects of the set of conflicting information objects, wherein the conflict arbitration function implements a machine learning classifier yielding at least one of: a likelihood of the first information object and the second information object representing a same object, a level of confidence of the first information object, or a level of confidence of the second information object.
18. A computer-readable non-transitory storage medium comprising executable instructions that, when executed by a computer system, cause the computer system to:
extract, from a natural language text, a first plurality of information objects by applying, to a plurality of attributes of the natural language text, a machine learning classifier yielding a degree of association of a fragment of the natural language text with a pre-defined class of information objects;
extract, from the natural language text, a second plurality of information objects;
identify a set of conflicting information objects, such that a first information object of the set of conflicting information objects belongs to the first plurality of information objects and a second information object of the set of conflicting information objects belongs to the second plurality of information objects; and
produce a final list of information objects extracted from the natural language text, by applying, to the set of conflicting information objects, a conflict arbitration function which performs at least one of: modifying the first information object, deleting the first information object, or merging two or more information objects of the set of conflicting information objects.
US15/938,3072018-03-232018-03-28Information extraction from natural language textsExpired - Fee RelatedUS10437931B1 (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
US16/545,463US10691891B2 (en)2018-03-232019-08-20Information extraction from natural language texts

Applications Claiming Priority (4)

Application NumberPriority DateFiling DateTitle
RU20181103872018-03-23
RU2018110386ARU2681356C1 (en)2018-03-232018-03-23Classifier training used for extracting information from texts in natural language
RU20181103862018-03-23
RU2018110387ARU2691855C1 (en)2018-03-232018-03-23Training classifiers used to extract information from natural language texts

Related Child Applications (1)

Application NumberTitlePriority DateFiling Date
US16/545,463ContinuationUS10691891B2 (en)2018-03-232019-08-20Information extraction from natural language texts

Publications (2)

Publication NumberPublication Date
US20190294672A1true US20190294672A1 (en)2019-09-26
US10437931B1 US10437931B1 (en)2019-10-08

Family

ID=67984252

Family Applications (3)

Application NumberTitlePriority DateFiling Date
US15/938,509AbandonedUS20190294665A1 (en)2018-03-232018-03-28Training information extraction classifiers
US15/938,307Expired - Fee RelatedUS10437931B1 (en)2018-03-232018-03-28Information extraction from natural language texts
US16/545,463ActiveUS10691891B2 (en)2018-03-232019-08-20Information extraction from natural language texts

Family Applications Before (1)

Application NumberTitlePriority DateFiling Date
US15/938,509AbandonedUS20190294665A1 (en)2018-03-232018-03-28Training information extraction classifiers

Family Applications After (1)

Application NumberTitlePriority DateFiling Date
US16/545,463ActiveUS10691891B2 (en)2018-03-232019-08-20Information extraction from natural language texts

Country Status (1)

CountryLink
US (3)US20190294665A1 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20200320167A1 (en)*2019-04-022020-10-08Genpact LimitedMethod and system for advanced document redaction
US11048864B2 (en)*2019-04-012021-06-29Adobe Inc.Digital annotation and digital content linking techniques
US11244159B2 (en)*2019-04-242022-02-08Hitachi, Ltd.Article recognition system and article recognition method
KR20220094797A (en)*2020-12-292022-07-06케이웨어 (주)Data management server for managing metadata and control method thereof
US20220245350A1 (en)*2021-02-032022-08-04Cambium Assessment, Inc.Framework and interface for machines
US20220309043A1 (en)*2019-06-262022-09-29Koninklijke Philips N.V.Data quality checking based on derived relations between table columns
WO2023067576A1 (en)*2021-10-222023-04-27Open Text CorporationComposite extraction systems and methods for artificial intelligence platform
US20230129464A1 (en)2020-08-242023-04-27Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US20230206670A1 (en)*2020-06-122023-06-29Microsoft Technology Licensing, LlcSemantic representation of text in document
US20230306002A1 (en)*2022-03-242023-09-28Sap SeHelp documentation enabler
US11977854B2 (en)2021-08-242024-05-07Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US11989507B2 (en)2021-08-242024-05-21Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US11989527B2 (en)2021-08-242024-05-21Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US12067362B2 (en)2021-08-242024-08-20Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US12073180B2 (en)2021-08-242024-08-27Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US10740381B2 (en)2018-07-182020-08-11International Business Machines CorporationDictionary editing system integrated with text mining
US11720621B2 (en)*2019-03-182023-08-08Apple Inc.Systems and methods for naming objects based on object content
US11119759B2 (en)2019-12-182021-09-14Bank Of America CorporationSelf-learning code conflict resolution tool
US11693855B2 (en)*2019-12-202023-07-04International Business Machines CorporationAutomatic creation of schema annotation files for converting natural language queries to structured query language
CN111274391B (en)*2020-01-152023-09-01北京百度网讯科技有限公司SPO extraction method and device, electronic equipment and storage medium
KR102491753B1 (en)*2020-08-032023-01-26(주)한국플랫폼서비스기술Method and system for framework's deep learning a data using by query

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20050076110A1 (en)*2003-07-112005-04-07Boban MathewGeneric inbox system and method
US8280719B2 (en)*2005-05-052012-10-02Ramp, Inc.Methods and systems relating to information extraction
US20070016399A1 (en)*2005-07-122007-01-18International Business Machines CorporationMethod and apparatus for detecting data anomalies in statistical natural language applications
US8594996B2 (en)2007-10-172013-11-26Evri Inc.NLP-based entity recognition and disambiguation
US8725666B2 (en)2010-02-262014-05-13Lawrence Livermore National Security, Llc.Information extraction system
US9110852B1 (en)2012-07-202015-08-18Google Inc.Methods and systems for extracting information from text
US9600227B2 (en)*2013-11-212017-03-21Google Technology Holdings LLCSystem and method for speech-based navigation and interaction with a device's visible screen elements using a corresponding view hierarchy
RU2571373C2 (en)*2014-03-312015-12-20Общество с ограниченной ответственностью "Аби ИнфоПоиск"Method of analysing text data tonality
US20190042548A1 (en)*2017-08-072019-02-07Zachary PeoplesMethods for arbitrating online disputes and anticipating outcomes using machine intelligence

Cited By (50)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US11048864B2 (en)*2019-04-012021-06-29Adobe Inc.Digital annotation and digital content linking techniques
US20200320167A1 (en)*2019-04-022020-10-08Genpact LimitedMethod and system for advanced document redaction
US11562134B2 (en)*2019-04-022023-01-24Genpact Luxembourg S.à r.l. IIMethod and system for advanced document redaction
US12124799B2 (en)2019-04-022024-10-22Genpact Usa, Inc.Method and system for advanced document redaction
US11244159B2 (en)*2019-04-242022-02-08Hitachi, Ltd.Article recognition system and article recognition method
US20220309043A1 (en)*2019-06-262022-09-29Koninklijke Philips N.V.Data quality checking based on derived relations between table columns
US20230206670A1 (en)*2020-06-122023-06-29Microsoft Technology Licensing, LlcSemantic representation of text in document
US12374141B2 (en)*2020-06-122025-07-29Microsoft Technology Licensing, LlcSemantic representation of text in document
US12242812B2 (en)2020-08-242025-03-04Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12260181B2 (en)2020-08-242025-03-25Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US20230132455A1 (en)*2020-08-242023-05-04Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US20230186032A1 (en)*2020-08-242023-06-15Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US20230206003A1 (en)*2020-08-242023-06-29Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US20230129464A1 (en)2020-08-242023-04-27Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US11763096B2 (en)2020-08-242023-09-19Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12406146B2 (en)2020-08-242025-09-02Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12236199B2 (en)2020-08-242025-02-25Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US11829725B2 (en)2020-08-242023-11-28Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12400085B2 (en)2020-08-242025-08-26Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US20230130903A1 (en)*2020-08-242023-04-27Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12217009B2 (en)2020-08-242025-02-04Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12242814B2 (en)2020-08-242025-03-04Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12039282B2 (en)2020-08-242024-07-16Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12050876B2 (en)2020-08-242024-07-30Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12260182B2 (en)2020-08-242025-03-25Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12254277B2 (en)2020-08-242025-03-18Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12254278B2 (en)2020-08-242025-03-18Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12131127B2 (en)2020-08-242024-10-29Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12131126B2 (en)2020-08-242024-10-29Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data
US12242813B2 (en)2020-08-242025-03-04Unlikely Artificial Intelligence LimtedComputer implemented method for the automated analysis or use of data
US12147773B2 (en)2020-08-242024-11-19Unlikely Artificial Intelligence LimitedComputer implemented method for the automated analysis or use of data applied to a query answer system with a shared syntax applied to the query, factual statements and reasoning
US12159117B2 (en)2020-08-242024-12-03Unlikely Artificial Intelligence LimtedComputer implemented method for the automated analysis or use of data
KR20220094797A (en)*2020-12-292022-07-06케이웨어 (주)Data management server for managing metadata and control method thereof
KR102597181B1 (en)2020-12-292023-11-02케이웨어 (주)Data management server for managing metadata and control method thereof
US20220245350A1 (en)*2021-02-032022-08-04Cambium Assessment, Inc.Framework and interface for machines
US11989507B2 (en)2021-08-242024-05-21Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US11977854B2 (en)2021-08-242024-05-07Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US12430505B2 (en)2021-08-242025-09-30Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US12430503B2 (en)2021-08-242025-09-30Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US12073180B2 (en)2021-08-242024-08-27Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US12067362B2 (en)2021-08-242024-08-20Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US12008333B2 (en)2021-08-242024-06-11Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US12430504B2 (en)2021-08-242025-09-30Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US12353827B2 (en)2021-08-242025-07-08Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US11989527B2 (en)2021-08-242024-05-21Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US12164868B2 (en)2021-08-242024-12-10Unlikely Artificial Intelligence LimitedComputer implemented methods for the automated analysis or use of data, including use of a large language model
US12321704B2 (en)2021-10-222025-06-03Open Text CorporationComposite extraction systems and methods for artificial intelligence platform
WO2023067576A1 (en)*2021-10-222023-04-27Open Text CorporationComposite extraction systems and methods for artificial intelligence platform
US12141528B2 (en)2021-10-222024-11-12Open Text CorporationComposite extraction systems and methods for artificial intelligence platform
US20230306002A1 (en)*2022-03-242023-09-28Sap SeHelp documentation enabler

Also Published As

Publication numberPublication date
US10691891B2 (en)2020-06-23
US20190294665A1 (en)2019-09-26
US10437931B1 (en)2019-10-08
US20190384816A1 (en)2019-12-19

Similar Documents

PublicationPublication DateTitle
US10691891B2 (en)Information extraction from natural language texts
US10078688B2 (en)Evaluating text classifier parameters based on semantic features
US9928234B2 (en)Natural language text classification based on semantic features
US10007658B2 (en)Multi-stage recognition of named entities in natural language text based on morphological and semantic features
RU2662688C1 (en)Extraction of information from sanitary blocks of documents using micromodels on basis of ontology
US9626358B2 (en)Creating ontologies by analyzing natural language texts
US20200342059A1 (en)Document classification by confidentiality levels
US20180060306A1 (en)Extracting facts from natural language texts
US11379656B2 (en)System and method of automatic template generation
US20190392035A1 (en)Information object extraction using combination of classifiers analyzing local and non-local features
US10198432B2 (en)Aspect-based sentiment analysis and report generation using machine learning methods
US10445428B2 (en)Information object extraction using combination of classifiers
US20180157642A1 (en)Information extraction using alternative variants of syntactico-semantic parsing
US20180032508A1 (en)Aspect-based sentiment analysis using machine learning methods
US20170161255A1 (en)Extracting entities from natural language texts
US20180113856A1 (en)Producing training sets for machine learning methods by performing deep semantic analysis of natural language texts
US10303770B2 (en)Determining confidence levels associated with attribute values of informational objects
US20170052950A1 (en)Extracting information from structured documents comprising natural language text
US20180081861A1 (en)Smart document building using natural language processing
RU2618374C1 (en)Identifying collocations in the texts in natural language
US20180181559A1 (en)Utilizing user-verified data for training confidence level models
US20190065453A1 (en)Reconstructing textual annotations associated with information objects
US10706369B2 (en)Verification of information object attributes
RU2681356C1 (en)Classifier training used for extracting information from texts in natural language
RU2691855C1 (en)Training classifiers used to extract information from natural language texts

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:ABBYY PRODUCTION LLC, RUSSIAN FEDERATION

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MATSKEVICH, STEPAN EVGENYEVICH;BULGAKOV, ILYA ALEKSANDROVICH;REEL/FRAME:045375/0364

Effective date:20180328

FEPPFee payment procedure

Free format text:ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCFInformation on status: patent grant

Free format text:PATENTED CASE

ASAssignment

Owner name:ABBYY DEVELOPMENT INC., NORTH CAROLINA

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ABBYY PRODUCTION LLC;REEL/FRAME:059249/0873

Effective date:20211231

FEPPFee payment procedure

Free format text:MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPSLapse for failure to pay maintenance fees

Free format text:PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCHInformation on status: patent discontinuation

Free format text:PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FPLapsed due to failure to pay maintenance fee

Effective date:20231008


[8]ページ先頭

©2009-2025 Movatter.jp