Movatterモバイル変換


[0]ホーム

URL:


US20220067591A1 - Machine learning model selection and explanation for multi-dimensional datasets - Google Patents

Machine learning model selection and explanation for multi-dimensional datasets
Download PDF

Info

Publication number
US20220067591A1
US20220067591A1US17/445,667US202117445667AUS2022067591A1US 20220067591 A1US20220067591 A1US 20220067591A1US 202117445667 AUS202117445667 AUS 202117445667AUS 2022067591 A1US2022067591 A1US 2022067591A1
Authority
US
United States
Prior art keywords
machine learning
result
learning models
dimensions
query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
US17/445,667
Inventor
Jignesh Patel
Junda Chen
Dylan Paul Bacon
Jiatong Li
Ushmal Ramesh
Rogers Jeffrey Leo John
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DataChat AI
Original Assignee
DataChat AI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by DataChat AIfiledCriticalDataChat AI
Priority to US17/445,667priorityCriticalpatent/US20220067591A1/en
Assigned to DataChat.aireassignmentDataChat.aiASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS).Assignors: BACON, Dylan Paul, LEO JOHN, ROGERS JEFFREY, CHEN, JUNDA, LI, JIATONG, PATEL, JIGNESH, RAMESH, Ushmal
Publication of US20220067591A1publicationCriticalpatent/US20220067591A1/en
Pendinglegal-statusCriticalCurrent

Links

Images

Classifications

Definitions

Landscapes

Abstract

In general, techniques are described for various aspects of accessing datasets. A device comprising a memory configured to store the multi-dimensional dataset; a processor may perform the techniques. The processor may apply a plurality of machine learning models to the multi-dimensional dataset to obtain a result output by each of the plurality of machine learning models. The processor may next determine a correlation of one or more dimensions of the multi-dimensional dataset to the results output by each of the machine learning models, and select, based on the correlation determined between the dimensions and the result output by each of the machine learning models, a subset of the plurality of machine learning models to obtain the result for each of the subset of the machine learning models. The processor may then output the result for each of the subset of the plurality of machine learning models.

Description

Claims (23)

What is claimed is:
1. A device configured to interpret a multi-dimensional dataset, the device comprising:
a memory configured to store the multi-dimensional dataset; and
one or more processors configured to:
apply a plurality of machine learning models to the multi-dimensional dataset to obtain a result output by each of the plurality of machine learning models;
determine a correlation of one or more dimensions of the multi-dimensional dataset to the results output by each of the plurality of machine learning models;
select, based on the correlation determined between the one or more dimensions and the result output by each of the plurality of machine learning models, a subset of the plurality of machine learning models to obtain the result for each of the subset of the plurality of machine learning models; and
output the result for each of the subset of the plurality of machine learning models.
2. The device ofclaim 1, wherein the one or more processors are configured to output the result as a sentence using plain language.
3. The device ofclaim 1, wherein the one or more processors are configured to output the result for at least one of the subset of the plurality of machine learning models as a graph identifying a relevance of each of the one or more dimensions to the result for each of the subset of the plurality of machine learning models.
4. The device ofclaim 3, wherein the graph comprises an impact graph.
5. The device ofclaim 1, wherein the one or more processors are configured to output the result for each of the subset of the plurality of machine learning models as a graphical representation of a decision tree.
6. The device ofclaim 1, wherein the one or more processors are further configured to:
determine, based on a comparison of the correlation determined between the one or more dimensions and the result output by each of the plurality of machine learning models to a relevance threshold, one or more low relevance dimensions of the multi-dimensional dataset that have low relevance to the result output by each of the plurality of machine learning models; and
output an indication explaining that the one or more low relevance dimensions have low relevance to the result.
7. The device ofclaim 6, wherein the one or more processors are configured to output a sentence in plain language that explain the one or more low relevance dimensions having low relevance to the result.
8. The device ofclaim 1, wherein the one or more processors are further configured to refrain from transforming the one or more dimensions of the multi-dimensional dataset prior to application of the plurality of machine learning models.
9. The device ofclaim 1, wherein the one or more processors are further configured to:
determine, based on the results for each of the one or more of the plurality of machine learning models, one or more of a plurality of charts to explain the corresponding result;
rank the one or more of the plurality of charts to identify a highest ranked chart;
select the highest ranked chart; and
output the highest ranked chart as a visual chart.
10. The device ofclaim 9, wherein the one or more processors are further configured to:
generate an explanation in plain language explaining a formulation of the visual chart; and
output the explanation.
11. The device ofclaim 1, wherein the one or more processors are further configured to:
generate a pipeline report explaining how the device produced the plurality of the machine learning models; and
output the pipeline report.
12. A method of interpreting a multi-dimensional dataset, the method comprising:
applying a plurality of machine learning models to the multi-dimensional dataset to obtain a result output by each of the plurality of machine learning models;
determining a correlation of the one or more dimensions of the multi-dimensional dataset to the results output by each of the plurality of machine learning models;
selecting, based on the correlation determined between the one or more dimensions and the result output by each of the plurality of machine learning models, a subset of the plurality of machine learning models to obtain the result for each of the subset of the plurality of machine learning models; and
outputting the result for each of the subset of the plurality of machine learning models.
13. The method ofclaim 12, wherein outputting the result comprises outputting the result as a sentence using plain language.
14. The method ofclaim 12, wherein outputting the result comprises outputting the result for at least one of the subset of the plurality of machine learning models as a graph identifying a relevance of each of the one or more dimensions to the result for each of the subset of the plurality of machine learning models.
15. The method ofclaim 14, wherein the graph comprises an impact graph.
16. The method ofclaim 12, wherein outputting the result comprises outputting the result for each of the subset of the plurality of machine learning models as a graphical representation of a decision tree.
17. The method ofclaim 12, further comprising:
determining, based on a comparison of the correlation determined between the one or more dimensions and the result output by each of the plurality of machine learning models to a relevance threshold, one or more low relevance dimensions of the multi-dimensional dataset that have low relevance to the result output by each of the plurality of machine learning models; and
outputting an indication explaining that the one or more low relevance dimensions have low relevance to the result.
18. The method ofclaim 17, wherein outputting the indication comprises outputting a sentence in plain language that explain the one or more low relevance dimensions having low relevance to the result.
19. The method ofclaim 12, further comprising refraining from transforming the one or more dimensions of the multi-dimensional dataset prior to application of the plurality of machine learning models.
20. The method ofclaim 12, further comprising:
determining, based on the results for each of the one or more of the plurality of machine learning models, one or more of a plurality of charts to explain the corresponding result;
ranking the one or more of the plurality of charts to identify a highest ranked chart;
selecting the highest ranked chart; and
outputting the highest ranked chart as a visual chart.
21. The method ofclaim 20, further comprising:
generating an explanation in plain language explaining a formulation of the visual chart; and
outputting the explanation.
22. The method ofclaim 12, further comprising:
generating a pipeline report explaining how the device produced the plurality of the machine learning models; and
outputting the pipeline report.
23. A non-transitory computer-readable storage medium storing instructions that, when executed, cause one or more processors to:
apply a plurality of machine learning models to a multi-dimensional dataset to obtain a result output by each of the plurality of machine learning models;
determine a correlation of the one or more dimensions of the multi-dimensional dataset to the result output by each of the plurality of machine learning models;
select, based on the correlation determined between the one or more dimensions and the result output by each of the plurality of machine learning models, a subset of the plurality of machine learning models to obtain the result for each of the subset of the plurality of machine learning models; and
output the result for each of the subset of the plurality of machine learning models.
US17/445,6672020-08-252021-08-23Machine learning model selection and explanation for multi-dimensional datasetsPendingUS20220067591A1 (en)

Priority Applications (1)

Application NumberPriority DateFiling DateTitle
US17/445,667US20220067591A1 (en)2020-08-252021-08-23Machine learning model selection and explanation for multi-dimensional datasets

Applications Claiming Priority (2)

Application NumberPriority DateFiling DateTitle
US202063070074P2020-08-252020-08-25
US17/445,667US20220067591A1 (en)2020-08-252021-08-23Machine learning model selection and explanation for multi-dimensional datasets

Publications (1)

Publication NumberPublication Date
US20220067591A1true US20220067591A1 (en)2022-03-03

Family

ID=77775037

Family Applications (3)

Application NumberTitlePriority DateFiling Date
US17/445,667PendingUS20220067591A1 (en)2020-08-252021-08-23Machine learning model selection and explanation for multi-dimensional datasets
US17/445,665Active2042-08-04US12019996B2 (en)2020-08-252021-08-23Conversational syntax using constrained natural language processing for accessing datasets
US18/752,541PendingUS20250028912A1 (en)2020-08-252024-06-24Conversational syntax using constrained natural language processing for accessing datasets

Family Applications After (2)

Application NumberTitlePriority DateFiling Date
US17/445,665Active2042-08-04US12019996B2 (en)2020-08-252021-08-23Conversational syntax using constrained natural language processing for accessing datasets
US18/752,541PendingUS20250028912A1 (en)2020-08-252024-06-24Conversational syntax using constrained natural language processing for accessing datasets

Country Status (5)

CountryLink
US (3)US20220067591A1 (en)
EP (2)EP4204990A1 (en)
JP (2)JP2023539232A (en)
CA (2)CA3188880A1 (en)
WO (2)WO2022047466A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US11836138B1 (en)2022-05-272023-12-05Snowflake Inc.Overlap results data generation on a cloud data platform
CN117194637A (en)*2023-09-182023-12-08深圳市大数据研究院 Multi-level visual assessment report generation method and device based on large language model
US12169767B1 (en)*2018-04-062024-12-17Curai, Inc.Ensemble machine learning systems and methods
US12321343B1 (en)2025-02-062025-06-03Morgan Stanley Services Group Inc.Natural language to SQL on custom enterprise data warehouse powered by generative artificial intelligence

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JP2025044222A (en)*2023-09-192025-04-01ソフトバンクグループ株式会社 system

Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20140114707A1 (en)*2012-10-192014-04-24International Business Machines CorporationInterpretation of statistical results
US20150032429A1 (en)*2006-04-192015-01-29Tableau Software Inc.Systems and Methods for Generating Models of a Dataset for a Data Visualization
US20180060738A1 (en)*2014-05-232018-03-01DataRobot, Inc.Systems and techniques for determining the predictive value of a feature
US20180253658A1 (en)*2017-03-012018-09-06Microsoft Technology Licensing, LlcUnderstanding business insights and deep-dive using artificial intelligence
US20190325335A1 (en)*2018-04-202019-10-24H2O.Ai Inc.Model interpretation

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
JPH0944508A (en)*1995-07-271997-02-14Toshiba Corp Database natural language interface device and method
US7725307B2 (en)*1999-11-122010-05-25Phoenix Solutions, Inc.Query engine for processing voice based queries including semantic decoding
JP2002342361A (en)*2001-05-152002-11-29Mitsubishi Electric Corp Information retrieval device
US20050043940A1 (en)*2003-08-202005-02-24Marvin ElderPreparing a data source for a natural language query
US20050091036A1 (en)*2003-10-232005-04-28Hazel ShackletonMethod and apparatus for a hierarchical object model-based constrained language interpreter-parser
US8676859B2 (en)*2010-01-212014-03-18Hewlett-Packard Development Company, L.P.Method and system for analyzing data stored in a database
US9959311B2 (en)*2015-09-182018-05-01International Business Machines CorporationNatural language interface to databases
US20180052842A1 (en)2016-08-162018-02-22Ebay Inc.Intelligent online personal assistant with natural language understanding
EP3625689A4 (en)*2017-05-172021-04-28Sigopt, Inc.Systems and methods implementing an intelligent optimization platform
US10963525B2 (en)2017-07-072021-03-30Avnet, Inc.Artificial intelligence system for providing relevant content queries across unconnected websites via a conversational environment
US10521489B2 (en)*2017-11-302019-12-31Microsoft Technology Licensing, LlcMachine learning to predict numerical outcomes in a matrix-defined problem space
US20190236487A1 (en)*2018-01-302019-08-01Microsoft Technology Licensing, LlcMachine learning hyperparameter tuning tool
US10565528B2 (en)*2018-02-092020-02-18Sas Institute Inc.Analytic system for feature engineering improvement to machine learning models
US11157704B2 (en)*2018-06-182021-10-26DataChat.aiConstrained natural language processing
US11227114B1 (en)*2018-11-282022-01-18Kensho Technologies, LlcNatural language interface with real-time feedback

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US20150032429A1 (en)*2006-04-192015-01-29Tableau Software Inc.Systems and Methods for Generating Models of a Dataset for a Data Visualization
US20140114707A1 (en)*2012-10-192014-04-24International Business Machines CorporationInterpretation of statistical results
US20180060738A1 (en)*2014-05-232018-03-01DataRobot, Inc.Systems and techniques for determining the predictive value of a feature
US20180253658A1 (en)*2017-03-012018-09-06Microsoft Technology Licensing, LlcUnderstanding business insights and deep-dive using artificial intelligence
US20190325335A1 (en)*2018-04-202019-10-24H2O.Ai Inc.Model interpretation

Cited By (5)

* Cited by examiner, † Cited by third party
Publication numberPriority datePublication dateAssigneeTitle
US12169767B1 (en)*2018-04-062024-12-17Curai, Inc.Ensemble machine learning systems and methods
US11836138B1 (en)2022-05-272023-12-05Snowflake Inc.Overlap results data generation on a cloud data platform
US12008001B2 (en)*2022-05-272024-06-11Snowflake Inc.Overlap queries on a distributed database
CN117194637A (en)*2023-09-182023-12-08深圳市大数据研究院 Multi-level visual assessment report generation method and device based on large language model
US12321343B1 (en)2025-02-062025-06-03Morgan Stanley Services Group Inc.Natural language to SQL on custom enterprise data warehouse powered by generative artificial intelligence

Also Published As

Publication numberPublication date
CA3188921A1 (en)2022-03-03
EP4204989A1 (en)2023-07-05
EP4204990A1 (en)2023-07-05
US20220067303A1 (en)2022-03-03
JP2023539232A (en)2023-09-13
WO2022047466A1 (en)2022-03-03
US20250028912A1 (en)2025-01-23
JP2023539225A (en)2023-09-13
WO2022047465A1 (en)2022-03-03
US12019996B2 (en)2024-06-25
CA3188880A1 (en)2022-03-03

Similar Documents

PublicationPublication DateTitle
US11157704B2 (en)Constrained natural language processing
US11775572B2 (en)Directed acyclic graph based framework for training models
CN112567394B (en) Techniques for building knowledge graphs in limited knowledge domains
US12019996B2 (en)Conversational syntax using constrained natural language processing for accessing datasets
CN115398436B (en) Noisy Data Augmentation for Natural Language Processing
US20240126795A1 (en)Conversational document question answering
JP2022547631A (en) Stopword data augmentation for natural language processing
JP7726995B2 (en) Enhanced Logit for Natural Language Processing
WO2022040547A1 (en)Techniques for providing explanations for text classification
CN112487157A (en)Template-based intent classification for chat robots
CN116235164A (en) Automated transformation outside the scope of chatbots
CN116583837A (en)Distance-based LOGIT values for natural language processing
US20250094480A1 (en)Document processing and retrieval for knowledge-based question answering
CN119669317B (en) Information display method and device, electronic device, storage medium and program product
US11334223B1 (en)User interface for data analytics systems
US20240143934A1 (en)Multi-task model with context masking
US12204532B2 (en)Parameterized narrations for data analytics systems
US20240256529A1 (en)Constrained natural language user interface
CN119768794A (en) Adaptive training data augmentation to facilitate training of named entity recognition models

Legal Events

DateCodeTitleDescription
ASAssignment

Owner name:DATACHAT.AI, WISCONSIN

Free format text:ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:PATEL, JIGNESH;CHEN, JUNDA;BACON, DYLAN PAUL;AND OTHERS;SIGNING DATES FROM 20210820 TO 20210823;REEL/FRAME:057287/0500

STPPInformation on status: patent application and granting procedure in general

Free format text:DOCKETED NEW CASE - READY FOR EXAMINATION

STPPInformation on status: patent application and granting procedure in general

Free format text:NON FINAL ACTION MAILED

STPPInformation on status: patent application and granting procedure in general

Free format text:RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER

STPPInformation on status: patent application and granting procedure in general

Free format text:FINAL REJECTION MAILED


[8]ページ先頭

©2009-2025 Movatter.jp