技术领域technical field
本发明属于育儿保健技术领域,涉及一种智能育儿知识服务方法及系统。The invention belongs to the technical field of childcare and health care, and relates to an intelligent childcare knowledge service method and system.
背景技术Background technique
随着互联网的快速发展,计算机技术已经在各行各业方便着人们的生活。在医疗领域也不例外。在网络上潜藏着大量的儿科疾病的专业数据和用户的问诊记录,这些信息对于儿科疾病问题的解决,就像黄金一样零零散散的隐没在沙子里。虽然很有用,但是不够系统、不够完整,缺乏灵活智能的组织和人性化的展现方式,最终不能很好的发挥作用。如何开发出一款育儿保健的机器人系统,来搜集整理信息,为育儿父母服务,刻不容缓。With the rapid development of the Internet, computer technology has facilitated people's lives in all walks of life. The medical field is no exception. There are a large amount of professional data on pediatric diseases and user consultation records hidden on the Internet. These information are scattered and hidden in the sand like gold for the solution of pediatric diseases. Although it is very useful, it is not systematic and complete, lacks flexible and intelligent organization and humanized presentation, and ultimately cannot play a good role. How to develop a robot system for parenting and health care to collect and sort out information and serve parents is an urgent task.
经过调查,目前市场上存在着“寻医问药网”,“中国健康网”等一系列医疗网站,还有各大医院的门户网站,他们大都提供儿童疾病查询和在线医生问诊的服务。但是经过使用后发现,他们的疾病查询服务大都局限于疾病信息的检索,并没有对疾病信息进行有效的构建和整理,以形成知识库从而为用户更好地服务。另外他们对于信息的展示形式,大多过于单一,都只是文字的阐述和罗列。当患儿的父母,心急如焚的查找有用信息时,往往会被淹没在浩如烟海的文字里,最终所获甚少。这种忽略用户心理的交互,并不能在关健时刻及时有效的给用户提供服务。其次,他们大而全的疾病检索,当然能够包涵儿科疾病,但是精力的分散注定了其不能专注于儿童疾病。After investigation, there are a series of medical websites on the market, such as "seeking doctors and asking medicines", "China Health Network", and the portal websites of major hospitals. Most of them provide services for children's disease inquiries and online doctor consultations. However, after using it, it is found that most of their disease query services are limited to the retrieval of disease information, and have not effectively constructed and organized disease information to form a knowledge base to better serve users. In addition, most of their display forms of information are too single, and they are just explanations and lists of words. When the parents of children are anxious to find useful information, they are often submerged in the vast amount of words, and finally get little. This kind of interaction that ignores the user's psychology cannot provide timely and effective services to users at critical moments. Secondly, their large and comprehensive disease search can certainly cover pediatric diseases, but the distraction of energy is doomed to not be able to focus on children's diseases.
另外,这些网站所提供的在线医生问诊服务,并不能做到真正的全天24小时在线,有些甚至得提前预约。可是患儿的发病时间是无规律可循的,当夜半三更,孩子发病而医生又恰巧不在时,这些系统不但不能为用户服务,有甚者甚至延误了患儿的治疗时机。In addition, the online doctor consultation services provided by these websites cannot be truly online 24 hours a day, and some even have to make an appointment in advance. However, the onset time of the child is irregular. When the child becomes ill in the middle of the night and the doctor happens to be away, these systems not only fail to serve the user, but even delay the timing of the child's treatment.
其次,这些网站的疾病信息来源于后台专业人士的手动录入,往往有一定的滞后性。当一种儿科流行病迅速爆发时,信息录入滞后的网站,往往并不能及时更新网站信息,这样,用户就不能及时知情,及时预防,防患于未然。Secondly, the disease information on these websites comes from the manual entry of professionals in the background, which often has a certain lag. When a pediatric epidemic breaks out rapidly, websites with lagging information input often cannot update website information in time, so that users cannot be informed in time, prevent it in time, and prevent problems before they happen.
市场上也存在着“育儿网”、“babytree”、“YY”网”等育儿平台,他们更加关注于儿童,因此对于育儿而言也更加专业。但是,他们只是关注于保健以及育儿知识的普及,对于孩子疾病的查询和诊治方面,并不能给出权威的服务。况且,他们所针对的儿童,是大多数儿童的普遍症状,但其实儿童的成长过程是千差万别的,在育儿过程中的经验也是各有千秋的。因此关注每一个孩子的成长,为每一个孩子量身打造一个属于自己知识经验库是十分必要的。这个经验库可以用来分享,但是并不能完全复用。There are also parenting platforms such as "children.com", "babytree" and "YY" on the market. They pay more attention to children, so they are more professional in parenting. However, they only focus on health care and the popularization of parenting knowledge , for the inquiry and diagnosis and treatment of children's diseases, they cannot provide authoritative services. Moreover, the children they are targeting are common symptoms of most children, but in fact, the growth process of children is very different. Experience in the process of raising children Each has its own merits. Therefore, it is very necessary to pay attention to the growth of each child and create a knowledge and experience base for each child. This experience base can be used for sharing, but it cannot be completely reused.
目前国内外,针对成人疾病问诊及预防的应用软件数量繁多,并且大多数都收到显著的欢迎和成效。但是这类软件应用当中,成人健康咨询综合类居多,单科类疾病问诊领域显著偏少,而趋向于儿科疾病的问诊和咨询更是寥寥无几。例如国内的综合类健康咨询应用软件“春雨医生”、“好大夫”、“青苹果健康”、“掌上医生”等等,虽广受欢迎但却是针对成人综合疾病问诊的;而育儿类软件如“育儿指南”、“育儿问答”、“天天育儿”、“春雨育儿医生”,尽管以问答形式提供了许多育儿经验及在线医生问诊服务,但仍存在以下不足:At present, there are a large number of application software for adult disease consultation and prevention at home and abroad, and most of them have received significant welcome and results. However, among such software applications, there are many comprehensive adult health consultations, and there are significantly fewer single-discipline disease consultation fields, and there are very few inquiries and consultations that tend to focus on pediatric diseases. For example, the domestic comprehensive health consultation application software "Chunyu Doctor", "Good Doctor", "Green Apple Health", "Pocket Doctor", etc., although popular, are aimed at consulting adults with comprehensive diseases; Software such as "Guide to Parenting", "Questions and Answers on Parenting", "Daily Parenting", and "Chunyu Parenting Doctor", although they provide a lot of parenting experience and online doctor consultation services in the form of questions and answers, they still have the following shortcomings:
(1)上述系统疾病查询服务大都局限于疾病信息的检索,并没有对疾病信息进行有效的构建和整理,以知识库的形式呈现给用户,以致于用户还是要从繁杂的网络数据中搜寻、判别有用的信息,无法精确的获取自己想要的育儿知识。另外他们对于信息的展示形式,大多过于单一,都只是文字的阐述和罗列。当患儿的父母,心急如焚的查找有用信息时,往往会被淹没在浩如烟海的文字里,最终所获甚少。这种忽略用户心理的交互,很难在关健时刻及时有效的给用户提供服务。(1) Most of the above-mentioned systematic disease query services are limited to the retrieval of disease information, and have not effectively constructed and sorted out the disease information and presented it to users in the form of a knowledge base, so that users still have to search, Distinguish useful information, and cannot accurately obtain the parenting knowledge you want. In addition, most of their display forms of information are too single, and they are just explanations and lists of words. When the parents of children are anxious to find useful information, they are often submerged in the vast amount of words, and finally get little. This kind of interaction that ignores the user's psychology makes it difficult to provide timely and effective services to users at critical moments.
(2)上述系统大部分是基于论坛形式的问答服务,这种论坛形式的问答服务提供的是一种零散式的知识服务,并不能完整的展现一种儿科疾病从发病到治疗,再到治愈的全过程。而且随着孩子年龄的增长,育儿过程中会出现不同问题,每位家长都需要对比问题所发生的前因后果,而不仅仅是通过当前的表征状态来获取结论。不能为家长们提供一种基于过程式的育儿知识服务。(2) Most of the above systems are based on question-and-answer services in the form of forums. This kind of question-and-answer services in the form of forums provides a fragmented knowledge service and cannot fully show a pediatric disease from onset to treatment and then to cure the whole process. Moreover, as children grow older, different problems will appear in the parenting process. Every parent needs to compare the cause and effect of the problem, not just draw conclusions through the current representational state. Parents cannot be provided with a process-based parenting knowledge service.
发明内容Contents of the invention
本发明所要解决的技术问题,提供一种智能育儿知识服务方法,对疾病信息进行有效的构建和整理,以知识库的形式呈现给用户,解决了现有育儿知识服务方法不能完整的展现一种儿科疾病从发病到治疗,再到治愈的全过程,缺乏人性化智能的展现方式,展现形式单一的问题。The technical problem to be solved by the present invention is to provide an intelligent parenting knowledge service method, which effectively constructs and organizes disease information and presents it to users in the form of a knowledge base, which solves the problem that the existing parenting knowledge service method cannot fully display a The entire process of pediatric diseases from onset to treatment and then to cure lacks the display method of humanized intelligence, and the problem of displaying a single form.
本发明的另一目的是,提供一种智能育儿知识服务系统,构建了一个基于过程的儿科知识服务社交系统。Another object of the present invention is to provide an intelligent parenting knowledge service system, which constructs a process-based pediatric knowledge service social system.
本发明所采用的技术方案是,一种智能育儿知识服务方法,具体按照以下步骤进行:The technical solution adopted in the present invention is a kind of intelligent child-rearing knowledge service method, which is specifically carried out according to the following steps:
步骤1,信息采集:利用爬虫程序定时下载、解析最新的儿童疾病数据信息,并储存在mysql数据库中;Step 1, information collection: use the crawler program to regularly download and analyze the latest data on children's diseases, and store them in the mysql database;
步骤2,专家机器人知识库构建:包括疾病咨询、疾病预测、医生推荐三个步骤;Step 2, construction of expert robot knowledge base: including three steps: disease consultation, disease prediction, and doctor recommendation;
所述疾病咨询:将用户提供的疾病名称直接传递至疾病预测模块;或者通过与用户不断的交流获取症状词,采用分词器切分用户输入的口语症状词,利用mysql数据库中的疾病症状词表识别切分后的症状词,获得症状词列表,并传递至疾病预测模块;The disease consultation: the disease name provided by the user is directly transmitted to the disease prediction module; or the symptom word is obtained through continuous communication with the user, and the spoken symptom word input by the user is segmented by a word segmenter, and the disease symptom vocabulary in the mysql database is used Identify the segmented symptom words, obtain a list of symptom words, and pass them to the disease prediction module;
所述疾病预测:包括疾病名称预测和症状词列表预测;所述疾病名称预测是根据疾病咨询中获得的疾病名称在mysql数据库中检索,以知识图谱的形式为用户展示疾病预测结果;所述症状词列表预测是将疾病咨询中获得的症状词列表与mysql数据库中每一种疾病的症状进行匹配,根据匹配程度以知识图谱的形式为用户展示疾病预测结果;The disease prediction: including disease name prediction and symptom word list prediction; the disease name prediction is retrieved in the mysql database according to the disease name obtained in the disease consultation, and the disease prediction result is displayed for the user in the form of a knowledge map; the symptom The word list prediction is to match the symptom word list obtained in the disease consultation with the symptoms of each disease in the mysql database, and display the disease prediction results for users in the form of a knowledge map according to the degree of matching;
所述医生推荐:基于用户先前的疾病咨询记录和疾病预测结果的点击记录,为用户推荐与用户所患疾病相关的医生信息;The doctor recommendation: based on the user's previous disease consultation records and click records of disease prediction results, recommending doctor information related to the user's disease for the user;
步骤3,个人机器人知识库构建:用户不断的向个人机器人输入自己的知识,个人机器人以一个问题对应一个答案的形式存储在mysql数据库中,并以知识社区的形式为不同用户提供过程式的育儿知识服务;Step 3, personal robot knowledge base construction: users continuously input their own knowledge to the personal robot, and the personal robot is stored in the mysql database in the form of a question corresponding to an answer, and provides procedural parenting for different users in the form of a knowledge community knowledge service;
步骤4,系统交互实现:通过PC端和移动端实现用户交互。Step 4, system interaction implementation: user interaction is realized through the PC terminal and the mobile terminal.
本发明的特征还在于,进一步的,所述步骤2中根据疾病咨询中获得的疾病名称为用户进行疾病预测的方法是:基于疾病名称的检索,采用开源的全文搜索框架luence,分别对mysql数据库中疾病数据的“名称”、“概述”“详细资料”字段提前建立索引,建立索引时赋予不同的权重,“名称”的权重最大,根据用户提供的疾病名称与mysql数据库中的疾病名称的吻合程度,对每个检索结果进行打分,将得分高的疾病预测结果反馈给用户。The present invention is also characterized in that, further, in the step 2, according to the disease name obtained in the disease consultation, the method for the user to predict the disease is: based on the retrieval of the disease name, using the open-source full-text search framework luence, respectively to the mysql database The "name", "summary" and "details" fields of the disease data in the database are indexed in advance, and different weights are given when indexing, and the weight of "name" is the largest, according to the match between the disease name provided by the user and the disease name in the mysql database Score each retrieval result, and feed back the disease prediction results with high scores to the user.
进一步的,所述步骤2中将疾病咨询中获得的症状词列表与mysql数据库中每一种疾病的症状进行匹配,根据匹配程度为用户进行疾病预测的方法是:采用tf-idf的余弦相似度匹配算法,具体按照以下步骤进行:Further, in the step 2, the symptom word list obtained in the disease consultation is matched with the symptoms of each disease in the mysql database, and the method of predicting the disease for the user according to the degree of matching is: using the cosine similarity of tf-idf Matching algorithm, specifically according to the following steps:
首先要对mysql数据库中所有的疾病症状排序,而后根据疾病症状的序列,为mysql数据库中每一种疾病构建一条向量P,同时根据疾病咨询中获得的症状词列表构建一条假想疾病向量Pˊ;两条疾病向量构建完成后,计算两条疾病向量的余弦值,计算公式如下:First, sort all the disease symptoms in the mysql database, and then construct a vector P for each disease in the mysql database according to the sequence of disease symptoms, and construct a hypothetical disease vector Pˊ according to the list of symptom words obtained in the disease consultation; two After the construction of the two disease vectors is completed, calculate the cosine value of the two disease vectors, and the calculation formula is as follows:
当每条疾病向量P与假想疾病向量Pˊ的余弦值计算完成后,余弦值按照从大到小排序,选择前十个余弦值对应的疾病信息,将该疾病预测结果反馈给用户。After the calculation of the cosine value of each disease vector P and the imaginary disease vector P′ is completed, the cosine values are sorted from large to small, and the disease information corresponding to the first ten cosine values is selected, and the disease prediction result is fed back to the user.
进一步的,所述构建疾病向量的方法:倘若某疾病相应的位置确实出现某种症状,这个位置的向量因子就是此症状相对于此疾病的tf-idf值,倘若没有出现某种症状,这个位置的向量因子取0值;Further, the method of constructing a disease vector: if a certain symptom does appear in the corresponding position of a certain disease, the vector factor of this position is the tf-idf value of the symptom relative to the disease; if no certain symptom occurs, the position The vector factor of takes 0 value;
tf-idf值的计算方式如下:The tf-idf value is calculated as follows:
tf-idf(wij)=tf(wij)×idf(wi)tf-idf(wij)=tf(wij)×idf(wi)
tf值,表示一个症状词在当前疾病数据中出现的频率,其计算公式为:The tf value indicates the frequency of a symptom word appearing in the current disease data, and its calculation formula is:
其中,n(wij)表示第i个症状词在第j种疾病数据中出现的次数,D(j)表示第j种疾病数据分词后的总词数;Among them, n(wij) represents the number of times the i-th symptom word appears in the j-th disease data, and D(j) represents the total number of words after the j-th disease data is segmented;
Idf值,表示一个症状的稀有程度,其计算公式为:The Idf value indicates the rarity of a symptom, and its calculation formula is:
其中,A表示当前的疾病总数,a(wi)表示出现第i个症状词的疾病总数。Among them, A represents the current total number of diseases, and a(wi) represents the total number of diseases where the i-th symptom word appears.
进一步的,所述步骤2还包括以下步骤:倘若用户在疾病咨询模块给出的症状词过少,不足以够构建假想疾病向量时,疾病预测方法为:当用户给出一个症状词时,在mysql数据库中查找,遍历每一个疾病的症状列表,假如搜索到某一个疾病的症状列表中包含当前症状词,而当前症状词的tf-idf值与搜索到的疾病症状列表中所有症状词的tf-idf总和结果比率接近于1,然后取出此疾病症状列表中第二大tf-idf值的症状词去询问用户是否满足,倘若满足,则继续询问第三大tf-idf值的症状,依次类推,当已经确定的症状词数与此疾病症状词总数的比例达到65%时,将该疾病预测结果反馈给用户;如果已经确定的症状词数与此疾病症状词总数的比例不足65%时,再按照上述方法依次对其它包含当前症状词的疾病症状列表进行预测。Further, the step 2 also includes the following steps: if the symptom words given by the user in the disease consultation module are too few to construct a hypothetical disease vector, the disease prediction method is: when the user gives a symptom word, in Search in the mysql database, traverse the symptom list of each disease, if the searched symptom list of a certain disease contains the current symptom word, and the tf-idf value of the current symptom word is the same as the tf of all the symptom words in the searched disease symptom list -idf sum result ratio is close to 1, then take out the symptom word with the second largest tf-idf value in the disease symptom list to ask the user whether it is satisfied, if satisfied, continue to ask the symptom with the third largest tf-idf value, and so on , when the ratio of the determined symptom words to the total number of disease symptom words reaches 65%, the disease prediction result is fed back to the user; if the ratio of the determined symptom words to the total number of disease symptom words is less than 65%, Then follow the above method to predict other disease symptom lists containing the current symptom word in turn.
进一步的,所述步骤2中医生推荐的方法是:使用luence框架为医生的信息建立索引,医生的信息包括姓名、科室、所属医院、简介、主治、擅长、联系方式,不同的信息赋予不同的权重,主治和擅长的权重均大于其他信息,简介的权重次之;选取排名前十的医生推荐给用户;在整个过程结束之后,系统还会讯问用户对于本次问诊是否满意,将本次问诊的记录添加到个人机器人模块。Further, the method recommended by the doctor in step 2 is: use the luence framework to index the doctor's information, the doctor's information includes name, department, hospital, brief introduction, attending, specialty, contact information, and different information is given different Weight, the weight of indications and expertise is greater than other information, and the weight of introduction is second; select the top ten doctors to recommend to users; after the whole process is over, the system will also ask users whether they are satisfied with this consultation, and will this The record of the consultation is added to the personal robot module.
进一步的,所述步骤3中,对用户提问进行答案检索的方法:首先对存储在mysql数据库中的所有问题和答案数据分域建立索引,当用户问题提出时,对用户问题进行分词处理,然后去掉停用词和虚词,以主要实词和用户名建立一个嵌套查询在索引中查找。Further, in the step 3, the method of retrieving answers to user questions: first, index all questions and answer data stored in the mysql database by domain, and when a user question is raised, perform word segmentation processing on the user question, and then Remove stop words and function words, build a nested query with main content words and user name to find in the index.
本发明所采用的另一技术方案是,一种智能育儿知识服务系统,包括信息采集模块、专家机器人模块、个人机器人模块和系统交互模块;Another technical solution adopted by the present invention is an intelligent parenting knowledge service system, including an information collection module, an expert robot module, a personal robot module and a system interaction module;
所述信息采集模块与mysql数据库连接,用于通过爬虫程序定时获取最新的儿童疾病数据信息;Described information collection module is connected with mysql database, is used for regularly obtaining the newest child disease data information by crawler program;
所述专家机器人模块,用于以知识图谱的形式直观的为用户提供儿科疾病相关的各类知识服务;专家机器人模块包括疾病咨询模块、疾病预测模块、医生推荐模块;The expert robot module is used to intuitively provide users with various knowledge services related to pediatric diseases in the form of a knowledge map; the expert robot module includes a disease consultation module, a disease prediction module, and a doctor recommendation module;
所述疾病咨询模块与疾病预测模块连接,用于获取用户提供的疾病名称或症状词,并将疾病名称、症状词列表传递至疾病预测模块;The disease consultation module is connected with the disease prediction module for obtaining the disease name or symptom word provided by the user, and passing the disease name and symptom word list to the disease prediction module;
所述疾病预测模块,用于将疾病名称在mysql数据库中检索,以知识图谱的形式为用户展示疾病预测结果;以及将症状词列表与mysql数据库中每一种疾病的症状进行匹配,根据匹配程度以知识图谱的形式为用户展示疾病预测结果;The disease prediction module is used to search the name of the disease in the mysql database, and display the disease prediction results for the user in the form of a knowledge graph; and match the list of symptom words with the symptoms of each disease in the mysql database, according to the degree of matching Display the disease prediction results for users in the form of knowledge graph;
所述医生推荐模块分别与疾病咨询模块、疾病预测模块连接,基于疾病咨询模块的咨询记录和疾病预测结果的点击记录,为用户推荐与用户所患疾病相关的医生;The doctor recommendation module is respectively connected with the disease consultation module and the disease prediction module, and based on the consultation record of the disease consultation module and the click record of the disease prediction result, recommends a doctor related to the user's disease for the user;
所述个人机器人模块,用于将用户输入的知识以一个问题对应一个答案的形式存储在mysql数据库中,并以知识社区的形式为不同用户提供过程式的育儿知识服务;The personal robot module is used to store the knowledge input by the user in the mysql database in the form of a question corresponding to an answer, and provide procedural parenting knowledge services for different users in the form of a knowledge community;
所述系统交互模块,包括PC端和移动端。The system interaction module includes a PC terminal and a mobile terminal.
进一步的,所述疾病咨询模块包括mmseg4J分词器和疾病症状词表,mmseg4J分词器用于切分用户输入的口语症状词;疾病症状词表包括mysql数据库中的疾病症状和临床表现中的词语、及其同义词,疾病症状词表用于识别经过mmseg4J分词器切分的症状词。Further, the disease consultation module includes a mmseg4J word breaker and a disease symptom vocabulary, and the mmseg4J word breaker is used to segment the spoken symptom words input by the user; the disease symptom word list includes the words in the disease symptoms and clinical manifestations in the mysql database, and Its synonyms and disease symptom vocabulary are used to identify symptom words segmented by the mmseg4J tokenizer.
进一步的,所述PC端和移动端,采用语音技术让系统具有人的声音,采用数据可视化技术使得数据的展示更加形象生动,包括多种说法的提示语。Further, the PC terminal and the mobile terminal adopt voice technology to make the system have a human voice, and use data visualization technology to make the display of data more vivid, including prompts in various ways.
本发明的有益效果是:本发明构建了一个基于过程的儿科知识服务社交系统,能够为用户提供更多育儿过程中的完整记录,帮助用户更全面的分析已有的育儿知识;充分利用自然语言处理和数据相关算法,对获取的最新儿科医疗数据进行分析和建模,分别构建专家机器人模块和个人机器人模块,专家机器人模块以知识图谱的形式更直观的为用户提供儿科疾病相关的各类知识服务;个人机器人模块以知识社区的形式为不同用户提供过程式的育儿知识服务。本发明还具有以下优点:The beneficial effects of the present invention are: the present invention builds a process-based pediatric knowledge service social system, which can provide users with more complete records in the parenting process and help users analyze existing parenting knowledge more comprehensively; make full use of natural language Processing and data-related algorithms, analyze and model the latest acquired pediatric medical data, and build expert robot modules and personal robot modules respectively. The expert robot module provides users with various types of knowledge related to pediatric diseases more intuitively in the form of knowledge graphs Service; the personal robot module provides procedural parenting knowledge services for different users in the form of a knowledge community. The present invention also has the following advantages:
(1)本发明建立了专门针对儿科疾病相关数据的知识库,以知识图谱的方式展示了疾病名称、疾病症状、治疗方法、儿科医生、就诊途径等信息,以及各信息之间的关联关系。疾病预测模块能够根据用户提供的症状名词,智能分析用户患儿可能属于的疾病类型,并提供相应的护理知识服务,同时提供相应专家及就诊信息。尽管目前有一些类似的育儿论坛和网站,但是并没有对这些论坛和网站的信息进行知识分析,无法为用户提供完整的育儿知识图谱。本发明的智能育儿知识服务系统能够提供给用户一个完整的儿科疾病知识图谱,在一张图谱中就能为用户展示所查询症状的所有关联数据。无论是查询效果还是用户体验均优于传统的论坛和网站中的关键字查询。(1) The present invention establishes a knowledge base specifically for data related to pediatric diseases, and displays information such as disease names, disease symptoms, treatment methods, pediatricians, and treatment routes in the form of a knowledge map, as well as the relationship between each information. The disease prediction module can intelligently analyze the possible disease types of the user's child according to the symptom nouns provided by the user, and provide corresponding nursing knowledge services, as well as corresponding experts and medical consultation information. Although there are some similar parenting forums and websites, there is no knowledge analysis on the information of these forums and websites, and it is impossible to provide users with a complete parenting knowledge map. The intelligent parenting knowledge service system of the present invention can provide users with a complete knowledge map of pediatric diseases, and can display all associated data of the queried symptoms for the user in one map. Both the query effect and the user experience are better than the keyword query in traditional forums and websites.
(2)本发明中个人机器人服务是本系统的独创功能,在育儿过程中,每位父母都有自己的经验及体会,这些已有经验将会为以后本人育儿和他人育儿过程中碰到的问题提供知识服务;通过从小培养一个属于自己的育儿机器人,记录自己在育儿过程中的点滴经验,在以后的育儿过程中,可随时翻查已有的育儿知识,包括症状、治疗方法、用药等各种信息,实现了知识的重复利用。同时个人机器人的知识可以通过育儿社区向其他用户开放,其他用户可以完整的获取每个个人机器人共享的育儿知识,能够帮助用户全面了解育儿过程中碰到的疾病诊疗问题,比传统论坛式的问答服务提供的零散化育儿知识更全面,更具参考价值。(2) Personal robot service is the original function of this system among the present invention, and in child-rearing process, every parent has own experience and understanding, and these existing experiences will be for the future that I run into in child-rearing and other people's child-rearing process Provide knowledge services for questions; by cultivating a child-rearing robot of your own from an early age, and recording your own experience in the process of child-rearing, in the future child-rearing process, you can check the existing child-rearing knowledge at any time, including symptoms, treatment methods, medications, etc. All kinds of information realize the repeated use of knowledge. At the same time, the knowledge of personal robots can be opened to other users through the parenting community, and other users can fully obtain the parenting knowledge shared by each personal robot, which can help users fully understand the disease diagnosis and treatment problems encountered in the parenting process, which is better than traditional forum-style questions and answers The fragmented parenting knowledge provided by the service is more comprehensive and has more reference value.
附图说明Description of drawings
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are only These are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without creative work.
图1是本发明专家机器人服务流程图。Fig. 1 is a flow chart of the expert robot service in the present invention.
图2是本发明个人机器人服务流程图。Fig. 2 is a flow chart of the personal robot service of the present invention.
具体实施方式detailed description
下面将结合本发明实施例中,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below in conjunction with the embodiments of the present invention. Apparently, the described embodiments are only some, not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without making creative efforts belong to the protection scope of the present invention.
一种智能育儿知识服务方法,具体按照以下步骤进行:An intelligent child-rearing knowledge service method, which is specifically carried out according to the following steps:
步骤1,信息采集:利用爬虫程序定时下载、解析最新的儿童疾病数据信息,并储存在mysql数据库中;信息采集是系统在后台默默运行的第一道程序,虽然用户无法直接看到,但是其为知识构建模块提供素材。因此,作为整个系统的基础,其重要性不言而喻。信息采集的流程是:首先选择几个比较权威的网站,然后定时的爬取其数据,然后解析,去除无用标签、广告,最后持久化到本地mysql数据库中。Step 1, information collection: use the crawler program to regularly download and analyze the latest child disease data information, and store them in the mysql database; information collection is the first program that the system runs silently in the background, although users cannot directly see it, but other Provide material for knowledge building blocks. Therefore, as the basis of the whole system, its importance is self-evident. The process of information collection is: first select a few more authoritative websites, then regularly crawl their data, then analyze, remove useless tags and advertisements, and finally persist them into the local mysql database.
定时爬取:为保证系统的数据能够在一定时间段内更新,从而获得当前新发的一些流行性疾病信息,必须让系统定时的采集数据。而采集的频率又不能太高,因为采集速度过快,会造成服务器负载过大,影响其他功能的流畅性。Timing crawling: In order to ensure that the data of the system can be updated within a certain period of time, so as to obtain information on some new epidemic diseases, the system must be allowed to collect data regularly. And the frequency of collection should not be too high, because the collection speed is too fast, which will cause excessive load on the server and affect the fluency of other functions.
这里我们实现tomcat服务器的上下文监听器类来作为守护进程,并在其中调用java6.0以后新添加的concurrent包所提供的定时器,然后在定时器中调用自己写好的定制化爬虫程序。最终,只需要在web.xml文件中配置好监听器,就能够在启动服务器后,一边处理前端来的业务逻辑请求,一边定时的爬取我们所指定网站的数据,并做相应的解析工作。可以设置爬取时间在每天晚上的24点到3点,此时一般访问服务器的人比较少,不会给服务器造成很大的压力,有利于系统的平稳运行。Here we implement the context listener class of the tomcat server as a daemon process, and call the timer provided by the newly added concurrent package after java6.0, and then call the customized crawler program written by ourselves in the timer. In the end, we only need to configure the listener in the web.xml file, and after starting the server, we can process the business logic requests from the front end, while regularly crawling the data of the website we specified, and do corresponding analysis work. You can set the crawling time from 24:00 to 3:00 every night. At this time, there are generally fewer people accessing the server, which will not cause a lot of pressure on the server, which is conducive to the smooth operation of the system.
定制化爬虫:Custom crawler:
为了满足自身的需要,我们自己设计爬虫程序。要做定制化爬虫,爬取特定网站的特定数据,我们必须对这些网站的结构进行研究,然后总结一套通用的方案。要写一个定制化爬虫程序,获取特定内容,我们先模拟人在浏览此网站时所执行的操作。例如打开某一个医疗信息网站后,因为我们做的是关于儿童的系统,所以先点击“儿科”,然后点击“儿科”下的一个科室“小儿感染科”,右侧会列出所有“小儿疾病科”的所有疾病。最后要做的就是,点击“疾病详细资料”,浏览疾病详情。In order to meet our own needs, we design the crawler program ourselves. To customize crawlers and crawl specific data from specific websites, we must study the structure of these websites and then summarize a set of general solutions. To write a custom crawler program to obtain specific content, we first simulate the actions that humans perform when browsing this website. For example, after opening a certain medical information website, because we are making a system about children, first click "Pediatrics", and then click a department "Pediatric Infection" under "Pediatrics", and all "Pediatric Diseases" will be listed on the right family" of all diseases. The last thing to do is to click on "Disease Details" to browse the disease details.
分析整个访问过程,我们的爬虫程序大致可以分为两个步骤。第一步采集疾病详情页面的URL,放入“疾病详情URL队列”;第二步多线程遍历“疾病详情URL队列”采集疾病详情数据。Analyzing the entire access process, our crawler program can be roughly divided into two steps. The first step is to collect the URL of the disease details page and put it into the "disease details URL queue"; the second step is to traverse the "disease details URL queue" with multiple threads to collect disease details data.
要拿到疾病详情的URL,我们必须遍历“儿科”下的所有小科室,然后获得每一个小科室下疾病的总数量,根据总数量算出当前小科室下的疾病分为几页,由页码构建出疾病列表页的URL,并放入“疾病列表URL列表”中。最终所有科室下每一页的内容结构都是一样的,只需遍历“疾病列表URL列表”,一个个解析,获得疾病详情URL,并放入“疾病详情URL队列”中。To get the URL of the disease details, we must traverse all the small departments under "Pediatrics", then obtain the total number of diseases under each small department, and calculate the number of pages for the diseases under the current small department according to the total number, which is constructed by the page number Get the URL of the disease list page and put it into the "Disease List URL List". In the end, the content structure of each page under all departments is the same. You only need to traverse the "disease list URL list", parse one by one, obtain the disease detail URL, and put it into the "disease detail URL queue".
采用多线程去“疾病详情URL队列”中取出URL,下载,并解析即可。应该注意的是,这里的URL可能很多,必须进行标注,哪些爬取成功,哪些尚未爬取,哪些爬取失败。对于爬取失败的,可能是网络原因,需要进行再次爬取。对于新添加URL,还应进行判断,此URL是否已经在队列中。当队列里URL足够多时,这个判断可能会很耗时,优化方法是:计算URL的MD5值,然后放于一个hash表中,每次用这个hash表进行判断,将会省时许多。Use multithreading to retrieve the URL from the "Disease Details URL Queue", download it, and parse it. It should be noted that there may be many URLs here, which must be marked, which ones have been crawled successfully, which ones have not been crawled, and which ones have failed to be crawled. For crawling failures, it may be due to network reasons, and it needs to be crawled again. For the newly added URL, it should also be judged whether the URL is already in the queue. When there are enough URLs in the queue, this judgment may be time-consuming. The optimization method is: calculate the MD5 value of the URL, and then put it in a hash table. Using this hash table to judge each time will save a lot of time.
整个爬取过程中,包括两个操作:下载,解析。下载html文本我们会用到apache的开源项目httpClient包发送请求;而解析文本,则会用到正则表达式和Jsoup。Jsoup用来获取HTML节点中需要的内容,正则表达式用来匹配那些没有节点限制的内容,例如匹配疾病总数等。The whole crawling process includes two operations: downloading and parsing. To download html text, we will use Apache's open source project httpClient package to send requests; and to parse text, we will use regular expressions and Jsoup. Jsoup is used to obtain the required content in HTML nodes, and regular expressions are used to match those content without node restrictions, such as matching the total number of diseases, etc.
当数据采集下来以后,就要持久化到本地数据库,以供知识构建模块使用。本系统采用mysql关系型数据库,持久化过程使用javaee标准的jpa框架,这样可以免去直接操作sql语句的繁琐,从而以面向对象的方式访问数据,有利于系统的扩展和维护。After the data is collected, it must be persisted to the local database for use by the knowledge building blocks. The system adopts mysql relational database, and the persistence process uses the javaee standard jpa framework, which can avoid the tedious operation of sql statements directly, so as to access data in an object-oriented manner, which is beneficial to system expansion and maintenance.
步骤2,专家机器人知识库构建:以知识图谱的形式直观的为用户提供儿科疾病相关的各类知识服务;专家机器人模块主要是模拟病人看病的过程,通过不断与用户进行交流,逐渐衍生出了疾病咨询、疾病预测、医生推荐这三个子模块。这三个子模块围绕着现实生活中问诊的流程展开,因此也会有一定的次序。而其知识库的构建,也将因为各自具体情况的不同,而有所不同。Step 2, construction of expert robot knowledge base: provide users with various knowledge services related to pediatric diseases intuitively in the form of knowledge graphs; the expert robot module mainly simulates the process of seeing a doctor, and gradually derives The three sub-modules are disease consultation, disease prediction, and doctor recommendation. These three sub-modules revolve around the process of consultation in real life, so there will be a certain order. And the construction of its knowledge base will also be different because of the different specific situations.
(1)疾病咨询:(1) Disease consultation:
疾病咨询主要是尽量模拟医生的口吻与病人进行有效的交流。交流的目的是获取疾病症状词,从而为疾病预测模块提供原料。而咨询的方式,又会因为用户情况的不同而分为两种:其一是用户知道疾病名称,这样直接将疾病名称输出到疫病预测模块即可;第二种是用户不知道疾病名称,那么只能通过与用户不断的交流,来获取症状词,将症状词列表传递给疾病预测模块。因为第一种情况很简单,根据疾病咨询中获得的疾病名称在mysql数据库中检索,以知识图谱的形式为用户展示疾病预测结果;下面详细阐述第二种情况。Disease consultation is mainly to imitate the doctor's tone to communicate effectively with patients. The purpose of the exchange is to obtain disease symptom words, thus providing raw materials for the disease prediction module. The way of consultation can be divided into two types due to different user situations: one is that the user knows the name of the disease, so that the disease name can be directly output to the epidemic prediction module; the second is that the user does not know the name of the disease, then Symptom words can only be obtained through continuous communication with users, and the list of symptom words is passed to the disease prediction module. Because the first case is very simple, according to the disease name obtained in the disease consultation, it is searched in the mysql database, and the disease prediction result is displayed to the user in the form of a knowledge graph; the second case is described in detail below.
首先应该明确,与用户交流的目的是获取症状词。而用户输出的是自然语言,如何从用户输入的口语中获取到比较专业的症状词:First of all, it should be clear that the purpose of communicating with users is to obtain symptom words. However, the user outputs natural language. How to obtain more professional symptom words from the spoken language input by the user:
我们首先构建了一张疾病症状词表。症状词表来源于数据采集模块爬取到的疾病数据。其疾病数据中的疾病症状和临床表现中的词语,可以作为症状词,因此对这些词语进行了进一步处理,并放入了症状词表中。We first constructed a disease symptom vocabulary. The symptom vocabulary comes from the disease data crawled by the data collection module. The words in the disease symptoms and clinical manifestations in the disease data can be used as symptom words, so these words are further processed and put into the symptom vocabulary.
另外由于用户输入的口语症状词可能不在疾病症状词表中,也或许有些症状词有多种说法,但是他们都表示同一个意思。例如:“发烧”的口语表示为“发烫”,其他说法甚至还有“发热”等。当用户输入其中一种时,即表示用户有此种症状。因此,还需要为症状词表补充同义词。为了数据的准确性,我们写自动化程序,首先将同义词表中的同义词自动添加到各个症状词后面。自动化程序的思路是,模拟人查同义词表的过程,以每一个症状词为原本词去查同义词表,查到同义词即补充到症状词表后面。考虑到同义词可能不够齐全和一些口语症状词可能不在其列,我们需要人为的为这些症状词补充了同义词。最终经过程序和人的双重补充,达到的症状词有2455条,其中每个症状词后面的同义词可能有4到5个,当然不排除有的症状词没有同义词的可能。In addition, because the spoken symptom words input by the user may not be in the disease symptom vocabulary, some symptom words may have multiple sayings, but they all represent the same meaning. For example: the colloquial expression of "fever" is "hot", and other expressions even include "fever". When a user enters one of these, it means that the user has that symptom. Therefore, it is also necessary to add synonyms to the symptom vocabulary. For the accuracy of the data, we write an automated program, first automatically add synonyms in the synonym table to each symptom word. The idea of the automated program is to simulate the process of checking the synonym table by a human, and use each symptom word as the original word to check the synonym table, and when the synonym is found, it will be added to the back of the symptom table. Considering that the synonyms may not be complete enough and some spoken symptom words may not be listed, we need to artificially add synonyms for these symptom words. In the end, 2,455 symptom words were obtained after double supplementation by programs and people, and there may be 4 to 5 synonyms behind each symptom word. Of course, the possibility that some symptom words have no synonyms cannot be ruled out.
最后要对用户输入的自然语句进行切分,需要一个全而新的词库。这里我们采用mmseg4J分词器,采用搜狗输入法提供的最新词典,在加入我们自己构建的一些口语症状描述词作为分词的大词典。这样,只要用户的输入语句中包含我们构建好的症状词或者症状词的同义词,甚至是口语描述,我们都能够识别出来。Finally, to segment the natural sentences entered by the user, a brand new thesaurus is needed. Here we use the mmseg4J word segmenter, the latest dictionary provided by Sogou input method, and add some oral symptom descriptors constructed by ourselves as a large dictionary of word segmentation. In this way, as long as the user's input sentence contains our constructed symptom words or synonyms of symptom words, or even spoken descriptions, we can identify them.
例如,当用户输入:“我的孩子最近老流清鼻,有时候头还发烫,该怎么办呢?”,我们的分词结果是:“我的|孩子|最近|老|流清鼻|,|有时候|头|还|发烫|,|该|怎么办|呢|?|”。然后将分出的词与症状词表匹配,首先与症状词表的主词匹配,倘若不匹配,则再次与症状词的同义词匹配,如果匹配则表明:用户的本次描述捕获到了症状词。例如此次用户的输入,我们一一将分词结果去匹配,最后发现,“流清鼻”虽然与症状词表的主词“流鼻涕”不相匹配,但是与我们手动添加的通俗说法“流清鼻”匹配;而“发烫”也与主词“发烧”不相匹配,但是却匹配于程序构建的源于同义词表的词语“发烫”。For example, when the user enters: "My child has a runny nose recently, and sometimes his head is still hot, what should I do?", our word segmentation result is: "My | child | recent | old | clear nose | ,|sometimes|head|is still|hot|,|what should I do|?|?|". Then match the separated words with the symptom vocabulary, first match the subject of the symptom vocabulary, if not, then match again with the synonyms of the symptom word, if they match, it indicates that the user's description has captured the symptom word. For example, we matched the word segmentation results one by one for the user input this time, and finally found that although "runny nose" does not match the subject "runny nose" in the symptom vocabulary, it does not match the popular saying "flow clear" that we added manually. "nasal"; and "fare" does not match the subject "fever", but it matches the word "fare" constructed by the program from the synonym table.
为了疾病预测的准确性,我们尽量匹配足够多的症状,才去结束疾病咨询的过程。可是有些疾病症状确实不够多,让用户一味的输出非典型症状,有时候反而不利于疾病的预测。因此我们疾病咨询遵循的机制是:“尽量但不勉强”。当咨询后用户输入的速度减慢,或者不再输入新症状时,表明用户症状的描述完全,则结束本次问询。For the accuracy of disease prediction, we try to match as many symptoms as possible before ending the disease consultation process. However, there are not enough symptoms of some diseases, allowing users to blindly output atypical symptoms, sometimes it is not conducive to the prediction of diseases. Therefore, the mechanism we follow in disease consultation is: "try but not force". When the input speed of the user slows down after the consultation, or no new symptoms are entered, it indicates that the description of the user's symptoms is complete, and the inquiry ends.
(2)疾病预测:(2) Disease prediction:
包括疾病名称预测和症状词列表预测;疾病名称预测是根据疾病咨询中获得的疾病名称在mysql数据库中检索,以知识图谱的形式为用户展示疾病预测结果;症状词列表预测是将疾病咨询中获得的症状词列表与mysql数据库中每一种疾病的症状进行匹配,根据匹配程度以知识图谱的形式为用户展示疾病预测结果;Including disease name prediction and symptom word list prediction; disease name prediction is to search the mysql database based on the disease name obtained in the disease consultation, and display the disease prediction results for the user in the form of a knowledge map; symptom word list prediction is to obtain the disease consultation Match the list of symptom words with the symptoms of each disease in the mysql database, and display the disease prediction results for users in the form of knowledge graphs according to the degree of matching;
当用户知道疾病名称时,疾病的预测也就是基于疾病名的一个检索;我们采用开源的全文搜索框架luence,分别对mysql数据库中准备好的疾病数据的“名称”、“概述”、“详细资料”等字段提前建立好索引;建立索引时赋予不同的权重,“名称”的权重最大,根据用户提供的疾病名称与mysql数据库中的疾病名称的吻合程度,对每个检索结果进行打分,这样当用户提供的疾病名称与疾病库中的疾病名称吻合时,检索后输出的分值就大,将得分高的疾病预测结果反馈给用户。每个检索结果都带有一个分值,系统并不能保证用户输入的疾病名称完全准确,这样基于一个分值的输出就更加具有客观性。当用户输入的疾病名称不在我们系统时,计算结果的分值会小的接近于0,用户就可以不必以系统的预测为参考,也不会延误用户寻找其他途径的咨询方式。When the user knows the name of the disease, the prediction of the disease is also a search based on the name of the disease; we use the open source full-text search framework luence to search the "name", "overview", and "details" of the disease data prepared in the mysql database " and other fields are indexed in advance; when indexing, different weights are assigned, and "name" has the largest weight. According to the degree of matching between the disease name provided by the user and the disease name in the mysql database, each retrieval result is scored, so when When the disease name provided by the user matches the disease name in the disease database, the output score after retrieval will be higher, and the disease prediction result with a higher score will be fed back to the user. Each retrieval result has a score, and the system cannot guarantee that the disease name entered by the user is completely accurate, so the output based on a score is more objective. When the disease name entered by the user is not in our system, the score of the calculation result will be small and close to 0, and the user does not need to use the system's prediction as a reference, nor will it delay the user's search for other ways of consultation.
症状词列表预测的主要方法是,将“疾病咨询”模块获得的症状词列表与mysql数据库中每一种疾病的症状进行匹配,倘若匹配度高,则将其对应的疾病预测给用户。当然,这里的匹配不能简单的用有和无的匹配方式进行。因为,有的疾病症状多,有的疾病症状少,有的疾病症状突出,往往一种症状就能左右这种疾病。而流行感冒这种疾病,则症状相对而言就比较多了。所以我们应该采取一种算法,可以综合考虑以上综合因素。The main method of symptom word list prediction is to match the symptom word list obtained by the "disease consultation" module with the symptoms of each disease in the mysql database. If the matching degree is high, the corresponding disease will be predicted to the user. Of course, the matching here cannot simply be done with the matching method of having and not having. Because some diseases have many symptoms, some diseases have few symptoms, and some diseases have prominent symptoms, and often one symptom can control the disease. Influenza is a disease with relatively more symptoms. Therefore, we should adopt an algorithm that can comprehensively consider the above comprehensive factors.
我们采取综合了tf-idf的余弦相似度匹配算法,来进行疾病与症状的匹配。首先要对mysql数据库中所有的疾病症状排序,而后根据疾病症状的序列,为mysql数据库中每一种疾病构建一条向量P,同时根据疾病咨询中获得的症状词列表构建一条假想的疾病向量Pˊ。We adopt a cosine similarity matching algorithm that combines tf-idf to match diseases and symptoms. First, sort all the disease symptoms in the mysql database, and then construct a vector P for each disease in the mysql database according to the sequence of disease symptoms, and construct an imaginary disease vector Pˊ according to the list of symptom words obtained in the disease consultation.
构建疾病向量的方法:倘若某疾病相应的位置确实出现某种症状,这个位置的向量因子就是此症状相对于此疾病的tf-idf值,倘若没有出现某种症状,这个位置的向量因子取0值;The method of constructing the disease vector: If a certain symptom does appear in the corresponding position of a certain disease, the vector factor of this position is the tf-idf value of the symptom relative to the disease. If there is no certain symptom, the vector factor of this position is taken as 0 value;
tf-idf值的计算方式如下:The tf-idf value is calculated as follows:
tf-idf(wij)=tf(wij)×idf(wi)tf-idf(wij)=tf(wij)×idf(wi)
tf(Term Frequency)值,表示一个症状词在当前疾病数据中出现的频率,频率越大表明此症状与此疾病越相关。疾病数据包括疾病描述、临床表现、症状等我们所采集到的数据;其计算公式为:The tf (Term Frequency) value indicates the frequency of a symptom word appearing in the current disease data, and the greater the frequency, the more relevant the symptom is to the disease. Disease data includes disease description, clinical manifestations, symptoms and other data we have collected; the calculation formula is:
其中,n(wij)表示第i个症状词在第j种疾病数据中出现的次数,D(j)表示第j种疾病数据分词后的总词数;例如,“打喷嚏”在“流行性感冒”的数据中出现了4次,而“流行性感冒”的数据总词数有500个,那么其tf(wij)值便为0.008。Among them, n(wij) represents the number of occurrences of the i-th symptom word in the j-th disease data, and D(j) represents the total number of words in the j-th disease data; There are 4 times in the data of "cold", and the total number of words in the data of "influenza" is 500, so its tf(wij) value is 0.008.
Idf(Inverse Document Frequency)值,表示一个症状的稀有程度,越稀有的症状预测一种疾病的能力往往越强。其计算公式为:The Idf (Inverse Document Frequency) value indicates the rarity of a symptom, and the rarer the symptom, the stronger the ability to predict a disease. Its calculation formula is:
其中,A表示当前的疾病总数,a(wi)表示出现第i个症状词的疾病总数;例如“流鼻涕”出现在了35种疾病中,而我们的疾病总数有7000个,那么其idf值便为:log(7000/35)≈2.30。Among them, A represents the current total number of diseases, a(wi) represents the total number of diseases with the i-th symptom word; for example, "runny nose" appears in 35 diseases, and our total number of diseases is 7000, then its idf value Then it is: log(7000/35)≈2.30.
由上可知,一个症状对一种疾病的预测性取决于两个因素:tf值,即当前症状在此疾病中出现的频率,多则强;idf值,即此症状的稀有程度,越稀有的症状出现在一种疾病中时,越能够预测此疾病。It can be seen from the above that the predictability of a symptom to a disease depends on two factors: the tf value, that is, the frequency of the current symptom in this disease, the more the stronger; the idf value, that is, the rarity of the symptom, the rarer it is Symptoms are more predictive of a disease when they occur in it.
按照上述方法,两条疾病向量构建完成后,计算两条疾病向量的余弦值,计算公式如下:According to the above method, after the construction of the two disease vectors is completed, the cosine value of the two disease vectors is calculated, and the calculation formula is as follows:
构建疾病向量的具体操作方法:The specific operation method of constructing the disease vector:
首先将所有的疾病症状排好序。假设当前疾病库中的疾病症状有10个:First put all the symptoms of the disease in order. Suppose there are 10 disease symptoms in the current disease database:
表1 症状词向量序列Table 1 Symptom word vector sequence
假设流行感冒的疾病症状有“发烧”、“流鼻涕”、“呕吐”、“打喷嚏”,而这些症状相对于此疾病的tf-idf值分别为:0.0022,0.0184,0.0053,0.0242;那么最终构建的流行性感冒的向量为:P(0.0022,0,0,0.0184,0.0053,0,0,0,0.0242,0)。Assuming that the symptoms of influenza include "fever", "runny nose", "vomiting", and "sneezing", and the tf-idf values of these symptoms relative to the disease are: 0.0022, 0.0184, 0.0053, 0.0242; then the final The constructed influenza vector is: P(0.0022, 0, 0, 0.0184, 0.0053, 0, 0, 0, 0.0242, 0).
构建好每一种疾病的向量后,就可以进行疾病预测了。当疾病咨询模块传来症状列表以后,我们会为当前用户构建一条假想疾病的向量。构建方法与疾病向量的方法相同。假设在疾病咨询模块所捕获到的用户症状为:“发烧”、“流鼻涕”、“打喷嚏”,那么为预期疾病构建的向量为:Pˊ(0.0022,0,0,0.0184,0,0,0,0,0.0242,0)。After constructing the vector of each disease, the disease prediction can be performed. When the disease consultation module sends a list of symptoms, we will construct a vector of hypothetical diseases for the current user. The construction method is the same as that of the disease vector. Assuming that the user symptoms captured in the disease consultation module are: "fever", "runny nose", "sneezing", then the vector constructed for the expected disease is: Pˊ(0.0022, 0, 0, 0.0184, 0, 0, 0, 0, 0.0242, 0).
等到两条向量就绪,就可以进行计算两条向量之间的余弦值了。计算公式如下:When the two vectors are ready, the cosine between the two vectors can be calculated. Calculated as follows:
其中分子为两条向量的内积,分母为两条向量模的乘积;The numerator is the inner product of two vectors, and the denominator is the product of the modulus of two vectors;
我们还以上面的例子进行计算,则计算结果为cos(PPˊ)≈0.9852,可以看到余弦值接近于1,说明两条向量非常相似,由此证明用户患有流行感冒的概率是比较大的,我们可以将此疾病信息预测给用户。We also use the above example to calculate, and the calculated result is cos(PPˊ)≈0.9852. It can be seen that the cosine value is close to 1, indicating that the two vectors are very similar, which proves that the probability of the user suffering from influenza is relatively high , we can predict this disease information to the user.
采取tf-idf值可以保证某种疾病的关键症状能够获得比较大的权重,从而在进行余弦相似度计算时,可以算出比较大的结果。这样当用户出现这些关键症状时,我们系统的预测就能够尽可能的靠拢那些拥有关键症状词的疾病。当每条疾病向量P与假想疾病向量Pˊ的余弦值计算完成后,余弦值按照从大到小排序,选择前十个余弦值对应的疾病信息,返还给前端做可视化展示,将该疾病预测结果反馈给用户。Taking the tf-idf value can ensure that the key symptoms of a certain disease can get a relatively large weight, so that a relatively large result can be calculated when performing cosine similarity calculations. In this way, when users have these key symptoms, the prediction of our system can be as close as possible to those diseases with key symptom words. After the calculation of the cosine value of each disease vector P and the imaginary disease vector Pˊ is completed, the cosine values are sorted from large to small, and the disease information corresponding to the first ten cosine values is selected, and returned to the front end for visual display, and the disease prediction result Feedback to users.
余弦值越大越接近于1,就说明这个假想向量与相应疾病向量越相似,用户的患儿就越有可能患有此种疾病。当然,之所以不将余弦值最大的一个结果返回给用户,是因为此系统毕竟不是真人医生,并不能确诊,因此返回前十个供用户参考。不过这也印证了此模块的名字“疾病预测”,预测的结果当然不只一个了。The larger the cosine value is, the closer it is to 1, it means that the imaginary vector is more similar to the corresponding disease vector, and the user's child is more likely to suffer from this disease. Of course, the reason why the result with the largest cosine value is not returned to the user is because this system is not a real doctor after all, and cannot make a diagnosis, so the first ten results are returned for the user's reference. However, this also confirms the name of this module "Disease Prediction", and of course there are more than one predicted results.
倘若用户在疾病咨询模块给出的症状词过少,在疾病预测模块不足以够构建向量时,系统提供另一种预测疾病方法。当用户给出一个症状词时,在mysql数据库中查找,遍历每一个疾病的症状列表,假如搜索到某一个疾病的症状列表中包含当前症状词,而当前症状词的tf-idf值与搜索到的疾病症状列表中所有症状词的tf-idf总和结果比率接近于1,说明用户患有此种疾病的可能性越大;然后取出此疾病症状列表中第二大tf-idf值的症状词去询问用户是否满足,倘若满足,则继续询问第三大tf-idf值的症状,依次类推,当已经确定的症状词数与此疾病症状词总数的比例达到65%时,将该疾病预测结果反馈给用户;如果已经确定的症状词数与此疾病症状词总数的比例不足65%时,再按照上述方法依次对其它包含当前症状词的疾病症状列表进行预测。If the user gives too few symptom words in the disease consultation module and the disease prediction module is not enough to construct a vector, the system provides another method for predicting the disease. When the user gives a symptom word, search in the mysql database and traverse the symptom list of each disease. If the symptom list of a certain disease contains the current symptom word, and the tf-idf value of the current symptom word is the same as the searched The tf-idf sum result ratio of all symptom words in the disease symptom list is close to 1, indicating that the user is more likely to suffer from this disease; then take out the symptom word with the second largest tf-idf value in the disease symptom list to Ask the user whether they are satisfied, if they are satisfied, continue to ask the symptom with the third largest tf-idf value, and so on, when the ratio of the number of confirmed symptom words to the total number of symptom words of this disease reaches 65%, the prediction result of the disease will be fed back For the user; if the proportion of the determined symptom words to the total number of disease symptom words is less than 65%, then follow the above method to predict other disease symptom lists containing the current symptom words in turn.
(3)医生推荐:(3) Doctor's recommendation:
医生推荐模块主要功能是基于用户先前的疾病咨询记录和疾病预测结果的点击记录,为用户推荐与用户所患疾病相关的医生信息;医生信息包括医生的姓名、科室、所属医院、简介、主治、擅长、联系方式等。The main function of the doctor recommendation module is to recommend doctor information related to the user's disease based on the user's previous disease consultation records and disease prediction results; doctor information includes the doctor's name, department, hospital, profile, attending, Skills, contacts, etc.
为了能够高速有效的给用户推荐到合适的医生,系统依然使用luence框架为医生的信息建立索引。且不同的信息赋予不同的权重,主治和擅长的权重均大于其他信息,简介的权重次之;如主治和擅长的权重都为3,简介的权重为1。这样当用户的检索信息中包含的某种疾病症状或者疾病名称在医生的主治和擅长中出现时,当前医生所获得的检索排名分数就高,说明此医生越是擅长此种疾病,同时此医生被推荐的可能性就越大。选取排名前十的医生推荐给用户。让用户自己去选择点击查看医生的详细信息,并确定是否选择此医生去实地就医。In order to quickly and effectively recommend suitable doctors to users, the system still uses the luence framework to index doctor information. And different information is given different weights, the weight of indications and expertise is greater than other information, and the weight of introduction is second; for example, the weight of indications and expertise is 3, and the weight of introduction is 1. In this way, when a certain disease symptom or disease name contained in the user's search information appears in the doctor's indications and specialties, the current doctor's search ranking score will be higher, indicating that the doctor is more good at this disease, and the doctor The more likely you are to be recommended. Select the top ten doctors to recommend to users. Let the user choose to click to view the detailed information of the doctor, and determine whether to choose this doctor to go to the field for medical treatment.
在整个过程结束之后,系统还会讯问用户对于本次问诊是否满意,这将作为系统的一种反馈,将本次问诊的记录添加到个人机器人模块。当在个人机器人模块中用户选择公开自己机器人时,将会有可能获得系统机器人的问题解答,而问题解答的信息来源便是本次用户点赞的问诊记录。After the whole process is over, the system will also ask the user whether he is satisfied with the consultation, which will be used as a feedback of the system, and the record of the consultation will be added to the personal robot module. When the user chooses to disclose his own robot in the personal robot module, he will be able to obtain answers to questions from the system robot, and the source of information for the answers to questions is the consultation record that the user likes this time.
步骤3,个人机器人知识库构建:Step 3, personal robot knowledge base construction:
个人机器人是用户自行创建的在系统中的虚拟代理。用户可以将自己在育儿的过程中,所积累的经验和知识通过培养的方式不断的向个人机器人输入自己的知识,这样就实现了机器人的成长;个人机器人以一个问题对应一个答案的形式存储在mysql数据库中,并以知识社区的形式为不同用户提供过程式的育儿知识服务;个人机器人知识库的构建过程,就是用户不断的输入自己的知识让机器人记住的过程。Personal robots are virtual agents in the system that users create themselves. Users can continuously input their knowledge to the personal robot through the experience and knowledge they have accumulated in the process of raising children, so as to realize the growth of the robot; the personal robot is stored in the form of a question corresponding to an answer. mysql database, and provide procedural parenting knowledge services for different users in the form of a knowledge community; the construction process of the personal robot knowledge base is the process of users continuously inputting their own knowledge for the robot to remember.
首先,每一个用户登录系统后,都可以创建一个自己的机器人。机器人要有昵称、年龄、外表、称呼语等数据,用户将在系统的引导下一步步完善这些数据。而后,用户每次进入个人机器人模块时,已经创建好的机器人就会给用户打招呼,并开始服务。系统服务的主要模式是问答,问答的难点在于如何实现知识的增长和基于用户的问题寻找答案。First of all, after each user logs in to the system, they can create a robot of their own. The robot must have data such as nickname, age, appearance, and salutation, and the user will complete these data step by step under the guidance of the system. Then, every time the user enters the personal robot module, the created robot will greet the user and start serving. The main mode of system service is question answering. The difficulty of question answering lies in how to realize the growth of knowledge and find answers based on users' questions.
知识的增长主要来源于用户的培养。当用户讯问自己的机器人一个问题时,倘若系统以前从用户那里获取过这个问题的答案,系统便会直接给出答案。倘若系统的知识库检索不到答案,机器人则会要求学习这个知识,于是用户将会在系统的引导下教会机器人这个知识点。相应的这个知识点也会存储在系统的知识库中,作为下一次对用户提问进行答案检索的依据。系统中用户的问题和答案,采取一对一的存储方式,一个问题对应一个答案存储在mysql数据库的表中。The growth of knowledge mainly comes from the cultivation of users. When a user asks their bot a question, the system will give the answer directly if the system has previously obtained an answer to that question from the user. If the system's knowledge base cannot retrieve the answer, the robot will ask to learn this knowledge, so the user will teach the robot this knowledge point under the guidance of the system. Correspondingly, this knowledge point will also be stored in the system's knowledge base as the basis for the next answer retrieval to user questions. The user's questions and answers in the system are stored in a one-to-one manner, and one question corresponds to one answer stored in a table in the mysql database.
答案的寻找依旧依靠于luence全文检索引擎。系统首先对所有的用户问题和答案数据分域建立索引,当用户问题提出时,首先对用户问题进行分词处理,然后去掉停用词和虚词,以主要实词和用户名建立一个嵌套查询去索引中查找。之所以叫嵌套查询,是因为luence中有一套查询机制,他可以组合“与”、“或”、”非”的嵌套,使得查询更加灵活。这里我们嵌套的查询是:(用户id)and(实词1or实词2or实词n),这样就能够查找指定用户知识库中的特定问题。查找结果也是基于分数的返回,我们选择分数最高问题的答案作为结果用来给用户回答。用户可能会对问题的答案不甚满意,这说明当前问题与用户的问题有差异,于是系统机器人将提醒用户重新记住此新问题。不慎满意的另一种可能是,用户提问的方式有所改变,虽然是同一个问题,可能因为时间太过久远而用户用新的语句提出了此问题,由此导致系统不能够识别这个问题。对于这种状况,系统的优化方法是在实现luence建立索引时,添加同义词词典,这样可以解决由于句子中词语改变而造成的识别错误。例如用户提问“孩子发烧怎么办?”和用户提问“孩子发热怎么办?”会检索到一样的答案。但是这样只能解决由于用词不同而产生的问题检索失败,对于因为句法等原因导致的不识别问题,牵扯到更深层次的研究,本系统暂时没有涉及。The search for the answer still relies on the luence full-text search engine. The system first establishes indexes for all user questions and answer data domains. When a user question is raised, it first performs word segmentation processing on the user question, then removes stop words and function words, and builds a nested query with main content words and user names to index in the search. The reason why it is called nested query is that there is a set of query mechanism in luence, which can combine the nesting of "and", "or" and "not" to make the query more flexible. Here our nested query is: (user id) and (real word 1or real word 2or real word n), so that we can find specific questions in the specified user knowledge base. The search result is also returned based on the score, and we select the answer to the question with the highest score as the result to give the user an answer. The user may be dissatisfied with the answer to the question, which means that the current question is different from the user's question, so the system robot will remind the user to remember this new question. Another possibility of inadvertent satisfaction is that the way the user asks the question has changed. Although it is the same question, it may be because the time is too long and the user raised the question with a new sentence, which caused the system to fail to recognize the question. . For this situation, the optimization method of the system is to add a synonym dictionary when implementing luence to build an index, which can solve the recognition error caused by the change of words in the sentence. For example, if a user asks "What should I do if my child has a fever?" and if a user asks "What should I do if my child has a fever?", the same answer will be retrieved. However, this can only solve the problem of retrieval failure due to different words. For the non-recognition problem caused by syntax and other reasons, it involves deeper research, which is not involved in this system for the time being.
另外,在个人机器人积累到10个问题以上时,用户可以选择公开自己的机器人,这样当用户再次提出一个问题时,或许在自己的知识库中检索失败,导致自己的机器人不能回答,而此时因为公开了数据,所以可以获得其他用户机器人的答案。与此同时,也将意味着自己的机器人也会为他人服务。通过这种模式,可以实现用户之间交流和分享育儿经验。In addition, when the personal robot has accumulated more than 10 questions, the user can choose to disclose his robot, so that when the user asks a question again, it may fail to search in his knowledge base, causing his robot to be unable to answer. Because the data is made public, answers from other user bots are available. At the same time, it will also mean that your own robot will also serve others. Through this model, users can communicate and share parenting experiences.
步骤4,系统交互实现:Step 4, system interaction implementation:
我们也充分人性化的考虑了用户的使用习惯,通过PC端和移动端实现用户交互;作为一个机器人系统,首先应当考虑的就是如何让系统更加人性化。而人性化不仅体现在算法上面,更加体现在交互上。为了让系统更加具有人的品质,我们不但使用语音技术让系统具有人的声音,而且还使用了一些数据可视化技术,使得数据的展示更加形象生动。这样,系统不仅从视觉还是从听觉上,都会带给用户耳目一新的体验。We also fully consider the user's usage habits in a humane way, and realize user interaction through PC and mobile terminals; as a robot system, the first thing to consider is how to make the system more humanized. The humanization is not only reflected in the algorithm, but also in the interaction. In order to make the system more human-like, we not only use speech technology to make the system have a human voice, but also use some data visualization technology to make the data display more vivid. In this way, the system will bring users a refreshing experience not only in terms of vision but also hearing.
在安卓手机端,语音合成技术和语音识别技术都比较成熟,系统可以完全实现服务器数据的读和听,因此用户和系统的交互是纯语音交互。当系统构建好需要机器人说出的话以后,通过http协议传输json数据到手机端,手机直接解析后调用语音接口合成音频文件,让手机朗读出来即实现了机器人的“说”。而用户回答机器人问题时,也只是说出来,安卓端将调用语音系统API将用户语音转化为文字,并构建json字符串传给系统服务器,从而实现了机器人的“听”。浏览器端的语音技术相对较弱,不过也可以实现语音的合成,这机器人只会“说”,而识别就需要用户手动输入文字了。不过,PC键盘的文字输入也是非常方便的,只要用户拥有一般人的打字速度,交互还是非常流畅的。On the Android mobile phone, the speech synthesis technology and speech recognition technology are relatively mature, and the system can fully realize the reading and listening of server data, so the interaction between the user and the system is pure voice interaction. After the system is built and the words to be spoken by the robot are transmitted, the json data is transmitted to the mobile phone through the http protocol. After the mobile phone parses it directly, it calls the voice interface to synthesize an audio file, and the mobile phone reads it aloud to realize the "speaking" of the robot. When the user answers the robot's questions, he just speaks out, and the Android side will call the voice system API to convert the user's voice into text, and construct a json string to send to the system server, thus realizing the "listening" of the robot. The speech technology on the browser side is relatively weak, but it can also realize speech synthesis. This robot can only "speak", and the recognition requires the user to manually input text. However, the text input of the PC keyboard is also very convenient. As long as the user has the typing speed of an average person, the interaction is still very smooth.
当然,在听与说的同时,视觉上还需要让用户感受到机器人的动作。安卓端将使用帧动画的形式,机器人在说与听的同时,播放机器人交互动作动画,让用户体验到是在与机器进行交流,而不是手机。浏览器端将使用js实现图片的轮播,从而也会拥有动画的效果。Of course, while listening and speaking, it is also necessary for users to feel the robot's movements visually. The Android terminal will use the form of frame animation. While the robot is speaking and listening, it will play the animation of robot interaction actions, so that users can experience that they are communicating with a machine, not a mobile phone. The browser side will use js to realize the carousel of pictures, which will also have the effect of animation.
从服务器端传来的还有一些展示数据,例如疾病预测的结果、疾病详情、医生推荐结果、医生详情等,这些数据安卓端将在webViewr控件中调用JS,采用Echars展现。浏览器端嵌入Echars早已经非常成熟,所以也没有什么技术难度。Echars动态图标展示,将使得数据形象生动,而不再只像文字一般单调。There are also some display data transmitted from the server, such as the results of disease prediction, disease details, doctor recommendation results, doctor details, etc. These data will be displayed in Echars by calling JS in the webViewr control on the Android side. Embedding Echars on the browser side is already very mature, so there is no technical difficulty. Echars dynamic icon display will make the data vivid, instead of being monotonous like text.
对于系统中机器人表达的一些提示性语句,为了不让用户感到呆板,我们趣味性的设置了许多种表达。在系统运行一个流程节点时,系统会遍历当前提示语的所有说法,随机选取一个给用户提示。这样,在用户连续多次使用系统后,每次都基本会遇到不同的表达,将会增加体验的新奇性。For some suggestive sentences expressed by robots in the system, in order not to make users feel dull, we have set up many kinds of expressions in an interesting way. When the system runs a process node, the system will traverse all the sayings of the current prompt, and randomly select one to prompt the user. In this way, after users use the system multiple times in a row, they will basically encounter different expressions each time, which will increase the novelty of the experience.
本发明采用上述智能育儿知识服务方法的智能育儿知识服务系统,包括信息采集模块、专家机器人模块、个人机器人模块和系统交互模块;The intelligent child-rearing knowledge service system adopting the above-mentioned intelligent child-rearing knowledge service method of the present invention includes an information collection module, an expert robot module, a personal robot module and a system interaction module;
信息采集模块与mysql数据库连接,用于通过爬虫程序定时获取最新的儿童疾病数据信息;The information collection module is connected to the mysql database, and is used to regularly obtain the latest data on children's diseases through the crawler program;
专家机器人模块,用于以知识图谱的形式直观的为用户提供儿科疾病相关的各类知识服务;专家机器人模块包括疾病咨询模块、疾病预测模块、医生推荐模块;The expert robot module is used to intuitively provide users with various knowledge services related to pediatric diseases in the form of knowledge graphs; the expert robot module includes a disease consultation module, a disease prediction module, and a doctor recommendation module;
疾病咨询模块与疾病预测模块连接,用于获取用户提供的疾病名称或症状词,并将疾病名称、症状词列表传递至疾病预测模块;疾病咨询模块包括mmseg4J分词器和疾病症状词表,mmseg4J分词器用于切分用户输入的口语症状词;疾病症状词表包括mysql数据库中的疾病症状和临床表现中的词语、及其同义词,疾病症状词表用于识别经过mmseg4J分词器切分的症状词。The disease consultation module is connected with the disease prediction module to obtain the disease name or symptom word provided by the user, and pass the disease name and symptom word list to the disease prediction module; the disease consultation module includes mmseg4J word breaker and disease symptom vocabulary, mmseg4J word segmentation The device is used to segment the spoken symptom words input by the user; the disease symptom vocabulary includes the words and synonyms in the disease symptoms and clinical manifestations in the mysql database, and the disease symptom vocabulary is used to identify the symptom words segmented by the mmseg4J tokenizer.
疾病预测模块,用于将疾病名称在mysql数据库中检索,以知识图谱的形式为用户展示疾病预测结果;以及将症状词列表与mysql数据库中每一种疾病的症状进行匹配,根据匹配程度以知识图谱的形式为用户展示疾病预测结果;The disease prediction module is used to retrieve the name of the disease in the mysql database, and display the disease prediction results for the user in the form of a knowledge map; and match the symptom word list with the symptoms of each disease in the mysql database, and use knowledge according to the degree of matching Display the disease prediction results for users in the form of graphs;
医生推荐模块分别与疾病咨询模块、疾病预测模块连接,基于疾病咨询模块的咨询记录和疾病预测结果的点击记录,为用户推荐与用户所患疾病相关的医生;The doctor recommendation module is respectively connected with the disease consultation module and the disease prediction module, and based on the consultation records of the disease consultation module and the click records of the disease prediction results, recommends doctors related to the user's disease for the user;
个人机器人模块,用于将用户输入的知识以一个问题对应一个答案的形式存储在mysql数据库中,并以知识社区的形式为不同用户提供过程式的育儿知识服务;The personal robot module is used to store the knowledge input by the user in the mysql database in the form of a question corresponding to an answer, and provide procedural parenting knowledge services for different users in the form of a knowledge community;
系统交互模块,包括PC端和移动端,采用语音技术让系统具有人的声音,采用数据可视化技术使得数据的展示更加形象生动,包括多种说法的提示语。The system interaction module, including the PC terminal and the mobile terminal, uses voice technology to make the system have a human voice, and uses data visualization technology to make the data display more vivid, including prompts in various ways.
本系统经过从设计到实现的一系列研究,最终可以平稳运行。其定时定向的数据采集,全面严谨的知识库构建,友好动态的语音交互将会缓解无数育儿父母育儿过程中的烦恼,甚至于带给他们育儿的乐趣,从而填补了儿童医疗知识智能服务在移动互联网时代的空白。After a series of studies from design to realization, the system can run smoothly in the end. Its regular and directional data collection, comprehensive and rigorous knowledge base construction, and friendly and dynamic voice interaction will alleviate the troubles of countless parenting parents in the process of raising children, and even bring them fun in parenting, thus filling the gap between children's medical knowledge and intelligent services in mobile A void in the Internet age.
以上所述仅为本发明的较佳实施例而已,并非用于限定本发明的保护范围。凡在本发明的精神和原则之内所作的任何修改、等同替换、改进等,均包含在本发明的保护范围内。The above descriptions are only preferred embodiments of the present invention, and are not intended to limit the protection scope of the present invention. Any modification, equivalent replacement, improvement, etc. made within the spirit and principles of the present invention are included in the protection scope of the present invention.
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201710210882.8ACN106991284B (en) | 2017-03-31 | 2017-03-31 | Intelligent parenting knowledge service method and system |
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201710210882.8ACN106991284B (en) | 2017-03-31 | 2017-03-31 | Intelligent parenting knowledge service method and system |
| Publication Number | Publication Date |
|---|---|
| CN106991284Atrue CN106991284A (en) | 2017-07-28 |
| CN106991284B CN106991284B (en) | 2019-12-31 |
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201710210882.8AActiveCN106991284B (en) | 2017-03-31 | 2017-03-31 | Intelligent parenting knowledge service method and system |
| Country | Link |
|---|---|
| CN (1) | CN106991284B (en) |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107545148A (en)* | 2017-09-30 | 2018-01-05 | 旗瀚科技有限公司 | A kind of intelligent robot diagnoses question and answer interactive system |
| CN108109694A (en)* | 2018-01-05 | 2018-06-01 | 李向坤 | Event determination method and device, storage medium, electronic equipment |
| CN108182262A (en)* | 2018-01-04 | 2018-06-19 | 华侨大学 | Intelligent Answer System construction method and system based on deep learning and knowledge mapping |
| CN108984647A (en)* | 2018-06-26 | 2018-12-11 | 北京工业大学 | A kind of water utilities domain knowledge map construction method based on Chinese text |
| CN109492080A (en)* | 2018-10-25 | 2019-03-19 | 杭州任你说智能科技有限公司 | A kind of medical health system and its implementation based on voice response robot |
| CN109582777A (en)* | 2018-12-06 | 2019-04-05 | 中国银行股份有限公司 | A kind of human-machine intelligence's processing method and system |
| CN110033851A (en)* | 2019-04-02 | 2019-07-19 | 腾讯科技(深圳)有限公司 | Information recommendation method, device, storage medium and server |
| CN110335674A (en)* | 2019-06-10 | 2019-10-15 | 旗瀚科技有限公司 | A kind of disease tentative diagnosis method for disabled aiding robot of helping the elderly |
| CN110415818A (en)* | 2019-08-05 | 2019-11-05 | 儿康智能科技(苏州)有限公司 | An intelligent pediatric disease interrogation system and method based on observable symptoms |
| CN110867255A (en)* | 2019-10-24 | 2020-03-06 | 开望(杭州)科技有限公司 | Intelligent mother and infant knowledge service method and system |
| TWI688969B (en)* | 2018-10-24 | 2020-03-21 | 大仁科技大學 | Dialogue system for medical product recommendation |
| CN111079021A (en)* | 2019-12-20 | 2020-04-28 | 腾讯科技(深圳)有限公司 | Method, device, server and storage medium for recommending medical information content |
| CN111191051A (en)* | 2020-04-09 | 2020-05-22 | 速度时空信息科技股份有限公司 | Method and system for constructing emergency knowledge map based on Chinese word segmentation technology |
| CN111414393A (en)* | 2020-03-26 | 2020-07-14 | 湖南科创信息技术股份有限公司 | Semantic similar case retrieval method and equipment based on medical knowledge graph |
| CN112231537A (en)* | 2020-11-09 | 2021-01-15 | 张印祺 | Intelligent reading system based on deep learning and web crawler |
| CN112445872A (en)* | 2020-11-25 | 2021-03-05 | 开望(杭州)科技有限公司 | Family portrait construction method based on parent-child space |
| CN113077898A (en)* | 2021-03-08 | 2021-07-06 | 南京紫金山智慧城市研究院有限公司 | Intelligent health management brain system based on big data mining |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN108764280B (en)* | 2018-04-17 | 2021-04-27 | 中国科学院计算技术研究所 | Medical data processing method and system based on symptom vector |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101477566A (en)* | 2009-01-19 | 2009-07-08 | 腾讯科技(深圳)有限公司 | Method and apparatus used for putting candidate key words advertisement |
| CN103049641A (en)* | 2012-11-13 | 2013-04-17 | 张宝永 | Medical guidance diagnosis display method and system |
| CN104102816A (en)* | 2014-06-20 | 2014-10-15 | 周晋 | Symptom match and machine learning-based automatic diagnosis system and method |
| CN104484845A (en)* | 2014-12-30 | 2015-04-01 | 天津迈沃医药技术有限公司 | Disease self-analysis method based on medical ontology database |
| CN104680458A (en)* | 2015-02-09 | 2015-06-03 | 李宏强 | Method for recommending medical-seeking departments, hospitals or doctors to public based on massive prescriptions |
| CN104915446A (en)* | 2015-06-29 | 2015-09-16 | 华南理工大学 | Automatic extracting method and system of event evolving relationship based on news |
| CN105139237A (en)* | 2015-09-25 | 2015-12-09 | 百度在线网络技术(北京)有限公司 | Information push method and apparatus |
| CN106096283A (en)* | 2016-06-16 | 2016-11-09 | 贵阳朗玛信息技术股份有限公司 | Remote interrogation assistant's service platform, system and method |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101477566A (en)* | 2009-01-19 | 2009-07-08 | 腾讯科技(深圳)有限公司 | Method and apparatus used for putting candidate key words advertisement |
| CN103049641A (en)* | 2012-11-13 | 2013-04-17 | 张宝永 | Medical guidance diagnosis display method and system |
| CN104102816A (en)* | 2014-06-20 | 2014-10-15 | 周晋 | Symptom match and machine learning-based automatic diagnosis system and method |
| CN104484845A (en)* | 2014-12-30 | 2015-04-01 | 天津迈沃医药技术有限公司 | Disease self-analysis method based on medical ontology database |
| CN104680458A (en)* | 2015-02-09 | 2015-06-03 | 李宏强 | Method for recommending medical-seeking departments, hospitals or doctors to public based on massive prescriptions |
| CN104915446A (en)* | 2015-06-29 | 2015-09-16 | 华南理工大学 | Automatic extracting method and system of event evolving relationship based on news |
| CN105139237A (en)* | 2015-09-25 | 2015-12-09 | 百度在线网络技术(北京)有限公司 | Information push method and apparatus |
| CN106096283A (en)* | 2016-06-16 | 2016-11-09 | 贵阳朗玛信息技术股份有限公司 | Remote interrogation assistant's service platform, system and method |
| Title |
|---|
| 于静一: "基于Solr实现农业信息扩展检索的研究", 《中国优秀硕士学位论文全文数据库信息科技辑》* |
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN107545148A (en)* | 2017-09-30 | 2018-01-05 | 旗瀚科技有限公司 | A kind of intelligent robot diagnoses question and answer interactive system |
| CN108182262A (en)* | 2018-01-04 | 2018-06-19 | 华侨大学 | Intelligent Answer System construction method and system based on deep learning and knowledge mapping |
| CN108182262B (en)* | 2018-01-04 | 2022-03-04 | 华侨大学 | Construction method and system of intelligent question answering system based on deep learning and knowledge graph |
| CN108109694A (en)* | 2018-01-05 | 2018-06-01 | 李向坤 | Event determination method and device, storage medium, electronic equipment |
| CN108109694B (en)* | 2018-01-05 | 2023-06-30 | 李向坤 | Event judging method and device, storage medium and electronic equipment |
| CN108984647A (en)* | 2018-06-26 | 2018-12-11 | 北京工业大学 | A kind of water utilities domain knowledge map construction method based on Chinese text |
| TWI688969B (en)* | 2018-10-24 | 2020-03-21 | 大仁科技大學 | Dialogue system for medical product recommendation |
| CN109492080A (en)* | 2018-10-25 | 2019-03-19 | 杭州任你说智能科技有限公司 | A kind of medical health system and its implementation based on voice response robot |
| CN109582777A (en)* | 2018-12-06 | 2019-04-05 | 中国银行股份有限公司 | A kind of human-machine intelligence's processing method and system |
| CN110033851A (en)* | 2019-04-02 | 2019-07-19 | 腾讯科技(深圳)有限公司 | Information recommendation method, device, storage medium and server |
| CN110335674A (en)* | 2019-06-10 | 2019-10-15 | 旗瀚科技有限公司 | A kind of disease tentative diagnosis method for disabled aiding robot of helping the elderly |
| CN110415818A (en)* | 2019-08-05 | 2019-11-05 | 儿康智能科技(苏州)有限公司 | An intelligent pediatric disease interrogation system and method based on observable symptoms |
| CN110867255A (en)* | 2019-10-24 | 2020-03-06 | 开望(杭州)科技有限公司 | Intelligent mother and infant knowledge service method and system |
| CN111079021A (en)* | 2019-12-20 | 2020-04-28 | 腾讯科技(深圳)有限公司 | Method, device, server and storage medium for recommending medical information content |
| CN111079021B (en)* | 2019-12-20 | 2023-09-19 | 腾讯科技(深圳)有限公司 | Method, device, server and storage medium for recommending medical information content |
| CN111414393A (en)* | 2020-03-26 | 2020-07-14 | 湖南科创信息技术股份有限公司 | Semantic similar case retrieval method and equipment based on medical knowledge graph |
| CN111191051A (en)* | 2020-04-09 | 2020-05-22 | 速度时空信息科技股份有限公司 | Method and system for constructing emergency knowledge map based on Chinese word segmentation technology |
| CN111191051B (en)* | 2020-04-09 | 2020-07-28 | 速度时空信息科技股份有限公司 | Method and system for constructing emergency knowledge map based on Chinese word segmentation technology |
| CN112231537A (en)* | 2020-11-09 | 2021-01-15 | 张印祺 | Intelligent reading system based on deep learning and web crawler |
| CN112445872A (en)* | 2020-11-25 | 2021-03-05 | 开望(杭州)科技有限公司 | Family portrait construction method based on parent-child space |
| CN112445872B (en)* | 2020-11-25 | 2023-04-07 | 开望(杭州)科技有限公司 | Family portrait construction method based on parent-child space |
| CN113077898A (en)* | 2021-03-08 | 2021-07-06 | 南京紫金山智慧城市研究院有限公司 | Intelligent health management brain system based on big data mining |
| Publication number | Publication date |
|---|---|
| CN106991284B (en) | 2019-12-31 |
| Publication | Publication Date | Title |
|---|---|---|
| CN106991284B (en) | Intelligent parenting knowledge service method and system | |
| Athota et al. | Chatbot for healthcare system using artificial intelligence | |
| US12001964B2 (en) | Artificial intelligence advisory systems and methods for behavioral pattern matching and language generation | |
| US9576241B2 (en) | Methods and devices for customizing knowledge representation systems | |
| US10248669B2 (en) | Methods and devices for customizing knowledge representation systems | |
| Tange et al. | Medical narratives in electronic medical records | |
| CN112148851A (en) | Construction method of medicine knowledge question-answering system based on knowledge graph | |
| CN111415740A (en) | Method, device, storage medium and computer equipment for processing inquiry information | |
| CN108182262A (en) | Intelligent Answer System construction method and system based on deep learning and knowledge mapping | |
| CN110807091A (en) | Hotel intelligent question-answer recommendation and decision support analysis method and system | |
| US11640403B2 (en) | Methods and systems for automated analysis of behavior modification data | |
| US12412670B2 (en) | System and method for defining a user experience of medical data systems through a knowledge graph | |
| US20230052022A1 (en) | Systems and Methods for Dynamic Charting | |
| US12159113B2 (en) | System and method for diagnosing disease through cognification of unstructured data | |
| US20230043543A1 (en) | System and method for determining and presenting clinical answers | |
| US20180307749A1 (en) | Device, system, and method for determining information relevant to a clinician | |
| Gaur et al. | “Who can help me?”: Knowledge Infused Matching of Support Seekers and Support Providers during COVID-19 on Reddit | |
| CN118675674A (en) | User portrait construction method and device | |
| CN118916457A (en) | Facility agriculture intelligent question-answering method based on large language model retrieval enhancement generation | |
| CN112185544B (en) | Semantic-based family medical consultation decision support method | |
| CN118673061A (en) | Nursing home accompanying service method and device based on robot assistant and storage medium | |
| WO2021041239A1 (en) | System and method for cognifying unstructured data | |
| Xu et al. | An upper-ontology-based approach for automatic construction of IOT ontology | |
| Shathyan et al. | Knowledge Graph Based Medical Chatbot building | |
| KR20230040563A (en) | System for customized content providing based on big data and method the same |
| Date | Code | Title | Description |
|---|---|---|---|
| PB01 | Publication | ||
| PB01 | Publication | ||
| SE01 | Entry into force of request for substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| GR01 | Patent grant | ||
| GR01 | Patent grant |