Disclosure of Invention
The embodiment of the application provides a travel method, a travel device, computer equipment and a storage medium for the blind, which can provide the travel direction for the blind aiming at the current road condition, so that the blind gets rid of the influence of travel influence factors faster, and the travel efficiency and travel experience of the blind are improved. The technical scheme is as follows:
In one aspect, a trip method for the blind is provided, the method comprising:
acquiring road condition images of the blind in real time in the traveling process of the blind, wherein the road condition images are used for reflecting road condition information of the current position of the blind;
under the condition that the road condition image contains at least one travel influence factor, determining a next walking plan aiming at the at least one travel influence factor, wherein the next walking plan is used for prompting the next walking direction of the blind person;
and sending the voice signal of the next walking plan to a terminal so that the terminal plays the voice signal of the next walking plan to the blind person.
In another aspect, there is provided a travel device for a blind person, the device comprising:
the acquisition module is used for acquiring road condition images of the blind in real time in the traveling process of the blind, wherein the road condition images are used for reflecting road condition information of the current position of the blind;
the first determining module is used for determining a next walking plan aiming at least one travel influence factor under the condition that the road condition image contains the at least one travel influence factor, wherein the next walking plan is used for prompting the next walking direction of the blind person;
And the sending module is used for sending the voice signal of the next walking plan to a terminal so that the terminal plays the voice signal of the next walking plan to the blind person.
In some embodiments, the first determining module includes:
the acquisition unit is used for acquiring the distance between the blind person and the travel influence factors under the condition that the road condition image contains any travel influence factor;
a first determining unit configured to determine a next walking plan for the blind person, in a case where a distance between the blind person and the travel influencing factor does not exceed a preset distance.
In some embodiments, the determining unit is configured to determine, if the travel influencing factor is an obstacle, a walking direction and a walking distance of the blind person in a next step based on a size of the obstacle, where the distance between the blind person and the travel influencing factor does not exceed the preset distance; if the travel influencing factors are steps, determining the next walking direction and the walking distance of the blind person based on step information, wherein the step information comprises at least one of the direction, the number and the height of the steps; if the travel influencing factor is a traffic light, determining whether the blind person stops walking next based on the color and time of the traffic light.
In some embodiments, the first determining module is configured to determine, in a case where at least one travel influencing factor is included in the road condition image, a relative position between the at least one travel influencing factor and the blind person, where the relative position includes a distance and a direction; according to the sequence from the near to the far of the relative position, aiming at the at least one travel influencing factor, making a next walking plan, wherein the next walking plan comprises a walking direction, a distance and time;
the sending module is used for generating the voice signal based on the next walking plan; and sending the voice signal of the next walking plan to the terminal so that the terminal plays the walking mode aiming at the at least one walking influence factor to the blind person according to the sequence from the near to the far of the relative position.
In some embodiments, the apparatus further comprises:
the recognition module is used for carrying out semantic recognition on the text of the voice signal of the blind person by adopting a large language model to obtain the intention of the blind person;
a second determining module for determining a target path based on a current position of the blind person and the target position, the target path being for guiding the blind person from the current position to the target position, in a case where the intention of the blind person is to go to the target position;
The first determining module is configured to determine, when the road condition image includes at least one travel influencing factor, the next walking plan based on the at least one travel influencing factor and the target path.
In some embodiments, the second determining module includes:
a second determining unit configured to determine a plurality of paths between a current position of the blind person and the target position based on the current position and the target position in a case where the intention of the blind person is to go to the target position;
the third determining unit is used for determining the target path based on road condition information of the multiple paths, the road condition information comprises travel influence factors, and the road condition information can reflect travel difficulty of the blind person in traveling on the corresponding path.
In some embodiments, the road condition information includes a plurality of influence factor types, and the plurality of influence factor types have priority;
the third determining unit is configured to determine, for any path, multiple types of influencing factors existing in the path based on road condition information of the path; based on the priorities of the plurality of influence factor types, carrying out weighted summation on the factor scores of the plurality of influence factor types to obtain a path score of the path, wherein the factor score is used for representing the difficulty of the blind person getting rid of the corresponding travel influence factor, and the path score is used for representing the difficulty of the blind person passing through the path; and determining the path with the lowest difficulty as a target path based on the scores of the paths.
In some embodiments, the third determining unit is configured to determine that the difficulty of the target number of paths is the same based on the path scores of the multiple paths, and send, to the terminal, a voice signal of the target number of paths, where the voice signal is used to prompt the blind person of road condition information of each of the target number of paths, where the difficulty is the lowest among the multiple paths; and determining the target path based on the received voice signal of the blind person.
In some embodiments, the recognition module is configured to perform voiceprint recognition on the initially acquired voice signal to obtain voiceprint features of the voice signal; and under the condition that the voiceprint characteristics are the same as those of the blind person, carrying out semantic recognition on the text of the voice signal by adopting a large language model to obtain the intention of the blind person.
In some embodiments, the first determining module is further configured to re-plan a route for the blind person based on the current location of the blind person and the target location in case of an emergency on a front road segment in the target path, where the front road segment is used for indicating a road segment in the target path that the blind person has not yet passed through, and the emergency can increase a travel difficulty of the target path.
In another aspect, a computer device is provided, the computer device including a processor and a memory for storing at least one section of computer program loaded and executed by the processor to implement a trip method for blind persons in an embodiment of the present application.
In another aspect, a computer readable storage medium having stored therein at least one segment of a computer program loaded and executed by a processor to implement a travel method for blind persons as in the embodiments of the present application is provided.
In another aspect, a computer program product is provided, comprising a computer program stored in a computer readable storage medium, the computer program being read from the computer readable storage medium by a processor of a computer device, the computer program being executed by the processor to cause the computer device to perform the method of travel for the blind provided in the various optional implementations of the aspects or aspects described above.
The embodiment of the application provides a travel method for the blind, which can determine the road condition information of the current position of the blind in the travel process by acquiring the road condition image of the blind in real time, namely, can determine whether travel influence factors influencing the travel of the blind exist near the blind under the current condition, and under the condition that the travel influence factors exist, determine the next travel plan of the blind, and then transmit the voice signal of the next travel plan to a terminal, so that the blind can travel according to the travel direction in the voice signal, thereby being capable of getting rid of the influence of the travel influence factors faster, and improving the travel efficiency and travel experience of the blind.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the embodiments of the present application will be described in further detail with reference to the accompanying drawings.
The terms "first," "second," and the like in this disclosure are used for distinguishing between similar elements or items having substantially the same function and function, and it should be understood that there is no logical or chronological dependency between the terms "first," "second," and "n," and that there is no limitation on the amount and order of execution.
The term "at least one" in the present application means one or more, and the meaning of "a plurality of" means two or more.
It should be noted that, the information (including but not limited to user equipment information, user personal information, etc.), data (including but not limited to data for analysis, stored data, presented data, etc.), and signals related to the present application are all authorized by the user or are fully authorized by the parties, and the collection, use, and processing of the related data is required to comply with the relevant laws and regulations and standards of the relevant countries and regions. For example, the road condition image and the position related to the application are acquired under the condition of full authorization.
The travel method for the blind provided by the embodiment of the application can be executed by computer equipment. In some embodiments, the computer device is a terminal or a server. In the following, taking a computer device as an example, an implementation environment of the travel method for the blind provided by the embodiment of the application is introduced, and fig. 1 is a schematic diagram of an implementation environment of the travel method for the blind provided by the embodiment of the application. Referring to fig. 1, the implementation environment includes a terminal 101 and a server 102. The terminal 101 and the server 102 can be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.
In some embodiments, terminal 101 is a smart phone, tablet, notebook, smart speaker, smart watch, smart voice interaction device, etc., but is not limited thereto. The terminal 101 runs an application program supporting image acquisition. The application may be a navigation-type application, a multimedia-type application, or an intelligent voice assistant, etc., to which embodiments of the application are not limited. Illustratively, the terminal 101 is a terminal used by a user. The terminal 101 may capture an image of the road condition where the user is currently located. Then, the terminal 101 may transmit the road condition image to the server 102, and the server 102 makes a next walking plan of the user according to the road condition image.
Those skilled in the art will recognize that the number of terminals may be greater or lesser. Such as the above-mentioned terminals may be only one, or the above-mentioned terminals may be several tens or hundreds, or more. The embodiment of the application does not limit the number of terminals and the equipment type.
In some embodiments, the server 102 is a stand-alone physical server, can be a server cluster or a distributed system formed by a plurality of physical servers, and can also be a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligence platforms, and the like. The server 102 is used to provide background services for applications that support image acquisition. In some embodiments, the server 102 takes on primary computing work and the terminal 101 takes on secondary computing work; alternatively, the server 102 takes on secondary computing work and the terminal 101 takes on primary computing work; alternatively, a distributed computing architecture is used for collaborative computing between the server 102 and the terminal 101.
Fig. 2 is a flowchart of a travel method for the blind according to an embodiment of the present application, and referring to fig. 2, in an embodiment of the present application, an example of the travel method is described by a terminal. The travel method for the blind comprises the following steps:
201. in the traveling process of the blind person, the server acquires road condition images of the blind person in real time, wherein the road condition images are used for reflecting road condition information of the current position of the blind person.
In the embodiment of the application, the blind person carries the terminal in the travel process of the blind person. The terminal can shoot the road conditions near the blind person in real time to obtain road condition images. Then, the terminal can upload the road condition image to the server. The server determines road condition information near the blind person based on the road condition image. The road condition information may include at least one of various information such as obstacles, congestion situations, traffic lights, steps, etc. near the blind, which is not limited in the embodiment of the present application.
202. Under the condition that the road condition image contains at least one travel influence factor, the server determines a next walking plan aiming at the at least one travel influence factor, and the next walking plan is used for prompting the next walking direction of the blind person.
In the embodiment of the application, the travel influencing factors refer to factors which influence the walking of the blind person to a certain extent. The travel influencing factors may be obstacles, other pedestrians, traffic lights, steps, etc., and the embodiment of the present application is not limited thereto. Obstacles and other pedestrians may obstruct walking to the blind. In the case of traffic lights of different colors, the walking state of the blind person is different. Compared with a flat road, the steps can bring certain difficulty to the travel of the blind. The server is able to identify entities in the road condition image. Under the condition that the entity in the road condition image comprises at least one travel influencing factor, the server can make a next walking plan of the blind person aiming at the at least one travel influencing factor. The next walking plan may be a mode of bypassing an obstacle or a pedestrian, a walking mode aiming at the current traffic light, a step-in mode, or the like, which is not limited in the embodiment of the present application.
203. The server transmits the voice signal of the next walking plan to the terminal so that the terminal plays the voice signal of the next walking plan to the blind person.
In an embodiment of the present application, the server generates a voice signal according to the next walking plan. Then, the server transmits the voice signal to the terminal carried by the blind person so that the terminal plays the voice signal of the next walking plan. The blind person can walk according to the voice signal of the next walking plan so as to get rid of the influence of the travel influence factors.
The embodiment of the application provides a travel method for the blind, which can determine the road condition information of the current position of the blind in the travel process by acquiring the road condition image of the blind in real time, namely, can determine whether travel influence factors influencing the travel of the blind exist near the blind under the current condition, and under the condition that the travel influence factors exist, determine the next travel plan of the blind, and then transmit the voice signal of the next travel plan to a terminal, so that the blind can travel according to the travel direction in the voice signal, thereby being capable of getting rid of the influence of the travel influence factors faster, and improving the travel efficiency and travel experience of the blind.
Fig. 3 is a flowchart of another travel method for the blind according to an embodiment of the present application, referring to fig. 3, in the embodiment of the present application, an example of the method being performed by a terminal will be described. The travel method for the blind comprises the following steps:
301. before the blind person goes out, the server determines a target path of the blind person based on the voice signal of the blind person.
In the embodiment of the application, the blind person can perform voice interaction with the server through the terminal. The server can determine the intention of the blind from the voice signal of the blind. The blind person's intention may be to go to a place, ask for time or ask for weather, etc., to which the embodiment of the present application is not limited. Taking the intention of the blind person to go to a certain place as an example, the server can plan a path for the blind person before the blind person goes out. The server can acquire the voice signal of the blind person from the terminal. Then, the server identifies the destination to which the blind person wants to reach by recognizing the voice signal of the blind person. The server may then determine a target path based on the current location and destination of the blind person. The server may obtain the current position from the voice signal of the blind person, or the server may determine the current position according to the location of the terminal, which is not limited in the embodiment of the present application.
The embodiment of the application does not limit the way of understanding the intention of the blind. Alternatively, the server may employ a large language model to understand the intent of the blind. Correspondingly, the server determines the target path of the blind based on the voice signal of the blind by: the server adopts a large language model to carry out semantic recognition on the text of the voice signal of the blind person, so as to obtain the intention of the blind person. Then, in the case where the intention of the blind person is to go to the target position, the server determines a target path based on the current position of the blind person and the target position. The target position is the destination. The target path is used for guiding the blind person from the current position to the target position. The large language model may be a ChatGPT (Chat Generative Pre-trained Transformer, chat generation pre-training transformer) model or a LLaMA (Large Language Model Meta Artificial Intelligence, large language model meta-artificial intelligence) model, as embodiments of the present application are not limited in this respect. The text of the blind's voice signal may be obtained through an ASR (Automated Speech Recognition, automatic speech recognition) model, to which embodiments of the application are not limited. According to the scheme provided by the embodiment of the application, the text of the voice signal of the blind person is subjected to semantic recognition through the large-scale language model, so that the intention of the blind person can be more accurately determined, and a more accurate target path is made; moreover, compared with the traditional voice recognition scheme, the method has the advantages that all possible utterances or keywords of the blind person are not required to be stored in the database, the blind person can speak at will, the intention of the blind person can be accurately understood through a large language model, and storage resources are saved.
In some embodiments, in order to further improve accuracy of understanding the intention of the blind person, the server may further perform voiceprint recognition on the acquired voice signal based on the above embodiments. Correspondingly, the server adopts a large language model to carry out semantic recognition on the text of the voice signal of the blind person, and the process for obtaining the intention of the blind person comprises the following steps: and the server carries out voiceprint recognition on the preliminarily acquired voice signals to obtain voiceprint characteristics of the voice signals. Then, under the condition that the voiceprint characteristics are the same as those of the blind person, the server adopts a large language model to carry out semantic recognition on the text of the voice signal, so as to obtain the intention of the blind person. In the case that the voiceprint features are different from those of the blind person, the server does not perform the semantic recognition process any more. According to the scheme provided by the embodiment of the application, the server can store the voiceprint characteristics of the blind person in advance, when the voice signal is received, the acquired voice signal is firstly voiceprint identified to determine whether the voice signal is the voice signal of the blind person, and the intention of the blind person can be identified only when the voice signal is determined to be the voice signal of the blind person, so that the trip of the blind person is prevented from being influenced by the voice signal of other people; and, the voice recognition is ended when the voice signal is not the voice signal of the blind person, so that the running consumption can be saved.
In planning a path for the blind person from the current location to the target location, the server may determine at least one path between the current location and the target location. In the case that there is only one path between the current position and the target position, the server takes the path as the target path for the blind person. In the case where there are a plurality of paths between the current position and the target position, the server may select one path from the plurality of paths as the target path for the blind person.
In some embodiments, the server may select one path from a plurality of paths as a target path for the blind person according to travel difficulty of the path. Accordingly, in the case that the intention of the blind person is to go to the target location, the process of determining the target path by the server based on the current location and the target location of the blind person includes: in the case where the blind person intends to go to the target position, the server determines a plurality of paths between the current position and the target position based on the current position and the target position of the blind person. Then, the server determines a target path based on the road condition information of the plurality of paths. The road condition information includes travel influencing factors. The road condition information can reflect the travel difficulty of the blind person walking on the corresponding path. The more travel influencing factors in the path, the greater the travel difficulty of the path. That is, the travel difficulty of the path is positively correlated with the number of travel influencing factors. According to the scheme provided by the embodiment of the application, the path is planned for the blind according to the travel difficulty of the path, so that the path with the minimum travel difficulty is planned for the blind, and the travel of the blind is facilitated, namely, the travel efficiency and travel experience of the blind can be improved.
Optionally, the server may determine the travel difficulty of the path according to the type of the travel influencing factors, in addition to determining the travel difficulty of the path according to the number of the travel influencing factors. The traffic information includes a plurality of influence factor types. There is a priority among the multiple influencing factor types. The priority of the influence factor type is positively related to the negative influence of the travel influence factor of the corresponding type on the travel of the blind person. That is, the higher the priority of the influence factor type, the greater the difficulty of the blind to get rid of the travel influence factor of the corresponding type. Correspondingly, the process of determining the target path by the server based on the road condition information of the multiple paths comprises the following steps: for any path, the server determines a plurality of influence factor types existing in the path based on road condition information of the path. And then, the server performs weighted summation on the factor scores of the plurality of influence factor types based on the priorities of the plurality of influence factor types to obtain path scores of the paths. Then, the server determines the path with the lowest difficulty as the target path based on the scores of the paths. The path score is used to indicate the difficulty of the blind to traverse the path. According to the scheme provided by the embodiment of the application, the influence of the travel influence factors of different types on the travel of the blind person is different, and different travel influence factors in the path can be treated differently by setting different priorities for the travel influence factors of different types, so that the travel difficulty of the path can be calculated more accurately.
The factor scores are used for indicating the difficulty that the blind gets rid of corresponding travel influence factors. For example, the travel influencing factors are steps, and the more the number of steps is, the greater the difficulty reflected by the factor score of the steps is; the smaller the number of steps, the smaller the difficulty reflected by the factor score of the steps. The travel influencing factors are barriers, and the larger the size of the barriers is, the greater the difficulty reflected by factor scores of the barriers is; the smaller the size of the obstacle, the smaller the difficulty reflected by the factor score of the obstacle. The size of the obstacle may be the overall size of the obstacle or the area occupied by the obstacle, which is not limited in the embodiment of the present application. The travel influencing factors are pedestrians, the more the number of pedestrians is, the greater the difficulty reflected by the factor score of the pedestrians is, and the high crowding degree of the path is indicated; the smaller the number of pedestrians, the smaller the difficulty reflected by the factor score of the pedestrians, which indicates that the congestion degree of the path is low.
In some embodiments, the target path may be specified by the blind in addition to the manner in which the target path is determined in the above embodiments. The server can send the voice signals of the paths to the terminal, and the terminal outputs the voice signals of the paths to the blind so that the blind can acquire the road condition information of each path. Then, the blind person may select one path from the plurality of paths as the target path. Alternatively, the server may first perform one-time screening on the multiple paths based on the scores of the multiple paths by adopting the method in the above embodiment; and then the blind person selects a target path on the basis of server screening. Accordingly, the server determines the path with the lowest difficulty as the target path based on the scores of the paths, and the process comprises the following steps: and under the condition that the difficulty of determining the paths with the target number is the same based on the path scores of the paths, and the difficulty is the lowest in the paths, the server sends the voice signals of the paths with the target number to the terminal. The voice signal is used for prompting the blind people of the road condition information of the paths of the target number. Then, the server determines a target path based on the received voice signal of the blind person. According to the scheme provided by the embodiment of the application, when the server screens out the paths with the lowest difficulty and the target number, the road condition information of each path with the target number is played for the blind, so that the blind can select a proper target path according to own requirements, the intention of the blind is met, and the travel experience of the blind can be improved.
For example, the server uses a large language model to determine that the planned routes are not very similar, and each has advantages and disadvantages. For example, the pedestrians on the route A are few but more steps, and the pedestrians on the route B are more but less steps. After comprehensively considering the advantages and disadvantages, the server can send voice signals of the route A and the route B through the terminal when the obviously optimal route cannot be determined, so that road condition information of each route and the advantages and disadvantages can be described for the blind. Then, the blind person decides which route to walk based on the description of the voice signal, thereby ensuring the degree of freedom of the blind person.
In some embodiments, the blind may also mark multiple locations, such as home, company, friends, and activity center, with voice signals. When blind guiding is performed next time, the server can directly determine a corresponding target path and the like according to the 'home', 'going to the company' or 'going to xx friend' which are said by the blind.
302. In the traveling process of the blind along the target path, the server acquires road condition images of the blind in real time, wherein the road condition images are used for reflecting road condition information of the current position of the blind.
In the embodiment of the application, after the target path is determined, the server starts to navigate according to the target path so as to guide the blind person to walk along the target path. When the blind person walks on the target path, the terminal can shoot the road condition of the target path of the position of the blind person, and the road condition image of the position of the current blind person is obtained. Then, the server can acquire road condition images from the terminal so as to determine road condition information nearby the blind based on the road condition images later. That is, the server can determine whether travel influencing factors exist near the blind person according to the road condition image. In the case that there is a travel influencing factor in the vicinity of the blind person, the server proceeds to step 303. Under the condition that no travel influencing factors exist near the blind person, the server can send a voice signal to the terminal so as to inform that no travel influencing factors exist near the blind person, and the previous travel planning can be kept. Then, the server continues to acquire the next road condition image from the terminal.
In the travel process of the blind, voice interaction can be performed between the blind and the server. The blind can actively inquire about information about travel. For example, the blind may ask information about "how long still is", "how far now is", "weather" or "congestion". And uploading the voice signal of the blind person to the terminal by the terminal. Then, the server can recognize the voice signal of the blind person through the large language model to understand the intention of the blind person. Then, the server determines corresponding reply information for the intention of the blind person. Then, the server transmits a voice signal of the reply message to the terminal to transmit the reply message to the blind person. Because the blind person is not aware of the daytime, the night and the weather, and the night and bad weather are very unsafe for traveling, the traveling experience of the blind person can be improved by providing the blind person with the skills of acquiring the real-time and the current weather. Of course, the blind person can also perform voice interaction with the server before going out to acquire the information, and the embodiment of the application does not limit the time of voice interaction between the blind person and the server.
303. Under the condition that the road condition image contains at least one travel influence factor, the server determines a next walking plan aiming at the at least one travel influence factor, and the next walking plan is used for prompting the next walking direction of the blind person.
In the embodiment of the application, when the road condition image contains at least one travel influencing factor, the server can determine the next walking plan based on the at least one travel influencing factor and the target path. According to the scheme provided by the embodiment of the application, on the basis of the target path, the next walking plan of the blind can be formulated aiming at the at least one travel influence factor, so that the deviation of the next walking plan of the blind from the target path can be avoided, the guarantee is provided for the blind to reach the target position, and the travel efficiency and travel experience of the blind can be improved.
In some embodiments, the server may determine the next walking plan for the blind based on the distance between the blind and the travel influencing factors. Accordingly, the process of the server determining the next walking plan for the at least one travel influencing factor comprises: under the condition that the road condition image contains any travel influencing factor, the server acquires the distance between the blind person and the travel influencing factor. Then, the server determines a next walking plan for the blind person in the case where the distance between the blind person and the travel influencing factor does not exceed the preset distance. The embodiment of the application does not limit the size of the preset distance. According to the scheme provided by the embodiment of the application, the next walking plan is formulated only for the travel influence factors which do not exceed the preset distance with the blind, namely, the next walking plan is formulated only for the travel influence factors nearby the blind, and as the blind does not necessarily go to the distant travel influence factors, the running consumption can be saved compared with the planning of all the travel influence factors in the road condition image; in addition, wrong guidance caused by making a useless plan for the blind is avoided, and the travel efficiency and travel experience of the blind can be improved.
Further, the server determines a next walking plan for the blind person in the case where the distance between the blind person and the travel influencing factor does not exceed the preset distance and the travel influencing factor is located within the preset range of the target path. Because the blind person walks along the target path, pedestrians generally do not go to travel for travel influence factors outside the preset range of the target path, and compared with planning for all travel influence factors in road condition images, the method can further save running consumption; and moreover, the wrong guidance caused by making a useless plan for the blind can be further avoided, and the travel efficiency and travel experience of the blind can be improved.
In some embodiments, the server may formulate a different next walking plan for different types of travel influencing factors. Correspondingly, in the case that the distance between the blind person and the travel influencing factor does not exceed the preset distance, the process of determining the next travel plan of the blind person by the server comprises the following steps: under the condition that the distance between the blind person and the travel influencing factors does not exceed the preset distance, if the travel influencing factors are obstacles, the server determines the next walking direction and the next walking distance of the blind person based on the sizes of the obstacles. If the travel influencing factors are steps, the server determines the walking direction and the walking distance of the blind person in the next step based on the step information. The step information includes at least one of a direction, a number, and a height of the steps. If the travel influencing factor is a traffic light, the server determines whether the blind person stops walking next based on the color and time of the traffic light. According to the scheme provided by the embodiment of the application, different next walking plans are formulated aiming at different types of travel influence factors, so that the blind can be guided to get rid of the influence of the travel influence factors more accurately, and the travel efficiency and travel experience of the blind are improved.
In some embodiments, the server may sequentially determine the next walking plan for the blind based on the distance between the blind and the travel influencing factors. Correspondingly, under the condition that the road condition image contains at least one travel influence factor, the process of determining the next walking plan aiming at the at least one travel influence factor by the server comprises the following steps: under the condition that the road condition image contains at least one travel influencing factor, the server determines the relative position between the at least one travel influencing factor and the blind person. The relative position includes distance and direction. Then, the server makes a next walking plan according to at least one travel influence factor according to the sequence from the near to the far of the relative positions. The next walking plan includes walking direction, distance and time. According to the scheme provided by the embodiment of the application, the next walking plan of the blind is determined according to the distance between the blind and the travel influencing factors in sequence, so that the blind can get rid of each travel influencing factor in sequence, the travel accuracy of the blind is ensured, and the travel efficiency and travel experience of the blind are improved.
In some embodiments, the server may reroute the blind person when a sudden event occurs in the target path. Correspondingly, under the condition that an emergency occurs in a road section in front of the target path, the server re-plans a route for the blind based on the current position and the target position of the blind. The front road section is used for representing a road section which is not passed by the blind in the target path. The emergency event can increase the travel difficulty of the target path. The emergency event may be a traffic accident, road congestion or/and control road sealing, which is not limited in the embodiment of the present application. According to the scheme provided by the embodiment of the application, when the sudden estrus event occurs in the target path, the travel difficulty of the target path generally increases, if the blind walks according to the original path, the travel of the blind can be greatly negatively influenced, and in this case, the route is re-planned for the blind, so that the travel of the blind is facilitated, and the travel efficiency and travel experience of the blind are improved.
For example, a traffic accident occurs in front of a walking route of the blind person, and the server can acquire the road condition information in real time through the road condition image, so that the blind person can be reminded of changing the route or slowing down the speed in time.
304. The server transmits the voice signal of the next walking plan to the terminal so that the terminal plays the voice signal of the next walking plan to the blind person.
In an embodiment of the application, the server generates a speech signal based on the next walking plan. And then, the server sends a voice signal of the next walking plan to the terminal so that the terminal plays a walking mode aiming at least one travel influence factor to the blind person according to the sequence from the near to the far of the relative position. The travel accuracy of the blind is guaranteed, so that the travel efficiency and travel experience of the blind can be improved.
In order to more clearly describe the travel method for the blind provided by the embodiment of the application, the method is further described below with reference to the accompanying drawings. Fig. 4 is a frame diagram of a travel method for the blind according to an embodiment of the present application. Referring to fig. 4, the blind person carries a terminal. The terminal comprises a positioning module, a voice module and an image module. The terminal obtains satellite positioning data of the current position through a positioning module. The terminal then transmits the satellite positioning data to the server. The server then identifies satellite positioning data based on a large language model (e.g., chatGPT), determining the current location. The large language model is trained by concatenating a large number of corpora (databases) that contain dialogs in the real world, so that the large language model can understand and learn the language of humans to conduct dialogs. The terminal can acquire the voice signal of the blind person through the voice module. The terminal then transmits the voice signal to the server. Then, the server recognizes the voice signal of the blind person based on the voice recognition module to obtain a text. The speech recognition module may be an ASR model, and embodiments of the application are not limited. Then, the server recognizes the text based on the large language model, and determines the intention of the blind person. For example, the server may determine a target location that the blind person wants to go to. Then, the server determines a target path between the current location and the target location through a path planning module. Then, the server converts the target path into a text through the large language model, converts the text of the target path into a voice signal through the voice recognition module, and sends the voice signal to the terminal. The terminal plays the voice signal of the target path through the voice module. In the traveling process of the blind along the target path, the terminal shoots the road conditions nearby the blind through the image module to obtain road condition images. And then, the terminal sends the road condition image to the server. The server converts the road condition image into text through the image recognition module. And then, the server identifies the text through the large language model and determines the current road condition information. Then, the server makes a next walking plan for the current road condition information through the walking planning module. Then, the server converts the next walking plan into a text through the large language model, converts the text of the next walking plan into a voice signal through the voice recognition module, and sends the voice signal to the terminal. The terminal plays a voice signal of the next walking plan through the voice module so as to guide the blind to walk. The travel method for the blind provided by the embodiment of the application is a blind guiding method realized by combining the technologies of voice recognition technology, large model understanding and decision making, computer vision, path planning and the like.
The embodiment of the application provides a travel method for the blind, which can determine the road condition information of the current position of the blind in the travel process by acquiring the road condition image of the blind in real time, namely, can determine whether travel influence factors influencing the travel of the blind exist near the blind under the current condition, and under the condition that the travel influence factors exist, determine the next travel plan of the blind, and then transmit the voice signal of the next travel plan to a terminal, so that the blind can travel according to the travel direction in the voice signal, thereby being capable of getting rid of the influence of the travel influence factors faster, and improving the travel efficiency and travel experience of the blind.
Fig. 5 is a block diagram of a travel device for the blind according to an embodiment of the present application. The device is used for executing the steps when the travel method for the blind is executed, and referring to fig. 5, the device comprises: an acquisition module 501, a first determination module 502 and a sending module 503.
The acquisition module 501 is used for acquiring road condition images of the blind in real time in the traveling process of the blind, wherein the road condition images are used for reflecting the road condition information of the current position of the blind;
the first determining module 502 is configured to determine a next walking plan for at least one travel influencing factor, where the road condition image includes the at least one travel influencing factor, and the next walking plan is used to prompt a blind person for a next walking direction;
And the sending module 503 is configured to send a voice signal of the next walking plan to the terminal, so that the terminal plays the voice signal of the next walking plan to the blind person.
In some embodiments, fig. 6 is a block diagram of another travel device for the blind provided according to an embodiment of the present application, referring to fig. 6, a first determining module 502 includes:
the obtaining unit 5021 is used for obtaining the distance between the blind person and the travel influencing factors under the condition that the road condition image contains any travel influencing factor;
a first determining unit 5022 for determining a next walking plan of the blind person in case that the distance between the blind person and the travel influencing factor does not exceed a preset distance.
In some embodiments, with continued reference to fig. 6, the determining unit 5022 is configured to determine, if the travel influencing factor is an obstacle, a walking direction and a walking distance of the blind person in a next step based on the size of the obstacle, in a case where the distance between the blind person and the travel influencing factor does not exceed a preset distance; if the travel influencing factors are steps, determining the walking direction and the walking distance of the blind person in the next step based on step information, wherein the step information comprises at least one of the direction, the number and the height of the steps; if the travel influencing factor is a traffic light, determining whether the blind person stops walking next based on the color and time of the traffic light.
In some embodiments, with continued reference to fig. 6, the first determining module 502 is configured to determine, in a case where the road condition image includes at least one travel influencing factor, a relative position between the at least one travel influencing factor and the blind person, where the relative position includes a distance and a direction; according to the sequence from the near to the far of the relative positions, aiming at least one travel influence factor, a next walking plan is formulated, wherein the next walking plan comprises a walking direction, a distance and time;
a transmitting module 503, configured to generate a voice signal based on the next walking plan; and sending a voice signal of the next walking plan to the terminal so that the terminal plays a walking mode aiming at least one travel influence factor to the blind according to the sequence from the near to the far of the relative position.
In some embodiments, with continued reference to fig. 6, the apparatus further comprises:
the recognition module 504 is configured to perform semantic recognition on the text of the voice signal of the blind person by using a large language model, so as to obtain the intention of the blind person;
a second determining module 505 for determining a target path based on the current position and the target position of the blind person in case the intention of the blind person is to go to the target position, the target path being for guiding the blind person from the current position to the target position;
The first determining module 502 is configured to determine, when the road condition image includes at least one travel influencing factor, a next walking plan based on the at least one travel influencing factor and the target path.
In some embodiments, with continued reference to fig. 6, the second determination module 505 includes:
a second determining unit 5051 for determining a plurality of paths between the current position and the target position based on the current position and the target position of the blind person in case the intention of the blind person is to go to the target position;
the third determining unit 5052 is configured to determine a target path based on road condition information of multiple paths, where the road condition information includes travel influencing factors, and the road condition information can reflect travel difficulty of the blind person traveling on the corresponding path.
In some embodiments, with continued reference to fig. 6, the traffic information includes a plurality of influence factor types, with priorities among the plurality of influence factor types;
a third determining unit 5052, configured to determine, for any path, multiple types of influencing factors existing in the path based on road condition information of the path; based on the priorities of the multiple influence factor types, weighting and summing the factor scores of the multiple influence factor types to obtain a path score of the path, wherein the factor score is used for indicating the difficulty of the blind person getting rid of the corresponding travel influence factor, and the path score is used for indicating the difficulty of the blind person passing through the path; and determining the path with the lowest difficulty as a target path based on the scores of the paths.
In some embodiments, with continued reference to fig. 6, the third determining unit 5052 is configured to determine, based on the path scores of the multiple paths, that the difficulty of the target number of paths is the same, and send, to the terminal, a voice signal of the target number of paths, where the difficulty is the lowest among the multiple paths, the voice signal being used to prompt the blind person of road condition information of each of the target number of paths; and determining a target path based on the received voice signal of the blind person.
In some embodiments, with continued reference to fig. 6, the recognition module 504 is configured to perform voiceprint recognition on the primarily acquired voice signal to obtain voiceprint features of the voice signal; under the condition that the voiceprint characteristics are the same as those of the blind person, a large language model is adopted to carry out semantic recognition on the text of the voice signal, so as to obtain the intention of the blind person.
In some embodiments, with continued reference to fig. 6, the first determining module 502 is further configured to re-plan a route for the blind person based on the current location and the target location of the blind person in the event of an emergency on a front road segment in the target path, where the front road segment is used to represent a road segment that the blind person in the target path has not yet passed through, and the emergency can increase the travel difficulty of the target path.
The embodiment of the application provides a travel device for the blind, which can determine the road condition information of the current position of the blind in the travel process by acquiring the road condition image of the blind in real time, namely, can determine whether travel influence factors influencing the travel of the blind exist near the blind under the current condition, and under the condition that the travel influence factors exist, determine the next travel plan of the blind, and then transmit the voice signal of the next travel plan to a terminal, so that the blind can travel according to the travel direction in the voice signal, thereby being capable of getting rid of the interference of the travel influence factors faster, and improving the travel efficiency and travel experience of the blind.
It should be noted that: the travel device for the blind provided in the above embodiment is only exemplified by the division of the above functional modules when the blind is instructed to walk, and in practical application, the above functional allocation may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the trip device for the blind person provided in the above embodiment and the trip method embodiment for the blind person belong to the same concept, and the specific implementation process of the trip device for the blind person is detailed in the method embodiment, and is not repeated here.
In the embodiment of the present application, the computer device can be configured as a terminal or a server, when the computer device is configured as a terminal, the technical solution provided by the embodiment of the present application may be implemented by the terminal as an execution body, and when the computer device is configured as a server, the technical solution provided by the embodiment of the present application may be implemented by the server as an execution body, or the technical solution provided by the present application may be implemented by interaction between the terminal and the server, which is not limited by the embodiment of the present application.
Fig. 7 is a block diagram of a terminal 700 according to an embodiment of the present application. The terminal 700 may be a portable mobile terminal such as: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III, motion picture expert compression standard audio plane 3), an MP4 (Moving Picture Experts Group Audio Layer IV, motion picture expert compression standard audio plane 4) player, a notebook computer, or a desktop computer. Terminal 700 may also be referred to by other names of user devices, portable terminals, laptop terminals, desktop terminals, etc.
In general, the terminal 700 includes: a processor 701 and a memory 702.
Processor 701 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like. The processor 701 may be implemented in at least one hardware form of DSP (Digital Signal Processing ), FPGA (Field-Programmable Gate Array, field programmable gate array), PLA (Programmable Logic Array ). The processor 701 may also include a main processor, which is a processor for processing data in an awake state, also referred to as a CPU (Central Processing Unit ); a coprocessor is a low-power processor for processing data in a standby state. In some embodiments, the processor 701 may be integrated with a GPU (Graphics Processing Unit, image processor) for taking care of rendering and drawing of content that the display screen is required to display. In some embodiments, the processor 701 may also include an AI (Artificial Intelligence ) processor for processing computing operations related to machine learning.
Memory 702 may include one or more computer-readable storage media, which may be non-transitory. The memory 702 may also include high-speed random access memory as well as non-volatile memory, such as one or more magnetic disk storage devices, flash memory storage devices. In some embodiments, a non-transitory computer readable storage medium in memory 702 is used to store at least one computer program for execution by processor 701 to implement the travel method for the blind provided by the method embodiments of the present application.
In some embodiments, the terminal 700 may further optionally include: a peripheral interface 703 and at least one peripheral. The processor 701, the memory 702, and the peripheral interface 703 may be connected by a bus or signal lines. The individual peripheral devices may be connected to the peripheral device interface 703 via buses, signal lines or a circuit board. Specifically, the peripheral device includes: at least one of radio frequency circuitry 704, a display 705, a camera assembly 706, audio circuitry 707, and a power supply 708.
A peripheral interface 703 may be used to connect I/O (Input/Output) related at least one peripheral device to the processor 701 and memory 702. In some embodiments, the processor 701, memory 702, and peripheral interface 703 are integrated on the same chip or circuit board; in some other embodiments, either or both of the processor 701, the memory 702, and the peripheral interface 703 may be implemented on separate chips or circuit boards, which is not limited in this embodiment.
The Radio Frequency circuit 704 is configured to receive and transmit RF (Radio Frequency) signals, also referred to as electromagnetic signals. The radio frequency circuitry 704 communicates with a communication network and other communication devices via electromagnetic signals. The radio frequency circuit 704 converts an electrical signal into an electromagnetic signal for transmission, or converts a received electromagnetic signal into an electrical signal. In some embodiments, the radio frequency circuit 704 includes: antenna systems, RF transceivers, one or more amplifiers, tuners, oscillators, digital signal processors, codec chipsets, subscriber identity module cards, and so forth. The radio frequency circuitry 704 may communicate with other terminals via at least one wireless communication protocol. The wireless communication protocol includes, but is not limited to: the world wide web, metropolitan area networks, intranets, generation mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity ) networks. In some embodiments, the radio frequency circuitry 704 may also include NFC (Near Field Communication ) related circuitry, which is not limiting of the application.
The display screen 705 is used to display a UI (User Interface). The UI may include graphics, text, icons, video, and any combination thereof. When the display 705 is a touch display, the display 705 also has the ability to collect touch signals at or above the surface of the display 705. The touch signal may be input to the processor 701 as a control signal for processing. At this time, the display 705 may also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards. In some embodiments, the display 705 may be one and disposed on the front panel of the terminal 700; in other embodiments, the display 705 may be at least two, respectively disposed on different surfaces of the terminal 700 or in a folded design; in other embodiments, the display 705 may be a flexible display disposed on a curved surface or a folded surface of the terminal 700. Even more, the display 705 may be arranged in a non-rectangular irregular pattern, i.e. a shaped screen. The display 705 may be made of LCD (Liquid Crystal Display ), OLED (Organic Light-Emitting Diode) or other materials.
The camera assembly 706 is used to capture images or video. In some embodiments, camera assembly 706 includes a front camera and a rear camera. Typically, the front camera is disposed on the front panel of the terminal and the rear camera is disposed on the rear surface of the terminal. In some embodiments, the at least two rear cameras are any one of a main camera, a depth camera, a wide-angle camera and a tele camera, so as to realize that the main camera and the depth camera are fused to realize a background blurring function, and the main camera and the wide-angle camera are fused to realize a panoramic shooting and Virtual Reality (VR) shooting function or other fusion shooting functions. In some embodiments, camera assembly 706 may also include a flash. The flash lamp can be a single-color temperature flash lamp or a double-color temperature flash lamp. The dual-color temperature flash lamp refers to a combination of a warm light flash lamp and a cold light flash lamp, and can be used for light compensation under different color temperatures.
The audio circuit 707 may include a microphone and a speaker. The microphone is used for collecting sound waves of users and environments, converting the sound waves into electric signals, and inputting the electric signals to the processor 701 for processing, or inputting the electric signals to the radio frequency circuit 704 for voice communication. For the purpose of stereo acquisition or noise reduction, a plurality of microphones may be respectively disposed at different portions of the terminal 700. The microphone may also be an array microphone or an omni-directional pickup microphone. The speaker is used to convert electrical signals from the processor 701 or the radio frequency circuit 704 into sound waves. The speaker may be a conventional thin film speaker or a piezoelectric ceramic speaker. When the speaker is a piezoelectric ceramic speaker, not only the electric signal can be converted into a sound wave audible to humans, but also the electric signal can be converted into a sound wave inaudible to humans for ranging and other purposes. In some embodiments, the audio circuit 707 may also include a headphone jack.
The location component 708 is operative to locate the current geographic location of the terminal 700 for navigation or LBS (Location Based Service, location-based services). The positioning component 708 may be a positioning component based on the United states GPS (Global Positioning System ), the Beidou system of China, or the Galileo system of Russia.
A power supply 709 is used to power the various components in the terminal 700. The power supply 709 may be an alternating current, a direct current, a disposable battery, or a rechargeable battery. When the power supply 709 includes a rechargeable battery, the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery. The wired rechargeable battery is a battery charged through a wired line, and the wireless rechargeable battery is a battery charged through a wireless coil. The rechargeable battery may also be used to support fast charge technology.
In some embodiments, the terminal 700 further includes one or more sensors 710. The one or more sensors 710 include, but are not limited to: acceleration sensor 711, gyro sensor 712, pressure sensor 713, optical sensor 714, and proximity sensor 715.
The acceleration sensor 711 can detect the magnitudes of accelerations on three coordinate axes of the coordinate system established with the terminal 700. For example, the acceleration sensor 711 may be used to detect the components of the gravitational acceleration in three coordinate axes. The processor 701 may control the display screen 705 to display a user interface in a landscape view or a portrait view based on the gravitational acceleration signal acquired by the acceleration sensor 711. The acceleration sensor 711 may also be used for the acquisition of motion data of a game or a user.
The gyro sensor 712 may detect a body direction and a rotation angle of the terminal 700, and the gyro sensor 712 may collect a 3D motion of the user to the terminal 700 in cooperation with the acceleration sensor 711. The processor 701 may implement the following functions based on the data collected by the gyro sensor 712: motion sensing (e.g., changing UI according to a tilting operation by a user), image stabilization at shooting, game control, and inertial navigation.
The pressure sensor 713 may be disposed at a side frame of the terminal 700 and/or at a lower layer of the display screen 705. When the pressure sensor 713 is disposed at a side frame of the terminal 700, a grip signal of the user to the terminal 700 may be detected, and the processor 701 performs left-right hand recognition or quick operation according to the grip signal collected by the pressure sensor 713. When the pressure sensor 713 is disposed at the lower layer of the display screen 705, the processor 701 controls the operability control on the UI interface according to the pressure operation of the user on the display screen 705. The operability controls include at least one of a button control, a scroll bar control, an icon control, and a menu control.
The optical sensor 714 is used to collect the ambient light intensity. In one embodiment, the processor 701 may control the display brightness of the display screen 705 based on the ambient light intensity collected by the optical sensor 714. Specifically, when the intensity of the ambient light is high, the display brightness of the display screen 705 is turned up; when the ambient light intensity is low, the display brightness of the display screen 705 is turned down. In another embodiment, the processor 701 may also dynamically adjust the shooting parameters of the camera assembly 706 based on the ambient light intensity collected by the optical sensor 714.
A proximity sensor 715, also referred to as a distance sensor, is typically provided on the front panel of the terminal 700. The proximity sensor 715 is used to collect a distance between the user and the front surface of the terminal 700. In one embodiment, when the proximity sensor 715 detects that the distance between the user and the front surface of the terminal 700 gradually decreases, the processor 701 controls the display 705 to switch from the bright screen state to the off screen state; when the proximity sensor 715 detects that the distance between the user and the front surface of the terminal 700 gradually increases, the processor 701 controls the display screen 705 to switch from the off-screen state to the on-screen state.
Those skilled in the art will appreciate that the structure shown in fig. 7 is not limiting of the terminal 700 and may include more or fewer components than shown, or may combine certain components, or may employ a different arrangement of components.
Fig. 8 is a schematic structural diagram of a server according to an embodiment of the present application, where the server 800 may have a relatively large difference due to different configurations or performances, and may include one or more processors (Central Processing Units, CPU) 801 and one or more memories 802, where at least one computer program is stored in the memories 802, and the at least one computer program is loaded and executed by the processor 801 to implement the trip method for the blind provided in the above method embodiments. Of course, the server may also have a wired or wireless network interface, a keyboard, an input/output interface, and other components for implementing the functions of the device, which are not described herein.
The embodiment of the application also provides a computer readable storage medium, wherein at least one section of computer program is stored in the computer readable storage medium, and the at least one section of computer program is loaded and executed by a processor of computer equipment to realize the operation executed by the computer equipment in the travel method for the blind person. For example, the computer readable storage medium may be Read-Only Memory (ROM), random-access Memory (Random Access Memory, RAM), compact disc Read-Only Memory (Compact Disc Read-Only Memory, CD-ROM), magnetic tape, floppy disk, optical data storage device, and the like.
Embodiments of the present application also provide a computer program product comprising a computer program stored in a computer readable storage medium. The processor of the computer device reads the computer program from the computer-readable storage medium, and the processor executes the computer program so that the computer device performs the travel method for the blind provided in the above-described various alternative implementations.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program for instructing relevant hardware, where the program may be stored in a computer readable storage medium, and the storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The foregoing description of the preferred embodiments of the present application is not intended to limit the application, but rather, the application is to be construed as limited to the appended claims.