Disclosure of Invention
The invention provides a method for diagnosing website access faults, which comprises the following steps:
acquiring website performance index data, and executing routine diagnosis according to the index data;
judging whether the website access is successful according to the conventional diagnosis information obtained by the conventional diagnosis;
and if the judgment result is that the website access fails, further acquiring deep diagnosis information data, and executing deep diagnosis according to the deep diagnosis information data.
Preferably, the routine diagnosis further comprises: and acquiring an HTTP request state code, and judging whether the request state is normal or not according to the HTTP request state code.
Preferably, the routine diagnosis further comprises: and if the request state is normal, further acquiring performance monitoring data of the accessed website, and diagnosing the website performance according to the performance monitoring data.
Preferably, the performance monitoring data includes connection time of TCP, DNS resolution time, and link information of resource files of web pages.
Preferably, the deep diagnosis further comprises: and executing network connectivity detection aiming at the terminal by using the public domain name server, and judging whether the terminal is successfully accessed to the Internet.
Preferably, the deep diagnosis further comprises: if the terminal is successfully accessed to the Internet, the reason of the failure of the resolution is tracked according to the obtained DNS resolution result or the validity and correctness of the resolved IP address are checked.
Preferably, the deep diagnosis further comprises: if the DNS analysis fails, further detecting whether the domain name server is reachable or whether the domain name server has service failure.
Preferably, the deep diagnosis further comprises: judging whether the TCP is successfully connected, and if so, further executing HTTP connection detection; otherwise, further searching the TCP connection failure reason.
Preferably, the deep diagnosis further comprises: and if the HTTP connection is successful, further executing diagnosis aiming at the front-end webpage of the website.
According to another aspect of the present invention, there is also provided a website access failure diagnosis system, including: a storage device and a processor; wherein the storage means is for storing a computer program for implementing the method as described above when executed by the processor.
Compared with the prior art, the invention has the following beneficial technical effects: according to the network access fault diagnosis method and system provided by the invention, the source automated testing framework (Selenium Webdriver) is expanded based on the standard Resource Timing and Navigation Timing of W3C supported by the modern browser core, so that the performance data acquisition of the browser accessing to the Web site is realized and is used as the basis for subsequent diagnosis; meanwhile, the fault diagnosis level is divided into a conventional level and a deep level, relatively simple conventional diagnosis is carried out by utilizing collected access performance monitoring data at the initial stage of initiating access to the website, and the fault diagnosis level is used for assisting in completing an access quality evaluation system of the website aiming at the website which can be successfully accessed; for the website with failed access, deep diagnosis is continuously carried out so as to further investigate and position the reason of the website access failure; various network test tools are integrated for realizing comprehensive diagnosis of website access faults in both conventional diagnosis and deep diagnosis; by utilizing the method and the system for diagnosing the website access failure, the reason of the failure of the user in accessing the website can be diagnosed, and the troubleshooting and the fault point determination are facilitated, so that the failure content is reported in time or the failure is solved, and the access is recovered as soon as possible.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more clearly understood, the method and system for diagnosing website access failures provided in the embodiments of the present invention are described in further detail below with reference to the accompanying drawings.
The inventor earnestly studies on factors influencing the website access quality, and finds that the reasons for website access failure can be analyzed from the network, the server side and the front end respectively.
The network factor refers to access failure caused by the network itself, specifically, such as a link physical failure, a DNS server failure, or network congestion. The root cause of network congestion is the lack of network resources, i.e. the user load is greater than the capacity and processing capacity of the network resources, but the presented failure manifestation can be manifold, e.g. arriving packets are frequently dropped due to insufficient router cache; or a bandwidth bottleneck is formed on a low-speed link in the data transmission process due to insufficient link bandwidth; or the low speed of the processor causes the processing speed of the router CPU not to keep pace with the high-speed link when the router CPU executes operations such as queuing cache, updating a routing table and the like.
The server-side factors refer to various parameters influencing the performance of the server, such as hardware configuration, I/O performance, bandwidth and other reference indexes of the server. For example, bandwidth quality is an important factor affecting response time of a website, and a server side generally rejects a service due to an overload caused by the fact that the above parameters cannot meet the requirements, so that website access fails. In addition, the server may also fail to access the website due to physical problems, such as power failure.
Front-end factors refer to resources of the front-end page that affect the website access performance. Generally, the performance of the website is greatly affected by the quality of front-end page optimization, and the user experience of the front end even determines whether the user can continuously use the functions of the website. For example, JS, CSS, or picture resources of a page are too large to generate too many HTTP requests, which may degrade access performance, or when the requested resources are not compressed, data transmission pressure may be too large, which may cause the page to be displayed slowly or even to be overtime, which may eventually result in a failure in website access.
In order to quickly diagnose the access performance of the website and determine the fault reason, the inventor particularly provides a standard Resource Timing and Navigation Timing based on W3C supported by modern browser cores, and the method is used for quickly and accurately diagnosing the access quality of the website and positioning the fault reason by combining conventional diagnosis and deep diagnosis. The following detailed description will be given with reference to specific embodiments.
The website access fault diagnosis method provided by the preferred embodiment of the invention specifically comprises the following steps:
s10 routine diagnosis
Fig. 1 is a schematic diagram of a diagnostic tree of conventional diagnosis provided by the present invention, as shown in fig. 1, when a website is accessed through a browser, firstly, a conventional diagnosis is performed on the website according to performance index data of the acquired website, and an execution logic of a diagnostic flow may be gradually executed from the diagnostic tree shown in fig. 1, specifically including the following steps:
s101HTTP request status diagnostic
Obtaining the HTTP request status code, if the status code is more than or equal to 500, then concluding that: a server error; if the status code is greater than or equal to 400 and less than 500, it is concluded: the client requests an error; if the status code is equal to 200, then it is concluded: the request state is normal, and the detection of S102 may be continuously performed.
S102 website performance diagnostics
When the request state is normal, a browser is opened by using a Selenium WebDriver to access a target website, performance monitoring data of the accessed website is acquired by using a Resource Timing API and a Navigation Timing API provided by W3C, and the performance monitoring data are marked as timeingInfo and resourcesInfo so as to further diagnose the website performance by using the data, and the method specifically comprises the following steps:
(1) calculating the connection time of the TCP by using the connectitStart and connectinEnd fields in the timeingInfo, comparing the connection time with a preset threshold, and if the connection time exceeds the preset threshold, concluding that: network failure-TCP connection time is too long.
Specifically, the connection time of the TCP can be calculated by using the following formula:
timingInfo.connectEnd-timingInfo.connectStart
wherein, the ttinginfo.connected and the timenginfo.connected startconnected respectively represent a connection end time point and a connection start time point of the TCP.
(2) Calculating DNS resolution time by using the fields of domainLookupStart and domainLookupEnd in the timeingInfo, comparing the DNS resolution time with a preset threshold, and if the DNS resolution time exceeds the threshold, drawing a conclusion: network failure-DNS resolution time is too long.
Specifically, the DNS resolution time can be calculated by using the following formula:
timingInfo.domainLookupEnd-timingInfo.domainLookupStart
wherein, timinginfo, domain lookup end and timinginfo, domain lookup start represent an end time point and a start time point of DNS resolution, respectively.
(3) And requesting the file and downloading the file by using a Curl command for each resource link according to link information of all resource files of the website page in the resourcesInfo, so as to obtain the downloading time and size of the resource file. Therefore, whether each resource link is effective and can be downloaded or not is known, whether the resource size exceeds a preset threshold value or not is comprehensively judged, whether the resource file is loaded successfully or not is judged, and the success rate of resource downloading is given. When the success rate of resource downloading is lower than the threshold value, the conclusion that the resource loading on the website page may be overtime or failed is obtained.
S103 routine diagnostic completion
If no fault occurs in the diagnosis of S101-S102, the conclusion that the website is normally accessed can be obtained, and the conventional diagnosis is finished; if any failure occurs in the above-described diagnosis in S101 to S102, the deep diagnosis in step S20 will be continued.
S20 deep diagnosis
Fig. 2 is a schematic diagram of a diagnostic tree for deep diagnosis provided by the present invention, as shown in fig. 2, when the website access failure is found through the conventional diagnosis in step S10, the conventional diagnosis information can be checked, and the deep diagnosis is triggered according to the requirement, specifically, the deep diagnosis for the website access failure can be performed according to the further obtained deep diagnosis information data, and the execution logic of the diagnostic process can be executed step by step from the diagnostic tree shown in fig. 2, and specifically includes the following steps:
s201 Internet access diagnosis
The Ping command is used for detecting the network connectivity of some public known domain name servers or public known website servers and terminal machines, such as '114.114.114.114' and '8.8.8.8'. If the tested domain name server or the known website server responds to the test packet sent by the Ping operation and returns the response time of the test packet and other related information, the network of the terminal and the tested host is communicated, the terminal is accessed to the Internet, and the subsequent deep diagnosis can be continuously executed; otherwise, the terminal is determined not to access the network, so as to further investigate the cause of such a result, such as a network card failure of the terminal, no available IP address assigned, or unstable local area network.
If the diagnosis result of S201 determines that the terminal has accessed the network, further performing deep diagnosis of the network.
S202DNS resolution diagnostics
And acquiring a DNS resolution result by using an Nslookup command, if no result is returned in domain name resolution, further tracking the reason of resolution failure, and otherwise, carrying out subsequent check on the validity and correctness of the resolved IP address. The method specifically comprises the following steps:
(1) domain name server reachable diagnostics
If no result of domain name resolution is returned, whether the used domain name server can be reached can be further judged, specifically, the Ping command is used for testing the connectivity of the used domain name server, if the testing shows that the used domain name server is connected, whether the domain name server can normally provide resolution service is required to be further detected, otherwise, the fault reason is that the DNS server cannot be reached can be judged, and the diagnosis is finished.
(2) Domain name server fault diagnosis
If the server is reachable, it can further detect whether the domain name server has service failure. Specifically, a common domain name list is provided, the domain name server is used for analyzing the common domain name, if the common domain name can be analyzed normally, the domain name server is judged to be capable of providing an analysis service normally, the reason of failure of DNS analysis may be that the website domain name is temporarily not added or updated to the domain name server, and the diagnosis is ended. If the common domain name in the list can not be successfully resolved, the domain name server can be judged to have a service fault temporarily, and the diagnosis is finished.
S203TCP connection diagnosis
And judging whether the TCP connection with the website server is successful or not by using Socket programming. Specifically, if the TCP connection is successful, detecting a further HTTP connection; if the TCP connection fails, the reason of the failure is further searched.
If the TCP connection fails, the Ping command is adopted to test the connectivity of the website server, if the connectivity shows that the website server has no fault and can normally provide services, the reason for the TCP connection failure can be the connection rejection of the server caused by the overweight load of the website server, and the diagnosis is finished. If not, it means that the network link from the terminal to the website server (including the website server itself) may have a fault, and it is necessary to continue to use Traceroute tool to obtain the detailed information of the routing link from the terminal to the website server, so as to further locate the fault point, and end the diagnosis.
If the TCP connection is successful, the connection state of the HTTP can be diagnosed.
S204HTTP connection diagnostics
Firstly, acquiring a URL after website skipping, and judging whether the website uses HTTPS or not according to whether the URL contains the HTTPS or not or HTTP. The method includes the steps that a Resource Timing API and a Navigation Timing API provided by a Selenium Webdriver and W3C are used to obtain performance monitoring data of an accessed website, and the performance monitoring data are marked as timeingInfo and resourcesInfo so as to be used for subsequent diagnosis, and specifically includes the following steps:
if the website uses HTTPS, the securennectionStart field of the timingInfo is used to determine if the SSL handshake has failed. Specifically, if the timeinginfo has the field and its value is 0, it may determine that the SSL connection fails, and end the diagnosis, where the reason for the network access failure may be certificate failure, etc.; otherwise, the SSL handshake is considered to be successful, and the connection state of the HTTP is continuously detected.
Secondly, an HTTP connection request is sent to the target website and corresponding return data is obtained, so that whether response exists in the HTTP connection or not is detected.
If the HTTP request does not respond, the information such as the packet loss rate, the average time delay and the like can be obtained by sending a Ping command to the website server to judge whether network congestion occurs. Specifically, if the packet loss rate is greater than 0, a conclusion that the packet loss condition exists is obtained, and the reasons for packet loss mainly include physical line faults, equipment faults, virus attacks, routing information errors and the like, and the diagnosis is finished; if the average time delay exceeds a preset threshold value, the conclusion that the time delay is large and network congestion possibly occurs is obtained, and the diagnosis is finished; if none of the above conditions exist, the reason for no response of the HTTP request may be other network reasons, such as poor service quality of the web server, and the diagnosis is ended.
If the HTTP request has a response, the status code is checked. If the status code is 200, more detailed detection needs to be carried out on website page elements; if the status code is Not equal to 200, the corresponding abnormal reason, for example, 404Not Found or 403Forbidden, etc., is given according to the error code prompt, and the diagnosis is ended.
S205 website front end webpage diagnosis
If the HTTP request has a response and the returned status code is 200, that is, if the HTTP request is successful, the resource file of the front-end page of the website can be detected. For example, according to the link information of all resource files of a website page in the resourcesInfo, a Curl command is used for each resource link to request the file and download the file, so that the download time and the download size of the resource file are obtained, whether each resource link is effective and can be downloaded or not is known, whether the resource size exceeds a preset threshold value or not is comprehensively judged, whether the resource file is successfully loaded or not is judged, and the resource download success rate is given. When the success rate of resource downloading is lower than the threshold value, the conclusion that the resource loading is overtime or fails on the website page is obtained, and the diagnosis is finished; otherwise, all the precursor steps are integrated, the conclusion that no serious fault exists and the website is normally visited can be obtained, and the diagnosis is finished.
According to another aspect of the present invention, there is also provided a system for performing the above network access failure diagnosis method, the diagnosis system including a data acquisition module, a routine diagnosis module and a deep diagnosis module. The acquisition module can be used for acquiring performance index data of the WEB site and diagnosis information data for fault diagnosis, transmitting the performance index data of the WEB site to the conventional diagnosis module for executing conventional diagnosis of website access quality, and transmitting the diagnosis information data to the deep diagnosis module for executing deep diagnosis of the website access quality.
Specifically, routine diagnostics may be used to perform diagnostics on HTTP request status and website performance; deep diagnostics can be used to perform comprehensive diagnostics and point of failure diagnostics on internet access, the network itself, the server side, and the web front end.
In an embodiment of the present invention, the system may be implemented in the form of a terminal plug-in implemented by Java, and specifically, the plug-in may perform interactive data transmission with a plug-in manager by using JSON format based on HTTP protocol.
Although in the above embodiments, ping, nsloop, and traceroute are used as examples to describe the network access fault diagnosis method provided by the present invention, it should be understood by those skilled in the art that, besides the network test tools in the above embodiments, other conventional network test tools with similar functions may be applied to the network access fault diagnosis method provided by the present invention.
Although the present invention has been described by way of preferred embodiments, the present invention is not limited to the embodiments described herein, and various changes and modifications may be made without departing from the scope of the present invention.