Incomplete Visits Table #1076
-
Hi, Would it be possible to explain how does incomplete Visits Table works? Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Hey, I just realized that we never documented this anywhere. The short version is that you should discard all data from incomplete visits. There are two ways a visit can be considered a failed visit, if there was an error while executing the command sequence or if the command sequence was interrupted by a shutdown. While there might be some data saved but due to the incompleteness we never considered it as part of our analysis. If you are running larger crawls with lots of failing websites, I would recommend you take a look at the crawler.py, which is what we previously used. Retrying each website up to three times if there was a failure helped us significantly reduce the failure percentage. |
Beta Was this translation helpful? Give feedback.
Hey,
I just realized that we never documented this anywhere.
The short version is that you should discard all data from incomplete visits.
There are two ways a visit can be considered a failed visit, if there was an error while executing the command sequence or if the command sequence was interrupted by a shutdown.
While there might be some data saved but due to the incompleteness we never considered it as part of our analysis.
If you are running larger crawls with lots of failing websites, I would recommend you take a look at the crawler.py, which is what we previously used. Retrying each website up to three times if there was a failure helped us significantly reduce the failure percentage.