-- Switching from mnogosearch
Depending on how you stopped your crawl, it might happen like you said, the resume option isn't listed. This indicates that there are no urls in the queue as saved to disk to start the crawl from. If you really need to restart that crawl, from a shell, switch into the src/executables folder, and run:
php ArcTool.php inject timestamp file
Here timestamp is the timestamp of the crawl to add urls to the queue, file is a file with one url/line.
For your second question, I think what you want is how I do this for video sites. Basically, under Web Scrapers make a new web scraper. Give as its signature some xpath which evaluates to non-empty on pages you know have an image that could be a thumb nail. Then under extract fields you need to have a line
THUMB_URL=xpath_for_thumb_url_on_page
You can look at the pre-built web scraper for videos that comes with Yioop for an example. It does it using open graph meta-info.
(
Edited: 2021-03-06)
Depending on how you stopped your crawl, it might happen like you said, the resume option isn't listed. This indicates that there are no urls in the queue as saved to disk to start the crawl from. If you really need to restart that crawl, from a shell, switch into the src/executables folder, and run:
php ArcTool.php inject timestamp file
Here timestamp is the timestamp of the crawl to add urls to the queue, file is a file with one url/line.
For your second question, I think what you want is how I do this for video sites. Basically, under Web Scrapers make a new web scraper. Give as its signature some xpath which evaluates to non-empty on pages you know have an image that could be a thumb nail. Then under extract fields you need to have a line
THUMB_URL=xpath_for_thumb_url_on_page
You can look at the pre-built web scraper for videos that comes with Yioop for an example. It does it using open graph meta-info.