2014-11-08

Crawl error: response code Issuer certificate is invalid..

Originally Posted By: buzz
When i try to crawl my own https site, which is perfectly accessible from the web without any errors, the "fetcher.php" shows me this:

"response code Issuer certificate is invalid."

Complete text:

[Sat, 08 Nov 2014 03:39:32 -0800] Start process pages... Current Memory:6485672
[Sat, 08 Nov 2014 03:39:32 -0800] https://www.flirtzoo.com response code Issuer certificate is invalid.
[Sat, 08 Nov 2014 03:39:32 -0800] No page processor for mime type:


Why am i having this error?
'''Originally Posted By: buzz''' When i try to crawl my own https site, which is perfectly accessible from the web without any errors, the &quot;fetcher.php&quot; shows me this:<br><br>&quot;response code Issuer certificate is invalid.&quot;<br><br>Complete text:<br><br>[Sat, 08 Nov 2014 03:39:32 -0800] Start process pages... Current Memory:6485672<br>[Sat, 08 Nov 2014 03:39:32 -0800] https://www.flirtzoo.com response code Issuer certificate is invalid.<br>[Sat, 08 Nov 2014 03:39:32 -0800] No page processor for mime type:<br><br><br>Why am i having this error?

-- Crawl error: response code Issuer certificate is invalid
My guess is that your website has a self-signed certificate or something and this is confusing the page requests by the Yioop software within the website. (Yioop communicates with itself via http). You could probably hack the code in lib/fetch_url.php getPages() to add an option so that it ignores when SSL/TLS certificates are self-signed. This would be something like setting the curl option flag CURLOPT_SSL_VERIFYPEER to false. Another option, of course, is to get a real certificate.
My guess is that your website has a self-signed certificate or something and this is confusing the page requests by the Yioop software within the website. (Yioop communicates with itself via http). You could probably hack the code in lib/fetch_url.php getPages() to add an option so that it ignores when SSL/TLS certificates are self-signed. This would be something like setting the curl option flag CURLOPT_SSL_VERIFYPEER to false. Another option, of course, is to get a real certificate.

-- Crawl error: response code Issuer certificate is invalid
Actually, I looked at your certificate and at least firefox said it was okay. I remember a while back there was some issue with some Dutch or Belge certificate provider having their certificates revoked? Maybe that's why curl is having a problem?
Actually, I looked at your certificate and at least firefox said it was okay. I remember a while back there was some issue with some Dutch or Belge certificate provider having their certificates revoked? Maybe that's why curl is having a problem?

-- Crawl error: response code Issuer certificate is invalid
Originally Posted By: buzz
Thank you for your response. The certificate is from "AlphaSSL", which is as far as i know is a big US? certificate provider. I don't think the certificate validity is the problem, all browsers validate it.

Is curl in anyway using the hostname of the server? I have yioop installed under a diffrent domainname(/home/folder) on the same physical server as the site i am trying to crawl, and the hostname is set to the site i am trying to crawl(which uses ssl).
'''Originally Posted By: buzz''' Thank you for your response. The certificate is from &quot;AlphaSSL&quot;, which is as far as i know is a big US? certificate provider. I don't think the certificate validity is the problem, all browsers validate it.<br><br>Is curl in anyway using the hostname of the server? I have yioop installed under a diffrent domainname(/home/folder) on the same physical server as the site i am trying to crawl, and the hostname is set to the site i am trying to crawl(which uses ssl).

-- Crawl error: response code Issuer certificate is invalid
I see, I saw the nv-sa in the certificate and thought it was Netherlands-based, but I guess that was you,
not your certificate provider. I can try a small crawl from my laptop in the next day or so to see
what gives and adjust the SSL/TLS settings in the most recent version of Yioop so it works. I'll let you
know here what I find. Just for the record, what version of PHP are you running as that will influence
the version of CURL and could affect its processing of SSL/TLS.

Best,
Chris
I see, I saw the nv-sa in the certificate and thought it was Netherlands-based, but I guess that was you,<br>not your certificate provider. I can try a small crawl from my laptop in the next day or so to see<br>what gives and adjust the SSL/TLS settings in the most recent version of Yioop so it works. I'll let you<br>know here what I find. Just for the record, what version of PHP are you running as that will influence<br>the version of CURL and could affect its processing of SSL/TLS.<br><br>Best,<br>Chris

-- Crawl error: response code Issuer certificate is invalid
Originally Posted By: buzz
Yes, that that is actually true, it is provided by a Dutch certificate reseller, but the certificate comes from AlphaSSL.

I am using:
Os: CentOS 7
php: PHP 5.4.16
apache: 2.4.6

Regards,
'''Originally Posted By: buzz''' Yes, that that is actually true, it is provided by a Dutch certificate reseller, but the certificate comes from AlphaSSL.<br><br>I am using:<br>Os: CentOS 7<br>php: PHP 5.4.16<br>apache: 2.4.6<br><br>Regards,
2014-11-14

-- Crawl error: response code Issuer certificate is invalid
Sorry for being slow in getting back to you. I did a quick test crawl. It downloads the robots.txt without problem.
However, that files contains:
User-agent: *
Disallow: /

which forbids Yioop from crawling that site. So the crawler stops there.
Sorry for being slow in getting back to you. I did a quick test crawl. It downloads the robots.txt without problem.<br>However, that files contains:<br>User-agent: * <br>Disallow: / <br><br>which forbids Yioop from crawling that site. So the crawler stops there.
X