-

 

Which file executes the SQL Queries in YIOOP and does YIOOP index all the languages of the WIKIPEDIA Pages ??? (+0/0). - 13/06/2018 Yioop Software Help

Hello. I have some very important questions to ask you about YIOOP and how it stores data:
1 - Does Yioop support the download and indexing of WIKIPEDIA Pages available in all existing languages such as: "https://en.wikipedia.org/wiki/Mark_Zuckerberg" AND "https: //en.wikipedia .org / wiki / Mark_Zuckerberg "AND MORE" https://es.wikipedia.org/wiki/Mark_Zuckerberg "???
2 - Which are the files that execute the SQL queries in YIOOP ??? What exactly is the path of these files that execute SQL queries ???
3 - How to access the database of Web pages downloaded and indexed by YIOOP ??? Where exactly is the YIOOP Search Engine Database ???
Thank you for enlightening me a little bit because this is very important.
Hello. I have some very important questions to ask you about YIOOP and how it stores data: '''1 - '''Does Yioop support the download and indexing of WIKIPEDIA Pages available in all existing languages such as: "https://en.wikipedia.org/wiki/Mark_Zuckerberg" AND "https: //en.wikipedia .org / wiki / Mark_Zuckerberg "AND MORE" https://es.wikipedia.org/wiki/Mark_Zuckerberg "??? '''2 - '''Which are the files that execute the SQL queries in YIOOP ??? What exactly is the path of these files that execute SQL queries ??? '''3 - '''How to access the database of Web pages downloaded and indexed by YIOOP ??? Where exactly is the YIOOP Search Engine Database ??? Thank you for enlightening me a little bit because this is very important.
 

-- Which file executes the SQL Queries in YIOOP and does YIOOP index all the languages of the WIKIPEDIA Pages ??? (+0/0). - 04/07/2018 Yioop Software Help

Hi,
Most of the questions you are asking are answered in the Documentation for Yioop if you do a search in that page.
  1. You can index whatever pages you want by listing them as seed sites for the crawl you want to carry out.
  2. Yioop does not use a sql database to store crawls. It can use sqlite, mysql, or postgres for storing group wiki information and non crawl related stuff. Crawls are stored in work_directory/cache as folders beginning with IndexData followed by some timestamp. These folders have three sub-folders: dictionaries, which contains a collection of binary files that let you look up based on the hash of a query term, which index shards have postings about that query term; posting_doc_shards, which has a collection of index shard files each of which is a binary file containing postings which represent where to find for an occurrence of a query term, the document that contained that occurrence; and summaries, which contains a collection of web archive files with compressed summaries of each web page downloaded.
In Version 5 of Yioop, also in the work-directory/cache folder, you will see Archive folders, these contain compressed full pages that were downloaded during a crawl.
To understand how the indexing and crawl process work you should read: Yioop Ranking.
If you want to find out where the SQL database used by Yioop for Groups and Wiki's is and what it contains go to Server Settings and look at how it is configured under: Database Set-Up. Typically, a sqlite database is used and it is stored in work_directory/data/public_data.db . The contents can be viewed using any Sqlite viewer.

Last Edited: 04/07/2018
Hi, Most of the questions you are asking are answered in the [[https://www.seekquarry.com/p/Documentation|Documentation for Yioop]] if you do a search in that page. # You can index whatever pages you want by listing them as seed sites for the crawl you want to carry out. # Yioop does not use a sql database to store crawls. It can use sqlite, mysql, or postgres for storing group wiki information and non crawl related stuff. Crawls are stored in work_directory/cache as folders beginning with IndexData followed by some timestamp. These folders have three sub-folders: dictionaries, which contains a collection of binary files that let you look up based on the hash of a query term, which index shards have postings about that query term; posting_doc_shards, which has a collection of index shard files each of which is a binary file containing postings which represent where to find for an occurrence of a query term, the document that contained that occurrence; and summaries, which contains a collection of web archive files with compressed summaries of each web page downloaded. In Version 5 of Yioop, also in the work-directory/cache folder, you will see Archive folders, these contain compressed full pages that were downloaded during a crawl. To understand how the indexing and crawl process work you should read: [[https://www.seekquarry.com/p/Ranking|Yioop Ranking]]. If you want to find out where the SQL database used by Yioop for Groups and Wiki's is and what it contains go to Server Settings and look at how it is configured under: Database Set-Up. Typically, a sqlite database is used and it is stored in work_directory/data/public_data.db . The contents can be viewed using any Sqlite viewer.
 
 
[X ]