Scrape plants scientific name information from Agroforestry Species Switchboard 2.0.

Last update: Dec 23, 2021

Overview

Agroforestry Species Switchboard 2.0 Scraper

Scrape plants scientific name information from Species Switchboard 2.0.

Requirements

python >= 3.10 (you can use pyenv for easier python version management)
pipenv

How to run

Install dependencies

cp env.sample .env
pipenv --python 3
pipenv install

Run
```
pipenv run python main.py
```
The result will be placed in a file named result.*.csv

Test Shell

pipenv run scrapy shell 'http://apps.worldagroforestry.org/products/switchboard/index.php/species_search/Acacia%20abyssinica'

Cleanup All Outputs

rm result.* && rm log.*

Special Cases

Case	Link	Note
ICRAF Databases Not Found	Engelhardia spicata
Genus Found	Forficula	What to do next?
Multiple Species Found	Alstonia spectabilis	Get the matched species right?
Species Variant Found	Engelhardtia spicata	Need human to check
Similar Species Found	Costus speciosus	Need human to check

Contributing

Fork this repo
Develop
Create pull request
Tag @rizqirizqi for review
Merge~~

License

GPL-3.0

Scrape plants scientific name information from Agroforestry Species Switchboard 2.0.

Related tags

Overview

Agroforestry Species Switchboard 2.0 Scraper

Requirements

How to run

Test Shell

Cleanup All Outputs

Special Cases

Contributing

License

Owner

Mgs. M. Rizqi Fadhlurrahman

Web-Scraping using Selenium Master

Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation

Scrapping Connections' info on Linkedin

Google Scholar Web Scraping

Divar.ir Ads scrapper

Dictionary - Application focused on word search through web scraping

Telegram Group Scrapper

TikTok Username Swapper/Claimer/etc

This is python to scrape overview and reviews of companies from Glassdoor.

Complete pipeline for crawling online newspaper article.

Works very well and you can ask for the type of image you want the scrapper to collect.

WebScrapping Project - G1 Latest News

Crawler job that scrapes comments from social media posts and saves them in a S3 bucket.

Collection of code files to scrap different kinds of websites.

中国大学生在线四史自动答题刷分(现仅支持英雄篇)

This project was created using Python technology and flask tools to scrape a music site

mlscraper: Scrape data from HTML pages automatically with Machine Learning

Telegram group scraper tool

联通手机营业厅自动做任务、签到、领流量、领积分等。

自动完成每日体温上报（Github Actions）

Scrape plants scientific name information from Agroforestry Species Switchboard 2.0.

Related tags

Overview

Agroforestry Species Switchboard 2.0 Scraper

Requirements

How to run

Test Shell

Cleanup All Outputs

Special Cases

Contributing

License

Owner

Mgs. M. Rizqi Fadhlurrahman

Web-Scraping using Selenium Master

Web Crawlers for Data Labelling of Malicious Domain Detection & IP Reputation Evaluation

Scrapping Connections' info on Linkedin

Google Scholar Web Scraping

Divar.ir Ads scrapper

Dictionary - Application focused on word search through web scraping

Telegram Group Scrapper

TikTok Username Swapper/Claimer/etc

This is python to scrape overview and reviews of companies from Glassdoor.

Complete pipeline for crawling online newspaper article.

Works very well and you can ask for the type of image you want the scrapper to collect.

WebScrapping Project - G1 Latest News

Crawler job that scrapes comments from social media posts and saves them in a S3 bucket.

Collection of code files to scrap different kinds of websites.

中国大学生在线 四史自动答题刷分(现仅支持英雄篇)

This project was created using Python technology and flask tools to scrape a music site

mlscraper: Scrape data from HTML pages automatically with Machine Learning

Telegram group scraper tool

联通手机营业厅自动做任务、签到、领流量、领积分等。

自动完成每日体温上报（Github Actions）

中国大学生在线四史自动答题刷分(现仅支持英雄篇)