Timing Results

Time in seconds to perform various image fetching tasks;

Task Multi-thread multi-search (concurrent_image_search) Multi-thread single-search (concurrent_images_download) Single-thread single-search (download_images) google-images-download by hardikvasa
Download 200 cat pictures 23.6 22.4 92.7 148.4
Download 200 cat & dog pictures 28.7 47.7 254.2 330.4

All tests were ran with the following config;

  • total_images=200
  • headers={'User-Agent': 'Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36'}
  • progress_bar=False
  • verbose=True

Both concurrent_image_search and concurrent_images_download were ran with;

  • max_image_fetching_threads=20
  • image_download_timeout=3

concurrent_image_search was also ran with max_similtanous_threads=2

google-images-download was ran with the following config; arguments = {"keywords":"cat", "limit":200, "chromedriver": "chromedriver.exe", "format": "jpg", "print_urls":False}

Explanation

Understandably in all cases concurrent processing beat out single thread because they are able to download multiple images similtaneously. concurrent_image_search goes one step further with multiple search terms by running them similitaneoulsy, where the other 2 must run one after the other. What's interesting is that concurrent_image_search is slower than concurrent_images_download even though the first actually uses the second when executing. This delay is likely to do with the fact that concurrent_image_search must allocate the call to a thread handler, whereas concurrent_images_download starts immediatly.