DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones AWS Cloud
by AWS Developer Relations
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Partner Zones
AWS Cloud
by AWS Developer Relations
The Latest "Software Integration: The Intersection of APIs, Microservices, and Cloud-Based Systems" Trend Report
Get the report

How-to: download linked images from a website

Denzel D. user avatar by
Denzel D.
·
Dec. 04, 11 · Interview
Like (0)
Save
Tweet
Share
4.15K Views

Join the DZone community and get the full member experience.

Join For Free

There was actually a question that got me thinking – how would I implement a program that downloads pictures from a web page, that are pointed by some links?

Here is a sample console application I came up with:

using System;
using System.Collections.Generic;
using System.Net;
using System.Threading;
using System.IO;
using System.Text.RegularExpressions;
using System.Drawing;
namespace ConsoleApplication
{
class Program
{
static int totalFiles = 0;
static int currentFiles = 0;
static void Main(string[] args)
{
GetImages("<a href="http://www.textureking.com/index.php/category/all-textures%22);">http://www.textureking.com/index.php/category/all-textures");</a>
}
static void GetImages(string url)
{
string responseString;
HttpWebRequest initialRequest = (HttpWebRequest)WebRequest.Create(url);
using (HttpWebResponse initialResponse = (HttpWebResponse)initialRequest.GetResponse())
{
using (StreamReader reader = new StreamReader(initialResponse.GetResponseStream()))
{
responseString = reader.ReadToEnd();
}
}
List<string> imageset = new List<string>();
Regex regex = new Regex(@"f=""[^""]*jpg|bmp|tif|gif|png",RegexOptions.IgnoreCase);
foreach (Match m in regex.Matches(responseString))
{
if (!imageset.Contains(m.Value))
imageset.Add(m.Value);
}
for (int i = 0; i < imageset.Count; i++)
imageset[i] = imageset[i].Remove(0, 3);
totalFiles = imageset.Count;
currentFiles = totalFiles;
Console.WriteLine(totalFiles.ToString() + " images will be downloaded.");
foreach (string f in imageset)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(DownloadImage), f);
}
Console.Read();
}
static void DownloadImage(object path)
{
currentFiles--;
Console.WriteLine("Downloading " + Path.GetFileName(path.ToString()) + "... (" + (totalFiles - currentFiles).ToString() + "/" + totalFiles + ")");
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(path.ToString());
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
Image image = Image.FromStream(response.GetResponseStream());
image.Save(@"D:\Temporary\" + Path.GetFileName(path.ToString()));
}
Console.WriteLine(Path.GetFileName(path.ToString()) + " downloaded.");
}
}
} 

The sample URL provided in the method call is used to download several textures linked on the webpage.

I am using regex to actually find the URLs. The case is ignored since I am not sure whether the file extensions are written with in caps or not. Since there is a chance for the same URL to be mentioned twice on the same page, I am making sure that there are no duplicates, so before adding the regex match to the List, I am checking if that already contains an entry for the match.

The final saving path also can be modified, but I decided to leave it hardcoded like this for testing purposes. In case you want to make the path dynamic, you can pass a generic collection or an array as the parameter for the DownloadImage method and then explicitly convert it and read the needed values (identified by an index, for example).

NOTE: I am using ThreadPool here so all threads are automatically set as background – if the application is closed, the download process will be canceled. To avoid this and wait for all downloads to complete (which is probably not a good idea but still a possibility), the Thread class should be used with IsBackground set to false.

Download

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • NoSQL vs SQL: What, Where, and How
  • What Are the Different Types of API Testing?
  • 5 Software Developer Competencies: How To Recognize a Good Programmer
  • 5 Best Python Testing Frameworks

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: