.net Core Parallel.ForEach issues

Why Parallel.ForEach is not good for this task is explained in comments: it’s designed for CPU-bound (CPU-intensive) tasks. If you use it for IO-bound operations (like making web requests) – you will waste thread pool thread blocked while waiting for response, for nothing good. It’s possible to use it still, but it’s not best for this scenario.

What you need is to use asynchronous web request methods (like HttpWebRequest.GetResponseAsync), but here comes another problem – you don’t want to execute all your web requests at once (like another answer suggests). There may be thousands urls (ids) in your list. So you can use thread synchronization constructs designed for that, for example Semaphore. Semaphore is like queue – it allows X threads to pass, and the rest should wait until one of busy threads will finish it’s work (a bit simplified description). Here is an example:

static async Task ProcessUrls(string[] urls) {
    var tasks = new List<Task>();
    // semaphore, allow to run 10 tasks in parallel
    using (var semaphore = new SemaphoreSlim(10)) {
        foreach (var url in urls) {
            // await here until there is a room for this task
            await semaphore.WaitAsync();
            tasks.Add(MakeRequest(semaphore, url));
        }
        // await for the rest of tasks to complete
        await Task.WhenAll(tasks);
    }
}

private static async Task MakeRequest(SemaphoreSlim semaphore, string url) {
    try {
        var request = (HttpWebRequest) WebRequest.Create(url);

        using (var response = await request.GetResponseAsync().ConfigureAwait(false)) {
            // do something with response    
        }
    }
    catch (Exception ex) {
        // do something
    }
    finally {
        // don't forget to release
        semaphore.Release();
    }
}

Leave a Comment