DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports Events Over 2 million developers have joined DZone. Join Today! Thanks for visiting DZone today,
Edit Profile Manage Email Subscriptions Moderation Admin Console How to Post to DZone Article Submission Guidelines
View Profile
Sign Out
Refcards
Trend Reports
Events
Zones
Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
  1. DZone
  2. Coding
  3. Languages
  4. Making Code Faster: Start From Scratch

Making Code Faster: Start From Scratch

The author continues refactoring and improves performance yet again.

Oren Eini user avatar by
Oren Eini
·
Nov. 24, 16 · Tutorial
Like (3)
Save
Tweet
Share
3.87K Views

Join the DZone community and get the full member experience.

Join For Free

After introducing the problem and doing some very obvious things, we have managed to get to nine seconds instead of 30. That is pretty awesome, but we can do better.

Let's see what would happen if we were to write it from scratch, sans Linq.

var stats = new Dictionary<long, long>();

foreach (var line in File.ReadLines(args[0]))
{
    var parts = line.Split(' ');
    var duration = (DateTime.Parse(parts[1]) - DateTime.Parse(parts[0])).Ticks;
    var id = long.Parse(parts[2]);

    long existingDuration;
    stats.TryGetValue(id, out existingDuration);
    stats[id] = existingDuration + duration;
}

using (var output = File.CreateText("summary.txt"))
{
    foreach (var entry in stats)
    {
        output.WriteLine($"{entry.Key:D10} {TimeSpan.FromTicks(entry.Value):c}");
    }
}

The code is still pretty small and idiomatic, but not using Linq gave us some interesting numbers: 10.4 seconds to run (so comparable to the parallel Linq), but we also allocated 2.9 GB (down from 3.49 GB) and our peek working set didn’t exceed 30 MB.

Taking the next step and paralleling this approach:

var stats = new ConcurrentDictionary<long, long>();
Parallel.ForEach(File.ReadLines(args[0]), line =>
{
    var parts = line.Split(' ');
    var duration = (DateTime.Parse(parts[1]) - DateTime.Parse(parts[0])).Ticks;
    var id = long.Parse(parts[2]);
    stats.AddOrUpdate(id, duration, (_, existingDuration) => existingDuration + duration);

});

We now have eight seconds, 3.49 GB of allocations and peak working set of 50 MB. That is good, but we can do better.

// code

var stats = new Dictionary<string, FastRecord>();

foreach (var line in File.ReadLines(args[0]))
{
    var parts = line.Split(' ');
    var duration = (DateTime.Parse(parts[1]) - DateTime.Parse(parts[0])).Ticks;

    FastRecord value;
    var idAsStr = parts[2];
    if (stats.TryGetValue(idAsStr, out value) == false)
    {
        stats[idAsStr] = value = new FastRecord
        {
            Id = long.Parse(idAsStr),
        };
    }
    value.DurationInTicks += duration;
}


using (var output = File.CreateText("summary.txt"))
{
    foreach (var entry in stats)
    {
        output.WriteLine($"{entry.Value.Id:D10} {TimeSpan.FromTicks(entry.Value.DurationInTicks):c}");
    }
}



// record class
 public class FastRecord
 {
     public long Id;
     public long DurationInTicks;
 }

Now, instead of using a dictionary of long to long, we’re using a dedicated class, and the key is the string representation of the number. Most of the time, it should save us the need to parse the long. It also means that the number of dictionary operations we need to do is reduced.

This dropped the runtime to 10.2 seconds (compared to 10.4 seconds for the previous single threaded impl). That is good, but this is just the first stage; what I really want to do is save on all those expensive dictionary calls when running in parallel.

Here is the parallel version:

var stats = new ConcurrentDictionary<string, FastRecord>();

Parallel.ForEach(File.ReadLines(args[0]), line =>
{
    var parts = line.Split(' ');
    var duration = (DateTime.Parse(parts[1]) - DateTime.Parse(parts[0])).Ticks;

    var idAsStr = parts[2];
    var value = stats.GetOrAdd(idAsStr, s => new FastRecord
    {
        Id = long.Parse(s)
    });

    Interlocked.Add(ref value.DurationInTicks, duration);
});

And that one runs at 4.1 seconds, allocates 3 GB and has a peek working set of 48 MB.

We are now close to eight times faster than the initial version! ...but we can probably still do better. I’ll go over that in my next post.

Scratch (programming language)

Published at DZone with permission of Oren Eini, DZone MVB. See the original article here.

Opinions expressed by DZone contributors are their own.

Popular on DZone

  • The 12 Biggest Android App Development Trends in 2023
  • An Introduction to Data Mesh
  • Upgrade Guide To Spring Data Elasticsearch 5.0
  • How Observability Is Redefining Developer Roles

Comments

Partner Resources

X

ABOUT US

  • About DZone
  • Send feedback
  • Careers
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 600 Park Offices Drive
  • Suite 300
  • Durham, NC 27709
  • support@dzone.com
  • +1 (919) 678-0300

Let's be friends: