Over a million developers have joined DZone.

Fail, Fail, Fail: More Job Candidate Fails

· Big Data Zone

Sometimes, reading candidates answers is just something that I know is going to piss me off.

We have a question that goes something like this (the actual question is much more detailed):

We have a 15TB csv file that contains web log, the entries are sorted by date (since this is how they were entered). Find all the log entries within a given date range. You may not read more than 32 MB.

A candidate replied with an answer that had the following code:

    string line = string.Empty;
    StreamReader file;
        file = new StreamReader(filename);
    catch (FileNotFoundException ex)
       Console.WriteLine("The file is not found.");
   while ((line = file.ReadLine()) != null)
       var values = line.Split(',');
       DateTime date = Convert.ToDateTime(values[0]);
       if (date.Date >= startDate && date.Date <= endDate)
       // Results size in MB
       double size = (GetObjectSize(output) / 1024f) / 1024f;
       if (size >= 32)
           Console.WriteLine("Results size exceeded 32MB, the search will stop.");

My reply was:

The data file is 15TB in size, if the data is beyond the first 32MB, it won't be found.

The candidate then fixed his code. It now includes:

  var lines = File.ReadLines(filename);

Yep, this is on a 15TB file.

Now I’m going to have to lie down for a bit, I am not feeling so good.


Published at DZone with permission of Ayende Rahien , DZone MVB .

Opinions expressed by DZone contributors are their own.

{{ parent.title || parent.header.title}}

{{ parent.tldr }}

{{ parent.urlSource.name }}