DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Please enter at least three characters to search
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Zones

Culture and Methodologies Agile Career Development Methodologies Team Management
Data Engineering AI/ML Big Data Data Databases IoT
Software Design and Architecture Cloud Architecture Containers Integration Microservices Performance Security
Coding Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks
Culture and Methodologies
Agile Career Development Methodologies Team Management
Data Engineering
AI/ML Big Data Data Databases IoT
Software Design and Architecture
Cloud Architecture Containers Integration Microservices Performance Security
Coding
Frameworks Java JavaScript Languages Tools
Testing, Deployment, and Maintenance
Deployment DevOps and CI/CD Maintenance Monitoring and Observability Testing, Tools, and Frameworks

Last call! Secure your stack and shape the future! Help dev teams across the globe navigate their software supply chain security challenges.

Modernize your data layer. Learn how to design cloud-native database architectures to meet the evolving demands of AI and GenAI workloads.

Releasing software shouldn't be stressful or risky. Learn how to leverage progressive delivery techniques to ensure safer deployments.

Avoid machine learning mistakes and boost model performance! Discover key ML patterns, anti-patterns, data strategies, and more.

Related

  • Python Packages for Validating Database Migration Projects
  • Data Analytics Using Python
  • .NET Performance Optimization Techniques for Expert Developers
  • How to Use Python for Data Science

Trending

  • Medallion Architecture: Efficient Batch and Stream Processing Data Pipelines With Azure Databricks and Delta Lake
  • Why High-Performance AI/ML Is Essential in Modern Cybersecurity
  • Understanding and Mitigating IP Spoofing Attacks
  • Enhancing Security With ZTNA in Hybrid and Multi-Cloud Deployments
  1. DZone
  2. Data Engineering
  3. Data
  4. Getting Started With C# DataFrame and XPlot.Ploty

Getting Started With C# DataFrame and XPlot.Ploty

For the Python programming language, Pandas is an efficient and popular data analysis tool, especially its Dataframe, used to manipulate and display data.

By 
Jirapongse Phuriphanvichai user avatar
Jirapongse Phuriphanvichai
·
Updated Jan. 27, 22 · Tutorial
Likes (5)
Comment
Save
Tweet
Share
24.4K Views

Join the DZone community and get the full member experience.

Join For Free

For the Python programming language, Pandas is an efficient and popular data analysis tool, especially its Dataframe, used to manipulate and display data. For the .NET programming languages, we can use Deedle or Microsoft.Data.Analysis package available in NuGet which also provides a DataFrame class used to manipulate, transform, and display data.

This example focuses on Microsoft.Data.Analysis package by demonstrating some basic features of the DataFrame class in Jupyter Notebook.

It also uses the XPlot.Plotly package which is F# data visualization package to plot charts for the data in the Dataframe. The source code is available on GitHub.

Prerequisite

To run examples in this article, please refer to this Using .NET Core in Jupyter Notebook article for setting up Jupyter Notebook to support the .NET programming languages.

Install the Package

The Microsoft.Data.Analysis package is available in Nuget so the dotnet-interactive #r magic command can be used to install the package from NuGet.

Run the below command to install Microsoft.Data.Analysis package version 0.4.0.

C#
 
#r "nuget:Microsoft.Data.Analysis,0.4.0"


Refer the Namespaces

This article uses classes from the following four packages. Therefore, it uses the using statement to refer to those packages.

  • XPlot.Plotly: A cross-platform data visualization package for the F# and .NET programming languages
  • Microsoft.Data.Analysis: An easy-to-use and high-performance library for data analysis and transformation
  • System.Linq: Classes and interfaces that support queries that use Language-Integrated Query
  • Microsoft.AspNetCore.Html: Types for manipulating HTML content
C#
 
using XPlot.Plotly;
using Microsoft.Data.Analysis;
using System.Linq;
using Microsoft.AspNetCore.Html;


Render a DataFrame as an HTML Table

By default, a DataFrame is rendered as an HTML table with one row and two columns (Columns and Rows).

Columns vs. Rows

This can be overridden by registering the custom formatter for the DataFrame. The below code registers custom formatters for the Dataframe and DataFrameRow to render the data in an HTML table. 

Custom Formatting

It only displays the first 100 rows. This can be changed by modifying the value of the take variable. 

Java
 
Formatter<DataFrame>.Register((df, writer) =>
{
    var headers = new List<IHtmlContent>();
    headers.Add(th(i("index")));
    headers.AddRange(df.Columns.Select(c => (IHtmlContent) th(c.Name)));
    var rows = new List<List<IHtmlContent>>();
    var take = 100;
    for (var i = 0; i < Math.Min(take, df.Rows.Count); i++)
    {
        var cells = new List<IHtmlContent>();
        cells.Add(td(i));
        foreach (var obj in df.Rows[i])
        {
            cells.Add(td(obj));
        }
        rows.Add(cells);
    } 
    var t = table(
        thead(
            headers),
        tbody(
            rows.Select(
                r => tr(r))));
    writer.Write(t);    
    writer.Write(df.Rows.Count + " x "+df.Columns.Count);
}, "text/html");
 
Formatter<DataFrameRow>.Register((dataFrameRow, writer) =>
{
    var cells = new List<IHtmlContent>();
    cells.Add(td(i));
    foreach (var obj in dataFrameRow)
    {
        cells.Add(td(obj));
    }
    var t = table(
        tbody(
            cells));
    writer.Write(t);
}, "text/html");


Create the DataFrame

DataFrameColumn

A DataFrame can be created by passing the list of DataFrameColumn objects to the DataFrame's constructor.

public DataFrame(params DataFrameColumn[] columns);
public DataFrame(IEnumerable<DataFrameColumn> columns);


The following code creates a DataFrame that has 200 rows and 2 columns. The first column contains dates and the second column contains random integer numbers. It calls the PrimitiveDataFrameColumn constructor to create the DataFrameColumn instances.

C#
 
var start = new DateTime(2009,1,1);
Random rand = new Random();
var numDataPoint = 200; 

PrimitiveDataFrameColumn<DateTime> date = new PrimitiveDataFrameColumn<DateTime>("Date", 
    Enumerable.Range(0, numDataPoint)
          .Select(offset => start.AddDays(offset))
          .ToList()); 
PrimitiveDataFrameColumn<int> data = new PrimitiveDataFrameColumn<int>("Data",
    Enumerable.Range(0, numDataPoint)
                        .Select(r => rand.Next(100))
                        .ToList()); 
var df = new DataFrame(date, data);
df


CSV File

The DataFrame can also be created from a CSV file by calling the DataFrame.LoadCsv static method.

The following code creates a DataFrame from the ohcldata.csv file. This file is downloaded from 5.30. Example - Basic OHLC (Open, High, Low, Close) Financial Plot website. This file contains the daily Open, High, Low, Close financial data.

C#
 
var df1 = DataFrame.LoadCsv("ohlcdata.csv");
df1

Then, the Info method can be used to generate a summary of each column in the DataFrame. 

C#
 
df1.Info()

Access the DataFrame

Access Data by Indices

A row index and column index can be used to access the specific data in the DataFrame. The index is zero-based numbering.

The below code access data in the first row and the second column.

C#
 
df[0,1]



Out[7]: 11; First Row, Second Column


After that, a new value can be assigned to the DataFrame.

The below code increases the data in the first row and the second column by 10.

C#
 
df[0,1] = int.Parse(df[0,1].ToString()) + 10;
df.Head(10)

Out[8]: Chart; Increase Data by 10

Access Row Data

The entire row can be accessed by using a row index. The index is zero-based numbering.

The below code accesses the tenth row in the DataFrame.

C#
 
df1.Rows[9]


The column index can also be used to access the specific column in the row.

The below accesses the fourth column in the tenth row.

C#
 
df1.Rows[9][5]


Then, the new value can also be assigned to the column.

The below code assigns 50000000 to the sixth column.

C#
 
df1.Rows[9][5] = 50000000f;
df1.Head(10)


Out[11]: Chart; Access Row Data


Access Column Data

The entire column can be accessed by using a column name or index. The index is zero-based numbering.

The below code accesses the column named Data (the second column) in the DataFrame.

C#
 
//df.Columns["Data"] or df.Columns[1]
df.Columns["Data"]  


Out[12]: Chart; Access Column Data


The data in the column can be changed by using the DataFrame's overloaded operators.

The below code increases all data in the column by ten.

C#
 
 df.Columns["Data"]= df.Columns["Data"]+10;
 df

Out[13]: Chart; Increase All Data By 10

Insert Data

Add a New Column

The DataFrame maintains a list of DataFrameColumns in the DataFrameColumnCollection. A new column can be added to the DataFrameColumnCollection.

The below code adds a new integer column to the DataFrame.

C#
 
df.Columns.Add(new PrimitiveDataFrameColumn<int>("Data1", df.Rows.Count()));
df

Out [14]: Chart; Add a New Column

The data in the new column is set to null.

The following code fills null values in the new column (Data1) with 10.

C#
 
df.Columns["Data1"].FillNulls(10, true);
df

Out[15]: Chart; Fill Null Values

Append a New Row

The Append method can be used to append new rows to the DataFrame.

The following code creates a list of KeyValuePair instances and then adds it to the DataFrame.

C#
 
df.Append(new List<KeyValuePair<string, object>>() { 
    new KeyValuePair<string, object>("Date", DateTime.Now),
    new KeyValuePair<string, object>("Data", 12),
    new KeyValuePair<string, object>("Data1", 50)
}, true);
df.Tail(10)


Out [16]: Chart; Append a New Row


Manipulate the DataFrame

Sort the DataFrame

The OrderBy or OrderByDescending method can be used to order the DataFrame by a specified column.

The following code sorts the DataFrame by the column named Data.

C#
 
df.OrderBy("Data")

Out [17]:Chart; Sort DataFrame

Group the DataFrame

The GroupBy method can be used to group the rows of the DataFrame by unique values in the column.

The following code groups the DataFrame by the column named Data and then counts the number of values in each group.

C#
 
var groupByData = df.GroupBy("Data");
groupByData.Count().OrderBy("Data")

Out[18]: Chart; Group the DataFrame

Filter the DataFrame

The Filter method can be used to filter the DataFrame by row indices or boolean values.

The following code filters the DataFrame by returning rows that have the values in the column named Data greater than fifty.

C#
 
df.Filter(df.Columns["Data"].ElementwiseGreaterThan(50))


Out[19]: Chart; Filter the DataFrame


Merge the DataFrame

The Merge method can be used to merge two DataFrames with a database-style join.

The following code joins two DataFrames by using the Date column contained in both DataFrames. First, it converts the data type in the Date column of the df1 from the string type to the DataTime type. Then, it calls the Merge method to join the DataFrames.

Java
 
df1.Columns["Date"] = new PrimitiveDataFrameColumn<DateTime>("Date", 
    df1.Columns["Date"]
                .Cast<object>()
                .ToList()
                .Select(x => DateTime.ParseExact(x.ToString(), "yyyy-MM-dd", System.Globalization.CultureInfo.InvariantCulture))
                .Cast<DateTime>());               

df1.Merge<DateTime>(df, "Date", "Date")


Out[20]: Chart; Merge the DataFrame


Plot Charts by Using XPlot.Ploty

XPlot.Ploty is a cross-platform data visualization package for the F# and .NET programming languages. It is based on Plotly which is the popular JavaScript charting library.

The following examples demonstrate how to use XPlot.Ploty to plot charts by using the data in the DataFrame.

Line Chart

The following code plots a line chart from the Open column in the DataFrame.

C#
 
var chart1 = Chart.Plot(
    new Graph.Scatter
    {
        x = df1.Columns["Date"],
        y = df1.Columns["Open"],        
        mode = "lines+markers"
    }
);
var chart1_layout = new Layout.Layout{
    title="Open Price",
    xaxis =new Graph.Xaxis{
        title = "Date"
        },
    yaxis =new Graph.Yaxis{
    title = "Price (USD)"
        }           
    };
chart1.WithLayout(chart1_layout);
chart1


Line Chart Example


Line Chart With Mulitple Lines

The following code plots the Open, and Close columns in a line chart.

C#
 
var chart2_list = new List<Graph.Scatter> 
{
    new Graph.Scatter
    {
         x = df1.Columns["Date"],
        y = df1.Columns["Open"],
        name="Open",
        mode = "lines"
    },
    new Graph.Scatter    
    {       
        x = df1.Columns["Date"],
        y = df1.Columns["Close"],
        name="Close",
        mode = "lines"
    }
    
};
 
var chart2 = Chart.Plot(
    chart2_list
);
 
var chart2_layout = new Layout.Layout{
    title="Open and Close Price",
    xaxis =new Graph.Xaxis{
        title = "Date"
        },
    yaxis =new Graph.Yaxis{
    title = "Price (USD)"
        }           
    };
chart2.WithLayout(chart2_layout);
chart2


Multiple Line Chart Example


Bar Chart

The following code plots a bar chart from the Volume column in the DataFrame.

C#
 
var chart3 = Chart.Plot(

    new Graph.Bar
    {
        x = df1.Columns["Date"],
        y = df1.Columns["Volume"],        
        marker = new Graph.Marker{color = "rgb(0, 0, 109)"}
    }
);
var chart3_layout = new Layout.Layout{
    title="Volume",
    xaxis =new Graph.Xaxis{
        title = "Date"
        },
    yaxis =new Graph.Yaxis{
    title = "Unit"
        }           
    };
chart3.WithLayout(chart3_layout);
chart3



Bar Chart Example


Candlestick Chart

The following code plots a candlestick chart from the Open, High, Low, Close columns in the DataFrame.

C#
 
var chart4 = Chart.Candlestick(df1.OrderBy("Date").Rows.Select(row => new Tuple<string, double, double, double, double>(
                 ((DateTime)row[0]).ToString("yyyy-MM-dd"),
                 double.Parse(row[1].ToString()),
                 double.Parse(row[2].ToString()),
                 double.Parse(row[3].ToString()),
                 double.Parse(row[4].ToString())
                )));
chart4.WithLayout(new Layout.Layout{
    title="OHLC",
    xaxis =new Graph.Xaxis{
        title = "Date"
        },
    yaxis =new Graph.Yaxis{
    title = "Price (USD)"
        }           
    });
chart4


Candlestick Chart


References

  1. Phplot.sourceforge.net. n.d. 5.30. Example - Basic OHLC (Open, High, Low, Close) Financial Plot. [online] Available at: http://phplot.sourceforge.net/phplotdocs/ex-ohlcbasic.html [Accessed 6 May 2021].
  2. Bluemountaincapital.github.io. n.d. Deedle: Exploratory data library for .NET. [online] Available at: https://bluemountaincapital.github.io/Deedle/ [Accessed 6 May 2021].
  3. Govindarajan, P., 2019. An Introduction to DataFrame | .NET Blog. [online] .NET Blog. Available at: https://devblogs.microsoft.com/dotnet/an-introduction-to-dataframe/ [Accessed 6 May 2021].
  4. Sequeira, J., 2020. dotnet/interactive: Magic Commands. [online] GitHub. Available at: https://github.com/dotnet/interactive/blob/main/docs/magic-commands.md [Accessed 6 May 2021].
  5. Winnington, E., 2019. Eric Winnington - Tips and tricks for C# Jupyter notebook. [online] Ewinnington.github.io. Available at: https://ewinnington.github.io/posts/jupyter-tips-csharp [Accessed 6 May 2021].
  6. Fslab.org. n.d. XPlot - F# Data Visualization Package. [online] Available at: https://fslab.org/XPlot/index.html [Accessed 6 May 2021].
  7. Phuriphanvichai, J., 2021. Using .NET Core in Jupyter Notebook | Refinitiv Developers. [online] Developers.refinitiv.com. Available at: https://developers.refinitiv.com/en/article-catalog/article/using--net-core-in-jupyter-notebook.html [Accessed 10 May 2021].
Database Pandas Data visualization .NET jupyter notebook

Opinions expressed by DZone contributors are their own.

Related

  • Python Packages for Validating Database Migration Projects
  • Data Analytics Using Python
  • .NET Performance Optimization Techniques for Expert Developers
  • How to Use Python for Data Science

Partner Resources

×

Comments
Oops! Something Went Wrong

The likes didn't load as expected. Please refresh the page and try again.

ABOUT US

  • About DZone
  • Support and feedback
  • Community research
  • Sitemap

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 100
  • Nashville, TN 37211
  • support@dzone.com

Let's be friends:

Likes
There are no likes...yet! 👀
Be the first to like this post!
It looks like you're not logged in.
Sign in to see who liked this post!