Pandas Dataframe or similar in C#.NET
Question:
I am currently working on implement the C# version of a Gurobi linear program model that was earlier built in Python. I have a number of CSV files from which I was importing the data and creating pandas dataframes, and I was fetching columns from those dataframes to create variables that I was using in my Linear Program. The python code for creating the variables using dataframes is as follows:
dataPath = "C:/Users/XYZ/Desktop/LinearProgramming/TestData"
routeData = pd.DataFrame.from_csv(os.path.join(dataPath, "DirectLink.csv"), index_col=None)
#Creating 3 Python-dictionaries from Python Multi-Dict using column names and keeping RouteID as the key
routeID, transportCost, routeType = multidict({x[0]:[x[1],x[2]] for x in routeData[['RouteID', 'TransportCost','RouteType']].values})
Example: If the csv structure is as follows:
RouteID RouteEfficiency TransportCost RouteType
1 0.8 2.00 F
2 0.9 5.00 D
3 0.7 6.00 R
4 0.6 3.00 T
The 3 variables should be:
RouteID: 1 2 3 4
TransportCost:
1:2.00
2:5.00
3:6.00
4:3.00
RouteType:
1:F
2:D
3:R
4:T
Now, I want to create a C# version of the above code that does the same task, but I learnt that C# doesn’t support dataframes. I tried looking for a few alternatives, but am unable to find anything. Please help me with this.
Answers:
Deedle
is a .Net library that handles DataFrames.
New kid on the block
https://devblogs.microsoft.com/dotnet/an-introduction-to-dataframe/
Announced today, still in preview, Microsoft’s own take on a DataFrame 🙂
I was after a .NET representation of the Python Pandas library, and I came across this C# port: https://github.com/SciSharp/Pandas.NET. As of 21st March 2022, the last update was 2 months ago. The site has 5 contributers, 36 people watching, and 51 forks.
This port includes the Pandas DataFrames structure and methods.
ML.net 2.0 was released on Nov 10, 2022 and here is the blog post reference to DataFrame https://devblogs.microsoft.com/dotnet/announcing-ml-net-2-0/#dataframe
DataFrame is under the Microsoft.Data.Analysis nuget package https://www.nuget.org/packages/Microsoft.Data.Analysis/ and the source code is here https://github.com/dotnet/machinelearning/tree/main/src/Microsoft.Data.Analysis
The ML.net docs are here https://learn.microsoft.com/en-ca/dotnet/machine-learning/
I am currently working on implement the C# version of a Gurobi linear program model that was earlier built in Python. I have a number of CSV files from which I was importing the data and creating pandas dataframes, and I was fetching columns from those dataframes to create variables that I was using in my Linear Program. The python code for creating the variables using dataframes is as follows:
dataPath = "C:/Users/XYZ/Desktop/LinearProgramming/TestData"
routeData = pd.DataFrame.from_csv(os.path.join(dataPath, "DirectLink.csv"), index_col=None)
#Creating 3 Python-dictionaries from Python Multi-Dict using column names and keeping RouteID as the key
routeID, transportCost, routeType = multidict({x[0]:[x[1],x[2]] for x in routeData[['RouteID', 'TransportCost','RouteType']].values})
Example: If the csv structure is as follows:
RouteID RouteEfficiency TransportCost RouteType
1 0.8 2.00 F
2 0.9 5.00 D
3 0.7 6.00 R
4 0.6 3.00 T
The 3 variables should be:
RouteID: 1 2 3 4
TransportCost:
1:2.00
2:5.00
3:6.00
4:3.00
RouteType:
1:F
2:D
3:R
4:T
Now, I want to create a C# version of the above code that does the same task, but I learnt that C# doesn’t support dataframes. I tried looking for a few alternatives, but am unable to find anything. Please help me with this.
Deedle
is a .Net library that handles DataFrames.
New kid on the block
https://devblogs.microsoft.com/dotnet/an-introduction-to-dataframe/
Announced today, still in preview, Microsoft’s own take on a DataFrame 🙂
I was after a .NET representation of the Python Pandas library, and I came across this C# port: https://github.com/SciSharp/Pandas.NET. As of 21st March 2022, the last update was 2 months ago. The site has 5 contributers, 36 people watching, and 51 forks.
This port includes the Pandas DataFrames structure and methods.
ML.net 2.0 was released on Nov 10, 2022 and here is the blog post reference to DataFrame https://devblogs.microsoft.com/dotnet/announcing-ml-net-2-0/#dataframe
DataFrame is under the Microsoft.Data.Analysis nuget package https://www.nuget.org/packages/Microsoft.Data.Analysis/ and the source code is here https://github.com/dotnet/machinelearning/tree/main/src/Microsoft.Data.Analysis
The ML.net docs are here https://learn.microsoft.com/en-ca/dotnet/machine-learning/