Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What are the functions of .NET for Apache Spark 1.0?

2025-01-19 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly introduces "what are the functions of .NET for Apache Spark 1.0". In daily operation, I believe many people have doubts about the functions of .NET for Apache Spark 1.0. The editor consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful for you to answer the questions about "what are the functions of .NET for Apache Spark 1.0?" Next, please follow the editor to study!

.net for Apache Spark 1.0 has been released, a .NET framework for Spark big data that makes it easy for .NET developers to use Apache Spark.

The package is led by Microsoft and .NET Foundation and has been under development for about two years. At the Spark + AI Summit in 2019, Microsoft announced the release of .NET for Apache Spark and released its first preview version v0.1.0.

Version 1.0 includes the following:

Supports .NET applications for .NET Standard 2.0 (.NET Core 3.1 or later is recommended).

Support for Apache Spark 2.4 DataFrame API 3.0 DataFrame API, including the ability to write Spark SQL. For example:

Var spark = SparkSession.Builder () .GetOrCreate ()

Var tweets = spark.Read () .Schema ("date STRING, time STRING, author STRING, tweet STRING") .Format ("csv") .load (inputfile)

Tweets = tweets.GroupBy (Lower (Col ("author")) .As ("author"))

.agg (Count ("tweet") .As ("tweetcount"))

.OrderBy (Desc ("tweetcount"))

Tweets.Write () .SaveAsTable (tweetcount)

Spark.Sql (@ "SELECT * FROM tweetcount") .show ()

You can write Apache Spark applications using .NET user-defined functions (UDF). For example:

/ / Define and register UDF

Var concat = Udf ((age, name) = > name+age)

/ / Use UDF

Df.Filter (df ["age"] > 21) .Select (concat (df ["age"], df ["name"]) .Show ()

Provides an API extension framework to add support for other Spark libraries. Currently includes support for Linux foundation Delta Lake, Microsoft OSS Hyperspace, ML.NET, and Apache Spark's MLLib functionality.

Performance work on moving data between the Spark runtime and the .NET UDFs and improving pickling interop and support for Apache Arrow.

Competitive advantage: .net for Apache Spark programs that do not use UDF show the same speed as non-UDF Spark applications based on Scala and PySpark. If the application contains UDF,.NET for Apache Spark programs at least as fast as PySpark programs, it is generally faster.

At this point, the study on "what are the functions of .NET for Apache Spark 1.0" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Internet Technology

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report