In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article mainly explains the "Linq query and performance example analysis", the content of the article is simple and clear, easy to learn and understand, the following please follow the editor's ideas slowly in depth, together to study and learn "Linq query and performance example analysis" bar!
Orcas (VS2008&Framework3.5) brings us a lot of exciting new features, especially the introduction of LINQ, which can be said to facilitate a large number of developers and framework designers. In the past, when we used some of O/RMapping 's frameworks, what we were most worried about was the performance of complex queries and database operations, so what did LINQ bring us in this respect? Let's start with LINQ queries.
1: LINQ query:
In a project, the most operations on the database should be queries, especially for more complex projects, queries often take some data from many tables and "cobble together" to present to users. If we don't use Omax R Mapping, we can handle it in two ways: complex SQL statements or stored procedures.
However, in the case of using Oamp R Mapping, in order to ensure that the data extracted is strongly typed, we have to use multiple methods to piece together, not to mention the trouble, the performance loss is certain, then there is an embarrassing situation in which the Omax R Mapping framework is used together with the unencapsulated ADO.NET. So how do you handle LINQ queries? First of all, let's look at some complex query syntax.
Design two tables, representing the department (Department) and the employee (Employee):
1. Fuzzy query:
Fuzzy query is difficult to implement in the framework of O/RMapping, and even if implemented, it is not very efficient. LINQ provides us with a wealth of fuzzy query methods, such as looking up a person surnamed Zhang in the Employee table.
DBDataClassesDataContext dbdata = new DBDataClassesDataContext (); var query = from employee in dbdata.Employees where employee.EmployeeName.StartsWith (Zhang) select employee
In practice, this sentence is translated into:
SELECT [t0]. [EmployeeId]. [DepId], [t0]. [EmployeeName], [t0]. [EmployeeSalary] FROM [dbo]. [Employee] AS [t0] WHERE [t0]. [EmployeeName] LIKE @ p0
In such a SQL statement, the @ p0 parameter is executed with "%", that is, any eligible records that start with @ p0 are taken out. Such methods also include Contains, EndsWith and so on in LINQ queries.
2. Nested query:
If we need to take out all the employees in the marketing department, in the previous Omax R Mapping framework, we often need to take it twice (we do not consider the case of mapping), once take out the number of the marketing department, and then use the number to remove all the employees. LINQ queries provide us with a better solution, and we can do this:
Var query = from employee in dbdata.Employees where employee.DepId = (from department in dbdata.Departments where department.DepName = "Marketing Department" select department) .Single () .DepId select employee
Here we use a nested query, which, when executed, is converted to a nested query of SQL:
SELECT [t0]. [EmployeeId]. [DepId], [t0]. [EmployeeName], [t0]. [EmployeeSalary] FROM [dbo]. [Employee] AS [t0] WHERE [t0]. [DepId] = (SELECT [T1]. [DepId] FROM [dbo]. [Department] AS [T1] WHERE [T1]. [DepName] = @ p0)
Of course, here, we don't have to go to so much trouble, we just need to use the existing entity relationship, we can easily do it:
Var query = from employee in dbdata.Employees where employee.Department.DepName = "Marketing Department" select employee
However, this is fundamentally different from the previous query, so let's take a look at what kind of SQL statement the LINQ query translates into:
SELECT [t0]. [EmployeeId]. [DepId], [t0]. [EmployeeName], [t0]. [EmployeeSalary] FROM [dbo]. [Employee] AS [t0] LEFT OUTER JOIN [dbo]. [Department] AS [T1] ON [T1]. [DepId] = [t0]. [DepId] WHERE [T1]. [DepName] = @ p0
Here, the conversion is not a nested SQL statement, but is converted to a left join query, you can see that the nested query is a sharp tool that we can use in special cases.
3. Projection
If the query just now is still a very simple query, then combining anonymous classes to achieve projection query is a sharp tool provided by LINQ query, this method is very flexible, but also meets most of our query needs. Let's take a look at an example: we need to find out the sum of the department's salary, how to do it?
Var query = from department in dbdata.Departments select new {depid = department.DepId, depname=department.DepName, depsalary = department.Employees.Sum (e = > e.EmployeeSalary)}
In this query statement, an anonymous class is generated after the new keyword. The class has three properties, depid,depname and depsalary, where depsalary is obtained by calculation, and this sentence is translated into:
SELECT [t0]. [DepId] AS [depid], [t0]. [DepName] AS [depname], (SELECT SUM ([T1]. [EmployeeSalary]) FROM [dbo]. [Employee] AS [T1] WHERE [T1]. [DepId] = [t0]. [DepId]) AS [depsalay] FROM [dbo]. [Department] AS [t0]
What an elegant code, it's so convenient.
4. The expansion method of using LINQ query
LINQ query provides many extension methods to facilitate us to do various queries. Let's take a look at several typical extension methods:
A) Average, Max
Decimal x = dbdata.Employees.Average (e = > e.EmployeeSalary); Decimal y = dbdata.Employees.Max (e = > e.EmployeeSalary)
You don't need to write any SQL statements. The LINQ to SQL object relational mapper provided in Orcas handles fetching, tracking, and updating objects that map to your database data definitions and stored procedures.
As long as you use any LINQ query extension method to filter and shape the results, LINQ to SQL will execute the SQL code needed to get the data (note that the above Average and Max extension methods obviously do not return all data rows from the data table, they will use TSQL's aggregate function to calculate the value in the database, and then return only a scalar value).
B) Where, OrderBy
Sometimes, if we simply query and sort a table, we don't need to write lengthy LINQ query statements at this time. Instead, we can simply use LINQ extension methods, such as:
Var query = dbdata.Employees.Where (e = > e.EmployeeSalary > 2000) .OrderBy (e = > e.EmployeeName)
The Lambda syntax is used here, and this sentence is translated into the following SQL statement:
SELECT [t0]. [EmployeeId]. [DepId], [t0]. [EmployeeName], [t0]. [EmployeeSalary] FROM [dbo]. [Employee] AS [t0] WHERE [t0]. [EmployeeSalary] > @ p0 ORDER BY [t0]. [EmployeeName]
The above is achieved by using the expression tree support provided by Lambda, as well as the IQueryable interface, the code is clean and tidy.
C) Take, Skip
It has always been a headache for developers to deal with large quantities of data. Microsoft's controls such as Gridview in .NET 1.1 to 2.0 have not been very ideal in dealing with large quantities of data. LINQ query can solve this problem very well.
Var query = dbdata.Employees.Skip (10) .Take (10)
This sentence means to skip 10 records of the table, take 10 more, that is, take 11 to 20 records, and convert them into the following SQL statement:
SELECT [T1]. [EmployeeId], [T1]. [DepId], [T1]. [EmployeeName], [T1]. [EmployeeSalary] FROM (SELECT ROW_NUMBER () OVER (ORDER BY [t 0]. [EmployeeId], [t 0]. [DepId]. [EmployeeName], [t 0]. [EmployeeSalary]) AS [ROW_NUMBER], [t 0]. [EmployeeId], [t 0]. [DepId], [t 0]. [EmployeeName] [t0]. [EmployeeSalary] FROM [dbo]. [Employee] AS [t0]) AS [T1] WHERE [T1]. [ROW_NUMBER] BETWEEN @ p0 + 1 AND @ p0 + @ p1 ORDER BY [T1]. [ROW_NUMBER]
From the above SQL statement, the most performance-consuming paging operation is completely handed over to the database operation, and its processing mechanism is no longer like the paging of the Gridview control, where all the data is taken out and then paged and displayed, so the efficiency is much higher.
D) ToList and ToArray
By default, the data type of query results is IEnumerable. Many developers may not be used to this type and prefer collections or arrays, so it doesn't matter, you can use ToList or ToArray to convert query results into collections or arrays.
Here, we need to know is: when using the query statement to query the results, there is no real operation of the database, here is the use of delayed loading mechanism, if you do not want to use delayed loading, but need to know the query results immediately, you can use ToList or ToArray. This is a very useful mechanism.
For example, when we need to display the employees of two departments, the department can first take them out and put them in the List, and then take out the employees of each department in turn. In this case, the access is more efficient, because it is not necessary to access the database every time to retrieve the department.
Second: performance analysis of data operation
When we will skillfully use the above query methods to do a variety of queries on the contents of the database, we should understand the mechanism of these database operations, and adjust various data operation statements in time to run with high efficiency. So, let's take a look at LINQ's database operations and see what she's done.
Compared with NHibernate, LINQ is indeed better than NHibernate in the performance and controllability of O/RMapping. First of all, the default data mapping of Linq is implemented by Attribute, which is a unique syntax of .NET, and various attributes of data objects are determined at compile time, while most O/RMapping tools such as NHibernate still use XML mapping files to describe the properties of data objects and read the attributes of data objects from external files. It is obvious that there will be a loss of runtime efficiency.
Secondly, there are some differences in the way of obtaining data. The powerful SQL analysis mechanism in LINQ can analyze the SQL statements of various data operations and optimize them, and the improvement of its efficiency is also obvious.
Of course, as an SqlDataReader/SqlDataAdapter R Mapping tool, its efficiency can not reach the efficiency of directly using SQL statements to access the database, that is, we usually say that SQL accesses the database. However, the performance of Linq gives us a great surprise.
I did a test. When I used SqlDataReader and LINQ to do the same mass data query, I fell behind by less than 10%, while the query efficiency of NHibernate was much lower, almost twice as slow. For such a powerful data mapping function, such efficiency is acceptable. Unfortunately, LINQ queries currently only support SQLServer (but can support XML, Entity, and so on).
In the data query, we can optimize the query by analyzing the SQL statement generated by the LINQ query, which is very convenient. However, in view of the efficiency of data update, we have to talk about the data update mechanism of LINQ. In general, we will do this:
Var query = from emp in dbdata.Employees where emp.DepId== "1001" select emp; Employee employee = query.First (); employee.EmployeeName = "Li Si"; dbdata.SubmitChanges ()
For the above code, we can see that its function is to extract all employees whose department code is 1001 from the Employee table, and then we take out * pieces of data (here, for simplicity, we only take out * items, in fact, we can use Where to extract records that meet the conditions), then change the name to "Li Si", and then update it to the database.
What does the LINQ query do with this code? Through the query, several records are taken from the database and placed in memory, and all are marked as new (unchanged) status. When the employee name is modified, the modified object is marked as Dirty (changed).
In SubmitChanges, SQL statements are automatically generated and executed for records with Dirty object status in memory, that is, we have to complete a data update, at least one query and one update.
Due to the use of delayed loading (Layze Load) technology, only one record is actually extracted from the database in the above statement, and only this one is updated when it is updated, so the efficiency is still very high. In the course of testing, I found that the actual efficiency of randomly selecting one of 250000 pieces of data for updating is almost the same as that of randomly selecting one of 10 pieces of data to update. Because the comparison of update status is done in memory, it is more efficient. Let's take a look at what kind of SQL statement the actual update generates:
UPDATE [dbo]. [Employee] SET [EmployeeName] = @ p4 WHERE ([EmployeeId] = @ p0) AND ([DepId] = @ p1) AND ([EmployeeName] = @ p2) AND ([EmployeeSalary] = @ p3)
It turns out that we only modified the EmployeeName field, but the resulting SQL statement only updated the Employee field. So, let's take a look at the following conditions, why do they include conditions other than the primary key? It turns out that this is also the rigor of the LINQ query to automatically generate SQL statements, which is to prevent multiple transactions from updating the same record in concurrency. If the A transaction updates the record, the B transaction update will fail.
We can't help but ask, what if we want to update the primary key field? Will it be mistakenly updated to multiple records? The answer is yes, it will definitely be updated to other records incorrectly, so it is stipulated in LINQ that the primary key field is not allowed to be updated. If you do, delete the record and reinsert the new record. Such a rigorous SQL statement will bring us some trouble. Let's take a look at the following application scenario:
If we have a field in the table for the counter, the SQL statement looks like this:
Update CountTable set CountColumn=CountColumn+1 where CountId=@countId
However, the Sql statement generated using the LINQ query is:
UPDATE [dbo]. [CountTable] SET [CountColumn] = @ p2 WHERE ([CountId] = @ p0) AND ([CountColumn] = @ p1)
The parameter @ p2 is passed in after calculation, and the parameter @ p1 is the original value of CountColumn. In other words, the value of CountColumn+1 is not calculated by the database, so when the number of concurrency is very high, we tend to fail to update. I did a test, using multi-thread simulation of multi-user situation for counting statistics, the statistical value in the database is smaller than the value saved using static variables, which means that the database update is a failure.
In addition, each update needs to be completed in two steps: search and update, so it also has a great impact on efficiency.
Thank you for your reading, the above is the content of "Linq query and performance example Analysis". After the study of this article, I believe you have a deeper understanding of Linq query and performance example analysis, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.