In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >
Share
Shulou(Shulou.com)06/01 Report--
First, data flow design optimization
Data flow has two characteristics: stream and data processing in memory buffer. According to these two characteristics of data flow, data flow is optimized.
1, stream, and simultaneously extract, transform and load the data
Stream, that is, when source extracts data, the conversion component processes the data, while destination loads the data, which is processed between different components at the same time.
All RDBMS operations are synchronous, set-based operations require that the operation must be completed before the data is used for other purposes, which is determined by the atomic nature of the transaction, and then the data flow has the characteristics of flow. When the data flow passes through the pipeline, the data flow task can handle links, queries, and other transformation operations in parallel. When designing Data Flow, we should make full use of the characteristics of flow and limit the synchronization process.
For example, execute the insert statement to insert data into the Table1, and then run the update statement to update the Table1, which means that the update statement cannot be run until the insert script is complete, and the insert and update statements are synchronized.
The optimized design is to design a data flow to implement the same logic as the insert statement, and to use the transformation component to implement the same logic as the Update statement.
This design does not use TSQL's insert and update statements, but uses Data flow Task's Source,conversion and destination to make full use of the flow characteristics of the data stream. At the same time of data extraction, the conversion component converts the data stream to achieve "insert" and "Update" of data at the same time, reducing the overall processing time.
Sometimes it is faster to use RDBMS, for example, if there is an appropriate index in the table, sorting the data using the order by clause is much faster than the SSIS sort transformation.
2Gore SSIS Engine uses memory buffers to temporarily store data streams
SSIS Engine uses memory buffers to temporarily store data streams and performs most conversion operations on data that resides in memory, which makes SSIS data processing very efficient, and SSIS should be avoided from residing data streams on Disk or other storage media with very low IO speeds.
When the Server runs out of memory, SSIS copies the buffer to the Disk, and the speed of the Disk IO is much lower than the IO speed of the RAM, which will cause the package to run much slower, with the most intensive memory conversion blocking and semi-blocking conversion. Therefore, the memory usage of blocking and semi-blocking conversions must be monitored to avoid low memory.
Second, optimization of data flow conversion.
1, buffer and execution tree
A different buffer profile is used for each execution tree in the data flow, which means that components downstream of the execution tree may require different Column set depending on the processing logic. Because the performance of the buffer of the data flow is directly related to the row width of the buffer, the narrow buffer can hold more data rows, thus allowing higher data flow throughput.
The columns used by the upstream execution tree may not be required by the downstream execution tree, and SSIS provides warnings when the columns in the execution tree are no longer used by any downstream execution tree. Each warning indicates that a column is no longer used in the downstream component, and Yingai removes it from the Pipeline after initial use. Any component that asynchronously converts the output can choose to delete the column from the output.
2,Engine Thread
By adding more execution threads to the data stream, the utilization of CPU is improved. Set the value of the data flow property EngineThreads property to be greater than the number of execution trees and components to ensure that there are enough threads available for SSIS.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.