Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to fix the trap when using the Python ORM tool SQLAlchemy

2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

In this issue, the editor will bring you about how to fix the trap when using the Python ORM tool SQLAlchemy. The article is rich in content and analyzes and narrates it from a professional point of view. I hope you can get something after reading this article.

When using SQLAlchemy, those seemingly small choices can have a significant impact on the performance of this object-relational mapping toolkit.

Object-relational mapping Object-relational mapping (ORM) makes the job of application developers easier, largely because it allows you to interact with databases in languages you might know (such as Python), rather than using raw SQL statements to query. SQLAlchemy is a Python ORM toolkit that provides the ability to access SQL databases using Python. It is a mature ORM tool, which adds the advantages of model relationship, powerful query construction paradigm, simple serialization and so on. However, its ease of use makes it easy to forget what's going on behind it. Seemingly small choices made when using SQLAlchemy can have a very large performance impact.

Explains some of the most important performance issues that developers encounter when using SQLAlchemy and how to solve them.

You only need to count to retrieve the entire result set.

Sometimes developers only need a result count, but instead of using the database count feature, they get all the results and then use len in Python to complete the count.

Count = len (User.query.filter_by (acct_active=True) .all ())

Instead, the count method using SQLAlchemy performs counting on the server side, reducing the amount of data sent to the client. Calling all () in the previous example also results in the instantiation of the model object, which can be very expensive if there is a lot of data.

Unless there is something else you need to do, just use the count method:

Count = User.query.filter_by (acct_active=True). Count () retrieves the entire model when only a few columns are needed

In many cases, only a few columns of data are needed to issue a query. SQLAlchemy can only get the columns you want, instead of returning the entire model instance. This not only reduces the amount of data sent, but also avoids instantiating the entire object. Using tuples of column data instead of models can be much faster.

Result = User.query.all () for user in result: print (user.name, user.email)

Instead, use the with_entities method to select only what you want:

Result = User.query.with_entities (User.name, User.email). All () for (username, email) in result: print (username, email) updates one object per loop

Avoid using loops to update collections individually. Although the database can perform a single update very quickly, the round-trip time between the application and the database server accumulates quickly. In general, strive for fewer queries under reasonable circumstances.

For user in users_to_update: user.acct_active = True db.session.add (user)

Use the batch update method instead:

Query = User.query.filter (user.id.in_ ([user.id for user in users_to_update])) query.update ({"acct_active": True}, synchronize_session=False) trigger cascade deletion

ORM allows simple configuration of model relationships, but there are some subtle behaviors that can be surprising. Most databases maintain relational integrity through foreign keys and various cascading options. SQLAlchemy allows you to define models using foreign keys and cascading options, but ORM has its own cascading logic that can replace databases.

Consider the following models:

Class Artist (Base): _ _ tablename__ = "artist" id = Column (Integer, primary_key=True) songs = relationship ("Song", cascade= "all, delete") class Song (Base): _ _ tablename__ = "song" id = Column (Integer, primary_key=True) artist_id = Column (Integer, ForeignKey ("artist.id", ondelete= "CASCADE"))

Deleting a singer will cause ORM to issue a delete query on the song table, thus preventing deletions due to foreign keys. This behavior can become a bottleneck for complex relationships and a large number of records.

Include the passive_deletes option to ensure that the database manages the relationship. However, please make sure that your database has this feature. For example, SQLite does not manage foreign keys by default.

Songs = relationship ("Song", cascade all, delete ", passive_deletes=True) when greedy loading is to be used, lazy loading should be used

Lazy loading is the default method for SQLAlchemy to handle relationships. Judging from the previous example build, loading a singer does not load his or her songs at the same time. This is usually a good idea, but if you always need to load certain relationships, separate queries can be wasteful.

Popular serialization frameworks like Marshmallow can trigger cascading queries if relationships are allowed to be loaded in a deferred manner.

There are several ways to control this behavior. The easiest way is through the relationship function itself.

Songs = relationship ("Song", lazy= "joined", cascade= "all, delete")

This will cause a left join to be added to any singer's query, so the songs collection will be available immediately. Although more data is returned to the client, the number of round trips may be much less.

SQLAlchemy provides finer-grained control for situations where this synthesis approach is not available, and you can use the joinedload () function to switch the loading of joins on a per-query basis.

From sqlalchemy.orm import joinedload artists = Artist.query.options (joinedload (Artist.songs)) print (artists.songs) # Does not incur a roundtrip to load uses ORM for batch record import

When importing thousands of records, the overhead of building a complete model instance becomes a major bottleneck. Imagine loading thousands of records of songs from a file, each of which is first converted into a dictionary.

For song in songs: db.session.add (Song (`song))

Instead, bypass ORM and use only the core SQLAlchemy parameter binding functionality.

Batch = [] insert_stmt = Song.__table__.insert () for song in songs: if len (batch) > 1000: db.session.execute (insert_stmt, batch) batch.clear () batch.append (song) if batch: db.session.execute (insert_stmt, batch)

Keep in mind that this method will naturally skip any client-side ORM logic you may rely on, such as Python-based column defaults. Although this method is faster than loading objects as complete model instances, your database may have a faster batch load method. For example, PostgreSQL's COPY command provides the best performance for loading a large number of records.

Call commit or refresh prematurely

In many cases, you need to associate a child record with its parent record, and vice versa. One obvious way is to refresh the session so that an ID is assigned to the problematic record.

Artist = Artist (name= "Bob Dylan") song = Song (title= "Mr. Tambourine Man") db.session.add (artist) db.session.flush () song.artist_id = artist.id

For each request, multiple submissions or refreshes are usually unnecessary and undesirable. Database refresh involves forcing disk writes on the database server, and in most cases, the client will block until the server confirms that the data has been written.

SQLAlchemy can track relationships and management keys behind the scenes.

Artist = Artist (name= "Bob Dylan") song = Song (title= "Mr. Tambourine Man") artist.songs.append (song) Summary

I hope this series of common traps will help you avoid these problems and keep your application running smoothly. In general, measurement is the key when diagnosing performance problems. Most databases provide performance diagnostics that can help you locate problems, such as PostgreSQL's pg_stat_statements module.

The above is the editor for you to share how to fix the trap when using the Python ORM tool SQLAlchemy, if you happen to have similar doubts, you might as well refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report