Analysis of Python iterator and iterator slice 03/28 Update SLTechnology News&Howtos

Analysis of Python iterator and iterator slice

2025-03-28 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Shulou(Shulou.com)06/02 Report--

This article mainly explains "analyzing Python iterators and iterator slices". The explanation in this article is simple and clear, and it is easy to learn and understand. Please follow the editor's train of thought to study and learn "analyzing Python iterators and iterator slices".

1. Iterator and iterator

First of all, there are several basic concepts to clarify: iterations, iterable objects, and iterators.

Iteration is a way of traversing container type objects (such as strings, lists, dictionaries, and so on). For example, by iterating over a string "abc", we mean the process of fetching all its characters one by one from left to right. PS: the word iteration in Chinese has the meaning of recurring and progressive, but it should be understood as one-way horizontal linear in Python. If you are not familiar with it, I suggest understanding it as traversing directly. )

So, how do you write instructions for iterative operations? The most common writing syntax is the for loop.

# for loop to implement iterative process for char in "abc": print (char, end= "") # output result: abc

The for loop can implement the iterative process, but not all objects can be used in the for loop. For example, in the above example, if the string "abc" is replaced with any integer number, an error will be reported: 'int' object is not iterable.

The word "iterable" in this error message refers to "iterable", that is, the int type is not iterable. While string (string) types are iterable, so are lists, tuples, dictionaries, and so on.

So how do you tell whether an object is iterable? Why are they iterable? How do you make an object iterate?

To make an object iterable, it is necessary to implement an iterable protocol, that is, to implement the _ _ iter__ () magic method, in other words, as long as the object that implements this magic method is an iterable object.

So how do you tell if an object implements this method? In addition to the for loop above, I know four ways:

# method 1:dir () to check _ _ iter__dir (2) # No, slightly dir ("abc") # Yes # method 2:isinstance () judge import collectionsisinstance (2, collections.Iterable) # Falseisinstance ("abc", collections.Iterable) # True# method 3:hasattr () judge hasattr (2, "_ iter__") # Falsehasattr ("abc", "_ iter__") # True# method 4: use iter () to check whether an error is reported iter (2) # error: 'int' object is not iterableiter ("abc") # PS: determine whether it can be iterated You can also check whether to implement _ _ getitem__, for convenience, this article will skim.

The most noteworthy of these methods is the iter () method, which is Python's built-in method that turns iterable objects into iterators. This sentence can be parsed into two meanings: (1) iteratable objects and iterators are two different things; (2) iteratable objects can become iterators.

In fact, an iterator must be an iterable object, but an iterable object is not necessarily an iterator. What's the difference between the two?

As shown in the blue circle above, the key difference between an ordinary iterable object and an iterator can be summarized as follows: the two are different together, the so-called "together", that is, both are iterable (_ _ iter__), the so-called "two differences", that is, after the iterator is converted into an iterator, it will lose some attributes (_ _ getitem__) and add some attributes (_ _ next__).

First, take a look at the added attribute _ _ next__, which is the key to why an iterator is an iterator. in fact, we define an object that implements both the _ _ iter__ method and the _ _ next__ method as an iterator.

With this extra attribute, iterable objects can implement their own iteration / traversal process without the help of external for loop syntax. I have invented two concepts to describe these two kinds of traversal processes (PS: for ease of understanding, it is called traversal here, and it can actually be called iteration): traversal refers to traversal through external syntax, and self-traversal refers to traversal through its own methods.

With the help of these two concepts, we say that an iterable object is an object that can be "traversed by it", and an iterator is an object that can be "self-traversed" on this basis.

Ob1 = "abc" ob2 = iter ("abc") ob3 = iter ("abc") # ob1 it traverses for i in ob1: print (I, end = ") # ab cfor i in ob1: print (I, end =") # abc # ob1 self-traversal ob1.__next__ () # error: 'str' object has no attribute' _ next__'# ob2 it traverses for i in ob2: print (I End = "") # a b cfor i in ob2: print (I, end = "") # No output # ob2 self-traversal ob2.__next__ () # error: StopIteration# ob3 self-traversal ob3.__next__ () # aob3.__next__ () # bob3.__next__ () # cob3.__next__ () # error: StopIteration

As can be seen from the above example, the advantage of the iterator is that it supports self-traversal, at the same time, it is characterized by one-way non-loop, once the traversal is completed, the error will be reported when called again.

In this regard, I can think of an example: an ordinary iterable object is like a bullet cartridge. It traverses the bullet and puts it back after the operation is completed, so it can be iterated repeatedly (that is, the for loop is called many times and the same result is returned). The iterator is like an undetachable gun loaded with a cartridge, traversing it or traversing itself is firing bullets, which is consumptive traversal and cannot be reused (that is, traversal will have an end).

Having written so much, let's make a brief summary: iteration is a way of traversing elements, divided according to the way of implementation, there are two kinds of external iterations and internal iterations, and the objects that support external iterations (it traverses) are iterable objects. and the object that also supports internal iteration (self-traversal) is the iterator. According to the consumption pattern, it can be divided into reuse iteration and one-time iteration. Ordinary iterable objects are reusable, while iterators are disposable.

2. Iterator slice

The last difference is that ordinary iterable objects lose some properties when they are converted into iterators, and the key attribute is _ _ getitem__. In "Python Advanced: implementing slicing of Custom objects," I introduced this magic method and used it to implement the slicing properties of custom objects.

So the question is: why doesn't the iterator inherit this property?

First of all, the iterator uses consumptive traversal, which means that it is full of uncertainty, that is, its length and index key pairs are dynamically attenuated, so it is difficult to get to its item, and the _ _ getitem__ attribute is no longer needed. Secondly, it is unreasonable to add this attribute to the iterator. As the saying goes, a strong twist is not sweet.

Thus, a new question arises: why use iterators when such important attributes (including other unidentified attributes) are lost?

The answer to this question is that iterators have irreplaceable powerful and useful features that make Python design it this way. Limited to space, I will no longer expand here. I will fill in this topic later.

Not yet, the question arises: can the iterator have this property, that is, make the iterator continue to support slicing?

Hi = "Welcome to the official account: Python cat" it = iter (hi) # normal slice hi [- 7:] # Python cat # counterexample: iterator slice it [- 7:] # error: 'str_iterator' object is not subscriptable

Iterators cannot use normal slicing syntax because of the lack of _ _ getitem__. If you want to slice, there are only two ways of thinking: one is to make your own wheels and write the logic of the implementation; the other is to find the sealed wheels.

Python's itertools module is the wheel we are looking for, and we can easily implement iterator slicing with the methods it provides.

Import itertools# example 1: simple iterator s = iter ("123456789") for x in itertools.islice (s, 2,6): print (x, end = ") # output: 345 6for x in itertools.islice (s, 2,6): print (x, end =") # output: example 2: Fibonacci sequence iterator class Fib (): def _ init__ (self): self.a, self.b = 1 1 def _ iter__ (self): while True: yield self.a self.a, self.b = self.b, self.a + self.bf = iter (Fib ()) for x in itertools.islice (f, 2,6): print (x, end = ") # output: 23 5 8for x in itertools.islice (f, 2, 6): print (x, end =") # output: 34 55 89 144

The islice () method of the itertools module perfectly combines the iterator with slicing, and finally answers the previous question. However, compared with ordinary slices, iterator slices have many limitations. First of all, this method is not a "pure function" (pure functions should follow the principle of "the same input gets the same output", which was mentioned earlier in "advice from Kenneth Reitz: avoid unnecessary object-oriented programming"); second, it only supports positive slices and does not support negative indexes, which are determined by the wear and tear of iterators.

So, I can't help asking: what logic does the slicing method of the itertools module use to implement it? The following is the source code provided on the official website:

Def islice (iterable, * args): # islice ('ABCDEFG', 2)-> AB # islice (' ABCDEFG', 2,4)-> CD # islice ('ABCDEFG', 2, None)-> CDEFG # islice (' ABCDEFG', 0, None, 2)-> A C E G s = slice (* args) # index range is [0mensys.maxsize] The default step size is 1 start, stop, step = s.start or 0, s.stop or sys.maxsize, s.step or 1 it = iter (range (start, stop, step)) try: nexti = next (it) except StopIteration: # Consume * iterable* up to the * start* position. For I, element in zip (range (start), iterable): pass return try: for I, element in enumerate (iterable): if I = = nexti: yield element nexti = next (it) except StopIteration: # Consume to * stop*. For I, element in zip (range (I + 1, stop), iterable): pass

The indexing direction of the islice () method is limited, but it also offers the possibility of allowing you to slice an infinite (within system support) iterator. This is the most imaginative usage scenario for iterator slices.

In addition, iterator slicing has a real application scenario: reading data in a given range of rows in a file object.

In the document Reading and Writing Guide for Python Learners (including basic and advanced, suggested collection), I introduced several ways to read content from a file: readline () is more chicken-rib and not very useful; read () is suitable for situations where there is less content to read, or where all content needs to be processed at once. While readlines () uses more, each iteration to read the content, not only reduce the memory pressure, but also facilitate line-by-line data processing.

Although readlines () has the advantage of iterative reading, it reads line by line from beginning to end, and it is still too inefficient if the file has thousands of lines and we only want to read a few specific lines (for example, lines 1000-1009). Considering that file objects are naturally iterators, we can use iterator slices to intercept first and then process them, which will be much more efficient.

# test.txt file content''cat Python cat python is a cat.this is the end.'''from itertools import islicewith open (' test.txt','r',encoding='utf-8') as f: print (hasattr (f, "_ next__")) # determine whether the iterator content = islice (f, 2, 4) for line in content: print (line.strip ()) # output result: Truepython is a cat.this is the end.

Thank you for reading, the above is the content of "analyzing Python iterators and iterator slices". After the study of this article, I believe you have a deeper understanding of the problem of analyzing Python iterators and iterator slices, and the specific use needs to be verified in practice. Here is, the editor will push for you more related knowledge points of the article, welcome to follow!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.