In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-29 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/02 Report--
This article shows you how to implement the remote import module in Python's import mechanism, the content is concise and easy to understand, it can definitely brighten your eyes. I hope you can get something through the detailed introduction of this article.
The so-called module import refers to the operation of using the code of another module in one module, which is conducive to code reuse.
Maybe when you see this title, you will say, how could I post such a basic article? (of course, there will be basic articles.)
On the contrary. Precisely, I think the content of this article can be regarded as the advanced skills of Python, which will discuss in depth and explain the knowledge points of Python import Hook with real cases.
Of course, in order to make the article more systematic and comprehensive, there will be a small space in front of you to explain the basic knowledge points, but I hope you can read on patiently, because the latter is the essence of this article. I hope you don't miss it.
1. Import the foundation of the system
1.1 Import unit composition
There are many import units, such as modules, packages, variables, and so on.
For these basic concepts, it is still necessary for beginners to introduce their differences.
Module: files like * .py,*.pyc, * .pyd, * .so,*.dll are the smallest units of the Python code carrier.
Packages can also be subdivided into two categories:
Regular packages: is a folder with the _ _ init__.py file, which can contain other subpackages or modules
Namespace packages
With regard to Namespace packages, some people will be unfamiliar with it. I'll explain it here by excerpting a paragraph from the official document.
Namespace packages is made up of several parts, each of which adds a child package to the parent package. Each part may be in a different location on the file system. Parts may also be in the zip file, on the network, or somewhere else Python can search during import. Namespace packages do not necessarily correspond directly to objects in the file system; they may be virtual modules without entity representation.
The _ _ path__ attribute of the namespace package does not use normal lists. Instead, a custom iterable type is used, and if the path of its parent package (or the sys.path of the highest-level package) changes, this object automatically performs a new search for the package portion on the next import attempt within the package.
The namespace package does not have a parent/__init__.py file. In fact, multiple parent directories may be found during the import search, each provided by a different section. Therefore, the physical location of the parent/one is not necessarily adjacent to the parent/two. In this case, Python will create a namespace package for the top-level parent package, either itself or one of its child packages will be imported.
1.2 relative / absolute pair Import
When we import import modules or packages, Python provides two import methods:
Relative import (relative import): import foo.bar or form foo import bar
Absolute import (absolute import): from. Import B or from.. An import B, of which. Represents the current module. Represents the upper module
You can choose according to your actual needs, but it is important to note that in earlier versions (prior to Python2.6), Python used relative imports by default. In later versions (after Python2.6), absolute import is the default import method.
Using absolute and relative paths has its own advantages and disadvantages:
When you are developing and maintaining your own projects, you should use relative path imports to avoid the hassle of hard coding.
Using the absolute path will not only make your module import structure clearer, but also avoid import errors caused by package conflicts with duplicate names.
1.3 Standard Writing of Import
In PEP8, there is a requirement for the import order of modules, different source module import, there should be a clear boundary, separated by a blank line.
Import statements should be written by branches
# badimport os,sys# goodimport osimport sys
The import statement should use absolute import
# badfrom.. bar import Bar# goodfrom foo.bar import test
The import statement should be placed at the head of the file, after the module description and docstring, and before the global variable
The import statements should be arranged in order, separated by a space between each group, according to the built-in module, the third-party module, the module call order written by itself, and the internal alphabetical order of each group.
# built-in module import osimport sys# third-party module import flask# local module from foo import bar
I remember this piece was mentioned to you in the previous Python learning tutorial!
2. The wonderful use of _ _ import__
Using the import keyword in Python to import modules / packages can be said to be the foundation of the foundation.
But this is not the only way, there are also importlib.import_module () and _ _ import__ () and so on.
For _ _ import__, the average developer may be a stranger.
Unlike import, _ _ import__ is a function, and it is for this reason that the use of _ _ import__ is more flexible and is often used in frameworks for dynamic loading of plug-ins.
In fact, when we call the import import module, _ _ import__ is also called internally. Take a look at the following two import methods, which are equivalent.
# using importimport os# using _ _ import__os = _ _ import__ ('os')
By citing examples, the following two methods are also equivalent.
# using import.. As.. import pandas as pd# uses _ _ import__pd = _ _ import__ ('pandas')
As I said above, _ _ import__ is often used for plug-in dynamics, and in fact only it can do that (as opposed to import).
Plug-ins are usually located in a specific folder, in the process, you may not use all the plug-ins, or you may add new plug-ins.
If you use the import keyword as a hard-coding method, it is obviously too inelegant, and you need to modify the code when you want to add / modify plug-ins. It is more appropriate to write these plug-ins in the configuration file as a configuration, and then let the code read your configuration and dynamically import the plug-ins you want to use, which is flexible, convenient and error-free.
If one of my projects has four plug-ins: plugin01, plugin02, plugin03, and plugin04, a core method run () will be implemented under these plug-ins. But sometimes I don't want to use all the plug-ins, I just want to use plugin02 and plugin04, so I write the two plug-ins I want to use in the configuration file.
# my.confcustom_plugins= ['plugin02',' plugin04']
So how do I use dynamic loading and run them?
# main.pyfor plugin in conf.custom_plugins: _ _ import__ (plugin) sys. Modules [plugin] .run ()
3. Understand the cache of the module
Repeatedly referencing another same module within a module will not actually be imported twice, because when using the keyword import to import the module, it will first retrieve whether the module has been loaded in sys.modules, if it has been loaded, it will not be imported again, if it does not exist, it will retrieve and import the module.
Let's try it. In the my_mod02 module, I import the my_mod01 module twice. Logically, each time import will print the code in my_mod01 (that is, print in mod01), but the verification result is that it is printed only once.
$cat my_mod01.py print ('in mod01') $cat my_mod02.py import my_mod01 import my_mod01 $python my_mod02.py in mod01
The explanation for this phenomenon is because of the existence of sys.modules.
Sys.modules is a dictionary (key: module name, value: module object) that holds all the module objects that have been imported in the current namespace.
# test_module.pyimport sysprint (sys.modules.get ('json',' NotFound')) import jsonprint (sys.modules.get ('json',' NotFound'))
The running result is as follows, it can be seen that after the import of the json module, the sys.modules has the object of the json module.
$python test_module.pyNotFound
Due to the existence of cache, we can not reload a module.
But if you want to do the opposite, you can do it with the magical library of importlib. In fact, there is such a scenario, for example, in code debugging, after finding an exception in the code and modifying it, we usually have to restart the service and load the program again. At this time, if there is a module overload, it is extremely convenient, and after modifying the code, there is no need to restart the service, you can continue to debug.
To understand from the above example, my_mod02.py is rewritten as follows
# my_mod02.pyimport importlibimport my_mod01importlib.reload (my_mod01)
This module is executed using python3, and unlike the above, my_mod01.py is executed twice
$python3 my_mod02.pyin mod01in mod01
4. Finder and loader
If the module with the specified name cannot be found in sys.modules, it initiates a call to Python's import protocol to find and load the module.
This protocol consists of two conceptual modules, namely finder and loader.
The import of a Python module can actually be subdivided into two processes:
Module lookup implemented by finder
Module loading implemented by loader
4.1 what is a finder?
Finder, to put it simply, defines a module lookup mechanism that lets the program know how to find the corresponding module.
In fact, Python has several default finders built into it, which exist in sys.meta_path.
But these finders are not that important to the user, so before Python 3. 3, the Python interpretation hid them, which we called implicit finders.
# Python 2.7 > > import sys > sys.meta_path [] >
Because this is not conducive to developers'in-depth understanding of the import mechanism, after Python 3.3, all module import mechanisms will be exposed through sys.meta_path, and there will not be any implicit import mechanism.
# Python 3.7 > > import sys > sys.meta_path [,] >
Take a look at the default finder of Python, which can be divided into three types:
One that knows how to import built-in modules
One that knows how to import a frozen module
One that knows how to import a module from import path (that is, path based finder).
Can we define a finder ourselves? Of course, you just have to:
Define a class that implements the find_module method (both py2 and py3), or implement the find_loader class method (only py3 is valid). If you find the module, you need to return a loader object or ModuleSpec object (later), but do not find the need to return None.
Once defined, to use this finder, you must register it and insert it in the first place of the sys.meta_path so that it can be used first.
Import sysclass MyFinder (object): @ classmethod def find_module (cls, name, path, target=None): print ("Importing", name, path, target) # will define return MyLoader () # because finder is read sequentially, it must be inserted in the first sys.meta_path.insert (0, MyFinder)
Finders can be divided into two types:
Object +-Finder (deprecated) +-- MetaPathFinder +-- PathEntryFinder
It is important to note that before version 3.4, the finder returned the loader (Loader) object directly, while after version 3.4, the finder returned the module specification (ModuleSpec), which included the loader.
As for what the loader and module specifications are, please continue to look back.
4.2 what is the loader?
The finder is only responsible for finding the location and finding the module, while the one really responsible for loading the module is the loader (loader).
A normal loader must define a method named load_module ().
Why is it so-so here? because there are many kinds of loader:
Object +-- Finder (deprecated) | +-- MetaPathFinder | +-- PathEntryFinder +-- Loader +-- ResourceLoader-+ +-- InspectLoader | +-- ExecutionLoader-- FileLoader +-- SourceLoader
By looking at the source code, you can see that different loaders have different abstract methods.
The loader is usually returned by a finder. See PEP 302 for details.
So how do we customize our own loader?
You just have to:
Define a class that implements the load_module method
Check the properties related to the import (click to view details)
Create a module object and bind all import-related attribute variables to the module
Save this module to sys.modules (order is important, avoid recursive import)
Then load the module (this is the core)
If there is an error in loading, you need to be able to handle the ImportError. If the load is successful, the module object will be returned.
If you want to see specific examples, you can move on.
4.3 Specification of the module
The import mechanism uses a variety of information about each module during import, especially before loading. Most of the information is common to all modules. The purpose of the module specification is to encapsulate the import-related information based on each module.
The specification of the module is exposed as the _ _ spec__ attribute of the module object. See ModuleSpec for more information about module specifications.
After Python 3.4, the finder no longer returns the loader, but instead returns the ModuleSpec object, which stores more information
Module name
Loader
Module absolute path
So how do I view the ModuleSpec of a module?
Let me give you an example.
$cat my_mod02.pyimport my_mod01print (my_mod01.__spec__) $python3 my_mod02.pyin mod01ModuleSpec (name='my_mod01', loader=, origin='/home/MING/my_mod01.py')
As you can see from ModuleSpec, the loader is included, so do we have another way of thinking if we want to reload a module?
Let's verify it together.
There are now two files:
One is my_info.py.
# my_info.pyname='python'
The other is: main.py
# main.pyimport my_infoprint (my_info.name) # add a breakpoint import pdb;pdb.set_trace () # load my_info.__spec__.loader.load_module () print (my_info.name) again
At main.py, I added a breakpoint so that when running to the breakpoint, I changed the name in my_info.py to ming to verify that the overload is valid.
$python3 main.pypython > / home/MING/main.py (9) ()-> my_info.__spec__.loader.load_module () (Pdb) cming
As a result, overloading is effective.
4.4 what is the importer?
Importer, which you may see in other articles, but it's not really new.
It only implements both the finder and the loader interface objects, so you can say that the importer is a finder (finder) or a loader (loader).
5. Remote import module
Since the default finder and loader of Python only support the import of local modules, they do not support the import of remote modules.
In order to give you a better understanding of the Python Import Hook mechanism, I will demonstrate how to implement the importer of the remote import module through an example.
5.1 hands-on importer
When importing a package, the Python interpreter first gets the finder list from sys.meta_path.
The default order is: built-in module finder-> freeze module finder-> third-party module path (local sys.path) finder
If you still cannot find the required module after these three finders, an ImportError exception will be thrown.
Therefore, there are two ways to realize the remote import module.
One is to implement your own meta-path importer.
The other is to write a hook and add it to the sys.path_hooks to identify a specific directory naming pattern.
I choose the first method here as an example.
To implement the importer, we need a finder and a loader, respectively.
The first is the finder.
According to the source code, there are two kinds of pathfinders.
MetaPathFinder
PathEntryFinder
MetaPathFinder is used here to write the finder.
Prior to Python 3.4, finders had to implement the find_module () method, while Python 3.4 + recommended the find_spec () method, but that doesn't mean you can't use find_module (), but the import protocol will still try the find_module () method when there is no find_spec () method.
Let me give you an example of how to write using find_module ().
From importlib import abcclass UrlMetaFinder (abc.MetaPathFinder): def _ init__ (self, baseurl): self._baseurl = baseurl def find_module (self, fullname, path=None): if path is None: baseurl = self._baseurl else: # if it is not the original defined url, if not path.startswith (self._baseurl): return None baseurl = path try: loader = UrlMetaLoader (baseurl) # loader.load_module (fullname) except Exception: return None
If you use find_spec (), note that the call to this method needs to take two or three parameters.
The first is the fully qualified name of the imported module, such as foo.bar.baz. The second parameter is the path entry used by the module search. For the highest-level module, the second parameter is None, but for child modules or subpackages, the second parameter is the value of the parent package _ _ path__ attribute. If the corresponding _ _ path__ property is not accessible, a ModuleNotFoundError is thrown. The third parameter is an existing module object that will be loaded later. The import system passes in only one target module during the reload.
From importlib import abcfrom importlib.machinery import ModuleSpecclass UrlMetaFinder (abc.MetaPathFinder): def _ init__ (self, baseurl): self._baseurl = baseurl def find_spec (self, fullname, path=None, target=None): if path is None: baseurl = self._baseurl else: # if it is not the original defined url, if not path.startswith (self._baseurl): return None baseurl = path try: loader = UrlMetaLoader (baseurl) return ModuleSpec (fullname, loader, is_package=loader.is_package (fullname)) except Exception: return None
Next up is the loader.
According to the source code, there are two kinds of pathfinders.
FileLoader
SourceLoader
In theory, both loaders can achieve the functions we want, and I choose SourceLoader to demonstrate here.
In the abstract class SourceLoader, there are several important methods that you need to pay attention to when writing an implementation loader
Get_code: get the source code and implement it according to your own scenario.
Exec_module: executes the source code and assigns variables to module.dict
Get_data: abstract method, which must be implemented, returns the bytecode of the specified path.
Get_filename: abstract method, which must be implemented, returns the file name
In some old blog posts, you will often see that the loader wants to implement load_module (), which has been abandoned since Python 3.4. of course, you can use load_module () for compatibility.
From importlib import abcclass UrlMetaLoader (abc.SourceLoader): def _ init__ (self, baseurl): self.baseurl = baseurl def get_code (self, fullname): F = urllib2.urlopen (self.get_filename (fullname)) return f.read () def load_module (self, fullname): code = self.get_code (fullname) mod = sys.modules.setdefault (fullname Imp.new_module (fullname)) mod.__file__ = self.get_filename (fullname) mod.__loader__ = self mod.__package__ = fullname exec (code, mod.__dict__) return None def get_data (self): pass def execute_module (self, module): pass def get_filename (self, fullname): return self.baseurl + fullname + '.py'
When you use this old pattern to implement your own loading, you need to pay attention to two important points:
Execute_module must be overloaded and should not have any logic, even if it is not an abstract method.
Load_module, which needs to be executed manually in Finder, can load the module.
Instead, you should use execute_module () and create_module (). Because execute_module and create_module () have been implemented in the base class and satisfy our usage scenario. I don't have to repeat it here. Compared with the old mode, there is no need to manually execute execute_module () in the setting finder.
Import urllib.request as urllib2class UrlMetaLoader (importlib.abc.SourceLoader): def _ init__ (self, baseurl): self.baseurl = baseurl def get_code (self, fullname): F = urllib2.urlopen (self.get_filename (fullname)) return f.read () def get_data (self): pass def get_filename (self, fullname): return self.baseurl + fullname + '.py'
Both finder and loader are available. Don't forget to register our custom finder (UrlMetaFinder) with sys.meta_path.
Def install_meta (address): finder = UrlMetaFinder (address) sys.meta_path.append (finder)
After all the code is parsed, we organize it into a module (my_importer.py)
# my_importer.pyimport sysimport importlibimport urllib.request as urllib2class UrlMetaFinder (importlib.abc.MetaPathFinder): def _ init__ (self, baseurl): self._baseurl = baseurl def find_module (self, fullname, path=None): if path is None: baseurl = self._baseurl else: # if it is not the original defined url, if not path.startswith (self._baseurl) is returned directly: return None baseurl = path try: loader = UrlMetaLoader (baseurl) except Exception: return Noneclass UrlMetaLoader (importlib.abc.SourceLoader): def _ init__ (self) Baseurl): self.baseurl = baseurl def get_code (self, fullname): F = urllib2.urlopen (self.get_filename (fullname)) return f.read () def get_data (self): pass def get_filename (self, fullname): return self.baseurl + fullname + '.py'def install_meta (address): finder = UrlMetaFinder (address) sys.meta_path.append (finder)
5.2 build a remote server
At the beginning, I said that we should implement a method to import the module remotely.
I also need a remote server to store my modules. For convenience, I can use the http.server module that comes with python with a single command.
$mkdir httpserver & & cd httpserver$ cat > my_info.py > > from my_importer import install_meta > > install_meta ('http://localhost:12800/') # register finder with sys.meta_path > import my_info # print ok, indicating that the import is successful ok > my_info.name # verification can obtain the variable' Python programming time'
At this point, I have implemented a simple importer that can import modules on a remote server.
The above is how to implement the remote import module in Python's import mechanism. Have you learned the knowledge or skills? If you want to learn more skills or enrich your knowledge reserve, you are welcome to follow the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.