In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-18 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
Today, I would like to share with you the relevant knowledge points about how Python reads the secondary encapsulation of the configuration file ConfigParser. The content is detailed and the logic is clear. I believe most people still know too much about this knowledge, so share this article for your reference. I hope you can get something after reading this article.
Python read configuration file-ConfigParser secondary encapsulation directly uploads the code
Test.conf
[database] connect = mysqlsleep = notest = yes
Config.py
#-*-coding:utf-8-*-_ _ author__ = 'guoqianqian'import osimport ConfigParserimport oscurrent_dir = os.path.abspath (os.path.dirname (_ file__)) class OperationalError (Exception): "operation error." class Dictionary (dict): "custom dict." Def _ _ getattr__ (self, key): return self.get (key, None) _ _ setattr__ = dict.__setitem__ delattr__ = dict.__delitem__class Config: def _ init__ (self, file_name= "test", cfg=None): "@ param file_name: file name without extension. @ param cfg: configuration file path. " Env = {} for key, value in os.environ.items (): if key.startswith ("TEST_"): env [key] = value config = ConfigParser.ConfigParser (env) if cfg: config.read (cfg) else: config.read (os.path.join (current_dir, "conf" "% s.conf"% file_name) for section in config.sections (): setattr (self, section, Dictionary ()) for name, raw_value in config.items (section): try: # Ugly fix to avoid'0' and'1' to be parsed as a # boolean value. # We raise an exception to goto failure ^ w parse it # as integer. If config.get (section, name) in ["0", "1"]: raise ValueError value = config.getboolean (section, name) except ValueError: try: value = config.getint (section, name) except ValueError: value = config.get (section Name) setattr (getattr (self, section), name, value) def get (self, section): "" Get option. @ param section: section to fetch. @ return: option value. " Try: return getattr (self, section) except AttributeError as e: raise OperationalError ("Option% s is not found in"configuration, error:% s"% (section) E) if _ _ name__ = = "_ _ main__": conf = Config () print conf.get ("database"). Connect print conf.get ("database"). Sleep print conf.get ("database"). Test
Execution result
Mysql
False
True
Directory structure
Demo conf test.conf config.py read configuration file & & simple encapsulation
I have done the exercise of writing crawler data to database before, but this time I want to extract the database information into an ini configuration file. The advantage of this is that multiple databases can be added in the configuration file to facilitate switching (in addition, configuration files can also add information such as mailbox, url, etc.)
1.configparser module
Python uses its own configparser module to read the configuration file, which is similar to the ini file in windows
The module needs to be installed before use, and it can be installed using pip.
The basic method of reading files by 2.configparser
(1) create a new config.ini file, as follows
(2) create a new readconfig.py file and read the information of the configuration file
Import configparsercf = configparser.ConfigParser () cf.read ("E:\ Crawler\ config.ini") # reads the configuration file. If you write the absolute path of the file, you can get all the section in the file without the os module secs = cf.sections () # (there can be multiple configurations in a configuration file, such as database-related configuration, mailbox-related configuration, each section is wrapped by [], that is, [section]) And return print (secs) options = cf.options ("Mysql-Database") # in the form of a list to get the key print (options) items = cf.items ("Mysql-Database") corresponding to a section named Mysql-Database # get all the key value pairs print (items) host = cf.get ("Mysql-Database", "host") # corresponding to host in [Mysql-Database].
The running result of the above code is as follows, which can be compared with config.ini
3. Os module is introduced to read configuration files using relative directories.
The catalogue of the project is as follows:
Readconfig.py:
Import configparserimport osroot_dir = os.path.dirname (os.path.abspath ('.)) # get the directory above the directory where the current file resides, that is, the directory E:\ Crawlercf = configparser.ConfigParser () cf.read (root_dir+ "/ config.ini") # splice the path to the config.ini file, and directly use secs = cf.sections () # to get all the section in the file (multiple configurations can be found in a configuration file Such as database-related configuration, mailbox-related configuration, each section is wrapped by [], that is, [section]) And return print (secs) options = cf.options ("Mysql-Database") # in the form of a list to get the key print (options) items = cf.items ("Mysql-Database") corresponding to a section named Mysql-Database # get all the key value pairs print (items) host = cf.get ("Mysql-Database", "host") # corresponding to host in [Mysql-Database].
Or use os.path.join () for stitching
Import configparserimport osroot_dir = os.path.dirname (os.path.abspath ('.)) # get the directory one level above the directory where the current file resides, that is, the directory where the project resides E:\ Crawlerconfigpath = os.path.join (root_dir, "config.ini") cf = configparser.ConfigParser () cf.read (configpath) # read configuration file secs = cf.sections () # get all section in the file (multiple configurations can be in one configuration file Such as database-related configuration, mailbox-related configuration, each section is wrapped by [], that is, [section]) And return print (secs) options = cf.options ("Mysql-Database") # in the form of a list to get the key print (options) items = cf.items ("Mysql-Database") corresponding to a section named Mysql-Database # get all the key value pairs print (items) host = cf.get ("Mysql-Database", "host") # corresponding to host in [Mysql-Database] corresponding to print (host) 4. By reading the configuration file
Rewrite the previous example of requests+ regular expression crawling a cat's eye movie
Encapsulate the read configuration file readconfig.py and the operation database handleDB.py into a class respectively
Readconfig.py is as follows
Import configparserimport osclass ReadConfig: "define a class to read the configuration file" def _ _ init__ (self, filepath=None): if filepath: configpath = filepath else: root_dir = os.path.dirname (os.path.abspath ('.')) Configpath = os.path.join (root_dir, "config.ini") self.cf = configparser.ConfigParser () self.cf.read (configpath) def get_db (self, param): value = self.cf.get ("Mysql-Database", param) return valueif _ _ name__ = ='_ main__': test = ReadConfig () t = test.get_db ("host") print (t)
HandleDB.py is as follows
# coding: utf-8# author: hmkfrom common.readconfig import ReadConfigimport pymysql.cursorsclass HandleMysql: def _ _ init__ (self): self.data = ReadConfig () def conn_mysql (self): "Connect to the database"host = self.data.get_db (" host ") user = self.data.get_db (" user ") password = self.data.get_db (" password " ") db = self.data.get_db (" db ") charset = self.data.get_db (" charset ") self.conn = pymysql.connect (host=host) User=user, password=password, db=db, charset=charset) self.cur = self.conn.cursor () def execute_sql (self, sql, data): "execute the relevant sql of the operation data" self.conn_mysql () self.cur.execute (sql, data) self.conn.commit () def search (self Sql): "" execute query sql "self.conn_mysql () self.cur.execute (sql) return self.cur.fetchall () def close_mysql (self):" close database connection "self.cur.close () self.conn.close () if _ _ name__ = ='_ main__': test" = HandleMysql () sql = "select * from maoyan_movie" for i in test.search (sql): print (I)
The last run file, call the previous method
# coding: utf-8# author: hmkimport requestsimport refrom common import handleDBclass Crawler: "" define a reptile "" def _ _ init__ (self): self.db = handleDB.HandleMysql () @ staticmethod def get_html (url, header): response = requests.get (url=url) Headers=header) if response.status_code = = 200: return response.text else: return None @ staticmethod def get_data (html List_data): pattern = re.compile (rud.movie? (\ d +). *?'# matching Movie Rank r'(. *?)'# matches the name of the movie rroom.movie? (. *?)
'# match release time rmatch.match? (. *?)' # match the integer place of the score rq.match? (. *?). *?, re.S) # match fraction m = pattern.findall (html) for i in m: # because all the matching results will be returned as a list The information of each movie is saved as a tuple So you can iteratively process each set of movie information ranking = I [0] # extract the ranking movie = I [1] # extract the name release_time = I [2] # extract the release time score = I [3] + I [4] # extract the score in a set of movie information Here the integer part and the decimal part of the score are put together list_data.append ([ranking, movie, release_time, score]) # each set of movie information extracted is put into a list And append it to a large list. So the resulting large list contains all the movie information def write_data (self, sql, data): self.db.conn_mysql () try: self.db.execute_sql (sql) Data) print ('import successful') except: print ('import failed') self.db.close_mysql () def run_main (self): start_url = 'http://maoyan.com/board/4' depth = 10 # crawl depth (page flip) header = {"Accept": "text/html,application/xhtml+xml,application/xml Qcoach 0.9 gzip deflate sdch, "Accept-Language": "zh-CN,zh" Qroom0.8 "," Cache-Control ":" max-age=0 "," Connection ":" keep-alive "," Host ":" maoyan.com "," Referer ":" http://maoyan.com/board", "Upgrade-Insecure-Requests": "1" "User-Agent": "Mozilla/5.0 (Windows NT 6.1 Win64 X64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36 "} for i in range (depth): url = start_url +'? offset=' + str (10 * I) html = self.get_html (url, header) list_data = [] self.get_data (html) List_data) for i in list_data: "where the list_data parameter refers to the list data that is regularly matched and processed (it is a large list Contains all movie information, each of which is stored in its own list Iterate through the large list to extract each group of movie information, so that each group of movie information extracted is a small list, and then each group of movie information can be written to the database) "" movie = I # each group of movie information Here you can think of each set of movie data sql = "insert into maoyan_movie (ranking,movie,release_time,score) values (% s,% s,% s)" # sql insert statement self.write_data (sql) that you are going to insert into the database Movie) if _ _ name__ ='_ _ main__': test = Crawler () test.run_main () these are all the contents of the article "how Python reads the secondary encapsulation of the configuration file ConfigParser" Thank you for reading! I believe you will gain a lot after reading this article. The editor will update different knowledge for you every day. If you want to learn more knowledge, please pay attention to the industry information channel.
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.