In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-04-05 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/03 Report--
Import os,shutil,docx,re,timefrom win32com import client as wc# reads files from all cascading directories into the specified directory def count_files (file_dir): count=0 for pmaideddirection f in os.walk (file_dir): for c in f: if c.split ('.') [- 1] = "doc": count + = 1 src_dir = os.path.join (p C) print (src_dir) dst_dir = file_dir + "back" if not os.path.exists (dst_dir): os.makedirs (dst_dir) shutil.copy (src_dir, dst_dir) return count# extract the email address in each docx resume document We use the python-docx module here to solve pip install python-docxdef count_mail (file_dir,dst_file): mail_list = [] for parent,dirctiory,files in os.walk (file_dir): for f in files: doc = docx.Document (os.path.join (parent) F)) pattern = re.compile (ringing stories'([a-zA-Z0-9.percent percent) -] + @ [a-zA-Z0-9\ t\ s -] + (\ .[ a-zA-Z0-9\ t\ s] {2jue 4}))'' Re.VERBOSE) for para in doc.paragraphs: for groups in pattern.findall (para.text): mail_list.append (groups [0]. Replace (",") + " ") with open (dst_file,'w') as f: f.writelines (mail_list) print (" = email message written successfully = ") # since the python-docx module can only handle docx suffixes, we need to process files with doc suffixes The doc suffix must be converted to docxdef docxTodoc (old_doc,new_doc) through the win32com module: word = wc.Dispatch ('Word.Application') for parent,directory,files in os.walk (old_doc): for f in files: doc = word.Documents.Open (os.path.join (parent,f)) # File new_filepath=os.path.join (new_doc) under the target path F.split (".") [0] + ".docx") print (new_filepath) doc.SaveAs (new_filepath, 12, False, ", True,", False, False, False False) # File doc.Close () print (time.time ()) word.Quit () if _ _ name__ ='_ _ main__': print (count_files (r "C:\ Users\ icestick\ Desktop\ 51job_ exported resume _ 20180917") count_mail (r "C:\ Users\ icestick\ Desktop\ new_doc") R "C:\ Users\ icestick\ Desktop\ test.txt") old_doc = r "C:\ Users\ icestick\ Desktop\ 51jb _ export resume _ 20180917" # need to convert the doc directory to the original directory in docx format new_doc = r "C:\ Users\ icestick\ Desktop\ new_doc" # need to convert the doc directory to the target directory mail_extract = r "C:\ Users\ Icestick\ Desktop\ test.txt "# mailbox extracted file if not os.path.exists (new_doc): os.mkdir (new_doc) print (" = directory created successfully = ") docxTodoc (old_doc) New_doc) print ("= docx format conversion =") count_mail (new_doc, mail_extract) else: docxTodoc (old_doc, new_doc) print ("= docx format conversion =") count_mail (new_doc, mail_extract)
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.