In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-01-16 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >
Share
Shulou(Shulou.com)06/01 Report--
This article mainly introduces "how to use Python to make a MOOC Open course Downloader". In daily operation, I believe many people have doubts about how to use Python to make a MOOC Open course Downloader. Xiaobian consulted all kinds of materials and sorted out simple and easy-to-use operation methods. I hope it will be helpful to answer the doubts of "how to use Python to make a MOOC Open course Downloader". Next, please follow the editor to study!
Development tools
Python version: 3.7.8
Related modules:
DecryptLogin module
Tqdm module
Click module
Argparse module
And some modules that come with python.
Environment building
Install Python and add it to the environment variable, and pip installs the relevant modules you need.
be all eagerness to see it
Mode of operation:
Python moocdl.py-url course link
The effect is as follows:
Moocdl
A random course was tested, and the result was in m3u8 format, so it was a bit slow to download. By default, all the courseware will be downloaded and put into the corresponding directory.
Brief introduction of principle
First of all, we need to simulate logging in to the MOOC of China University before we can download the corresponding course materials. Here, we can use the DecryptLogin package which was opened up before the official account:
Def login (self, username, password): lg = login.Login () infos_return, session = lg.icourse163 (username, password) return infos_return, session
Next, we will briefly explain how to download the materials in the corresponding course. First of all, we need to get the basic information about the course. Just click on the course home page and you can find it directly on the returned page:
The code implementation to extract the course information we need is as follows:
# get information from the main page of the course url = url.replace ('learn/',' course/') response = self.session.get (url) term_id = re.findall (r'termId: "(\ d +)", response.text) [0] course_name ='- '.join (re.findall (r'name: "(. +)") Response.text) course_name = self.filterBadCharacter (course_name) course_id = re.findall (r'https?://www.icourse163.org/ (course | learn) /\ walled-(\ d +)', url) [0] print (f 'gets the following information from the course main page:\ n\ t [course name]: {course_name}, [course ID]: {course_name}, [TID]: {term_id}')
Then use this information to crawl the corresponding resource list:
# get resource list resource_list = [] data= {'tid': term_id,' mob-token': self.infos_return ['results'] [' mob-token'],} response = self.session.post ('https://www.icourse163.org/mob/course/courseLearn/v1', data=data) course_info = response.json () file_types = [1,3,4] for chapter_num, chapter in enumerate (course_info.get (' results') Get ('termDto', {}). Get (' chapters', []): for lesson_num, lesson in enumerate (chapter.get ('lessons', [])) if chapter.get (' lessons') is not None else []: for unit_num, unit in enumerate (lesson.get ('units') []): if unit ['contentType'] not in file_types: continue savedir = course_name self.checkdir (savedir) for item in [self.filterBadCharacter (chapter [' name']), self.filterBadCharacter (lesson ['name']), self.filterBadCharacter (unit [' name'])]: savedir = os.path.join (savedir Item) self.checkdir (savedir) if unit ['contentType'] = = file_types [0]: savename = self.filterBadCharacter (unit [' name']) + '.mp4' resource_list.append ({'savedir': savedir,' savename': savename, 'type':' video' 'contentId': unit [' contentId'], 'id': unit [' id'],}) elif unit ['contentType'] = = file_types [1]: savename = self.filterBadCharacter (unit [' name']) + '.pdf' resource_list.append ({'savedir': savedir) 'savename': savename, 'type':' pdf', 'contentId': unit [' contentId'], 'id': unit [' id'] }) elif unit ['contentType'] = = file_types [2]: if unit.get (' jsonContent'): json_content = eval (unit ['jsonContent']) savename = self.filterBadCharacter (json_content [' fileName']) resource_list.append ({'savedir': savedir) 'savename': savename,' type': 'rich_text',' jsonContent': json_content,}) print (f' successfully obtained the resource list with the number of {len (resource_list)}')
Finally, you can parse and download according to the resource type:
# download the corresponding resource pbar = tqdm (resource_list) for resource in pbar: pbar.set_description (f'downloading {resource ["savename"]}') #-download video if resource ['type'] =' video': data = {'bizType':' 1', 'mob-token': self.infos_return [' results'] ['mob-token'] 'bizId': resource [' id'], 'contentType':' 1century,} while True: response = self.session.post ('https://www.icourse163.org/mob/j/v1/mobileResourceRpcBean.getResourceToken.rpc', Data=data) if response.json () ['results'] is not None: break time.sleep (0.5 + random.random ()) signature = response.json () [' results'] ['videoSignDto'] [' signature'] data= {'enVersion':' 1century, 'clientType':' 2' 'mob-token': self.infos_return [' results'] ['mob-token'],' signature': signature, 'videoId': resource [' contentId'],} response = self.session.post ('https://vod.study.163.com/mob/api/v1/vod/videoByNative', Data=data) #-download video videos = response.json () ['results'] [' videoInfo'] ['videos'] resolutions, video_url = [3,2,1] None for resolution in resolutions: for video in videos: if video ['quality'] = = resolution: video_url = video ["videoUrl"] break if video_url is not None: break if' .m3u8'in video_url: self.m3u8download ({'download_url': video_url 'savedir': resource [' savedir'], 'savename': resource [' savename'],}) else: self.defaultdownload ({'download_url': video_url,' savedir': resource ['savedir'],' savename': resource ['savename']) }) #-download subtitles srt_info = response.json () ['results'] [' videoInfo'] ['srtCaptions'] if srt_info: for srt_item in srt_info: srt_name = os.path.splitext (resource [' savename']) [0] +'_'+ srt_item ['languageCode'] +' .srt' Srt_url = srt_item ['url'] response = self.session.get (srt_url) fp = open (os.path.join [' savedir'] Srt_name), 'wb') fp.write (response.content) fp.close () #-- download PDF elif resource [' type'] = = 'pdf': data = {' tween: '3percent,' cid': resource ['contentId'],' unitId': resource ['id'] 'mob-token': self.infos_return [' results'] ['mob-token'],} response = self.session.post (' http://www.icourse163.org/mob/course/learn/v1', data=data) pdf_url = response.json () ['results'] [' learnInfo'] ['textOrigUrl'] self.defaultdownload ({' download_url': pdf_url) 'savedir': resource [' savedir'], 'savename': resource [' savename'],}) #-download rich text elif resource ['type'] = =' rich_text': download_url = 'http://www.icourse163.org/mob/course/attachment.htm?' + urlencode (resource [' jsonContent']) self.defaultdownload ({'download_url': download_url) 'savedir': resource [' savedir'], 'savename': resource [' savename'],}) so far The study on "how to use Python to make a MOOC Open course downloader" is over. I hope to be able to solve your doubts. The collocation of theory and practice can better help you learn, go and try it! If you want to continue to learn more related knowledge, please continue to follow the website, the editor will continue to work hard to bring you more practical articles!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.