Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

The correct posture for the calculation of development function-- crawler

2025-01-17 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

In "function Computing Local running and debugging-basic usage of Fun Local", we introduce the method of using Fun Local to run and debug functions locally. However, such a simple introduction does not show that Fun Local has greatly improved the efficiency of function computing development.

This time, let's take a simple scenario as an example-develop a simple crawler function (code reference function calculation console template) to introduce how to develop a serverless crawler application that automatically scales and charges for the number of calls from scratch in the correct posture.

Development steps

We split the complete application into multiple steps, and after each step is completed, we will run and verify accordingly.

1. Create a Fun project

First, we create a directory called image-crawler as the root of the project. Then create a file called template.yml in this directory with the following contents:

ROSTemplateFormatVersion: '2015-09-01'Transform:' Aliyun::Serverless-2018-04-03'Resources: localdemo: Type: 'Aliyun::Serverless::Service' Properties: Description:' local invoke demo' image-crawler: Type: 'Aliyun::Serverless::Function' Properties: Handler: index.handler CodeUri: code/ Description:' Hello world with python 2.7 percent' Runtime: python2.7

If you don't know the Serverless Application Model defined by Fun, you can refer to it here.

After the operation is completed, our project directory structure is as follows:

. └── template.yml2. Write helloworld function code

Create a directory called code under the root directory, and create a file called index.py under that directory with a simple helloworld function:

Def handler (event, context): return 'hello worldview'

Execute under the project root:

Fun local invoke image-crawler

The function runs successfully:

After the operation is completed, our project directory structure is as follows:

. ├── code │ └── index.py └── template.yml3. Event trigger function runs

Let's simply modify the code in step 2 to print the event to log.

Import logginglogger = logging.getLogger () def handler (event, context): logger.info ("event:" + event) return 'hello worldview'

Run the function by triggering the event, and the following results are obtained:

As you can see, our function has correctly received the trigger event.

For more help information on Fun Local, please see.

4. Get the source content of the web page

Next, we add the code to get the content of the page.

Import loggingimport jsonimport urlliblogger = logging.getLogger () def handler (event, context): logger.info ("event:" + event) evt = json.loads (event) url = evt ['url'] html = get_html (url) logger.info ("html content length:" + str (len (html)) return' DoneDef get_html (url): page = urllib.urlopen (url) html = page.read () return html

The code logic is relatively simple, we directly use the urllib library here to read the content of the web page.

Run the function to get the following output:

5. Parse the pictures in the web page

We are going to parse the jpg images contained in the page through regular parsing, so this step will be cumbersome because it involves fine-tuning the regular expression. In order to solve the problem quickly, we decided to use the local debugging provided by fun local to solve the problem. Local debugging method reference: "function calculation local operation and debugging-basic usage of Fun Local".

First, let's make the next breakpoint on the following line:

Logger.info ("html content length:" + str (len (html)

Then start it as debug, and after the vscode debugger connects, the function continues to run to this line at our breakpoint:

We can see the local variable directly in the Locals column, which contains the variable html, that is, the html source code we obtained. We can copy its value, analyze it, and then design a regular expression.

We can write a simple one first, for example, it can be http:\ /\ / [^\ s, "] *. Jpg.

How to quickly verify the correctness of this code? We can take advantage of the Watch function provided by the debugger.

Create a Watch variable and enter the following values:

Re.findall (re.compile (r'http:\ / [^\ s, "] *\ .jpg'), html)

After you enter, you can see the execution effect of the code:

It is generally not easy to write correctly here, and regular tests can be modified repeatedly until they are correct.

We get the correct image parsing logic to add to the code:

Reg = r'http:\ /\ / [^\ s, "] *. Jpg'imgre = re.compile (reg) def get_img (html): return re.findall (imgre, html)

Then call it in the handler method:

Def handler (event, context): logger.info ("event:" + event) evt = json.loads (event) url = evt ['url'] html = get_html (url) img_list = get_img (html) logger.info (img_list) return' Donebirds'

After the writing is complete, you can continue to execute locally to verify the results:

Echo'{"url": "https://image.baidu.com/search/index?tn=baiduimage&word=%E5%A3%81%E7%BA%B8"}'\ | fun local invoke image-crawler

As you can see, img_list has been output to the console:

6. Upload pictures to oss

We chose to use oss to store the parsed images.

First, we need to configure OSS Endpoint and OSS Bucket through the environment variables.

Configure the environment variables in template (you need to create the oss bucket in advance):

EnvironmentVariables: OSSEndpoint: oss-cn-hangzhou.aliyuncs.com BucketName: fun-local-test

You can then get these two environment variables directly in the function:

Endpoint = os.environ ['OSSEndpoint'] bucket_name = os.environ [' BucketName']

In addition, when fun local runs the function, it provides an additional variable to identify that this is a locally run function. With this logo, we can use it to do some localization operations, such as connecting to RDS at online runtime and Mysql at local runtime.

Here, we use this logo to create an oss client in a different way, because when you run online, you get a temporary ak that plays a role through credentials, which has a validity limit, but not when you run locally. Oss provides the construction methods of these two ways, which we can use directly:

Creds = context.credentialsif (local): auth = oss2.Auth (creds.access_key_id, creds.access_key_secret) else: auth = oss2.StsAuth (creds.access_key_id, creds.access_key_secret, creds.security_token) bucket = oss2.Bucket (auth, endpoint, bucket_name)

Then we iterate through all the pictures and upload all the pictures to oss:

Count = 0for item in img_list: count + = 1 logging.info (item) # Get each picture pic = urllib.urlopen (item) # Store all the pictures in oss bucket, keyed by timestamp in microsecond unit bucket.put_object (str (datetime.datetime.now (). Microsecond) + '.png', pic)

Run the function locally again:

Echo'{"url": "https://image.baidu.com/search/index?tn=baiduimage&word=%E5%A3%81%E7%BA%B8"}'\ | fun local invoke image-crawler

As you can see from the log, the pictures are parsed one by one and uploaded to oss.

Log in to the oss console and you can see these pictures.

Deployment

After the local development is complete, we also need to publish it online to make it a callable service. In the past, you might find it troublesome to log in to the console, create services, create functions, configure environment variables, create roles, etc., but now with fun, you don't need all of this.

However, there are some differences between local and online, that is, to authorize the function calculation to be able to access OSS, how to do it? It's simple to add a line of configuration to our template (Polices documentation, please refer to):

Policies: AliyunOSSFullAccess

The added template.yml is as follows:

ROSTemplateFormatVersion: '2015-09-01'Transform:' Aliyun::Serverless-2018-04-03'Resources: localdemo: Type: 'Aliyun::Serverless::Service' Properties: Description:' local invoke demo' Policies: AliyunOSSFullAccess image-crawler: Type: 'Aliyun::Serverless::Function' Properties: Handler: index.handler CodeUri: code/ Description:' Hello world with python 2.7 percent' Runtime: python2.7 EnvironmentVariables: OSSEndpoint: oss-cn-hangzhou.aliyuncs.com BucketName: fun-local-test

Then, after using fun deploy, you can see the log of the successful deployment.

Verified by the console

When you log in to the console, you can see that our services, functions, code, environment variables, and so on are ready.

In the trigger event, write the json we used to test, and then execute:

As you can see, the effect is consistent with that of the local:

Verified by fcli

Fcli help documentation reference.

Execute the following command on the terminal to get the list of functions:

Fcli function list-service-name localdemo

You can see that our image-crawler has been created successfully.

{"Functions": ["image-crawler", "java8", "nodejs6", "nodejs8", "php72", "python27", "python3"], "NextToken": null}

You can call the function to run using the following command:

Fcli function invoke-service-name localdemo\-function-name image-crawler\-event-str'{"url": "https://image.baidu.com/search/index?tn=baiduimage&word=%E5%A3%81%E7%BA%B8"}'

After a successful run, you will get the same results as the console and fun local.

Summary

At this point, our development is over.

This paper makes use of the local running and debugging ability provided by fun local to develop the function locally and get feedback through repeated execution of the function to facilitate rapid code iteration.

After the completion of local development, there is no need to make any changes to the code, through the fun deploy command, one-click deployment to the cloud to achieve the desired results.

The method introduced in this article is not the only way to develop function calculation. The purpose of this article is to convey a signal to developers that when calculating the development function, as long as the posture is correct, it will be very enjoyable, and the development process will be very smooth. Enjoy your use.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report