Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

How to use Spark to analyze Laguo.com recruitment information

2025-03-31 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/01 Report--

This article mainly shows you "how to use Spark to analyze Lagu.com recruitment information", the content is easy to understand, clear, hope to help you solve your doubts, the following let the editor lead you to study and learn "how to use Spark to analyze Lagu.com recruitment information" this article.

If you use traditional programming language tools?

Suppose we use traditional language tools from data collection, storage to data reading and use, such as nodejs.

If we want to know exactly how many jobs there are in different salary ranges and sort them from most to least, we may need to:

Create new objects to store data from each company

Read data in a loop to enrich the data of various companies

Record the information of each position in each company according to the salary

Sort by the number of recruits

The steps are relatively simple. Not to mention that memory is likely to be overwhelmed when the dataset is larger, but the logical details of steps 2 and 3 require a lot of code judgment, such as how to loop through file data. What if the file name is irregular? What if the file data is corrupt and irregular data? The json of file data is not a directly available position array. Is the operation of json structure transformation logically easy for you to implement?

Admittedly, there's nothing you can't do in a programming language, it's just a matter of time; when it comes to time, if there's another obviously faster way, would you not use it?

Use Spark for analysis

Use Spark to implement the same logic described above. The following actions are based on the interactive programming tool Zeppelin:

1. Read data val job = sqlContext.read.json ("jobs") job.registerTempTable ("job") job.printSchema ()

two。 Get the number of positions in each salary segment and sort% sqlSELECT postionCol.salary,COUNT (postionCol.salary) salary_countFROM jobLATERAL VIEW explode (content.positionResult.result) positionTable AS postionColWHERE content.positionResult.queryAnalysisInfo.positionName= "ios" GROUP BY postionCol.salaryORDER BY salary_count DESC

You can really directly use syntax similar to SQL for complex queries of semi-structured data. I don't know how do you feel after reading it?

If your SQL background is not very good, my advice is: read more documents when you have time, and type the English keyword google when you need it.

Several sparkSQL sample queries for data that you may be interested in

Give children's shoes in need:

Show the number of recruits for a position by company name% sqlSELECT postionCol.companyFullName,COUNT (postionCol.companyFullName) postition_countFROM jobLATERAL VIEW explode (content.positionResult.result) positionTable AS postionColWHERE content.positionResult.queryAnalysisInfo.positionName= "ios" GROUP BY postionCol.companyFullNameORDER BY postition_count DESC

SqlSELECT postionCol.workYear,COUNT (postionCol.workYear) workYearsFROM jobLATERAL VIEW explode (content.positionResult.result) positionTable AS postionColWHERE content.positionResult.queryAnalysisInfo.positionName= "ios" GROUP BY postionCol.workYearORDER BY workYears DESC showing the length of service required by a position

Show the educational requirements for a position% sqlSELECT postionCol.education,COUNT (postionCol.education) education_countFROM jobLATERAL VIEW explode (content.positionResult.result) positionTable AS postionColWHERE content.positionResult.queryAnalysisInfo.positionName= "ios" GROUP BY postionCol.educationORDER BY education_count DESC

Show the size of each company for a position% sqlSELECT postionCol.companySize,COUNT (postionCol.companySize) company_size_ountFROM jobLATERAL VIEW explode (content.positionResult.result) positionTable AS postionColWHERE content.positionResult.queryAnalysisInfo.positionName= "ios" GROUP BY postionCol.companySizeORDER BY company_size_ount DESC

The above is all the contents of the article "how to use Spark to analyze Lagu.com recruitment Information". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report