Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

What Python developers need to know before moving to the Goe language

2025-03-30 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Development >

Share

Shulou(Shulou.com)06/02 Report--

In this issue, the editor will bring you what Python developers need to know before switching to the Goe language. The article is rich and analyzed and described from a professional point of view. I hope you can get something after reading this article.

Background

At Repustate, one of the best technical achievements we have ever made is the implementation of emotional analysis in Arabic. Arabic is a hard nut to crack because its morphological changes are quite complex. Participle in Arabic (dividing a sentence into several separate words) is also more difficult than in English, because Arabic words themselves may also contain white space characters (for example, the position of "Alev" in a word). This is hardly a leak. Repustate uses support vector machines (SVM) to capture the most likely meaning behind a sentence and add emotional elements to it. Overall, we used 22 models (22 SVM) and analyzed every word in a document. So if you have a document of 500 words, then based on SVM, you will make 100, 000 comparisons.

Python

Repustate is almost entirely a Python store. We use Django to implement API and websites. So (currently) it makes sense to use Python to implement the Arabic emotion engine in order to keep the code consistent. Python is a good choice for prototyping and implementation. Its expressive ability is very strong, third-party class libraries and so on are also very good. If you are for Web services, Python is very *. But when you do low-level calculations and rely heavily on hashes (dictionary types in Python) for comparisons, everything slows down. We can process about two or three Arabic documents per second, but this is too slow. By comparison, our English emotion engine can process about 500 documents per second.

Bottleneck

So we turned on the Python parser and started to investigate what took so long. Remember when I said we had 22 SVM and each word needed to be processed? Well, these are all linear processing, not parallel processing. So our response is to change the linear processing to the map/reduce-like operation. To put it simply: Python is not suitable for use as a map/reduce. Python works well when you need concurrency. At the 2013 Python conference (PyCon 2013), Guido talked about Tulip. His new project is making up for the shortcomings of Python, but it will take a while to launch, but if we already have something better to use, why should we wait?

Choose Go or go home?

My friends at Mozilla told me that Mozilla is switching a lot of their basic logging architecture to the Go language, in part because the powerful [how programming languages work (interpretive vs compiled, dynamic language vs static language) will say, "Oh, of course the Go language will be faster." Yes, we can also rewrite everything in Java and see similar faster improvements, but that's not why Go wins. The code you wrote in Go seems to be right. I don't know what's going on, but once the code is compiled (very fast), you'll think it works (not only does it run infallible, but even logically). I know it sounds unreliable, but it is. This is very similar to Python in terms of redundancy (or non-redundancy), which targets functions, so functional programming is easy to figure out. And of course, go threads and channels make your life easier, you can get a big performance boost from static typing, and you can have finer control over memory allocation without having to pay too much for language expression.

Things I wish I knew earlier (Tips & Tricks)

Despite all these compliments, sometimes you really need to change your way of thinking when dealing with Go code compared to Python. So this is a list of notes I took when I migrated the code-- just an idea that popped out of my head when I converted Python code to Go:

◆ does not have a built-in collection type (you must use map and check for existence)

Because ◆ does not have a set, it has to write its own methods such as intersection, union and so on.

◆ has no tuples type, you must write your own structure, or use slices (that is, array)

◆ doesn't have a method like\ _ getattr__ (). You have to always check for existence instead of setting default values. For example, in Python, you can write value = dict.get ("a_key", "default_value").

◆ must always check for errors (or explicitly ignore errors)

◆ cannot have variables / packages that are not used, so simple tests sometimes require some code to be annotated

◆ converts between [] byte and string. Regexp uses [] byte (immutable). That's true, but it's annoying to keep switching variables back and forth.

◆ Python is more relaxed. You can use an out-of-range index to fetch a fragment in a string without error. You can also take out clips with negative numbers, but not Go.

◆ you cannot mix data structure types. Maybe it's not very clean, but sometimes in Python, I use a dictionary where values are a mixture of strings and lists. But not Go, you have to clean up your data structure or use a custom structure

◆ cannot unpack a tuple or list to several different variables (for example: X, y, z = [1, 2, 3])

◆ hump naming style (if you don't have an initial capitalized method name / structure name, they won't be exposed to other packages). I prefer the lowercase letter and underlined naming style of Python.

◆ must explicitly check for errors! = nil, unlike in Python, many types can be checked like bool (0, ", None can be interpreted as a" non "collection)

The ◆ documentation is too messy on some modules, such as (crypto/md5), but the go-nuts on IRC is easy to use and provides great help.

The conversion of ◆ from numbers to strings (int64-> string) is not quite the same as [] byte-> string (as long as you use string ([] byte)). You need to use strconv.

◆ reads Go code more like a programming language than a pseudocode language like Python. Go has more non-alphanumeric characters and uses | and & & rather than "or" and "and"

If ◆ writes a file, there are File.Write ([] byte) and File.WriteString (string), which runs counter to the Python way of Python developers: "there is only one way to solve the problem."

It is difficult for ◆ to modify strings and fmt.Sprintf must be rearranged frequently

◆ does not have a constructor, so the idiom is to create a NewType () method to return the structure you want

The ◆ Else (or else if) must be formatted correctly, and the curly braces paired by else and if must be on the same line. strange.

The ◆ assignment operator depends on whether it is inside or outside the function, for example, = and: =

◆ if I only want "keys" or "values", such as dict.keys () or dict.values (), or a list of tuples, such as dict.items (), there is no equivalent in the Go language, you can only enumerate map to construct your list type.

◆ I sometimes use an idiom: construct a dictionary type whose value is a function. I want to call these functions with a given key, which you can do in Go, but all functions must accept and return the same thing, such as the same method signature.

◆ if you use JSON and your JSON is a composite type, congratulations. You must construct a custom structure to match the format in the JSON block, and then parse the original JSON into an instance of your custom structure. More work to be done than object = json.loads (json_blob) in the Python world

Is it worth it?

It's worth it. 1 million times it's worth it. The increase in speed is so much that it is difficult to give up. At the same time, I think Go is the current trend, so when hiring new employees, I think it will be helpful to regard Go as an important part of the accumulation of Repustate technology.

This is what Python developers need to know before switching to Goe language. If you happen to have similar doubts, please refer to the above analysis to understand. If you want to know more about it, you are welcome to follow the industry information channel.

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Development

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report