Password crisis: deep learning is speeding up password cracking! 07/15 Update SLTechnology News&Howtos

Password crisis: deep learning is speeding up password cracking!

2025-07-15 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Shulou(Shulou.com)06/02 Report--

Information security experts have been exploring how "generative countermeasure networks" (GAN) can improve our online security with encouraging results. Recently, researchers at Stevens Institute of Technology in New Jersey and New York Institute of Technology have developed a way to guess passwords using GAN.

The team developed an experiment to view leaked password data using its password guessing technique, known as "PassGan", and found that the software could guess 47 per cent of passwords from these accounts, much higher than competitive algorithms such as HashCat and Ripper. In this article, we will introduce the following in detail:

The historical background of this technology

What functions can criminals take advantage of this technology?

How does this technology work?

If you are the red team (attacker), how to use this technology

How to protect your own security as a user

As a blue team (defender), how to protect themselves and enterprises?

Before we get to the point, let's take a look at what is "password guessing"? This is a rather vague term, and in the context of this article, it means to crack a hash. So the question comes again, what is a "password hash" (password hashes)?

What is a password hash?

When large data leaks such as Dropbox, LinkedIn, and Ashley Madison occur, they publish (usually) e-mail and password hash lists. The so-called Hash-- is generally translated as "hash" or directly transliterated as "hash"-that is, an input of any length (called pre-mapping, pre-image) is transformed into a fixed-length output by a hashing algorithm, which is the hash value. This transformation is a compressed mapping, that is, the space of the hash value is usually much smaller than that of the input, and different inputs may be hashed into the same output, so it is not possible to determine the input value uniquely from the hash value.

For example, suppose I use the MD5 algorithm to process the plaintext password "12345" into a string similar to "5f4dcc3b5aa765d61d8327deb882cf99", which is called a "hash". It is irreversible, that is, the plaintext password cannot be derived from this hash (that is, the previous 123456).

This is useful in terms of security because it means that sites like Adobe, LinkedIn, or Google can have hundreds of millions of user accounts without storing anyone's actual password. Even so, they still have the ability to check whether the user knows the password without knowing the password. This process is achieved only through the stored password "hash" rather than the password itself. When users want to log on to these sites, they send their own passwords, which are then hash processed and checked to see if the resulting hash matches the hash associated with the e-mail address in the database.

In this way, attackers can effectively slow down the speed of cracking, but can not completely stop them. By using modern tools, attackers can guess hundreds of thousands to hundreds of millions of passwords per second, depending on the type of hash algorithm the attacker is trying to crack. This gives attackers more time and energy to check potential passwords instead of spending too much time guessing passwords.

PassGAN joins the fight.

Attackers and security experts are engaged in a never-ending race to find better and faster ways to crack the list. At present, PassGAN technology is the most powerful "weapon" to realize this process.

PassGAN uses a relatively new technology called Generative Adversarial Network (GAN), which generates 18% more correct password guesses than traditional methods. The so-called "generating countermeasure network" is a kind of neural network, which takes turns to train discriminator (Discriminator) and generator (Generator) against each other to sample from complex probability distribution, such as generating pictures, text, voice and so on.

Let's talk about the first program, Discriminator, which is a deep convolution neural network. In short, it is a system that can learn patterns at a more and more abstract level. Its function is to judge whether the data generated by the generator is close to reality. It eventually returns a number between 0 and 1, where 0 is not a password and 1 is very similar to a password.

The next program is the generator, which is mainly used to generate data that can confuse the discriminator. It starts with a random string of text and gets the corresponding score through the "discriminator" test. The score for the first time is usually very low because it is just a random string of characters.

Next, the generator modifies the string, and then goes to the discriminator to see the score of the modified string. If the score rises, the modification result is maintained; if not, the modification is undone and the new modification operation continues; the process is repeated so that the score reaches a given value in order to improve its computing power.

To put it simply, the basic idea of "GAN" is that the generator and the discriminator play a game of "as virtue rises one foot and vice rises ten": the discriminator should practice the "golden eye" and try its best to distinguish between the real sample (such as the real picture) and the false sample generated by the generator; the generator should learn to "confuse the false with the real" and generate the "fake sample" that the discriminator judges as real. The ideal state of competition is for both sides to make continuous progress-the eyes of the discriminator are getting clearer and the deceptive ability of the generator is also increasing.

How well-intentioned and malicious actors use PassGan

Like any other security innovation technology, PassGan may also be abused by malicious actors to attack systems with password reuse, endangering the security of enterprises and systems. Of course, password reuse (Password reuse) can also cause extremely serious harm to ordinary users. Therefore, in order to protect themselves from security threats, users can not only rely on the protection of passwords, but also need to enable multi-factor authentication (MFA). Many websites have provided this feature, I hope you can use it in time.

Of course, for "red teams" designed to strengthen organizational security, PassGAN can also help them crack local passwords, such as Windows SAM hash or linux / etc / shadow password hash.

If you are a member of the "Red team" and are interested in PassGAN tools, you can go to https://github.com/brannondorsey/PassGAN to find resources about the open source implementation of PassGAN released by Brannon Dorsey.

In the PassGAN experiment, the researchers explored different neural network configurations, parameters and training processes to determine the appropriate balance between learning and overfitting. Specifically, the main contributions of the researchers are as follows:

1. Show that GAN can generate high-quality password guesses. In the experiment, for the RockYou dataset, the researchers were able to match 2774269 (46.86%) of the 5919936 passwords in the real user password test set and 4996980 (11.53%) of the 43454871 passwords in the LinkedIn dataset.

two。 Research shows that this technology can compete with the most advanced password generation rules. Although these rules are specifically adjusted for the dataset used in the evaluation, the output quality of PassGAN is comparable to or better than password generation rules (in HashCat).

3. PassGAN can be used to supplement password generation rules. In the experiment, the researchers successfully used PassGAN to generate a password match that no password rule could generate.

4. The researchers also point out that the best-performing password cracking results come from a combination of PassGAN and HashCat. By combining the output of PassGAN and HashCat, it is found that the password matched by HashCat is 18% more than that of HashCat itself.

5. Contrary to password generation rules, PassGAN can generate an almost unlimited number of password guesses. Experiments show that the number of new (unique) password guesses increases steadily with the total number of passwords generated by GAN. This is important because the number of unique passwords currently generated using rules will eventually be limited by the size of the password dataset that is used to instantiate these rules.

6. It may also be useful to run the output of PassGAN through a standard, variation-based password rule set for greater coverage.

Best practices for enterprise passwords

If you are responsible for managing the enterprise, it is recommended that you implement a strong password policy and use passphrase instead of passwords. According to Wikipedia, the difference between the two is that Passphrase is longer than normal passwords.

In recent years, many experts have begun to think that using Passphrase long enough can get rid of the previous password requirements of uppercase and lowercase letters, numbers, special symbols and other situations that make passwords difficult to remember. For example, passwords made up of the initials of "like a river flowing eastward" are not only easy to remember, but also better than Password, which meets all kinds of requirements without egg use.

Take the "* dJoeo30 (# JS) 3% $" password as an example.

From an entropy point of view, this is a very good password. Because it contains everything, it has 16 characters, including uppercase letters, lowercase letters, symbols and numbers. If you calculate that there are 95 possibilities for each character, then there are 95 ^ 16 (95 to the power of 16) possibilities for this set of passwords, and even if you can guess a trillion times per second, it may take 1 trillion years to guess.

In terms of password strength, it is indeed a good password, but unfortunately no one can remember it. Because it's hard for our brains to remember random strings that don't make any sense to us. This is why PassGAN can easily crack so many passwords, because passwords that humans can easily remember (such as P@a$$word) must also be easy to guess.

This is why we advocate using passphrase. We can use things with a lot of entropy, but the resulting password is not arbitrary or random characters, but something traceable and easy to remember. For example, "this album is very good, the weather is very cold" and other things that are easy to remember. If we assume that the attacker knows that the user is using a password combination of one English word, then based on a per capita vocabulary of 20000, we have 20000 ^ 8 (to the eighth power of 20000) possible combinations, and it is not easy to crack them successfully.

Next, you need to consider how the user has a given password in the enterprise. Should you expect the receptionist to know how to create a secure password? Should you expect everyone in the enterprise to have a strong password? Do you give users' passwords easy for them to remember, or will they write them down on post-it notes? These problems are worth paying attention to and thinking about!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.