Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

A case of Linux Kernel outputting Chinese characters

2025-02-25 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >

Share

Shulou(Shulou.com)06/02 Report--

Editor to share with you a case of Linux kernel output of Chinese characters, I believe that most people do not know much about it, so share this article for your reference, I hope you can learn a lot after reading this article, let's learn about it!

You can easily enter Chinese and get Chinese output on the SSH terminal of Windows/MacOS login Linux, such as the following:

But it is almost impossible to display Chinese on Linux's own virtual terminal:

[root@localhost font] # echo shoes > / dev/tty2

Two question marks are displayed, and it is clear that the Linux kernel does not recognize Chinese.

Why can't the Linux kernel recognize Chinese? Here is a relationship to be sorted out:

Your input and display output on the remote SSH terminal is done by the host of the SSH terminal, such as Windows,MacOS, which has nothing to do with Linux.

Your input and display output behavior on Linux local virtual terminals, such as / dev/tty1, is handled by the Linux kernel itself.

For example, I used iTerm SSH to connect to all keyboard input on a remote CentOS Linux,iTerm in MacOS, and the display output behavior was done by the MacOS host of iTerm.

On the contrary, if you type directly on the CentOS Linux virtual terminal and attempt to get the output, then the input and output must be handled by the Linux kernel itself.

That's basically it. As for why the Linux kernel does not support Chinese, it is very annoying to understand how the Linux kernel treats the logic of unicode when dealing with the input and output of virtual terminals.

Anyway, I just can't output Chinese here, and I don't do this, obviously this is not a necessary task, so I'm just having fun.

The goal of this paper is to enable the virtual terminal of Linux to output Chinese.

Just output Chinese, even one Chinese character is good. Specifically, when I type the'A 'character on the keyboard, the display echoes a Chinese character.

Therefore, this article does not intend to allow the Linux kernel to fully support Chinese on a large scale. This kind of thing has been done by many people and communities, but the playability is not high. After all, this kind of thing can be used as a private job to make money. As long as it is a money-making job, it is not very playable, because it has to be fast.

Do not need to know the tedious unicode coding, do not need to know the boring font font format, see how to play.

Show the effect first. Here is a lattice example of 8 × 168\ times 168x16:

It is not very good-looking, so I made the following lattice of 28 × 1628\ times 1628 × 16:

Here's how this is done.

There are actually two mappings between the time you hit a key on the keyboard and the time when a character is finally displayed on the monitor of the virtual terminal:

Mapping between keyboard and character set

Convert a keystroke event to a code in a character set, such as mapping it to 0x41 when the'A' key is pressed.

Mapping of character sets and fonts

Map the codewords of a character set to a dot matrix for display. For example, map 0x41 to an 8 × 168\ times 168 × 16 dot matrix that can be seen as a character'A'.

Linux's console does not recognize character set codewords that exceed 0x00ff, so it cannot handle codewords that exceed 0x00ff's unicode, and if you want it to do so, you have to change the kernel code.

As I just said, modifying the kernel code to fully support Chinese on a large scale is a profitable thing. It is not only boring, but no one will share it.

So I tried to modify the above two mappings to solve the problem. Because it's just a display, I won't modify the mapping between the keyboard and the character set, because that will still encounter the problem that the character set codeword exceeds the 0x00ff.

This means that there is only one way to display Chinese, and that is to modify the mapping of character sets and fonts!

This mapping must be stored somewhere in kernel memory or in the file system. I can find the following information in the config file of the current kernel:

[root@localhost font] # cat / boot/config-3.10.0-862.11.6.el7.x86_64 | grep FONT# CONFIG_FONTS is not setCONFIG_FONT_8x8=yCONFIG_FONT_8x16=y

Let's see what's in / proc/kallsyms:

[root@localhost font] # cat / proc/kallsyms | grep font.*8xffffffffb006a3e0 R font_vga_8x8ffffffffb006a420 r fontdata_8x8ffffffffb006ac20 R font_vga_8x16ffffffffb006ac60 r fontdata_8x16ffffffffb0307a10 r _ ksymtab_font_vga_8x16ffffffffb03234b8 r _ kcrctab_font_vga_8x16ffffffffb034246e r _ kstrtab_font_vga_8x16

Well, this is the font saved in the kernel:

[root@localhost rh] # ll. / drivers/video/console/font_8x*-rw-r--r--. 1 root root 95976 Sep 17 2018. / drivers/video/console/font_8x16.c-rw-r--r--. 1 root root 50858 Sep 17 2018. / drivers/video/console/font_8x8.c

These two files are no longer analyzed here. This only confirms the fact that the kernel uses its own font during initialization, which is, after all, nothing but the kernel itself.

The problem is that when it comes to user mode, this font can be changed, can be changed gaudy, these fonts can not be just two 8x8 and 8x16 can hold live …

At this point, you need to find the font file that we installed in the distribution. We need to find it, then change the shape of a font in it and turn it into Chinese! It's that simple.

You don't have to search where the font file is installed and saved, you can find it by executing the strace setfont command.

[root@localhost] # strace-F-e trace=open setfont...strace: Process 6276 attached [pid 6276] open ("/ etc/ld.so.cache", O_RDONLY | O_CLOEXEC) = 4. [pid 6276] open ("/ lib/kbd/consolefonts/default8x16.psfu.gz", O_RDONLY | O_NOCTTY | O_NONBLOCK) = 4 [pid 6276] + + exited with 0 + +-SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=6276, si_uid=0, si_status=0, si_utime=0 Si_stime=0}-+ exited with 0 +

This is it, / lib/kbd/consolefonts/default8x16.psfu.gz

There is no need to search the format of fonts in psfu format, specific characters can be found through pattern recognition.

I'm going to find'A 'first, and then change the' B 'and' C 'behind it to my names "Zhao" and "ya".

First of all, I want to make the words "Zhao" and "ya" to form a lattice. The following is my work Zhao:

0000000000000000001000001111100000100101 001001011111101000100011 00111010 01100101 011000001001100010000111000000000000000000000000

Next, we will replace the dot matrix of'B 'with this lattice, and at the same time make a "sub" character to replace the dot matrix of' C'.

The corresponding lattice diagram of the default font can be found at the following site:

Https://www.zap.org.au/software/fonts/console-fonts-distributed/psftx-centos-7.5/default8x16.psfu.large.pdf

We can get the lattice array of the'A 'character, and then match the array in the default8x16.psfu file. The code is as follows:

# include # include unsigned char zhaoya [32] = "Zhao" 0x00, 0x00, 0x20, 0xf8, 0x25, 0x25, 0xfa, 0x23, 0x3a, 0x65, 0x60, 0x98, 0x87, 0x00, 0x00, 0x00, / / second behavior Sub-0x00, 0x00, 0x00, 0x7e, 0x24, 0x24, 0x24, 0xa5 0xa5, 0x66, 0x24, 0x24, 0x7e, 0x00, 0x00, 0x00} Int main (int argc, char * * argv) {int I = 0; unsigned char buf [16]; off_t offset = 0; int s = 0; int fd = open ("default8x16.psfu", O_RDWR); I = pread (fd, buf, 8, offset); while (1) {I = pread (fd, buf, 16, offset) If (s = = 2) {/ / replace'C 'memcpy (buf, & zhaoya [16], 16); I = pwrite (fd, buf, 16, offset); break } if (s = = 1) {/ / replace'B' memcpy (buf, & zhaoya [0], 16); pwrite (fd, buf, 16, offset); s = 2 } / / A simple method to identify'A'if (buf [0] = = 0x00 & & buf [1] = = 0x00 & & buf [2] = = 0x10 & & buf [3] = = 0x38) {printf ("A found at% d!\ n", offset); s = 1 } offset + = 16;}}

Compile and execute directly, and then set the default8x16.psfu as a parameter to the kernel:

[root@localhost font] # setfont. / default8x16.psfu

Enter the virtual terminal tty2 of Linux at this time, when tapping the capital'B'of the keyboard, the word "Zhao" will appear.

Although 16 × 816\ times 816 × 8 or even 8 × 88\ times 88 × 8 can also make complex Chinese dots, it is too ugly.

So I'm looking for a higher resolution font. I found a high resolution 28 × 1628\ times 1628 × 16 lattice Arabic-VGA28x16.psf.gz on Ubuntu. The method of modifying it is exactly the same as the previous one, and its lattice diagram is as follows:

Https://www.zap.org.au/software/fonts/console-fonts-distributed/psftx-debian-9.4/Lat7-VGA28x16.psf.pdf

I don't need to make my own 28 × 1628\ times 1628 × 16 dot matrix, I just need to use the ready-made GNU uifont. Directly in unifont_sample-12.1.01.hex according to the "Zhao" and "ya" unicode code words can be indexed to the dot matrix. For a query on unicode codewords of any character, you can see:

Https://graphemica.com/

The code to replace font is as follows:

# include # include "zhao" # define L 28*2int fd;int main (int argc, char * * argv) {unsigned char buf [L]; off_t offset = 0; / / this 0x0e60 is the offset obtained by pattern matching. Offset + = 0x0e60; fd = open ("Lat7-VGA28x16.psf", O_RDWR); pread (fd, buf, L, offset); memset (buf, 0, L); memcpy (buf+8, & code [0], 32); pwrite (fd, buf, L, offset); offset + = L; pread (fd, buf, L, offset); memset (buf, 0, L) Memcpy (buf+8, & code [32], 32); pwrite (fd, buf, L, offset); offset + = L; pread (fd, buf, L, offset); memset (buf, 0, L); memcpy (buf+8, & code [64], 32); pwrite (fd, buf, L, offset);}

And then the effect is:

Not bad.

In fact, the content of this article is only:

Make a crappy lattice.

Mapping relationship between keyboard,ascii/unicode,font

The method of locating and analyzing the problem without knowing any details.

The simpler the better, the more complex the worse.

Well, in fact, the third and fourth points are the most important.

Finally, if you want to know which fonts your current virtual terminal supports, type:

[root@localhost font] # showconsolefont

It will show:

These are all the contents of this article entitled "the case of Linux Kernel outputting Chinese characters". Thank you for reading! I believe we all have a certain understanding, hope to share the content to help you, if you want to learn more knowledge, welcome to follow the industry information channel!

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

Servers

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report