In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-03-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Servers >
Share
Shulou(Shulou.com)05/31 Report--
Most people do not understand the knowledge points of this article "how to install and use flex in linux", so the editor summarizes the following contents, detailed contents, clear steps, and has a certain reference value. I hope you can get something after reading this article. Let's take a look at this "how to install and use flex in linux" article.
In linux, flex is a lexical analysis tool that recognizes lexical patterns in text; Flex reads a given input file or, if no file name is given, from standard input to get a description of the scanner that needs to be generated.
The operating environment of this tutorial: linux5.9.8 system, Dell G3 computer.
Flex: lexical analyzer
Flex is a lexical analyzer. Used to generate a .l file into a .c program file. That is to generate a lexical analyzer. Then read the input, match the regular expression, and then perform the corresponding action to achieve the function of the program. We can find that the flex implementation can accept input outside the program.
Flex is a tool for generating scanners that can recognize lexical patterns in text. Flex reads in the given input file and, if no file name is given, reads from standard input to get a description of the scanner that needs to be generated. This description is called a rule and consists of regular expressions and C code pairs. The output of Flex is a C code file-- lex.yy.c-- where the yylex () function is defined. Compiling the output file can generate an executable file. When the executable is run, it analyzes the input file to find a match for each regular expression. When a match is found, it executes the C code associated with the regular expression. Flex is not a GNU project, but GNU wrote a manual for Flex.
Usage
Install flex
Sudo apt-get install flex// or download the corresponding version of the installation file to install
Then create a new text file and enter the following:
% [0-9] + printf ("?"); # return 0 position. ECHO;%%int main (int argc, char* argv []) {yylex (); return 0;} int yywrap () {return 1;}
Save this file as hide-digits.l. Note that the% in this file must be at the front of the line (that is, there can be no spaces before the%).
After that, at the terminal, enter:
Flex hide-digits.l
At this point, there is an additional "lex.yy.c" file in the directory. Compile and run the C file again:
Gcc-o hide-digits lex.yy.c./hide-digits
Then keep typing any key in the terminal and enter, you can find that all the characters except the numbers are output as is, and each string of numeric characters is replaced? Yes. Finally, after typing #, the program exited. As follows:
Eruiewdkfjeruiewdkfj1245?fdsaf4578fdsaf?...#
When running flex on the command line, the second command line parameter (here is hide-digits.l) is the participle pattern file provided to flex, which is mainly the participle matching pattern written by the user with regular expressions. With flex, these regular expressions will be translated into C code format function yylex, and output to the lex.yy.c file, which can be regarded as a finite state automaton.
When running flex on the command line, the second command line parameter (here is hide-digits.l) is the participle pattern file provided to flex, which is mainly the participle matching pattern written by the user with regular expressions. With flex, these regular expressions will be translated into C code format function yylex, and output to the lex.yy.c file, which can be regarded as a finite state automaton.
Let's explain the code in the hide-digits.l file in detail. The first paragraph is:
% [0-9] + printf ("?"); # return 0 position. ECHO;%%
Flex pattern file, with%% and%% to do segmentation, the above split content is called rules, each line in this document is a rule, each rule is composed of matching pattern (pattern) and event (action), pattern in front, expressed by regular expression, event in the back, that is, C code. Whenever a pattern is matched, the subsequent C code is executed.
Flex translates this into a function called yylex, which scans the input file (standard input by default) and executes the C code following the rule when it scans a complete, longest string that can match the regular expression of a rule. If there are no return statements in the C code, the yylex function continues to run after the C code is executed, starting the next round of scanning and matching.
When a pattern with multiple rules is matched, yylex selects the rule with the longest matching length, and if there is a rule with equal matching length, it chooses the rule that comes first.
Int main (int argc, char * argv []) {yylex (); return 0;} int yywrap () {return 1;}
The main function in the second paragraph is the entry to the program, and flex copies the code to the end of the lex.yy.c file as is. The last line of the yywrap function, flex requires such a function.
Example
Word-spliter.l
% {# define T_WORD 1int numChars = 0, numWords = 0, numLines = 0%} WORD ([^\ t\ n\ r\ a] +)%\ n {numLines++; numChars++;} {numWords++; numChars+ = yyleng; return scheduled word;} {return 0;}. {numChars++;}% int main () {int token_type;while (token_type = yylex ()) {printf ("WORD:\ t% s\ n", yytext) } printf ("\ nChars\ tWords\ tLines\ n"); printf ("% d\ t% d\ t% d\ n", numChars, numWords, numLines); return 0;} int yywrap () {return 1;}
In this example, two global variables yytext and yyleng provided by flex are used to represent the string just matched and its length, respectively.
Compilation execution
Flex word-spliter.lgcc-o word-spliter lex.yy.c./word-spliter
< word-spliter.l输出:WORD: %{WORD: #define...WORD: }Chars Words Lines470 70 27 可见此程序其实就是一个原始的分词器,它将输入文件分割成一个个的 WORD 再输出到终端,同时统计输入文件中的字符数、单词数和行数。此处的 WORD 指一串连续的非空格字符。 扩展 (1) 列出所需的所有类型的 token; (2) 为每种类型的 token 分配一个唯一的编号,同时写出此 token 的正则表达式; (3) 写出每种 token 的 rule (相应的 pattern 和 action )。 第 1 类为单字符运算符,一共 15 种: + * - / % = , ; ! < >() {}
Type 2 are two-character operators and keywords, with a total of 16 types:
=, =,! =, & &, | | void, int, while, if, else, return, break, continue, print, readint
The third type is integer constant, string constant, and identifier (variable name and function name).
After expansion
% {# include "token.h" int cur_line_num = 1 void init_scanner (); void lex_error (char* msg, int line) %} / * Definitions, note:\ 042 is'"* / INTEGER ([0-9] +) UNTERM_STRING (\ 042 [^\ 042\ n] *) STRING (\ 042 [^\ 042\ n] *\ 042) IDENTIFIER ([_ a-zA-Z] [_ a-zA-Z0-9] *) OPERATOR ([+ *-/% = ! () {}]) SINGLE_COMMENT1 ("/ /" [^\ n] *) SINGLE_COMMENT2 ("#" [^\ n] *)% [\ n] {cur_line_num++ } [\ t\ r\ a] + {/ * ignore all spaces * /} {SINGLE_COMMENT1} {/ * skip for single line comment * /} {SINGLE_COMMENT2} {/ * skip for single line commnet * /} {OPERATOR} {return yytext [0];} "=" {return T_Ge } "=" {return titled EQ;} "! =" {return titled neon;} "& &" {return titled and;} "| |" {return titled order;} "void" {return T_Void " } "int" {return titled int;} "while" {return titled while;} "if" {return titled if;} "else" {return titled Else;} "return" {return T_Return } "break" {return tiered break;} "continue" {return titled Continue;} "print" {return titled Print;} "readint" {return titled ReadInt;} {INTEGER} {return titled IntConstance;} {STRING} {return T_StringConstant } {IDENTIFIER} {return Tunable Identifier;} {return 0;} {UNTERM_STRING} {lex_error ("Unterminated string constant", cur_line_num);}. {lex_error ("Unrecognized character", cur_line_num);}% int main (int argc, char* argv []) {int token; init_scanner (); while (token = yylex ()) {print_token (token); puts (yytext);} return 0;} void init_scanner () {printf ("%-20s%s\ n", "TOKEN-TYPE", "TOKEN-VALUE") Printf ("- -\ n");} void lex_error (char* msg, int line) {printf ("\ nError at line%-3D:% s\ n\ n", line, msg);} int yywrap (void) {return 1;}
In the above file, it is important to note that the string enclosed in double quotes in the regular expression is the original string, the special characters in it do not need to be escaped, and the double quotation marks themselves must be escaped (must use\ "or\ 042), which is a feature different from regular expressions in flex.
The number of the token except the single-character operator is in the following token.h file, which also provides a print_token function that prints its name based on the number of the token.
# ifndef TOKEN_H#define TOKEN_Htypedef enum {T_Le = 256, T_Ge, T_Eq, T_Ne, T_And, T_Or, T_IntConstant, T_StringConstant, T_Identifier, T_Void, T_Int, T_While, T_If, T_Else, T_Return, T_Break, T_Continue, T_Print, T_ReadInt} TokenType Static void print_token (int token) {static char* token_strs [] = {"T_Le", "T_Ge", "T_Eq", "T_Ne", "T_And", "T_Or", "T_IntConstant", "T_StringConstant", "T_Identifier", "T_Void", "T_Int", "T_While", "T_If", "T_Else" "T_Return", "T_Break", "T_Continue", "T_Print", "T_ReadInt"} If (token < 256) {printf ("%-20c", token);} else {printf ("%-20s", token_ strs [token-256]);}} # endif
Makefile
Out: scannerscanner: lex.yy.c token.hgcc-o $@ $
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.