In addition to Weibo, there is also WeChat
Please pay attention
WeChat public account
Shulou
2025-02-22 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Internet Technology >
Share
Shulou(Shulou.com)06/01 Report--
The main content of this article is to explain "what is the NanoVG optimization method", interested friends may wish to have a look. The method introduced in this paper is simple, fast and practical. Now let the editor take you to learn "what is the NanoVG optimization method"?
NanoVG optimization
Nanovg, as its name suggests, is a very small library of vector drawing functions. Compared with hundreds of thousands of lines of code in cairo and skia, nanovg's less than 5000 lines of C code, called nano, is also worthy of the name. The design, interface, and code quality of nanovg are exemplary. The only drawback is that the performance is not ideal. Especially on the low-end and large-screen models of the Android, a simple interface can only draw a dozen frames per second. I recently encountered this awkward problem when I ported AWTK to Android.
After optimization, the overall rendering performance of AWTK has been improved by 3 to 5 times on low-end models. Make a note here for friends in need for reference.
The performance bottleneck of nanovg is the fragment shader (fragment shader), which can be thought of as a callback function for GPU, which is called when each pixel is processed and executed millions of times at each frame, so it can be seen that this function has a great impact on performance.
Let's first look at nanovg's fragment shader (fragment shader) code:
Static const char* fillFragShader = "# ifdef GL_ES\ n"# if defined (GL_FRAGMENT_PRECISION_HIGH) | | defined (NANOVG_GL3)\ n"precision highp float;\ n"# else\ n" precision mediump float \ n "# endif\ n" # endif\ n "# ifdef NANOVG_GL3\ n"# ifdef USE_UNIFORMBUFFER\ n"layout (std140) uniform frag {\ n"mat3 scissorMat;\ n"mat3 paintMat" \ n "" vec4 innerCol;\ n "" vec4 outerCol;\ n "" vec2 scissorExt;\ n "" vec2 scissorScale;\ n "" vec2 extent;\ n "float radius \ n "" float feather;\ n "" float strokeMult;\ n "" float strokeThr;\ n "" int texType;\ n "" int type;\ n ""} \ n "# else\ n" / / NANOVG_GL3 & &! USE_UNIFORMBUFFER "uniform vec4 [uniformly _ SIZE];\ n"# endif\ n"uniform sampler2D tex;\ n"in vec2 ftcoord;\ n"in vec2 fpos;\ n"out vec4 outColor" \ n "# else\ n" / /! NANOVG_GL3 "uniform vec4 fragility [uniformly _ SIZE];\ n"uniform sampler2D tex;\ n"varying vec2 ftcoord;\ n"varying vec2 fpos" \ n "# endif\ n" # ifndef USE_UNIFORMBUFFER\ n "# define scissorMat mat3 (frag [0] .xyz, frag [1] .xyz, frag [2] .xyz)\ n"# define paintMat mat3 (frag [3] .xyz, frag [4] .xyz) Frag [5] .xyz)\ n "" # define innerCol frag [6]\ n "" # define outerCol frag [7]\ n "" # define scissorExt frag [8] .xy\ n "" # define scissorScale frag [8] .zw\ n "" # define extent frag [9] .xy\ n " "# define radius frag [9] .z\ n"# define feather frag [9] .w\ n"# define strokeMult frag [10] .x\ n"# define strokeThr frag [10] .y\ n"# define texType int (frag [10] .z)\ n" Define type int (frag [10] .w)\ n "# endif\ n"\ n "float sdroundrect (vec2 pt) Vec2 ext, float rad) {\ n "vec2 ext2 = ext-vec2 (rad,rad) \ n "" vec2 d = abs (pt)-ext2;\ n "" return min (max (d.xmemd.y), 0.0) + length (max (dmem0.0))-rad \ n ""}\ n "\ n" / / Scissoring\ n "" float scissorMask (vec2 p) {\ n "" vec2 sc = (abs ((scissorMat * vec3 (pje 1.0)) .xy)-scissorExt);\ n "" sc = vec2 (0.5pm 0.5)-sc * scissorScale \ n "" return clamp (sc.x,0.0,1.0) * clamp (sc.y,0.0,1.0) \ n ""}\ n "" # ifdef EDGE_AA\ n "" / / Stroke-from [0.. 1] to clipped pyramid, where the slope is 1px.\ n "" float strokeMask () {\ n "" return min (1.0) (1.0-abs (ftcoord.x*2.0-1. 0) * strokeMult) * min (1. 0, ftcoord.y) \ n ""}\ n "# endif\ n"\ n "void main (void) {\ n"vec4 result;\ n"float scissor = scissorMask (fpos);\ n"# ifdef EDGE_AA\ n"float strokeAlpha = strokeMask () \ n "" if (strokeAlpha
< strokeThr) discard;\n" "#else\n" " float strokeAlpha = 1.0;\n" "#endif\n" " if (type == 0) { // Gradient\n" " // Calculate gradient color using box gradient\n" " vec2 pt = (paintMat * vec3(fpos,1.0)).xy;\n" " float d = clamp((sdroundrect(pt, extent, radius) + feather*0.5) / feather, 0.0, 1.0);\n" " vec4 color = mix(innerCol,outerCol,d);\n" " // Combine alpha\n" " color *= strokeAlpha * scissor;\n" " result = color;\n" " } else if (type == 1) { // Image\n" " // Calculate color fron texture\n" " vec2 pt = (paintMat * vec3(fpos,1.0)).xy / extent;\n" "#ifdef NANOVG_GL3\n" " vec4 color = texture(tex, pt);\n" "#else\n" " vec4 color = texture2D(tex, pt);\n" "#endif\n" " if (texType == 1) color = vec4(color.xyz*color.w,color.w);" " if (texType == 2) color = vec4(color.x);" " // Apply color tint and alpha.\n" " color *= innerCol;\n" " // Combine alpha\n" " color *= strokeAlpha * scissor;\n" " result = color;\n" " } else if (type == 2) { // Stencil fill\n" " result = vec4(1,1,1,1);\n" " } else if (type == 3) { // Textured tris\n" "#ifdef NANOVG_GL3\n" " vec4 color = texture(tex, ftcoord);\n" "#else\n" " vec4 color = texture2D(tex, ftcoord);\n" "#endif\n" " if (texType == 1) color = vec4(color.xyz*color.w,color.w);" " if (texType == 2) color = vec4(color.x);" " color *= scissor;\n" " result = color * innerCol;\n" " }\n" "#ifdef NANOVG_GL3\n" " outColor = result;\n" "#else\n" " gl_FragColor = result;\n" "#endif\n" "}\n"; 它的功能很完整也很复杂,裁剪和反走样都做了处理。仔细分析之后,我发现了几个性能问题: 一、颜色填充的问题 简单颜色填充和渐变颜色填充使用了相同的代码: " if (type == 0) { // Gradient\n" " // Calculate gradient color using box gradient\n" " vec2 pt = (paintMat * vec3(fpos,1.0)).xy;\n" " float d = clamp((sdroundrect(pt, extent, radius) + feather*0.5) / feather, 0.0, 1.0);\n" " vec4 color = mix(innerCol,outerCol,d);\n" " // Combine alpha\n" " color *= strokeAlpha * scissor;\n" " result = color;\n"问题 简单颜色填充只需一条指令,而渐变颜色填充则需要数十条指令。这两种情况重用一段代码,会让简单颜色填充慢10倍以上。 方案 把颜色填充分成以下几种情况,分别进行优化: 矩形简单颜色填充。 对于无需裁剪的矩形(这是最常见的情况),直接赋值即可,性能提高20倍以上。 " if (type == 5) { //fast fill color\n" " result = innerCol;\n" 通用多边形简单颜色填充。 去掉渐变的采样函数,性能会提高一倍以上: " } else if(type == 7) { // fill color\n" " strokeAlpha = strokeMask();\n" " if (strokeAlpha < strokeThr) discard;\n" " float scissor = scissorMask(fpos);\n" " vec4 color = innerCol;\n" " color *= strokeAlpha * scissor;\n" " result = color;\n" 渐变颜色填充(只占极小的部分)。 这种情况非常少见,还是使用之前的代码。 效果: 平均情况,填充性能提高10倍以上! 二、字体的问题 对于文字而言,需要显示的像素和不显示的像素,平均算下来在1:1左右。 " } else if (type == 3) { // Textured tris\n" "#ifdef NANOVG_GL3\n" " vec4 color = texture(tex, ftcoord);\n" "#else\n" " vec4 color = texture2D(tex, ftcoord);\n" "#endif\n" " if (texType == 1) color = vec4(color.xyz*color.w,color.w);" " if (texType == 2) color = vec4(color.x);" " color *= scissor;\n" " result = color * innerCol;\n" " }\n"问题: 如果显示的像素和不显示的像素都走完整的流程,会浪费调一半的时间。 方案: 当color.x < 0.02时直接跳过。 裁剪和反走样放到判断语句之后。 " } else if (type == 3) { // Textured tris\n" "#ifdef NANOVG_GL3\n" " vec4 color = texture(tex, ftcoord);\n" "#else\n" " vec4 color = texture2D(tex, ftcoord);\n" "#endif\n" " if(color.x < 0.02) discard;\n" " strokeAlpha = strokeMask();\n" " if (strokeAlpha < strokeThr) discard;\n" " float scissor = scissorMask(fpos);\n" " color = vec4(color.x);" " color *= scissor;\n" " result = color * innerCol;\n" " }\n"效果: 字体渲染性能提高一倍! 三、反走样的问题 反走样的实现函数如下(其实我也不懂): "float strokeMask() {\n" " return min(1.0, (1.0-abs(ftcoord.x*2.0-1.0))*strokeMult) * min(1.0, ftcoord.y);\n" "}\n"问题: 与简单的赋值操作相比,加上反走样功能,性能会下降5-10倍。但是不加反走样功能,绘制多边形时边缘效果比较差。不加不好看,加了又太慢,看起来是个两难的选择。 方案: 矩形填充是可以不用反走样功能的。而90%以上的情况都是矩形填充。矩形填充单独处理,一条指令搞定,性能提高20倍以上: " if (type == 5) { //fast fill color\n" " result = innerCol;\n"效果: 配合裁剪和矩形的优化,性能提高10倍以上。 四、裁剪的问题 裁剪放到Shader中虽然合理,但是性能就要大大折扣了。 "// Scissoring\n" "float scissorMask(vec2 p) {\n" " vec2 sc = (abs((scissorMat * vec3(p,1.0)).xy) - scissorExt);\n" " sc = vec2(0.5,0.5) - sc * scissorScale;\n" " return clamp(sc.x,0.0,1.0) * clamp(sc.y,0.0,1.0);\n" "}\n"问题: 与简单的赋值操作相比,加上裁剪功能,性能会下降10以上倍。但是不加裁剪功能,像滚动视图这样的控件就没法实现,这看起来也是个两难的选择。 方案: 而90%以上的填充都是在裁剪区域的内部的,没有必要每个像素都去判断,放在Shader之外进行判断即可。 static int glnvg__pathInScissor(const NVGpath* path, NVGscissor* scissor) { int32_t i = 0; float cx = scissor->Xform [4]; float cy = scissor- > xform [5]; float hw = scissor- > extent [0]; float hh = scissor- > extent [1]; float l = cx-hw; float t = cy-hh; float r = l + 2 * hw-1; float b = t + 2 * hh-1; const NVGvertex* verts = path- > fill; for (I = 0; I
< path->Nfill; iTunes +) {const NVGvertex* iter = verts + i; int x = iter- > x; int y = iter- > y; if (x
< l || x >R | | y
< t || y >B) {return 0;}} return 1;} effect:
With the optimization of cutting and rectangle, the performance is improved by more than 10 times.
At this point, I believe you have a deeper understanding of "what is the NanoVG optimization method?" you might as well do it in practice. Here is the website, more related content can enter the relevant channels to inquire, follow us, continue to learn!
Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.
Views: 0
*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.
Continue with the installation of the previous hadoop.First, install zookooper1. Decompress zookoope
"Every 5-10 years, there's a rare product, a really special, very unusual product that's the most un
© 2024 shulou.com SLNews company. All rights reserved.