Tungsten Fabric: the Golden key to CMP (TF M) 07/03 Update SLTechnology News&Howtos

Tungsten Fabric: the Golden key to CMP (TF M)

2025-07-03 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > Network Security >

Shulou(Shulou.com)06/01 Report--

All relevant materials in this article https://163.53.94.133/assets/uploads/files/cmp-key-shuxun.pdf

Shanghai Digital News CIO Qian Yu

Shanghai Digital News is a company based on traditional data center business, why did it switch to cloud computing? After 2011, the whole data center industry is becoming more and more like real estate. The data center business is highly replicable and competitive. By 2013, some new technologies came out, including the explosive growth of OpenStack, so it was decided to do cloud computing in 2014.

The original definition was multi-platform. From the perspective of practical application scenarios, it is not that the virtual machine or the container is better. There is no question of which one replaces the virtual machine in different scenarios. When it comes to building two platforms, there is a very awkward problem. The network of virtual machines and the network of containers are two different things.

In the middle, we found about 10 SDN technologies, from commercial to open source, to domestic small-scale applications. At that time, Tungsten Fabric was still called OpenContrail, and the version at that time only supported OpenStack.

CMP was put forward in recent years, but when we first started to do it, we already had the idea of CMP.

Compared with all Portal, whether OpenStack or native K8s, they are basically from the perspective of operation and maintenance, not a case that provides business to the outside world. So from the user's point of view, it is a very painful thing, at that time we decided to unify the two platforms and build a complete platform based on the user's own interface on Web.

At that time, the direction of digital cloud platform and SDN was determined, which was mainly OpenStack and K8s at that time. We found a problem that K8s is not a PaaS platform, but just solves a docker management problem. If it is a small environment, it doesn't matter whether you use it or not. You don't have to do SDN, including OpenStack. If the business environment is not too complex, you can run the traditional VLAN as long as you control the volume and there is no broadcast storm.

However, if your business scenario is very complex, it used to be included in a virtual machine, but now it is in a container. The emergence of this kind of business will cause great difficulties to the network, and it is impossible to make a strategy for each line of business. Once there is a business migration or trouble shooting, the back-end operation and maintenance staff will collapse and cannot be transferred. Used to write a PBR, write a static route, at most hang a few switch routes. This is not the case at all in the cloud network environment. There may be countless virtual machines under a tenant, in which there are countless different application methods, and the direction of all the flows is in a mess. In this case, if you use the traditional way, you basically don't have to do it, because you can't see the end. So we should adopt the method of SDN.

Tungsten Fabric (hereinafter referred to as TF) is indeed very good, but there are some problems. Switches that fully support OVSDB will be more compatible with TF. This is not to say that Openflow is not good, it can also be done with a flow meter, but it is more troublesome.

The underlying Port of data communication is the SDN technical support line through TF at the bottom. At that time, when K8s was just launched in the OpenContrail era in 2015, we proposed to adopt a container-based approach, because if the virtual machine approach has disadvantages to operation and maintenance, capacity expansion and migration, it is difficult to guarantee the future business. At that time, OpenStack was also relatively early. Basically, it was deployed on its own, when it was developed jointly with Juniper networks, OpenContrail was deployed together.

In addition, digital, as a data center operator, provides the traditional hosting, and everyone is considering the problem of going to the cloud. In cloud computing, we have used SDN technology, non-traditional VLAN way, so how to connect when users go to the cloud? it is impossible to open a VLAN to do any mapping, which is more difficult.

Also, how to connect a bunch of business scenarios where users are actually in the computer room with the overlay network of cloud computing, instead of trying it with an independent service.

Here to solve the problem of VLAN mapping, it is impossible to provide a dedicated line for the user, but also to change his VLAN network, which is unrealistic, so a lot of secondary development has been done on this. Including OpenStack and OpenShift, the open source community version is single node, to really apply to the scenario, at least to ensure that multi-node, community version of things to fall into the production environment, including docking with TF, there are still a lot of secondary development to do.

This is the problem of actual use cases encountered in development and research, some of which are our own and some of which are in user application scenarios.

The stability of Neutron is relatively poor. We have measured that there is an inexplicable jitter in 2500 virtual opportunities, causing all of them to crash. We are still more cautious about the native Neutron.

If K8s only implements a single business, it can be solved by basic native Flannel or Calico. Calico does not support multi-data center and multi-task. Calico is the most widely used open source environment in K8s.

Although OpenShift has OVS, it is doubtful whether it can communicate with OpenStack. In the end, it is impossible to communicate with OpenStack. It is totally two systems.

In addition, it is also unknown whether VNF and CNF can coexist. Why does the virtual machine access the container? It seems extremely unreasonable to us, but in the financial industry or e-commerce industry, some businesses can run virtual machines, some have bought commercial software, or some users have their own development capabilities and have put some things in containers.

In the past, we opened a bunch of load balancers in the virtual machine, but we solved it with one Node and numerous port directly in the container, including many registrars that cannot portal and cannot segment the network to do proxy transfer. They find it very difficult to see if it is possible to solve this problem. We finally tried and made mistakes, and recently solved the problem of interworking between VNF and CNF at the OpenStack virtual machine level, using the management network to do interworking.

The mutual access between virtual machine network and container network is based on label in TF version 4.0. it can be used, but it is not convenient to use. Up to now, version 5.1, the whole Portal does not pull this out as an option, and every time it has to turn over the virtual machine and the container to do the corresponding, this operation is more troublesome, we have also tried to do secondary development, more tired. If possible, putting these two things together will be very convenient to manage.

How do software-defined FW and LB business land in cross-scenarios? In most user scenarios, commercial software is used, and there are all kinds of brands. They provide image themselves, and each virtual machine has its own feature. How to interconnect with TF is definitely a secondary development, but at present, it seems that it is only possible for TF to do it, and others are more difficult.

The problem of VPC, in our understanding, the VPC of TF can be understood as the current domestic SDN web is more reasonable, two sections of * to establish a tunnel. As for the virtual machine that you want to manage in the public cloud, it seems unlikely. Even if you give it to you, you may eventually give up, and the problem of version iteration alone cannot be solved. No one does such a thing. The advantage is that there is Portal, and you can see the actual situation of the whole business.

To complain, TF does solve the problem of network expansion and stability of OpenStack, but it is a bit picky about network cards. In some special application scenarios, such as IDP protocol running VDI, we find that the network cards of Intel and Broadcom are not so friendly.

Compared with OpenStack, the docking of TF and OpenShift is a little more difficult so far. OpenShift's open source OKD has some problems. In addition, it only installs TF with OpenShift or K8s, so you can't see the problem with simple applications. However, if you run several business chains, such as tags, applications, routing gateways, business orchestration, etc., the whole process will have problems.

Indeed, when we look at TF, as mentioned in the document, the support for OpenShift is still much better than other open source software or commercial software, at least we can see the dawn of secondary development with TF.

With regard to the service chain, it may be better if it can match the port, do not interfere with the attributes of the entire network, it will be more complex in some specific scenarios.

We are currently adapting to a cloudy environment, and only AWS and Azure are available, but it is still based on the actual application scenario, and it is not necessary to connect all the public cloud platforms, and the business is not that complicated.

We have tested the support for DPDK and smartNIC. Under the default kernel environment of OpenStack, it is basically unable to meet the software standards of security vendors. Only by using DPDK can the nominal value be reached. However, DPDK is exclusive to Intel. Some applications can run and some are not. So, if you want to use DPDK, you should take a look at it according to your own usage scenario.

TF provides API similar to REST, so even if you want to do CMP, it is relatively easy to call backend parameters, but the documentation for API is a bit messy.

Up to now, we are very serious about cloud computing. From 2014 to now, we have been constantly polishing our platform. All perspectives are presented as user examples, including moving the API from the back end of TF to the front end, so that he can adjust it according to his own strategy for a tenant. In the era of CMP, if an application scenario doesn't work in 15 minutes, it will fail, let alone with the help of a third party. Open up a lot of permissions to users.

Our PaaS backend is OpenShift, and all businesses based on the PaaS platform are redone at the front end, including the functions of TF for OpenShift, which can be monitored inside the TF, without the need for collection in a very primitive SNMP way, and no need at all.

So far, the platform of digital information has achieved this degree. I chose TF because the agreement is relatively standard, and BGP * * can solve it. I am more resistant to private agreements. Some competitors always want to achieve unification, but it is impossible in the end, so it is more acceptable to be open.

When it comes to the problem of VXLAN, in practice, if you use kernel, if the quantity is very large, the loss is still very large, especially for switches or network cards that are not specially optimized by VXLAN, the direct performance loss will be about 30%.

From the perspective of the whole TF, different platforms and different network features are basically managed together, but containers and virtual machines still have a certain manual workload. It will be better if TF solves this problem. In addition, TF is not consistent in OpenStack and OpenShift authentication mechanisms.

What is more painful in recent years is that there is relatively little support, whether in the open source community or officially, mainly focusing on installation, with part of trouble shooting, but relatively lacking in the deployment of actual application scenarios. To do cloud computing is not to open a virtual machine, it doesn't matter whether you use OpenStack or not, KVM is solved. So cloud computing is not virtualization, it has a certain business logic in it, which means that the platform should be able to provide a lot of support for the business of actual landing users.

We applied TF relatively early, starting from version 3.2, and officially docking with version 4.0. I believe that if you have your own business logic and certain development capabilities, you can create your own good products based on TF. TF is more programmable and versatile. On a scale of 100, I will score 80, and the remaining 20 will be in support.

Above, I have thrown out some problems from the actual application scenarios to the various situations encountered in the development.

Thank you very much!

About Tungsten Fabric:

The Tungsten Fabric project is an open source project protocol that is developed based on standard protocols and provides all the components necessary for network virtualization and network security. The components of the project include: SDN controller, virtual router, analysis engine, northbound API release, hardware integration functions, cloud orchestration software and extensive REST API.

About the TF Chinese Community:

TF Chinese Community is initiated spontaneously by a group of Chinese volunteers who follow and love SDN, including technology veterans, market veterans, industry experts and experienced users. It will serve as a bridge between the community and China, disseminate information, submit questions, organize activities, and unite all forces interested in the multi-cloud Internet to effectively solve the problems encountered in the process of cloud network construction.

Follow Wechat: TF Chinese Community

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.