Network Security Internet Technology Development Database Servers Mobile Phone Android Software Apple Software Computer Software News IT Information

In addition to Weibo, there is also WeChat

Please pay attention

WeChat public account

Shulou

Uncover the Transformer in iPhone: based on the GPT-2 architecture, the word separator is produced by emoji,MIT alumni.

2025-04-01 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Share

Shulou(Shulou.com)11/24 Report--

The "secret" of Apple's Transformer has been uncovered by enthusiasts.

Under the wave of big models, even if they are as conservative as Apple, they always mention "Transformer" at every press conference.

At this year's WWDC, for example, Apple announced that new versions of iOS and macOS will have built-in Transformer language models to provide input methods with text prediction.

Apple officially did not disclose more information, but technology enthusiasts can't sit still.

A little brother named Jack Cook turned macOS Sonoma beta upside down, and as a result, he really dug up a lot of new information:

In terms of model architecture, Cook believes that Apple's language model is more like being based on GPT-2.

In the aspect of word splitter (tokenizer), emoji is very prominent.

Let's take a look at more details.

Based on the GPT-2 architecture, let's first review what functions Apple's Transformer-based language model can achieve on iPhone, MacBook and other devices.

It is mainly reflected in the input method. Apple has its own input method under the blessing of language model, which can realize the function of word prediction and error correction.

Brother Jack Cook tested it specifically and found that this function mainly realized the prediction for a single word.

△ source: the Jack Cook blog post model sometimes predicts upcoming multiple words, but this is limited to situations where the sentence semantics are obvious, similar to the autocomplete function in Gmail.

△ source: Jack Cook blog post, so where exactly is this model installed? After a deep dig, Brother Cook determined:

I found the predictive text model in / System/ Library / LinguisticData / RequiredAssets_en.bundle/ AssetData / en.lm/unilm.bundle.

The reason is:

1. Many files in unilm.bundle do not exist in macOS Ventura, but only appear in the new version of macOS Sonoma beta.

2. There is a sp.dat file in unilm.bundle, which can be found in both Ventura and Sonoma beta, but the version of Sonoma beta has updated a set of token that looks obviously like a word splitter.

3. The number of token in sp.dat can match the two files in unilm.bundle-unilm_joint_cpu.espresso.shape and unilm_joint_ane.espresso.shape. These two files describe the shapes of the layers in the Espresso / CoreML model.

Furthermore, according to the network structure described in unilm_joint_cpu, I speculate that the Apple model is based on the GPT-2 architecture:

It mainly includes token embeddings, position coding, decoder block and output layer, and each decoder block has words like gpt2_transformer_layer_3d.

△ source: Jack Cook blog article based on the size of each layer, my brother also speculated that the Apple model has about 3400 parameters, and the size of the hidden layer is 512. In other words, it is smaller than the smallest version of GPT-2.

I think this is mainly because Apple wants a model that consumes less power, but can run quickly and frequently at the same time.

Apple's official statement on WWDC is, "every time you click a button, iPhone will run the model."

However, this means that the text prediction model can not complete the continuation of sentences or paragraphs.

△ source: in addition to the Jack Cook blog article model architecture, Cook also dug up the relevant information about the word separator (tokenizer).

He found a set of token with a number of 15000 in unilm.bundle/sp.dat, which, notably, contains 100 emoji.

Cook reveals Cook although this Cook is not Cook, my brother's blog article attracted a lot of attention as soon as it was released.

Based on his findings, netizens enthusiastically discussed Apple's balance between user experience and cutting-edge technology applications.

Back to Jack Cook himself, he graduated with a bachelor's degree and a master's degree in computer science from MIT and is still studying for a master's degree in Internet social science from Oxford University.

Previously, he worked as an intern at Nvidia, focusing on language models such as BERT. He is also a senior research and development engineer for natural language processing at the New York Times.

So, does his revelation also make you think? Welcome to share your views in the comments area.

Original text link:

Https://jackcook.com/2023/09/08/predictive-text.html

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

Views: 0

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.

Share To

IT Information

Wechat

© 2024 shulou.com SLNews company. All rights reserved.

12
Report