Tian Yuandong's team's new work, the first "short story" automatic generator, can write 7500 words of coherent stories in one breath. 07/02 Update SLTechnology News&Howtos

Tian Yuandong's team's new work, the first "short story" automatic generator, can write 7500 words of coherent stories in one breath.

2025-07-02 Update From: SLTechnology News&Howtos shulou NAV: SLTechnology News&Howtos > IT Information >

Shulou(Shulou.com)11/24 Report--

Language models are unreliable for writing long stories, and model design has to rely on "bionics of writers"!

ChatGPT, a chat robot released by OpenAI, is really out of the circle. It knows astronomy and geography. It is not only rational but also emotional. It is not necessary to draft an 800-word composition.

However, ChatGPT's "story continuation" follows the linear logic of word-by-word generation, which is far from the way human writing is written, and ChatGPT is destined to be only a writing assistant and cannot become a real AI writer.

Recently, Dr. Tian Yuandong's team released a large-scale language model-based story generator Re3 (Recursive Reprompting and Revision) framework at EMNLP'22, which allows the model to generate consistent stories by designing prompts, without fine-tuning large models at all.

Paper link: arxiv.org/ pdf/2210.06774.pdf

Code link: github.com/ yangkevin2 / emnlp22-re3-story-generation

Re3 framework does not adopt linear logic of word-by-word generation of language model, but adopts hierarchical generation method: First, the story characters, various attributes and outlines of the characters are generated in the Plan stage. Then, given the story outline and characters in the Draft stage, specific paragraphs are repeatedly generated. These specific paragraphs are filtered by the Rewrite stage, picking out the generated paragraphs that are highly related to the previous paragraph, and discarding those that are not related (a small model needs to be trained). Finally, some obvious factual errors are corrected in the Edit stage.

The authors argue that the iterative approach is more consistent with the general way human writers write than with the senseless word-by-word generation of language models.

From the final effect, compared with other current text generation algorithms, Re3 writes a lot of stories, the longest story has 7500 words, and it takes about half an hour for people to read it. It is not an exaggeration to say that it is more than 10 times longer than other schemes.

Most importantly, the text generated by Re3 is quite self-consistent, at least there will be no sudden changes in the main characters, or sudden changes in style, and the clues before and after can also be matched up. There is also no situation where other schemes often fall into an infinite text loop.

The stories generated by Re3, nearly 3000 words in length, have a good coherence Compared to stories of similar length generated directly from the same base model, human evaluators judge that more stories in Re3 have a coherent overall plot (14% increase) and are relevant to a given initial premise (20% increase).

Nevertheless, Re3 is still a lot worse than human writers at present. The main problems are that the characters have no motivation, the plot arrangement is irregular, and the details are chaotic, not to mention the pace, rhythm, or theme of the writing. It can be tiring to read.

Writers generally pay attention to "grass snake gray line, foreshadowing thousands of miles", important details and actions of characters, emotional depictions in scenes may become decisive factors in the development of the next story, not to mention the intricate relationships between characters and unexpected plot developments. These advanced writing skills, AI is still far from being able to achieve.

Kevin Yang, the first author of the article, is a fourth-year doctoral student at the University of California, Berkeley, whose main research interests are controllable natural language text generation under structured settings, such as using structured methods of controllable generation to improve the consistency of long texts.

The second author, Dr. Tian Yuandong, is a researcher and senior manager of Meta Artificial Intelligence Research Institute. His research interests include deep reinforcement learning and its application in games, as well as theoretical analysis of deep learning models. He received a master's degree from Shanghai Jiaotong University in 2005 and 2008, and a doctor's degree from Carnegie Mellon University Robotics Institute in 2013.

Re3 framework to generate long and coherent stories is a difficult problem in the field of artificial intelligence, the model requires a comprehensive grasp of language, world and common sense knowledge to complete.

Previous work on auto-generated stories ranged from a few sentences to one or two paragraphs in length, and while short stories of this length can serve as a good testbed for text generation, they are still much shorter than the average short story.

The Re3 framework was designed to compensate for the shortcomings of current story generation models by generating longer "short" stories.

Coherence and relevance are more important issues for a long story than for a short story, and the Re3 framework researchers are the first to automate the generation of such a long coherent story model, with further increases limited primarily by evaluation rather than technical issues.

The system must maintain a coherent overall plot over thousands of words. For example, the user may be given an initial premise, and the model needs to remain relevant to the premise over thousands of words of text.

Other difficulties include maintaining stylistic consistency over long distances and avoiding factual inconsistencies.

The researchers observed that human writers do not write a long article at once, but rather an iterative process that begins by (a) creating a detailed plan,(b) drafting each paragraph according to the plan,(c) revising or rewriting the entire article, and (d) continuing to edit to refine the details.

Inspired by this, the Re3 framework takes a similar approach to generating growth stories.

The Plan module utilizes structured prompting of GPT3-Instruct-175B (GPT-3 fine-tuned for human instructions) to translate initialized premises into more detailed story settings to simulate high-level human planning, including background, character + description, and story outline.

These plan components are themselves generated by prompting and are used repeatedly to generate prompts for story paragraphs in the Draft module, i.e. recurrent prompting

Draft Module To generate a coherent story, Draft selects the most relevant parts of the plan and previously generated stories and reassembles them into a single prompt for GPT3- 175B to generate the next paragraph.

This Story Draft module can be considered a step further than existing thought chain approaches: dynamically selecting the parts of previous language model outputs that are relevant to the current step and running some additional processing (e.g., summarization) when needed.

Rewrite Module Rewrite module mimics human rewriting by reordering continuity through a mix of relevance scores, coherence scores, and simple heuristic filters.

Training rankers is the only place in the Re3 framework where existing story data is used; all other generation modules are done with prompting zero-shot settings.

Edit Module The final edit module addresses the problem of detecting and correcting long-distance factual inconsistencies.

To make the task more actionable, the researchers focused on factual inconsistencies in character attributes such as age, occupation, and relationship to other characters.

At a high level, the detection system maintains a compact knowledge base for each role in the form of an attribute dictionary.

For each new story paragraph, check only for fact conflicts against these attribute value dictionaries, then update the dictionary for the new paragraph, creating a new dictionary as new characters are detected.

Specifically, the detection system for the editing module is a proof-of-concept system inspired by OpenIE, breaking down the process into simple GPT3 queries, correcting the system using GPT3's editing API, and there is much room for improvement later.

In the experimental part, the researchers evaluated the resulting text for interestingness (whether it appeals to readers), coherence, relevance (fidelity to the original premise), and humanlike (compared to human writers).

Each annotator can see one premise and two corresponding stories (generated by RE3 and another random baseline) and rate the two generated stories binary on four metrics.

As a result, RE3 was able to write a longer story based on the expected premise while maintaining a coherent overall plot, validating design choices inspired by the human writing process and recursive prompting.

RE3 showed significant improvements in coherence and relevance compared to both baselines, and the annotators also labeled RE3 stories as having significantly fewer writing problems.

References:

https://arxiv.org/abs/2210.06774

https://github.com/yangkevin2/emnlp22-re3-story-generation

https://twitter.com/KevinYa33964384/status/1582149319032852480

https://zhuanlan.zhihu.com/p/578638528

This article comes from Weixin Official Accounts: Xin Zhiyuan (ID: AI_era), Editor: LRS

Welcome to subscribe "Shulou Technology Information " to get latest news, interesting things and hot topics in the IT industry, and controls the hottest and latest Internet news, technology news and IT industry trends.

*The comments in the above article only represent the author's personal views and do not represent the views and positions of this website. If you have more insights, please feel free to contribute and share.