Home > Information > News
#News ·2025-01-08
Look at a PPT generation idea: PPTAgent. Traditional PPT generation methods usually use the end-to-end text generation paradigm, which only focuses on the text content, ignoring the layout design and PPT structure. PPTAgent uses an edit-based generative paradigm to address the challenges of dealing with spatial relationships and design style.
Each slide in the traditional method can be represented by the following formula:
PPTAgent framework
In this article, PPTAgent is a framework for automatically generating PPT. The edit-based workflow is divided into two stages: PPT analysis and PPT generation.
The main goal is to provide structured and semantic reference information for PPT generation through slide clustering and content schema extraction. The results of this stage will directly affect the quality and efficiency of the subsequent stage.
Slide clustering (hierarchical clustering) is the process of grouping slides in a reference PPT according to their functionality and content. Slides can be divided into two broad categories: Clustering algorithms:
picture
Clustering example
a. Structured slides: These slides are primarily used to support the structure of the presentation, such as opening slides, transition slides, and closing slides. For such slides, the PPTAgent uses the LLM to infer the functional roles of each slide and groups them according to those roles. These slides usually have a distinct textual feature.
b. Content slides: These slides are primarily used to convey specific information, such as slides containing bullets, charts, and images. For such slides, PPTAgent adopts hierarchical clustering method based on image similarity. Group similar slides together by calculating the image similarity between slides.
a. Category: Describes the type of element, such as text box, image, etc.
b. Modal: Describes how elements are rendered, such as plain text, text with graphics, etc.
c. Content: Describes the specific content of an element, such as text content or alternative text to an image.
picture
The second stage is based on the analysis results of the first stage to generate a new PPT. At the heart of this stage is an interactive editing process that generates the target PPT using reference slides and input documents. The steps include: generating a structured outline, specifying the reference slides and related content for each slide; Use LLMs to iteratively edit reference slides to generate new slides; Implement five specialized apis that allow LLMs to edit, delete, and copy text elements, as well as edit and remove visual elements.
Outline generation: Outline generation guides the LLM to create a structured outline based on human preferences. Each entry specifies the reference slide, the index of the relevant document section, and the title and description of the new slide. By utilizing the planning and summarising capabilities of the LLM, together with the semantic information extracted from the reference PPT, a coherent and engaging outline is generated to guide the generation process of the new PPT.
Slide generation: Slide generation is the process of generating new slides by iteratively editing reference slides under the guidance of an outline. To enable precise manipulation of slide elements, PPTAgent implements five specialized apis that allow LLM to edit, delete, and copy text elements, as well as edit and delete visual elements. In addition, to enhance understanding of the slide structure, PPTAgent converts the slide from its original XML format to an HTML representation that is easier for LLM interpretation.
Evaluation indicators, existing indicators include:
PPTEval indicators include:
These metrics are used to evaluate the quality of the generated PPT in different dimensions.
PPTAgent: Generating and Evaluating Presentations Beyond the Text - to - Slides TAB, https://arxiv.org/pdf/2501.03936v1
2025-02-17
2025-02-14
2025-02-13
13004184443
Room 607, 6th Floor, Building 9, Hongjing Xinhuiyuan, Qingpu District, Shanghai
gcfai@dongfangyuzhe.com
WeChat official account
friend link
13004184443
立即获取方案或咨询top