World PulseNowPowered by AI

Trending:

Tokenizing Buildings: A Transformer for Layout Synthesis

arXiv — cs.CV•Friday, December 5, 2025 at 5:00:00 AM

PositiveArtificial Intelligence

A new Transformer-based architecture called Small Building Model (SBM) has been introduced for layout synthesis in Building Information Modeling (BIM) scenes. This model addresses the challenge of tokenizing buildings by integrating diverse architectural features into sequences while maintaining their compositional structure, utilizing a sparse attribute-feature matrix to represent room properties.
The development of SBM is significant as it enhances the efficiency and accuracy of layout synthesis in BIM, allowing for high-fidelity room embeddings and improved predictive capabilities through its encoder-decoder pipeline for Data-Driven Entity Prediction (DDEP).
This advancement reflects a broader trend in the AI field, where Transformer models are increasingly applied to complex tasks across various domains, including Computer-Aided Design (CAD) and multimodal understanding. The integration of such models is reshaping how architectural and design processes are approached, emphasizing the importance of efficient data representation and synthesis.

— via World Pulse Now AI Editorial System

Was this article worth reading? Share it

Recommended apps based on your readingExplore all apps

PlantFCE Model Builder

Build 3D process plant models with an intuitive, drag-and-drop interface.

Business & ProductivityTry the app

File Architect

Build file structures from text outlines with customizable templates and quick imports.

Tech & Developer ToolsTry the app

Airparser

Extract and parse data from documents using GPT-4 automation.

AI & DataTry the app

Continue Readings

Self-Paced and Self-Corrective Masked Prediction for Movie Trailer Generation

arXiv — cs.CV18 hours ago

Self-Paced and Self-Corrective Masked Prediction for Movie Trailer Generation

PositiveArtificial Intelligence

A new method for movie trailer generation, named SSMP, has been proposed, which utilizes self-paced and self-corrective masked prediction to enhance the quality of trailers by employing bi-directional contextual modeling. This approach addresses the limitations of traditional selection-then-ranking methods that often lead to error propagation in trailer creation.

Read full article

via arXiv — cs.CV

Controllable Long-term Motion Generation with Extended Joint Targets

arXiv — cs.CV18 hours ago

Controllable Long-term Motion Generation with Extended Joint Targets

PositiveArtificial Intelligence

A new framework called COMET has been introduced for generating stable and controllable character motion in real-time, addressing challenges in computer animation related to fine-grained control and motion degradation over long sequences. This autoregressive model utilizes a Transformer-based conditional VAE to allow precise control over user-specified joints, enhancing tasks such as goal-reaching and in-betweening.

Read full article

via arXiv — cs.CV

Sliding-Window Merging for Compacting Patch-Redundant Layers in LLMs

arXiv — cs.CV18 hours ago

Sliding-Window Merging for Compacting Patch-Redundant Layers in LLMs

PositiveArtificial Intelligence

A new method called Sliding-Window Merging (SWM) has been proposed to enhance the efficiency of large language models (LLMs) by compacting patch-redundant layers. This technique identifies and merges consecutive layers based on their functional similarity, thereby maintaining performance while simplifying model architecture. Extensive experiments indicate that SWM outperforms traditional pruning methods in zero-shot inference performance.

Read full article

via arXiv — cs.CV

Reconstructing KV Caches with Cross-layer Fusion For Enhanced Transformers

arXiv — cs.CL2 days ago

Reconstructing KV Caches with Cross-layer Fusion For Enhanced Transformers

PositiveArtificial Intelligence

Researchers have introduced FusedKV, a novel approach to reconstructing key-value (KV) caches in transformer models, enhancing their efficiency by fusing information from bottom and middle layers. This method addresses the significant memory demands of KV caches during long sequence processing, which has been a bottleneck in transformer performance. Preliminary findings indicate that this fusion retains essential positional information without the computational burden of rotary embeddings.

Read full article

via arXiv — cs.CL

MAGE-ID: A Multimodal Generative Framework for Intrusion Detection Systems

arXiv — cs.LG2 days ago

MAGE-ID: A Multimodal Generative Framework for Intrusion Detection Systems

PositiveArtificial Intelligence

A new framework named MAGE-ID has been introduced to enhance Intrusion Detection Systems (IDS) by addressing challenges such as heterogeneous network traffic and data imbalance between benign and attack flows. This multimodal generative framework utilizes a diffusion-based approach to synthesize data from tabular flow features and their transformed images, improving detection performance significantly on datasets like CIC-IDS-2017 and NSL-KDD.

Read full article

via arXiv — cs.LG

AutoBrep: Autoregressive B-Rep Generation with Unified Topology and Geometry

arXiv — cs.CV3 days ago

AutoBrep: Autoregressive B-Rep Generation with Unified Topology and Geometry

PositiveArtificial Intelligence

A novel Transformer model named AutoBrep has been introduced to generate boundary representations (B-Reps) in Computer-Aided Design (CAD) with high quality and valid topology. This model addresses the challenge of end-to-end generation of B-Reps by employing a unified tokenization scheme that encodes geometric and topological characteristics as discrete tokens, facilitating a breadth-first traversal of the B-Rep face adjacency graph during inference.

Read full article

via arXiv — cs.CV

Toward Content-based Indexing and Retrieval of Head and Neck CT with Abscess Segmentation

arXiv — cs.CV3 days ago

Toward Content-based Indexing and Retrieval of Head and Neck CT with Abscess Segmentation

PositiveArtificial Intelligence

A new study has introduced AbscessHeNe, a dataset of 4,926 contrast-enhanced CT slices specifically focused on head and neck abscesses, which are critical for timely diagnosis and treatment. This dataset aims to enhance the development of semantic segmentation models that can accurately identify abscess boundaries and assess deep neck space involvement.

Read full article

via arXiv — cs.CV

Multimodal LLMs See Sentiment

arXiv — cs.CV3 days ago

Multimodal LLMs See Sentiment

PositiveArtificial Intelligence

A new framework named MLLMsent has been proposed to enhance the sentiment reasoning capabilities of Multimodal Large Language Models (MLLMs). This framework explores sentiment classification directly from images, sentiment analysis on generated image descriptions, and fine-tuning LLMs on sentiment-labeled descriptions, achieving state-of-the-art results in recent benchmarks.

Read full article

via arXiv — cs.CV