Computer Vision Project (Ongoing) – Structural Drawing Interpretation Using Deep Learning

Summary

Developing a computer vision pipeline to extract structural elements directly from engineering drawings. The project focuses on identifying beams, columns, and structural walls in plan drawings and producing instance-level masks that preserve geometry and connectivity. Rather than simply locating objects, the goal is to generate structured outputs that could support automated modelling or quantity extraction workflows.

I am building custom datasets from real structural plans, training deep convolutional models, and evaluating performance using precision–recall and IoU metrics. The longer-term objective is to move from image recognition toward usable structural data.

Problem Context

Structural drawings are visually clean for engineers but complex for machines. Elements are defined by linework, overlap with text and dimensions, and vary across drafting styles. The challenge is not detecting shapes in isolation, but correctly identifying structural members while preserving member thickness, orientation, continuity across intersections, and connectivity between beams and columns. If the output cannot retain geometry, it has limited engineering value.

Dataset Construction

Datasets are constructed from structural plan drawings by converting vector PDFs to high-resolution raster images, cropping into tiled patches to maintain detail, and manually annotating instance-level masks for beams, columns, and walls. Instance-level labelling allows each structural member to be treated as a separate object rather than part of a single class mask — important for later extracting centrelines and connectivity.

Augmentation (rotational, scaling, brightness variation, and controlled noise injection) is applied to ensure the model does not overfit to a single drafting style.

Model Approach

The current pipeline uses an instance-aware convolutional architecture that produces a class label, bounding region, and per-instance segmentation mask for each structural member — allowing each to be isolated while retaining its exact geometry.

From the mask outputs, post-processing extracts centrelines, estimates member thickness, identifies intersections between elements, and cleans up spurious detections. This geometry-aware stage is essential: a bounding box alone is insufficient for structural interpretation.

Training & Evaluation

Training is conducted in PyTorch with stratified train/validation/test splits, cross-entropy and mask loss components, tuned confidence thresholds, and IoU-based mask evaluation. Performance is assessed via precision–recall curves, mean Average Precision, and Intersection over Union.

Rather than overall accuracy, I track failure modes: thin members being partially detected, columns merging with beam masks, and annotations incorrectly classified as structure. These inform both dataset refinement and model adjustments.

Engineering Direction

The emphasis is structural usefulness, not just image classification. Key questions guiding development: Can extracted masks be converted into clean centrelines? Can beam–column joints be detected reliably? Can the output feed into a simplified structural model? The work is gradually moving from detection toward structured geometric representation.

Code & Repository

The full source code is not published here as it forms part of an ongoing supervised research project and includes structural drawing datasets provided under restricted use agreements. The implementation includes custom data loaders and image tiling pipelines, PyTorch training and inference scripts, evaluation scripts for precision–recall and IoU, and post-processing routines for geometry extraction.

Key Skills & Tools

Computer Vision

Instance-level segmentation
Mask-based structural extraction
Morphological image processing
Geometric post-processing

Machine Learning

PyTorch
Custom dataset construction and annotation
Precision–recall and IoU evaluation
Hyperparameter tuning
Failure mode analysis

Engineering Application

Geometry-aware model design
Structural connectivity extraction
Bridging vision outputs to engineering use
Ongoing supervised research project